I think the following solution is ideal:
Explanation of the code:
token
to store the address and lengths of the tokensstr
is made up entirely of separators so there are strlen(str) + 1
tokens, all of them empty stringsstr
recording the address and length of every tokenNULL
sentinel valuememcpy
as it's faster than strcpy
and we know
the lengthstypedef struct {
const char *start;
size_t len;
} token;
char **split(const char *str, char sep)
{
char **array;
unsigned int start = 0, stop, toks = 0, t;
token *tokens = malloc((strlen(str) + 1) * sizeof(token));
for (stop = 0; str[stop]; stop++) {
if (str[stop] == sep) {
tokens[toks].start = str + start;
tokens[toks].len = stop - start;
toks++;
start = stop + 1;
}
}
/* Mop up the last token */
tokens[toks].start = str + start;
tokens[toks].len = stop - start;
toks++;
array = malloc((toks + 1) * sizeof(char*));
for (t = 0; t < toks; t++) {
/* Calloc makes it nul-terminated */
char *token = calloc(tokens[t].len + 1, 1);
memcpy(token, tokens[t].start, tokens[t].len);
array[t] = token;
}
/* Add a sentinel */
array[t] = NULL;
free(tokens);
return array;
}
Note malloc
checking omitted for brevity.
In general, I wouldn't return an array of char *
pointers from a split function like this as it places a lot of responsibility on the caller to free them correctly. An interface I prefer is to allow the caller to pass a callback function and call this for every token, as I have described here: Split a String in C.