0

I have a string, for example:

char* cmd = "a bcd ef hijk lmmopq";

The string is composed with segments split by space, the number of segments is not fixed.

Intuitively, I can get a 2D char string by allocating memory dynamically, for example:

char** argv = malloc();
char* argv[0] = malloc();
...
char* argv[i] = malloc();

But can I transform the original array to 2d char array like below to avoid memory allocation?

char* argv[] = {"a", "bcd", "ef", "hijk", "lmmopq"};  
6
  • 1
    char** argv is not a 2D array, but a pointer to a pointer. char a[3][4] is a 2D array. Commented Apr 12, 2017 at 4:43
  • "But can I get a 2d char array like below to avoid memory allocation" what do you mean by "can I get"? Get from where? Do you know the size of that array in advance? Commented Apr 12, 2017 at 4:43
  • I want to transform the original cmd to a 2d array. I want to use it like a 2d array afterwards.@AlexLop. Commented Apr 12, 2017 at 4:49
  • Yes, you can tokenize your string into discrete words with strtok or strsep or by simply walking a pair of pointers down your string an identifying the individual words directly and assigning each word to a successive pointer-to-pointer-to-char or by copying the words to a 2D array of chars of sufficient size.. Commented Apr 12, 2017 at 5:04
  • Do those pointer-to-pointers have to point at your cmd string? Do you need to keep this string in one place and not copy it anywhere? Commented Apr 12, 2017 at 5:07

2 Answers 2

1

As pointed out in another answer, strtok can be used to split up your string in-place so that delimiters (spaces) are replaced with null terminators.

To know how many strings there will be, you'll have to iterate through the string twice. For the first iteration, invent some quick & simple function that doesn't alter the string, like this:

size_t count_spaces (const char* src)
{
  size_t spaces=0;
  for(; *src != '\0'; src++)
  {
    if(*src == ' ')
    {
      spaces++;
    }
  }
  return spaces;
}

Then for the second iteration, use strtok. Full example:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

size_t count_spaces (const char* src)
{
  size_t spaces=0;
  for(; *src != '\0'; src++)
  {
    if(*src == ' ')
    {
      spaces++;
    }
  }
  return spaces;
}

void tokenize (char* restrict src, 
               size_t dst_size, 
               char* restrict dst [dst_size])
{
  size_t i;
  char* ptr = strtok(src, " ");
  for(i=0; i<dst_size && ptr != NULL; i++)
  {
    dst[i] = ptr;
    ptr = strtok(NULL, " ");
  }
}

int main (void) 
{
  char str [] = "a bcd ef hijk lmmopq";
  size_t size = count_spaces(str) + 1;
  char* ptr_arr [size];

  tokenize(str, size, ptr_arr);

  for(size_t i=0; i<size; i++)
  {
    puts(ptr_arr[i]);
  }
}
Sign up to request clarification or add additional context in comments.

Comments

0

Continuing from my comment, you can tokenize your string with, e.g. strtok and assign the pointers to the individual words to a pointer-to-pointer-to-char. For example:

#include <stdio.h>
#include <string.h>

#define MAX 10

int main (void) {

    char *cmd = (char[]){"a bcd ef hijk lmmopq"},       /* compound literal */
    // char cmd[] = "a bcd ef hijk lmmopq";   /* otherwise declare as array */
        *arr[MAX] = {NULL},
        *delim = " \n";
    size_t n = 0;

    for (char *p = strtok (cmd, delim); n < MAX && p; p = strtok (NULL, delim))
        arr[n++] = p;

    for (int i = 0; i < (int)n; i++)
        printf ("arr[%d]: %s\n", i, arr[i]);

    return 0;
}

*Example Use/Output**

$./bin/str2ptp
arr[0]: a
arr[1]: bcd
arr[2]: ef
arr[3]: hijk
arr[4]: lmmopq

Note: You cannot pass a string literal to strtok as strtok modifies the string. Either use a pointer to an array or declare and initialize as a normal char[] array to begin with.


Dynamically allocating pointer for unknown number of words

If you have no idea whether you could read twenty words or 2000 words, you can easily handle the situation by dynamically allocating blocks of pointers, and then reallocating again if the prior max allocation is again reached. It is a simple process, e.g.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX 10
#define MAXB 4096

int main (void) {

    char cmd[MAXB] = "",  /* buffer of 4096 chars to hold each line input */
        **arr = calloc (MAX, sizeof *arr),
        *delim = " \n";
    size_t n = 0, max = MAX;

    while (fgets (cmd, MAXB, stdin)) { /* read each line of input on stdin */

        size_t len = strlen (cmd);  /* get line length */
        if (cmd[len - 1] == '\n')   /* test for trailing '\n'  */
            cmd[--len] = 0;         /* overwrite with nul-byte */

        for (char *p = strtok (cmd, delim); p; p = strtok (NULL, delim)) {
            arr[n++] = p;

            if (n == max) { /* realloc arr by MAX if n == MAX */
                void *tmp = realloc (arr, (max + MAX) * sizeof *arr);
                if (!tmp) {
                    fprintf (stderr, "error: memory exhausted.\n");
                    break;
                }
                arr = tmp;  /* zero new pointers (optional) */
                memset (arr + max, 0, MAX * sizeof *arr);
                max += MAX; /* update current max iteration */
            }
        }
        for (int i = 0; i < (int)n; i++)
            printf ("arr[%2d]: %s\n", i, arr[i]);
    }

    free (arr);  /* don't forget, if you allocate it -> you free it. */

    return 0;
}

Above, always validate your allocations with, e.g. if (!arr) { /* handle error */ } which was omitted from the initial allocation of arr for brevity.

Example Use/Input

$ echo "A quick brown fox jumps over the lazy dog while the dog watched" | \
./bin/strtop2pdyn
arr[ 0]: A
arr[ 1]: quick
arr[ 2]: brown
arr[ 3]: fox
arr[ 4]: jumps
arr[ 5]: over
arr[ 6]: the
arr[ 7]: lazy
arr[ 8]: dog
arr[ 9]: while
arr[10]: the
arr[11]: dog
arr[12]: watched

3 Comments

what if there are 20 tokens? or 100? The last token will be composed of X-MAX space-separated substrings
You either set MAX sufficiently large or you dynamically allocate as needed, your choice. However, in the context of this question the problem definition was for a set of 5 tokens. If I were implementing this, then I would dynamically allocate the first MAX pointers in arr and then call realloc as required to handle as many words there are in the line.
That's correct, but OP stated that the number of tokens is not fixed and he wants to avoid memory allocation. Most of the times setting MAX to a sufficiently large value works, but IMHO it's bad practice.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.