C - String String Tokenizing


Breaking a string into words is called tokenizing.

strtok() function is for tokenizing a string.

It requires two arguments: the string to be tokenized and a string containing all the possible delimiter characters.

strtok_s() is alternative safer to use than the standard function.

Because it's an optional standard function, you need to define the __STDC_WANT_LIB_EXT1__ symbol as 1 to use it.

The strtok_s() function requires four arguments:

The address of the string to be tokenized, or NULL for second and subsequent tokenizing operations after the first on the same string.

The address of an integer variable containing the size of the array in which the
first argument is stored. This will be updated by the function, so it stores the number of
characters left to be searched in the string to be tokenized after the current search.
The address of a string that contains all possible token delimiters.

A pointer to a variable of type char* in which the function will store information to
allow it to continue searching for tokens after the first has been found.


#define __STDC_WANT_LIB_EXT1__ 1           // Make optional versions of functions available
#include <stdio.h>
#include <string.h>
#include <stdbool.h>

int main(void)
  char delimiters[] = " \".,;:!?)(";       // Prose delimiters
  char buf[100];                           // Buffer for a line of keyboard input
  char str[1000];                          // Stores the prose to be tokenized
  char* ptr = NULL;                        // Pointer used by strtok_s()
  str[0] = '\0';                           // Set 1st character to null
  size_t str_len = sizeof(str);
  size_t buf_len = sizeof(buf);
  printf("Enter some prose that is less than %zd characters.\n"
    "Terminate input by entering an empty line:\n", str_len);

  // Read multiple lines of prose from the keyboard
  while (true)
  {//from  w  w w.  ja  v  a 2 s.  c  om
    if (!gets_s(buf, buf_len))                                // Read a line of input
      printf("Error reading string.\n");
      return 1;
    if (!strnlen_s(buf, buf_len))                             // An empty line ends input

    if (strcat_s(str, str_len, buf))                          // Concatenate the line with str
      printf("Maximum permitted input length exceeded.\n");
      return 1;
  printf("The words in the prose that you entered are:\n", str);

  // Find and list all the words in the prose
  unsigned int word_count = 0;
  char * pWord = strtok_s(str, delimiters, &ptr);  // Find 1st word
  if (pWord)
      printf("%-18s", pWord);
      if (++word_count % 5 == 0)
      pWord = strtok_s(NULL,  delimiters, &ptr);    // Find subsequent words
    } while (pWord);                                           // NULL ends tokenizing
    printf("\n%u words found.\n", word_count);
    printf("No words found.\n");

  return 0;