Strtok strange behaviour

You have a large number of errors in your code and you are over-complicating the problem. The most pressing error is Why is while ( !feof (file) ) always wrong? Why? Trace the execution-path within your loop. You attempt to read with fgets(), and then you use sentence without knowing whether EOF was reached calling tok_ptr = strtok(sentence, " \n"); before you ever get around to checking feof(f1_ptr)

What happens when you actually reach EOF? That IS “Why while ( !feof (file) ) is always wrong?” Instead, you always want to control your read-loop with the return of the read function you are using, e.g. while (fgets(sentence, SIZE, f1_ptr) != NULL)

What is it you actually need your code to do?

The larger question is why are you over-complicating the problem with strtok, and arrays (and fgets() for that matter)? Think about what you need to do:

  1. read each character in the file,
  2. if it is whitespace, ignore it, set the in-word flag false,
  3. if a non-whitespace, if 1st char in word, capitalize it, output the char, set the in-word flag true and increment the number of chars output to the current line, and finally
  4. if it is the 20th character output, output a newline and reset the counter zero.

The bare-minimum tools you need from your C-toolbox are fgetc(), isspace() and toupper() from ctype.h, a counter for the number of characters output, and a flag to know if the character is the first non-whitespace character after a whitespace.

Implementing the logic

That makes the problem very simple. Read a character, is it whitespace?, set your in-word flag false, otherwise if your in-word flag is false, capitalize it, output the character, set your in-word flag true, increment your word count. Last thing you need to do is check if your character-count has reached the limit, if so output a '\n' and reset your character-count zero. Repeat until you run out of characters.

You can turn that into a code with something similar to the following:

#include <stdio.h>
#include <ctype.h>

#define CPL 20      /* chars per-line, if you need a constant, #define one (or more) */

int main (int argc, char **argv) {
    
    int c, in = 0, n = 0;   /* char, in-word flag, no. of chars output in line */
    /* use filename provided as 1st argument (stdin by default) */
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
    
    if (!fp) {  /* validate file open for reading */
        perror ("file open failed");
        return 1;
    }
    
    while ((c = fgetc(fp)) != EOF) {            /* read / validate each char in file */
        if (isspace(c))                         /* char is whitespace? */
            in = 0;                             /* set in-word flag false */
        else {  /* otherwise, not whitespace */
            putchar (in ? c : toupper(c));      /* output char, capitalize 1st in word */
            in = 1;                             /* set in-word flag true */
            n++;                                /* increment character count */
        }
        if (n == CPL) {                         /* CPL limit reached? */
            putchar ('\n');                     /* output newline */
            n = 0;                              /* reset cpl counter */
        }
    }
    putchar ('\n');     /* tidy up with newline */
    
    if (fp != stdin)    /* close file if not stdin */
        fclose (fp);
}

Example Use/Output

Given your input file stored on my computer in dat/text220.txt, you can produce the output you are looking for with:

$ ./bin/text220 dat/text220.txt
WatchYourThoughts;Th
eyBecomeWords.WatchY
ourWords;TheyBecomeA
ctions.WatchYourActi
ons;TheyBecomeHabits
.WatchYourHabits;The
yBecomeCharacter.Wat
chYourCharacter;ItBe
comesYourDestiny.

(the executable for the code was compiled to bin/text220, I usually keep separate dat, obj, and bin directories for data, object files and executables to keep by source code directory clean)

note: by reading from stdin by default if no filename is provided as the first argument to the program, you can use your program to read input directly, e.g.

$ echo "my dog      has   fleas  -   bummer!" | ./bin/text220
MyDogHasFleas-Bummer
!

No fancy string functions required, just a loop, a character, a flag and a counter — the rest is just arithmetic. It’s always worth trying to boils your programming problems down to basic steps and then look around your C-toolbox and find the right tool for each basic step.

Using strtok

Don’t get me wrong, there is nothing wrong with using strtok and it makes a fairly simple solution in this case — the point I was making is that for simple character-oriented string-processing, it’s often just a simple to loop over the characters in the line. You don’t gain any efficiencies using fgets() with an array and strtok(), the read from the file is already placed into a buffer of BUFSIZ1.

If you did want to use strtok(), you should control you read-loop your with the return from fgets()and then you can tokenize with strtok() also checking its return at each point. A read-loop with fgets() and a tokenization loop with strtok(). Then you handle first-character capitalization and then limiting your output to 20-chars per-line.

You could do something like the following:

#include <stdio.h>
#include <string.h>
#include <ctype.h>

#define CPL 20      /* chars per-line, if you need a constant, #define one (or more) */
#define MAXC 1024
#define DELIM " \t\r\n"

void putcharCPL (int c, int *n)
{
    if (*n == CPL) {            /* if n == limit */
        putchar ('\n');         /* output '\n' */
        *n = 0;                 /* reset value at mem address 0 */
    }
    putchar (c);                /* output character */
    (*n)++;                     /* increment value at mem address */
}

int main (int argc, char **argv) {
    
    char line[MAXC];    /* buffer to hold each line */
    int n = 0;          /* no. of chars ouput in line */
    /* use filename provided as 1st argument (stdin by default) */
    FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
    
    if (!fp) {  /* validate file open for reading */
        perror ("file open failed");
        return 1;
    }
    
    while (fgets (line, MAXC, fp))  /* read each line and tokenize line */
        for (char *tok = strtok (line, DELIM); tok; tok = strtok (NULL, DELIM)) {
            putcharCPL (toupper(*tok), &n);     /* convert 1st char to upper */
            for (int i = 1; tok[i]; i++)        /* output rest unchanged */
                putcharCPL (tok[i], &n);
        }
    putchar ('\n');     /* tidy up with newline */
    
    if (fp != stdin)    /* close file if not stdin */
        fclose (fp);
}

(same output)

The putcharCPL() function is just a helper that checks if 20 characters have been output and if so outputs a '\n' and resets the counter. It then outputs the current character and increments the counter by one. A pointer to the counter is passed so it can be updated within the function making the updated value available back in main().

Look things over and let me know if you have further questions.

footnotes:

1. Depending on your version of gcc, the constant in the source setting the read-buffer size may be _IO_BUFSIZ. _IO_BUFSIZ was changed to BUFSIZ here: glibc commit 9964a14579e5eef9 For Linux BUFSIZE is defined as 8192 (512 on Windows).

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top