hashcollision: C

Sometimes I really really dislike C for its low-levelness.

Here's one concrete example why. Predict the output of the following program:


#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>

int main(int argc, char** argv) {
  printf("hello world\n");
  if (fork() == 0) {
    printf("child\n");
  } else {
    printf("parent\n");
  }
  return 0;
}

with the following:


dyoo@dyoo-desktop:~/work/net-2$ gcc -Wall test-fork.c
dyoo@dyoo-desktop:~/work/net-2$ ./a.out
[predict your output here #1]
dyoo@dyoo-desktop:~/work/net-2$ ./a.out | cat
[predict your output here #2]

Yes; I was surprised too, but #1 and #2 can produce different output.

At least, this is what I see:


dyoo@dyoo-desktop:~/work/net-2$ ./a.out
hello world
child
parent
dyoo@dyoo-desktop:~/work/net-2$ ./a.out | cat
hello world
child
hello world
parent

The bug here is the interaction between the low-level fork() and the way that buffered output works in C. In the second case, since I'm piping through cat, there's more output buffering going on. The first printf() is buffered and not immediately printed but rather stored somewhere. The fork() duplicates the process. So when both the child and parent processes close, they flush their respective buffers... and we see this unusual output.

Some man pages do talk about this, like the one from SGI IRIX. Linux's man pages, not so much. Ah well.

hashcollision

Sunday, April 15, 2007

stick a fork() in it