./C FAQs

posted by cli on

C FAQs

This is a condensed version of comp.lang.c Frequently Asked Questions

  1. In C, the syntax and interpretation of a declaration is not really type identifier, but rather base_type declarator. declarator consists of an identifier name along with optional *, [] or () syntax indicating(if present) that the identifier is a pointer, array, function, or some combination.

  2. There can be many declarations of a global variable or function, but there must be exactly one definition. For global variables, the definition is the declaration that actually allocates space, and provides an initialization value, if any. For functions, the definition is the declaration that provides a function body.

  3. The language in the standard does not require all declarations for the same static function or variable to include the storage class static, but the rules are rather intricate, and are slightly different for function than for data objects. Therefore, it’s safest if static appears consistently in the definition and declarations.

    ``` /* object / / function */

    int o1; int f1(); /* external linkage / static int o2; static int f2(); / internal linkage / static int o3; static int f3(); / internal linkage */

    static int o1; static int f1(); /* ERROR, both have external linkage / int o2; / ERROR, o2 has internal linkage / int f2(); / OK, picks up internal linkage / extern int o3; extern int f3(); / OK, both pick up internal linkage */ ```

    The difference is case(2), where functions do pick up a previous linkage even without extern, objects don’t.

  4. extern is significant only with data declarations. In function declarations, it can be used as a stylish hint to indicate the function’s definition is probably in another source file.

  5. typedefs obey scope rules(that is, they can be declared local to a function or block), #defines don’t(usually a warning is issued if a redefinition is encounted).

  6. The famous const questions. Rule of thumb: read it from right to left.

    c int* /* pointer to int */ int const * /* pointer to const int */ int * const /* const pointer to int */ int const * const /* const pointer to const int */ Now the first const can be on either side of the type so:

    c const int * == int const * const int * const == int const * const If you want to go really crazy you can do things like this:

    c int ** /* pointer to pointer to int */ int ** const /* a const pointer to a pointer to an int */ int * const * /* a pointer to a const pointer to an int */ int const ** /* a pointer to a pointer to a const int */ int * const * const /* a const pointer to a const pointer to an int */ typedef substitutions are not purely textual. In this declaration

    c typedef char *charp; const charp p; p is const for the same reason const int i declares i as const. The typedefed declaration of p does not look inside the typedef to see that there is a pointer invloved.

  7. One way to make sense of complicated C declarations is by reading them inside out, remembering that [] and () bind more tightly than *. For example, given

    char *(*pfpc)();

    We can see that pfpc is a pointer (the inner *) to a function (the ()) returning a pointer (the outer *) to char. When we later use pfpc, the expression *(*pfpc)() (the value pointed to by the return value of a function pointed to by pfpc) will be a char.

    cdecl can help a lot when declarations get messy.

  8. If you really need to declare a pointer to an entire array, use something like int (*ap)[N];.

  9. Postfix ++ has higher precedence than unary prefix *.

    c x = *p++; /* pointer p gets incremented */ x = (*p)++; /* increment the pointed-to location */

  10. There is a technique known as the Clockwise/Spiral Rule which enables any C programmer to parse in their head any C declaration! There are three simple steps to follow:
    (a). Starting with the unknown element, move in a spiral/clockwise direction; when ecountering the following elements replace them with the corresponding english statements:

    [X] or []      => Array X size of... or Array undefined size of...
    (type1, type2) => function passing type1 and type2 returning...
    *              => pointer(s) to...
    

    (b). Keep doing this in a spiral/clockwise direction until all tokens have been covered.
    (c). Always resolve anything in parenthesis first!

  11. Function calls are allowed in initializers only for automatic variables, that is for local, non-static variables. e.g. Compliers complain if char *p = malloc(10) is placed in global scope.

  12. A string literal can be used in two slightly different ways:
    (a). As the initializer for an array of char, e.g. char s[] = "foobar";. It specifies initial values of the characters in that array, and if necessary, its size.
    (b). Anywhere else, it turns into an unnamed, static array of characters, and this unnamed array may be stored in read-only memory, and which therfore cannot be necessarily be modified. In an expression context, the array is converted at once to a pointer, as usual. char *p = "foobar"; initializes p to point to the unnamed array’s first element.

  13. When the name of a function appears in an expression, it decays into a point, that is, it has its address implicitly taken, much as an array name does. e.g. int (*fp)() = func;

  14. There are four different kinds of namespaces:
    (a). labels (i.e. goto targets);
    (b). tags (names of structures, unions, and enumerations; these three aren’t separate even though they theoretically could be);
    (c). structure/union members (one namespace per structure or union);
    (d). everything else (functions, variables, typedef names, enumeration constants), termed ordinary identifiers by the Standard.

  15. There are four kinds of scope (regions over which an identifier’s declaration is in effect) in C: function, file, block, and prototype. The fourth one exists only in the parameter lists of function prototype declarations

  16. There’re three variations on not precisely defined by the standard:
    (a). implementation-defined: The implementation must pick some behavior; it may not fail to compile the program. (The program using the construct is not incorrect.) The choice must be documented. The Standard may specify a set of allowable behaviors from which to choose, or it may impose no particular requirements.
    (b). unspecified: Like implementation-defined, except that the choice need not be documented.
    (c). undefined: Anything at all can happen; the Standard imposes no requirements. The program may fail to compile, or it may execute incorrectly (either crashing or silently generating incorrect results), or it may fortuitously do exactly what the programmer intended. Here is another way of looking at it, due to Roger Miller: > Somebody told me that in basketball you can’t hold the ball and run. I got a basketball > and tried it and it worked just fine. He obviously didn’t understand basketball.

  17. Kernighan and Ritchie wisely point out > If you don’t know how they are done on various machines, that innocence may help to protect you.

  18. When you need to ensure the order of subexpression evaluation, you may need to use explicit temporary variables and separate statements. Parentheses won’t help here. Parentheses tell the compiler which operands go with which operators; they do not force the compiler to evaluate everything within the parentheses first.

  19. Sequence point, read c-seq-point carefully.

  20. To avoid surprises, it’s best to avoid mixing signed and unsigned types in the same expression. You can always use explicit casts to indicate, unambiguously, exactly when and how you want conversions performed.

  21. void * acts as a generic pointer only because conversions (if necessary) are applied automatically when other pointer types are assigned to and from void *’s.

    void *’s are only guaranteed to hold object (i.e. data) pointers; it is not portable to convert a function pointer to type void *. It is guaranteed, however, that all function pointers can be interconverted, as long as they are converted back to an appropriate type before calling. Therefore, you can pick any function type (usually int (*)() or void (*)(), that is, pointer to function of unspecified arguments returning int or void) as a generic function pointer. When you need a place to hold object and function pointers interchangeably, the portable solution is to use a union of a void * and a generic function pointer (of whichever type you choose).

  22. Originally, a pointer to a function had to be turned into a real function, with the * operator, before calling:

    int r, (*fp)(), func();
    fp = func;
    r = (*fp)();
    

    It can also be argued that functions are always called via pointers, and that real function names always decay implicitly into pointers (in expressions, as they do in initializations). This reasoning means that

    r = fp();
    

    is legal and works correctly, whether fp is the name of a function or a pointer to one.

    The ANSI C Standard essentially adopts the latter interpretation, meaning that the explicit * is not required, though it is still allowed.

    Remember that functions are second-class citizens in C.

  23. The language definition states that for each pointer type, there is a special value, the null pointer – which is distinguishable from all other pointer values and which is guaranteed to compare unequal to a pointer to any object or function. The internal values of null pointers for different types may be different.

    According to the language definition, an integral constant expression with the value 0 in a pointer context is converted into a correctly typed null pointer at compile time.

    It is only in pointer contexts that NULL and 0 are equivalent. NULL should not be used when another kind of 0 is required, even though it might work, because doing so sends the wrong stylistic message. Furthermore, ANSI allows the definition of NULL to be ((void *)0), which will not work at all in non-pointer contexts.

    Although symbolic constants are often used in place of numbers because the numbers might change, this is not the reason that NULL is used in place of 0. NULL is used only as a stylistic convention.

    If you’re confused, here are two simple rules to follow:

    (a). When you want a null pointer constant in source code, use 0 or NULL.
    (b). If the usage of 0 or NULL is an argument in a function call, cast it to the pointer type expected by the function being called.

    Rule (b) is conservative, but it doesn’t hurt. Strictly speaking, casts on pointer arguments are only required in function calls without prototypes in scope, and in the variable-length part of variable-length argument lists.

  24. As Wayne Throop has put it > It’s pointer arithmetic and array indexing that are equivalent in C, pointers and arrays > are different.

  25. A reference to an object of type array-of-T which appears in an expression decays (with three exceptions) into a pointer to its first element; the type of the resultant pointer is pointer-to-T.

  26. Since arrays decay immediately into pointers, an array is never actually passed to a function. You can pretend that a function receives an array as a parameter, and illustrate it by declaring the corresponding parameter as an array:

    void f(char a[])
    { ... }
    

    Interpreted literally, this declaration would have no use, so the compiler turns around and pretends that you’d written a pointer declaration, since that’s what the function will in fact receive:

    void f(char *a)
    { ... }
    
  27. There’s hardly such a thing as an array, after all, arrays are second-class citizens in C; one upshot of this prejudice is that you cannot assign to them.

  28. An array is not a pointer, nor vice versa. An array reference (that is, any mention of an array in a value context), turns into a pointer.

  29. The difference between arr and &arr is the type. In Standard C, &arr yields a pointer, of type pointer-to-array-of-T, to the entire array. A simple reference (without an explicit &) to an array yields a pointer, of type pointer-to-T, to the array’s first element.

    Arrays of type T decay into pointers to type T, which is convenient; subscripting or incrementing the resultant pointer will access the individual members of the array. True pointers to arrays, when subscripted or incremented, step over entire arrays, and are generally useful only when operating on arrays of arrays, if at all.

  30. Pointer arithmetic is defined only as long as the pointer points within the same allocated block of memory, or to the imaginary terminating element one past it; otherwise, the behavior is undefined, even if the pointer is not dereferenced.

  31. The rule by which arrays decay into pointers is not applied recursively.

  32. Perhaps surprisingly, character constants in C are of type int, so sizeof('a') is sizeof(int). It makes perfect sense if you know the rule that character constants are of type int, even if that rule doesn’t seem to make much sense in itself.

  33. It is true that any nonzero value is considered true in C, but this applies only on input, i.e. where a boolean value is expected. When a boolean value is generated by a built-in operator such as ==, !=, and <=, it is guaranteed to be 1 or 0. Therefore, the test

    if((a == b) == TRUE)
    

    would work as expected (as long as TRUE is 1), but it is obviously silly. In fact, explicit tests against TRUE and FALSE are generally inappropriate. In particular, and unlike the built-in operators, some library functions (notably isupper, isalpha, etc.) return, on success, a nonzero value which is not necessarily 1, so comparing their return values against a single value such as TRUE is quite risky and likely not to work.

    A good rule of thumb is to use TRUE and FALSE (or the like) only for assignment to a boolean variable or function parameter, or as the return value from a boolean function, but never in a comparison.

  34. There are three important rules to remember when defining function-like macros:
    (a). The macro expansion must always be parenthesized to protect any lower-precedence operators from the surrounding expression.
    (b). Within the macro definition, all occurrences of the parameters must be parenthesized to protect any low-precedence operators in the actual arguments from the rest of the macro expansion.
    (c). If a parameter appears several times in the expansion, the macro may not work properly if the actual argument is an expression with side effects.

  35. When a function accepts a variable number of arguments, its prototype does not (and cannot) provide any information about the number and types of those variable arguments. Therefore, the usual protections do not apply in the variable-length part of variable-length argument lists: the compiler cannot perform implicit conversions or (in general) warn about mismatches. The programmer must make sure that arguments match, or must manually insert explicit casts.

  36. You can use a pointer-to-T (for any type T) where a pointer-to-const-T is expected. However, the rule (an explicit exception) which permits slight mismatches in qualified pointer types is not applied recursively, but only at the top level. For example, you can not pass a char ** to a function which expects a const char **.

    If you must assign or pass pointers which have qualifier mismatches at other than the first level of indirection, you must use explicit casts, although as always, the need for such a cast may indicate a deeper problem which the cast doesn’t really fix.

  37. Perhaps surprisingly, \n in a scanf format string does not mean to expect a newline, but rather to read and discard characters as long as each is a whitespace character. In fact, any whitespace character in a scanf format string means to read and discard whitespace characters. Furthermore, formats like %d also discard leading whitespace, so you usually don’t need explicit whitespace in scanf format strings at all.

  38. As a general rule, you shouldn’t try to interlace calls to scanf with calls to gets() (or any other input routines); scanf’s peculiar treatment of newlines almost always leads to trouble. Either use scanf to read everything or nothing.