C Language Elements

C Language Elements

   * C tokens
        - In a C source program, the basic element recognized by the compiler is the "token." A token is source-program text that the compiler does not break down into component elements.

Syntax

token:
keyword
identifier
constant
string-literal
operator
punctuator

* C Comments
       - A "comment" is a sequence of characters beginning with a forward slash/asterisk combination (/*) that is treated as a single white-space character by the compiler and is otherwise ignored. A comment can include any combination of characters from the representable character set, including newline characters, but excluding the "end comment" delimiter (*/). Comments can occupy more than one line but cannot be nested.
Comments can appear anywhere a white-space character is allowed. Since the compiler treats a comment as a single white-space character, you cannot include comments within tokens. The compiler ignores the characters in the comment.
Use comments to document your code. This example is a comment accepted by the compiler:
/* Comments can contain keywords such as
   for and while without generating errors. */
Comments can appear on the same line as a code statement:
printf( "Hello\n" );  /* Comments can go here */
You can choose to precede functions or program modules with a descriptive comment block:
/* MATHERR.C illustrates writing an error routine 
 * for math functions. 
 */ 
Since comments cannot contain nested comments, this example causes an error:
/* Comment out this routine for testing 

   /* Open file */
    fh = _open( "myfile.c", _O_RDONLY );
    .
    .
    .
 */
The error occurs because the compiler recognizes the first */, after the words Open file, as the end of the comment. It tries to process the remaining text and produces an error when it finds the */ outside a comment.
While you can use comments to render certain lines of code inactive for test purposes, the preprocessor directives #if and #endif and conditional compilation are a useful alternative for this task. For more information, see Preprocessor Directives in the Preprocessor Reference.
Microsoft Specific
The Microsoft compiler also supports single-line comments preceded by two forward slashes (//). If you compile with /Za (ANSI standard), these comments generate errors. These comments cannot extend to a second line.
// This is a valid comment
Comments beginning with two forward slashes (//) are terminated by the next newline character that is not preceded by an escape character. In the next example, the newline character is preceded by a backslash (\), creating an "escape sequence." This escape sequence causes the compiler to treat the next line as part of the previous line. (For more information, see Escape Sequences.)
// my comment \
    i++; 
Therefore, the i++; statement is commented out.
The default for Microsoft C is that the Microsoft extensions are enabled. Use /Za to disable these extensions.

* C Keywords
      "Keywords" are words that have special meaning to the C compiler. In translation phases 7 and 8, an identifier cannot have the same spelling and case as a C keyword. (See a description of translation phases in the Preprocessor Reference; for information on identifiers, see Identifiers.) The C language uses the following keywords:
auto
double
int
struct
break
else
long
switch
case
enum
register
typedef
char
extern
return
union
const
float
short
unsigned
continue
for
signed
void
default
goto
sizeof
volatile
do
if
static
while
You cannot redefine keywords. However, you can specify text to be substituted for keywords before compilation by using C preprocessor directives.
Microsoft Specific
The ANSI C standard allows identifiers with two leading underscores to be reserved for compiler implementations. Therefore, the Microsoft convention is to precede Microsoft-specific keyword names with double underscores. These words cannot be used as identifier names. For a description of the ANSI rules for naming identifiers, including the use of double underscores, see Identifiers.
The following keywords and special identifiers are recognized by the Microsoft C compiler:
__asm
dllimport2
__int8
naked 2
__based 1
__except
__int16
__stdcall
__cdecl
__fastcall
__int32
thread2
__declspec
__finally
__int64
__try
dllexport 2
__inline
__leave

1. The __based keyword has limited uses for 32-bit and 64-bit target compilations.
2. These are special identifiers when used with __declspec; their use in other contexts is not restricted.
Microsoft extensions are enabled by default. To ensure that your programs are fully portable, you can disable Microsoft extensions by specifying the /Za option (compile for ANSI compatibility) during compilation. When you do this, Microsoft-specific keywords are disabled.
When Microsoft extensions are enabled, you can use the keywords listed above in your programs. For ANSI compliance, most of these keywords are prefaced by a double underscore. The four exceptions, dllexportdllimportnaked, and thread, are used only with __declspec and therefore do not require a leading double underscore. For backward compatibility, single-underscore versions of the rest of the keywords are supported.

* C Identifiers 
 "Identifiers" or "symbols" are the names you supply for variables, types, functions, and labels in your program. Identifier names must differ in spelling and case from any keywords. You cannot use keywords (either C or Microsoft) as identifiers; they are reserved for special use. You create an identifier by specifying it in the declaration of a variable, type, or function. In this example, result is an identifier for an integer variable, and main and printf are identifier names for functions.
#include <stdio.h>

int main()
{
    int result;
    
    if ( result != 0 )
        printf_s( "Bad file handle\n" );
}
Once declared, you can use the identifier in later program statements to refer to the associated value.
A special kind of identifier, called a statement label, can be used in goto statements. (Declarations are described in Declarations and Types Statement labels are described inThe goto and Labeled Statements.)

 Syntax

identifier:
nondigit
identifier nondigit
identifier digit
nondigit: one of
_ a b c d e f g h i j k l m n o p q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
digit: one of
0 1 2 3 4 5 6 7 8 9
The first character of an identifier name must be a nondigit (that is, the first character must be an underscore or an uppercase or lowercase letter). ANSI allows six significant characters in an external identifier's name and 31 for names of internal (within a function) identifiers. External identifiers (ones declared at global scope or declared with storage class extern) may be subject to additional naming restrictions because these identifiers have to be processed by other software such as linkers.
Microsoft Specific
Although ANSI allows 6 significant characters in external identifier names and 31 for names of internal (within a function) identifiers, the Microsoft C compiler allows 247 characters in an internal or external identifier name. If you aren't concerned with ANSI compatibility, you can modify this default to a smaller or larger number using the /H (restrict length of external names) option.
END Microsoft Specific
The C compiler considers uppercase and lowercase letters to be distinct characters. This feature, called "case sensitivity," enables you to create distinct identifiers that have the same spelling but different cases for one or more of the letters. For example, each of the following identifiers is unique:
add
ADD
Add
aDD
Microsoft Specific
Do not select names for identifiers that begin with two underscores or with an underscore followed by an uppercase letter. The ANSI C standard allows identifier names that begin with these character combinations to be reserved for compiler use. Identifiers with file-level scope should also not be named with an underscore and a lowercase letter as the first two letters. Identifier names that begin with these characters are also reserved. By convention, Microsoft uses an underscore and an uppercase letter to begin macro names and double underscores for Microsoft-specific keyword names. To avoid any naming conflicts, always select identifier names that do not begin with one or two underscores, or names that begin with an underscore followed by an uppercase letter.
END Microsoft Specific
The following are examples of valid identifiers that conform to either ANSI or Microsoft naming restrictions:
j
count
temp1
top_of_page
skip12
LastNum
Microsoft Specific
Although identifiers in source files are case sensitive by default, symbols in object files are not. Microsoft C treats identifiers within a compilation unit as case sensitive.
The Microsoft linker is case sensitive. You must specify all identifiers consistently according to case.
The "source character set" is the set of legal characters that can appear in source files. For Microsoft C, the source set is the standard ASCII character set. The source character set and execution character set include the ASCII characters used as escape sequences. See Character Constants for information about the execution character set.
END Microsoft Specific
An identifier has "scope," which is the region of the program in which it is known, and "linkage," which determines whether the same name in another scope refers to the same identifier. These topics are explained in Lifetime, Scope, Visibility, and Linkage

* C Constants 
  A "constant" is a number, character, or character string that can be used as a value in a program. Use constants to represent floating-point, integer, enumeration, or character values that cannot be modified.

Syntax

constant:
floating-point-constant
integer-constant
enumeration-constant
character-constant
Constants are characterized by having a value and a type. Floating-pointinteger, and character constants are discussed in the next three sections. Enumeration constants are described in Enumeration Declarations.

* C String Literals 
A "string literal" is a sequence of characters from the source character set enclosed in double quotation marks (" "). String literals are used to represent a sequence of characters which, taken together, form a null-terminated string. You must always prefix wide-string literals with the letter L.

Syntax

string-literal:
" s-char-sequence opt "
L" s-char-sequence opt "
s-char-sequence:
s-char
s-char-sequence s-char
s-char:
any member of the source character set except the double quotation mark ("), backslash (\), or newline character
escape-sequence
The example below is a simple string literal:
char *amessage = "This is a string literal.";
All escape codes listed in the Escape Sequences table are valid in string literals. To represent a double quotation mark in a string literal, use the escape sequence \". The single quotation mark (') can be represented without an escape sequence. The backslash (\) must be followed with a second backslash (\\) when it appears within a string. When a backslash appears at the end of a line, it is always interpreted as a line-continuation character.

* Punctuation and Special Characters
The punctuation and special characters in the C character set have various uses, from organizing program text to defining the tasks that the compiler or the compiled program carries out. They do not specify an operation to be performed. Some punctuation symbols are also operators (see Operators). The compiler determines their use from context.

Syntax

punctuator: one of
[ ]   ( )   { }   *   ,   :   =   ;   ... #
These characters have special meanings in C. Their uses are described throughout this book. The pound sign (#) can occur only in preprocessing directives.

No comments:

Post a Comment