ttlogo.jpg Free TextTransformer Projects
Home
Text2HTML
Wikipedia
Yacc2TT
Delphi parser
Java parser
C preprocessor
C parser
HTML4
Utilities
MIME parser
Spamfilter
Additional Examples
Free components
  Minimal Website   Impressum


c_pp gui
for the donationcoder N.A.N.Y. Programming Challenge
download (1200 KB): c_pp.zip
Last Update: 11 juli 08

c_pp


c_pp.ttp is the name of a TextTransformer project which imitates a C-preprocessor. C++ files can be remodeled into the preprocessed form with c_pp, like they are "seen" by the compiler: Preprocessor directives are removed, include files are included, definitions are replaced, not defined areas are removed and macros are expanded. In contrast to existing preprocessors of the different compiler manufacturers, c_pp does not only create an intermediate sequence of tokens, but a real text.


The name "c_pp"

The name "c_pp" stands for C-preprocessor. The underscore distinguishes the name from a Cplusplus parser also existing with the name "Cpp".


History of the c_pp project

The original version of this C++ preprocessor was developed to prepare the translation of a company software written in C++ into Java. So it wasn't the aim to produce a general preprocessor, which copes with all possible tricks of preprocessor Meta programming. The aim was rather pragmatic: The preprocessor directives should be replaced from the finite number of files in a way which maintained the meaning of these directives.

  • "real" C++ constants were inserted in the code for defined constants
  • quite a number of macros were not resolved but replaced by functions
  • comments were left in the code
  • headers of the system files and library files were not included. Their contents should be substituted by their java analoga directly.
  • for every company header a corresponding preprocessed header was produced and the include directives for these headers therefore were left in the source code.

These special treatments tailored to the company software in question, were removed from the c_pp project published here. However, it is easily possible to insert corresponding special treatments for other translation projects once more again.


Possible applications of c_pp

Other applications of c_pp are conceivable in addition to the task just described. For example, it could be used to test, whether the preprocessor commands actually produces the expected code. There are so many pitfalls, that a long section of the gnu preprocessor manual is dedicated to them.

http://gcc.gnu.org/onlinedocs/cpp/Macro-Pitfalls.html#Macro-Pitfalls

Even instructions written correctly, have the disadvantage that they are difficult to debug. That's why Scott Meyers gives the advice already in the first chapter of his well known book: "Effective C ++": "prefer the compiler to the preprocessor". So another conceivable possible application of the c_pp project is, to really transform C++ files into new files with dissolved preprocessor instructions.



Remarks to the construction of the project and to the standard conformity


c_pp is nearly a Standards conformant implementation of the mandated C99/C++ preprocessor functionality. The deviations are discussed in the following annotations, which are ordered like the excellent introduction to the c preprocessor at:

http://gcc.gnu.org/onlinedocs/cpp/

Following the order of "cpp_info".


Initial processing


1. Reading the input file

The file is read continously without - like other preprocessors - breaking it into lines at first. White spaces and tabulators and comments are ignored


2. Trigraphs

c_pp doesn't handle trigraphs.

3. Continued lines

Backslashes `\' at the end of lines and following spaces are removed.

4. Comments

All comments are replaced with single spaces in the production "comment". "comment" is set as an inclusion production in c_pp. This is a special TextTransformer feature to handle comments etc. easily.

"Extremely confusing" tricks like splitting `/*', `*/', and `//' onto multiple lines with backslash-newline, aren't handled correctly by c_pp.ttp.

5. Line breaks

In TextTransformer projects regular expressions for the ignored characters often contain line breaks, too. However, line breaks have an important role in the C-preprocessor grammar so that their possible occurrences are set explicitly.

Tokenization

In the 1999 C standard, identifiers may contain letters which are not part of the ASCII character set. c_pp cannot treat such identifiers.



Header Files

Include Syntax

Included user and system header files, like

`#include "FILE"'

or

`#include <FILE>'

both are recognized by the expression

PD_INCLUDE ::= #\s*include\s*("([^"]+)"|<([^>]+)>)

The first sub-expression of this expression - in TextTransformer notation: 'xState.str(1)' -
gives the included file 'FILE'.


Include Operation


Everytime, c_pp finds an include directive, the function

'scan_include_file'

is called. In this function the file is loaded with 'load_file'. Then the file is processes with the production 'header' in the same way like the original file. I.e. the preprocessed text of the included file is attached to the text which was already generated from the original file. After the processing of the included file is completed, the processing of the including file is continued. If there are include directives in the included file too, then the inclusion method is executed analogously at a higher level. An integer parameter for the current level is incremented, when 'header' is called.

Whether a file really shall be included gets controlled by the function 'ReallyInclude'. The translator of C++ to Java mentioned at the beginning, only included headers immediately belonging to the source file.


Search Path

c_pp does not distinguish between system and user headers presently. The headers are looked up in the same list of directoriey in both cases. This list is in the vector

m_vIncludeDirs

The list can be passed as a configuration parameter to the project. Depending on the way the TextTransformer project is executed, the configuration parameter has to be put in the project options (for the working space of the IDE), in the transformation manager or as a command line parameter.
Every list has to be put into one line in the configuration string. E.g.

D:\Tetra\Projects\Divers\Cpp
C:\Programme\Borland\CBuilder6\Include

This list is parsed with the production 'IncludePaths' before the start of the c_pp preprocessor to fill the m_vIncludeDirs-vector.

  (
    SKIP {{ AddIncludeDir(trim_copy(xState.str())); }}
    ( EOL | EOF )
  )*

The sub-parser is called in the Init function:

IncludePaths(ConfigParam());

In addition the root path of the source file is set as an include path with:

m_vIncludeDirs.push_back(SourceRoot());

Once-Only Headers

The names of preprocessed headers are stored in the map

m_mHeaderPaths

The preprocessor could be accelerated, if files already contained in this list, were not parsed again. This method, however, would not be absolutely correct since the set of the defined expressions can have changed between two inclusions of the same file. The same file can therefore yield another result at renewed processing.



Macros

Macros are abbreviations of code fragments which are defined by the preprocessor directive '#define'. Function like macros contain brackets and possible arguments while object like macros simple are identifiers.

Macro definitions are parsed in the production 'definition'. It starts with the token

PD_DEFINE ::= #\s*define

For object like macros a simple identifier 'ID' follows. If, however, a token like

MACRO_DEF_BEGIN ::= (\w+)\(

follows, it is a function like macro definition.

If c_pp finds a macro call, the macro is expanded: Arguments are evaluated macro arguments with preceding '#' are stringified and '##' concatenates tokens. Variadisc macros aren't supported. Macros can be undefined or redefined. c_pp does not produce a warning if a new definition is different from the original.



Conditionals

The simplest sort of conditional is

#ifdef MACRO

CONTROLLED TEXT

#endif /* MACRO */

CONTROLLED TEXT will be included in the output of the preprocessor if and only if MACRO is defined. The CONTROLLED TEXT inside of a conditional can include preprocessing directives. They are executed only if the conditional succeeds. You can nest conditional groups inside other conditional groups.



Diagnostics

Line Control


Pragmas


Other Directives

Preprocessor Output

Links

A C-preprocessor written in Pascal from Dr. Hans-Peter Diettrich
http://members.aol.com/vbdis/

Standards conformant implementation of the mandated C99/C++ preprocessor functionality written in C++ from Hartmut Kaiser
http://www.boost.org/libs/wave/index.html



 to the top