CTA is a C Call-Tree-Analysis tool.

This utility examines a set of source files, and builds a database
describing each file, function, and all function calls made within that
function. You can then make queries on that database to see the complete
function call tree above or below a specific function. Although it does
have limitations (see below), the information it can provide is immensely
useful when tracking down all kinds of problems relating to the static
call tree, such as proving that a function can or cannot be called below
a specific point, finding the entry points which can lead to a specific
function, resolving race conditions, access conflicts, non-reentrantcy
problems and many others.


To use it, first prepare a FILES.IDX file with a list of all the files
that you want to analyze. I have provided a simple utility to assist in
creating this index file for an existing directory structure:

  Call-Tree-Files - Dave Dunfield - Jan 16 2007

  Use:    CTF pattern [options]

  opts:   /D      - remove Directory in display names
          /L      - add Long filenames to display
          /Q      - Quiet (less informational output)
          /S      - scan Subdirectories

For most projects, the procedure will be to:

Go to the root of the project that you wish to index, then execute
the command: CTF *.C /L /S

This will build a FILES.IDX file containing the names (short and long)
of all .C files occuring in the directory tree.

You may have to edit FILES.IDX and remove any .C files which are not part
of the actual software build. At this time you can also remove any .C files
that for whatever reason you don't want scanned.

Note that the format of each line in FILES.IDX is:

  <short-name> [display-name]

where:

<short-name> is the 8.3 format filenames required by the Call-Tree-Analysis
utility (it's a 16-bit DOS program and can't work with long filenames).

[display-name] is optional, and may contain any name you wish to be displayed
for a given file instead of <short-name>. If you use the /L option with CTF
it will automatically insert the full windows long filenames as the [display-
name] for each file.



Once you have the list of files, you can begin using my Call-Tree-Analysis
utility:

  Call-Tree-Analysis - Dave Dunfield - Jan 16 2007

  Use:    CTA /B file|@list ...   - Build call-tree database
          CTA function-name       - show call-tree for function
          CTA /M                  - show database Memory use

  opts:   /F                      - Full references (can be long)
          /Q                      - Quiet (less informational output)
          /R                      - Reverse lookup
          /W(0-3)                 - Warning level                 [/W1]
          D=filename[.CTA]        - Database file                 [DATABASE.CTA]
          E=filename[.TXT]        - Error/warning output file     [stderr]
          I=indent                - Indent size                   [72]
          O=filename[.TXT]        - Output file (scanning)        [none]
          O=filename[.CTO]        - Options file (/B)             [none]
          T=column                - Tab column                    [74]

The first thing you must do is build the Call-Tree database.
Use the command: CTA /B @FILES

This will read FILES.IDX, and parse every file listed therein making a list
of each function occuring, and each function called which is written to
DATABASE.CTA. The format of DATABASE.CTA quite simple:

  :filename_1
  defined_function_1
   called_function_1
   called_function_2
  defined_function_2
   called_function_1
   called_function_2
  :filename_2
   .. etc ..

This database will describe every function called by every other function
in every file of the project (it can get quite large).

See the notes under "Limitations" below for information on certain things
to watch out for when building the database. If necessary, you can manually
"tweak" the database file after running CTA /B.



To query the database, use: CTA function_name

this will show the call-tree for all functions called by
'function_name'

To do a reverse query, use: CTA function_name /R
This will show the call-tree for all functions which call
'function_name'


Examples: (Analysis of any arbitrary function in our source tree usually
results in many pages of output, however I have hunted around in a project
I'm currently working on and found "NetworkFailure", a function which
occurs within a very small call tree which I will use an an example:


Here is the output of a forward scan of NetworkFailure - this
depicts the call tree BELOW the function:


Line  File  Lvl  Functions called by: NetworkFailure
------------------------------------
1           1    BITSET
2     1     1    ProcessMessageLight
3           2      BITTEST2
4     2     1    send_log_event

Files:
1     Project\Source\Ctrl.c
2     Project\Source\Prot.c


Here we see the following trees:
  NetworkFailure->BITSET
  NetworkFailure->ProcessMessageLight->BITTEST2
  NetworkFailure->send_log_event

In this example, BITSET and BITTEST2 are actually macros, which are
are called in a manner that makes them look like functions to my
parser - note that no file number is given, which means that the
parser was not able to identify these functions as being defined
within any file in the project. The call-tree database creation
utility does have a feature which allows you to maintain a file of
names to omit from the call-tree database, we should maintain such
a file of MACRO and other names which we do not wish to clutter
the database.

Although it seems odd - "send_log_event" does indeed end the call
tree without calling any other functions - it inserts the log event
into a global queue which exists as data variables in RAM.


Here is the output of a reverse scan of NetworkFailure - this
depicts the call tree ABOVE the function:


Line  File  Lvl  Functions calling: NetworkFailure
----------------------------------
1     1     1    EthernetInterrupt
2     2     2      OneMsSec
3     1     1    receive_packet
4     1     2      EthernetInterrupt . . . . . . . . . . . . . . . . . >1
5     1     1    packet_timer
6     2     2      OneMsSec . . . . . . . . . . . . . . . . . . . . . .>2

Files:
1     Project\Source\Prot.c
2     Project\Source\Ctrl.c


Here we see the following trees:
  OneMsSec->EthernetInterrupt->NetworkFailure
  OneMsSec->EthernetInterrupt->receive_packet->NetworkFailure
  OneMsSec->packet_timer->NetworkFailure

Note that this is a "compressed" display where duplicated sections of the
call-tree are depicted by referencing the previously occuring display of
the higher portion - the tool can also generate a /Fully expanded call-tree,
however this gets VERY large with a complex project.

Note that when CTA detects a recursive calling loop, it always indictes
this as a reference back to the recursive calling point using a special
'^' indicator.


Options file:
-------------
The O=filename command line option defines an "options" files which
allows you to control how certain symbols are processed by CTA when
building (/B) the database:
- Symbols may be defined one per line which causes CTA to treat these
  names the same as reserved words - ie: NOT try and resolve them as
  a function definition or reference. This is useful to exclude the
  names of parameterized macros, and external functions you do not
  with to track from the database.
- Macro names may be defined as above by preceeding them with '#', and
  following them with an optional value. These will be used in the
  processing of #if/#ifdef/#ifndef conditional blocks. NOTE: Macro names
  are processed for conditional directives only, and are NOT replaced
  in the source code - Macro names may only be assigned a numeric value
  in the range of (-32768 to 32767).
        Debug             <- Exclude symbol 'Debug'
        Setbit            <- Exclude symbol 'Setbit'
        Clrbit            <- Exclude symbol 'Clrbit'
        #DEBUG_MODE       <- Define 'DEBUG_MODE' with value 0
        #DEBUG_LEVEL 2    <- Define 'DEBUG_LEVEL' with value 2


Static Functions:
-----------------
Analysing a full project call-tree is complicated by Static functions,
which can have the same name in separate modules. CTA handles static
function names by prefixing them with 'filename:', so the function:

    static func myfunc()

Occuring in MYFILE.C would appears in the database as MYFILE.C:myfunc

To properly reference static functions, CTA must encounter the function
definition, or a prototype for the function including the "static" keyword
before any references are processed.


#included C files:
------------------
Sometimes, one C file #includes another - this confuses CTA with regard
to static functions, because it does not know that the multiple C files
represent a single compilation unit (and therefore share static functions).
You can manually force CTA to process multiple files as a single source
unit by placing a '+' at the beginning of any filenames in the .IDX file
which are to be "included" with the preceeding file as one compilation
unit. In this case, the filename prepended to static functions will be
the name of the first file in the group.

eg:
    myfunc.c
    +myfunc1.c
    +myfunc2.c

In the above example, all three files, myfunc.c, myfunc1.c and myfunc2.c
will be scanned as a single compilation unit, and any static functions
occuring in any of the three files will have the prefix "myfunc.c".

Note that order is important - if myfunc.c #includes myfunc1.c at an
early point, and subsequently references a static function defined in
myfunc1.c, this will not be known to CTA unless it processes myfunc1.c
BEFORE the reference in myfunc.c. In this case, you might re-order the
files as:

    myfunc1.c
    +myfunc.c
    +myfunc2.c


#preprocessor conditionals
--------------------------
Blocks of code conditionally compiled with the preprocessor directives
#if/#ifdef/#ifndef/#else/#elif/#endif can cause problems with CTA.
CTA implements a simple processor for these preprocessor conditionals.
Please note:
- Other preprocessor macros such as #define are NOT processed by
  CTA - MACRO names used in conditionals must be defined in the Options
  (O=) file.
- #ifdef/#ifndef/defined(....) assume a macro name is NOT defined
  unless it has been defined in the Options (O=) file.
- #if assumes undefined names == 0
- #if supports the following value element types:
     n    - Decimal number      (sub)  - Sub-expression as value
     0n   - Octal number        -value - Negation of any other value
     0xn  - Hexidecimal number  ~value - Compliment of any other value
     name - Macro name value    !value - Logical NOT of any other value
     defined(name)  - True (1) if indicated name is defined.
- #if supports the following two-operand operations:
     +    - Addition             ==    - Test for equal
     -    - Subtraction          !=    - Test for not-equal
     *    - Multiplication       <     - Test for less than
     /    - Division             >     - Test for greater than
     %    - Modulus              <=    - Test for less than or equal
     &    - Bitwise AND          >=    - Test for greater or equal
     |    - Bitwise OR           <<    - Shift left          
     ^    - Bitwise XOR          >>    - Shift right
     &&   - Logical AND          ||    - Logical OR
- #if works with 16-bit signed integers only (-32768 to 32767)
- The /C option causes CTA to ingore conditionals and process ALL blocks.


Limitations:
------------
CTA is not a full C parser - it contains a high-speed simplified parser
that can identify most function definitations and references, however it
can be confused by "odd" C constucts.  It also does not detect erroneous
source code - it is assumed that your program compiles and links without
error before you perform a Call-Tree Analysis.

CTA cannot track dynamic portions of the call tree - ie: calls occuring
through function pointers.

CTA cannot track calls occuring througn non-C source files - the most
common occurance of this is assembly language files or other languages.

CTA cannot track the movement of data beyond the end of a call tree.
For example, the log_event or message light status in the examples
above which moved into queues to trigger activity by another function
at some other point in the code.

CTA can be tricked by certain preprocessor constructs:

- CTA keys of the sequence <symbol> <opening-round-bracket> to
  detect function calls. Use of the preprocessor to hide this
  construct, eg: #define callfunc func()
  will prevent CTA from detecting this function call.

- CTA knows all C keywords, and watches for certain ones - use of
  preprocessor to disguise keywords can confuse CTA.

- CTA counts the occurances of '{' and '}' to know when a function
  definition is active. Use of the preprocessor can invalidate this,
  typically in a conditional block such as:

    #if DEBUG
       int debug_global;
       void function(parameters)
       {
            int debug_local;
    #else
       void function(parameters)
       {
    #endif

  This can confuse CTA if both blocks get processed (/C), which will generate
  an error message indicating "End of file at level n" where 'n' is the number
  of {} levels outstanding when the file ended.

  if conditionals cause an excess of '}'s to be detected, CTA will issue
  the error message "'}' occured at level 0" - indicating that an extra
  '}' was detected where no matching '{' had been processed.

  In cases such as these, you would have to move the '{' or '}'s outside
  of the conditional blocks or otherwise insure that the blocks can
  always be balanced by counting '{'s and '}'s.

Certain macros, casts and typedefs can resemble function invocations closely
enough to fool CTA - This can be resolved by using the O= (Options file)
option to specify the names that CTA will not try to process when when you
build (/B) the database.


Errors and Warnings:
--------------------
To assist in detecting errors in parsing, CTA issues the following
errors and warnings. Errors are serious enough that CTA halts further
processing. Some warnings do not necessarily mean a problem, so they
are available in an optional range of warning levels - since the
level 1 warnings are almost always errors, it is not recommended to
set the warning level to 0 (no warnings).

ERR: Out of memory in segment <segment-name>

  The indicated storage segment has become exhausted. You will have
  to use CTA on a smaller set of files.

ERR: End of file searching for <item>

  CTA was searching forward for the indicated item and hit the end of
  the file without finding it. Normally this would be a '}', ')' or ']'
  character indicating the end of a block construct, or a ';' indicating
  the end of a statement.

ERR: Not found: <name>

  CTA was not able to find the referenced function in the database.

/W1: End of file in quoted value

  CTA encountered the end of the file while it was parsing a '.'
  quoted character.

/W1: End of file in string

  CTA encountered the end of the file while it was parsing a "..."
  quoted string.

/W1: End of file in comment

  CTA encountered the end of the file while it was processing a
  /* ... */ comment construct.

ERR: Unknown item: <char>

  CTA could not identify the character as part of a C construct.
  Typically this means that an illegal character (control or non-
  valid printable character) occured in the C source file.

/W1: '}' occured at level 0

  Indicates that the { } blocks have become unbalanced in the file,
  and CTA found a '}' with no matching '{' - see limitations above.

/W1: End of file at level (n)

  Indicates that the { } blocks have become unbalanced in the file,
  and CTA encountered the end of the file while it still expected
  (n) blocks to be closed by the occurance of '}' characters.

/W1: Call before def: name

  CTA found a construct that looks like a function call but no
  function definition was being processed.

/W1: Duplicate: name

 CTA found two definitions for the same function. Most likely cause is
 conditional blocks (since CTA does not perform preprocessor functions,
 it sees ALL the definitions). The two definitions were not sequential
 (one right after the other) which prevents CTA from processing them as
 one function (see below). All occurances except for the first one
 encountered are ignored.

/W2: Sequential duplicate: name

  CTA found two sequential definitions for the same function name.
  This sometimes occurs when you have the same function declared
  multiple times in conditional blocks (or even just the function
  header). Since CTA does not process the conditional blocks, it
  sees ALL of the definitions.

  CTA handles this special case by NOT outputting the second definition,
  and placing all references from both functions into a single entry in
  the database. This is normally what you would want (to know any and
  all functions which could be called by either version).

  Note that if multiple conditional definitions are NOT sequential
  and consecutive in the source file (ie: other definitions occur
  between them), CTA cannot combine them, and the "Duplicate" error
  above will occur.

/W2: Def/no-ref

  CTA found a definition for the indicated function, but no references
  to it. Most likely causes are:
  - Function is "dead code" (not called anywhere)
  - Function is called only via pointers (which CTA cannot track)
  - Function is called only in other modules which have not been
    included in the CTA scan (other languages or non-scanned C files)
  - Function calls are performed in an odd manner which the CTA parser
    failed to detect.
  Scanning for the function name in all of your source files should
  quickly show if/how the function is referenced.

/W2: Ref/no-def

  CTA found references to a function which was not defined in the
  scanned file. Most likely causes are:
  - Function is defined on another module which was not included in the
    CTA scan (other languages or non-scanned C files)
  - Function is actually a #define macro or some other non-function
    construct which resembles a function call closely enough to confuse
    the CTA parser.
  - Function is defined in some odd manner which the CTA parser failed
    to detect.

/W3: Skipping non-function {} at level 0

  CTA has found a construct containing { } blocks which does not
  appear to be, or be contained within a function definition. This
  often occurs in global array and structure declarations.

/W3: Skipping typedef

  Since portions of some typedefs constructs can be structured to parse
  like function references, CTA simply skips the entire construct when
  it encounters the keyword "typedef".

Note: It's easier to see where these warnings occur if you use the /Q
option with CTA when you build (/B) the database.


Dunfield Development Services (DDS) offers software and firmware
development services specializing in systems and embedded applications.
For more information, visit: http://www.dunfield.com
