Resolving Undefined Symbol linker messages.

by Oct 19, 1993

 Technical Information Database

TI867C.txt   Resolving Undefined Symbol linker messages.       
Category   :General
Platform    :All
Product    :Borland C++  3.x    

Description:
  The purpose of this document is to provide an overview of the
  Linking process and help in identifying causes of 'unresolved
  external symbols'.
      The code for printf is in a module in the run time library.
  When one calls printf in a C/C++ module, the compiler creates a
  record ( referred to as EXTDEF - EXTernal DEFinition ) which
  indicates the call to an 'extern'al function.  The linker then
  looks at that OBJ, along with all the other modules and libraries
  specified and attempts to find another module ( .OBJ or .LIB )
  which defines/provides the symbol printf.
  If successful in the search, the call is printf is resolved;
  Otherwise the linker generates an error indicating that 'printf'
  is an undefined symbol.
  The error message, however, is very often not the result of
  leaving out the module containing the symbol being looked for but
  rather a discrepancy between the name used by the caller ( the
  C/C++ module calling printf() in the case mentioned above ) and
  the supplier ( the LIBRARY containing the code to printf()  ).
  The *real* name of any symbol is almost always different from the
  name/identifier used by the programmer.   For example, the *real*
  name ( by *real* name we mean the idenfier used/generated by the
  tools ) of strcpy is:  '_strcpy'.   The *real* name of a symbol
  depends on the various settings and options.   The relevant ones
  are list below:
       Calling Conventions:
              > cdecl
              > pascal
              > fastcall
      Compiler Settings:
              > generate underbars
              > unsigned chars        ( C++ only )
      Optimizations:
              > Object Data Calling   ( C++ only )
      Virtual Table:
              > Far Virtual Tables
      Language used:
              > C
              > C++
              > Assembly
  Furthermore there are two options which will affect how the
  linker attempts to match symbols:
               > Case sensitive link
               > Case sensitive exports ( Windows only )
  The following is a discussion of how the above mentioned options
  affect the *real* name of symbols, hence the resolution of
  symbols.
  Calling Conventions:
  ====================
       The Borland/Turbo C++ allows one to specify the default
  calling convention.  This default can be overriden by using the
  'pascal', '_fastcall' or 'cdecl' keywords, however.  Whether set
  globally or on individual function instances, the calling
  convention affects the name of functions;  by default, when the
  compiler encounters a function declared as,
  int Foo( int );               [ or 'int cdecl Foo( int )' ];
  the name generated is _Foo: that is, the resulting name has the
  same case but is preceeded with a leading underscore.  The
  generation of the underscore is the default behavior, and is
  neccessary when one links to the run time libraries.  There is no
  'printf' symbol in the RTL (!), but there is a '_printf'.
  While the C calling convention implies 'Case Sensitivity' and
  'Generation of Underbars', Borland/Turbo C++ provides separate
  settings for the generation of underbars and the calling
  convention:  Generation of Underbars can be controled from the
  Options | Compiler | Advanced Code Generation Dialog, or the -u
  option from the command line.  ( -u- would turn it off, it is on
  by default);  the 'Calling Convention' can be modified via the
  Options | Compiler | Entry/Exit Code Dialog.
  If our function 'Foo' is declared with the pascal modifier, for
  example:
  int pascal Foo( int );
  ( Or if the default 'Calling Convention' is set to 'PASCAL' ) the
  resulting name will be FOO:  that is, all upercase with no
  underscore.  The '_fastcall' modifier is similar to 'cdecl' in
  regards to Case Sensitivity but the underscore character is
  replaced with the '.   Hence:
  int _fastcall Foo( int );
  will result in the '@Foo' symbol.
  Therefore, mismatching the calling conventions may/will result in
  'Undefined Symbols' - Watch for clues in the undefined symbol
  name provided in the Linker error messages ( e.g. look at the
  Case Sensitivity and any leading characters ) to spot cases of
  incorrect settings in the 'Calling Convention' and/or 'Generation
  of Underbars'.
  NAME MANGLING:
  ==============
       The C++ language uses yet another naming convention as part
  of its implementation of 'type safe linkage'.  Imagine a function
  foo which take two longs [ e.g. void foo( long, long ) ]; what if
  someone has it incorrectly prototyped in a calling module as
  taking two floats  [ e.g.  void foo( float, float ); ].  The
  results of such a call will be unpredictable.  When using the C
  language, the linker would resolve such a call since the symbol
  the compiler uses to call the function taking two floats will be
  '_foo', and the name the compiler used in the module which
  implements the function taking two longs is also '_foo'.
  In C++, however, the name the compiler generates for a function
  is a 'mangled' name: it is 'encrypted' based on the parameters
  types the function expects.  In the scenario described in the
  prior paragraph, the call to 'foo' will not be resolved since the
  compiler will generate different names for 'void foo( float,
  float )'  and  'void foo( long, long )'.
  Because of the fact that a C++ function's ( real! ) name depends
  on the types of its parameters, if unsigned chars is used as a
  compiler option, it will change the name of functions declared to
  take a char, or char *.  Unsigned chars is off by default, it is
  turned on under the Options | Compiler | Code generation menu.
  Or by specifying the -K option with the command line compiler.
  Watch out for potential 'Undefined Symbol' messages caused by a
  mismatched of char v. unsigned char.
  The 'virtual mechanism' of C++ is implemented via a table
  commonly referred to as 'Virtual Table' ( or VMT - Virtual Method
  Table ).  Various settings of the Compiler dictate whether the
  Table ends up in the Default Data Segment or in a Far Segment (
  namely Memory Model, '_export' and 'huge' class modifiers,
  Virtual Table Control Options etc ).  To further enforce 'type-
  safe-linkage', the Borland/Turbo C++ compiler include the
  'distance' of the Virtual Table as part of its 'Name-Mangling'
  logic.  This prevents the linker from resolving function calls
  which would crash at run-time because of mis-matched 'Virtual
  Table Control' settings.   In the same token, Borland provides
  the 'Object Data Calling convention' for improved efficiency of
  C++ code.  Once again, the 'Name-mangling' algorithm also
  reflects the enabling of 'Object Data Calling'.  This ensures
  that function calls involving mismatched 'Object Data Calling'
  convention between caller and callee will be caught at link time
  ( instead of resulting in erratic run-time behaviour! ) .
  To illustrate the effect of 'Virtual Table Control' and 'Object
  Data Calling', let's create a simple class and look at the
  effects of the various settings on the resulting names:
  class  Test
  {
       public:
           virtual int Process( void );
  };
  int main( void )
  {
       Test t;
       return t.Process();
  }
  The following table illustrates the effects of Compiler Settings
  on the *actual* name generated for the member function 'int
  Test::Process(void)'.
       +----------------------------------------------------------+
       | Object Call. | Far V. Tbl. | Huge Md. |  [ REAL  NAME ]  |
       |--------------+-------------+-----------------------------+
       |      No      |     No      |  No     > @Test@Process$qv  |
       |--------------+-------------+-----------------------------+
       |      No      |     Yes     |  No     > @Test@0Process$qv |
       |--------------+-------------+-----------------------------+
       |      Yes     |     No      |  No     > @Test@1Process$qv |
       |--------------+-------------+-----------------------------+
       |     Yes      |     No      |  Yes    > @Test@2Process$qv |
       +--------------+-------------+-----------------------------+
  ( NOTE:  Using the '_export' or 'huge' keyword when defining a
           class results in Far Virtual Tables for the class ).
  'Undefined Symbol Messages' caused by mismatching Virtual Table
  Controls or Object Data Calling conventions may be hard to
  identified;  it is often useful to use the TDUMP.EXE utility to
  find the actual names of the unresolved symbols:  watch out of
  any '0', '1' or '2' following the '@ClassName@' portion of the
  real names.
  LANGUAGE USED:
  ==============
       By default an assembler ( including TASM  ) does not modify
  public names, but merely converts them to upper case.  With TASM,
  the /mx option will force the assembler to treat public symbols
  with case sensitivity.  Without /mx, a call to _foo from an
  assembly module would look like _FOO to the linker, which could
  cause undefined symbols when linking C and assembly.  ( NOTE:
  TASM also has extension which will cause the automatic generation
  of underscores.  See. .MODEL , Language directives from
  your TASM manual ).
  As mentioned in the above section about 'Name Mangling', the C++
  language uses a different Naming Convention from the C language.
  This can result in undefined symbols when calling C from C++ ( or
  vice-versa ).  C++ modules should use the 'extern "C"' syntax
  when interfacing with C modules.  ( see. Name Mangling section of
  Programmer's Guide manual regarding the mentioned syntax ).
  LINKER SETTINGS:
  ================
       By default, the linker will treat _foo and _FOO as different
  symbols.  However, one can control whether the linker pays
  attention to Case Sensitivity via an option:
       The Option can be set via the Options | Linker | Settings
  dialog (IDE), or the /c option with TLINK. ( /c: Enables Case
  Sensitivity [default], /c- turns Option off).
       For example, if the option is disabled, a call to _foo could
  be resolved to _FoO.
       When creating a Windows application, not only can one link
  to 'static' modules ( .OBJs or .LIBs which are a collection of
  .OBJs), but one can also link to dynamic libraries ( i.e. the
  resolution of the call is completed by Windows at load time ).
  Functions residing in DLLs and called from one's EXE are said to
  be imported.  Functions that one codes in an .EXE or .DLL and
  which are called by Windows/Other Exes/DLLs are said to be
  exported.
       Functions are imported in two ways: by listing them in the
  IMPORTS section of the .DEF file, or by linking to an import
  library.  Functions can be exported by two methods: by using the
  _export keyword in the source code, or listing the functions in
  the EXPORTS section of the .DEF file.
       So suppose our App. calls a symbol _foo which is in a DLL.
  The linker can treat symbols coming from an import library, or
  IMPORTS section of the .DEF file with or without case
  sentitivity, ( determined by the setting of case sensitive
  exports under the Options | Linker | Settings Dialog or /C option
  on the TLINK command line ).  If this setting is NOT enabled,
  then the Linker treats symbols in import libs or IMPORTS sections
  as all uppercase.  It then considers these upper case symbols
  during the link phase.  At that point it is doing normal linking
  using the setting of the case sensitive link option.  If we are
  importing both _foo and _FOO without the /C option, the linker
  can only resolve the call to _FOO.
       If we are calling _foo, ( NOTE: foo is a cdecl function )
  and performing a case sensitive link, but do not have case
  sensitivity on EXPORTS, _foo will show up as undefined.
       > Imported cdecl functions and C++ names will link when /c
       > and /C are both enabled, or neither are enabled*.
  C++ names are always generated with lowercase letters.  When
  importing or exporting C++ names, it is recommended that one uses
  both the /c and /C options.
  Now let's apply the above to some common scenarios and provide
  possible diagnostics and suggestions:
  PROBLEM:
       All the functions in a 3rd party library are undefined!
  SOLUTION:
       3rd party libraries must be explicitly linked in.  To
  explicitly link to a 3rd party library from the IDE, open a
  project file and insert the .LIB file into the project file.
  The project file also needs to have all of your source code files
  listed in it.  From the command line, insert the .LIB on your
  command line to TLINK.
  PROBLEM:
       All the functions in the RTL are undefined!
  SOLUTION:
       You need to link in Cx.LIB, where x is the memory model.  A
  feature of the IDE in Turbo C++ and Borland C++ v2.x is that if
  you put a .LIB in the project file which starts out as Cx where x
  is a memory model, the new library overrides the normal run time
  library, and the latter will not be linked in ( e.g. using a
  library named CSERVE.LIB ).  Rename any such libraries, then the
  normal Cx.LIB will automaticly be linked in.  ( Borland C++ 3.x
  has a dialog for specifying if the Run Time Libraries should be
  linked in ).
  PROBLEM:
       When mixing C and C++ modules (.c and .cpp source) symbols
  are undefined.
  SOLUTION:
       Because of name mangling (see above) the symbol the linker
  sees being called from a C++ module will not look like the symbol
  in the C module.  To turn name mangling off when prototyping
  functions:
       // SOURCE.CPP
       extern "C" {
            int Cfunc1( void );
            int Cfunc2( int  );
       }
  NOTE:  You can also disable name-mangling for functions written
  in C++ and called from C.
       A C++ compile will happen if the source code has a .CPP
  extension, or Options | Compiler | C++ options use C++ compiler
  is set to always.
  PROBLEM:
       randomize and other macros are coming up as undefined
  symbols.
  SOLUTION:
       Turn keywords to Borland C++.  Since some macros are not
  ANSI compatible, the header files will not define them if
  compiled with ANSI, or UNIX keywords on.
  PROBLEM:
        min and max are coming up undefined.
  SOLUTION:
       These macros are only included in a C compile, and will not
  be seen by the compiler if compiling in C++.  In C, you must
  #include  to use them.
  PROBLEM:
       I cannot get my assembly modules to link with my C/C++
  program.
  SOLUTION:
       For C++, see above.  Otherwise, the .ASM must be assembled
  with case sensitivity on public symbols.  ( /mx for TASM ).  It
  must also match the C naming convention, which will have an
  underscore in front of the name.  So given,
       int foo(  void );
  in the C module, you need to
       call _foo
  from the assembly module.  (NOTE: TASM has extensions which will
  automatically generate underscores for you).  Also, make sure the
  .OBJ which has the assembly code is listed in the project file,
  or on the tlink line.
  PROBLEM:
       wsprintf is comming up undefined.
  SOLUTION:
       In Borland C++ 2.0, to use wsprintf when case sensitive
  exports is on, you need to reverse a define in windows.h via the
  following:
  #ifdef   wsprintf
  #undef   wsprintf
  #define  wsprintf wsprintf
  extern   "C" int FAR cdecl wsprintf( LPSTR, LPSTR, ... );
  #endif
  To call wsprintf (or any cdecl imported function ) with case
  sensitive exports off, you need to match an upper case name.
  Thus windows.h #defines wsprintf to be WSPRINTF.  wsprintf is one
  of the cdecl functions from windows, so the compiler will
  generate a lower case symbol for when calling it.
  PROBLEM:
       FIWRQQ and FIDRQQ are undefined
  SOLUTION:
       These symbols are in the EMU or FP87 library.  You must link
  it in explicitly when using TLINK, or set the IDE to link it in
  under the Options | Compiler | Advanced Code Generation Floating
  point box.
  PROBLEM:
       Warning attempt to export non-public symbol ...
  SOLUTION:
       The exports section of the .DEF file has a symbol which does
  not match one the compiler generated in the source code.  This
  will happen if:
       - The source was compile in C++ (the symbol name will be
         mangled).  Resolve by exporting with the _export keyword,
         compiling in C, or declaring the function as extern "C".
       - Case sensitive exports is ON, you are exporting a PASCAL
         function, and exporting it like:  WndProc.  Resolve by
         exporting as WNDPROC or by turning case sensitive exports
         off.
       - You are exporting a cdecl function.  If declared as
                 int foo( void );
         export as _foo and turn case sensitive exports on ( or
         just use the _export keyword).
       NOTE: When using the '_export' keyword, it must be used in
             the prototype of the function:
                 i.e. int FAR _export foo( int );
  PROBLEM:
       C++ and DLL linking problems.
  SOLUTION:
       classes declared in the DLL need to be declared as the
  following:
       class _export A
       {
            ...
       };
       When defined in the EXE, the same must be prototyped as:
       class huge A
       {
            ...
       };
       ( see user's guide for more info).
  Then, link with /c and /C on ( i.e. both case senstive link and
  case sensitive exports ENABLED ), when building BOTH the DLL and
  the calling EXE.
  PROBLEM:
       OWL and undefined symbols.
  SOLUTION:
       If linking to the static libraries:
            - with BC 2.0, link in owlwx.lib, tclasswx.lib, and
              sallocwx.lib. (You don't need sallocwx.lib with BC
              v 3.x ).
            - do NOT define _CLASSDLL in the code generation
              dialog, or before including owl.h.
            - link with /c and /C.  (from IDE, under linker
              options, case sensitive link and case sensitive
              exports).
            - Do NOT compile with -K or unsigned char's on.  You
              will get several undefined symbols in this case.
       If linking to the OWL DLL, DO define _CLASSDLL before
  including OWL.H  and use /c and /C linker options ( i.e. both
  Case Sensitive Link and Case Sensitive Exports ENABLED ).
  PROBLEM:
       With an OWL App., wsprintf is undefined in module when
       linking to the static libs.
  SOLUTION:
       Link with /C ( i.e. case sensitive exports ENABLED ).
  PROBLEM:
       _main is an undefined symbol.
  SOLUTION:
       main is the entry point for every C/C++ program.  Make sure
  you write a function called main (all lowercase) in your program.
  If you have a project file loaded, make sure your source code
  file (.c or .cpp file) which has main in it is listed in the .prj
  file.   Make sure generate underbars is turned on.
  PROBLEM:
       iostream members, like <

Article originally contributed by Borland Staff