Why i got wrong debug symbols? - dll

I have next workflow:
1) Build dll and pdb files.
2) Share dll to cutomer
3) Analize memory dump from customer.
When I run !analyze -v in WinDbg I got (below part of output)
....
MANAGED_STACK_COMMAND: _EFN_StackTrace
PRIMARY_PROBLEM_CLASS: WRONG_SYMBOLS
BUGCHECK_STR: APPLICATION_FAULT_WRONG_SYMBOLS
// some callstack here
MODULE_NAME: RTPLogic
IMAGE_NAME: RTPLogic.dll
DEBUG_FLR_IMAGE_TIMESTAMP: 58a43706
STACK_COMMAND: ~541s; .ecxr ; kb
FAILURE_BUCKET_ID: WRONG_SYMBOLS_c0000374_RTPLogic.dll!CSRTPStack::Finalize
BUCKET_ID: X64_APPLICATION_FAULT_WRONG_SYMBOLS_rtplogic!CSRTPStack::Finalize+1da
Looks like we have wrong debug symbol for RTPLogic.dll.
I download ChkMatch tool.
I get pdb path from windbg
0:541> !lmi RTPlogic.dll
Loaded Module Info: [rtplogic.dll]
Module: RTPLogic
.....
Age: 1, Pdb: D:\Work\path_to_original_pdb\RTPLogic.pdb
Image Type: MEMORY - Image read successfully from loaded memory.
Symbol Type: PDB - Symbols loaded successfully from image header.
C:\ProgramData\dbg\sym\RTPLogic.pdb\9F82CDF359044635ADEBA578CA1D1D031\RTPLogic.pdb
Compiler: Resource - front end [0.0 bld 0] - back end [9.0 bld 21022]
Load Report: private symbols & lines, not source indexed
C:\ProgramData\dbg\sym\RTPLogic.pdb\9F82CDF359044635ADEBA578CA1D1D031\RTPLogic.pdb
I have logs related to this dump and I see that my changes appears in logs. So customer not forgotten to install my DLL before get the memdump.
I run ChkMatch
PS D:\tools> .\ChkMatch.exe -c "D:\Work\path_to_dll\RTPLogic.dll" "C:\Progra
mData\dbg\sym\RTPLogic.pdb\9F82CDF359044635ADEBA578CA1D1D031\RTPLogic.pdb"
.....
Result: Matched
How it possible that I got wrong debug symbols in such situation?

The symbols for RTPLogic.dll!CSRTPStack::Finalize are correct, but other symbols that are required to reconstruct the call stack are incorrect. It's likely that you have some operating system methods on the call stack and the symbols for ntdll or similar are missing.
Since with ChkMatch, you're only checking one single PDB file, the result of ChkMatch is as reliable and correct (for one PDB) as that of WinDbg (for many PDBs) and they do not contradict each other.
Your sympath probably contains only a local path to your own DLLs and does not contain any information about Microsoft's symbol server. In the output of .sympath (which you did not post), I expect to see something like
0:000> .sympath
D:\Work\path_to_dll
You should include Microsoft symbols as well, as described in How to set up symbols in WinDbg. To fix the problem, use the following commands:
.symfix+ c:\symbols
.reload /f
The output of .sympath should now look like
0:000> .sympath
D:\Work\path_to_dll;SRV*c:\symbols*http://msdl.microsoft.com/download/symbols
This should help WinDbg in reconstructing the complete call stack, resolve OS methods of ntdll and others and thus get rid of the "wrong symbols" message.

Related

Mismatch between IDs from minidump_stalkwalk and dump_syms

I am trying to use google breakpad, but I am facing a strange issue.
i am working in linux. I have my own library, my_lib.so, which I process with dump_syms and generates this symbol :
$ dump_syms my_lib.so|head -2
MODULE Linux mips 3BB485681467218D36EB2FF02287096C0 my_lib.so
INFO CODE_ID 6885B43B67148D2136EB2FF02287096C
I create the symbols directory with the appropiate subdirectories. I then generate a minidump for the program that uses a stripped version of my_lib.so, but when I try to process it with minidump_stackwalk:
0x77dce000 - 0x77e23fff my_lib.so ??? (WARNING: No symbols, my_lib.so, AC40136B433E5A68F66CCE8C2C2E6C250)
It is seaching for a differente ID, AC40136B433E5A68F66CCE8C2C2E6C250, so it does not find the symbols. Why the mismatch?
Knowing that it searches for AC40136B433E5A68F66CCE8C2C2E6C250 I manually changed the tree directory in symbols, to match that one, just to test. I also changed the id inside the my_lib.so.sym file, and then minidump_stalkwalk does not complain about not finding the symbols, but still I can't see the stack trace.
Any ideas about this mismatch?
by the way, if I run readelf -n over the original library and the stripped one, I get the same GNU BUILD ID.

How to add a user defined function in QDB Library?

QDB is a database provided by QNX Neutrino package. I went through the QDB documentation to add a user defined SQL function: http://www.qnx.com/developers/docs/6.5.0/topic/com.qnx.doc.qdb_en_dev_guide/writing_functions.html?cp=2_0_8
I created a source file which had my user define SQL function written in C and qdb_function structure definition. I built it with a make file to create libudf.so.
As suggested by QDB I added Function = udftag#libudf.so in the qdb.cfg. But while running the qdb in the shell prompt, it is giving the error (in bold):
qdb -I basic -V -R set -v -c /etc/sql/qdb.cfg -s de_DE#cldr -o tempstore=/fs/tmpfs
QDB: No script registered for handling corrupt database.
qdb: processing [TempMainAddressBook]Function - Can't access shared library
and qdb is getting exited immediately.
I have tried following things:
made sure sqlite3 library is added in the make file
source code is in strictly in C by using directive : extern "C" to avoid name mangling as the file extension is .cpp. I also tried with .c extension.
given the absolute path of the libudf.so in qdb.cfg as : Function = udftag#/usr/lib/libudf.so
qdb_funcion struct is properly defined in library's source code only.
tried without using the static declaration of function(mentioned in the qdb docs)
After trying all hits and trials also, I am getting the same error every time which is Can't access shared library
If any one has any idea to resolve this error please share.
Suggestion 1: run qdb by setting LD_DEBUG=1, like in:
LD_DEBUG=1 qdb command line options
This will output a lot of debug information from the dynamic loader as it attempts to locate and then load the .so files. Check what is the path that it output before the "Can't access" message is displayed.
Suggestion 2: obvious but make sure that the permissions are OK for the .so file. Do you have the execution permission set?
Suggestion 3: check if the error message is identical if you completely remove the .so file from the system
Suggestion 4: increase the number of lower-case 'v'-s. QDB likely supports more, with progressively more verbose information provided as you increase the numbers (6 should be enough for full verbosity)

literal string expected error

Please have a look at the following code
with text_io;
use text_io;
procedure hello is
begin
put_line("hello");
new_line(3);
end hello;
When I click "build all" in GPS IDE, I get this error
gnatmake -d -PC:\Users\yohan\firstprogram.gpr
firstprogram.gpr:1:06: literal string expected
firstprogram.gpr:2:01: "end" expected
gnatmake: "C:\Users\yohan\firstprogram.gpr" processing failed
[2013-04-03 13:29:58] process exited with status 4 (elapsed time: 00.47s)
I am very new to Ada, as you can see, this is my first program. Please help.
On the command line, gnatmake will happily compile a file which contains Ada code but has the extension .gpr. GPS knows "better" than that, and insists on treating myfirstprogram.gpr as a GNAT Project file, which of course it isn't.
You'll find life with GNAT much easier if you stick with its file naming conventions: .ads for a spec, .adb for a body, and the file name needs to be the unit name in lower case. In your case, the file should have been called hello.adb.
The simplest approach to creating a GNAT project file in GPS is to go to the Project menu and select New. The only places where you must enter data are on the "Naming the project" page (you might choose firstproject!) and the "Main files" page, where you'd click on the blue + to add hello.adb; you can Forward through the others.
After adding the main file, you can click Apply to install the new project file; now you can Build all and Run.
You may find the GPS tutorial helpful (Help menu, GPS ...)

WinDBG doesn't display source lines despite loading private pdb files

I am trying to debug a problem in a native DLL using WinDBG. I believe that I have the private symbols loaded, but WinDBG is not displaying the source lines or parameter information. Here is what I am observing; any help would be greatly appreciated!
I have the PDB which I believe corresponds to the DLL in the symbol search path. Running lm I see:
01050000 01058000 3NMSMTHR C (private pdb symbols) e:\ads_symbols\3NMSMTHR.pdb
As this states "private pdb symbols" I expect that this is the private pdb.
I also ran symchk and see the following output:
C:\utils\inetmgr\patch01>"c:\Program Files\Debugging Tools for Windows (x86)\symchk.exe" /v 3nmsmthr.dll /s c:\utils\inetmgr\patch01
[SYMCHK] Searching for symbols to C:\utils\inetmgr\patch01\3nmsmthr.dll in path c:\utils\inetmgr\patch01
DBGHELP: Symbol Search Path: c:\utils\inetmgr\patch01
[SYMCHK] Using search path "c:\utils\inetmgr\patch01"
DBGHELP: No header for C:\utils\inetmgr\patch01\3NMSMTHR.DLL. Searching for image on disk
DBGHELP: C:\utils\inetmgr\patch01\3NMSMTHR.DLL - OK
DBGHELP: 3NMSMTHR - private symbols & lines
c:\utils\inetmgr\patch01\3NMSMTHR.pdb
[SYMCHK] MODULE64 Info ----------------------
[SYMCHK] Struct size: 1680 bytes
[SYMCHK] Base: 0x10000000
[SYMCHK] Image size: 32768 bytes
[SYMCHK] Date: 0x4cc1b0f8
[SYMCHK] Checksum: 0x00000000
[SYMCHK] NumSyms: 0
[SYMCHK] SymType: SymPDB
[SYMCHK] ModName: 3NMSMTHR
[SYMCHK] ImageName: C:\utils\inetmgr\patch01\3NMSMTHR.DLL
[SYMCHK] LoadedImage: C:\utils\inetmgr\patch01\3NMSMTHR.DLL
[SYMCHK] PDB: "c:\utils\inetmgr\patch01\3NMSMTHR.pdb"
[SYMCHK] CV: RSDS
[SYMCHK] CV DWORD: 0x53445352
[SYMCHK] CV Data: I:\usr\bpi\adrutl\3NMSMTHR.pdb
[SYMCHK] PDB Sig: 0
[SYMCHK] PDB7 Sig: {A865C40A-5070-4752-AD1F-CD3087843807}
[SYMCHK] Age: 4
[SYMCHK] PDB Matched: TRUE
[SYMCHK] DBG Matched: TRUE
[SYMCHK] Line nubmers: TRUE
[SYMCHK] Global syms: TRUE
[SYMCHK] Type Info: TRUE
[SYMCHK] ------------------------------------
SymbolCheckVersion 0x00000002
Result 0x001f0001
DbgFilename
DbgTimeDateStamp 0x4cc1b0f8
DbgSizeOfImage 0x00008000
DbgChecksum 0x00000000
PdbFilename c:\utils\inetmgr\patch01\3NMSMTHR.pdb
PdbSignature {A865C40A-5070-4752-AD1F-CD3087843807}
PdbDbiAge 0x00000004
[SYMCHK] [ 0x00000000 - 0x001f0001 ] Checked "C:\utils\inetmgr\patch01\3NMSMTHR.DLL"
SYMCHK: FAILED files = 0
SYMCHK: PASSED + IGNORED files = 1
This finds the PDB in the right path I've given it (note that I copied this exact PDB file to e:\ads_symbols which is the path seen in the lm output). This symchk output states Line Numbers: true and thus I expect to see private style information. However, if I run ~kv then for my functions in the stack trace I see:
00bef2ac 01052a8a 00000000 00000000 00020aa4 3NMSMTHR!BPMThrProcTerm+0x2c0
00bef2cc 100073eb 00bef4d8 00000000 00000000 3NMSMTHR!BPMThrThreadInitName+0x2a
And this doesn't seem like its reading the private information-- I don't get the source listing like I do for the MS CRT functions which have private symbols on the MSFT symbol server. Also if I do x /t /d 3NMSMTHR!ThreadInitName then I get
01052a60 <NoType> 3NMSMTHR!BPMThrThreadInitName = <no type information>
And lastly if I try to use .frame3 (to go to that frame) and then execute dv to display the locals, I receive the:
0:001> .frame
03 00bef2cc 100073eb 3NMSMTHR!BPMThrThreadInitName+0x2a
0:001> dv
Unable to enumerate locals, HRESULT 0x80004005
Private symbols (symbols.pri) are required for locals.
Type ".hh dbgerr005" for details.
This doesn't make sense to me. Any help would be much appreciated. My overall goal is to get the parameter and source information. OR to confirm that the PDB file I have is in fact NOT the private symbols. I didn't build this DLL or PDB nor do I know any specifics about the linker options passed to it.
Thanks!
EDIT:
I failed to mention that I am getting the checksum error:
*** WARNING: Unable to verify checksum for C:\utils\inetmgr\3NMSMTHR.dll
Sorry! I was trying to run the .lines command as suggested below and I see:
*** WARNING: Unable to verify checksum for C:\utils\inetmgr\3NMSMTHR.dll
DBGHELP: 3NMSMTHR - private symbols & lines
e:\ads_symbols\3NMSMTHR.pdb
Line number information will not be loaded
So I guess that's my problem. Which leads to my next question: is there a way to fix the checksum (which is listed as 0, see above symchk output)? This PDB is the correct one given the symchk output. Can I have it bypass the checksum check?
EDIT2:
For anyone else that comes across this: I was able to fix the checksum warning by:
editbin /release 3NMSMTHR.DLL
This set the checksum in the PE header. Then I had to run the
.symopt+0x40
In WinDbg in order to force it to load the PDB even though the timestamp on the DLL was different. I'm sure that alternatively I could've used some utility to update the modified timestamp as well.
That fixed the warning about the checksum...but STILL no parameter info (running dv on the right frame), no source line info, etc.
So now I'm lost. Is it possible that these PDBs don't contain that info? How could I confirm that? How would I build them to contain it? We use NMAKE to build these.
EDIT3:
I rebuilt the DLL and PDB as DEBUG and then got all of the stack trace information that I expected. So now my question is: (1) is it possible to build in release and get the static functions, parameter info, etc. (private symbol info)? and (2) the stack trace I was getting with the release dlls+pdbs was incorrect-- the first function entrypoint was correct, but then the next stack frame showed a func that wasn't called. My assumption is that the release DLL inlined some functions and somehow the PDB was just "guessing" at the function in that frame? Very strange.
Did you try the .lines command?
If you want to be able to make sense of dumps or stack traces even in Release mode, you should ensure the following:
You compile with /Zi or /ZI (Debug Information Format is one of the two Program Database options).
You do not compile with /Oy (Omit Frame Pointers).
You link with /DEBUG (Generate Debug Info).
You keep (but don't distribute) the resulting .pdb file.
The main thing is to avoid omitting frame pointers; omitting them saves a little bit of time/space in a function call but makes it very hard to stack walk. Note that you may still get odd stack traces from release builds due to other optimisation settings (particularly inlining) but they should still have the majority of interesting functions.
You will not have type information if the function is written in assembly language. Also it is possible that a static library was linked to the DLL and the static library did not have full debug information.
I know this is old, but for anyone coming across this issue, what worked for me was to run ".lines -e". This is probably what Naveen was suggesting.

In an ELF file, how does the address for _start get detemined?

I've been reading the ELF specification and cannot figure out where the program entry point and _start address come from.
It seems like they should have to be in a pretty consistent place, but I made a few trivial programs, and _start is always in a different place.
Can anyone clarify?
The _start symbol may be defined in any object file. Normally it is generated automatically (it corresponds to main in C). You can generate it yourself, for instance in an assembler source file:
.globl _start
_start:
// assembly here
When the linker has processed all object files it looks for the _start symbol and puts its value in the e_entry field of the elf header. The loader takes the address from this field and makes a call to it after it has finished loading all sections in memory and is ready to execute the file.
Take a look at the linker script ld is using:
ld -verbose
The format is documented at: https://sourceware.org/binutils/docs-2.25/ld/Scripts.html
It determines basically everything about how the executable will be generated.
On Binutils 2.24 Ubuntu 14.04 64-bit, it contains the line:
ENTRY(_start)
which sets the entry point to the _start symbol (goes to the ELF header as mentioned by ctn)
And then:
. = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;
which sets the address of the first headers to 0x400000 + SIZEOF_HEADERS.
I have modified that address to 0x800000, passed my custom script with ld -T and it worked: readelf -s says that _start is at that address.
Another way to change it is to use the -Ttext-segment=0x800000 option.
The reason for using 0x400000 = 4Mb = getconf PAGE_SIZE is to start at the beginning of the second page as asked at: Why is the ELF execution entry point virtual address of the form 0x80xxxxx and not zero 0x0?
A question describes how to set _start from the command line: Why is the ELF entry point 0x8048000 not changeable with the "ld -e" option?
SIZEOF_HEADERS is the size of the ELF + program headers, which are at the beginning of the ELF file. That data gets loaded into the very beginning of the virtual memory space by Linux (TODO why?) In a minimal Linux x86-64 hello world with 2 program headers it is worth 0xb0, so that the _start symbol comes at 0x4000b0.
I'm not sure but try this link http://www.docstoc.com/docs/23942105/UNIX-ELF-File-Format
at page 8 it is shown where the entry point is if it is executable. Basically you need to calculate the offset and you got it.
Make sure to remember the little endianness of x86 ( i guess you use it) and reorder if you read bytewise edit: or maybe not i'm not quit sure about this to be honest.