What kind of info are there in the first 832 bytes of .so file? - elf

I saw many similar stuff like this:
open("/lib64/libpthread.so.0", O_RDONLY) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260W \0242\0\0\0"..., 832) = 832
What's there in the beginning 832 bytes?

If the listing above was captured at program startup, then it is
likely that you are seeing the runtime loader in action, as it brings
in shared libraries and resolves symbols prior to launching the
program.
As for the initial contents being read, every ELF file starts with an
ELF header which describes the layout and contents of the rest of the
file---please see the tutorial "libelf by Example" for more
information.

Related

Why i got wrong debug symbols?

I have next workflow:
1) Build dll and pdb files.
2) Share dll to cutomer
3) Analize memory dump from customer.
When I run !analyze -v in WinDbg I got (below part of output)
....
MANAGED_STACK_COMMAND: _EFN_StackTrace
PRIMARY_PROBLEM_CLASS: WRONG_SYMBOLS
BUGCHECK_STR: APPLICATION_FAULT_WRONG_SYMBOLS
// some callstack here
MODULE_NAME: RTPLogic
IMAGE_NAME: RTPLogic.dll
DEBUG_FLR_IMAGE_TIMESTAMP: 58a43706
STACK_COMMAND: ~541s; .ecxr ; kb
FAILURE_BUCKET_ID: WRONG_SYMBOLS_c0000374_RTPLogic.dll!CSRTPStack::Finalize
BUCKET_ID: X64_APPLICATION_FAULT_WRONG_SYMBOLS_rtplogic!CSRTPStack::Finalize+1da
Looks like we have wrong debug symbol for RTPLogic.dll.
I download ChkMatch tool.
I get pdb path from windbg
0:541> !lmi RTPlogic.dll
Loaded Module Info: [rtplogic.dll]
Module: RTPLogic
.....
Age: 1, Pdb: D:\Work\path_to_original_pdb\RTPLogic.pdb
Image Type: MEMORY - Image read successfully from loaded memory.
Symbol Type: PDB - Symbols loaded successfully from image header.
C:\ProgramData\dbg\sym\RTPLogic.pdb\9F82CDF359044635ADEBA578CA1D1D031\RTPLogic.pdb
Compiler: Resource - front end [0.0 bld 0] - back end [9.0 bld 21022]
Load Report: private symbols & lines, not source indexed
C:\ProgramData\dbg\sym\RTPLogic.pdb\9F82CDF359044635ADEBA578CA1D1D031\RTPLogic.pdb
I have logs related to this dump and I see that my changes appears in logs. So customer not forgotten to install my DLL before get the memdump.
I run ChkMatch
PS D:\tools> .\ChkMatch.exe -c "D:\Work\path_to_dll\RTPLogic.dll" "C:\Progra
mData\dbg\sym\RTPLogic.pdb\9F82CDF359044635ADEBA578CA1D1D031\RTPLogic.pdb"
.....
Result: Matched
How it possible that I got wrong debug symbols in such situation?
The symbols for RTPLogic.dll!CSRTPStack::Finalize are correct, but other symbols that are required to reconstruct the call stack are incorrect. It's likely that you have some operating system methods on the call stack and the symbols for ntdll or similar are missing.
Since with ChkMatch, you're only checking one single PDB file, the result of ChkMatch is as reliable and correct (for one PDB) as that of WinDbg (for many PDBs) and they do not contradict each other.
Your sympath probably contains only a local path to your own DLLs and does not contain any information about Microsoft's symbol server. In the output of .sympath (which you did not post), I expect to see something like
0:000> .sympath
D:\Work\path_to_dll
You should include Microsoft symbols as well, as described in How to set up symbols in WinDbg. To fix the problem, use the following commands:
.symfix+ c:\symbols
.reload /f
The output of .sympath should now look like
0:000> .sympath
D:\Work\path_to_dll;SRV*c:\symbols*http://msdl.microsoft.com/download/symbols
This should help WinDbg in reconstructing the complete call stack, resolve OS methods of ntdll and others and thus get rid of the "wrong symbols" message.

How to convert ELF file to binary file?

My understanding is that a binary file is the hex-codes of the instructions of the processor (can be loaded into memory & start executing from entry point) and a ELF file is the same with NO-Fixed memory addresses assigned for data etc...
Now, how can I convert ELF to binary?
How the conversion works? I mean how the memory addresses are assigned?
In general
An ELF file does not need to use "NO-Fixed memory addresses". In fact, the typical ELF executable file (ET_EXEC) is using a fixed address.
A binary file is usually understood as a file containing non-text data. In the context of programs, it is usually understood to mean the compiled form of the program (in opposition to the source form which is usually a bunch of text files). ELF file are binary files.
Now you might want to know how the ELF file is transformed into the in-memory-representation of the program: the ELF file contains additional information such as where in the program (virtual) address-space each segment of the program should be loaded, which dynamic-libraries should be loaded, how to link the main program and the dynamic libraries together, how to initialise the program, where is the entry point of the program, etc.
One important part of an executable or shared-object is the location of the segments which must be loaded into the program address space. You can look at them using readelf -l:
$ readelf -l /bin/bash
Elf file type is EXEC (Executable file)
Entry point 0x4205bc
There are 9 program headers, starting at offset 64
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
PHDR 0x0000000000000040 0x0000000000400040 0x0000000000400040
0x00000000000001f8 0x00000000000001f8 R E 8
INTERP 0x0000000000000238 0x0000000000400238 0x0000000000400238
0x000000000000001c 0x000000000000001c R 1
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000f1a74 0x00000000000f1a74 R E 200000
LOAD 0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
0x0000000000009068 0x000000000000f298 RW 200000
DYNAMIC 0x00000000000f1df8 0x00000000006f1df8 0x00000000006f1df8
0x0000000000000200 0x0000000000000200 RW 8
NOTE 0x0000000000000254 0x0000000000400254 0x0000000000400254
0x0000000000000044 0x0000000000000044 R 4
GNU_EH_FRAME 0x00000000000d6af0 0x00000000004d6af0 0x00000000004d6af0
0x000000000000407c 0x000000000000407c R 4
GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000
0x0000000000000000 0x0000000000000000 RW 10
GNU_RELRO 0x00000000000f1de0 0x00000000006f1de0 0x00000000006f1de0
0x0000000000000220 0x0000000000000220 R 1
Each LOAD (PT_LOAD) entry describes a segment which must be loaded in the program address-space.
Reading and processing this information is the job of the ELF loaders: on your typical OS this is done in part by the kernel and in part by the dynamic-linker (ld.so, also called "program interpreter" in ELF parlance).
ARM plain binary files
(I don't really known about ARM stuff.)
You're apparently talking about embedded platforms. On ARM, a plain binary file contains the raw content of the initial memory of the program. It does not contain things such as string tables, symbol tables, relocation tables, debug informations but only the data of the (PT_LOAD) segments.
It is a binary file, not hex-encoded. The vhx files are hex-encoded.
Plain binary files can be generated from the ELF files with fromelf.
The basic idea here is that each PT_LOAD entry of a ELF file is dumped at its correct position in the file and remaining gaps (if any) between them are filled with zeros.
The ELF file already has addresses assigned in the p_vaddr field of each segment so this conversion process does not need to determine addresses: this has already been done by the link editor (and the linker script).
References
ARM ELF file format
I came here while searching for "convert .elf into binary file" (with arm files in mind, though).
It turned out that the easiest way in my case was to use
arm-none-eabi-objcopy -O binary kernel.elf kernel.bin
I do not understand what do you want to say but ELF(Executable Linkable format) is a new executable format. ofcourse its sections including .text need to be mapped in memory for execution. but if you want to convert ELF into binary check what is the difference between ELF files and bin files. some answers contain information how to change ELF into other binary format
in order to clear how ELF is loaded into memory check http://www.gsp.com/cgi-bin/man.cgi?topic=elf. if you have still some problems come with specific question.

How does numpy handle mmap's over npz files?

I have a case where I would like to open a compressed numpy file using mmap mode, but can't seem to find any documentation about how it will work under the covers. For example, will it decompress the archive in memory and then mmap it? Will it decompress on the fly?
The documentation is absent for that configuration.
The short answer, based on looking at the code, is that archiving and compression, whether using np.savez or gzip, is not compatible with accessing files in mmap_mode. It's not just a matter of how it is done, but whether it can be done at all.
Relevant bits in the np.load function
elif isinstance(file, gzip.GzipFile):
fid = seek_gzip_factory(file)
...
if magic.startswith(_ZIP_PREFIX):
# zip-file (assume .npz)
# Transfer file ownership to NpzFile
tmp = own_fid
own_fid = False
return NpzFile(fid, own_fid=tmp)
...
if mmap_mode:
return format.open_memmap(file, mode=mmap_mode)
Look at np.lib.npyio.NpzFile. An npz file is a ZIP archive of .npy files. It loads a dictionary(like) object, and only loads the individual variables (arrays) when you access them (e.g. obj[key]). There's no provision in its code for opening those individual files inmmap_mode`.
It's pretty obvious that a file created with np.savez cannot be accessed as mmap. The ZIP archiving and compression is not the same as the gzip compression addressed earlier in the np.load.
But what of a single array saved with np.save and then gzipped? Note that format.open_memmap is called with file, not fid (which might be a gzip file).
More details on open_memmap in np.lib.npyio.format. Its first test is that file must be a string, not an existing file fid. It ends up delegating the work to np.memmap. I don't see any provision in that function for gzip.

File Name in log file using log4php

I write a wrapper over log4php api named as loggerimplementation.php
Now I faced a problem in log files. When I called a function for logging defined in loggerimplementation.
Log file generated as:
2011-06-10 10:49:17 [INFO] Log4phpTest: Hello (at C:\xampp\htdocs\logger\Loggerimplementation.php line 74)
2011-06-10 10:50:08 [INFO] Log4phpTest: Hello (at C:\xampp\htdocs\logger\Loggerimplementation.php line 74)
2011-06-10 11:04:01 [INFO] Log4phpTest: Hello (at C:\xampp\htdocs\logger\Loggerimplementation.php line 78)
Now , as you can see %F will always show wrapper file name because it's always called from there. But I want real file name which one used this wrapper.
So please suggest me the way for this.
Or
tell me from where %F details I can get?
Waiting for reply.

WinDBG doesn't display source lines despite loading private pdb files

I am trying to debug a problem in a native DLL using WinDBG. I believe that I have the private symbols loaded, but WinDBG is not displaying the source lines or parameter information. Here is what I am observing; any help would be greatly appreciated!
I have the PDB which I believe corresponds to the DLL in the symbol search path. Running lm I see:
01050000 01058000 3NMSMTHR C (private pdb symbols) e:\ads_symbols\3NMSMTHR.pdb
As this states "private pdb symbols" I expect that this is the private pdb.
I also ran symchk and see the following output:
C:\utils\inetmgr\patch01>"c:\Program Files\Debugging Tools for Windows (x86)\symchk.exe" /v 3nmsmthr.dll /s c:\utils\inetmgr\patch01
[SYMCHK] Searching for symbols to C:\utils\inetmgr\patch01\3nmsmthr.dll in path c:\utils\inetmgr\patch01
DBGHELP: Symbol Search Path: c:\utils\inetmgr\patch01
[SYMCHK] Using search path "c:\utils\inetmgr\patch01"
DBGHELP: No header for C:\utils\inetmgr\patch01\3NMSMTHR.DLL. Searching for image on disk
DBGHELP: C:\utils\inetmgr\patch01\3NMSMTHR.DLL - OK
DBGHELP: 3NMSMTHR - private symbols & lines
c:\utils\inetmgr\patch01\3NMSMTHR.pdb
[SYMCHK] MODULE64 Info ----------------------
[SYMCHK] Struct size: 1680 bytes
[SYMCHK] Base: 0x10000000
[SYMCHK] Image size: 32768 bytes
[SYMCHK] Date: 0x4cc1b0f8
[SYMCHK] Checksum: 0x00000000
[SYMCHK] NumSyms: 0
[SYMCHK] SymType: SymPDB
[SYMCHK] ModName: 3NMSMTHR
[SYMCHK] ImageName: C:\utils\inetmgr\patch01\3NMSMTHR.DLL
[SYMCHK] LoadedImage: C:\utils\inetmgr\patch01\3NMSMTHR.DLL
[SYMCHK] PDB: "c:\utils\inetmgr\patch01\3NMSMTHR.pdb"
[SYMCHK] CV: RSDS
[SYMCHK] CV DWORD: 0x53445352
[SYMCHK] CV Data: I:\usr\bpi\adrutl\3NMSMTHR.pdb
[SYMCHK] PDB Sig: 0
[SYMCHK] PDB7 Sig: {A865C40A-5070-4752-AD1F-CD3087843807}
[SYMCHK] Age: 4
[SYMCHK] PDB Matched: TRUE
[SYMCHK] DBG Matched: TRUE
[SYMCHK] Line nubmers: TRUE
[SYMCHK] Global syms: TRUE
[SYMCHK] Type Info: TRUE
[SYMCHK] ------------------------------------
SymbolCheckVersion 0x00000002
Result 0x001f0001
DbgFilename
DbgTimeDateStamp 0x4cc1b0f8
DbgSizeOfImage 0x00008000
DbgChecksum 0x00000000
PdbFilename c:\utils\inetmgr\patch01\3NMSMTHR.pdb
PdbSignature {A865C40A-5070-4752-AD1F-CD3087843807}
PdbDbiAge 0x00000004
[SYMCHK] [ 0x00000000 - 0x001f0001 ] Checked "C:\utils\inetmgr\patch01\3NMSMTHR.DLL"
SYMCHK: FAILED files = 0
SYMCHK: PASSED + IGNORED files = 1
This finds the PDB in the right path I've given it (note that I copied this exact PDB file to e:\ads_symbols which is the path seen in the lm output). This symchk output states Line Numbers: true and thus I expect to see private style information. However, if I run ~kv then for my functions in the stack trace I see:
00bef2ac 01052a8a 00000000 00000000 00020aa4 3NMSMTHR!BPMThrProcTerm+0x2c0
00bef2cc 100073eb 00bef4d8 00000000 00000000 3NMSMTHR!BPMThrThreadInitName+0x2a
And this doesn't seem like its reading the private information-- I don't get the source listing like I do for the MS CRT functions which have private symbols on the MSFT symbol server. Also if I do x /t /d 3NMSMTHR!ThreadInitName then I get
01052a60 <NoType> 3NMSMTHR!BPMThrThreadInitName = <no type information>
And lastly if I try to use .frame3 (to go to that frame) and then execute dv to display the locals, I receive the:
0:001> .frame
03 00bef2cc 100073eb 3NMSMTHR!BPMThrThreadInitName+0x2a
0:001> dv
Unable to enumerate locals, HRESULT 0x80004005
Private symbols (symbols.pri) are required for locals.
Type ".hh dbgerr005" for details.
This doesn't make sense to me. Any help would be much appreciated. My overall goal is to get the parameter and source information. OR to confirm that the PDB file I have is in fact NOT the private symbols. I didn't build this DLL or PDB nor do I know any specifics about the linker options passed to it.
Thanks!
EDIT:
I failed to mention that I am getting the checksum error:
*** WARNING: Unable to verify checksum for C:\utils\inetmgr\3NMSMTHR.dll
Sorry! I was trying to run the .lines command as suggested below and I see:
*** WARNING: Unable to verify checksum for C:\utils\inetmgr\3NMSMTHR.dll
DBGHELP: 3NMSMTHR - private symbols & lines
e:\ads_symbols\3NMSMTHR.pdb
Line number information will not be loaded
So I guess that's my problem. Which leads to my next question: is there a way to fix the checksum (which is listed as 0, see above symchk output)? This PDB is the correct one given the symchk output. Can I have it bypass the checksum check?
EDIT2:
For anyone else that comes across this: I was able to fix the checksum warning by:
editbin /release 3NMSMTHR.DLL
This set the checksum in the PE header. Then I had to run the
.symopt+0x40
In WinDbg in order to force it to load the PDB even though the timestamp on the DLL was different. I'm sure that alternatively I could've used some utility to update the modified timestamp as well.
That fixed the warning about the checksum...but STILL no parameter info (running dv on the right frame), no source line info, etc.
So now I'm lost. Is it possible that these PDBs don't contain that info? How could I confirm that? How would I build them to contain it? We use NMAKE to build these.
EDIT3:
I rebuilt the DLL and PDB as DEBUG and then got all of the stack trace information that I expected. So now my question is: (1) is it possible to build in release and get the static functions, parameter info, etc. (private symbol info)? and (2) the stack trace I was getting with the release dlls+pdbs was incorrect-- the first function entrypoint was correct, but then the next stack frame showed a func that wasn't called. My assumption is that the release DLL inlined some functions and somehow the PDB was just "guessing" at the function in that frame? Very strange.
Did you try the .lines command?
If you want to be able to make sense of dumps or stack traces even in Release mode, you should ensure the following:
You compile with /Zi or /ZI (Debug Information Format is one of the two Program Database options).
You do not compile with /Oy (Omit Frame Pointers).
You link with /DEBUG (Generate Debug Info).
You keep (but don't distribute) the resulting .pdb file.
The main thing is to avoid omitting frame pointers; omitting them saves a little bit of time/space in a function call but makes it very hard to stack walk. Note that you may still get odd stack traces from release builds due to other optimisation settings (particularly inlining) but they should still have the majority of interesting functions.
You will not have type information if the function is written in assembly language. Also it is possible that a static library was linked to the DLL and the static library did not have full debug information.
I know this is old, but for anyone coming across this issue, what worked for me was to run ".lines -e". This is probably what Naveen was suggesting.