How can I view the differences between two DLLs? - dll

Is there a way to view the difference between two binary DLL files? I have PDBs for both.
Ideally I'd like to see:
What functions have been added
What functions have been removed
What functions have been modified (with a diff of the disassembly)
What other entries (static variables, resources, etc) have been added/removed/modified
Note: this is different from this question as I am dealing with native DLLs.

If you want to compare executable files, you have a couple of alternatives:
Bindiff: it's a commercial extension for the commercial disassembler IDA Pro. It's a de-facto tool for reverse engineering. According to the vendor description, it allows you to:
Identify identical and similar functions in different binaries
Port function names, anterior and posterior comment lines, standard comments and local names from one disassembly to the other
Detect & highlight changes between two variants of the same function
http://www.zynamics.com/bindiff.html
You still have a free alternative: PatchDiff. As Bindiff, it's also a plugin for IDA Pro. According to the developer, Patchdiff can perform the following tasks:
Display the list of identical functions
Display the list of matched functions
Display the list of unmatched functions (with the CRC)
Display a flow graph for identical and matched functions
http://cgi.tenablesecurity.com/tenable/patchdiff.php

A "low tech" approach (no disassembly) would be to use DUMPBIN /ALL (or another switch, depending on what exactly you want to know) on the DLLs and do a text compare on the result.

Related

Structure of QuickTime's 'dref' atom 'alis' element

I need to rewrite a QuickTime reference movie, making it point to another set of files.
I'm working in Windows environment, so I don't have acces to the QuickTime API, and being the referenced files unaccesible, I can't also use the COM interface to load the movie because it can't resolve the referenced paths.
The documentation in the "QuickTime File Format Specification" says that the 'dref' atom can have a list of 'alis', 'url ' and 'rsrc' data references. In this case I need to parse the 'alis' elements. According to the reference, "Data reference is a Macintosh alias".
So long, I have not been able to see a declaration of the structure or any related information. Do you know the structure of an alias record? Where can I find detailed information about it's structure?
Thank you a lot for your help!
The format is very similar to the sort of alias that you could generate in the Finder by right-clicking an item, and creating an alias to it.
Aside: When the QuickTime format was originally specified, Apple intelligently chose to incorporate a number of other standards and paradigms that were extensively already being used elsewhere in the OS. This is one of the reasons why QT is (or was) able to do really clever things like reference movies. Unfortunately, there's also now a lot of cruft leftover from OS features that are no longer relevant (ie. AppleShare). Back in its heyday, QuickTime was slick, especially compared to its competitors; today, it's vastly underappreciated due to the buggy Windows port, and the relatively low processing power of the desktop systems of its time.
Back ontopic, unfortunately, the format for alias files is not an open/published standard, and there is precious little documentation on the topic on the 'net. There's one really old doc that deconstructs the alias format used in Mac OS Classic. Although the structure used in OS X is very similar, the alias files themselves tend to be much larger, as they contain numerous extra data strings at the end of the file that are not documented in the above-linked documentation.
Also, aliases created in the finder do look a bit different from the ones contained within the dref atom, although I've never run through them bit-by-bit to deduce the actual differences. If you want to take a peek at what those files, and have the OS X Developer Tools installed, you can run
setfile -a a [filename]
on a Finder-generated alias to strip the file of its alias-ness so that you can look at its contents in a hex editor (otherwise, the OS will just redirect you to the linked file - doh!). You can re-set the file's alias attribute, or arbitrarily designate any file as an alias by running
setfile -a A [filename]
Unfortunately, during my experiments, dumping the alis portion of a QT movie's dref atom has never seemed to generate an alias that Mac OS was able to interpret.
Fortunately (or not, as it was in my case), the functions that Mac OS allegedly uses to create/handle aliases are part of a public API called the Alias Manager, which is part of the very-low-level CoreServices framework. If you've got time to delve into this further, you can write some code to experiment with Mac OS's built-in alias-generating and interpreting capabilities.
Unfortunately, if you're dealing with an old/buggy file, you have no way of knowing if the file was actually generated by CoreServices' Alias Manager, or if that framework has changed/evolved/regressed since then. Because it's a closed format, 3rd-party developers who opt to not use the Alias Manager can only take guesses as to the format's "legal" structure.
You can use this Java program to see what is in the header, and extract data (it's a bit old, but may still work). What is more useful, though, is the thorough discussion by the author about the Quicktime header.
But I think you may just be looking for the Apple documentation, currently found here.

Is it possible to override the behavior of a merge module

Supposing I have a merge module that installs a file "MyFile.txt" to a certain location, and that I wish to use that merge module, however I want to supply a different copy of "MyFile.txt" from the one supplied with the merge module.
Is it possible to do this? (And for bonus points how can I do this using Wix)
Update: Roughly speaking MyFile.txt is part of a package up component of installable items that we provide to others, they then comine these components with their own to produce an installer.
In the ideal world they would only need to add new files to the output, however this is a replacement for an existing system where they currently have the ability to modify or even replace items (suce as MyFile.txt) in the end installer, and so without the ability to do the same with the merge module the migration path will be difficult.
The packaged up component doesn't need to be a merge module if there is a better solution, however merge modules seemed like the sensible choice and in all other respects provide a very nice re-usable package of installer logic.
It's possible but every technique that I know is a bit of a hack and doesn't scale very well. Can you tell me more about what type of file MyFile.txt is and what the intent of the different flavors of the file? Usually my goal is to never have the same filename twice ( darn component rules ) and then design variation points to support the needs. Sometimes upstream changes to the application are required to do this correctly.

How to determine where, or if, a variable is used in an SSIS package

I've inherited a collection of largely undocumented ssis packages. The entry point package (ie: the one that forks off in a variety of directions to call other packages) defines a number of variables. I would like to know how these variables are being used, but there doesn't seem to be an equivalent of "right click/Find All References"
Is there a reliable way to determine where these variables are being used?
A hackish way would be to open the dtsx file in a text editor/xml viewer and search for the variable name.
If it's being used in expressions, it should show it and you can trace the xml tree back up until you find the object it's being used on.
You can use the bids helper add-in thats gives you visual feedback on where variables are used in your package. Thats makes it very fast and easy to detect them.Besides that, it offers several other valueable features.
Check out: http://bidshelper.codeplex.com/

LD_PRELOAD on AIX

Can someone here tell me if there is something similar to LD_PRELOAD on recent versions of AIX? More specifically I need to intercept calls from my binary to time(), returning a constant time, for testing purposes.
AIX 5.3 introduced the LDR_PRELOAD (for 32-bit programs) and LDR_PRELOAD64 (for 64-bit programs) variables. They are analoguous to LD_PRELOAD on Linux. Both are colon-separated lists of libraries, and symbols will be pre-emptively loaded from the listed shared objects before anything else.
For example, if you have a shared object foo.so:
LDR_PRELOAD=foo.so
If you use archives, use the AIX style to specify the object within the archive:
LDR_PRELOAD="bar.a(shr.so)"
And separate multiple entries with a colon:
LDR_PRELOAD="foo.so:bar.a(shr.so)"
AIX 5L uses the LDR_PRELOAD variable.
Not that I'm aware of. Closest thing we've done (with malloc/free for debugging) is to
create a new library file with just the functions desired (same name as original).
place it in a different directory to the original.
make a dependency from our library file to the original.
change the LD_LIBRARY_PATH (or SHLIB_PATH?) to put our library first in the search chain.
That way, our functions got picked up first by the loader, any we didn't supply were provided by the original.
This was a while ago. AIX 5L is supposed to be much more like Linux (hence the L) so it may be able to do exactly what you require.
Alternatively, if you have the source, munge the calls to time() with mytime() and provide your function. You're not testing exactly the same software but the differences for that sort of minimal change shouldn't matter.

Process for reducing the size of an executable

I'm producing a hex file to run on an ARM processor which I want to keep below 32K. It's currently a lot larger than that and I wondered if someone might have some advice on what's the best approach to slim it down?
Here's what I've done so far
So I've run 'size' on it to determine how big the hex file is.
Then 'size' again to see how big each of the object files are that link to create the hex files. It seems the majority of the size comes from external libraries.
Then I used 'readelf' to see which functions take up the most memory.
I searched through the code to see if I could eliminate calls to those functions.
Here's where I get stuck, there's some functions which I don't call directly (e.g. _vfprintf) and I can't find what calls it so I can remove the call (as I think I don't need it).
So what are the next steps?
Response to answers:
As I can see there are functions being called which take up a lot of memory. I cannot however find what is calling it.
I want to omit those functions (if possible) but I can't find what's calling them! Could be called from any number of library functions I guess.
The linker is working as desired, I think, it only includes the relevant library files. How do you know if only the relevant functions are being included? Can you set a flag or something for that?
I'm using GCC
General list:
Make sure that you have the compiler and linker debug options disabled
Compile and link with all size options turned on (-Os in gcc)
Run strip on the executable
Generate a map file and check your function sizes. You can either get your linker to generate your map file (-M when using ld), or you can use objdump on the final executable (note that this will only work on an unstripped executable!) This won't actually fix the problem, but it will let you know of the worst offenders.
Use nm to investigate the symbols that are called from each of your object files. This should help in finding who's calling functions that you don't want called.
In the original question was a sub-question about including only relevant functions. gcc will include all functions within every object file that is used. To put that another way, if you have an object file that contains 10 functions, all 10 functions are included in your executable even if one 1 is actually called.
The standard libraries (eg. libc) will split functions into many separate object files, which are then archived. The executable is then linked against the archive.
By splitting into many object files the linker is able to include only the functions that are actually called. (this assumes that you're statically linking)
There is no reason why you can't do the same trick. Of course, you could argue that if the functions aren't called the you can probably remove them yourself.
If you're statically linking against other libraries you can run the tools listed above over them too to make sure that they're following similar rules.
Another optimization that might save you work is -ffunction-sections, -Wl,--gc-sections, assuming you're using GCC. A good toolchain will not need to be told that, though.
Explanation: GNU ld links sections, and GCC emits one section per translation unit unless you tell it otherwise. But in C++, the nodes in the dependecy graph are objects and functions.
On deeply embedded projects I always try to avoid using any standard library functions. Even simple functions like "strtol()" blow up the binary size. If possible just simply avoid those calls.
In most deeply embedded projects you don't need a versatile "printf()" or dynamic memory allocation (many controllers have 32kb or less RAM).
Instead of just using "printf()" I use a very simple custom "printf()", this function can only print numbers in hexadecimal or decimal format not more. Most data structures are preallocated at compile time.
Andrew EdgeCombe has a great list, but if you really want to scrape every last byte, sstrip is a good tool that is missing from the list and and can shave off a few more kB.
For example, when run on strip itself, it can shave off ~2kB.
From an old README (see the comments at the top of this indirect source file):
sstrip is a small utility that removes the contents at the end of an
ELF file that are not part of the program's memory image.
Most ELF executables are built with both a program header table and a
section header table. However, only the former is required in order
for the OS to load, link and execute a program. sstrip attempts to
extract the ELF header, the program header table, and its contents,
leaving everything else in the bit bucket. It can only remove parts of
the file that occur at the end, after the parts to be saved. However,
this almost always includes the section header table, and occasionally
a few random sections that are not used when running a program.
Note that due to some of the information that it removes, a sstrip'd executable is rumoured to have issues with some tools. This is discussed more in the comments of the source.
Also... for an entertaining/crazy read on how to make the smallest possible executable, this article is worth a read.
Just to double-check and document for future reference, but do you use Thumb instructions? They're 16 bit versions of the normal instructions. Sometimes you might need 2 16 bit instructions, so it won't save 50% in code space.
A decent linker should take just the functions needed. However, you might need compiler & linke settings to package functions for individual linking.
Ok so in the end I just reduced the project to it's simplest form, then slowly added files one by one until the function that I wanted to remove appeared in the 'readelf' file. Then when I had the file I commented everything out and slowly add things back in until the function popped up again. So in the end I found out what called it and removed all those calls...Now it works as desired...sweet!
Must be a better way to do it though.
To answer this specific need:
•I want to omit those functions (if possible) but I can't find what's
calling them!! Could be called from any number of library functions I
guess.
If you want to analyze your code base to see who calls what, by whom a given function is being called and things like that, there is a great tool out there called "Understand C" provided by SciTools.
https://scitools.com/
I have used it very often in the past to perform static code analysis. It can really help to determine library dependency tree. It allows to easily browse up and down the calling tree among other things.
They provide a limited time evaluation, then you must purchase a license.
You could look at something like executable compression.