finding malware in 2 different files of the same program - malware

So this is an intro class I am taking in reverse engineering.
So I have two files that are the same program and one is supposed to have a trojan in it.
I looked at both files and have found some very odd things. However, I don't have reasons as to why it would happen.
The PE header is different. In one file in the DOS header the PE header is located at offset F0 and the other at F8. Why? I don't really understand. Why would someone change the PE header by 8 bytes?
I noticed the code entry points are different too. Does this mean that the start of the program is jumping else where meaning both programs are running from different locations.
I noticed all of the RVA's for say the export or import table have increased or shifted up higher. I assume this is because the PE header shifted by 8 bytes, therefore everything else in the file will shift up too.
The size of code value is different, as I found one file is a bit larger than the other. The time stamps are different too meaning that the file must have been edited.
One of the files has the import symbol execve, while the other does not. I don't know what this symbol does?
Lastly, I think 1 of the export symbols has jumps and such, that the other does not have. Meaning that it is doing something it shouldn't be doing.
Anyway, these are some observations I have noticed. I just need help making sense of what these observations might mean.
Thanks.
A Noob reverse engineer.

hopefully this will clear some things up.
I noticed the code entry points are different too. Does this mean that the start of the program is jumping else where meaning both programs are running from different locations.
Ok the change in the code entry points can clearly indicate that the code has been tampered with and often means that the malicious code will be called on entry and then the malicious code will run the normal code there-after. This is done so that the user does not notice the application has been tampered with.
The size of code value is different, as I found one file is a bit larger than the other. The time stamps are different too meaning that the file must have been edited.
The change in size can also indicate that there is malicious code in the executable because executables are not supposed to grow (I don't know you are feeding yours).
One of the files has the import symbol execve, while the other does not. I don't know what this symbol does?
As for execv, please see _execv, _wexecv MSDN

Related

What does "Has been cloaked" mean?

I'm working on moving some types to a matching file refactoring in a VB.NET application written years ago by someone whose no longer with us. He tended to put lots of classes in a file. For example, he'd put a dozen classes into a BaseClases.vb file. It has become hard to find those classes, so I started performing this refactoring to help me as I support the app.
However, every time I'd refactor a type into a separate class file, VS 2019 would generate a message that reads like this:
"The item SomeClass.vb has been cloaked. See output tool window for information on any other errors."
I'd look at the Errors window but didn't see anything obvious. Although there's lots of other errors unrelated to this, so it might be there, and I've missed it in the crowd of errors.
I've searched SO for the phrase "has been cloaked". It only brought up 2 posts, neither of which addressed the situation I'm in.
So, what does it mean to have a new class cloaked? From what is it being cloaked?

Too Few Observation Sequences while training new voice for MarryTTS

I'm trying to build a new voice for MaryTTS in German for a while now, but didn't succeed so far. I followed a tutorial (https://github.com/marytts/marytts/wiki/HMMVoiceCreation) and tried to understand each step. No matter what I do, I get stuck at step 14 (HMMVoiceMakeVoice), the error being:
ERROR [+2121] HInit: Too Few Observation Sequences
which usually means, that the tested phone (en9 in this example) is not found within my data set.
After changing the locale, the same error happend on the phone "de27" as Nikolay Shmyrev pointed out.
I doubt that though, since I use about 500 Audio files, which have a length of at least 5 sec, so a total well over an hour of footage.
In fact, I skipped the "en9" phone, since I don't know what exactly is represented by it. The next one to fail was "oI", which I located manually about ten times in the first few audio files.
I think it has to do with the automatic labeling to not work properly (step 2-4), but I don't know, what I can do, to get a better result?
Edit: I uploaded all the files I get until this step, which can be inspected on this shared google drive. Note, that I could not, for copyright reasons, upload the wav folder. In the logs directory, you can find the logs after each step. I couldn't find any problems there, but maybe someone will.
I do not completely understand the structure of the generated data, but I thought changing the MARYBASE/mary/trickyPhones.txt and running the make tools again would be enough to change the map name from "tS" to "Z" which sounds about the same in German. But the HMMVoiceMakeVoice still results in the same output.

How to find the size of a reg in verilog?

I was wondering if there were a way to compute the size of a reg in Verilog. I researched it quite a bit, and found $size(a), but it's only in SystemVerilog, and it won't work in my verilog program.
Does anyone know an alternative for this??
I also wanted to ask as a side note; I'm having some trouble with my test bench in the sense that when I update a value in the file, that change is not taken in consideration when I simulate. I've been told I might have been using an old test bench but the one I am continuously simulating is the only one available in this project.
EDIT:
To give you an idea of what's the problem: in my code there is a "start" signal and when it is set to 1, the operation starts. Otherwise, it stays idle. I began writing the test bench with start=0, tested it and simulated it, then edited the test bench by setting start to 1. But when I simulate it, the start signal remains 0 in the waveform. I tried to check whether I was using another test bench, but it is the only test bench I am using in this project.
Given that I was on a deadline, I worked on the code so that it would adapt to the "frozen" test bench. I am getting now all the results I want, but I wanted to test some other features of my code, so I created a new project and copy pasted the code in new files (including the same test bench). But when I ran a simulation, the waveform displayed wrong results (even though I was using the exact same code in all modules and test bench). Any idea why?
Any help would be appreciated :)
There is a standardised way to do this, but it requires you to use the VPI, which I don't think you get on Modelsim's student edition. In short, you have to write C code, and dynamically link it to the simulator. In the C code, you can get object properties using routines such as vpi_get. Useful properites might be vpiSize, which is what you want, vpiLeftRange, vpiRightRange, and so on.
Having said all that, Verilog is essentially a static language, and objects have to be declared with a static width using constant expressions. Having a run-time method to determine an object's size is therefore of pretty limited value (since you should already know it), and may not solve whatever problem you actually have. Your question would make more sense for VHDL (and SystemVerilog?), which are much more dynamic.
Note on Icarus: the developers have pushed lots of SystemVerilog stuff back into the main language. If you take advantge of this you may find that your code is not portable.
Second part of your question: you need to be specific on what your problem actually is.

Any way to determine when a .net program was compiled/built

I need to prove that a VB.NET program that I wrote was written at a particular time.
(the reason is an academic integrity investigation where someone copied my code).
I have all the code on my disk including the debug and release folders, with my username in the build paths.
Are their addition things I could do, such as looking in the binaries?
If you use IL Disassembler to open the EXE/DLL, then select menu option View>Header, there is a field called "Time-date stamp" in the COFF/PE header. It's in binary format, and according to MSDN it is:
The low 32 bits of the time stamp of the image. This represents the date and time the image was created by the linker. The value is represented in the number of seconds elapsed since midnight (00:00:00), January 1, 1970, Universal Coordinated Time, according to the system clock.
First thing you should do it copy all of the data as it stands to another device - making sure you preserve all date times. Do not open or edit any of the files.
Each file will have three timestamps, when it was created, when it was last modified etc. These can be found using DIR /T
/T Controls which time field displayed or used for sorting
timefield C Creation
A Last Access
W Last Written
Get a listing of the directory like this:
DIR myrootdir /s /ah /as /tc > fileslist.txt
This will dump out all the files with creation times to a file called fileslist.txt
Also as #EricJ says : offer your disk as evidence - but like I said make a copy first. It would be best to make an image copy (windows backup) to an another drive first.
The investigators are going about this all the wrong way.
Any timestamp data can be faked, so the best way would be for them to sit down and ask detailed questions about how the code works, to both parties seperately.
Or to ask both parties to complete a small test project, again seperately - under exam conditions.
The one that copied the work wont understand what they copied most likely, and wont be able to reproduce something based on similar concepts.
The one who did write it - well unless they cheated to, they will understand it all in depth.

Using open source SNES emulator code to turn a rom file into a self-contained executable game

Would it be possible to take the source code from a SNES emulator (or any other game system emulator for that matter) and a game ROM for the system, and somehow create a single self-contained executable that lets you play that particular ROM without needing either the individual rom or the emulator itself to play? Would it be difficult, assuming you've already got the rom and the emulator source code to work with?
It shouldn't be too difficult if you have the emulator source code. You can use a method that is often used to store images in c source files.
Basically, what you need to do is create a char * variable in a header file, and store the contents of the rom file in that variable. You may want to write a script to automate this for you.
Then, you will need to alter the source code so that instead of reading the rom in from a file, it uses the in memory version of the rom, stored in your variable and included from your header file.
It may require a little bit of work if you need to emulate file pointers and such, or you may be lucky and find that the rom loading function just loads the whole file in at once. In this case it would probably be as simple as replacing the file load function with a function to return your pointer.
However, be careful for licensing issues. If the emulator is licensed under the GPL, you may not be legally allowed to store a proprietary file in the executable, so it would be worth checking that, especially before you release / distribute it (if you plan to do so).
Yes, more than possible, been done many times. Google: static binary translation. Graham Toal has a good howto paper on the subject, should show up early in the hits. There may be some code out there I may have left some code out there.
Completely removing the rom may be a bit more work than you think, but not using an emulator, definitely possible. Actually, both requirements are possible and you may be surprised how many of the handheld console games or set top box games are translated and not emulated. Esp platforms like those from Nintendo where there isnt enough processing power to emulate in real time.
You need a good emulator as a reference and/or write your own emulator as a reference. Then you need to write a disassembler, then you have that disassembler generate C code (please dont try to translate directly to another target, I made that mistake once, C is portable and the compilers will take care of a lot of dead code elimination for you). So an instruction of a make believe instruction set might be:
add r0,r0,#2
And that may translate into:
//add r0,r0,#2
r0=r0+2;
do_zflag(r0);
do_nflag(r0);
It looks like the SNES is related to the 6502 which is what Asteroids used, which is the translation I have been working on off and on for a while now as a hobby. The emulator you are using is probably written and tuned for runtime performance and may be difficult at best to use as a reference and to check in lock step with the translated code. The 6502 is nice because compared to say the z80 there really are not that many instructions. As with any variable word length instruction set the disassembler is your first big hurdle. Do not think linearly, think execution order, think like an emulator, you cannot linearly translate instructions from zero to N or N down to zero. You have to follow all the possible execution paths, marking bytes in the rom as being the first byte of an instruction, and not the first byte of an instruction. Some bytes you can decode as data and if you choose mark those, otherwise assume all other bytes are data or fill. Figuring out what to do with this data to get rid of the rom is the problem with getting rid of the rom. Some code addresses data directly others use register indirect meaning at translation time you have no idea where that data is or how much of it there is. Once you have marked all the starting bytes for instructions then it is a trivial task to walk the rom from zero to N disassembling and or translating.
Good luck, enjoy, it is well worth the experience.