Can I set a maximum value for a number in protobuf? - size

In protobuf, we only have the choice of using signed or unsigned 32- or 64-bit integer to limit the range of a value.
However, the datastructure I want to define contains a mixture of 8-bit, 16-bit and 32-bit integers to save space on embedded devices. On them, the datastructure is also implemented somewhat differently and requires reserved special values for some fields, so the maximum number for them is not a power of 2.
On these embedded devices, the protobuf definition is only used for transmission to and from them, not for actual storage. So I could just limit the numbers when reading them in.
However, I'd rather define these maximum values in the .proto or .options file to make sure all client applications are aware of these limitations.
Is there a way to do this?
I know there are field options, but the ones listed here does not include an option for this. It is possible to create custom options, but that seems to require writing a compiler extension, which means I have to manually implement this limit checking for every language I want to compile to, and that costs more time than it will ever save.

This is not possible in protobuf by default, and the specification includes no syntax to enforce limits like this.
However, some third party implementations do include such support.
For example, my own nanopb has the int_size option:
int_size: Override the integer type of a field. (To use e.g. uint8_t to save RAM.)
This will return an error at runtime from pb_decode() if the value does not fit in the field.

No there is no syntax for expressing that intent and no inbuilt tool / codegen that will enforce the rule you want to add. You would need to handle this manually.

Related

Is there a way to get exported constants from an Objective-C framework? [duplicate]

I'm trying to find a constant (something like a secret token) from inside of an iOS app in order to build an app using an undocumented web API (by the way, I'm not into something illegal). So far, I have the decrypted app executable on my Mac (jailbreak + SSH + dumping decrypted executable as file). I can use the strings command to get a readable list of strings, and I can use the class-dump tool (http://stevenygard.com/projects/class-dump/) to get a list of interface definitions (headers) of the classes. Although this gives me an idea of the app's inner workings, I still can't find what I'm searching for: the constants I'm looking for. There are literally thousands of string definitions in the strings command dump. Is there any way to dump the strings in a way that I can have the names of the NSString constants with their values. I don't need the implementation details of the methods, I know that it's compiled and all I can get is assembly code. But if I can get the names of the string constants (both in strings dump and class dump) and also the string values (in strings dump), I think there may be a way to associate them together.
Thanks,
Can.
Unfortunately, no, unless there's some black magic tool out there that I'm unaware of, or unless the executable was built with debug symbols (which is likely not the case). If there are debug symbols, you should be able to run it through a debugger and get variable names.
At compile time, the compiler strips off the name of the constant, and replaces all occurrences of the constant in the code with the address of its location in memory (which is usually the same byte offset as inside the executable). Because of this, the original variable naming of the constant is lost, leaving only the value. Hence, the reason you can't find the constants anywhere.
Something that I would do to try to find the secret token, is capture all the data traffic that the app creates, and then look for the same patterns in the binary. If the token is indeed in there, and it isn't obfuscated somehow, then at least that narrows it down for you greatly.
Good luck! RE can be very rewarding but sometimes it really sucks.

can hard coded strings in a compiled exe be changed?

Lets say you have some code in your app with a hard coded string.
If somevalue = "test123" Then
End If
Once the application is compiled, is it possible for someone to modify the .exe file and change 'test123' to something else? If so, would it only work if the string contained the same number of characters?
It's possible but not necessarily straightforward. For example, if your string is loaded in memory, someone could use a memory manager tool to modify the value of the string's address directly.
Alternatively, they could decompile your app, change the string, and recompile it to create a new assembly with the new string. However, whether this is likely to happen depends on your app and how important it is for that string to be changed.
You could use an obfuscator to make it a bit harder to do but, ultimately, a determined cracker would be able to do it. The question is whether that string is important enough to worry about and, if so, maybe consider an alternative approach such as using a web service to provide the string.
Strings hard-coded without any obfuscation techniques can easily be found inside compiled executables by openign them up in any HEX-editor. Once found, replacing the string is possible in 2 ways :
1. Easy way (*conditions apply)
If the following conditions apply in your case, this is a very quick-fire way of modifying the hard-coded strings in the executable binary.
length(new-string) <= length(old-string)
No logic in the code to check for executable modification using CRC.
This is a viable option ONLY if the new string is equal or shorter than the old string. Use a hex-editor to find occurrences of the old string and replace it with the new string. Pad an extra space with NULL i.e. 0x00
For example old-long-string in the binary
is modified to a shorter new-string and padded with null characters to the same length as the original string in the binary executable file
Note that such modifications to the executable files are detected by any code that verifies the checksum of the binary file against the pre-calculate checksum of the original binary executable file.
2. Harder way (applicable in almost all cases)
De-compiling the binary to native code opens up the possibility to modify any strings (and even code) and rebuild it to obtain the new binary executable.
There exist dozens of such de-compiler tools to decompile vb.net (Visual Studio.net, in general). An excellent detailed comparison of the most popular ones (ILspy, JustDecompile, DotPeek, .NET Reflector to name a few ) can be found here.
There do exist scenarios in which even the harder way will NOT be successful. This is the case when the original developer has used obfuscation techniques to prevent the strings from being detected and modified in the executable binary. One such obfuscation technique is storing encrypted strings.

is there anyway in fortran90 to read data at specied byte

I have encountered a problem that demands reading at data at specified byte from a binary input file,like reading at location 40000 bytes off the start of the file.I intend to use direct access to file.But that requires each segment be divided in the same size which specified in the argument recl.Can anybody provides a feasible solution.Some programming language like c provide function that can jump to the specified bytes.
The Fortran 2003 standard introduced unformatted stream access, to pretty much do exactly this. Once the file has been opened appropriately you can just use a POS specifier in the relevant write statement. Support for this Fortran 2003 feature is reasonably widespread amongst the Fortran compilers that are actively supported. The compiler needs to use a file storage unit of a byte, but all compilers that I am aware of do this (this is also what the standard recommends).
Otherwise, the closest standard Fortran 90 approach is to use unformatted direct access with a record length that is some reasonable common factor of the desired position and size of the elements of data to be read. For instance - if you were reading eight byte real numbers from the file, then a record length of eight might work - you would start reading at record number 5000. This requires both that the file storage unit of the Fortran processor be a byte (common, perhaps with compile options) and that no record delimiters or similar exists in the file for unformatted direct access (mostly the case, again perhaps with compile options).

Changing embedded serial number

I have a Serial number string "1080910" embedded in a programmable device which has been downloaded to a binary file using the ALL-100 programmer. This is my Master file as it were. I need to change this serial number to that of the unit that I need to re-flash using the Master file - the ALL-100 programmer uses XACCESS User Interface which has Edit feature showing Address location, Hex data field and Ascii field. Somewhere in this file is the serial number string - can anybody assist me in how to locate and edit the serial number string as I have been unable to locate it using the search function and have not been able to visually pick up the sequence of numbers. Help !!!
If the data has a symbolic address in the source code, and is not a local variable, its address will appear in the map file generated by the linker. If it is a local variable initialised with a literal constant, then the data will exist in the static initialisation data the location of which should also be identified in the map file.
Another possibility is that your application image is compressed and the start-up code expands it into RAM at run-time. This will be obvious in the map file if the data and code addresses are in RAM rather than ROM. If this is the case then what you are attempting will be very difficult. You would have to know the compression algorithm used, and which part of the image is the commpressed part (part of it will be the decompression code that runs from ROM). You would then have to decompress the image, modify the string, and then recompress it. Further, if the decompression performs any kind of checksum on the compressed or decompressed data, you will have to recalculate and modify that too.
If this was a requirement from the outset, you would have done better to reserve the space in the linker script or use compiler specific extensions to absolutely locate the data at a specific location.
Maybe it is stored in Unicode, so alternate chars are 00.

Process for reducing the size of an executable

I'm producing a hex file to run on an ARM processor which I want to keep below 32K. It's currently a lot larger than that and I wondered if someone might have some advice on what's the best approach to slim it down?
Here's what I've done so far
So I've run 'size' on it to determine how big the hex file is.
Then 'size' again to see how big each of the object files are that link to create the hex files. It seems the majority of the size comes from external libraries.
Then I used 'readelf' to see which functions take up the most memory.
I searched through the code to see if I could eliminate calls to those functions.
Here's where I get stuck, there's some functions which I don't call directly (e.g. _vfprintf) and I can't find what calls it so I can remove the call (as I think I don't need it).
So what are the next steps?
Response to answers:
As I can see there are functions being called which take up a lot of memory. I cannot however find what is calling it.
I want to omit those functions (if possible) but I can't find what's calling them! Could be called from any number of library functions I guess.
The linker is working as desired, I think, it only includes the relevant library files. How do you know if only the relevant functions are being included? Can you set a flag or something for that?
I'm using GCC
General list:
Make sure that you have the compiler and linker debug options disabled
Compile and link with all size options turned on (-Os in gcc)
Run strip on the executable
Generate a map file and check your function sizes. You can either get your linker to generate your map file (-M when using ld), or you can use objdump on the final executable (note that this will only work on an unstripped executable!) This won't actually fix the problem, but it will let you know of the worst offenders.
Use nm to investigate the symbols that are called from each of your object files. This should help in finding who's calling functions that you don't want called.
In the original question was a sub-question about including only relevant functions. gcc will include all functions within every object file that is used. To put that another way, if you have an object file that contains 10 functions, all 10 functions are included in your executable even if one 1 is actually called.
The standard libraries (eg. libc) will split functions into many separate object files, which are then archived. The executable is then linked against the archive.
By splitting into many object files the linker is able to include only the functions that are actually called. (this assumes that you're statically linking)
There is no reason why you can't do the same trick. Of course, you could argue that if the functions aren't called the you can probably remove them yourself.
If you're statically linking against other libraries you can run the tools listed above over them too to make sure that they're following similar rules.
Another optimization that might save you work is -ffunction-sections, -Wl,--gc-sections, assuming you're using GCC. A good toolchain will not need to be told that, though.
Explanation: GNU ld links sections, and GCC emits one section per translation unit unless you tell it otherwise. But in C++, the nodes in the dependecy graph are objects and functions.
On deeply embedded projects I always try to avoid using any standard library functions. Even simple functions like "strtol()" blow up the binary size. If possible just simply avoid those calls.
In most deeply embedded projects you don't need a versatile "printf()" or dynamic memory allocation (many controllers have 32kb or less RAM).
Instead of just using "printf()" I use a very simple custom "printf()", this function can only print numbers in hexadecimal or decimal format not more. Most data structures are preallocated at compile time.
Andrew EdgeCombe has a great list, but if you really want to scrape every last byte, sstrip is a good tool that is missing from the list and and can shave off a few more kB.
For example, when run on strip itself, it can shave off ~2kB.
From an old README (see the comments at the top of this indirect source file):
sstrip is a small utility that removes the contents at the end of an
ELF file that are not part of the program's memory image.
Most ELF executables are built with both a program header table and a
section header table. However, only the former is required in order
for the OS to load, link and execute a program. sstrip attempts to
extract the ELF header, the program header table, and its contents,
leaving everything else in the bit bucket. It can only remove parts of
the file that occur at the end, after the parts to be saved. However,
this almost always includes the section header table, and occasionally
a few random sections that are not used when running a program.
Note that due to some of the information that it removes, a sstrip'd executable is rumoured to have issues with some tools. This is discussed more in the comments of the source.
Also... for an entertaining/crazy read on how to make the smallest possible executable, this article is worth a read.
Just to double-check and document for future reference, but do you use Thumb instructions? They're 16 bit versions of the normal instructions. Sometimes you might need 2 16 bit instructions, so it won't save 50% in code space.
A decent linker should take just the functions needed. However, you might need compiler & linke settings to package functions for individual linking.
Ok so in the end I just reduced the project to it's simplest form, then slowly added files one by one until the function that I wanted to remove appeared in the 'readelf' file. Then when I had the file I commented everything out and slowly add things back in until the function popped up again. So in the end I found out what called it and removed all those calls...Now it works as desired...sweet!
Must be a better way to do it though.
To answer this specific need:
•I want to omit those functions (if possible) but I can't find what's
calling them!! Could be called from any number of library functions I
guess.
If you want to analyze your code base to see who calls what, by whom a given function is being called and things like that, there is a great tool out there called "Understand C" provided by SciTools.
https://scitools.com/
I have used it very often in the past to perform static code analysis. It can really help to determine library dependency tree. It allows to easily browse up and down the calling tree among other things.
They provide a limited time evaluation, then you must purchase a license.
You could look at something like executable compression.