Reading Binary File - fread

so I am trying to read a filesystem disk, which has been provided.
So, what I want to do is read the 1044 byte from the filesystem. What I am currently doing is the following:
if (fp = fopen("filesysFile-full", "r")) {
fseek(fp, 1044, SEEK_SET); //Goes to 1024th byte
int check[sizeof(char)*4]; //creates a buffer array 4 bytes long
fread(check, 1, 4, fp); //reads 4 bytes from the file
printf("%d",check); //prints
int close = fclose(fp);
if (close == 0) {
printf("Closed");
}
}
The value that check should be printing is 1. However I am getting negative values which keep changing everytime I run the file. I don't understand what I am doing wrong. Am I taking the right approach to reading bytes of the disk, and printing them.
What I basically want to do is read bytes of the disk, and read the values at certain bytes. Those bytes are fields which will help me understand the structure/format of the disk.
Any help would be appreciated.
Thank you.

This line:
int check[sizeof(char)*4];
allocates an array of 4 ints.
The type of check is therefore int*, so this line:
printf("%d",check);
prints the address of the array.
What you should do it allocate it as an int:
int check;
and then fread into it:
fread(&check, 1, sizeof(int), fp);
(This code, incidentally, assumes that int is 4 bytes.)

int check[sizeof(char)*4]; //creates a buffer array 4 bytes long
This is incorrect. You are creating an array of four integers, which are typically 32 bits each, and then when you printf("%d",check) you are printing the address of that array, which will probably change every time you run the program. I think what you want is this:
if (fp = fopen("filesysFile-full", "r")) {
fseek(fp, 1044, SEEK_SET); //Goes to 1024th byte
int check; //creates a buffer array the size of one integer
fread(&check, 1, sizeof(int), fp); //reads an integer (presumably 1) from the file
printf("%d",check); //prints
int close = fclose(fp);
if (close == 0) {
printf("Closed");
}
}
Note that instead of declaring an array of integers, you are declaring just one. Also note the change from fread(check, ...) to fread(&check, ...). The first parameter to fread is the address of the buffer (in this case, a single integer) into which you want to read the data.
Keep in mind that while integers are probably 32 bits long, this isn't guaranteed. Also, in most operating systems, integers are stored with the least significant byte first on the disk, so you will only read 1 if the data on the disk looks like this at byte 1044:
0x01 0x00 0x00 0x00
If it is the other way around, 0x00 00 00 01, that will be read as 16777216 (0x01000000).
If you want to read more than one integer, you can use an array as follows:
if (fp = fopen("filesysFile-full", "r")) {
fseek(fp, 1044, SEEK_SET); //Goes to 1024th byte
int check[10]; //creates a buffer of ten integers
fread(check, 10, sizeof(int), fp); //reads 10 integers into the array
for (int i = 0; i < 10; i++)
printf("%d ", check[i]); //prints
int close = fclose(fp);
if (close == 0) {
printf("Closed");
}
}
In this case, check (without brackets) is a pointer to the array, which is why I've changed the fread back to fread(check, ...).
Hope this helps!

Related

Why the bitfield's least significant bit is promoted to MSb during typecasting in the below program?

Why do we get this value as output:- ffffffff
struct bitfield {
signed char bitflag:1;
};
int main()
{
unsigned char i = 1;
struct bitfield *var = (struct bitfield*)&i;
printf("\n %x \n", var->bitflag);
return 0;
}
I know that in a memory block of size equal to the data-type, the first bit is used to represent if it is positive(0) or negative(1); when interpreted as a signed data-type. But, still can't figure out why -1 (ffffffff) is printed. When the struct with only one bit set, I was expecting that when it gets promoted to a 1 byte char. Because, my machine is a little-endian and I was expecting that one bit in the field to be interpreted as the LSb in my 1 byte character.
Can somehow please explain. I'm really confused.

STM32 reading variables out of Received Buffer with variable size

I am not really familiar with programming in STM32. I am using the micro controller STM32F303RE.
I am receiving data via a UART connection with DMA.
Code:
HAL_UARTEx_ReceiveToIdle_DMA(&huart2, RxBuf, RxBuf_SIZE);
__HAL_DMA_DISABLE_IT(&hdma_usart2_rx, DMA_IT_HT);
I am writing the value into a Receiving Buffer and then transfer it into a main buffer. This function and declaration is down before the int main(void).
#define RxBuf_SIZE 100
#define MainBuf_Size 100
uint8_t RxBuf[RxBuf_SIZE];
uint8_t MainBuf[MainBuf_Size];
void HAL_UARTEx_RxEventCallback(UART_HandleTypeDef *huart,uint16_t Size){
if( huart -> Instance == USART2){
memcpy (MainBuf, RxBuf, Size);
HAL_UARTEx_ReceiveToIdle_DMA(&huart2, RxBuf, RxBuf_SIZE);
}
for (int i = 0; i<Size; i++){
if((MainBuf[i] == 'G') && (MainBuf[i+1] == 'O')){
RecieveData();
HAL_UART_DMAStop(&huart2);
}
}
}
I receive know the data into a buffer and it stops as soon as "GO" is transmitted. Until this point it is working. The function ReceiveData() should then transform this buffer to the variables. But it isn't working for me.
Now I want to transform this received data with "breakpoints" into variables.
So I want to send: "S2000S1000S1S10S2GO".
I always have 5 variables. (in this case: 2000, 1000, 1, 10, 2) I want to read the data out of the string and transform it into an uint16_t to procude. The size/ length of the variables could be changed. That's why I tried to use like some breakpoint.

Objective C char array mismatch

I am having an issue getting the correct format of a char array in Objective C
Correct sample:
unsigned char bytes[] = {2, 49, 53, 49, 3, 54};
When printing to the debug area I get this:
Printing description of bytes:
(unsigned char [6]) bytes = "\x02151\x0365"
Incorrect sample:
I then attempt to populate an unsigned char array with characters manually (via a for-loop that produces the below samples):
unsigned char bb[64];
bb[0] = 2;
bb[1] = 49;
bb[2] = 52;
bb[3] = 49;
bb[4] = 3;
bb[5] = 54;
When printing to the debug area I get this:
Printing description of bb: (unsigned char [64]) bb = "\x02151\x036";
Also when expanding the array while debugging I can see xcode is telling me that the 'bytes' array has int values and the 'bb' array has characters such as '\x02' in it.
This is just a high level piece of code that does not do much yet, but I need to match the array named 'bytes' before being able to proceed.
Any ideas? Thanks
You don't:
state what kind (local, instance, etc.) of variables bytes and bb are and that makes a difference;
show your for loop; or
state what you mean by "printing"
so this answer is going to be a guess!
Try the following code (it's a "complete" Cocoa app):
#implementation AppDelegate
unsigned char bytes[] = {2, 49, 53, 49, 3, 54};
char randomBytes[] = { 0x35, 0x0 };
unsigned char bb[64];
- (void)applicationDidFinishLaunching:(NSNotification *)aNotification
{
for(int ix = 0; ix < 6; ix++) bb[ix] = bytes[ix];
// breakpoint here
}
#end
Now in the debugger, at least on my machine/compiler combination, this result is not guaranteed:
(lldb) p bytes
(unsigned char [6]) $0 = "\x02151\x0365"
(lldb) p bb
(unsigned char [64]) $1 = "\x02151\x036"
I think this reproduces your result. So what is going on?
The variable bytes is an array but as it is characters the debugger is choosing to interpret it as a C string when displaying it - note the double quotes around the value and the \x hex escapes for non-printable characters.
A C string terminates on a null (zero) byte so when the debugger interprets your array as a C string it will display characters until it finds a null byte and stops. It just so happens that on your machine the two bytes following your bytes array have the values 0x35 and 0x0; I have reproduced that here by adding the randomBytes array; and those values are the character 5 and the null byte so the debugger prints the 5.
So why does bb only print 6 characters? Global variables are zero initialised, so bb has 64 null bytes before the for loop. After the loop the 7th of those null bytes acts as the EOS (end of string) marker and the print just shows the 6 characters you expect.
Finally why do I say the above results are not guaranteed? The memory layout order of global variables is not specified as part of the C Standard, which underlies Objective-C, so there is in fact no guarantee that the randomBytes array immediately follows the bytes array. If the global variable layout algorithm is different on your computer/compiler combination you may not get the same results. However the underlying cause of your issue is the same - the "printing" is running off the end of your bytes array.
HTH

NSSwapInt from byte array

I'm trying to implement a function that will read from a byte array (which is a char* in my code) a 32bit int stored with different endianness. I was suggested to use NSSwapInt, but I'm clueless on how to go about it. Could anyone show me a snippet?
Thanks in advance!
Heres a short example:
unsigned char bytes[] = { 0x00, 0x00, 0x01, 0x02 };
int intData = *((int *)bytes);
int reverseData = NSSwapInt(intData);
NSLog(#"integer:%d", intData);
NSLog(#"bytes:%08x", intData);
NSLog(#"reverse integer: %d", reverseData);
NSLog(#"reverse bytes: %08x", reverseData);
The output will be:
integer:33619968
bytes:02010000
reverse integer: 258
reverse bytes: 00000102
As mentioned in the docs,
Swaps the bytes of iv and returns the resulting value. Bytes are swapped from each low-order position to the corresponding high-order position and vice versa. For example, if the bytes of inv are numbered from 1 to 4, this function swaps bytes 1 and 4, and bytes 2 and 3.
There's also a NSSwapShort and NSSwapLongLong.
There is a potential of a data misalignment exception if you solve this problem by using integer pointers - e.g. some architectures require 32-bit values to be at addresses which are multiples of 2 or 4 bytes. The ARM architecture used by the iPhone et al. may throw an exception in this case, but I've no iOS device handy to test whether it does.
A safe way to do this which will never throw any misalignment exceptions is to assemble the integer directly:
int32_t bytes2int(unsigned char *b)
{
int32_t i;
i = b[0] | b[1] << 8 | b[2] << 16 | b[3] << 24; // little-endian, or
i = b[3] | b[2] << 8 | b[1] << 16 | b[0] << 24; // big-endian (pick one)
return i;
}
You can pass this any byte pointer and it will assemble 4 bytes to make a 32-bit int. You can extend the idea to 64-bit integers if required.

Functions to compress and uncompress array of integers

I was recently asked to complete a task for a c++ role, however as the application was decided not to be progressed any further I thought that I would post here for some feedback / advice / improvements / reminder of concepts I've forgotten.
The task was:
The following data is a time series of integer values
int timeseries[32] = {67497, 67376, 67173, 67235, 67057, 67031, 66951,
66974, 67042, 67025, 66897, 67077, 67082, 67033, 67019, 67149, 67044,
67012, 67220, 67239, 66893, 66984, 66866, 66693, 66770, 66722, 66620,
66579, 66596, 66713, 66852, 66715};
The series might be, for example, the closing price of a stock each day
over a 32 day period.
As stored above, the data will occupy 32 x sizeof(int) bytes = 128 bytes
assuming 4 byte ints.
Using delta encoding , write a function to compress, and a function to
uncompress data like the above.
Ok, so before this point I had never looked at compression so my solution is far from perfect. The manner in which I approached the problem is by compressing the array of integers into a array of bytes. When representing the integer as a byte I keep the calculate most
significant byte (msb) and keep everything up to this point, whilst throwing the rest away. This is then added to the byte array. For negative values I increment the msb by 1 so that we can
differentiate between positive and negative bytes when decoding by keeping the leading
1 bit values.
When decoding I parse this jagged byte array and simply reverse my
previous actions performed when compressing. As mentioned I have never looked at compression prior to this task so I did come up with my own method to compress the data. I was looking at C++/Cli recently, had not really used it previously so just decided to write it in this language, no particular reason. Below is the class, and a unit test at the very bottom. Any advice / improvements / enhancements will be much appreciated.
Thanks.
array<array<Byte>^>^ CDeltaEncoding::CompressArray(array<int>^ data)
{
int temp = 0;
int original;
int size = 0;
array<int>^ tempData = gcnew array<int>(data->Length);
data->CopyTo(tempData, 0);
array<array<Byte>^>^ byteArray = gcnew array<array<Byte>^>(tempData->Length);
for (int i = 0; i < tempData->Length; ++i)
{
original = tempData[i];
tempData[i] -= temp;
temp = original;
int msb = GetMostSignificantByte(tempData[i]);
byteArray[i] = gcnew array<Byte>(msb);
System::Buffer::BlockCopy(BitConverter::GetBytes(tempData[i]), 0, byteArray[i], 0, msb );
size += byteArray[i]->Length;
}
return byteArray;
}
array<int>^ CDeltaEncoding::DecompressArray(array<array<Byte>^>^ buffer)
{
System::Collections::Generic::List<int>^ decodedArray = gcnew System::Collections::Generic::List<int>();
int temp = 0;
for (int i = 0; i < buffer->Length; ++i)
{
int retrievedVal = GetValueAsInteger(buffer[i]);
decodedArray->Add(retrievedVal);
decodedArray[i] += temp;
temp = decodedArray[i];
}
return decodedArray->ToArray();
}
int CDeltaEncoding::GetMostSignificantByte(int value)
{
array<Byte>^ tempBuf = BitConverter::GetBytes(Math::Abs(value));
int msb = tempBuf->Length;
for (int i = tempBuf->Length -1; i >= 0; --i)
{
if (tempBuf[i] != 0)
{
msb = i + 1;
break;
}
}
if (!IsPositiveInteger(value))
{
//We need an extra byte to differentiate the negative integers
msb++;
}
return msb;
}
bool CDeltaEncoding::IsPositiveInteger(int value)
{
return value / Math::Abs(value) == 1;
}
int CDeltaEncoding::GetValueAsInteger(array<Byte>^ buffer)
{
array<Byte>^ tempBuf;
if(buffer->Length % 2 == 0)
{
//With even integers there is no need to allocate a new byte array
tempBuf = buffer;
}
else
{
tempBuf = gcnew array<Byte>(4);
System::Buffer::BlockCopy(buffer, 0, tempBuf, 0, buffer->Length );
unsigned int val = buffer[buffer->Length-1] &= 0xFF;
if ( val == 0xFF )
{
//We have negative integer compressed into 3 bytes
//Copy over the this last byte as well so we keep the negative pattern
System::Buffer::BlockCopy(buffer, buffer->Length-1, tempBuf, buffer->Length, 1 );
}
}
switch(tempBuf->Length)
{
case sizeof(short):
return BitConverter::ToInt16(tempBuf,0);
case sizeof(int):
default:
return BitConverter::ToInt32(tempBuf,0);
}
}
And then in a test class I had:
void CTestDeltaEncoding::TestCompression()
{
array<array<Byte>^>^ byteArray = CDeltaEncoding::CompressArray(m_testdata);
array<int>^ decompressedArray = CDeltaEncoding::DecompressArray(byteArray);
int totalBytes = 0;
for (int i = 0; i<byteArray->Length; i++)
{
totalBytes += byteArray[i]->Length;
}
Assert::IsTrue(m_testdata->Length * sizeof(m_testdata) > totalBytes, "Expected the total bytes to be less than the original array!!");
//Expected totalBytes = 53
}
This smells a lot like homework to me. The crucial phrase is: "Using delta encoding."
Delta encoding means you encode the delta (difference) between each number and the next:
67497, 67376, 67173, 67235, 67057, 67031, 66951, 66974, 67042, 67025, 66897, 67077, 67082, 67033, 67019, 67149, 67044, 67012, 67220, 67239, 66893, 66984, 66866, 66693, 66770, 66722, 66620, 66579, 66596, 66713, 66852, 66715
would turn into:
[Base: 67497]: -121, -203, +62
and so on. Assuming 8-bit bytes, the original numbers require 3 bytes apiece (and given the number of compilers with 3-byte integer types, you're normally going to end up with 4 bytes apiece). From the looks of things, the differences will fit quite easily in 2 bytes apiece, and if you can ignore one (or possibly two) of the least significant bits, you can fit them in one byte apiece.
Delta encoding is most often used for things like sound encoding where you can "fudge" the accuracy at times without major problems. For example, if you have a change from one sample to the next that's larger than you've left space to encode, you can encode a maximum change in the current difference, and add the difference to the next delta (and if you don't mind some back-tracking, you can distribute some to the previous delta as well). This will act as a low-pass filter, limiting the gradient between samples.
For example, in the series you gave, a simple delta encoding requires ten bits to represent all the differences. By dropping the LSB, however, nearly all the samples (all but one, in fact) can be encoded in 8 bits. That one has a difference (right shifted one bit) of -173, so if we represent it as -128, we have 45 left. We can distribute that error evenly between the preceding and following sample. In that case, the output won't be an exact match for the input, but if we're talking about something like sound, the difference probably won't be particularly obvious.
I did mention that it was an exercise that I had to complete and the solution that I received was deemed not good enough, so I wanted some constructive feedback seeing as actual companies never decide to tell you what you did wrong.
When the array is compressed I store the differences and not the original values except the first as this was my understanding. If you had looked at my code I have provided a full solution but my question was how bad was it?