xmlNewCDataBlock implicit conversion to int - objective-c

I'm parsing xml via libxml2 library. After updating Xcode to 5.1, I got warning that last parameter - length - is implicitly converted to int, while it's unsigned long.
Here's function declaration:
XMLPUBFUN xmlNodePtr XMLCALL
xmlNewCDataBlock(xmlDocPtr doc,
const xmlChar *content,
int len);
Is there any similar function that takes unsigned long values, because I don't know how big my data can be, and I want to process it safely.

There's no such function. libxml2's string manipulation functions use ints for string lengths and offsets, so text nodes longer than INT_MAX are not supported.

Related

Reading binary file into a struct using C++/CLI

I have a problem (and I think it can be resolved easily, but it is driving me crazy). I have checked other posts, but I was not able to find the solution.
I would like to read a binary file into a struct using C++/CLI. The problem is that after reading it, some of the values do not fit with the correct ones. In the example below, all the struct fields are well read until "a" (included) (at around byte 100). From that field, the rest have wrong values. I know that they have wrong values and the source file is right, since I previously used python, and FileStream and BinaryReader from C++/CLI. However, I am not using them anymore, given that I would like to read the binary file into a struct.
In addition, in some cases I also have a value of -1 for variable "size" (size of the file), but not always. I am not sure if it could get a wrong value when the file is too big.
Therefore, my question if you can see something that I cannot, or I am doing something wrong.
struct LASheader
{
unsigned short x;
char y[16];
unsigned char v1;
unsigned char v2;
char y1[68];
unsigned short a;
unsigned long b;
unsigned long c;
unsigned char z;
unsigned short d;
unsigned long e;
}
void main()
{
FILE *ptr = fopen("E:\\Pablo\\file.las", "rb");
//I go at the end of the file to get the size
fseek(ptr, 0L, SEEK_END);
unsigned long long size = ftell(ptr);
struct LASheader lasHeader;
//I want an offset of 6 bytes
fseek(ptr, 6, SEEK_SET);
fread(&lasHeader, sizeof(lasHeader), 1, ptr);
unsigned short a1 = lasHeader.a;
unsigned long b1 = lasHeader.b;
unsigned long c1 = lasHeader.c;
unsigned short d1 = lasHeader.d;
unsigned long e1 = lasHeader.e;
}
Thank you!
Pablo.
There's a couple things here. I'll tackle the direct problem first.
You didn't say how this binary format was being written, but I think it's an alignment issue.
Without a #pragma pack directive, unsigned long b will align to a 4-byte boundary. Struct members x through a are 90 bytes total, so two padding bytes are inserted between a and b so that b is aligned properly.
To fix the alignment, you can surround the struct with #pragma pack(push, 1) and #pragma pack(pop).
Second, a more overall issue:
You called this C++/CLI code, and you tagged it C++/CLI, but you're not actually using any managed features in this code. Also, you said you have some C# code that works using BinaryReader, and BinaryReader works fine in C++/CLI, so you technically already had a C++/CLI solution in-hand.
If the rest of your C++/CLI project is this way (not using managed code), consider switching your project to C++, or perhaps splitting it. If your project is largely making use of managed code, then I would strongly consider using BinaryReader instead of fopen to read this data.

Obj-C: Is it really safe to compare BOOL variables?

I used to think that in 64-bit Obj-C runtime BOOL is actually _Bool and it's a real type so it's safe to write like this:
BOOL a = YES;
BOOL b = NO;
if (a != b) {...}
It's been working seemingly fine but today I found a problem when I use bit field structs like this:
typedef struct
{
BOOL flag1 : 1;
} FlagsType;
FlagsType f;
f.flag1 = YES;
BOOL b = YES;
if (f.flag1 != b)
{
// DOES GET HERE!!!
}
It seems that BOOL returned from the bit field is equal to -1 while the regular BOOL is 1, and they are not equal!!!
Note that I am aware of the situation when an arbitrary integer number is cast to BOOL and therefore becomes a "strange" BOOL which is not safe to compare.
However in this situation, both flag1 field and b were declared as BOOL and never cast. What is the problem? Is this a compiler bug?
The bigger question is if it's really safe to compare BOOLs at all or should I write a XORing helper function? (It would be such a chore, because boolean comparisons are so ubiquitous...)
I do not repeat that using a C boolean type solves the problems one can have with BOOL. That's true – in particular here, as you can read below –, but most of the problems resulted from a wrong storage into a boolean (C) object. But in this case _Bool or unsigned (int) seem to be the only possible solution. (Except of solutions with extra code.) There is a reason for it:
I cannot find a precise documentation of the new behavior of BOOL in Objective-C, but the behavior you found is something between bad and buggy. I expected the latest behavior to be analogous to _Bool. That's not true in your case. (Thanks for finding that out!) Maybe this is for backwards compatibility. To tell the full story:
In C an object of the type int is signed int. (This is a difference to char. For this type the signedess is implementation defined.)
— int, signed, or signed int
ISO/IEC 9899:TC3, 6.7.2-2
Each of the comma-separated sets designates the same type, […]
ISO/IEC 9899:TC3, 6.7.2-5
But there is a weird exception for historical reasons:
If the int object is a bit-field, it is implementation defined, whether it is a signed int or an unsigned int. (Likely this is because some CPUs in the past could not automatically expand the sign of a partial byte integer. So having an unsigned integer is easier, because nulling the top bits is enough.)
On clang the default is signed int. So according to full-width integers int always denotes a signed integer, even it has only one bit. An int member : 1 can only store 0 and -1! (Therefore it is no solution to use int instead.)
Each of the comma-separated sets designates the same type, except that for bit-fields, it is implementation-defined whether the specifier int designates the same type as signed int or the same type as unsigned int.
ISO/IEC 9899:TC3, 6.7.2-5
The C standard says that a boolean bit-field is an integer type and therefore takes part on the weird integer signedness rule for bit-fields:
A bit-field is interpreted as a signed or unsigned integer type consisting of the specified number of bits.
ISO/IEC 9899:TC3, 6.7.2.1-9
This is the behavior you found. Because this is meaningless for 1 bit booleans types, the C standard explicitly denotes that storing a 1 into a boolean bit-field has to compare equal to 1 in every case:
If the value 0 or 1 is stored into a nonzero-width bit-field of type _Bool, the value of the bit-field shall compare equal to the value stored.
ISO/IEC 9899:TC3, 6.7.2.1-9
This leads to the strange situation, that an implementation can implement booleans of width 1 as { 0, -1 }, but has to fulfill 1 == -1. Great.
So, the short story: BOOL behaves like an integer bit-field (conforming to the standard), but does not take part on the extra requirement for _Bools.
I think this is, because of legacy code. (One could expect -1 in the past.)

how to test if unsigned __int64 number exceeds range

I have a function where calculated values can reach higher values than the range of unsigned __int64 which is indicated by MS by 18,446,744,073,709,551,615. How can I test if a number has exceeded that range? I've converted the int into char and tried testing by checking the length with strlen. However, some values with a length longer than specified: for example if(strlen(charvar)>17) mysteriously escape. So how can I effectively test?
If you can use a modern compiler or Boost, then lexical_cast will do the job:
uint64_t bigint;
try {
bigint = lexical_cast<uint64_t>(str);
} catch (std::bad_lexical_cast &e) {
// do whatever you want to do when the string isn't valid;
}
// Safely use bigint
See this link for the Boost library. You can definitely get this for VS 2008.
If this is Windows only you can also look at _atoi64 and the like. See msdn. These return I64_MAX and I64_MIN in case of over/underflow.

Create Managed Array with long/size_t length

Jumping straight to code, this is what I would like to do:
size_t len = obj->someLengthFunctionThatReturnsTypeSizeT();
array<int>^ a = gcnew array<int>(len);
When I try this, I get the error
conversion from size_t to int, possible loss of data
Is there a way I can get this code to compile without explicitly casting to int? I find it odd that I can't initialize an array to this size, especially because there is a LongLength property (and how could you get a length as a long - bigger than int - if you can only initialize a length as an int?).
Thanks!
P.S.: I did find this article that says that it may be impractical to allocate an array that is truly size_t, but I don't think that is a concern. The point is that the length I would like to initialize to is stored in a size_t variable.
Managed arrays are implemented for using Int32 as indices, there is no way around that. You cannot allocate arrays larger than Int32.MaxValue.
You could use the static method Array::CreateInstance (the overload that takes a Type and an array of Int64), and then cast the resulting System::Array to the appropriate actual array type (e.g. array<int>^). Note that the passed values must not be larger than Int32.MaxValue. And you would still need to cast.
So you have at least two options. Either casting:
// Would truncate the value if it is too large
array<int>^ a = gcnew array<int>((int)len);
or this (no need to cast len, but the result of CreateInstance):
// Throws an ArgumentOutOfRangeException if len is too large
array<int>^ a = (array<int>^)Array::CreateInstance(int::typeid, len);
Personally, i find the first better. You still might want to check the actual size of len so that you don't run into any of the mentioned errors.

Pass Byte as SmallInt?

I have an Informix stored procedure which takes an int and a "smallint" as parameters. I'm trying to call this SP from a .net4 Visual Basic program.
As far as I know, "smallint" is a byte. Unfortunately, when loading up the IfxCommand.Parameters collection with an Integer and a Byte, I get an ArgumentException thrown of {"The parameter data type of Byte is invalid."} with the following stack trace:
at IBM.Data.Informix.TypeMap.FromObjectType(Type dataType, Int32 length)
at IBM.Data.Informix.TypeMap.FromObjectType(Type dataType)
at IBM.Data.Informix.IfxParameter.GetTypeMap()
at IBM.Data.Informix.IfxParameter.GetOutputValue(IntPtr stmt, CNativeBuffer valueBuffer, CNativeBuffer lenIndBuffer)
at IBM.Data.Informix.IfxDataReader.Dispose(Boolean disposing)
at IBM.Data.Informix.IfxDataReader.System.IDisposable.Dispose()
at IBM.Data.Informix.IfxCommand.ExecuteReaderObject(CommandBehavior behavior, String method)
at IBM.Data.Informix.IfxCommand.ExecuteReader(CommandBehavior behavior)
at IBM.Data.Informix.IfxCommand.ExecuteReader()
Presumably I need to cast the Byte I have to a smallint, somehow, but google isn't giving me any relevant answers, just at the moment.
I have tried using:
cmd.Parameters.Add(New IfxParameter("myVal", IBM.Data.Informix.IfxType.SmallInt)).Value = myByte
but I still get the same ArgumentException when executing the reader.
Can someone tell me what I'm doing wrong?
An Informix SmallInt is a 16 bit signed integer, byte is an 8 bit unsigned integer. A better equivalent would be Int16 or Short which is a 16 bit signed integer, just like SmallInt. I suspect that will work.
Informix has no analogue for an unsigned 8 bit integer like the .Net Byte or the TSQL TinyInt.
Int16 should work since it has the same range than SmallInt (-32,767 to 32,767)
Informix has 4 types that are related: BYTE and TEXT (since 1990), BLOB and CLOB (since 1996). Collectively, they are all large objects. A BYTE type is absolutely NOT a small integer type.
You may be able to use BYTE in a language that thinks it is a small integer if the language or driver fixes up the types.
But the native BYTE type is a large object. It requires a 56-byte descriptor in the main row of data, and then uses other storage (possibly IN TABLE, possibly in a blobspace) for the actual data storage.