Is it safe to convert a mysqlpp::sql_blob to a std::string? - blob

I'm grabbing some binary data out of my MySQL database. It comes out as a mysqlpp::sql_blob type.
It just so happens that this BLOB is a serialized Google Protobuf. I need to de-serialize it so that I can access it normally.
This gives a compile error, since ParseFromString() is not intended for mysqlpp:sql_blob types:
protobuf.ParseFromString( record.data );
However, if I force the cast, it compiles OK:
protobuf.ParseFromString( (std::string) record.data );
Is this safe? I'm particularly worried because of this snippet from the mysqlpp documentation:
"Because C++ strings handle binary data just fine, you might think you can use std::string instead of sql_blob, but the current design of String converts to std::string via a C string. As a result, the BLOB data is truncated at the first embedded null character during population of the SSQLS. There’s no way to fix that without completely redesigning either String or the SSQLS mechanism."
Thanks for your assistance!

It doesn't look like it would be a problem judging by that quote (it's basically saying if a null character is found in the blob it will stop the string there, however ASCII strings won't have random nulls in the middle of them). However, this might present a problem for internalization (multibyte charsets may have nulls in the middle).

Related

wxStaticText inconsistently displays 'degree' character

In the same application I have two different instances of wxStaticText. Each displays an angular value expressed in degrees. I've tested both instances for font name and font encoding. They are the same for both. I've tested that both strings passed to SetLabel() are using the same character value, decimal 176. Yet one displays the 'degree' character (small circle, up high) as expected and the other instead displays an odd character I'm not familiar with. How can this be? Is there some other property of wxStaticText I need to test?
I can't explain what you're seeing because obviously two identical controls must behave in the same way, but I can tell you that using decimal 176 is not a good way to encode the degree sign, unless you explicitly use wxConvISO8859_1 to create the corresponding wxString.
It is better to use wxString::FromUTF8("\xc2\xb0") instead or, preferably, make sure that your source files are UTF-8 encoded and just use wxString::FromUTF8("°").
Arghhhh! Found it. I was assuming SetLabel() was wxStaticText::SetLabel(), inherited from the wxWindow base class. It's not. We have a wrapper class of our own around wxStaticText that I was not aware of. It's the wrapper class that is bollixing the string value.
Moral: When debugging unfamiliar code, don't make assumptions, step ALL THE WAY in.

IOS JSON escaping special characters

I'm working in IOS and trying to pass some content to a web server via an NSURLRequest. On the server I have a PHP script setup to accept the request string and convert it into an JSON object using the Zend_JSON framework. The issue I am having is whenever the character "ø" is in any part of the request parameters, then the request string is cut short by one character.
Request string before going to server.
[{"description":"Blah blah","type":"Russebuss","name":"Roscoe Simulator","appVersion":"1.0.20","osVersion":"IOS 5.1","phone":"5555555","country":"Østfold","udid":"bed164974ea0d436a43f3cdee0e005a1"}]
Request string on server before any parsing
[{"description":"Blah blah","type":"Russebuss","name":"Roscoe Simulator","appVersion":"1.0.20","osVersion":"IOS 5.1","phone":"5555555","country":"Nord-Trøndelag","udid":"bed164974ea0d436a43f3cdee0e005a1"}
Everything looks exactly the same except the final closing ] is missing. I'm thinking it's having an issue when converting the string to UTF-8, but not sure the correct way to fix this issue.
Does anyone have any ideas why this is happening?
first of all do not trust the xcode console in such cases. you never know which coding the console is actually using.
second, escape the invalid characters before you build you json string. easiest way would probably to make sure you are using the same unicode representation, like utf-8, all the time.
third, if there are still invalid characters use a json lib with a parser (does the encoding). validate the output by parsing back to e.g. NSString. or validate the output manually by using a web form like http://jsonformatter.curiousconcept.com/
the badest way is to replace the single characters in the string, build your json and convert back. one way to do this could be to replace e.g an german ä with its unicode representaion U+00E4 (http://www.utf8-chartable.de/).
Thats the way I do it. I am glad that I nerver needed to go further than step three and this is the step you should do anyway to keep your code simple.
Please try to use Zends internal json Encoding:
Zend_Json::$useBuiltinEncoderDecoder = true;
should fix your issue.

Use of byte arrays and hex values in Cryptography

When we are using cryptography always we are seeing byte arrays are being used instead of String values. But when we are looking at the techniques of most of the cryptography algorithms they uses hex values to do any operations. Eg. AES: MixColumns, SubBytes all these techniques(I suppose it uses) uses hex values to do those operations.
Can you explain how these byte arrays are used in these operations as hex values.
I have an assignment to develop a encryption algorithm , therefore any related sample codes would be much appropriate.
Every four digits of binary makes a hexadecimal digit, so, you can convert back and forth quite easily (see: http://en.wikipedia.org/wiki/Hexadecimal#Binary_conversion).
I don't think I full understand what you're asking, though.
The most important thing to understand about hexadecimal is that it is a system for representing numeric values, just like binary or decimal. It is nothing more than notation. As you may know, many computer languages allow you to specify numeric literals in a few different ways:
int a = 42;
int a = 0x2A;
These store the same value into the variable 'a', and a compiler should generate identical code for them. The difference between these two lines will be lost very early in the compilation process, because the compiler cares about the value you specified, and not so much about the representation you used to encode it in your source file.
Main takeaway: there is no such thing as "hex values" - there are just hex representations of values.
That all said, you also talk about string values. Obviously 42 != "42" != "2A" != 0x2A. If you have a string, you'll need to parse it to a numeric value before you do any computation with it.
Bytes, byte arrays and/or memory areas are normally displayed within an IDE (integrated development environment) and debugger as hexadecimals. This is because it is the most efficient and clear representation of a byte. It is pretty easy to convert them into bits (in his mind) for the experienced programmer. You can clearly see how XOR and shift works as well, for instance. Those (and addition) are the most common operations when doing symmetric encryption/hashing.
So it's unlikely that the program performs this kind of conversion, it's probably the environment you are in. That, and source code (which is converted to bytes at compile time) probably uses a lot of literals in hexadecimal notation as well.
Cryptography in general except hash functions is a method to convert data from one format to another mostly referred as cipher text using a secret key. The secret key can be applied to the cipher text to get the original data also referred as plain text. In this process data is processed in byte level though it can be bit level as well. The point here the text or strings which we referring to are in limited range of a byte. Example ASCII is defined in certain range in byte value of 0 - 255. In practical when a crypto operation is performed, the character is converted to equivalent byte and the using the key the process is performed. Now the outcome byte or bytes will most probably be out of range of human readable defined text like ASCII encoded etc. For this reason any data to which a crypto function is need to be applied is converted to byte array first. For example the text to be enciphered is "Hello how are you doing?" . The following steps shall be followed:
1. byte[] data = "Hello how are you doing?".getBytes()
2. Process encipher on data using key which is also byte[]
3. The output blob is referred as cipherTextBytes[]
4. Encryption is complete
5. Using Key[], a process is performed over cipherTextBytes[] which returns data bytes
6 A simple new String(data[]) will return string value of Hellow how are you doing.
This is a simple info which might help you to understand reference code and manuals better. In no way I am trying to explain you the core of cryptography here.

Char.ConvertFromUtf32 not available in Silverlight

I'm converting a WinForms app to Silverlight (VB.NET). What should I use instead of Char.ConvertFromUtf32 as it's not available to use in Silverlight?
UTF-32 is currently not part of Silverlight, so you have to find a way around the limitation. I think you should stop a moment and think exactly why you need to read UTF32-encoded text.
If you are reading such text from a database or a file on the server, I would perform the conversion server-side (if possible I would convert everything to UTF-8 and get rid of the UTF-32 data in one shot).
If you are parsing a user-provided file on the client side, I would detect the UTF-32 encoding and gently tell the user that the file encoding is not supported. UTF32 is pretty rare nowadays, so I guess it should not be a very common case (but I could be wrong not knowing your exact situation).
In order to detect the file encoding you have to look at the first few bytes (byte order mark) -more information here, if they are not present the task becomes much harder and involves some kind of heuristics based on character frequency.
From: https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/types/how-to-convert-between-hexadecimal-strings-and-numeric-types
You can use a direct cast, like:
// Get the character corresponding to the integral value.
string stringValue = Char.ConvertFromUtf32(value);
char charValue = (char)value;
Small warning, it will only work up to 0xffff. It will not work for high range Unicode from 0x10000 to 0x10ffff.
Also, if you need to parse \uXXXX, try this other question: How do I convert Unicode escape sequences to Unicode characters in a .NET string?

00626 SQL Loader error

How to avoid
"characterset conversion buffer overflow" error in sql*loader? error # 00626.
I am not able to find this on internet please suggest me the solution for this.
What is the character set of the input datafile? You might try specifying the character set in the control file:
CHARACTERSET char_set_name LENGTH SEMANTICS CHARACTER
By default, if not specified, Oracle will use byte length semantics. Thus, if you define a field length in your control file as VARCHAR(20), in byte semantics you'd have 20 byte buffer, but in character length semantics you might have a 40 byte buffer. This would be my guess as to what could be the source of the error.
It's not a lot of help, but here's what the Oracle error manual has to say about that error:
SQL*Loader-00626: Character set
conversion buffer overflow.
Cause: A conversion from the datafile character set to the client
character set required more space than
that allocated for the conversion
buffer. The size of the conversion
buffer is limited by the maximum size
of a varchar2 column.
Action: The input record is rejected. The data will not fit into
the column.
It sounds like there isn't any way to work around this within SQLLoader. If it is affecting a small number of records then it may be easiest to simply handle those manually. If it is many records, then you probably need to find or create a different loading tool.
Just a few ideas for you to think about:
You could try to load different parts of the "string" into different fields in the database .. maybe that way you can work around the limitation.
You could try to do the character set conversion in a different tool .. some text editors may give you some options .. and then load the file without it requiring the conversion.
Not sure if there's any merit in these ideas, but hopefully you can work something out.
Thanks for all your help. This problem has been resolved. We split the file and loaded in chunks and it worked fine