How to store Bytes/Slice(UInt8) as a string in Crystal?

How to store Bytes/Slice(UInt8) as a string in Crystal? - redis

I'm encoding an Object into Bytes (ie Slice(UInt8)) via MessagePack. How would I store this in a datastore client (eg Crystal-Redis) that only accepts Strings?

If you have no other choice to store the Slice as a String, you can encode it as a String, but at the cost of reduced performance.
There's Base64 strict_encode/decode:
encoded = An_Object.to_msgpack # Slice(UInt8)
save_to_datastore "my_stuff", Base64.strict_encode(encoded)
from_storage = get_from_datastore "my_stuff"
if from_storage
My_MsgPack_Mapping.from_msgpack( Base64.decode(from_storage) )
end
Or you can use Slice#hexstring and String#hexbytes:
encoded = An_Object.to_msgpack # Slice(UInt8)
save_to_datastore "my_stuff", encoded.hexstring
from_storage = get_from_datastore "my_stuff"
if from_storage && from_storage.hexbytes?
My_MsgPack_Mapping.from_msgpack( from_storage.hexbytes )
end
(Crystal-Redis users have another option: see this issue.)

Both Crystal and Redis should be able to handle strings with non-valid UTF-8 bytes, so you could just directly create a String from the slice and store this to Redis and vice versa.
This is of course not entirely safe: you should make sure to avoid invoking any string methods that expect a valid UTF-8 string.
But apart from that, this direct method should be perfectly fine. Is is faster and more memory-efficient than using a string encoding.
redis.set key, String.new(slice)
redis.get(key).to_slice

Related

protocol-buffers: string or byte sequence of the exact length

Looking at https://developers.google.com/protocol-buffers/docs/proto3#scalar it appears that string and bytes types don't limit the length? Does it mean that we're expected to specify the length of transmitted string in a separate field, e.g. :
message Person {
string name = 1;
int32 name_len = 2;
int32 user_id = 3;
...
}

The wire type used for string/byte is Length-delimited. This means that the message includes the strings length. How this is made available to you will depend upon the language you are using - for example the table says that in C++ a string type is used so you can call name.length() to retrieve the length.
So there is no need to specify the length in a separate field.

One of the things that I wished GPB did was allow the schema to be used to set constraints on such things as list/array length, or numerical value ranges. The best you can do is to have a comment in the .proto file and hope that programmers pay attention to it!
Other serialisation technologies do do this, like XSD (though often the tools are poor), ASN.1 and JSON schema. It's very useful. If GPB added these (it doesn't change wire formats), GPB would be pretty well "complete".

Encrypt and Decrypt a String value with return type as String

I would like to have a function in sql that encrypts a given varchar value and returns it as a varchar and Vice-versa. I have checked out the following functions of sql such as :
ENCRYPTBYPASSPHRASE ENCRYPTBYKEY ENCRYPTBYCERT
I have also checked out other questions here but I am unable to find (or maybe understand) the solution I am looking for. What I want to do is pass the encrypted string to a URL as query string (I would like this to do this from Db) which does not have any / so that it does not mess with the URL routes.
Step 1: I made random string generator. (Works)
DECLARE #RandomString VARCHAR (500) = ( SELECT RandomString
FROM dbo.SfRandomStringGenerator (64, 1) );
Example string:
X922t1N2udpdi30HZN9W4U9N997UatHZMJKWvI4si0w9g9q6FA3Lqd8NxCJXAe5D
Step 2: I would like to encrypt the string from Step 1 so that the Output is also in String format BUT the Encryption methods that I have come across all return VARBINARY which is not what I Want.
Declare #EncryptedString Nvarchar(MAX)=dbo.SfEncrypt(#RandomString)
Select #EncryptedString
Example string:
WDkyMnQxTjJ1ZHBkaTMwSFpOOVc0VTlOOTk3VWF0SFpNSktXdkk0c2kwdzlnOXE2RkEzTHFkOE54Q0pYQWU1RA
Step 3: I would also like to be able to decrypt the encrypted string.
Declare #DecryptedString Nvarchar(MAX)=dbo.SfDecrypt(#RandomString)
Select #DecryptedString
Example string:
X922t1N2udpdi30HZN9W4U9N997UatHZMJKWvI4si0w9g9q6FA3Lqd8NxCJXAe5D
Thus far I have not been able to get the encryption I want.
Any help or pointers towards the solution would be helpful. Thanks.

Read the VARBINARY result into a byte[] array.
Convert the byte[] array into a Base 64 string using System.Convert.ToBase64String()
Base64 includes the + and / characters which have special meaning in a url path, so they must be encoded.
UrlEncode the base 64 string using System.Web.HttpUtility.UrlEncode(string) to encode any + or / characters with %2B or %2F.
You should then be able to add the url-encoded string to your url.
On the url endpoint, reverse the procedure to obtain the original encrypted byte[] array.
EDIT: Be aware that using a query string can come up against a length limit, of the order of 2048 characters, so if your data is larger than this, you may find that the server will refuse to handle the request.
In that case, consider using POST to send your data, and supply the encrypted data in the body of the message request.
EDIT in response to comment:
I figured that you would be processing the result of your "do encryption" query before sending it to a server as a url, and so the fact that the SQL encryption functions return varbinary should not have presented a problem.
IF you are happy that you can actually do encryption and decryption in TSQL, then I'll refer you to this SO post which offers a way to do Base-64 encoding and decoding in TSQL care of the xml query api. Do your encryption to a varbinary field, run it through the example to generate a base64 varchar, and then use TSQL REPLACE to convert '+' to '%2B' and '/' to '%2F'
You'll then have a varchar value of a url-encoded, base-64 encoded, encrypted representation of your input data, safe for transmission as a query string.

How can I get pymongo to always return str and not unicode?

From the pymongo docs:
MongoDB stores data in BSON format. BSON strings are UTF-8 encoded so PyMongo must ensure
that any strings it stores contain only valid UTF-8 data. Regular strings () are > validated and stored unaltered. Unicode strings () are encoded UTF-8 first. > The reason our example string is represented in the Python shell as u’Mike’ instead of
‘Mike’ is that PyMongo decodes each BSON string to a Python unicode string, not a regular
str."
It seems a bit silly to me that the database can only store UTF-8 encoded strings, but the return type in pymongo is unicode, meaning the first thing I have to do with every string from the document is once again call encode('utf-8') on it. Is there some way around this, i.e. telling pymongo not to give me unicode back but just give me the raw str?

No, there is no such feature in PyMongo; every string decoded from BSON is decoded as UTF-8. Python represents the string internally as UCS-2 or some other format, depending on the Python version. See the code where the BSON decoder extracts a string.
In the upcoming PyMongo 3.x series we may add features for more flexible BSON decoding to allow developers to optimize uncommon use cases like this.

Determining whether a column is an encryption key or plain text

We have a column of type varchar(25) in a SQL Server table that mistakenly had plain text values inserted when they should have been encrypted with AES. We are going to remove the plain text values from the database. The plan was to verify the block size of the field, though this would cause some unencrypted values to be left. Is there any other criteria I can check to reliably identify valid encrypted data?
We need it to be a T-SQL only solution.
Update
Just dug a little deeper, it's getting the values back from a web service. This web service encrypts them using AES in ASP.Net. It takes the returned byte array and then it uses this method to conver the byte array to a string:
static public string ByteArrToString(byte[] byteArr)
{
byte val;
string tempStr = "";
for (int i = 0; i <= byteArr.GetUpperBound(0); i++)
{
val = byteArr[i];
if (val < (byte)10)
tempStr += "00" + val.ToString();
else if (val < (byte)100)
tempStr += "0" + val.ToString();
else
tempStr += val.ToString();
}
return tempStr;
}
For clarity, I should say I did not originally write this code!
Cheers

Not really, especially since the encoding method doesn't look normal to me. It is more common to base64 encode the data which makes it very distinctive. It really depends what the unencrypted data consists of as to how easily it is to determine whether the data is encrypted or not - for instance, is it words, numbers, does it have spaces etc (since the encoded data has no spaces for instance).
It looks like your encoded data will all be numeric represented as a string so depending on length of data, you could see if your column will cast to a BIGINT.
Not sure the best way off the top of my head but there is an answer here that might help you "try cast" in T-SQL StackOverflow-8453861

String version of term_to_binary

I'm trying to write a simple server that talks to clients via tcp. I have it sending messages around just fine, but now I want it to interpret the messages as Erlang data types. For example, pretend it's HTTP-like (it's not) and that I want to send from the client {get, "/foo.html"} and have the server interpret that as a tuple containing an atom and a list, instead of just a big list or binary.
I will probably end up using term_to_binary and binary_to_term, but debugging text-based protocols is so much easier that I was hoping to find a more list-friendly version. Is there one hiding somewhere?

You can parse a string as an expression (similar to file:consult) via:
% InputString = "...",
{ok, Scanned, _} = erl_scan:string(InputString),
{ok, Exprs} = erl_parse:parse_exprs(Scanned),
{value, ParsedValue, _} = erl_eval:exprs(Exprs, [])
(See http://www.trapexit.org/String_Eval)
You should be able to use io_lib:format to convert an expression to a string using the ~w or ~p format codes, such as io_lib:format("~w", [{get, "/foo.html"}]).
I don't think this will be very fast, so if performance is an issue you should probably not use strings like this.
Also note that this is potentially unsafe since you're evaluating arbitrary expressions -- if you go this route, you should probably do some checks on the intermediate output. I'd suggest looking at the result of erl_parse:parse_exprs to make sure it contains the formats you're interested in (i.e., it's always a tuple of {atom(), list()}) with no embedded function calls. You should be able to do this via pattern matching.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas