Convert sql binary (16) to utf-8 byte in .net - sql

I have values stored in a sql database with datatype of binary(16), they come over into the .NET application (using Entity Framework) as type System.Data.Linq.Binary. I'd like to convert this binary representation of my data to data type byte[] without losing any data and preferably using UTF-8 encoding. Is this not built into the .NET framework? Must I convert it to some intermediary data type first before being able to get my byte array?

Binary has nothing to do with UTF-8.
binary(16) means 16 bytes of binary data. There is a Binary::ToArray method to get a byte array.
https://msdn.microsoft.com/en-us/library/system.data.linq.binary.toarray

Related

how insert image data to postgres data base and select the image data from data base

I have convert jpg image to base64img format using base64-img module, that base64img data is stored to postgres in varchar datatype.
can I store image data without conversion?
I'm using the following query to store varchar data.
create table:
CREATE TABLE images (
name varchar,
base64data varchar
);
insert values:
INSERT INTO image(name,base64data) VALUES ('image1','base64imagedata in string type');
Either you do it as text:
CREATE TABLE images (
name varchar,
base64data text
);
INSERT INTO image(name, base64data)
VALUES ('image1','base64imagedata in string type')
With varchar you need to determine the size in advanced. At least with text, it saves you the trouble of determining in advance what's the maximum size for the storage of your base64'd images. text's limit is still 1GB I think, if even you can call that a limit.
Or you do it with bytea:
CREATE TABLE images (
name varchar,
base64data bytea
);
INSERT INTO image(name, base64data)
VALUES ('image1', decode('base64imagedata in string type', 'base64'))
It's better to use the data type that reflects the actual data's type. For image, it's not a sequence of ASCII / Unicode characters, a string/varchar/text. Rather, it's a sequence of all possible values from 0 to 255, a byte array. bytea data type is the proper data type for images.
By the way, it's better not to transport the base64 to the database. Aside from saving bandwidth, it saves the database from performing the decoding. Instead, do the conversion of base64 to byte array on application side. It depends on your data access layer or ORM.
Here's one, on Sequelize (nodejs app): PostgreSQL - How to insert Base64 images strings into a BYTEA column?
Sequelize.BLOB('tiny')
In fact if your image is already from a byte array. You don't even need to convert it to base64 and decode it back on Postgres. You can just merely pass your byte array to your data access layer. An example in .NET: https://docs.huihoo.com/enterprisedb/8.1/dotnet-usingbytea.html
That would save network bandwidth when transporting image. base64 data representation is bigger than raw byte array, and it saves the database some work too. Offload some of the work to application layer

What is the limit of BINARY data types in Hive 1.2?

I did not find much about BINARY data types in apache docs: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types
I created a table with BINARY column using-
create table table1(col1 binary);
After fetching metadata via JDBC I found,
columnSize:2147483647
Is there any official document for this?
From the Binary DataType Proposal :
How is 'binary' represented internally in Hive
Binary type in Hive will map to 'binary' data type in thrift.
Primitive java object for 'binary' type is ByteArrayRef
PrimitiveWritableObject for 'binary' type is BytesWritable
And since ByteArrayRef holds a reference to a byte array, the answer should be Integer.MAX_VALUE - 5, see here

Parquet Binary Data type

I have a question regarding the Binary data type. I am trying to write a Parquet Schema for my MR job to create the Parquet file contrary to have Hive or Impala create one. I see some references to a Binary type which I do not see in Parquet
Is binary an alias to BYTE_ARRAY?
Also is UTF-8 a default encoding on Binary data types?
Raw bytes are stored in Parquet either as a fixed-length byte array (FIXED_LEN_BYTE_ARRAY) or as a variable-length byte array (BYTE_ARRAY, also called binary). Fixed is used when you have values with a constant size, like a SHA1 hash value. Most of the time, the variable-length version is used.
Strings are encoded as variable-length binary with the UTF8 type annotation to indicate how to interpret the raw bytes back into a String. UTF8 is the only encoding supported in the format, but not every binary uses UTF8 because not all binary fields are storing string data.
There is no data type in parquet-column called BYTE_ARRAY.
I saw their PrimitiveType in latest package but could not see it.
Could not write byte[] in binary as well.

Constructing an ascii message in a string datatype

I need to construct a message to send to a serial device that is 1024 ASCII characters. If I construct this message in a regular string data type will this work? I assume the data will be wrong because strings are formatted in Unicode. How could I go about doing something like this?
First of all, strings are not designed to handle binary data so you should not use it when you need access to row binary data.
Most suitable candidate for this is byte array. It will ensure that the data is store exactly you want it.
For writing to serial port, you should use binary streams. It will take a binary array and can put on any writable device.

Postgres' text column doesn't like my zlib compressed data

Is there a better data type to be using to store a zlib compressed string in Postgresql?
Use bytea "The bytea data type allows storage of binary strings"
Use a bytea. Zip compressed data is not a text.