What is the difference between VARBINARY and BLOB in hsqldb? - blob

I would like to store files in an hsqldb database.
Some files are text files ranging from several KB to hundreds of KB in size and some are binary files which may reach several MB in size.
According to the documentation (http://hsqldb.org/doc/guide/sqlgeneral-chapt.html#sgc_binary_types) I can use the types VARCHAR, VARBINARY and BLOB for storing files.
I think I will store files as binary, in order to be able to store both text and binary in the same way. But I do not understand the difference between the two binary types - VARBINARY and BLOB, except that they have different default length values.
What is the difference between them? Which one is better suited for file content storage?

VARCHAR or VARBINARY types are not optimal types for storing files of several kilobytes.
You can use the BLOB type. The difference between this and VARBINARY is the dedicated storage mechanism for large object used for BLOB and CLOB.
See here for more details:
http://hsqldb.org/doc/guide/management-chapt.html#mtc_large_objects

Related

Is it possible to optimize table if I know all blobs will have the same size?

I know for sure that all blob entries on the table will have the same size, but I only know the size at runtime and would rather avoid byte1, byte2, byte3, etc
I assume this was asked a billion times already, but I can't seem to find the right keywords to find such a question, neither here or google
If you mean to optimise storage of the data itself, then no, other than to perhaps save a compressed version of the data.
SQLite, with the exception of the rowid or an alias of the rowid column (for rowid tables as opposed to the rarer used WITHOUT ROWID tables), optimises the storage of the data (e.g. for integers the value will be stored 1, 2, 3, 4, 6, or 8 bytes rather than always 8 bytes). Blobs are stored as they are as per
BLOB. The value is a blob of data, stored exactly as it was input. Datatypes In SQLite Version 3 - 2. Storage Classes and Datatypes
However, if you mean optimise, such as in searching, then BLOBS are inefficient from a search ascpect.
If you mean File/access then BLOBS can be faster than the file system as per :-
SQLite reads and writes small blobs (for example, thumbnail images) 35% fasterĀ¹ than the same blobs can be read from or written to individual files on disk using fread() or fwrite(). 35% Faster Than The Filesystem.

Vertica Large Objects

I am migrating a table from Oracle to Vertica that contains an LOB column. The maximum actual size of the LOB column amounts to 800MB. How can this data be accommodated in Vertica? Is it appropriate to use the Flex Table?
In Vertica's documentation, it says that data loaded in a Flex table is stored in column raw which is a LONG VARBINARY data type. By default, it has a max value of 32MB, which, according to the documentation can be changed(i.e. increased) using the parameter FlexTablesRawSize.
I'm thinking this is the approach for storing large objects in Vertica. We just need to update the FlexTablesRawSize parameter to handle 800MB of data. I'd like to consult if this is the optimal way or if there's a better way. Or will this conflict with Vertica's table row constraint limitation that only allows up to 32MB of data per row?
Thank you in advance.
If you use Vertica for what it's built for - running a Big Data database, you would, like in any analytical database, try to avoid large objects in your table. BLOBs and CLOBs are usually used to store unstructured data: large documents, image files, audio files, video files. You can't filter by such a column, you can't run functions on it, or sum it, etc, you can't group by it.
A safe and performant design should lead to storing the file name in a Vertica table column, storing the file maybe even in Hadoop, and letting the front end (usually a BI tool, and all BI tools support that) retrieve the file to bring it to a report screen ...
Good luck ...
Marco

Encoding byte data and storing as TEXT vs storing in BYTEA in PostgreSQL

I have some byte data(millions of rows), and currently, I am converting to base64 first, and storing it as TEXT. The data is indexed on a row that contains base64 data. I assume Postgres does the conversion to base64 itself.
Will it be faster if I store using BYTEA data type instead?
How will the indexed queries be affected on two data types?
Converting bytes to text using Base64 will consume 33% more space than bytes. Even this would be faster, you will use quite more space on disk. Loading and storing data should be slower as well. I see no advantage in doing that.
Postgres supports indices on BYTEA columns. Since the bytes are shorter than the text, byte columns with indexes should be faster than text columns with indices as well.

Is there any datatype that can store data more than 2 gb in sql server

I am having requirement to store more than 2 gigabytes data in a column. Is there any way that I can do it? I need the data what I store need to be in database not in computer which results when using file stream concept
NO there isn't. NVARCHAR(MAX) is the datatype which can be used to store 2GB of data in a column. But you can not store more than 2GB of data in it so that the upper limit to the datatype.
On a side note what makes you store such a big data in a column as this may cause you a lot of performance overhead and also it might not be a worthy thing to proceed with. I am sure you may find alternatives to that.
Possible alternatives may be to split the data and store it into multiple rows.
Else as commented by Mladen Prajdic you can use Filestream to store more than 2Gb of data.

Why is my database file the same size as an ordinary text file with the same data?

We currently store data for a product I work on in ascii plaintext files in a format like this:
timestamp:2011120211T10:42:23
value:42
error:Foobar error
value:100
error:
timestamp:2011120211T10:43:58
value:0
...
I tried importing this exact data from one 13 MB text file into an Sqlite database with columns (DATETIME, TEXT, TEXT, TEXT, TEXT). However, much to my surprise, the file size of the database was also 13 MB.
Why is this? I would expect a database to use a format more space efficient than plain ascii, is that not the case?
That is definitely not the case. There is lots of metadata there, and space is actually often wasted in the name of efficiency, to allow for inserts, for indexing, etc.
The only time I would expect an ASCII dump to be larger than the database files is if the database was largely binary data which would need to be BASE64 encoded to be output as ASCII, and if there were no/minimal indexes.
Databases can support compression of data but that affects performance. I'm not familiar with Sqlite but I'd guess that data compression is an option you would need to turn on.
I would imagine that the efficiency and speed of the database would stem from the data structures it uses in memory and algorithms that it implements to search, not the structure of the files.
Database is not made for more space efficient, it is made for time efficient. In many case database not waste any space but text is not waste so many space too.
Database record number is more space efficient text, but text is seem as text.
And even have some space benefit, it will not so big to easy find out. But when you use byte not use MB, you will find out.