Dynamic DB storage based on unknown result from source - sql

I have a source of data that I get from a webService. I can never know when it'll change and I need to store it in a DB as soon as I get it. What is the best way to make the storage solution adapt to what I put there. I am using mySQL. Would serialization be the key?

I would store the context in a column using the TEXT data type, and consider MEDIUMTEXT or LONGTEXT if the content is over 4000 characters. MySQL 5.1 has XML functionality to get values out of the XML payload...
Ideally, I'd consume the webservice and populate tables appropriately.

Related

HANA: Unknown Characters in Database column of datatype BLOB

I need help on how to resolve characters of unknown type from a database field into a readable format, because I need to overwrite this value on database level with another valid value (in the exact format the application stores it in) to automate system copy acitvities.
I have a proprietary application that also allows users to configure it in via the frontend. This configuration data gets stored in a table and the values of a configuration property are stored in a column of type "BLOB". For the here desired value, I provide a valid URL in the application frontend (like http://myserver:8080). However, what gets stored in the database is not readable (some square characters). I tried all sorts of conversion functions of HANA (HEX, binary), simple, and in a cascaded way (e.g. first to binary, then to varchar) to make it readable. Also, I tried it the other way around and make the value that I want to insert appear in the correct format (conversion to BLOL over hex or binary) but this does not work either. I copied the value to clipboard and compared it to all sorts of character set tables (although I am not sure if this can work at all).
My conversion tries look somewhat like this:
SELECT TO_ALPHANUM('') FROM DUMMY;
while the brackets would contain the characters in question. I cant even print them here.
How can one approach this and maybe find out the character set that is used by this application? I would be grateful for some more ideas.
What you have in your BLOB column is a series of bytes. As you mentioned, these bytes have been written by an application that uses an unknown character set.
In order to interpret those bytes correctly, you need to know the character set as this is literally the mapping of bytes to characters or character identifiers (e.g. code points in UTF).
Now, HANA doesn't come with a whole lot of options to work on LOB data in the first place and for C(haracter)LOB data most manipulations implicitly perform a conversion to a string data type.
So, what I would recommend is to write a custom application that is able to read out the BLOB bytes and perform the conversion in that custom app. Once successfully converted into a string you can store the data in a new NVCLOB field that keeps it in UTF-8 encoding.
You will have to know the character set in the first place, though. No way around that.
I assume you are on Oracle. You can convert BLOB to CLOB as described here.
http://www.dba-oracle.com/t_convert_blob_to_clob_script.htm
In case of your example try this query:
select UTL_RAW.CAST_TO_VARCHAR2(DBMS_LOB.SUBSTR(<your_blob_value)) from dual;
Obviously this only works for values below 32767 characters.

Storing and returning emojis

What's the simplest way to write and, then, read Emoji symbols in Oracle table?
Currently I have this situation:
iOS client pass encoded Emojis: One%20more%20time%20%F0%9F%98%81%F0%9F%98%94%F0%9F%98%8C%F0%9F%98%92. For example, %F0%9F%98%81 means 😁;
Column type is nvarchar2(2000), so when view saved text via Oracle SQL Developer it looks like: One more time ????????.
This seems more a client problem than a database problem. Certain iOs programs are capable of interpreting that string and show an image instead of that string.
SQL Developer does not do that.
As long as the data stored in the database is the same as the data retrieved from the database, you have no problem.
After all, we do BASE64 encoding/decoding of the text. It’s suitable for small texts.
In MySQL the character set needs to be set to UTF-16 to be able to save emojis, I assume Oracle would need the same ch

Using hsqldb as a key-value store

I would like to use hsqldb as a simple key-value store, where both the key and the value are strings.
The value would be a JSON of some data, say no more than 10K in size.
The type of the value column is LONGVARCHAR.
I would like to know whether this type is suitable for this purpose.
P.S.
A bit of background. We wanted to use MongoDB or CouchDB, but the latest MongoDB does not support Windows XP and the latest CouchDB does not support Windows 32 bits, both of which is a requirement. Using a DB like Cassandra seems like an enormous overkill in our case.
If the values are already in the UTF-8 or other 8 bit encoding form, you can use BLOB or VARBINARY. Otherwise, use CLOB or VARCHAR for Unicode characters. Both forms are suitable for up to 10K values. Note LONGVARCHAR is simply a long VARCHAR.
If speed of access is essential, you can test with both types and decide which one is the best for your data. The same access API can be used for BLOB/VARBINARY or CLOB/VARCHAR when the values are relatively small (10k).

What data type to use for variable length data (for performance)?

What data type should I use for data that can be very short, eg. html link (think twitter), or very long eg. html blog post (think wordpress).
I am thinking if I use varchar(4000), it maybe too short for a html formated blog entry? but if I use text, it will take up more space and is less efficient?
[update]
i am still condering using MySQL (if PHP 5.3/Zend Framework) or MSSQL (if ASP.NET MVC 2)
MySQL also has a Text data type for storing an arbitrarily large amount of text. You can find more here: The BLOB and TEXT Types
If you are using Micrsoft SQL server 2008 you can use varchar(max).
Edit:
Text is also available but isn't searchable without text indexing..

What MySQL datatype & attributes should be used to store large amounts of html formatted data?

I'm setting up a database using PHPMyAdmin and many fields will be large chunks of HTML.
What MySQL datatype & attributes should be used for fields that store large amounts of HTML data?
TEXT, MEDIUMTEXT, or LONGTEXT
I would recommend against storing large chunks of HTML (or other text data) in a database. I find that it's often far more useful to store files as files, and put a filename in the database instead.
On the other hand, if you're doing everything through phpMyAdmin, you may not have that option available to you.
You really really should start with the documentation, then if you have questions based on the data types you find there, try to ask for some clarification. But it really helps to understand what the datatypes are before asking the question: Documentation here:
http://dev.mysql.com/doc/refman/5.4/en/data-types.html
That said, take a closer look at text and blob. Text will store a large body of textual information (probably a good choice) where blob is designed for binary data. This does make a difference based on the query functions and what data types they operate on.
I think you can store HTML in simple TEXT field. If your html is more then 64KB then you can use MEDIUMTEXT instead.
See also Storage Requirements for String Types for more details about maximum length of stored value.
Also remember than characters in Unicode can require more then 1 byte to store.