How can we use substring, trim, length operations on some text of blob datatype. And how can we update a column of blob datatype using query?
Thanks,
With difficulty!
First of all, which of the 4 various types of blob are you discussing:
BYTE
TEXT
BLOB
CLOB
These come in pairs (like Sith Lords): there is a binary version (BYTE, BLOB) and a text version (TEXT, CLOB). There's also another pairing: old (BYTE, TEXT) and newer (BLOB, CLOB). The BYTE and TEXT types were introduced with Informix OnLine 4.00 in about 1989. The BLOB and CLOB types were introduced with Informix Universal Server 9.00 in 1996, and are also known as SmartBlobs.
However, there's a very real sense in which it doesn't matter which of the types you are referring to.
There are very few operations that can be performed on BYTE and TEXT blobs. They can be fetched and stored, but for all practical purposes, that's all. I believe you can use LENGTH to determine the length of a TEXT blob. I don't believe there are any methods available to update part of BYTE or TEXT blob; it is an all-or-nothing replacement. Further, the replacement is from a host variable of the appropriate type - there are no BYTE or TEXT literals.
The situation is a bit better with SmartBlobs, but I'm not an expert on them. There are mechanisms for obtaining a LO (large object) handle and then manipulating that, but I don't think those are available server-side (from SQL or SPL). I may be willfully not understanding what's available with the SmartBlobs, but I think the operations are only available from programming APIs and not within SQL. There are no BLOB or CLOB literals either. However, you can use SQL to load from files (FILETOBLOB, FILETOCLOB) and write to files (LOTOFILE) - with the files either on the server or on the client.
I have already answered your question about substring: substring operation on blob text in informix
. With BLOBs you can use substring operator, but not SUBSTRING() nor SUBST() functions.
You can also use LENGTH(), but not TRIM().
Example code:
CREATE TABLE _text_test (id serial, txt_vch varchar(200), txt_text text);
INSERT INTO _text_test (txt_vch, txt_text) VALUES ('1234567890', '1234567890');
SELECT txt_vch, txt_text, txt_vch[3,5], txt_text[3,5], length(txt_text) FROM _text_test;
In my example I used TEXT blob type (Jonathan showed you more blob types, you should show us what kind of blob you use in question). Last select shows usage of substring operator and LENGTH() function. You can replace LENGTH() function with other functions like TRIM() to test it with your environment. In my case TRIM() test ends with:
ODBC Error: -880 [Informix][Informix ODBC Driver][Informix]
Trim character and trim source must be of string data type.
Last select works well with JDBC 3.70JC1 driver, but it seems that ODBC 3.70TC1 driver has bug and shows 3 first chars: 123 instead of 345. Test it yourself.
In recent version (12.10) there is DBMS_LOB package
However it doesn't work as documented: for example there is no dbms_lob.get_length function. Instead I've found that dbms_lob_get_length is working as expected.
So for CLOB fields you have following usefull operations:
dbms_lob_get_length;
dbms_lob_instr;
dbms_lob_substr (unfortunately it gets data after get_length too);
I've found also one undocumented but very, very useful function: dbms_lob_new_clob which gets lvarchar argument and it converts it to CLOB.
I know that this answer is very late. I think that it can be usefull for other people searching ways to handle blobs in Informix (I've found this post few days ago when I was starting mini-research about using blobs for storing xml).
Related
I'm trying to convert the following code from Oracle to Snowflake:
order by nlssort(name, 'NLS_SORT=BINARY')
I know NLSSORT is not a function in Snowflake, but is there anything I can use as an alternative?
It should be pretty similar already to Snowflake's default sorting - you just need to consider your database charset in Oracle (select * from nls_database_parameters where parameter='NLS_CHARACTERSET') and see whether it has a different binary order than ASCII/UTF-8.
Oracle's documentation:
If the value is BINARY, then comparison is based directly on byte
values in the binary encoding of the character values being compared.
Snowflake's documentation:
All data is sorted according to the numeric byte value of each
character in the ASCII table. UTF-8 encoding is supported.
So I think you should be able to just do:
order by name
It's kind of odd that somebody would write that Oracle code to begin with, since BINARY is the default sort order (collation). But if your Oracle database is using multilingual collation (which is not common) for other queries, I don't think you're going to be able to easily emulate that in Snowflake.
I need help on how to resolve characters of unknown type from a database field into a readable format, because I need to overwrite this value on database level with another valid value (in the exact format the application stores it in) to automate system copy acitvities.
I have a proprietary application that also allows users to configure it in via the frontend. This configuration data gets stored in a table and the values of a configuration property are stored in a column of type "BLOB". For the here desired value, I provide a valid URL in the application frontend (like http://myserver:8080). However, what gets stored in the database is not readable (some square characters). I tried all sorts of conversion functions of HANA (HEX, binary), simple, and in a cascaded way (e.g. first to binary, then to varchar) to make it readable. Also, I tried it the other way around and make the value that I want to insert appear in the correct format (conversion to BLOL over hex or binary) but this does not work either. I copied the value to clipboard and compared it to all sorts of character set tables (although I am not sure if this can work at all).
My conversion tries look somewhat like this:
SELECT TO_ALPHANUM('') FROM DUMMY;
while the brackets would contain the characters in question. I cant even print them here.
How can one approach this and maybe find out the character set that is used by this application? I would be grateful for some more ideas.
What you have in your BLOB column is a series of bytes. As you mentioned, these bytes have been written by an application that uses an unknown character set.
In order to interpret those bytes correctly, you need to know the character set as this is literally the mapping of bytes to characters or character identifiers (e.g. code points in UTF).
Now, HANA doesn't come with a whole lot of options to work on LOB data in the first place and for C(haracter)LOB data most manipulations implicitly perform a conversion to a string data type.
So, what I would recommend is to write a custom application that is able to read out the BLOB bytes and perform the conversion in that custom app. Once successfully converted into a string you can store the data in a new NVCLOB field that keeps it in UTF-8 encoding.
You will have to know the character set in the first place, though. No way around that.
I assume you are on Oracle. You can convert BLOB to CLOB as described here.
http://www.dba-oracle.com/t_convert_blob_to_clob_script.htm
In case of your example try this query:
select UTL_RAW.CAST_TO_VARCHAR2(DBMS_LOB.SUBSTR(<your_blob_value)) from dual;
Obviously this only works for values below 32767 characters.
I have a function in a VB .NET class library, which inserts XML text into a VARCHAR(MAX) column.
The column results in an extra "?" at the front of the data in the column. I do not want that character in my data.
The column data starts like :
?<?xml version="1.0" encoding="utf-8"?><Registration xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"....
The insert function is :
INSERT INTO Table (Data) OUTPUT Inserted.ID VALUES (#Data)
The table has 2 columns, Data and ID.
Am I doing something wrong. The XML is created by the .Net XmlSerializer.
Thanks
First, all XML in SQL Server is in Unicode (UCS-2, to be precise), and data access libraries probably know that. So storing their output in varchar column isn't the best idea - you might run into various issues with implicit conversion, and such. Try switching the column data type to nvarchar and look whether it helped.
Second, it might be some mark bytes that are usually found in disk files stored in UTF-8. Since SQL Server doesn't support this encoding, these bytes might have been converted (again, implicitly) into something unreadable. Try something like this query:
select cast(substring(XMLField, 1, 10) as varbinary)
from dbo.MyTable;
It will show you ASCII codes for those characters, at least.
My best guess, however, would be to get rid of UTF-8 completely - the only way to store such data in SQL Server is via varbinary columns, but I doubt you will like the resulting overhead. Try switching to UTF-16 - it's backward compatible with UCS-2 (unless you deal in something truly exotique).
Varchar can only hold characters in the ascii code page. My guess would be you have some unicode character at the beginning of that string.
Switch to nvarchar, you won't get rid of that initial character but you won't lose it either
I had some difficulty with this using System.Xml.Serialization.XmlSerializer. I also wanted the XML to be stored as a human-readable string (because of reasons).
Here's the code I used in case it is useful to someone:
var ser = new XmlSerializer(typeof(Model.SomeRootType));
using var ms = new MemoryStream();
using var writer = new StreamWriter(ms);
ser.Serialize(writer, myObjectModel);
var xml = System.Text.Encoding.UTF8.GetString(ms.ToArray());
// format the XML
var doc = System.Xml.Linq.XDocument.Parse(xml);
var niceXmlForTheDB = doc.ToString();
Notes:
Needs C# 8 for the using statements without the braces
Doesn't strictly need using for MemoryStream, but hey...
Can convert to VB if needed using one of the many online tools (you'll be used to if you code in VB!)
This code should probably be in a try-catch which does something sensible if it fails!
I am using Simple.Data.Oracle to insert data in a table. I am trying to insert a very large value in one of the columns and it is giving me the following error
ORA-22835:Buffer er for lille til konvertering af CLOB til CHAR eller BLOB til RAW (faktisk: 19471, maksimum: 4000)
I am a long way ahead in the project and can't afford to dump Simple.Data.Oracle and look for other alternates at the moment...
You're converting a CLOB to a string of some description. There's a maximum of 4,000 characters in a string in SQL so you need to take a sub-string of the CLOB if you want to place it into a character.
The simplest way to do this is to use DBMS_LOB.SUBSTR.
It doesn't appear as though simple.data.oracle supports CLOBs natively; if you look at the code for the DBTypeConverter class the dictionary maps a CLOB to DBType.String:
{"CLOB", DbType.String},
This is despite the documentation linked in the comments indicating that it should be mapped to DBType.Object. Though the mappings have changed very slightly in the 11.2 documentation a CLOB is still an object.
I don't know the reasons for deciding that a CLOB is a string but as I see it you have a few options:
Raise a bug report on GitHub and be prepared to be told it's not a bug.
Pull a version and change it yourself - be warned that as a CLOB is assumed to be a string there may be a lot of changes to make.
Use DBMS_LOB.SUBSTR and "chunk" the CLOB into as many strings as are required using the offset parameter.
I have an Informix 11.70 database.I am unable to sucessfully execute this insert statement on a table.
INSERT INTO some_table(
col1,
col2,
text_col,
col3)
VALUES(
5,
50,
CAST('"id","title1","title2"
"row1","some data","some other data"
"row2","some data","some other"' AS TEXT),
3);
The error I receive is:
[Error Code: -9634, SQL State: IX000] No cast from char to text.
I found that I should add this statement in order to allow using new lines in text literals, so I added this above the same query I have already written:
EXECUTE PROCEDURE IFX_ALLOW_NEWLINE('t');
Still, I receive the same error.
I have also read the IBM documentation that says: to alternatively allow new lines, I could set the ALLOW_NEWLINE parameter in the ONCONFIG file. I suppose the last one requires administrative access to the server to alter that config file, which I do not have, and I prefer not to take advantage of this setting.
Informix's TEXT (and BYTE) columns pre-date any standard, and are in many ways very peculiar types. TEXT in Informix is very different from TEXT found in other DBMS. One of the long-standing (over 20 years) problems with them is that there isn't a string literal notation that can be used to insert data into them. The 'No cast from char to text' is saying there is no explicit conversion from string literal to TEXT, either.
You have a variety of options:
Use LVARCHAR in the table (good if your values won't be longer than a few KiB, because the total row length is approximately 32 KiB). Maximum size of an LVARCHAR column is just under 32 KiB.
Use a programming language which can handle Informix 'locator' structures — in ESQL/C, the type used to hold a TEXT is loc_t.
Consider using CLOB instead. However, this has the same limitation (no string to CLOB conversion), but you'd be able to use the FILETOCLOB() function to get the information from a file on the client to the database (and LOTOFILE transfers information from the DB to a file on the client).
If you can use LVARCHAR, that is by far the simplest alternative.
I forgot to mention an important detail in the question - I use Java and the Hibernate ORM to access my Informix database, thus some of the suggested approaches (the loc_t handling in particular) in Jonathan Leffler's answer are unfortunately not applicable. Also, I need to store large data of dynamic length and I fear the LVARCHAR column would not be sufficient to hold it.
The way I got it working was to follow Michał Niklas's suggestion from his comment, and use PreparedStatement. This could potentially be explained by Informix handing the TEXT data type in its own manner.