Regular expression search of Oracle BLOB field - sql

I have a table with a BLOB field containing SOAP-serialised .NET objects (XML).
I want to search for records representing objects with specific values against known properties. I have a working .NET client that pulls back the objects and deserialises them one at a time to check the properties; this is user-friendly but creates a huge amount of network traffic and is very slow.
Now I would like to implement a server-side search by sending a regular expression to a stored procedure that will search the text inside the BLOB. Is this possible?
I have tried casting the column to varchar2 using utl_raw.cast_to_varchar2, but the length of the text is too long (in some cases 100KB).
dbms_lob.inst allows me to search the text field for a substring, but with such a complex XML structure I would like the additional flexibility offered by regular expressions.

Related

STRING type or SSTRING element for a text field in table? Pros and cons

I need to create a Z table to store reasons for modifications of a certain custom object.
In the UI, the user will pick a reason ID and then optionally fill a text box. The table will have more or less the fields below:
key objectID
key changeReasonID
changedOn
changedBy
comments
My doubt is with the comments field. I read the documentation about the limitations of STRING and SSTRING, but it's not clear to me if a STRING type field used in a transparent table has a limited length or not.
Even if the length is not limited (at least by the DB), I'm not sure if it's a good idea to use this approach or would you recommend CHAR/SSTRING types with a fix length instead?
**My system is running MSSQL database.
Strings have unlimited length, both in ABAP structures/tables, and in the database.
Most databases will store only a pointer in this column that points to the real CLOB value which is stored in a different memory segment. As a result, they restrict the usage of these columns, and may not allow you to use them as a key or index.
If I remember correctly, ABAP supports a maximum of 16 string fields per structure, which naturally limits its use cases. Also consider that ABAP structures have a maximum size.
For your case, if the comment will remain the only long field, and if you are actually fine with storing unlimited input (--> security constraints?), string sounds like a reasonable option.
If you are unsure what the future will bring, or to be on the safe side regarding security, you might want to opt for sstring or simply a long char instead.

HANA: Unknown Characters in Database column of datatype BLOB

I need help on how to resolve characters of unknown type from a database field into a readable format, because I need to overwrite this value on database level with another valid value (in the exact format the application stores it in) to automate system copy acitvities.
I have a proprietary application that also allows users to configure it in via the frontend. This configuration data gets stored in a table and the values of a configuration property are stored in a column of type "BLOB". For the here desired value, I provide a valid URL in the application frontend (like http://myserver:8080). However, what gets stored in the database is not readable (some square characters). I tried all sorts of conversion functions of HANA (HEX, binary), simple, and in a cascaded way (e.g. first to binary, then to varchar) to make it readable. Also, I tried it the other way around and make the value that I want to insert appear in the correct format (conversion to BLOL over hex or binary) but this does not work either. I copied the value to clipboard and compared it to all sorts of character set tables (although I am not sure if this can work at all).
My conversion tries look somewhat like this:
SELECT TO_ALPHANUM('') FROM DUMMY;
while the brackets would contain the characters in question. I cant even print them here.
How can one approach this and maybe find out the character set that is used by this application? I would be grateful for some more ideas.
What you have in your BLOB column is a series of bytes. As you mentioned, these bytes have been written by an application that uses an unknown character set.
In order to interpret those bytes correctly, you need to know the character set as this is literally the mapping of bytes to characters or character identifiers (e.g. code points in UTF).
Now, HANA doesn't come with a whole lot of options to work on LOB data in the first place and for C(haracter)LOB data most manipulations implicitly perform a conversion to a string data type.
So, what I would recommend is to write a custom application that is able to read out the BLOB bytes and perform the conversion in that custom app. Once successfully converted into a string you can store the data in a new NVCLOB field that keeps it in UTF-8 encoding.
You will have to know the character set in the first place, though. No way around that.
I assume you are on Oracle. You can convert BLOB to CLOB as described here.
http://www.dba-oracle.com/t_convert_blob_to_clob_script.htm
In case of your example try this query:
select UTL_RAW.CAST_TO_VARCHAR2(DBMS_LOB.SUBSTR(<your_blob_value)) from dual;
Obviously this only works for values below 32767 characters.

Display 500+ character field from SAP transparent table

As it commonly known, it is not recommended by SAP to use 255+ character fields in transparent tables. One should use several 255 fields instead, wrap text in LCHR, LRAW or STRING, or use SO10 text etc.
However, while maintaining legacy (and ugly) developments, such problem often arises: how to view what is stored in char500 or char1000 field in database?
The real life scenario:
we have a development where some structure written and read from char1000 field in transparent table
we know field structure and parsing the field through CL_ABAP_CONTAINER_UTILITIES=>FILL_CONTAINER_C or SO_STRUCT_TO_CHAR goes fine, all fields are put wonderfully
displaying the fields via SE11/SE16/SE16n gives nothing as the field is truncated to 255, and to 132 in debugger, AFAIR.
Is there any standard tool, transaction or FM we can use to display such long field?
In the DBA cockpit (ST04), there is a SQL command line, where you can enter directly the "native" SQL commands and display the result as an ALV view. With a substring function, you can split a field into several sections (expl: select substr(sql_text,1,100) s1, substr(sql_text,101,100) s2, substr(sql_text,201,100) s3, substr(sql_text,301,100) s4 from dba_hist_sqltext where sql_id = '0cuyjatkcmjf0'). PS: every ALV cell is 128 characters maximum.
Not sure whether this tool is available for all supported database softwares.
There is also an equivalent program named RSDU_EXEC_SQL (in all ABAP-based systems?)
Unfortunately, they won't work for ersatz of tables by SAP (clustered tables and so on) as they can be queried only with ABAP "Open SQL".
If you have an ERP system to you hand check transaction PP01 out with infotype 1002. Basically They store text in table HRP1002 and HRT1002 and create a special view with an text editor. It looks like this: http://www.sapfunctional.com/HCM/Positions/Page1.13.jpg
In debugger you can switch the view to e.g. HTML and you should see the whole string, but editing is limited as far as i know to a certain number of charachters.

What data type to use for variable length data (for performance)?

What data type should I use for data that can be very short, eg. html link (think twitter), or very long eg. html blog post (think wordpress).
I am thinking if I use varchar(4000), it maybe too short for a html formated blog entry? but if I use text, it will take up more space and is less efficient?
[update]
i am still condering using MySQL (if PHP 5.3/Zend Framework) or MSSQL (if ASP.NET MVC 2)
MySQL also has a Text data type for storing an arbitrarily large amount of text. You can find more here: The BLOB and TEXT Types
If you are using Micrsoft SQL server 2008 you can use varchar(max).
Edit:
Text is also available but isn't searchable without text indexing..

Informix SQL text Blob wildcard search

I am looking for an efficient way to use a wild card search on a text (blob) column.
I have seen that it is internally stored as bytes...
The data amount will be limited, but unfortunately my vendor has decided to use this stupid datatype. I would also consider to move everything in a temp table if there is an easy system side function to modify it - unfortunately something like rpad does not work...
I can see the text value correctly via using the column in the select part or when reading the data via Perl's DBI module.
Unfortunately, you are stuck - there are very few operations that you can perform on TEXT or BYTE blobs. In particular, none of these work:
+ create table t (t text in table);
+ select t from t where t[1,3] = "abc";
SQL -615: Blobs are not allowed in this expression.
+ select t from t where t like "%abc%";
SQL -219: Wildcard matching may not be used with non-character types.
+ select t from t where t matches "*abc*";
SQL -219: Wildcard matching may not be used with non-character types.
Depending on the version of IDS, you may have options with BTS - Basic Text Search (requires IDS v11), or with other text search data blades. On the other hand, if the data is already in the DB and cannot be type-converted, then you may be forced to extract the blobs and search them client-side, which is less efficient. If you must do that, ensure you filter on as many other conditions as possible to minimize the traffic that is needed.
You might also notice that DBD::Informix has to go through some machinations to make blobs appear to work - machinations that it should not, quite frankly, have to go through. So far, in a decade of trying, I've not persuaded people that these things need fixing.