SELECT as much data from CLOB to VARCHAR2 as possible, with multibyte chars in the data - sql

Multi-byte characters had caused me a lot of pain.
Any suggestion for this problem?
I have a CLOB field that might contains some multi-byte characters, and I need to select in SQL and convert this field into a string for downstream process, currently I am using:
SELECT DBMS_LOB.SUBSTR( description, 4000, 1 ) FROM table
But the 4000 in above command is in length of characters, rather than bytes. So I had to change to 3000 to handle any multi-byte characters that might have crept into the data else buffer size error will occur.
The problem is for records that do not contain multibyte character, it might unnecessarily truncated more data than it need to.
(The 4000 is the string limitation, we can/had to live with that.)
Is there a way to do something in equivalent of:
SELECT DBMS_LOB.SUBSTR( description, 4000bytes, 1 ) FROM table
That way I can get as much data out as possible.
Note: I am not allowed to create temp tables/views, not using PL/SQL, only SQL SELECT...

Jeffrey's thinking process is ok, but alchn is also right. Just ran into this same problem and here is my solution. You'll have to be able to create a function though:
Create Or Replace Function clob_substr(p_clob In Clob
,p_offset In Pls_Integer
,p_length In Pls_Integer) Return Varchar2 Is
Begin
Return substrb(dbms_lob.substr(p_clob
,p_length
,p_offset)
,1
,p_length);
End;
/
Here is a demo of it's use:
Select c
,clob_substr(c
,1
,4000)
From (
Select xmlelement("t", rpad('é', 4000, 'é'), rpad('é', 4000, 'é')).extract('//text()').getclobval() c
From dual
);

Maybe truncate the resulting varchar2 with SUBSTR:
SELECT SUBSTRB( DBMS_LOB.SUBSTR( description, 4000, 1 ), 1, 4000) FROM table

Related

ORACLE inserts with CLOB column (more than 4000 characters)

first of all sorry for my english.
I am trying to insert approximately 15000 rows, the problem is that there is a column of type CLOB that can have more than 4000 characters giving the error ORA-01704, I know how to insert one by one like this: TO_CLOB (string) || TO_CLOB (string) and it works, but I have approximately 1000 cases where it happens and I don't want to do it manually, what way can you think of to insert them?
Thank you.
i think you're trying something like that:
insert into values('your clob data');
in this case you are using sql literal size which length 4000 bytes.
11g and below,the limit is 4000 bytes.
But if you use script like that :
declare
l_clob varchar2(32767);
begin
...
insert into values(l_clob );
...
end
it means you are using pl sql literal string which has 32k.

What does it mean when the size of a VARCHAR2 in Oracle is declared as 1 byte?

I know that I can declare a varchar2 using the number of the characters that it should be able to contain.
However, in an Oracle database on which I am working, I found that a field (named PDF) is defined as follows:
VARCHAR2(1 BYTE)
What does this mean? How many characters can it contain?
Another, related question: What is the difference between a VARCHAR and a VARCHAR2?
You can declare columns/variables as varchar2(n CHAR) and varchar2(n byte).
n CHAR means the variable will hold n characters. In multi byte character sets you don't always know how many bytes you want to store, but you do want to garantee the storage of a certain amount of characters.
n bytes means simply the number of bytes you want to store.
varchar is deprecated. Do not use it.
What is the difference between varchar and varchar2?
The VARCHAR datatype is synonymous with the VARCHAR2 datatype. To avoid possible changes in behavior, always use the VARCHAR2 datatype to store variable-length character strings.
If your database runs on a single-byte character set (e.g. US7ASCII, WE8MSWIN1252 or WE8ISO8859P1) it does not make any difference whether you use VARCHAR2(x BYTE) or VARCHAR2(x CHAR).
It makes only a difference when your DB runs on multi-byte character set (e.g. AL32UTF8 or AL16UTF16). You can simply see it in this example:
CREATE TABLE my_table (
VARCHAR2_byte VARCHAR2(1 BYTE),
VARCHAR2_char VARCHAR2(1 CHAR)
);
INSERT INTO my_table (VARCHAR2_char) VALUES ('€');
1 row created.
INSERT INTO my_table (VARCHAR2_char) VALUES ('ü');
1 row created.
INSERT INTO my_table (VARCHAR2_byte) VALUES ('€');
INSERT INTO my_table (VARCHAR2_byte) VALUES ('€')
Error at line 10
ORA-12899: value too large for column "MY_TABLE"."VARCHAR2_BYTE" (actual: 3, maximum: 1)
INSERT INTO my_table (VARCHAR2_byte) VALUES ('ü')
Error at line 11
ORA-12899: value too large for column "MY_TABLE"."VARCHAR2_BYTE" (actual: 2, maximum: 1)
VARCHAR2(1 CHAR) means you can store up to 1 character, no matter how many byte it has. In case of Unicode one character may occupy up to 4 bytes.
VARCHAR2(1 BYTE) means you can store a character which occupies max. 1 byte.
If you don't specify either BYTE or CHAR then the default is taken from NLS_LENGTH_SEMANTICS session parameter.
Unless you have Oracle 12c where you can set MAX_STRING_SIZE=EXTENDED the limit is VARCHAR2(4000 CHAR)
However, VARCHAR2(4000 CHAR) does not mean you are guaranteed to store up to 4000 characters. The limit is still 4000 bytes, so in worst case you may store only up to 1000 characters in such field.
See this example (€ in UTF-8 occupies 3 bytes):
CREATE TABLE my_table2(VARCHAR2_char VARCHAR2(4000 CHAR));
BEGIN
INSERT INTO my_table2 VALUES ('€€€€€€€€€€');
FOR i IN 1..7 LOOP
UPDATE my_table2 SET VARCHAR2_char = VARCHAR2_char ||VARCHAR2_char;
END LOOP;
END;
/
SELECT LENGTHB(VARCHAR2_char) , LENGTHC(VARCHAR2_char) FROM my_table2;
LENGTHB(VARCHAR2_CHAR) LENGTHC(VARCHAR2_CHAR)
---------------------- ----------------------
3840 1280
1 row selected.
UPDATE my_table2 SET VARCHAR2_char = VARCHAR2_char ||VARCHAR2_char;
UPDATE my_table2 SET VARCHAR2_char = VARCHAR2_char ||VARCHAR2_char
Error at line 1
ORA-01489: result of string concatenation is too long
See also Examples and limits of BYTE and CHAR semantics usage (NLS_LENGTH_SEMANTICS) (Doc ID 144808.1)
To answer you first question:
Yes, it means that 1 byte allocates for 1 character. Look at this example
SQL> conn / as sysdba
Connected.
SQL> create table test (id number(10), v_char varchar2(10));
Table created.
SQL> insert into test values(11111111111,'darshan');
insert into test values(11111111111,'darshan')
*
ERROR at line 1:
ORA-01438: value larger than specified precision allows for this column
SQL> insert into test values(11111,'darshandarsh');
insert into test values(11111,'darshandarsh')
*
ERROR at line 1:
ORA-12899: value too large for column "SYS"."TEST"."V_CHAR" (actual: 12,
maximum: 10)
SQL> insert into test values(111,'Darshan');
1 row created.
SQL>
And to answer your next one:
The difference between varchar2 and varchar :
VARCHAR can store up to 2000 bytes of characters while VARCHAR2 can store up to 4000 bytes of characters.
If we declare datatype as VARCHAR then it will occupy space for NULL values, In case of VARCHAR2 datatype it will not occupy any space.
it means ONLY one byte will be allocated per character - so if you're using multi-byte charsets, your 1 character won't fit
if you know you have to have at least room enough for 1 character, don't use the BYTE syntax unless you know exactly how much room you'll need to store that byte
when in doubt, use VARCHAR2(1 CHAR)
same thing answered here Difference between BYTE and CHAR in column datatypes
Also, in 12c the max for varchar2 is now 32k, not 4000. If you need more than that, use CLOB
in Oracle, don't use VARCHAR

How to interpret ALL_TAB_COLS.DATA_LENGTH = 4000 for Oracle's CLOB types?

When I create a table with a CLOB in it, the CLOB column is reported to have a DATA_LENGTH of 4000:
create table test (
data clob
);
-- Returns 4000
select data_length from all_tab_cols
where table_name = 'TEST';
How can I interpret that? Is that some backwards-compatible tweak, considering that the limit for VARCHAR2 is also 4000? Or is this to indicate that I can only insert / update string literals of length 4000? Is this behaviour of SYS.ALL_TAB_COLS documented somewhere?
Up to 4000 bytes can be stored in-line in the tablespace. If the length of the CLOB exceeds 4000 bytes it must be stored in/readed from LOB storage area via special functions.
See also:
http://www.dba-oracle.com/t_blob.htm
http://www.dba-oracle.com/t_table_blob_lob_storage.htm

How to query a CLOB column in Oracle

I'm trying to run a query that has a few columns that are a CLOB datatype. If i run the query like normal, all of those fields just have (CLOB) as the value.
I tried using DBMS_LOB.substr(column) and i get the error
ORA-06502: PL/SQL: numeric or value error: character string buffer too small
How can i query the CLOB column?
This works
select DBMS_LOB.substr(myColumn, 3000) from myTable
When getting the substring of a CLOB column and using a query tool that has size/buffer restrictions sometimes you would need to set the BUFFER to a larger size. For example while using SQL Plus use the SET BUFFER 10000 to set it to 10000 as the default is 4000.
Running the DBMS_LOB.substr command you can also specify the amount of characters you want to return and the offset from which. So using DBMS_LOB.substr(column, 3000) might restrict it to a small enough amount for the buffer.
See oracle documentation for more info on the substr command
DBMS_LOB.SUBSTR (
lob_loc IN CLOB CHARACTER SET ANY_CS,
amount IN INTEGER := 32767,
offset IN INTEGER := 1)
RETURN VARCHAR2 CHARACTER SET lob_loc%CHARSET;
I did run into another condition with HugeClob in my Oracle database. The dbms_lob.substr only allowed a value of 4000 in the function, ex:
dbms_lob.substr(column,4000,1)
so for my HughClob which was larger, I had to use two calls in select:
select dbms_lob.substr(column,4000,1) part1,
dbms_lob.substr(column,4000,4001) part2 from .....
I was calling from a Java app so I simply concatenated part1 and part2 and sent as a email.
If it's a CLOB why can't we to_char the column and then search normally ?
Create a table
CREATE TABLE MY_TABLE(Id integer PRIMARY KEY, Name varchar2(20), message clob);
Create few records in this table
INSERT INTO MY_TABLE VALUES(1,'Tom','Hi This is Row one');
INSERT INTO MY_TABLE VALUES(2,'Lucy', 'Hi This is Row two');
INSERT INTO MY_TABLE VALUES(3,'Frank', 'Hi This is Row three');
INSERT INTO MY_TABLE VALUES(4,'Jane', 'Hi This is Row four');
INSERT INTO MY_TABLE VALUES(5,'Robert', 'Hi This is Row five');
COMMIT;
Search in the clob column
SELECT * FROM MY_TABLE where to_char(message) like '%e%';
Results
ID NAME MESSAGE
===============================
1 Tom Hi This is Row one
3 Frank Hi This is Row three
5 Robert Hi This is Row five
For big CLOB selects also can be used:
SELECT dbms_lob.substr( column_name, dbms_lob.getlength(column_name), 1) FROM foo
Another option is to create a function and call that function everytime you need to select clob column.
create or replace function clob_to_char_func
(clob_column in CLOB,
for_how_many_bytes in NUMBER,
from_which_byte in NUMBER)
return VARCHAR2
is
begin
Return substrb(dbms_lob.substr(clob_column
,for_how_many_bytes
,from_which_byte)
,1
,for_how_many_bytes);
end;
and call that function as;
SELECT tocharvalue, clob_to_char_func(tocharvalue, 1, 9999)
FROM (SELECT clob_column AS tocharvalue FROM table_name);
To add to the answer.
declare
v_result clob;
begin
---- some operation on v_result
dbms_lob.substr( v_result, 4000 ,length(v_result) - 3999 );
end;
/
In dbms_lob.substr
first parameter is clob which you want to extract .
Second parameter is how much length of clob you want to extract.
Third parameter is from which word you want to extract .
In above example i know my clob size is more than 50000 , so i want last 4000 character .
If you are using SQL*Plus try the following...
set long 8000
select ...

How do I get textual contents from BLOB in Oracle SQL

I am trying to see from an SQL console what is inside an Oracle BLOB.
I know it contains a somewhat large body of text and I want to just see the text, but the following query only indicates that there is a BLOB in that field:
select BLOB_FIELD from TABLE_WITH_BLOB where ID = '<row id>';
the result I'm getting is not quite what I expected:
BLOB_FIELD
-----------------------
oracle.sql.BLOB#1c4ada9
So what kind of magic incantations can I do to turn the BLOB into it's textual representation?
PS: I am just trying to look at the content of the BLOB from an SQL console (Eclipse Data Tools), not use it in code.
First of all, you may want to store text in CLOB/NCLOB columns instead of BLOB, which is designed for binary data (your query would work with a CLOB, by the way).
The following query will let you see the first 32767 characters (at most) of the text inside the blob, provided all the character sets are compatible (original CS of the text stored in the BLOB, CS of the database used for VARCHAR2) :
select utl_raw.cast_to_varchar2(dbms_lob.substr(BLOB_FIELD)) from TABLE_WITH_BLOB where ID = '<row id>';
SQL Developer provides this functionality too :
Double click the results grid cell, and click edit :
Then on top-right part of the pop up , "View As Text" (You can even see images..)
And that's it!
You can use below SQL to read the BLOB Fields from table.
SELECT DBMS_LOB.SUBSTR(BLOB_FIELD_NAME) FROM TABLE_NAME;
Use this SQL to get the first 2000 chars of the BLOB.
SELECT utl_raw.cast_to_varchar2(dbms_lob.substr(<YOUR_BLOB_FIELD>,2000,1)) FROM <YOUR_TABLE>;
Note: This is because, Oracle will not be able to handle the conversion of BLOB that is more than length 2000.
If you want to search inside the text, rather than view it, this works:
with unzipped_text as (
select
my_id
,utl_compress.lz_uncompress(my_compressed_blob) as my_blob
from my_table
where my_id='MY_ID'
)
select * from unzipped_text
where dbms_lob.instr(my_blob, utl_raw.cast_to_raw('MY_SEARCH_STRING'))>0;
I can get this to work using TO_CLOB (docs):
select
to_clob(BLOB_FIELD)
from
TABLE_WITH_BLOB
where
ID = '<row id>';
This works for me in Oracle 19c, with a BLOB field which larger the the VARCHAR limit. I get readable text (from a JSON-holding BLOB)
Barn's answer worked for me with modification because my column is not compressed. The quick and dirty solution:
select * from my_table
where dbms_lob.instr(my_UNcompressed_blob, utl_raw.cast_to_raw('MY_SEARCH_STRING'))>0;
I struggled with this for a while and implemented the PL/SQL solution, but later realized that in Toad you can simply double click on the results grid cell, and it brings up an editor with contents in text. (i'm on Toad v11)
In case your text is compressed inside the blob using DEFLATE algorithm and it's quite large, you can use this function to read it
CREATE OR REPLACE PACKAGE read_gzipped_entity_package AS
FUNCTION read_entity(entity_id IN VARCHAR2)
RETURN VARCHAR2;
END read_gzipped_entity_package;
/
CREATE OR REPLACE PACKAGE BODY read_gzipped_entity_package IS
FUNCTION read_entity(entity_id IN VARCHAR2) RETURN VARCHAR2
IS
l_blob BLOB;
l_blob_length NUMBER;
l_amount BINARY_INTEGER := 10000; -- must be <= ~32765.
l_offset INTEGER := 1;
l_buffer RAW(20000);
l_text_buffer VARCHAR2(32767);
BEGIN
-- Get uncompressed BLOB
SELECT UTL_COMPRESS.LZ_UNCOMPRESS(COMPRESSED_BLOB_COLUMN_NAME)
INTO l_blob
FROM TABLE_NAME
WHERE ID = entity_id;
-- Figure out how long the BLOB is.
l_blob_length := DBMS_LOB.GETLENGTH(l_blob);
-- We'll loop through the BLOB as many times as necessary to
-- get all its data.
FOR i IN 1..CEIL(l_blob_length/l_amount) LOOP
-- Read in the given chunk of the BLOB.
DBMS_LOB.READ(l_blob
, l_amount
, l_offset
, l_buffer);
-- The DBMS_LOB.READ procedure dictates that its output be RAW.
-- This next procedure converts that RAW data to character data.
l_text_buffer := UTL_RAW.CAST_TO_VARCHAR2(l_buffer);
-- For the next iteration through the BLOB, bump up your offset
-- location (i.e., where you start reading from).
l_offset := l_offset + l_amount;
END LOOP;
RETURN l_text_buffer;
EXCEPTION
WHEN OTHERS THEN
DBMS_OUTPUT.PUT_LINE('!ERROR: ' || SUBSTR(SQLERRM,1,247));
END;
END read_gzipped_entity_package;
/
Then run select to get text
SELECT read_gzipped_entity_package.read_entity('entity_id') FROM DUAL;
Hope this will help someone.
You can try this:
SELECT TO_CHAR(dbms_lob.substr(BLOB_FIELD, 3900)) FROM TABLE_WITH_BLOB;
However, It would be limited to 4000 byte
Worked for me,
select lcase((insert(
insert(
insert(
insert(hex(BLOB_FIELD),9,0,'-'),
14,0,'-'),
19,0,'-'),
24,0,'-'))) as FIELD_ID
from TABLE_WITH_BLOB
where ID = 'row id';
Use TO_CHAR function.
select TO_CHAR(BLOB_FIELD) from TABLE_WITH_BLOB where ID = '<row id>'
Converts NCHAR, NVARCHAR2, CLOB, or NCLOB data to the database character set. The value returned is always VARCHAR2.