Oracle database: How to read a BLOB? - sql

I'm working with an Oracle database, and I would like to read the contents of a BLOB. How do I do this?
When I do a simple select statement, it merely returns "(BLOB)" (without the quotes). How do I read the actual contents?

You can dump the value in hex using UTL_RAW.CAST_TO_RAW(UTL_RAW.CAST_TO_VARCHAR2()).
SELECT b FROM foo;
-- (BLOB)
SELECT UTL_RAW.CAST_TO_RAW(UTL_RAW.CAST_TO_VARCHAR2(b))
FROM foo;
-- 1F8B080087CDC1520003F348CDC9C9D75128CF2FCA49D1E30200D7BBCDFC0E000000
This is handy because you this is the same format used for inserting into BLOB columns:
CREATE GLOBAL TEMPORARY TABLE foo (
b BLOB);
INSERT INTO foo VALUES ('1f8b080087cdc1520003f348cdc9c9d75128cf2fca49d1e30200d7bbcdfc0e000000');
DESC foo;
-- Name Null Type
-- ---- ---- ----
-- B BLOB
However, at a certain point (2000 bytes?) the corresponding hex string exceeds Oracle’s maximum string length. If you need to handle that case, you’ll have to combine How do I get textual contents from BLOB in Oracle SQL with the documentation for DMBS_LOB.SUBSTR for a more complicated approach that will allow you to see substrings of the BLOB.

SQL Developer can show the blob as an image (at least it works for jpegs). In the Data view, double click on the BLOB field to get the "pencil" icon. Click on the pencil to get a dialog that will allow you to select a "View As Image" checkbox.

If the content is not too large, you can also use
SELECT CAST ( <blobfield> AS RAW( <maxFieldLength> ) ) FROM <table>;
or
SELECT DUMP ( CAST ( <blobfield> AS RAW( <maxFieldLength> ) ) ) FROM <table>;
This will show you the HEX values.

If you use the Oracle native data provider rather than the Microsoft driver then you can get at all field types
Dim cn As New Oracle.DataAccess.Client.OracleConnection
Dim cm As New Oracle.DataAccess.Client.OracleCommand
Dim dr As Oracle.DataAccess.Client.OracleDataReader
The connection string does not require a Provider value so you would use something like:
"Data Source=myOracle;UserID=Me;Password=secret"
Open the connection:
cn.ConnectionString = "Data Source=myOracle;UserID=Me;Password=secret"
cn.Open()
Attach the command and set the Sql statement
cm.Connection = cn
cm.CommandText = strCommand
Set the Fetch size. I use 4000 because it's as big as a varchar can be
cm.InitialLONGFetchSize = 4000
Start the reader and loop through the records/columns
dr = cm.ExecuteReader
Do while dr.read()
strMyLongString = dr(i)
Loop
You can be more specific with the read, eg dr.GetOracleString(i) dr.GetOracleClob(i) etc. if you first identify the data type in the column. If you're reading a LONG datatype then the simple dr(i) or dr.GetOracleString(i) works fine. The key is to ensure that the InitialLONGFetchSize is big enough for the datatype. Note also that the native driver does not support CommandBehavior.SequentialAccess for the data reader but you don't need it and also, the LONG field does not even have to be the last field in the select statement.

What client do you use? .Net, Java, Ruby, SQLPLUS, SQL DEVELOPER? Where did you write that simple select statement?
And why do you want to read the content of the blob, a blob contains binary data so that data is unreadable. You should use a clob instead of a blob if you want to store text instead of binary content.
I suggest that you download SQL DEVELOPER: http://www.oracle.com/technetwork/developer-tools/sql-developer/overview/index.html . With SQL DEVELOPER you can see the content.

If you're interested to get the plaintext (body part) from a BLOB, you could use the CTX_DOC package.
For example, the CTX_DOC.FILTER procedure can "generate either a plain text or a HTML version of a document". Be aware that CTX_DOC.FILTER requires an index on the BLOB column. If you don't want that, you could use the CTX_DOC.POLICY_FILTER procedure instead, which doesn't require an index.

Related

BigQuery: Convert a text column to UTF-8

I want to start a Vertex AI AutoML Text Entity Extraction Batch Prediction Job, but in my own experience, texts ("content" field in the JSONL structure), must also accomplish the following two features:
Every text's size in bytes, must be between 10 and 10000 bytes: DONE
Every text encoding must be UTF-8: UNKNOWN
My original data is stored in BigQuery, so I'll have to export it to Google Cloud Storage for later batch prediction. To take advantage of BigQuery optimization, I want to accomplish the 2 previous tasks in the BigQuery data source table itself. I have checked Google's official documentation, and the closest I have got to some related information, is this; however not accurate VS what I want. BTW, the query looks as follows:
WITH mydata AS (
SELECT
CASE
WHEN BYTE_LENGTH(posting)>10000 THEN LEFT(posting, 9950)
WHEN BYTE_LENGTH(posting)<10 THEN CONCAT(posting, " is possibly an skill")
ELSE posting
END AS posting
FROM `my-project.Machine_Learning_Datasets.sample-data-source` -- Modified for data protection
)
SELECT
posting as content, -- Something needs to be done here
"text" as mimeType
FROM mydata
And my-project.Machine_Learning_Datasets.sample-data-source schema looks as follows:
Field name
Type
Mode
Records
posting
STRING
NULLABLE
100M
Any ideas?
The following answer did the job, FYI:
WITH
mydata AS (
SELECT
CASE
WHEN BYTE_LENGTH(posting)>10000 THEN LEFT(posting, 9950)
WHEN BYTE_LENGTH(posting)<10 THEN CONCAT(posting, " is possibly an skill")
ELSE
posting
END
AS posting
FROM
`my-project.Machine_Learning_Datasets.sample-data-source` )
SELECT
REGEXP_REPLACE(posting, r'[^\x00-\x7F]+', '') AS content,
"text/plain" AS mimeType
FROM
mydata
UPDATE: This case has been considered, for an improved workaround.
Thanks!

npgsql executescalar() allways returns nothing

I'm using npgsql as a nuget package in visual studio 2017 with visual basic.
Various commands do work very well but an ExecuteScalar allways returns 'nothing' although it should give a result.
The command looks like this:
Dim ser As Integer
Dim find = New NpgsqlCommand("SELECT serial from dbo.foreigncode WHERE code = '#code';", conn)
Dim fcode = New NpgsqlParameter("code", NpgsqlTypes.NpgsqlDbType.Varchar)
find.Parameters.Add(fcode)
find.Prepare()
fcode.Value = "XYZ"
ser = find.ExecuteScalar() ==> nothing
When the command string is copied as a value during debugging and pasted into the query tool of PGADMIN it delivers the correct result. The row is definitely there.
Different Commands executed with ExecuteNonQuery() work well, including ones performing UPDATE statements on the row in question.
When I look into the properties of the parameter fcode immediately before the ExecuteScalar it shows 'fcode.DataTypeName' caused an exception 'System.NotImplementedException'.
If I change my prepared statement to "SELECT #code" and set the value of the parameter to an arbitrary value just this value is returned. There is no access to the table taking place because the table name is not part of the SELECT in this case. If I remove the WHERE CLAUSE in the SELECT and just select one column, I would also expect that something has to be returned. But again it is nothing.
Yes there is a column named serial. It is of type bigint and can not contain NULL.
A Query shows that there is no single row that contains NULL in any column.
Latest findings:
I queried a different table where the search column and the result column happen to have the same datatype. It works, so syntax, passing of parameter, prepare etc. seems to work in principal.
The System.NotImplementedException in the DataTypeName property of the parameter occurs as well but it works anyway.
I rebuilt the index of the table in question. No change.
Still: when I copy/paste the CommandText and execute it in PGAdmin it shows the correct result.
Modifying the Command and using plain text there without parameter and without prepare still does yield nothing. The plain text CommandText was copy/pasted from PGAdmin where it was successfully executed before.
Very strange.
Reverting search column and result column also gives nothing as a result.
Please try these two alternatives and post back your results:
' Alternative 1: fetch the entire row, see what's returned
Dim dr = find.ExecuteReader()
While (dr.Read())
Console.Write("{0}\t{1} \n", dr[0], dr[1])
End While
' Alternative 2: Check if "ExecuteScalar()" returns something other than an int
Dim result = find.ExecuteScalar()
... and (I just noticed Honeyboy Wilson's response!) ...
Fix your syntax:
' Try this first: remove the single quotes around "#code"!
Dim find = New NpgsqlCommand("SELECT serial from dbo.foreigncode WHERE code = #code;", conn)
Update 1
Please try this:
Dim find = New NpgsqlCommand("SELECT * from dbo.foreigncode;", conn)
Q: Does this return anything?
Dim dr = find.ExecuteReader()
While (dr.Read())
Console.Write("{0}\t{1} \n", dr[0], dr[1])
End While
Q: Does this?
Dim result = find.ExecuteScalar()
Q: Do you happen to have a column named "serial"? What is it's data type? Is it non-null for the row(s) with 'XYZ'?
Please update your original post with this information.
Update 2
You seem to be doing ":everything right":
You've confirmed that you can connect,
You've confirmed that non-query updates to the same table work (with npgsql),
You've confirmed that the SQL queries themselves are valid (by copying/pasting the same SQL into PGAdmin and getting valid results).
As Shay Rojansky said, "System.NotImplementedException in the DataTypeName property" is a known issue stepping through the debugger. It has nothing to do with your problem: https://github.com/npgsql/npgsql/issues/2520
SUGGESTIONS (I'm grasping at straws)::
Double-check "permissions" on your database and your table.
Consider installing a different version of npgsql.
Be sure your code is detecting any/all error returns and exceptions (it sounds like you're probably already doing this, but it never hurts to ask)
... and ...
Enable verbose logging, both client- and server-side:
https://www.npgsql.org/doc/logging.html
https://www.postgresql.org/docs/9.0/runtime-config-logging.html
... Finally ...
Q: Can you make ANY query, from ANY table, using ANY query method (ExecuteReader(), ExecuteScalar(), ... ANYTHING) from your npgsql/.Net client AT ALL?
I finally found it. It's often the small things that can have a big impact.
When the value was assigned to the parameter a substring index was incorect.
Now it works perfectly.
Thanks to everybody who spent his time on this.

SQL Server datetime culture (localization) headache

SQL Server 2005. Visual Studio 2010. ASP.NET 2.0 Web Application
This is a web application that supports multiple languages, one of them is Korean. I have “langid” in the query string to differentiate different languages, if langid=3 it is Korean.
In my code behind’ C# code, I read a table using this query:
"select * from Reservations where rsv_id = 1234"
There is a column named "rsv_date" in the table which is reservation date, of type datetime. In the db table its value is "11/22/2012 4:14:37 PM". I checked this in SQL server management studio. But when I read it out, I got "2012-11-22 오후 4:14:37"! Where does that Korean “오후” come from??? Is it because of some culture setting anywhere? But I don’t see where, either in my code or in SQL Server. This caused problem for me, because when I modify this record, it will try to write "2012-11-22 오후 4:14:37" to the db, which of course SQL server reports error.
My original code:
Hashtable reservation = new Hashtable();
SqlCommand sqlCommand = null;
SqlDataReader dataReader;
string queryCommand = "select * from Reservations where rsv_id = #RsvID";
sqlCommand = new SqlCommand(queryCommand, getConnection());
sqlCommand.Connection.Open();
sqlCommand.Parameters.AddWithValue("#RsvID", rsvID);
dataReader = sqlCommand.ExecuteReader();
while (dataReader.Read())
{
reservation["rsvID"] = dataReader["rsv_id"];
reservation["rsvCode"] = dataReader["rsv_code"];
reservation["rsvType"] = dataReader["rsv_type"];
reservation["rsvDate"] = dataReader["rsv_date"]; // where does Korean come from?
...
}
It's a common misunderstanding that you can "check" the format of datetime fields in the database.
The format you see on screen will always depend on the client, even if the client is "SQL server management studio".
In the database, the datetime is stored in a binary format that very few need to know.
So, the Korean characters are from the client, in this case your own program.
And Yes, they will depend on some culture setting somewhere.
Your example doesn't show what happens to reservation["rsvDate"] , where is the value displayed with the Korean characters ?
How are you trying to write the value with Korean characters to the database ?
To avoid Korean characters you could use .ToString(CultureInfo.InvariantCulture) where you use the Date value.

SQL table to delphi record using BCP

I have a scenario in which I have to export data of around 500,000 records from sql table to be used in Delphi application. The data is to be loaded into a packed record. Is there a method in which i can use the BCP to write data file similar to that of writing the records to file.
As of now I am loading the data using this psudo code.
// Assign the data file generated from BCP to the TextFile object.
AssignFile(losDataFile, loslFileName);
Reset(losDataFile);
while not EOD(losDataFile) do
begin
// Read from the data file until we encounter the End of File
ReadLn(losDataFile, loslDataString);
// Use the string list comma text to strip the fields
loclTempSlist.CommaText := loslDataString;
// Load the record from the items of the string list.
DummyRec.Name := loclTempSList[0];
DummyRec.Mapped = loclTempSList[1] = 'Y';
end;
For convenience i have listed the type of Dummy rec below
TDummyRec = packed record
Name : string[255];
Mapped : Boolean;
end;
So, my question is, instead of exporting the data to a text file, will it be possible to export the data to binary so that i can read from the file directly using the record type?
like
loclFileStream := TFileStream.Create('xxxxxx.dat', fmOpenRead or fmShareDenyNone);
while loclFileStream.Position < loclFileStream.Size do
begin
// Read from the binary file
loclFileStream.Read(losDummyData, SizeOf(TDummyRec));
//- -------- Do wat ever i want.
end;
I don't have much experience on using the BCP. Please help me with this.
Thanks
Terminator...
In your record, a string[255] will create a fixed-size Ansi string (i.e. a so-called shortstring). This type is clearly deprecated, and should not be used in your code.
It will be an awful waste of space to save it directly, using a TFileStream (even if it will work). Each record will store 256 bytes for each Name.
And using a string[255] (i.e. a so-called shortstring) will make an hidden conversion to a string for most access to it. So it is not the best option, IMHO.
My advice is to use a dynamic array then serialize / unserialize it with our Open Source classes. For your storage, you can use a dynamic array. Works from Delphi 5 up to XE2. And you'll be able to use a string in the record:
TDummyRec = packed record
Name : string; // native Delphi string (no shortstring)
Mapped : Boolean;
end;
Edit after OP's comment:
BCP is just a command-line tool meant to export a lot of rows into a SQL table. So IMHO BCP is not the good candidate for your purpose.
You seems to need to import a lot of rows from a SQL table.
In this case:
Using shortstring will be in all case a waste of memory, so you'll get faster out of memory than with using a good string;
You can try our Open Source classes to retrieve all data rows one by one, then populate your records using this data: see SynDB classes - it is lighter than ADO; Then you'll be able to retrieve the record data one by one, then use our record serialization functions to create some binary content - or try a dedicated faster engine like our SynBigTable;
There are some articles about using directly the OleDB feature used by BCP from Delphi code in here - it is in french, but you can use google to translate it and here for fast bulk copy; full source code included.
You want to read a SQL-table into a record, I have no idea why you are working with the archaic AssignFile.
You should really use a TADOQuery (or suitable variant) for you database.
Put a sensible SQL-query in it; something like:
SELECT field1, field2, field3 FROM tablename WHERE .....
When in doubt you can use:
SELECT * FROM tablename
Which will select all fields from the table.
The following code will walk through all the records and all the fields and save them in a variants and save that in a FileStream.
function NewFile(Filename: string): TFileStream;
begin
Result:= TFileStream.Create(Filename, fmOpenWrite);
end;
function SaveQueryToFileStream(AFile: TFileStream; AQuery: TADOQuery): boolean;
const
Success = true;
Failure = false;
UniqueFilePrefix = 'MyCustomFileTypeId';
BufSize = 4096;
var
Value: variant;
Writer: TWriter;
FieldCount: integer;
c: integer;
RowCount: integer;
begin
Result:= Success;
try
if not(AQuery.Active) then AQuery.Open
FieldCount:= AQuery.Fields.Count;
Writer:= TWriter.Create(AFile, BufSize);
try
Writer.WriteString(UniqueFilePrefix)
//Write the record info first
Writer.WriteInteger(FieldCount);
//Write the number of rows
RowCount:= AQuery.RecordCount;
WriteInteger(RowCount);
AQuery.First;
while not(AQuery.eof) do begin
for c:= 0 to FieldCount -1 do begin
Value:= AQuery.Fields[c].Value;
Writer.WriteVariant(Value);
end; {for c}
AQuery.Next;
end; {while}
except
Result:= failure;
end;
finally
Writer.Free;
end;
end;

SQL Bulk import from CSV

I need to import a large CSV file into an SQL server. I'm using this :
BULK
INSERT CSVTest
FROM 'c:\csvfile.txt'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
GO
problem is all my fields are surrounded by quotes (" ") so a row actually looks like :
"1","","2","","sometimes with comma , inside", ""
Can I somehow bulk import them and tell SQL to use the quotes as field delimiters?
Edit: The problem with using '","' as delimiter, as in the examples suggested is that :
What most examples do, is they import the data including the first " in the first column and the last " in the last, then they go ahead and strip that out. Alas my first (and last) column are datetime and will not allow a "20080902 to be imported as datetime.
From what I've been reading arround I think FORMATFILE is the way to go, but documentation (including MSDN) is terribly unhelpfull.
Try FIELDTERMINATOR='","'
Here is a great link to help with the first and last quote...look how he used the substring the SP
http://www.sqlteam.com/article/using-bulk-insert-to-load-a-text-file
Another hack which I sometimes use, is to open the CSV in Excel, then write your sql statement into a cell at the end of each row.
For example:
=concatenate("insert into myTable (columnA,columnB) values ('",a1,"','",b1,"'")")
A fill-down can populate this into every row for you. Then just copy and paste the output into a new query window.
It's old-school, but if you only need to do imports once in a while it saves you messing around with reading all the obscure documentation on the 'proper' way to do it.
Try OpenRowSet. This can be used to import Excel stuff. Excel can open CSV files, so you only need to figure out the correct [ConnectionString][2].
[2]: Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=c:\txtFilesFolder\;Extensions=asc,csv,tab,txt;
I know this isn't a real solution but I use a dummy table for the import with nvarchar set for everything. Then I do an insert which strips out the " characters and does the conversions. It isn't pretty but it does the job.
Id say use FileHelpers its an open source library
Do you need to do this programmatically, or is it a one-time shot?
Using the Enterprise Manager, right-click Import Data lets you select your delimiter.
You have to watch out with BCP/BULK INSERT because neither BSP or Bulk Insert handle this well if the quoting is not consistent, even with format files (even XML format files don't offer the option) and dummy ["] characters at the beginning and end and using [","] as the separator. Technically CSV files do not need to have ["] characters if there are no embedded [,] characters
It is for this reason that comma-delimited files are sometimes referred to as comedy-limited files.
OpenRowSet will require Excel on the server and could be problematic in 64-bit environments - I know it's problematic using Excel in Jet in 64-bit.
SSIS is really your best bet if the file is likely to vary from your expectations in the future.
u can try this code which is very sweet if you want ,
this will remove unwanted semicolons from your code.
if for example your data is like this :"Kelly","Reynold","kelly#reynold.com"
Bulk insert test1
from 'c:\1.txt' with (
fieldterminator ='","'
,rowterminator='\n')
update test1<br>
set name =Substring (name , 2,len(name))
where name like **' "% '**
update test1
set email=substring(email, 1,len(email)-1)
where email like **' %" '**
Firs you need to import CSV file into Data Table
Then you can insert bulk rows using SQLBulkCopy
using System;
using System.Data;
using System.Data.SqlClient;
namespace SqlBulkInsertExample
{
class Program
{
static void Main(string[] args)
{
DataTable prodSalesData = new DataTable("ProductSalesData");
// Create Column 1: SaleDate
DataColumn dateColumn = new DataColumn();
dateColumn.DataType = Type.GetType("System.DateTime");
dateColumn.ColumnName = "SaleDate";
// Create Column 2: ProductName
DataColumn productNameColumn = new DataColumn();
productNameColumn.ColumnName = "ProductName";
// Create Column 3: TotalSales
DataColumn totalSalesColumn = new DataColumn();
totalSalesColumn.DataType = Type.GetType("System.Int32");
totalSalesColumn.ColumnName = "TotalSales";
// Add the columns to the ProductSalesData DataTable
prodSalesData.Columns.Add(dateColumn);
prodSalesData.Columns.Add(productNameColumn);
prodSalesData.Columns.Add(totalSalesColumn);
// Let's populate the datatable with our stats.
// You can add as many rows as you want here!
// Create a new row
DataRow dailyProductSalesRow = prodSalesData.NewRow();
dailyProductSalesRow["SaleDate"] = DateTime.Now.Date;
dailyProductSalesRow["ProductName"] = "Nike";
dailyProductSalesRow["TotalSales"] = 10;
// Add the row to the ProductSalesData DataTable
prodSalesData.Rows.Add(dailyProductSalesRow);
// Copy the DataTable to SQL Server using SqlBulkCopy
using (SqlConnection dbConnection = new SqlConnection("Data Source=ProductHost;Initial Catalog=dbProduct;Integrated Security=SSPI;Connection Timeout=60;Min Pool Size=2;Max Pool Size=20;"))
{
dbConnection.Open();
using (SqlBulkCopy s = new SqlBulkCopy(dbConnection))
{
s.DestinationTableName = prodSalesData.TableName;
foreach (var column in prodSalesData.Columns)
s.ColumnMappings.Add(column.ToString(), column.ToString());
s.WriteToServer(prodSalesData);
}
}
}
}
}
This is an old question, so I write this to help anyone who stumble upon it.
SQL Server 2017 introduces the FIELDQUOTE parameter which is intended for this exact use case.
Yup, K Richard is right: FIELDTERMINATOR = '","'
See http://www.sqlteam.com/article/using-bulk-insert-to-load-a-text-file for more info.
You could also use DTS or SSIS.
Do you have control over the input format? | (pipes), and \t usually make for better field terminators.
If you figure out how to get the file parsed into a DataTable, I'd suggest the SqlBulkInsert class for inserting it into SQL Server.