Inserting a file into a Postgres bytea column using perl/SQL - sql

I'm working with a legacy system and need to find a way to insert files into a pre-existing Postgres 8.2 bytea column using Perl.
So far my searching has lead me to believe the following:
there is no consensus best approach for this.
lo_import looks promising, but I'm apparently too perl-tarded to get it to work.
I was hoping to do something like the following
my $bind1 = "foo"
my $bind2 = "123"
my $file = "/path/to/file.ext"
my $q = q{
INSERT INTO generic_file_table
(column_1,
column_2,
bytea_column
)
VALUES
(?, ?, lo_import(?))
};
my $sth = $dbh->prepare($q);
$sth->execute($bind1, $bind2, $file);
$sth->finish();`
My script works w/o the lo_import/bytea part. But with it I get this error:
DBD::Pg::st execute failed: ERROR: column "contents" is of type bytea but expression is >of type oid at character 176
HINT: You will need to rewrite or cast the expression.
What I think I'm doing wrong is that I'm not passing the actual binary file to the DB properly. I think I'm passing the file path, but not the file itself. If that's true then what I need to figure out is how to open/read the file into a tmp buffer, and then use the buffer for the import.
Or am I way off base here? I'm open to any pointers, or alternative solutions as long as they work with Perl 5.8/DBI/PG 8.2.

Pg offers two ways to store binary files:
large objects, in the pg_largeobject table, which are referred to by an oid. Often used via the lo extension. May be loaded with lo_import.
bytea columns in regular tables. Represented as octal escapes like \000\001\002fred\004 in PostgreSQL 9.0 and below, or as hex escapes by default in Pg 9.1 and above eg \x0102. The bytea_output setting lets you select between escape (octal) and hex format in versions that have hex format.
You're trying to use lo_import to load data into a bytea column. That won't work.
What you need to do is send PostgreSQL correctly escaped bytea data. In a supported, current PostgreSQL version you'd just format it as hex, bang a \x in front, and you'd be done. In your version you'll have to escape it as octal backslash-sequences and (because you're on an old PostgreSQL that doesn't use standard_conforming_strings) probably have to double the backslashes too.
This mailing list post provides a nice example that will work on your version, and the follow-up message even explains how to fix it to work on less prehistoric PostgreSQL versions too. It shows how to use parameter binding to force bytea quoting.
Basically, you need to read the file data in. You can't just pass the file name as a parameter - how would the database server access the local file and read it? It'd be looking for a path on the server.
Once you've read the data in, you need to escape it as bytea and send that to the server as a parameter.
Update: Like this:
use strict;
use warnings;
use 5.16.3;
use DBI;
use DBD::Pg;
use DBD::Pg qw(:pg_types);
use File::Slurp;
die("Usage: $0 filename") unless defined($ARGV[0]);
die("File $ARGV[0] doesn't exist") unless (-e $ARGV[0]);
my $filename = $ARGV[0];
my $dbh = DBI->connect("dbi:Pg:dbname=regress","","", {AutoCommit=>0});
$dbh->do(q{
DROP TABLE IF EXISTS byteatest;
CREATE TABLE byteatest( blah bytea not null );
});
$dbh->commit();
my $filedata = read_file($filename);
my $sth = $dbh->prepare("INSERT INTO byteatest(blah) VALUES (?)");
# Note the need to specify bytea type. Otherwise the text won't be escaped,
# it'll be sent assuming it's text in client_encoding, so NULLs will cause the
# string to be truncated. If it isn't valid utf-8 you'll get an error. If it
# is, it might not be stored how you want.
#
# So specify {pg_type => DBD::Pg::PG_BYTEA} .
#
$sth->bind_param(1, $filedata, { pg_type => DBD::Pg::PG_BYTEA });
$sth->execute();
undef $filedata;
$dbh->commit();

Thank you to those who helped me out. It took a while to nail this one down. The solution was to open the file and store it. then specifically call out the bind variable that is type bytea. Here is the detailed solution:
.....
##some variables
my datum1 = "foo";
my datum2 = "123";
my file = "/path/to/file.dat";
my $contents;
##open the file and store it
open my $FH, $file or die "Could not open file: $!";
{
local $/ = undef;
$contents = <$FH>;
};
close $FH;
print "$contents\n";
##preparte SQL
my $q = q{
INSERT INTO generic_file_table
(column_1,
column_2,
bytea_column
)
VALUES
(?, ?, ?)
};
my $sth = $dbh->prepare($q);
##bind variables and specifically set #3 to bytea; then execute.
$sth->bind_param(1,$datum1);
$sth->bind_param(2,$datum2);
$sth->bind_param(3,$contents, { pg_type => DBD::Pg::PG_BYTEA });
$sth->execute();
$sth->finish();

Related

Importing csv data to SQL using PowerShell

Hi Glorius People of the Interwebz!
I come to you with a humble question (please go easy on me, I am fairly OK in PowerShell, but my SQL skills are minimal... :( )
So I have been tasked with to write a powershell script to import data (from a number of csv files to a database) and I made a good progress, based on this (I heavily modified my version). All works dashingly, except one part: when I try to insert the values (I created a sort of "mapping file" to map the csv headers to the data), I can't seem to use the created string in the values part. So here is what I have:
This is my current code for powershell (ignore the comments)
This is a sample data csv
This is my mapping file
What I would want, is to replace the
VALUES(
'$($CSVLine.Invoice_Status_Text)',
'$($CSVLine.Invoice_Status)',
'$($CSVLine.Dispute_Required_Text)',
'$($CSVLine.Dispute_Required)',
'$($CSVLine.Dispute_Resolved_Text)',
'$($CSVLine.Dispute_Resolved)',
'$($CSVLine.Sub_Account_Number)',
'$($CSVLine.QTY)',
'$($CSVLine.Date_of_Service)',
'$($CSVLine.Service)',
'$($CSVLine.Amount_of_Service)',
'$($CSVLine.Total)',
'$($CSVLine.Location)',
'$($CSVLine.Dispute_Reason_Text)',
'$($CSVLine.Dispute_Reason)',
'$($CSVLine.Numeric_counter)'
);"
Part, for example with a string generated this way:
But when I replace the long - and honestly, boring to type - values with the $valueString, I get this type of error:
Incorrect syntax was encountered while parsing '$($'.
Not sure, if it matters, but my PS is 7.1
Any good people who can give a good suggestion on how to build the values from my text file...?
Ta,
F.
As commented, wrapping variables inside single-quotes takes the variable as written literally, so you do not get the value contained (7957), but a string like $($CSVLine.Numeric_counter) instead.
I don't do SQL a lot, but I think I would change the part where you construct the values to insert like this:
# demo, read the csv file in your example
$csv = Import-Csv D:\Test\test.csv -Delimiter ';'
# demo, these are the headers (or better yet, the Property Names to use from the objects in the CSV) as ARRAY
# (you use `$headers = Get-Content -Path 'C:\Temp\SQL\ImportingCSVsIntoSQLv1\config\headers.txt'`)
$headers = 'Invoice_Status_Text','Invoice_Status','Dispute_Required_Text','Dispute_Required',
'Dispute_Resolved_Text','Dispute_Resolved','Sub_Account_Number','QTY','Date_of_Service',
'Service','Amount_of_Service','Total','Location','Dispute_Reason_Text','Dispute_Reason','Numeric_counter'
# capture formatted blocks of values for each row in the CSV
$AllValueStrings = $csv | ForEach-Object {
# get a list of values using propertynames you have in the $headers
$values = foreach ($propertyName in $headers) {
$value = $_.$propertyName
# output the VALUE to be captured in $values
# for SQL, single-quote the string type values. Numeric values without quotes
if ($value -match '^[\d\.]+$') { $value }
else { "'{0}'" -f $value }
}
# output the values for this row in the CSV
$values -join ",`r`n"
}
# $AllValueStrings will now have as many formatted values to use
# in the SQL as there are records (rows) in the csv
$AllValueStrings
Using your examples, $AllValueStrings would yield
'Ready To Pay',
1,
'No',
2,
'',
'',
'',
'',
'',
'',
'',
'',
'',
'',
'',
7957

Does blobstor hava an opposite command, Ingres?

I am using the blobstor command to load jpeg images into an ingres db, which is fine. But at some point I need to develop a manual way to copy them back out again.
I can find some examples of this that uses BCP, however these are for sql server db's. So my question is, does blobstor have an equal an opposite command to extract blobs, that can be used when select from an Ingres db. Pointers to any examples would be much appreciated.
I don't believe there is a blobstor-opposite tool which ships with Ingres, when I've had need for such a thing before now the solution was to write a short program.
As an example, here's a perl script. It uses DBI and the DBD-IngresII module. Hope it's of some use.
# Required: db=, table=, col=. Optional: user=.
# Anything else is a where clause.
use DBI;
my %p=(); my $where="";
foreach my $arg (#ARGV)
{
if ($arg =~ /(db|table|col|user)=(\S+)$/) { $p{$1}=$2; next; }
$where .= " ".$arg if($p{db} and $p{table} and $p{col});
}
die "db, table and col required.\n" if(!$p{db} or !$p{table}
or !$p{col});
my $user=""; $user=$p{user} if defined($p{user});
my $dbh=DBI->connect("dbi:IngresII:".$p{db},$user,"");
my $stm="select ".$p{col}." from ".$p{table};
$stm.=" where".$where if ($where ne "");
my $sth=$dbh->prepare($stm);
$sth->execute;
#row=$sth->fetchrow_array;
print $row[0];
$sth->finish;
$dbh->disconnect;

how to insert utf8 characters into oracle database using robotframework database library

I have a robot script which inserts some sql statements from a sql file; some of these statements contain utf8 characters. If I insert this file manually into database using navicat tool, everything's fine. But when I try to execute this file using database library of robot framework, utf8 characters go crazy!
This is my utf8 included sql statement:
INSERT INTO "MY_TABLE" VALUES (2, 'تست1');
This is how I use database library:
Connect To Database Using Custom Params cx_Oracle ${dbConnection}
Execute Sql Script ${sqlFile}
Disconnect From Database
This is what I get in the database:
������������ 1
I have tried to execute the SQL file using cx_Oracle directly and it's still failing! It seems there is a problem in the original library. This is what I've used for importing SQL file:
import cx_Oracle
if __name__ == "__main__":
dsn_tns = cx_Oracle.makedsn(ip, port, sid)
db = cx_Oracle.connect(username, password, dsn_tns)
sql_commands = open(sql_file_addr, 'r').read().split(";")
cr = db.cursor()
for command in sql_commands:
if not command in ["", "\t", "\n", "\r", "\n\r", "\r\n", None]:
print "Executing SQL command:", command
cr.execute(command)
db.commit()
I have found that I can define character-set in the connection string. I've done it for mysql database and it the framework successfully inserted UTF8 characters into database; this is my connection string for MySQL:
database='db_name', user='db_username', password='db_password', host='db_ip', port=3306, charset='utf8'
But I don't know how to define character-set for Oracle connection string. I have tried this:
'db_username','db_password','db_ip:1521/db_sid','utf8'
And I've got this error:
TypeError: an integer is required
As #Yu Zhang suggested, I read discussion in this link and I found out that I should set an environment variable NLS_LANG in order to have a UTF-8 connection to the database. So I've added below line in my test setup:
os.environ["NLS_LANG"] = "AMERICAN_AMERICA.AL32UTF8"
Would any of links below help?
http://docs.oracle.com/cd/B19306_01/server.102/b14225/ch6unicode.htm#i1006779
http://www.theserverside.com/news/thread.tss?thread_id=39575
https://community.oracle.com/thread/502949
There can be several problems in here...
The first problem might be that you don't save the test files using UTF-8 encoding.
Robot framework expects plain text test files to be saved using UTF-8 encoding, yet most text editors will not save by default using UTF-8.
Verify that your editor saves that way - for example, by opening the file using NotePad++ and choosing Encoding -> UTF-8
Another problem might be the connection to the Oracle database. It doesn't seem like you can configure the connection custom properties to explicitly state UTF-8
This means you probably need to state that the database schema itself is UTF-8

How to get parameter value of sql from shell script

I use a shell script to run sqlite3 command, the code is as below:
idd=0;
/usr/bin/sqlite3 ~/Library/Application\ Support/NotificationCenter/*.db <<SQL_END
select app_id from app_info where bundleid='com.myapp.main';
select last_known_path from app_loc where app_id=29;
SQL_END
i want to get the last_konwn_path of the sql, and i want to transfer a paramter instead of using '29', any one can help me with this, thanks.
String interpolation, the mechanism replacing variables in doubly quoted strings, happens in here-documents. You can thus wrap your call in a function as follows:
# notificationdb_last_known_path APP_ID
notificationdb_last_known_path()
{
local APP_ID
APP_ID="$1"
/usr/bin/sqlite3 ~/Library/Application\ Support/NotificationCenter/*.db <<SQL_END
select app_id from app_info where bundleid='com.myapp.main';
select last_known_path from app_loc where app_id=${APP_ID};
SQL_END
}
You can then call this function like this to store answer in a variable:
my_app_id=$(notificationdb_last_known_path 29)
Factorize the sqlite invocation
If you have several similar database accesses, it is worthy to factorize the sqlite invocation as follows:
# notificationdb_session()
{
/usr/bin/sqlite3 ~/Library/Application\ Support/NotificationCenter/*.db
}
# notificationdb_last_known_path_query APP_ID
notification_last_known_path_query()
{
local APP_ID
APP_ID="$1"
cat <<SQL_END
select app_id from app_info where bundleid='com.myapp.main';
select last_known_path from app_loc where app_id=${APP_ID};
SQL_END
}
# notificationdb_last_known_path_query APP_ID
notificationdb_last_known_path()
{
notificationdb_last_known_path_query "$1" | notificationdb_session
}
Do not store variables
In shell programming it is much easier to pass structured data around as data flows between processes as to pass it through variables. Programs sort, cut, paste, join and awk are especially useful when working on structured data streams.
Disable text interpolation in here-documents
If you need to disable text interpolation in a here-document, you can achieve this by writing the here-document delimiter between single-quotes as in
cat <<'SQL_END'
SQL_STATEMENT
SQL_END

Execute SQL from file in SQLAlchemy

How can I execute whole sql file into database using SQLAlchemy? There can be many different sql queries in the file including begin and commit/rollback.
sqlalchemy.text or sqlalchemy.sql.text
The text construct provides a straightforward method to directly execute .sql files.
from sqlalchemy import create_engine
from sqlalchemy import text
# or from sqlalchemy.sql import text
engine = create_engine('mysql://{USR}:{PWD}#localhost:3306/db', echo=True)
with engine.connect() as con:
with open("src/models/query.sql") as file:
query = text(file.read())
con.execute(query)
SQLAlchemy: Using Textual SQL
text()
I was able to run .sql schema files using pure SQLAlchemy and some string manipulations. It surely isn't an elegant approach, but it works.
# Open the .sql file
sql_file = open('file.sql','r')
# Create an empty command string
sql_command = ''
# Iterate over all lines in the sql file
for line in sql_file:
# Ignore commented lines
if not line.startswith('--') and line.strip('\n'):
# Append line to the command string
sql_command += line.strip('\n')
# If the command string ends with ';', it is a full statement
if sql_command.endswith(';'):
# Try to execute statement and commit it
try:
session.execute(text(sql_command))
session.commit()
# Assert in case of error
except:
print('Ops')
# Finally, clear command string
finally:
sql_command = ''
It iterates over all lines in a .sql file ignoring commented lines.
Then it concatenates lines that form a full statement and tries to execute the statement. You just need a file handler and a session object.
You can do it with SQLalchemy and psycopg2.
file = open(path)
engine = sqlalchemy.create_engine(db_url)
escaped_sql = sqlalchemy.text(file.read())
engine.execute(escaped_sql)
Unfortunately I'm not aware of a good general answer for this. Some dbapi's (psycopg2 for instance) support executing many statements at a time. If the files aren't huge you can just load them into a string and execute them on a connection. For others, I would try to use a command-line client for that db and pipe the data into that using the subprocess module.
If those approaches aren't acceptable, then you'll have to go ahead and implement a small SQL parser that can split the file apart into separate statements. This is really tricky to get 100% correct, as you'll have to factor in database dialect specific literal escaping rules, the charset used, any database configuration options that affect literal parsing (e.g. PostgreSQL standard_conforming_strings).
If you only need to get this 99.9% correct, then some regexp magic should get you most of the way there.
If you are using sqlite3 it has a useful extension to dbapi called conn.executescript(str), I've hooked this up via something like this and it seemed to work: (Not all context is shown but it should be enough to get the drift)
def init_from_script(script):
Base.metadata.drop_all(db_engine)
Base.metadata.create_all(db_engine)
# HACK ALERT: we can do this using sqlite3 low level api, then reopen session.
f = open(script)
script_str = f.read().strip()
global db_session
db_session.close()
import sqlite3
conn = sqlite3.connect(db_file_name)
conn.executescript(script_str)
conn.commit()
db_session = Session()
Is this pure evil I wonder? I looked in vain for a 'pure' sqlalchemy equivalent, perhaps that could be added to the library, something like db_session.execute_script(file_name) ? I'm hoping that db_session will work just fine after all that (ie no need to restart engine) but not sure yet... further research needed (ie do we need to get a new engine or just a session after going behind sqlalchemy's back?)
FYI sqlite3 includes a related routine: sqlite3.complete_statement(sql) if you roll your own parser...
You can access the raw DBAPI connection through this
raw_connection = mySqlAlchemyEngine.raw_connection()
raw_cursor = raw_connection() #get a hold of the proxied DBAPI connection instance
but then it will depend on which dialect/driver you are using which can be referred to through this list.
For pyscog2, you can just do
raw_cursor.execute(open("my_script.sql").read())
but pysqlite you would need to do
raw_cursor.executescript(open("my_script").read())
and in line with that you would need to check the documentation of whichever DBAPI driver you are using to see if multiple statements are allowed in one execute or if you would need to use a helper like executescript which is unique to pysqlite.
Here's how to run the script splitting the statements, and running each statement directly with a "connectionless" execution with the SQLAlchemy Engine. This assumes that each statement ends with a ; and that there's no more than one statement per line.
engine = create_engine(url)
with open('script.sql') as file:
statements = re.split(r';\s*$', file.read(), flags=re.MULTILINE)
for statement in statements:
if statement:
engine.execute(text(statement))
In the current answers, I did not found a solution which works when a combination of these features in the .SQL file is present:
Comments with "--"
Multi-line statements with additional comments after "--"
Function definitions which have multiple SQL-queries ending with ";" butmust be executed as a whole statement
A found a rather simple solution:
# check for /* */
with open(file, 'r') as f:
assert '/*' not in f.read(), 'comments with /* */ not supported in SQL file python interface'
# we check out the SQL file line-by-line into a list of strings (without \n, ...)
with open(file, 'r') as f:
queries = [line.strip() for line in f.readlines()]
# from each line, remove all text which is behind a '--'
def cut_comment(query: str) -> str:
idx = query.find('--')
if idx >= 0:
query = query[:idx]
return query
# join all in a single line code with blank spaces
queries = [cut_comment(q) for q in queries]
sql_command = ' '.join(queries)
# execute in connection (e.g. sqlalchemy)
conn.execute(sql_command)
Code bellow works for me in alembic migrations
from alembic import op
import sqlalchemy as sa
from ekrec.common import get_project_root
def upgrade():
path = f'{get_project_root()}/migrations/versions/fdb8492f75b2_.sql'
op.execute(open(path).read())
I had success with David's answer here, with two slight modifications:
Use get_bind() as I was working with a Session rather than an Engine
Call cursor() on the raw connection
raw_connection = myDbSession.get_bind().raw_connection()
raw_cursor = raw_connection.cursor()
raw_cursor.execute(open("my_script.sql").read())