Remove quote marks in Google Cloud Datalab SQL module parameters? - sql

The parameterization example in the "SQL Parameters" IPython notebook in the datalab github repo (under datalab/tutorials/BigQuery/) shows how to change the value being tested for in a WHERE clause.
%%sql --module get_data
SELECT *
FROM
[myproject:mydataset.mytable]
WHERE
$query
However, this syntax always seems to insert quotation marks around the parameter. This breaks when I pass parameters that aren't just a simple value:
import gcp.bigquery as bq
query = "(bnf_code LIKE '1202%') OR (bnf_code LIKE '1203%')"
query = bq.Query(get_data, query=query)
print query.sql
This prints an invalid query:
SELECT * FROM [myproject:mydataset.mytable]
WHERE "(bnf_code LIKE '1202%') OR (bnf_code LIKE '1203%')"
Is there any way I can insert values that aren't wrapped in quotation marks?
I'm using the module repeatedly in my code, with variable numbers of OR clauses in the query parameter. So I do need a way to pass in more complicated queries.

Sorry, variables are meant to be simple scalars, or tables, or (soon) lists for use in IN clauses. They are not meant for expressions.

Passing unquoted arguments to SQL modules isn't possible, but it is possible to create a datalabs.data.SQLStatement with straight-up SQL in string form. With that you can use your own, Python-style placeholders to substitute values as you see fit:
import datalab.data._sql_statement as bqsql
statement = bqsql.SqlStatement(
"SELECT some-field FROM %s" % '[your-instance:some-table-name]')
query = bq.Query(statement)
I don't know if they're doing anything special with placeholders or the in-notebook command processing but... well, I didn't see any of that in my (admittedly limited) spelunking.

Related

Properly escaping strings in raw Django SQL query

Really the root of the problem is poor database design going back to long before I started here, but that problem isn't going to be resolved by the end of the day or even the end of the week. Thus, I need to find a better way to handle the following to get a feature added.
I'm forced to deviate away from the Django ORM because I need to build a raw SQL query due to having to write logic around the FROM <table>. We've elected to go this route instead of updating the models.py a couple times a year with the new tables that are added.
There are numerous areas starting here where the Django documentation says "Do not use string formatting on raw queries or quote placeholders in your SQL strings!"
If I write the query like this:
cursor.execute("""
SELECT * \
FROM agg_data_%s \
WHERE dataview = %s \
ORDER BY sortorder """, [tbl, data_view])
It adds single quotes around tbl, which obviously causes an issue, but will correctly construct the WHERE clause surrounded in single quotes.
Doing this, will not put single quotes around tbl, but will force you to put single quotes around the WHERE which is bad to say the least (opens it up for SQL injection):
sql = """ \
SELECT * \
FROM agg_data_%s \
WHERE dataview = '%s' \
ORDER BY sortorder """ % (tbl, data_view)
cursor.execute(sql)
Anyway to make lemonade out of these lemons?
The parameters %s can only be used for values. Not for (parts) of identifiers, like table/column names. If you are sure tbl is safe, then you can escape with:
sql = """
SELECT * \
FROM agg_data_%s \
WHERE dataview = %%s \
ORDER BY sortorder """ % tbl, [data_view])
Here the %% will "fold" to % and thus you yield %s after the first string interpolation.
But it might be better not to use different tables for this, and thus filter the table, for example by a foreign key.

Tcl SQLite update variable substitution cannot have apostrophe

Here's the problem: if I use { } for the update command like so:
package require sqlite3
fileRepo eval {UPDATE uploads SET $col=$data WHERE rowid=$id}
I cannot substitute any variables inside the curly brackets. it all has to be hard coded.
However, if I use " " for the update command like so:
fileRepo eval "UPDATE uploads SET $col='$data' WHERE rowid=$id"
I can substitute variables inside the double quotes, but I must use ' ' in order to put in data with spaces so sql sees it as one input. If I don't I get an error if I send something like
$data = "Legit Stack"
Because it has a space the sql will choke on the word: Stack
unless it is placed within single quotes
Therefore...
If I send this data to the update command:
$col = description
$data = "Stack's Pet"
I get the following error:
near "s": syntax error while executing "fileRepo eval "UPDATE uploads
SET $col='$data' WHERE rowid=$id" ...
Thus given these rules I can see no way to pass a single quote or apostrophe to the update command successfully. Is there a different way to do this?
Thanks!
While it is true that you can escape the single quotes by doubling them (as usual in SQL), you open up your code to the dangers of SQL injection attacks.
It might be better to split your code into two distinct steps:
Substitute with format {UPDATE uploads SET %s=$data WHERE rowid=$id} $col
let sqlite3 magic eval turn the $data and $id into bound variables for a prepared statement
This way you only need to sanitize your col variable, to make sure it contains a valid column name and nothing else (should be easy), instead of all your data. In addition, you do not need to copy large values as often, so a two step approach will even be faster. To make it even clearer you want to use a bind variable, try the alternative syntax with a : in front of a variable name.
package require sqlite3
set stmt [format {UPDATE uploads SET %s=:data WHERE rowid=:id} $col]
fileRepo eval $stmt
Recommended Reading:
For the : syntax: https://www.sqlite.org/tclsqlite.html#eval
For more information about SQL Injections: https://www.owasp.org/index.php/SQL_Injection_Prevention_Cheat_Sheet
You have to use an escape apostrophe. So it should look like this:
$data = "Stack''s Pet"

how insert ODI step error message in to Oracle table, if the error message has single quotes and colons

I'm trying to insert ODI step error message into oracle table.
I captured the error message using <%=odiRef.getPrevStepLog("MESSAGE")%>.
ODI-1226: Step PRC_POA_XML_synchronize fails after 1 attempt(s).
ODI-1232: Procedure PRC_POA_XML_synchronize execution fails.
ODI-1227: Task PRC_POA_XML_synchronize (Procedure) fails on the source XML connection XML_PFIZER_LOAD_POA_DB_DEV.
Caused By: java.sql.SQLException: class java.sql.SQLException
oracle.xml.parser.v2.XMLParseException: End tag does not match start tag 'tns3:ContctID'.
at com.sunopsis.jdbc.driver.xml.SnpsXmlFile.readDocument(SnpsXmlFile.java:459)
at com.sunopsis.jdbc.driver.xml.SnpsXmlFile.readDocument(SnpsXmlFile.java:469)
When I try to insert this into a table, I'm getting the following error:
Missing IN or OUT parameter at index:: 1
I tried with substr, replace. Nothing works as in middle of the error message we have a single quotes 'tns3:ContctID'.
Is there any way to insert this into a table?
that's a tough one if you want to use pure java BeanShell and you've given way too little details to get short and straight answer, like
how do you try to insert this (command on source/target, bean shell only, Oracle SQL +jBS, jython, groovy etc...)
The problem here is not only quotes but also newlines.
To replace them is even more difficult as every parsing step <%, <?, <# requires different trick to define those literals
What will work for sure is if you write Jython task for inserting log data (Jython in technology).
There you may use Python ability for multiline string literals
simply:
⋮
err_log = """
<?=odiRef.getPrevStepLog("MESSAGE")?>
"""
⋮
I faced this error few days back . I applied below mentioned solution in ODI ...
Use - q'#<%=odiRef.getPrevStepLog("MESSAGE")%>#'
This will escape inverted comma (') for INSERT statement.
I have used this in my code and it is working fine :)
For example -
select 'testing'abcd' from dual;
this query will give below error
"ORA-01756: quoted string not properly terminated"
select q'#testing'abcd#' from dual;
This query gives no error and we get below response in SQL Developer
testing'abcd

How can I extract field names from SQL with Perl?

I have a series of select statements in a text file and I need to extract the field names from each select query. This would be easy if some of the fields didn't use nested functions like to_char() etc.
Given select statement fields that could have several nested parenthese like:
ltrim(rtrim(to_char(base_field_name, format))) renamed_field_name,
Or the simple case of just base_field_name as a field, what would the regex look like in Perl?
Don't try to write a regex parser (though perl regexes can handle nested patterns like that), use SQL::Statement::Structure.
Why not ask the target database itself how it would interpret the queries?
In perl, one can use the DBI to query the prepared representation of a SQL query. Sometimes this is database-specific: some drivers (under the perl DBD:: namespace) support their RDBMS' idea of describing statements in ways analogous to the RDBMS' native C or C++ API.
It can be done generically, however, as the DBI will put the names of result columns in the statement handle attribute NAME. The following, for example, has a good chance of working on any DBI-supported RDBMS:
use strict;
use warnings;
use DBI;
use constant DSN => 'dbi:YouHaveNotToldUs:dbname=we_do_not_know';
my $dbh = DBI->connect(DSN, ..., { RaiseError => 1 });
my $sth;
while (<>) {
next unless /^SELECT/i; # SELECTs only, assume whole query on one line
chomp;
my $sql = /\bWHERE\b/i ? "$_ AND 1=0" : "$_ WHERE 1=0"; # XXX ugly!
eval {
$sth = $dbh->prepare($sql); # some drivers don't know column names
$sth->execute(); # until after a successful execute()
};
print $#, next if $#; # oops, problem with that one
print join(', ', #{$sth->{NAME}}), "\n";
}
The XXX ugly! bit there tries to append an always-false condition on the SELECT, so that the SQL engine doesn't have to do any real work when you execute(). It's a terribly naive approach -- that /\bWHERE\b/i test is no more correctly identifying a SQL WHERE clause than simple regexes correctly parse out SELECT field names -- but it is likely to work.
In a somewhat related problem at the office I used:
my #SqlKeyWordList = qw/select from where .../; # (1)
my #Candidates =split(/\s/,$SqlSelectQuery); # (2)
my %FieldHash; # (3)
for my $Word (#Candidates) {
next if grep($word,#SqlKeyWordList);
$FieldHash($Word)++;
}
Comments:
SqlKeyWordList contains all the SQL keywords that are potentially in the SQL statement (we use MySQL, there are many SQL dialiects, choosing/building this list is work, look at my comments below!). If someone decided to use a keyword as a field name, you will need a regex after all (beter to refactor the code).
Split the SQL statement into a list of words, this is the trickiest part and WILL REQUIRE tweeking. For now it uses Perl notion of "space" (=not in word) to split. Splitting the field list (select a,b,c) and the "from" portion of the SQL might be advisabel here, depends on your SQL statements.
%MyFieldHash will contain one entry per select field (and gunk, until you validated your SqlKeyWorkList and the regex in (2)
Beware
there is nothing in this code that could not be done in Python.
your life would be much easier if you can influence the creation of said SQL statements. (e.g. make sure each field is written to a comment)
there are so many things that can/will go wrong in this parsing approach, you really should sidestep the issue entirely, by changing the process (saves time in the long run).
this is the regex we use at the office
my #Candidates=split(/[\s
\(
\)
\+
\,
\*
\/
\-
\n
\
\=
\r
]+/,$SqlSelectQuery
);
How about splitting each line into terms (replace every parenthesis, comma and space with a newline), then sorting:
perl -ne's/[(), ]/\n/g; print' < textfile | sort -u
You'll end up with a lot of content like:
fieldname1
fieldname1
formatstring
ltrim
rtrim
t_char

BASH - Single quote inside double quote for SQL Where clause

I need to send a properly formatted date comparison WHERE clause to a program on the command line in bash.
Once it gets inside the called program, the WHERE clause should be valid for Oracle, and should look exactly like this:
highwater>TO_DATE('11-Sep-2009', 'DD-MON-YYYY')
The date value is in a variable. I've tried a variety of combinations of quotes and backslashes. Rather than confuse the issue and give examples of my mistakes, I'm hoping for a pristine accurate answer unsullied by dreck.
If I were to write it in Perl, the assignment would I think look like this:
$hiwaterval = '11-Sep-2009';
$where = "highwater>TO_DATE(\'$hiwaterval\', \'DD-MON-YYYY\')";
How do I achieve the same effect in bash?
hiwaterval='11-Sep-2009'
where="highwater > TO_DATE('$hiwaterval', 'DD-MON-YYYY')"
optionally add "export " before final variable setting if it is to be visible ourside the current shell.
Have you tried using using double ticks? Like highwater>TO_DATE(''11-Sep-2009'', ''DD-MON-YYYY''). Just a suggestion. I haven't tried it out.
You can assign the where clause like this:
export WHERECLAUSE=`echo "where highwater >TO_DATE('11-Sep-2009', 'DD-MON-YYYY')"`
(with backticks around the echo statement - they're not showing up in my editor here...)
which works with a shell script of the form:
sqlplus /nolog <<EOS
connect $USERNAME/$PASSWD#$DB
select * from test $WHERECLAUSE
;
exit
EOS