Using IN with sets of tuples in SQL (SQLite3) - sql

I have the following table in a SQLite3 database:
CREATE TABLE overlap_results (
neighbors_of_annotation varchar(20),
other_annotation varchar(20),
set1_size INTEGER,
set2_size INTEGER,
jaccard REAL,
p_value REAL,
bh_corrected_p_value REAL,
PRIMARY KEY (neighbors_of_annotation, other_annotation)
);
I would like to perform the following query:
SELECT * FROM overlap_results WHERE
(neighbors_of_annotation, other_annotation)
IN (('16070', '8150'), ('16070', '44697'));
That is, I have a couple of tuples of annotation IDs, and I'd like to fetch
records for each of those tuples. The sqlite3 prompt gives me the following
error:
SQL error: near ",": syntax error
How do I properly express this as a SQL statement?
EDIT I realize I did not explain well what I am really after. Let me try another crack at this.
If a person gives me an arbitrary list of terms in neighbors_of_annotation that they're interested in, I can write a SQL statement like the following:
SELECT * FROM overlap_results WHERE
neighbors_of_annotation
IN (TERM_1, TERM_2, ..., TERM_N);
But now suppose that person wants to give me pairs of terms if the form (TERM_1,1, TERM_1,2), (TERM_2,1, TERM_2,2), ..., (TERM_N,1, TERM_N,2), where TERM_i,1 is in neighbors_of_annotation and TERM_i,2 is in other_annotation. Does the SQL language provide an equally elegant way to formulate the query for pairs (tuples) of interest?
The simplest solution seems to be to create a new table, just for these pairs,
and then join that table with the table to be queried, and select only the
rows where the first terms and the second terms match. Creating tons of AND /
OR statements looks scary and error prone.

I've never seen SQL like that. If it exists, I would suspect it's a non-standard extension. Try:
SELECT * FROM overlap_results
WHERE neighbors_of_annotation = '16070'
AND other_annotation = '8150'
UNION ALL SELECT * FROM overlap_results
WHERE neighbors_of_annotation = '16070'
AND other_annotation = '44697';
In other words, build the dynamic query from your tuples but as a series of unions instead, or as a series of ANDs within ORs:
SELECT * FROM overlap_results
WHERE (neighbors_of_annotation = '16070' AND other_annotation = '8150')
OR (neighbors_of_annotation = '16070' AND other_annotation = '44697');
So, instead of code (pseudo-code, tested only in my head so debugging is your responsibility) such as:
query = "SELECT * FROM overlap_results"
query += " WHERE (neighbors_of_annotation, other_annotation) IN ("
sep = ""
for element in list:
query += sep + "('" + element.noa + "','" + element.oa + "')"
sep = ","
query += ");"
you would instead have something like:
query = "SELECT * FROM overlap_results "
sep = "WHERE "
for element in list:
query += sep + "(neighbors_of_annotation = '" + element.noa + "'"
query += " AND other_annotation = '" + element.oa + "')"
sep = "OR "
query += ";"

I'm not aware of any SQL dialects that support tuples inside IN clauses. I think you're stuck with:
SELECT * FROM overlap_results WHERE (neighbors_of_annotation = '16070' and other_annotation = '8150') or (neighbors_of_annotation = '16070' and other_annotation = '44697')
Of course, this particular query can be simplified to something like:
SELECT * FROM overlap_results WHERE neighbors_of_annotation = '16070' and (other_annotation = '8150' or other_annotation = '44697')
Generally SQL WHERE-clause predicates only allow filtering on a single-column.

Related

how can join two query in PostgreSQL

I want to join two querys into one query.
What retrieved in the first query is a tables with column of resourceindex that sorts ascending:
String loadRates = "SELECT * FROM ratings WHERE userindex="
+ uindex
+ " ORDER BY rank DESC";
And in the second query, what should retrieved is rows of resourceindexes:
String loadResources = "SELECT * FROM resourceinfo WHERE resourceindex = "
+ rs.getInt("resourceindex");
How can I combine these into a single query?
Do not use old style join but use the keyword join.
Never ever write an SQL string like that with concatenation of parameters but use parameters instead.
"SELECT * FROM public.resourceinfo"
+ " inner join public.ratings ON ratings.resourceindex = resourceinfo.index"
+ " WHERE ratings.userindex = $1" +
+ " ORDER BY ratings.rank DESC;";
How you would apply the parameters depend on the language you are using which you didn't tag.
EDIT: If you meant it would also filtered by a resourceindex parameter then add it too as:
AND resourceinfo.index = $2

execute variable values Google BigQuery

I am trying to execute the value of the variable, but I can't find documentation about it in Google BigQuery.
DECLARE SQL STRING;
SELECT
SQL =
CONCAT(
"CREATE TABLE IF NOT EXISTS first.rdds_",
REPLACE(CAST(T.actime AS STRING), " 00:00:00+00", ""),
" PARTITION BY actime ",
" CLUSTER BY id ",
" OPTIONS( ",
" partition_expiration_days=365 ",
" ) ",
" AS ",
"SELECT * ",
"FROM first.rdds AS rd ",
"WHERE rd.actime = ",
"'", CAST(T.actime AS STRING), "'",
" AND ",
"EXISTS ( ",
"SELECT 1 ",
"FROM first.rdds_load AS rd_load ",
"WHERE rd_load.id= rd.id ",
")"
) AS SQ
FROM (
SELECT DISTINCT actime
FROM first.rdds AS rd
WHERE EXISTS (
SELECT 1
FROM first.rdds_load AS rd_load
WHERE rd_load.id= rd.id
)
) T;
My variable will have many rows with scripted for create tables and I need to execute this variable.
In SQL Server for to execute variable is:
EXEC(#variable);
How to I execute SQL variable in Google BigQuery?
EDIT:
I did new test with version beta:
Using array, all rows in one result (ARRAY_AGG):
DECLARE SQL ARRAY<STRING>;
SET SQL = (
SELECT
CONCAT(
"CREATE TABLE IF NOT EXISTS first.rdds_",
REPLACE(CAST(T.actime AS STRING), " 00:00:00+00", ""),
" PARTITION BY actime ",
" CLUSTER BY id ",
" OPTIONS( ",
" partition_expiration_days=365 ",
" ) ",
" AS ",
"SELECT * ",
"FROM first.rdds AS rd ",
"WHERE rd.actime = ",
"'", CAST(T.actime AS STRING), "'",
" AND ",
"EXISTS ( ",
"SELECT 1 ",
"FROM first.rdds_load AS rd_load ",
"WHERE rd_load.id= rd.id ",
")"
)
) AS SQ
FROM (
SELECT DISTINCT actime
FROM first.rdds AS rd
WHERE EXISTS (
SELECT 1
FROM first.rdds_load AS rd_load
WHERE rd_load.id= rd.id
)
) T
);
My result:
One row with all instructions. But I can't running this with all instructions
Update: as of 5/20/2020, BigQuery released dynamic SQL feature for you to achieve the goal.
Dynamic SQL is now available as a beta release in all BigQuery regions. Dynamic SQL lets you generate and execute SQL statements dynamically at runtime. For more information, see EXECUTE IMMEDIATE.
x
================
BigQuery does not support this (Dynamic SQL) in pure SQL, but you can implement this in any client of your choice
While Mikhail is correct that this historically hasn't been supported in BigQuery, the very new beta release of BigQuery Scripting should let you accomplish similar results:
https://cloud.google.com/bigquery/docs/reference/standard-sql/scripting
In this case, you would need to use SET to assign the value of the variable, and there isn't an EXEC statement at this time, but there is support for conditionals, loops, variables, etc.
To recreate your example, you could store the results of a query against either your first.rdds_load table, then use WHILE to loop over those results. Within that loop, you can run a normal CREATE TABLE if it doesn't already exist. I'm thinking something along these lines based on your example . . .
DECLARE results ARRAY<STRING>;
DECLARE i INT64 DEFAULT 1;
DECLARE cnt INT64 DEFAULT 0;
SET results = ARRAY(
SELECT
DISTINCT AS VALUE
CAST(actime AS STRING)
FROM
first.rdds AS rd
WHERE
EXISTS (
SELECT
1
FROM
first.rdds_load AS rd_load
WHERE
rd_load.id = rd.id
)
);
SET cnt = ARRAY_LENGTH(results);
WHILE i <= cnt DO
/* Body of CREATE TABLE goes here; you can access the rows from the query above using results[ORDINAL(i)] as you loop through*/
END WHILE;
There's also support for stored procedures, which can be executed via CALL with passed arguments, which may work in your case as well (if you need to abstract the creation logic used by many scripts).
(I would argue that this scripting support is superior to building and executing strings, since you'll still get SQL validation and such for your query.)
As always with beta features, use with caution in production—but for what it's worth, thus far my experience has been incredibly stable.

Select last created table with sqlite3 python

I found a way to do it like this:
import sqlite3
conn = sqlite3.connect(
'avg_prices.db',
detect_types=sqlite3.PARSE_DECLTYPES | sqlite3.PARSE_COLNAMES)
cursorObj = conn.cursor()
cursorObj.execute('SELECT name from sqlite_master where type= "table"')
listed_tables =cursorObj.fetchall()
last_table = listed_tables[len(listed_tables)-1][0]
rate="VES/COP"
sql = "SELECT * FROM" + "'" + last_table + "'" + "WHERE Rates=" + "'" + rate + "'"
cursorObj.execute(sql)
result = cursorObj.fetchone()
Being result the values that I wanted to get.
I check this: MySQL - query for last created table
And it seems there is an easier way, but I didn't manage to get it to work on python.
SQLite does not store the creation time of tables. That means that you cannot do what you want.
The sqlite_master has no relevant column. You are at your own risk if you assume that the results are returned in table creation order.

Insert Into Select SQL Server

I am trying to do a kind of insert into select statement. I want to insert one column as standard and the second through a select. However this is not working:
queryString = "INSERT INTO Words (Word, SortedId) VALUES ('" + words[i] + "', (SELECT TOP 1 SortedId FROM SortedWords WHERE SortedWord = '" + sortWord(words[i]) + "'))";
SortedWords is already filled with data. But at the moment i get this error
{"There was an error parsing the query. [ Token line number = 1,Token line offset = 50,Token in error = SELECT ]"}
Note:
not sure if i need the TOP 1 bit or not, get error either way. But I obvs only want to insert one row.
Change your query to
queryString = "INSERT INTO Words (Word, SortedId) SELECT '" + words[i] + "', (SELECT TOP 1 SortedId FROM SortedWords WHERE SortedWord = '" + sortWord(words[i]) + "')";
Also, instead of concatenating strings to get your query, use parameters to avoid SQL injection.
Try next and better practice to use a SqlParameters:
INSERT INTO words
(word,
sortedid)
(SELECT TOP 1 #Word,
sortedid
FROM sortedwords
WHERE sortedword = #SortedWord)
And before execiting query create a parameters(C#)
//Assume you have a SqlCommand object(lets name it command)
command.Parameters.AddWithValue("#Word", words[i]);
command.Parameters.AddWithValue("#SortedWord", sortWord(words[i]));

Is it possible to generate and execute Python code in a Python script? [Dynamic Python code]

I am working on some reports (counts) and I have to fetch counts for different parameters. Pretty simple but tedious.
A sample query for one parameter :
qCountsEmployee = (
"select count(*) from %s where EmployeeName is not null"
% (tablename)
)
CountsEmployee = execute_query(qCountsEmployee)
Now I have few hundred such parameters!
What I did was : create a list of all the parameters and generate them using a quick Python script and then copy that text and put it in the main script to avoid the tedious lines.
columnList = ['a', 'b', ............'zzzz']
for each in columnList:
print (
'q' + each + ' ='
+ '"select count(*) from %s where' + each
+ 'is not null" % (tablename)'
)
print each + ' = execute_query(' + 'q' + each + ')'
While this approach works, I was wondering if instead of a separate script to generate lines of code and copy paste into main program, can I generate them in the main script directly and let the script treat them as lines of code? That will make the code much more readable is what I think. Hope I made sense! Thank you...
It would be possible, but is not useful here.
Just do something like
columnList = ['a', 'b', ............'zzzz']
results = {}
for column in columnList:
query = (
"select count(*) from " + tablename
+ " where " + column + " is not null"
)
result = execute_query(qCountsEmployee)
results[column] = result
You as well can put all this together in a generator function and do
def do_counting(column_list):
for column in column_list:
query = (
"select count(*) from " + tablename
+ " where " + column + " is not null"
)
result = execute_query(qCountsEmployee)
yield column, result
result_dict = dict(do_counting(['...']))
You can do:
cmd = compile( 'a = 5', '<string>', 'exec' )
exec( cmd )
That is the same as just writing:
a = 5
The string passed as the first argument to compile can be built dynamically.
To build on what glglgl said, you are probably better with dynamic SQL than dynamic Python (though Dynamic Python is definitely possible using things like eval ). When working with Dynamic SQL, you should be careful of sql injection. It seems like it would not come up in your particular use-case, but it certainly comes up more often than many developers realize.
I happen to have written an article about SQL Injection and Python which is available at Simple-talk.