Returning a tuple column type from slick plain SQL query - sql

In slick 3 with postgres, I'm trying to use a plain sql query with a tuple column return type. My query is something like this:
sql"""
select (column1, column2) as tup from table group by tup;
""".as[((Int, String))]
But at compile time I get the following error:
could not find implicit value for parameter rconv: slick.jdbc.GetResult[((Int, String), String)]
How can I return a tuple column type with a plain sql query?

GetResult[T] is a wrapper for function PositionedResult => T and expects an implicit val with PositionedResult methods such as nextInt, nextString to extract positional typed fields. The following implicit val should address your need:
implicit val getTableResult = GetResult(r => (r.nextInt, r.nextString))
More details can be found in this Slick doc.

Related

How to fetch all rows where an array contains any of the fields array elements

I have a table that has a column video_ids, it is of a bigint[] type. I would like to find all the rows that have any of the elements from the array passed in a select statement. So, if I have a row that has a video_ids field that looks like this:
{9529387, 9548200, 9579636}
I would like to fetch it if I pass an array that has any of this video_ids. I thought I would do that with any, but I am not sure how to do this in SQL, I have tried with this:
select id, finished, failed, video_ids, invoiced_video_ids, failed_video_ids
from video_order_execution
where order_ids = any(
'{9548200, 11934626, 9579636, 11936321, 11509698, 11552728, 11592106, 11643565, 11707543, 11810386, 11846268}'
::bigint[]);
I get an error if I do that:
ERROR: operator does not exist: bigint[] = bigint Hint: No operator
matches the given name and argument types. You might need to add
explicit type casts.
How can I make such a statement that would do the job for what I need?
Use the operator && which returns true if the 2 operands have any common items:
select id, finished, failed, video_ids, invoiced_video_ids, failed_video_ids
from video_order_execution
where order_ids &&
'{9548200, 11934626, 9579636, 11936321, 11509698, 11552728, 11592106, 11643565, 11707543, 11810386, 11846268}'::bigint[];

How to resolve this sql error of schema_of_json

I need to find out the schema of a given JSON file, I see sql has schema_of_json function
and something like this works flawlessly
> SELECT schema_of_json('[{"col":0}]');
ARRAY<STRUCT<`col`: BIGINT>>
But if I query for my table name, it gives me the following error
>SELECT schema_of_json(Transaction) as json_data from table_name;
Error in SQL statement: AnalysisException: cannot resolve 'schemaofjson(`Transaction`)' due to data type mismatch: The input json should be a string literal and not null; however, got `Transaction`.; line 1 pos 7;
The Transaction is one of the columns in my table and after checking it manually I can attest that it is of String type(json).
The SQL statement has it to give me the schema of the JSON, how to do it?
after looking further into the documentation that it is clear that the word foldable means that of the static one, and a column from a table JSON won't work
for minimal reroducible example here you go:
SELECT schema_of_json(CAST('{ "a": "b" }' AS STRING))
As soon as the cast is introduced in the above statement, the schema_of_json will fail......... It needs a static JSON as it's input

Cannot have map type columns in DataFrame which calls set operations

: org.apache.spark.sql.AnalysisException: Cannot have map type columns in DataFrame which calls set operations(intersect, except, etc.), but the type of column map_col is map
I have a hive table with a column of type - MAP<Float, Float>. I get the above error when I try to do an insertion on this table in a spark context. Insertion works fine without the 'distinct'.
create table test_insert2(`test_col` string, `map_col` MAP<INT,INT>)
location 's3://mybucket/test_insert2';
insert into test_insert2
select distinct 'a' as test_col, map(0,0) as map_col
Try to convert dataframe to .rdd then apply .distinct function.
Example:
spark.sql("select 'a'test_col,map(0,0)map_col
union all
select 'a'test_col,map(0,0)map_col").rdd.distinct.collect
Result:
Array[org.apache.spark.sql.Row] = Array([a,Map(0 -> 0)])

Passing Optional List argument from Django to filter with in Raw SQL

When using primitive types such as Integer, I can without any problems do a query like this:
with connection.cursor() as cursor:
cursor.execute(sql='''SELECT count(*) FROM account
WHERE %(pk)s ISNULL OR id %(pk)s''', params={'pk': 1})
Which would either return row with id = 1 or it would return all rows if pk parameter was equal to None.
However, when trying to use similar approach to pass a list/tuple of IDs, I always produce a SQL syntax error when passing empty/None tuple, e.g. trying:
with connection.cursor() as cursor:
cursor.execute(sql='''SELECT count(*) FROM account
WHERE %(ids)s ISNULL OR id IN %(ids)s''', params={'ids': (1,2,3)})
works, but passing () produces SQL syntax error:
psycopg2.ProgrammingError: syntax error at or near ")"
LINE 1: SELECT count(*) FROM account WHERE () ISNULL OR id IN ()
Or if I pass None I get:
django.db.utils.ProgrammingError: syntax error at or near "NULL"
LINE 1: ...LECT count(*) FROM account WHERE NULL ISNULL OR id IN NULL
I tried putting the argument in SQL in () - (%(ids)s) - but that always breaks one or the other condition. I also tried playing around with pg_typeof or casting the argument, but with no results.
Notes:
the actual SQL is much more complex, this one here is a simplification for illustrative purposes
as a last resort - I could alter the SQL in Python based on the argument, but I really wanted to avoid that.)
At first I had an idea of using just 1 argument, but replacing it with a dummy value [-1] and then using it like
cursor.execute(sql='''SELECT ... WHERE -1 = any(%(ids)s) OR id = ANY(%(ids)s)''', params={'ids': ids if ids else [-1]})
but this did a Full table scan for non empty lists, which was unfortunate, so a no go.
Then I thought I could do a little preprocessing in python and send 2 arguments instead of just the single list- the actual list and an empty list boolean indicator. That is
cursor.execute(sql='''SELECT ... WHERE %(empty_ids)s = TRUE OR id = ANY(%(ids)s)''', params={'empty_ids': not ids, 'ids': ids})
Not the most elegant solution, but it performs quite well (Index scan for non empty list, Full table scan for empty list - but that returns the whole table anyway, so it's ok)
And finally I came up with the simplest solution and quite elegant:
cursor.execute(sql='''SELECT ... WHERE '{}' = %(ids)s OR id = ANY(%(ids)s)''', params={'ids': ids})
This one also performs Index scan for non empty lists, so it's quite fast.
From the psycopg2 docs:
Note You can use a Python list as the argument of the IN operator using the PostgreSQL ANY operator.
ids = [10, 20, 30]
cur.execute("SELECT * FROM data WHERE id = ANY(%s);", (ids,))
Furthermore ANY can also work with empty lists, whereas IN () is a SQL syntax error.

Setting a Clob value in a native query

Oracle DB.
Spring JPA using Hibernate.
I am having difficulty inserting a Clob value into a native sql query.
The code calling the query is as follows:
#SuppressWarnings("unchecked")
public List<Object[]> findQueryColumnsByNativeQuery(String queryString, Map<String, Object> namedParameters)
{
List<Object[]> result = null;
final Query query = em.createNativeQuery(queryString);
if (namedParameters != null)
{
Set<String> keys = namedParameters.keySet();
for (String key : keys)
{
final Object value = namedParameters.get(key);
query.setParameter(key, value);
}
}
query.setHint(QueryHints.HINT_READONLY, Boolean.TRUE);
result = query.getResultList();
return result;
}
The query string is of the format
SELECT COUNT ( DISTINCT ( <column> ) ) FROM <Table> c where (exact ( <column> , (:clobValue), null ) = 1 )
where "(exact ( , (:clobValue), null ) = 1 )" is a function and "clobValue" is a Clob.
I can adjust the query to work as follows:
SELECT COUNT ( DISTINCT ( <column> ) ) FROM <Table> c where (exact ( <column> , to_clob((:stringValue)), null ) = 1 )
where "stringValue" is a String but obviously this only works up to the max sql string size (4000) and I need to pass in much more than that.
I have tried to pass the Clob value as a java.sql.Clob using the method
final Clob clobValue = org.hibernate.engine.jdbc.ClobProxy.generateProxy(stringValue);
This results in a java.io.NotSerializableException: org.hibernate.engine.jdbc.ClobProxy
I have tried to Serialize the Clob using
final Clob clob = org.hibernate.engine.jdbc.ClobProxy.generateProxy(stringValue);
final Clob clobValue = SerializableClobProxy.generateProxy(clob);
But this appears to provide the wrong type of argument to the "exact" function resulting in (org.hibernate.engine.jdbc.spi.SqlExceptionHelper:144) - SQL Error: 29900, SQLState: 99999
(org.hibernate.engine.jdbc.spi.SqlExceptionHelper:146) - ORA-29900: operator binding does not exist
ORA-06553: PLS-306: wrong number or types of arguments in call to 'EXACT'
After reading some post about using Clobs with entities I have tried passing in a byte[] but this also provides the wrong argument type (org.hibernate.engine.jdbc.spi.SqlExceptionHelper:144) - SQL Error: 29900, SQLState: 99999
(org.hibernate.engine.jdbc.spi.SqlExceptionHelper:146) - ORA-29900: operator binding does not exist
ORA-06553: PLS-306: wrong number or types of arguments in call to 'EXACT'
I can also just pass in the value as a String as long as it doesn't break the max string value
I have seen a post (Using function in where clause with clob parameter) which seems to suggest that the only way is to use "plain old JDBC". This is not an option.
I am up against a hard deadline so any help is very welcome.
I'm afraid your assumptions about CLOBs in Oracle are wrong. In Oracle CLOB locator is something like a file handle. And such handle can be created by the database only. So you can not simply pass CLOB as bind variable. CLOB must be somehow related to database storage, because this it can occupy up to 176TB and something like that can not be held in Java Heap.
So the usual approach is to call either DB functions empty_clob() or dbms_lob.create_temporary (in some form). Then you get a clob from database even if you think it is "IN" parameter. Then you can write as many data as you want into that locator (handle, CLOB) and then you can use this CLOB as a parameter for a query.
If you do not follow this pattern, your code will not work. It does not matter whether you use JPA, SpringBatch or plan JDBC. This constrain is given by the database.
It seems that it's required to set type of parameter explicitly for Hibernate in such cases. The following code worked for me:
Clob clob = entityManager
.unwrap(Session.class)
.getLobHelper()
.createClob(reader, length);
int inserted = entityManager
.unwrap(org.hibernate.Session.class)
.createSQLQuery("INSERT INTO EXAMPLE ( UUID, TYPE, DATA) VALUES (:uuid, :type, :data)")
.setParameter("uuid", java.util.Uuid.randomUUID(), org.hibernate.type.UUIDBinaryType.INSTANCE)
.setParameter("type", java.util.Uuid.randomUUID(), org.hibernate.type.StringType.INSTANCE)
.setParameter("data", clob, org.hibernate.type.ClobType.INSTANCE)
.executeUpdate();
Similar workaround is available for Blob.
THE ANSWER: Thank you both for your answers. I should have updated this when i solved the issue some time ago. In the end I used JDBC and the problem disappeared in a puff of smoke!