cannot do a where clause with jooq DSL in Kotlin/Java - kotlin

I am trying to run a query of the following form with jooq in Kotlin:
val create = DSL.using(SQLDialect.POSTGRES)
val query: Query = create.select().from(DSL.table(tableName))
.where(DSL.field("timestamp").between("1970-01-01T00:00:00Z").and("2021-11-05T00:00:00Z"))
.orderBy(DSL.field("id").desc())
The code above gives me:
syntax error at or near \"and\
Also, looking at this query in the debugger, the query.sql renders as:
select * from data_table where timestamp between ? and ? order by id desc
I am not sure if the ? indicates that it could not render the values to SQL or somehow they are some sort of placeholders..
Also, the code works without the where chain.
Additionally, on the Postgres command line I can run the following and the query executes:
select * from data_table where timestamp between '1970-01-01T00:00:00Z' and '2021-11-05T00:00:00Z' order by id
Querying the datatypes on the schema, the timestamp column type is rendered as timestamp without time zone.
Before I had declared variables as:
val lowFilter = "1970-01-01T00:00:00Z"
val highFilter = "2021-11-05T00:00:00Z"
and this did not work and it seems passing raw strings does not work either. I am very new to this, so I am pretty sure I am messing up the usage here.
EDIT
Following #nulldroid suggestion, did something like:
.where(DSL.field("starttime").between(DSL.timestamp("1970-01-01T00:00:00Z")).and(DSL.timestamp("2021-11-05T00:00:00Z")))
and this resulted in:
Type class org.jooq.impl.Val is not supported in dialect POSTGRES"

Not using the code generator:
I'm going to assume you have a good reason not to use the code generator for this particular query, the main reason usually being that your schema is dynamic.
So, the correct way to write your query is this:
create.select()
.from(DSL.table(tableName))
// Attach a DataType to your timestamp field, to let jOOQ know about this
.where(DSL.field("timestamp", SQLDataType.OFFSETDATETIME)
// Use bind values of a temporal type
.between(OffsetDateTime.parse("1970-01-01T00:00:00Z"))
.and(OffsetDateTime.parse("2021-11-05T00:00:00Z")))
.orderBy(DSL.field("id").desc())
Notice how I'm using actual temporal data types, not strings to compare dates and declare fields.
I'm assuming, from your question's UTC timestamps, that you're using TIMESTAMPTZ. Otherwise, if you're using TIMESTAMP, just replace OffsetDateTime by LocalDateTime...
Using the code generator
If using the code generator, which is always recommended if your schema isn't dynamic, you'd write almost the same thing as above, but type safe:
create.select()
.from(MY_TABLE)
// Attach a DataType to your timestamp field, to let jOOQ know about this
.where(MY_TABLE.TIMESTAMP
// Use bind values of a temporal type
.between(OffsetDateTime.parse("1970-01-01T00:00:00Z"))
.and(OffsetDateTime.parse("2021-11-05T00:00:00Z")))
.orderBy(MY_TABLE.ID.desc())

Related

Why does this SQL work? DATE keyword + date string (YYYY-MM-DD)

While examining a former colleague's code, I came across the following: DATE'2019-01-01'
Why does this work? It is used in a BigQuery Standard SQL context.
Wouldn't it need to be DATE('2019-01-01') ? As per the documentation https://cloud.google.com/bigquery/docs/reference/standard-sql/date_functions#date
I'm not sure why both syntax are allowed, but they produce the same data type. So it won't matter if you use either since you won't be encountering data type mismatch. I tested it using bqutil.fn.typeof to test the data type for both syntax. See testing below:
Query:
SELECT DATE('2019-01-01') as date_1,
DATE'2019-01-01' as date_2,
bqutil.fn.typeof(DATE('2019-01-01')) as typeof_date1,
bqutil.fn.typeof(DATE'2019-01-01') as typeof_date2
Both syntax produce DATE data type.
Output:

Raw SQL Query that causes error when translated to ActiveRecord Query (Firebird Database)

I am trying to do a fairly simply query that involves getting all the records between two dates via a specific column.
The raw SQL works fine in ISQL Firebird:
SELECT * FROM OPS_HEADER
WHERE PB_BOL_DT
BETWEEN '2020-09-01' AND '2020-09-10';
Here is my ActiveRecord Conversion:
OpsHeader.where('pb_bol_dt BETWEEN 2020-09-01 AND 2020-09-10')
This above line gives me this error:
expression evaluation not supported expression evaluation not supported Only one operand can be of type TIMESTAMP
I may be converting it wrong but it sure seems like this is the exact way to do it... I have no idea why it's giving me so much trouble.
You're missing quotes on the date literals, you'd want to start with:
OpsHeader.where(%q(pb_bol_dt BETWEEN '2020-09-01' AND '2020-09-10'))
But you can get ActiveRecord to build a BETWEEN by passing it a range:
OpsHeader.where(pb_bol_dt: '2020-09-01' .. '2020-09-10')
and letting AR deal with the quoting. You could also pass the end points separately using positional or named placeholders:
OpsHeader.where('pb_bol_dt between ? and ?', '2020-09-01', '2020-09-10')
OpsHeader.where('pb_bol_dt between :lower and :upper', lower: '2020-09-01', upper: '2020-09-10')
All of these will end up sending the same SQL to the database, the only difference is the small amount of string processing and type handling that is needed to build the latter three queries.

Airflow - Bigquery operator not working as intended

I'm new with Airflow and I'm currently stuck on an issue with the Bigquery operator.
I'm trying to execute a simple query on a table from a given dataset and copy the result on a new table in the same dataset. I'm using the bigquery operator to do so, since according to the doc the 'destination_dataset_table' parameter is supposed to do exactly what I'm looking for (source:https://airflow.apache.org/docs/stable/_api/airflow/contrib/operators/bigquery_operator/index.html#airflow.contrib.operators.bigquery_operator.BigQueryOperator).
But instead of copying the data, all I get is a new empty table with the schema of the one I'm querying from.
Here's my code
default_args = {
'owner':'me',
'depends_on_past':False,
'start_date':datetime(2019,1,1),
'end_date':datetime(2019,1,3),
'retries':10,
'retry_delay':timedelta(minutes=1),
}
dag = DAG(
dag_id='my_dag',
default_args=default_args,
schedule_interval=timedelta(days=1)
)
copyData = BigQueryOperator(
task_id='copyData',
dag=dag,
sql=
"SELECT some_columns,x,y,z FROM dataset_d.table_t WHERE some_columns=some_value",
destination_dataset_table='dataset_d.table_u',
bigquery_conn_id='something',
)
I don't get any warnings or errors, the code is running and the tasks are marked as success. It does create the table I wanted, with the columns I specified, but totally empty.
Any idea what I'm doing wrong?
EDIT: I tried the same code on a much smaller table (from 10Gb to a few Kb), performing a query with a much smaller result (from 500Mb to a few Kb), and it did work this time. Do you think the size of the table/the query result matters? Is it limited? Or does performing a too large query cause a lag of some sort?
EDIT2: After a few more tests I can confirm that this issue is not related to the size of the query or the table. It seems to have something to do with the Date format. In my code the WHERE condition is actually checking if a date_column = 'YYYY-MM-DD'. When I replace this condition with an int or string comparison it works perfectly. Do you guys know if Bigquery uses a particular date format or requires a particular syntax?
EDIT3: Finally getting somewhere: When I cast my date_column as a date (CAST(date_column AS DATE)) to force its type to DATE, I get an error that says that my field is actually an int-32 (Argument type mismatch). But I'm SURE that this field is a date, so that implies that either Bigquery stores it as an int while displaying it as a date, or that the Bigquery operator does some kind of hidden type conversion while loading tables. Any idea on how to fix this?
I had a similar issue when transferring data from other data sources than big-query.
I suggest casting the date_column as follows: to_char(date_column, 'YYYY-MM-DD') as date
In general, I have seen that big-query auto detection schema is often problematic. The safest way is to always specify schema before executing its corresponding query, or use operators that support schema definition.

Realm-js Querying on timeStamp does not work

This is with reference to https://realm.io/docs/javascript/latest/api/tutorial-query-language.html
I am not looking for variable substitution syntax as mentioned in the documentation.
So i have a date field by the name createDate, and i am trying to query on the same.
The filter query looks like createDate = ${someDate}, where somedate is in the format 'YYYY-MM-DD#HH:mm:ss', i also tried == instead of = in the query, but this simply does not work;
its hard to say without seeing how you're actually writing the query, but it could be how you're constructing it. maybe try
filtered('createDate = $0', someDate)
using a placeholder variable ($0) solves query problems in realm for me

Problems using functions on geography types with PostgreSQL - PostGIS (extension)

Here is the problem,
I recently installed postgresql with postGIS extension (with functions)
I built a small database model with some geography data.
I'm actually trying to insert some geography data into it, and use functions on it... like points / polygons / ...
Problem is that when I try to use postGis functions like ST_Area, ST_Perimeter, ... PostgreSQL always returns error(s). Most of them are 42P01 meaning "unknown table" if I'm right. But tables exist...
Here is a screenshot of my actual test db :
http://s30.postimg.org/prnyyw7gh/image.png
You can see on the screenshot that the postGIS extension is active on my current model (the 1050 functions of this extension are available in this model)
I inserted data this way :
INSERT INTO "Base".points (point,lat,lng) VALUES ("Base".ST_GeographyFromText('POINT(45.5555 32.2222)'),'45.5555','32.2222');
and
INSERT INTO "Base".polygons (polygon) VALUES ("Base".ST_GeographyFromText('POLYGON((x y,x1 y1,x2 y2,x y))'));
For table Points, I have a serial field (id), a geography field (point) and 2 text fields (lat, lng).
For table Polygons, I have a serial field (id) and a geography field (polygon).
Here are the two queries I'm tryin to make :
SELECT "Base".ST_Area(polygon) FROM "Base".polygons WHERE id=1
or
SELECT "Base".ST_Perimeter(polygon) FROM "Base".polygons WHERE id=1
These 2 tests do not work. They return error 42P01.
When tryin to test another function on my table "points", this also fails but returns a strange message. What I'm tryin is this :
SELECT "Base".ST_Distance((SELECT point FROM "Base".points WHERE id=1),(SELECT point FROM "Base".points WHERE id=2))
This function exists, but returns error message SQL state: 42883 with message :
ERROR : function _st_distance("Base".geography, "Base".geography, numeric, boolean) does not exist
I don't send any numeric or boolean... I can't explain where these errors are coming from...
I have to say that I'm new to postgresql... Problem may come from this...
Thanks for reading/Help
[Tip - post your create table statement, so we can see what the column types actually are].
I think your problems are because you are using geography data type rather than geometry; not all functions work (either at all or with the same arguments) for geography type. Here's an explanation of why - in short, according to the article.
If you do a lot of measurements and e.g. have to compare sizes of
large polygons, it would make sense to use geography rather than
geometry.
To find out what arguments are expected by which functions, check the postgis documentation. It will show you e.g.
float ST_Area(geometry g1);
float ST_Area(geography geog, boolean use_spheroid=true);
So you can see there are two versions of st_area. One accepts a geometry as an argument, the other accepts a geography, but you also have to add another argument.
If "polygon" is of type geography, you need
SELECT "Base".ST_Area(polygon, TRUE) FROM "Base".polygons WHERE id=1
-- true will measure around the geography spheroid and is more accurate
-- than false - read the documentation for what you need!
st_perimeter is similar.
Regarding your error on st_distance, did you see that underscore before "st_distance" in the error message? I suspect that you are running into problems because you have created the postgis extension in a schema. In PGAdmin, have a look for the function "ST_distance". You'd see that it calls in turn the function "_st_distance" - it can't find that because in your case the function is in a different schema. Try just doing this:
CREATE EXTENSION postgis
I think that will save you a world of pain.
Lastly, I think you have the lat and lon arguments in your st_geographyfromtext the wrong way round (but I may be wrong).