Does SQLite support common table expressions? - sql

Does SQLite support common table expressions?
I'd like to run query like that:
with temp (ID, Path)
as (
select ID, Path from Messages
) select * from temp

As of Sqlite version 3.8.3 SQLite supports common table expressions.
Change log
Instructions

Another solution is to integrate a "CTE to SQLite" translation layer in your application :
"with w as (y) z" => "create temp view w as y;z"
"with w(x) as (y) z" => "create temp table w(x);insert into w y;z"
As an (ugly, desesperate, but working) example :
http://nbviewer.ipython.org/github/stonebig/baresql/blob/master/examples/baresql_with_cte_code_included.ipynb

SQLite doesn't support CTEs, window functions, or any of the like. You can, however, write your own user functions that you can call inside SQLite by registering them to the database with the SQLite API using sqlite_create_function(). You register them with the database, and then you can use them in your own application code. You can make an aggregate function that would perform the sum of a series of averages based on the individual column values. For each value, a step-type callback function is called that allows you to perform some calculation on the data, and a pointer for holding state data is also available.

Related

Querying data with type not supported by Trino

I use Trino to consume data from a MariaDB table.
I have a specificy column at this table with Geographical Data (Point Data https://mariadb.com/kb/en/geometry-types/). Querying the source, the data appear like this:
SELECT location FROM x.y.z
location
---------------------------
POINT (51.566682 83.32865)
POINT (46.77708 16.32856)
POINT (84.857691 4.295681)
But this kind of data isn't supported by Trino (https://trino.io/docs/current/connector/mariadb.html)
I just want the values (x, y) inside POINT(x,y).
The documentation has a flag unsupported-type-handling=CONVERT_TO_VARCHAR but when I use it the data retrieved came like this:
location
-------------------------
�Q�GHJk ���*#
�{���GMg'���(#
0�Z¶nK#�B< / #
I tested a lot of conversions on this varchar but no one worked well. So how can I get this kind of data type using Trino?
The datype is not text, so converting it will not help
You can always use natve functions to myriadb, as long as they return datatypes that are allowed
Table functions
The connector provides specific table functions to access MariaDB.
query(varchar) -> table#
The query function allows you to query the underlying database directly. It requires syntax native to MariaDB, because the full query is pushed down and processed in MariaDB. This can be useful for accessing native features which are not available in Trino or for improving query performance in situations where running a query natively may be faster.
so a query like will work
SELECT
X,Y
FROM
TABLE(
mariadb.system.query(
query => 'SELECT
ST_X(loacation) as X,
ST_Y/location) As Y
FROM
mytable'
)
);

How do I generate a table name that contains today's date?

It may seem a little strange, but there are already tables with names for each date.
In my project, I have tables for each date to make statistics easier to handle.
Of course, I don't think this is always the best way, but this is the table structure for my project.
(It's a common technique in Google BigQuery and Amazon Athena. This question is about Google BigQuery)
So to get the data, I want to generate today's date. If I use TODAY, I can get the data of the latest day without rewriting the code even if it is the next day.
I tried, but the code didn't work.
Not work 1:
CONCAT in FROM
SELECT
*
FROM
CONCAT('foo_', FORMAT_TIMESTAMP('%Y%m%d', CURRENT_TIMESTAMP(), 'Asia/Tokyo'))
Error:
Table-valued function not found: CONCAT at [4:3]
Not work 2:
create temporary function:
create temporary function getTableName() as (CONCAT('foo_', FORMAT_TIMESTAMP('%Y%m%d', CURRENT_TIMESTAMP(), 'Asia/Tokyo')));
Error:
CREATE TEMPORARY FUNCTION statements must be followed by an actual query.
Question
How do I generate a table name that contains TODAY's date?
In this case, I would recommend you to use Wild tables in BigQuery, which allows you to use some features in Standard SQL.
With Wild Tables you can use _TABLE_SUFFIX, it grants you the ability to filter/scan tables containing this parameter. The syntax would be as follows:
SELECT *
FROM `test-proj-261014.sample.test_*`
where _TABLE_SUFFIX = FORMAT_DATE('%Y%m%d', CURRENT_DATE)
I hope it helps.
Your first query should go like this:
select CONCAT('foo_', FORMAT_TIMESTAMP('%Y%m%d', CURRENT_TIMESTAMP(), 'Asia/Tokyo'))
For creating temporary function, use the below code:
create temp function getTableName() as
((select CONCAT('foo_', FORMAT_TIMESTAMP('%Y%m%d', CURRENT_TIMESTAMP(), 'Asia/Tokyo'))
));
select getTableName()
The error "CREATE TEMPORARY FUNCTION statements must be followed by an actual query." is because once the temporary functions are defined then you have to use the actual query to use that function and then the validity of function dies out. To define persistent UDFs and use them in multiple queries please go through the link to define permanent functions.You can reuse persistent UDFs across multiple queries, whereas you can only use temporary UDFs in a single query.

nameUUIDFromBytes in sql

I've used UUID.html#nameUUIDFromBytes. I'd like to run the same logic on a column in table x. Are there any options for doing this in a sql query, ideally one that BigQuery supports?
I'm unable to modify the data in table x. Additionally table x is quite large and I'd rather not put in the resources to write a pipeline to copy it to another table y (using the java function in the pipeline) if I could do this in sql.
Is this what you are looking for?
SELECT GENERATE_UUID() AS uuid;
https://cloud.google.com/bigquery/docs/reference/standard-sql/uuid_functions#generate_uuid
Source thread:
https://stackoverflow.com/a/49438112/11928117

Encapsulating complex code in BigQuery

I recently had to generate a BQ table out of other BQ tables. The logic was rather involved and I ended up writing a complex SQL statement.
In Oracle SQL I would have written a PL/SQL procedure with the logic broken down into separate pieces (most often merge statements). In some cases I would encapsulate some code into functions. The resulting procedure would be a sequence of DML statements, easy to read and maintain.
However nothing similar exists for BQ. The UDF's are only temporary and cannot be stored within -say- a view.
Question: I am looking for ways to make my complex BQ SQL code more modular and readable. Is there any way I could accomplish this?
currently available option is to use WITH Clause
The WITH clause contains one or more named subqueries whose output acts as a temporary table which subsequent SELECT statements can reference in any clause or subquer
I would still consider User-Defined Functions as a really good option.
JS and SQL UDF are available in BigQuery and from what is known BigQuery team is working on introducing permanent UDF to be available soon
Meantime you can just store body of JS UDF as a js library and reference it in your UDF using OPTIONS section. see Including external libraries in above reference
October 2019 Update
The ability to use scripting and stored procedures is now in Beta.
So, you can send multiple statements to BigQuery in one request, to use variables, and to use control flow statements such as IF and WHILE, etc.
And, you can use procedure, which is a block of statements that can be called from other queries.
Note: it is Beta yet
BigQuery supports persistent user-defined functions. To get started, see the documentation.
For example, here's a CREATE FUNCTION statement that creates a function to compute the median of an array:
CREATE FUNCTION dataset.median(arr ANY TYPE) AS (
(
SELECT
IF(
MOD(ARRAY_LENGTH(arr), 2) = 0,
(arr[OFFSET(DIV(ARRAY_LENGTH(arr), 2) - 1)] + arr[OFFSET(DIV(ARRAY_LENGTH(arr), 2))]) / 2,
arr[OFFSET(DIV(ARRAY_LENGTH(arr), 2))]
)
FROM (SELECT ARRAY_AGG(x ORDER BY x) AS arr FROM UNNEST(arr) AS x)
)
);
After executing this statement, you can reference it in a follow-up query:
SELECT dataset.median([7, 1, 2, 10]) AS median;
You can also reference the function inside logical views. Note that you currently need to qualify the reference to the function inside the view using a project, however:
CREATE VIEW dataset.sampleview AS
SELECT x, `project-name`.dataset.median(array_column) AS median
FROM `project-name`.dataset.table

Is it possible to access a BigQuery partition in Standard SQL using the '$' decorator?

In Google BigQuery, I'm trying to use the $ decorator when querying a partitioned table using Standard SQL. I assume this is supposed to allow me to access partitions and table metadata as it did in Legacy SQL, but it doesn't appear to work in Standard SQL.
Both of the following queries return Error: Table "dataset.partitioned_table$___" cannot include decorator:
1) Accessing a partition directly:
#StandardSQL
SELECT a, b, c
FROM `mydataset.partitioned_table$20161115`
2) Accessing table metadata:
#StandardSQL
SELECT partition_id
FROM `mydataset.partitioned_table$__PARTITIONS_SUMMARY__`;
The obvious workaround for the first query is to use the _PARTITIONTIME pseudocolumn:
#StandardSQL
SELECT a, b, c
FROM mydataset.partitioned_table
WHERE _PARTITIONTIME = '2016-11-15'
However, I haven't been able to find a workaround for the second query, which is useful for retrieving the most recent partition (though using that info to actually query the latest partition seems broken as well. See: How to choose the latest partition in BigQuery table?)
Obtaining the partitions summary using a decorator is currently not supported in StandardSQL. We are planning some work in this area but we don't have an ETA currently on when that might be available. The fastest option right now is to run the query over T$__PARTITIONS_SUMMARY__ using legacy SQL.