Flattening nested record in postgres - sql

How to flatten the foo column in the outer select (in PostgreSQL)?
WITH RECURSIVE t AS (
SELECT row(d.*) as foo FROM some_multicolumn_table as d
UNION ALL
SELECT foo FROM t WHERE random() < .5
)
SELECT foo FROM t
What I want is to output all the columns (horizontally, i.e. as a row of multiple columns) of some_multicolumn_table in the outer select, not just a single "record" column.
How to do that?

You don't need the ROW constructor there, and so you can expand the record by using (foo).*:
WITH RECURSIVE t AS (
SELECT d as foo FROM some_multicolumn_table as d
UNION ALL
SELECT foo FROM t WHERE random() < .5
)
SELECT (foo).* FROM t;
Although this query could be simple written as:
WITH RECURSIVE t AS (
SELECT d.* FROM some_multicolumn_table as d
UNION ALL
SELECT t.* FROM t WHERE random() < .5
)
SELECT * FROM t;
And I recommend trying to keep it as simple as possible. But I'm assuming it was just an exemplification.

Related

How to copy a row n times in sqlite?

For a load test I have to copy a row in my table in a sqlite database 1000 (5000, 10000) times.
With the statement
INSERT INTO MYTABLE (
created,
modified,
anotherfield,
etc
)
SELECT created,
modified,
anotherfield,
etc FROM MYTABLE WHERE id = 1;
I can copy it one time. But it would be great to be able to put this into a loop to execute this statement n times.
It seems like SQLite does not support for-loops. I found something called WITH RECURSIVE which could be something like the SQLite way to handle loops. But if I execute
WITH RECURSIVE
cnt(x) AS (VALUES(1) UNION ALL SELECT x+1 FROM cnt WHERE x<1000)
<insert_statement_from_above>
the insert statement gets executed only once.
What am I doing wrong? How can I get to insert 1000 (5000, 10000) rows without having to add them all one by one? Thanks!
You must CROSS join the table to the recursive cte to produce 1000 rows:
WITH RECURSIVE cte(x) AS (SELECT 1 UNION ALL SELECT x + 1 FROM cte WHERE x < 1000)
INSERT INTO MYTABLE (created, modified, anotherfield)
SELECT m.created, m.modified, m.anotherfield
FROM MYTABLE m CROSS JOIN cte c
WHERE m.id = 1;
See the demo (for 3 rows).
Another way to use the recursive cte:
WITH RECURSIVE cte AS (
SELECT created, modified, anotherfield, 1 x
FROM MYTABLE
WHERE id = 1
UNION ALL
SELECT created, modified, anotherfield, x + 1
FROM cte
WHERE x < 1000
)
INSERT INTO MYTABLE (created, modified, anotherfield)
SELECT created, modified, anotherfield
FROM cte;
See the demo.

Is there a SQL function to expand table?

I vaguely remember there being a function that does this, but I think I may be going crazy.
Say I have a datatable, call it table1. It has three columns: column1, column2, column3. The query
SELECT * FROM table1
returns all rows/columns from table1. Isn't there some type of EXPAND function that allows me to duplicate that result? For example, if I want to duplicate everything from the SELECT * FROM table1 query three times, I can do something like EXPAND(3) ?
In BigQuery, I would recommend a CROSS JOIN:
SELECT t1.*
FROM table1 CROSS JOIN
(SELECT 1 as n UNION ALL SELECT 2 UNION ALL SELECT 3) n;
This can get cumbersome for lots of copies, but you can simplify this by generating the numbers:
SELECT t1.*
FROM table1 CROSS JOIN
UNNEST(GENERATE_ARRAY(1, 3)) n
This creates an array with three elements and unnests it into rows.
In both these cases, you can include n in the SELECT to distinguish the copies.
Below is for BigQuery Standard SQL
I think below is close enough to what "got you crazy" o)
#standardSQL
SELECT copy.*
FROM `project.dataset.tabel1` t, UNNEST(FN.EXPAND(t, 3)) copy
To be able to do so, you can leverage recently announced support for persistent standard SQL UDFs, namely - you need to create FN.EXPAND() function as in below example (note: you need to have FN dataset in your project - or use existing dataset in which case you should use YOUR_DATASET.EXPAND() reference
#standardSQL
CREATE FUNCTION FN.EXPAND(s ANY TYPE, dups INT64) AS (
ARRAY (
SELECT s FROM UNNEST(GENERATE_ARRAY(1, dups))
)
);
Finally, if you don't want to create persistent UDF - you can use temp UDF as in below example
#standardSQL
CREATE TEMP FUNCTION EXPAND(s ANY TYPE, dups INT64) AS ( ARRAY(
SELECT s FROM UNNEST(GENERATE_ARRAY(1, dups))
));
SELECT copy.*
FROM `project.dataset.tabel1` t, UNNEST(EXPAND(t, 3)) copy
if you want a cartesian product (all the combination on a row ) you could use
SELECT a.*, b.*, c.*
FROM table1 a
CROSS JOIN table1 b
CROSS JOIN table1 c
if you want the same rows repeated you can use UNION ALL
SELECT *
FROM table1
UNION ALL
SELECT *
FROM table1
UNION ALL
SELECT *
FROM table1
Use union all
Select * from table1
Union all
Select * from table1
Union all
Select * from table1
Union all
Select * from table1
For reuse purposes can embed this code in a procedure like
Create Procedure
expandTable(tablename
varchar2(50))
As
Select * from table1
Union all
Select * from table1
Union all
Select * from table1
Union all
Select * from table1
End
/

How to join two tables with the same number of rows in SQLite?

I have almost the same problem as described in this question. I have two tables with the same number of rows, and I would like to join them together one by one.
The tables are ordered, and I would like to keep this order after the join, if it is possible.
There is a rowid based solution for MSSql, but in SQLite rowid can not be used if the table is coming from a WITH statement (or RECURSIVE WITH).
It is guaranteed that the two tables have the exact same number of rows, but this number is not known beforehand. It is also important to note, that the same element may occur more than twice. The results are ordered, but none of the columns are unique.
Example code:
WITH
table_a (n) AS (
SELECT 2
UNION ALL
SELECT 4
UNION ALL
SELECT 5
),
table_b (s) AS (
SELECT 'valuex'
UNION ALL
SELECT 'valuey'
UNION ALL
SELECT 'valuez'
)
SELECT table_a.n, table_b.s
FROM table_a
LEFT JOIN table_b ON ( table_a.rowid = table_b.rowid )
The result I would like to achieve is:
(2, 'valuex'),
(4, 'valuey'),
(5, 'valuez')
SQLFiddle: http://sqlfiddle.com/#!5/9eecb7/6888
This is quite complicated in SQLite -- because you are allowing duplicates. But you can do it. Here is the idea:
Summarize the table by the values.
For each value, get the count and offset from the beginning of the values.
Then use a join to associate the values and figure out the overlap.
Finally use a recursive CTE to extract the values that you want.
The following code assumes that n and s are ordered -- as you specify in your question. However, it would work (with small modifications) if another column specified the ordering.
You will notice that I have included duplicates in the sample data:
WITH table_a (n) AS (
SELECT 2 UNION ALL
SELECT 4 UNION ALL
SELECT 4 UNION ALL
SELECT 4 UNION ALL
SELECT 5
),
table_b (s) AS (
SELECT 'valuex' UNION ALL
SELECT 'valuey' UNION ALL
SELECT 'valuey' UNION ALL
SELECT 'valuez' UNION ALL
SELECT 'valuez'
),
a as (
select a.n, count(*) as a_cnt,
(select count(*) from table_a a2 where a2.n < a.n) as a_offset
from table_a a
group by a.n
),
b as (
select b.s, count(*) as b_cnt,
(select count(*) from table_b b2 where b2.s < b.s) as b_offset
from table_b b
group by b.s
),
ab as (
select a.*, b.*,
max(a.a_offset, b.b_offset) as offset,
min(a.a_offset + a.a_cnt, b.b_offset + b.b_cnt) - max(a.a_offset, b.b_offset) as cnt
from a join
b
on a.a_offset + a.a_cnt - 1 >= b.b_offset and
a.a_offset <= b.b_offset + b.b_cnt - 1
),
cte as (
select n, s, offset, cnt, 1 as ind
from ab
union all
select n, s, offset, cnt, ind + 1
from cte
where ind < cnt
)
select n, s
from cte
order by n, s;
Here is a DB Fiddle showing the results.
I should note that this would be much simpler in almost any other database, using window functions (or perhaps variables in MySQL).
Since the tables are ordered, you can add row_id values by comparing n values.
But still the best way in order to get better performance would be inserting the ID values while creating the tables.
http://sqlfiddle.com/#!5/9eecb7/7014
WITH
table_a_a (n, id) AS
(
WITH table_a (n) AS
(
SELECT 2
UNION ALL
SELECT 4
UNION ALL
SELECT 5
)
SELECT table_a.n, (select count(1) from table_a b where b.n <= table_a.n) id
FROM table_a
) ,
table_b_b (n, id) AS
(
WITH table_a (n) AS
(
SELECT 'valuex'
UNION ALL
SELECT 'valuey'
UNION ALL
SELECT 'valuez'
)
SELECT table_a.n, (select count(1) from table_a b where b.n <= table_a.n) id
FROM table_a
)
select table_a_a.n,table_b_b.n from table_a_a,table_b_b where table_a_a.ID = table_b_b.ID
or convert the input set to comma separated list and try like this:
http://sqlfiddle.com/#!5/9eecb7/7337
WITH RECURSIVE table_b( id,element, remainder ) AS (
SELECT 0,NULL AS element, 'valuex,valuey,valuz,valuz' AS remainder
UNION ALL
SELECT id+1,
CASE
WHEN INSTR( remainder, ',' )>0 THEN
SUBSTR( remainder, 0, INSTR( remainder, ',' ) )
ELSE
remainder
END AS element,
CASE
WHEN INSTR( remainder, ',' )>0 THEN
SUBSTR( remainder, INSTR( remainder, ',' )+1 )
ELSE
NULL
END AS remainder
FROM table_b
WHERE remainder IS NOT NULL
),
table_a( id,element, remainder ) AS (
SELECT 0,NULL AS element, '2,4,5,7' AS remainder
UNION ALL
SELECT id+1,
CASE
WHEN INSTR( remainder, ',' )>0 THEN
SUBSTR( remainder, 0, INSTR( remainder, ',' ) )
ELSE
remainder
END AS element,
CASE
WHEN INSTR( remainder, ',' )>0 THEN
SUBSTR( remainder, INSTR( remainder, ',' )+1 )
ELSE
NULL
END AS remainder
FROM table_a
WHERE remainder IS NOT NULL
)
SELECT table_b.element, table_a.element FROM table_b, table_a WHERE table_a.element IS NOT NULL and table_a.id = table_b.id;
SQL
SELECT a1.n, b1.s
FROM table_a a1
LEFT JOIN table_b b1
ON (SELECT COUNT(*) FROM table_a a2 WHERE a2.n <= a1.n) =
(SELECT COUNT(*) FROM table_b b2 WHERE b2.s <= b1.s)
Explanation
The query simply counts the number of rows up until the current one for each table (based on the ordering column) and joins on this value.
Demo
See SQL Fiddle demo.
Assumptions
A single column in used for the ordering in each table. (But the query could easily be modified to allow multiple ordering columns).
The ordering values in each table are unique.
The values in the ordering column aren't necessarily the same between the two tables.
It is known that table_a contains either the same or more rows than table_b. (If this isn't the case then a FULL OUTER JOIN would need to be emulated since SQLite doesn't provide one.)
No further changes to the table structure are allowed. (If they are, it would be more efficient to have pre-populated columns for the ordering).
Either way...
Use something like
WITH
v_table_a (n, rowid) AS (
SELECT 2, 1
UNION ALL
SELECT 4, 2
UNION ALL
SELECT 5, 3
),
v_table_b (s, rowid) AS (
SELECT 'valuex', 1
UNION ALL
SELECT 'valuey', 2
UNION ALL
SELECT 'valuez', 3
)
SELECT v_table_a.n, v_table_b.s
FROM v_table_a
LEFT JOIN v_table_b ON ( v_table_a.rowid = v_table_b.rowid );
for "virtual" tables (with WITH or without),
WITH RECURSIVE vr_table_a (n, rowid) AS (
VALUES (2, 1)
UNION ALL
SELECT n + 2, rowid + 1 FROM vr_table_a WHERE rowid < 3
)
, vr_table_b (s, rowid) AS (
VALUES ('I', 1)
UNION ALL
SELECT s || 'I', rowid + 1 FROM vr_table_b WHERE rowid < 3
)
SELECT vr_table_a.n, vr_table_b.s
FROM vr_table_a
LEFT JOIN vr_table_b ON ( vr_table_a.rowid = vr_table_b.rowid );
for "virtual" tables using recursive WITHs (in this example the values are others then yours, but I guess you get the point) and
CREATE TABLE p_table_a (n INT);
INSERT INTO p_table_a VALUES (2), (4), (5);
CREATE TABLE p_table_b (s VARCHAR(6));
INSERT INTO p_table_b VALUES ('valuex'), ('valuey'), ('valuez');
SELECT p_table_a.n, p_table_b.s
FROM p_table_a
LEFT JOIN p_table_b ON ( p_table_a.rowid = p_table_b.rowid );
for physical tables.
I'd be careful with the last one though. A quick test shows, that the numbers of rowid are a) reused -- when some rows are deleted and others are inserted, the inserted rows get the rowids from the old rows (i.e. rowid in SQLite isn't unique past the lifetime of a row, whereas e.g. Oracle's rowid AFAIR is) -- and b) corresponds to the order of insertion. But I don't know and didn't find a clue in the documentation, if that's guaranteed or is subject to change in other/future implementations. Or maybe it's just a mere coincidence in my test environment.
(In general physical order of rows may be subject to change (even within the same database using the same DMBS as a result of some reorganization) and is therefore no good choice to rely on. And it's not guaranteed, a query will return the result ordered by physical position in the table as well (it might use the order of some index instead or have a partial result ordered some other way influencing the output's order). Consider designing your tables using common (sort) keys in corresponding rows for ordering and to join on.)
You can create temp tables to carry CTE data row. then JOIN them by sqlite row_id column.
CREATE TEMP TABLE temp_a(n integer);
CREATE TEMP TABLE temp_b(n VARCHAR(255));
WITH table_a(n) AS (
SELECT 2 n
UNION ALL
SELECT 4
UNION ALL
SELECT 5
UNION ALL
SELECT 5
)
INSERT INTO temp_a (n) SELECT n FROM table_a;
WITH table_b (n) AS
(
SELECT 'valuex'
UNION ALL
SELECT 'valuey'
UNION ALL
SELECT 'valuez'
UNION ALL
SELECT 'valuew'
)
INSERT INTO temp_b (n) SELECT n FROM table_b;
SELECT *
FROM temp_a a
INNER JOIN temp_b b on a.rowid = b.rowid;
sqlfiddle:http://sqlfiddle.com/#!5/9eecb7/7252
It is possible to use the rowid inside a with statement but you need to select it and make it available to the query using it.
Something like this:
with tablea AS (
select id, rowid AS rid from someids),
tableb AS (
select details, rowid AS rid from somedetails)
select tablea.id, tableb.details
from
tablea
left join tableb on tablea.rid = tableb.rid;
It is however as they have already warned you a really bad idea. What if the app breaks after inserting in one table but before the other one? What if you delete an old row? If you want to join two tables you need to specify the field to do so. There are so many things that could go wrong with this design. The most similar thing to this would be an incremental id field that you would save in the table and use in your application. Even simpler, make those into one table.
Read this link for more information about the rowid: https://www.sqlite.org/lang_createtable.html#rowid
sqlfiddle: http://sqlfiddle.com/#!7/29fd8/1
It is possible to use the rowid inside a with statement but you need to select it and make it available to the query using it. Something like this:
with tablea AS (select id, rowid AS rid from someids),
tableb AS (select details, rowid AS rid from somedetails)
select tablea.id, tableb.details
from
tablea
left join tableb on tablea.rid = tableb.rid;
The problem statement indicates:
The tables are ordered
If this means that the ordering is defined by the ordering of the values in the UNION ALL statements, and if SQLite respects that ordering, then the following solution may be of interest because, apart from small tweaks to the last three lines of the sample program, it adds just two lines:
A(rid,n) AS (SELECT ROW_NUMBER() OVER ( ORDER BY 1 ) rid, n FROM table_a),
B(rid,s) AS (SELECT ROW_NUMBER() OVER ( ORDER BY 1 ) rid, s FROM table_b)
That is, table A is table_a augmented with a rowid, and similarly for table B.
Unfortunately, there is a caveat, though it might just be the result of my not having found the relevant specifications. Before delving into that, however, here is the full proposed solution:
WITH
table_a (n) AS (
SELECT 2
UNION ALL
SELECT 4
UNION ALL
SELECT 5
),
table_b (s) AS (
SELECT 'valuex'
UNION ALL
SELECT 'valuey'
UNION ALL
SELECT 'valuez'
),
A(rid,n) AS (SELECT ROW_NUMBER() OVER ( ORDER BY 1 ) rid, n FROM table_a),
B(rid,s) AS (SELECT ROW_NUMBER() OVER ( ORDER BY 1 ) rid, s FROM table_b)
SELECT A.n, B.s
FROM A LEFT JOIN B
ON ( A.rid = B.rid );
Caveat
The proposed solution has been tested against a variety of data sets using sqlite version 3.29.0, but whether or not it is, and will continue to be, "guaranteed" to work is unclear to me.
Of course, if SQLite offers no guarantees with respect to the ordering of the UNION ALL statements (that is, if the question is based on an incorrect assumption), then it would be interesting to see a well-founded reformulation.

Aggregate two columns and rows into one

I have the following table structure
start|end
09:00|11:00
13:00|14:00
I know
SELECT ARRAY_AGG(start), ARRAY_AGG(end)
Will result in
start|end
[09:00,13:00]|[11:00,14:00]
But how can i get the following result?
result
[09:00,11:00,13:00,14:00]
BTW, I'm using Postgres
You could do array concatenation (if order is not important):
SELECT ARRAY_AGG(start) || ARRAY_AGG(end) FROM TABLE1
If order is important you could use Gordon's approach but:
add aggregate order array_agg(d order by d ASC)
use unnest instead of union all, because Gordon's solution (union all) performs two sequence scan. If table is big it could be better for performance to use:
SELECT array_agg(d ORDER BY d ASC) FROM(
SELECT unnest(ARRAY[start] || ARRAY[end]) as d from table1
) sub
which performs only one sequence scan on table (and will be faster).
One method is to unpivot them and then aggregate:
select array_agg(d)
from (select start as d from t
union all
select end as d from t
) t;
A similar method uses a cross join:
select array_agg(case when n.n = 1 then t.start else t.end end)
from t cross join
(select 1 as n union all select 2) n;
I assume the start and end are character type
select ARRAY_AGG(col)
from(select string_agg(strt::text||','||en::text,',') col
from b
)t

SELECT on two other queries in Oracle

So, lets say I want to do something like:
SELECT Query1.a,
Query2.b
FROM (
SELECT q as a
FROM somewhere
),
(
SELECT g as b
FROM elsewhere
)
where Query 1 is
(
SELECT q as a
FROM somewhere
)
and Query2 is
(
SELECT g as b
FROM elsewhere
)
So, i want to select from two other select statements.
Query 1 produces a table
a
value1
Query 2 produces a table
b
value 2
And Query 3 (the outer select statement) produces
a b
value 1 value 2
So, essentially, two result tables are combined as columns and not as rows.
Thank you, if you have any hints.
You basically have your solution. You are only missing the names of your queries, so do like this:
SELECT Query1.a,
Query2.b
FROM (
SELECT q as a
FROM somewhere
) Query1,
(
SELECT g as b
FROM elsewhere
) Query2
It's not clear how you need to connect different rows from tables but it can be something like this:
select query1.a,
query2.b
FROM
(select q as a, ROW_NUMBER() OVER (ORDER BY q) as RN from a) Query1
FULL JOIN
(select q as b, ROW_NUMBER() OVER (ORDER BY q) as RN from b) Query2
ON Query1.RN=Query2.RN
SQLFiddle example
Your syntax is a bit off the SQL charts, but in essence ritgh:
It is possible to do a subquery:
select A.field from (select field from a_table) A;
It is essential that you name your query, if you want to use it in the select or where clauses.
And even possible to combine them like regular tables:
select A.field, B.other_field from (select field from table1) A, (select other_field from table2) B;
It is also possible to do al kind of where, grouping and sorting stuff on it, but not needed in your case.
I assume this is what you're looking for:
SELECT query1.a, query2.b
FROM
(SELECT q as a FROM somewhere) query1,
(SELECT g as b FROM elsewhere) query2
Here is a SQLFiddle to test the query