row_number() over() continuous over union select - sql

I am creating a table and inserting data into that table with a query1 union query2. The issue is that I want to add row_number() to the table however when I add row_number() over() to either of the queries, the numbering only applies to query1 or query2 but not to the entire table as a whole.
I did a hack to get my result where I insert the data into the table (table_no_serial) using insert query1 union query2, then I create a second table like so
insert into table_w_serial select row_number() over(), * from table_no_serial;
is it possible to get this right the first time around?
insert into table purchase_table
select row_number() over(), w.ts, w.tail, w.event, w.action, w.msg, w.tags
from table1 w
where
w.action = 'stop'
union
select row_number() over(), t.ts, t.tail, t.event, t.action, t.msg, t.tags
from table2 t
where
f.action = 'stop';
I want something like this to work.
I want to write a code where the resulting table (endtable) will be a union of the first query and the second query and will include a constant row number across both queries so that if query1 returns 50 results and query2 returns 40 results. End table will have row number from 1-90

Use a subquery:
insert into table purchase_table ( . . . ) -- include column names here
select row_number() over (), ts, tail, event, action, msg, tags
from ((select w.ts, w.tail, w.event, w.action, w.msg, w.tags
from table1 w
where w.action = 'stop'
) union all
(select w.ts, w.tail, w.event, w.action, w.msg, w.tags
from table2 w
where f.action = 'stop'
)
) w;
Note that this also changes union to union all. union all is more efficient; only use union if you want to incur the overhead of removing duplicates.

Related

Defining a subtable and then query from that table using SQL

I have a table with many columns, and I want to count the unique values of each column. I know that I can do
SELECT sho_01, COUNT(*) from sho GROUP BY sho_01
UNION ALL
SELECT sho_02, COUNT(*) from sho GROUP BY sho_02
UNION ALL
....
Here sho is the table and sho_01,.... are the individual columns. This is BigQuery by the way, so they use UNION ALL.
Next, I want to do the same thing, but for a subset of sho, say SELECT * FROM sho WHERE id in (1,2,3). Is there a way where I can create a subtable first, and then query the subtable? Something like this
SELECT * FROM (SELECT * FROM sho WHERE id IN (1,2,3)) AS t1;
SELECT sho_01, COUNT(*) from t1 GROUP BY sho_01
UNION ALL
SELECT sho_02, COUNT(*) from t1 GROUP BY sho_02
UNION ALL
....
Thanks
Presumably, the columns are all of the same type. If so, you can simplify this using arrays:
select el.which, el.val, count(*)
from (select t1.*,
array[struct('sho_01' as which, sho_01 as val),
struct('sho_2', show_02),
. . .
] as ar
from t
) t cross join
unnest(ar) el
group by el.which, el.val;
You can then easily filter however you want by adding a where clause before the group by.
Below is for BigQuery Standard SQL and allows you to avoid manual typing of column names or even knowing them in advance
#standardSQL
SELECT
TRIM(SPLIT(kv, ':')[OFFSET(0)], '"') column,
SPLIT(kv, ':')[OFFSET(1)] value,
COUNT(1) cnt
FROM `project.dataset.table` t,
UNNEST(SPLIT(TRIM(TO_JSON_STRING(t), '{}'))) kv
GROUP BY column, value
-- ORDER BY column, value

Return COUNT rows of two SELECT related by UNION

I want to return the number of rows with columns from two Tables related by UNION.
I wrote this query
SELECT(
(SELECT * FROM(
(SELECT
ID_COMPTE,
TITLE,
LINK,
DATE_CREAT,
DATE_MODIF,
'TF1' AS "TYPE_FICHIER",
case when DATE_MODIF is null then DATE_CREAT else DATE_MODIF end as LAST_UPDATE FROM FIRST_TABLE FFF where ID_COMPTE= 11111111)
UNION
(SELECT
ID_COMPTE,
TITLE,
LINK,
DATE_CREAT,
DATE_MODIF,
'TF2' AS "TYPE_FICHIER",
case when DATE_MODIF is null then DATE_CREAT else DATE_MODIF end as LAST_UPDATE FROM SECOND_TABLE SSS where ID_COMPTE= 11111111)
order by LAST_UPDATE desc
) parentSelect WHERE ROWNUM BETWEEN 0 AND 2)), count(firstSelect) FROM firstSelect;
The purpose is to return the the last two rows with the count of all the rows of Table 1 and Table 2.
The query without the count works fine it's just the count that cause problem, I don't know how to insert it.
I tried also to use count() for each SELECT and SUM in the parent SELECT but it doesn't work.
This concept should work for you. Basically you select the data you want in the with clause. Then in the main select, you select your data, and the count.
WITH
base AS
(
SELECT 'TEST1' DATA FROM DUAL
UNION ALL
SELECT 'TEST2' DATA FROM DUAL
UNION ALL
SELECT 'TEST3' DATA FROM DUAL
)
SELECT (SELECT COUNT(*) FROM base) AS KOUNT, base.*
FROM base
;
You can use #Temp ( TempTable ).
Insert or manupulate the rows in it which you want to have and finally return it from stored procedure.

Select a third column based on two distant rows within the same table

I want to select a third column based on two distant columns within the same table.
I could only think of this:
select tl.thirdcolumn
from table1 t1
WHERE
EXISTS
(
Select distinct tl.firstcolumn , t1.secondcolumn
From t1
)
This:
select distinct tl.thirdcolumn
from table t1
won't work as I don't want the distinct thirdrow. I want the thirdrow to be based on the first two rows being distinct.
I guess its a kind of nested sql statment with a select top 1... idk
CATEGORY NAME Query
---------------------------------------------------
STUDENTS NUMBER_OF_CHAPTERS QueryA
STUDENTS NUMBER_OF_STUDENT_MEMBERS QueryB
STUDENTS NUMBER_OF_STUDENT_MEMBERS QueryB
MEMBERS NUMBER_OF_MEMBERS_WORLDWIDE QueryC
MEMBERS NUMBER_OF_MEMBERS_WORLDWIDE QueryC
Your question is rather hard to follow, but I think you might simply want group by:
select tl.firstcolumn , t1.secondcolumn, max(tl.thirdcolumn)
from table1 t1
group by tl.firstcolumn , t1.secondcolumn;
If you want rows where the pair of values only appears once, then add having count(*) = 1:
select tl.firstcolumn , t1.secondcolumn, max(tl.thirdcolumn)
from table1 t1
group by tl.firstcolumn , t1.secondcolumn
having count(*) = 1;
Query -
SELECT
CATEGORY,NAME,QUERY
FROM
(
WITH TAB AS (
SELECT
'STUDENTS' AS CATEGORY,
'NUMBER_OF_CHAPTERS' AS NAME,
'QUERYA' AS QUERY
FROM
DUAL
UNION ALL
SELECT
'STUDENTS' AS CATEGORY,
'NUMBER_OF_STUDENT_MEMBERS' AS NAME,
'QUERYB' AS QUERY
FROM
DUAL
UNION ALL
SELECT
'STUDENTS' AS CATEGORY,
'NUMBER_OF_STUDENT_MEMBERS' AS NAME,
'QUERYB' AS QUERY
FROM
DUAL
UNION ALL
SELECT
'MEMBERS' AS CATEGORY,
'NUMBER_OF_MEMBERS_WORLDWIDE' AS NAME,
'QUERYC' AS QUERY
FROM
DUAL
UNION ALL
SELECT
'MEMBERS' AS CATEGORY,
'NUMBER_OF_MEMBERS_WORLDWIDE' AS NAME,
'QUERYC' AS QUERY
FROM
DUAL
) SELECT
CATEGORY,
NAME,
QUERY,
COUNT(*) OVER(PARTITION BY
CATEGORY,
NAME
ORDER BY
CATEGORY,
NAME,
QUERY
) AS RNK
FROM
TAB
)
WHERE
RNK = 1;
Output -
"CATEGORY","NAME","QUERY"
"STUDENTS","NUMBER_OF_CHAPTERS","QueryA"

How to join two tables with the same number of rows in SQLite?

I have almost the same problem as described in this question. I have two tables with the same number of rows, and I would like to join them together one by one.
The tables are ordered, and I would like to keep this order after the join, if it is possible.
There is a rowid based solution for MSSql, but in SQLite rowid can not be used if the table is coming from a WITH statement (or RECURSIVE WITH).
It is guaranteed that the two tables have the exact same number of rows, but this number is not known beforehand. It is also important to note, that the same element may occur more than twice. The results are ordered, but none of the columns are unique.
Example code:
WITH
table_a (n) AS (
SELECT 2
UNION ALL
SELECT 4
UNION ALL
SELECT 5
),
table_b (s) AS (
SELECT 'valuex'
UNION ALL
SELECT 'valuey'
UNION ALL
SELECT 'valuez'
)
SELECT table_a.n, table_b.s
FROM table_a
LEFT JOIN table_b ON ( table_a.rowid = table_b.rowid )
The result I would like to achieve is:
(2, 'valuex'),
(4, 'valuey'),
(5, 'valuez')
SQLFiddle: http://sqlfiddle.com/#!5/9eecb7/6888
This is quite complicated in SQLite -- because you are allowing duplicates. But you can do it. Here is the idea:
Summarize the table by the values.
For each value, get the count and offset from the beginning of the values.
Then use a join to associate the values and figure out the overlap.
Finally use a recursive CTE to extract the values that you want.
The following code assumes that n and s are ordered -- as you specify in your question. However, it would work (with small modifications) if another column specified the ordering.
You will notice that I have included duplicates in the sample data:
WITH table_a (n) AS (
SELECT 2 UNION ALL
SELECT 4 UNION ALL
SELECT 4 UNION ALL
SELECT 4 UNION ALL
SELECT 5
),
table_b (s) AS (
SELECT 'valuex' UNION ALL
SELECT 'valuey' UNION ALL
SELECT 'valuey' UNION ALL
SELECT 'valuez' UNION ALL
SELECT 'valuez'
),
a as (
select a.n, count(*) as a_cnt,
(select count(*) from table_a a2 where a2.n < a.n) as a_offset
from table_a a
group by a.n
),
b as (
select b.s, count(*) as b_cnt,
(select count(*) from table_b b2 where b2.s < b.s) as b_offset
from table_b b
group by b.s
),
ab as (
select a.*, b.*,
max(a.a_offset, b.b_offset) as offset,
min(a.a_offset + a.a_cnt, b.b_offset + b.b_cnt) - max(a.a_offset, b.b_offset) as cnt
from a join
b
on a.a_offset + a.a_cnt - 1 >= b.b_offset and
a.a_offset <= b.b_offset + b.b_cnt - 1
),
cte as (
select n, s, offset, cnt, 1 as ind
from ab
union all
select n, s, offset, cnt, ind + 1
from cte
where ind < cnt
)
select n, s
from cte
order by n, s;
Here is a DB Fiddle showing the results.
I should note that this would be much simpler in almost any other database, using window functions (or perhaps variables in MySQL).
Since the tables are ordered, you can add row_id values by comparing n values.
But still the best way in order to get better performance would be inserting the ID values while creating the tables.
http://sqlfiddle.com/#!5/9eecb7/7014
WITH
table_a_a (n, id) AS
(
WITH table_a (n) AS
(
SELECT 2
UNION ALL
SELECT 4
UNION ALL
SELECT 5
)
SELECT table_a.n, (select count(1) from table_a b where b.n <= table_a.n) id
FROM table_a
) ,
table_b_b (n, id) AS
(
WITH table_a (n) AS
(
SELECT 'valuex'
UNION ALL
SELECT 'valuey'
UNION ALL
SELECT 'valuez'
)
SELECT table_a.n, (select count(1) from table_a b where b.n <= table_a.n) id
FROM table_a
)
select table_a_a.n,table_b_b.n from table_a_a,table_b_b where table_a_a.ID = table_b_b.ID
or convert the input set to comma separated list and try like this:
http://sqlfiddle.com/#!5/9eecb7/7337
WITH RECURSIVE table_b( id,element, remainder ) AS (
SELECT 0,NULL AS element, 'valuex,valuey,valuz,valuz' AS remainder
UNION ALL
SELECT id+1,
CASE
WHEN INSTR( remainder, ',' )>0 THEN
SUBSTR( remainder, 0, INSTR( remainder, ',' ) )
ELSE
remainder
END AS element,
CASE
WHEN INSTR( remainder, ',' )>0 THEN
SUBSTR( remainder, INSTR( remainder, ',' )+1 )
ELSE
NULL
END AS remainder
FROM table_b
WHERE remainder IS NOT NULL
),
table_a( id,element, remainder ) AS (
SELECT 0,NULL AS element, '2,4,5,7' AS remainder
UNION ALL
SELECT id+1,
CASE
WHEN INSTR( remainder, ',' )>0 THEN
SUBSTR( remainder, 0, INSTR( remainder, ',' ) )
ELSE
remainder
END AS element,
CASE
WHEN INSTR( remainder, ',' )>0 THEN
SUBSTR( remainder, INSTR( remainder, ',' )+1 )
ELSE
NULL
END AS remainder
FROM table_a
WHERE remainder IS NOT NULL
)
SELECT table_b.element, table_a.element FROM table_b, table_a WHERE table_a.element IS NOT NULL and table_a.id = table_b.id;
SQL
SELECT a1.n, b1.s
FROM table_a a1
LEFT JOIN table_b b1
ON (SELECT COUNT(*) FROM table_a a2 WHERE a2.n <= a1.n) =
(SELECT COUNT(*) FROM table_b b2 WHERE b2.s <= b1.s)
Explanation
The query simply counts the number of rows up until the current one for each table (based on the ordering column) and joins on this value.
Demo
See SQL Fiddle demo.
Assumptions
A single column in used for the ordering in each table. (But the query could easily be modified to allow multiple ordering columns).
The ordering values in each table are unique.
The values in the ordering column aren't necessarily the same between the two tables.
It is known that table_a contains either the same or more rows than table_b. (If this isn't the case then a FULL OUTER JOIN would need to be emulated since SQLite doesn't provide one.)
No further changes to the table structure are allowed. (If they are, it would be more efficient to have pre-populated columns for the ordering).
Either way...
Use something like
WITH
v_table_a (n, rowid) AS (
SELECT 2, 1
UNION ALL
SELECT 4, 2
UNION ALL
SELECT 5, 3
),
v_table_b (s, rowid) AS (
SELECT 'valuex', 1
UNION ALL
SELECT 'valuey', 2
UNION ALL
SELECT 'valuez', 3
)
SELECT v_table_a.n, v_table_b.s
FROM v_table_a
LEFT JOIN v_table_b ON ( v_table_a.rowid = v_table_b.rowid );
for "virtual" tables (with WITH or without),
WITH RECURSIVE vr_table_a (n, rowid) AS (
VALUES (2, 1)
UNION ALL
SELECT n + 2, rowid + 1 FROM vr_table_a WHERE rowid < 3
)
, vr_table_b (s, rowid) AS (
VALUES ('I', 1)
UNION ALL
SELECT s || 'I', rowid + 1 FROM vr_table_b WHERE rowid < 3
)
SELECT vr_table_a.n, vr_table_b.s
FROM vr_table_a
LEFT JOIN vr_table_b ON ( vr_table_a.rowid = vr_table_b.rowid );
for "virtual" tables using recursive WITHs (in this example the values are others then yours, but I guess you get the point) and
CREATE TABLE p_table_a (n INT);
INSERT INTO p_table_a VALUES (2), (4), (5);
CREATE TABLE p_table_b (s VARCHAR(6));
INSERT INTO p_table_b VALUES ('valuex'), ('valuey'), ('valuez');
SELECT p_table_a.n, p_table_b.s
FROM p_table_a
LEFT JOIN p_table_b ON ( p_table_a.rowid = p_table_b.rowid );
for physical tables.
I'd be careful with the last one though. A quick test shows, that the numbers of rowid are a) reused -- when some rows are deleted and others are inserted, the inserted rows get the rowids from the old rows (i.e. rowid in SQLite isn't unique past the lifetime of a row, whereas e.g. Oracle's rowid AFAIR is) -- and b) corresponds to the order of insertion. But I don't know and didn't find a clue in the documentation, if that's guaranteed or is subject to change in other/future implementations. Or maybe it's just a mere coincidence in my test environment.
(In general physical order of rows may be subject to change (even within the same database using the same DMBS as a result of some reorganization) and is therefore no good choice to rely on. And it's not guaranteed, a query will return the result ordered by physical position in the table as well (it might use the order of some index instead or have a partial result ordered some other way influencing the output's order). Consider designing your tables using common (sort) keys in corresponding rows for ordering and to join on.)
You can create temp tables to carry CTE data row. then JOIN them by sqlite row_id column.
CREATE TEMP TABLE temp_a(n integer);
CREATE TEMP TABLE temp_b(n VARCHAR(255));
WITH table_a(n) AS (
SELECT 2 n
UNION ALL
SELECT 4
UNION ALL
SELECT 5
UNION ALL
SELECT 5
)
INSERT INTO temp_a (n) SELECT n FROM table_a;
WITH table_b (n) AS
(
SELECT 'valuex'
UNION ALL
SELECT 'valuey'
UNION ALL
SELECT 'valuez'
UNION ALL
SELECT 'valuew'
)
INSERT INTO temp_b (n) SELECT n FROM table_b;
SELECT *
FROM temp_a a
INNER JOIN temp_b b on a.rowid = b.rowid;
sqlfiddle:http://sqlfiddle.com/#!5/9eecb7/7252
It is possible to use the rowid inside a with statement but you need to select it and make it available to the query using it.
Something like this:
with tablea AS (
select id, rowid AS rid from someids),
tableb AS (
select details, rowid AS rid from somedetails)
select tablea.id, tableb.details
from
tablea
left join tableb on tablea.rid = tableb.rid;
It is however as they have already warned you a really bad idea. What if the app breaks after inserting in one table but before the other one? What if you delete an old row? If you want to join two tables you need to specify the field to do so. There are so many things that could go wrong with this design. The most similar thing to this would be an incremental id field that you would save in the table and use in your application. Even simpler, make those into one table.
Read this link for more information about the rowid: https://www.sqlite.org/lang_createtable.html#rowid
sqlfiddle: http://sqlfiddle.com/#!7/29fd8/1
It is possible to use the rowid inside a with statement but you need to select it and make it available to the query using it. Something like this:
with tablea AS (select id, rowid AS rid from someids),
tableb AS (select details, rowid AS rid from somedetails)
select tablea.id, tableb.details
from
tablea
left join tableb on tablea.rid = tableb.rid;
The problem statement indicates:
The tables are ordered
If this means that the ordering is defined by the ordering of the values in the UNION ALL statements, and if SQLite respects that ordering, then the following solution may be of interest because, apart from small tweaks to the last three lines of the sample program, it adds just two lines:
A(rid,n) AS (SELECT ROW_NUMBER() OVER ( ORDER BY 1 ) rid, n FROM table_a),
B(rid,s) AS (SELECT ROW_NUMBER() OVER ( ORDER BY 1 ) rid, s FROM table_b)
That is, table A is table_a augmented with a rowid, and similarly for table B.
Unfortunately, there is a caveat, though it might just be the result of my not having found the relevant specifications. Before delving into that, however, here is the full proposed solution:
WITH
table_a (n) AS (
SELECT 2
UNION ALL
SELECT 4
UNION ALL
SELECT 5
),
table_b (s) AS (
SELECT 'valuex'
UNION ALL
SELECT 'valuey'
UNION ALL
SELECT 'valuez'
),
A(rid,n) AS (SELECT ROW_NUMBER() OVER ( ORDER BY 1 ) rid, n FROM table_a),
B(rid,s) AS (SELECT ROW_NUMBER() OVER ( ORDER BY 1 ) rid, s FROM table_b)
SELECT A.n, B.s
FROM A LEFT JOIN B
ON ( A.rid = B.rid );
Caveat
The proposed solution has been tested against a variety of data sets using sqlite version 3.29.0, but whether or not it is, and will continue to be, "guaranteed" to work is unclear to me.
Of course, if SQLite offers no guarantees with respect to the ordering of the UNION ALL statements (that is, if the question is based on an incorrect assumption), then it would be interesting to see a well-founded reformulation.

How to use order by with union all in sql?

I tried the sql query given below:
SELECT * FROM (SELECT *
FROM TABLE_A ORDER BY COLUMN_1)DUMMY_TABLE
UNION ALL
SELECT * FROM TABLE_B
It results in the following error:
The ORDER BY clause is invalid in views, inline functions, derived
tables, subqueries, and common table expressions, unless TOP or FOR
XML is also specified.
I need to use order by in union all. How do I accomplish this?
SELECT *
FROM
(
SELECT * FROM TABLE_A
UNION ALL
SELECT * FROM TABLE_B
) dum
-- ORDER BY .....
but if you want to have all records from Table_A on the top of the result list, the you can add user define value which you can use for ordering,
SELECT *
FROM
(
SELECT *, 1 sortby FROM TABLE_A
UNION ALL
SELECT *, 2 sortby FROM TABLE_B
) dum
ORDER BY sortby
You don't really need to have parenthesis. You can sort directly:
SELECT *, 1 AS RN FROM TABLE_A
UNION ALL
SELECT *, 2 AS RN FROM TABLE_B
ORDER BY RN, COLUMN_1
Not an OP direct response, but I thought I would jimmy in here responding to the the OP's ERROR messsage, which may point you in another direction entirely!
All these answers are referring to an overall ORDER BY once the record set has been retrieved and you sort the lot.
What if you want to ORDER BY each portion of the UNION independantly, and still have them "joined" in the same SELECT?
SELECT pass1.* FROM
(SELECT TOP 1000 tblA.ID, tblA.CustomerName
FROM TABLE_A AS tblA ORDER BY 2) AS pass1
UNION ALL
SELECT pass2.* FROM
(SELECT TOP 1000 tblB.ID, tblB.CustomerName
FROM TABLE_B AS tblB ORDER BY 2) AS pass2
Note the TOP 1000 is an arbitary number. Use a big enough number to capture all of the data you require.
There will be times when you need to do something like this :
Pull top 5 from table 1 based on a sort
and bottom 5 from table 2 based on another sort
and union these together.
solution
select * from (
-- top 5 records
select top 5 col1, col2, col3
from table1
group by col1, col2
order by col3 desc ) z
union all
select * from (
-- bottom 5 records
select top 5 col1, col2, col3
from table2
group by col1, col2
order by col3 ) z
this was the only way i was able to get around the error and worked fine for me.
SELECT * FROM (SELECT *
FROM TABLE_A ORDER BY COLUMN_1)DUMMY_TABLE
UNION ALL
SELECT * FROM TABLE_B
ORDER BY 2;
2 is column number here .. In Oracle SQL you can use the column number by which you want to sort the data
This solved my SELECT statement:
SELECT * FROM
(SELECT id,name FROM TABLE_A
UNION ALL
SELECT id,name FROM TABLE_B ) dum
order by dum.id , dum.name
where id and name columns available in tables and you can use your columns .
Simply use that , no need parenthesis or anything else
SELECT *, id as TABLE_A_ID FROM TABLE_A
UNION ALL
SELECT *, id as TABLE_B_ID FROM TABLE_B
ORDER BY TABLE_A_ID, TABLE_B_ID
ORDER BY after the last UNION should apply to both datasets joined by union.
The solution shown below:
SELECT *,id AS sameColumn1 FROM Locations
UNION ALL
SELECT *,id AS sameColumn2 FROM Cities
ORDER BY sameColumn1,sameColumn2
select CONCAT(Name, '(',substr(occupation, 1, 1), ')') AS f1
from OCCUPATIONS
union
select temp.str AS f1 from
(select count(occupation) AS counts, occupation, concat('There are a total of ' ,count(occupation) ,' ', lower(occupation),'s.') As str from OCCUPATIONS group by occupation order by counts ASC, occupation ASC
) As temp
order by f1