I have a table with many columns, and I want to count the unique values of each column. I know that I can do
SELECT sho_01, COUNT(*) from sho GROUP BY sho_01
UNION ALL
SELECT sho_02, COUNT(*) from sho GROUP BY sho_02
UNION ALL
....
Here sho is the table and sho_01,.... are the individual columns. This is BigQuery by the way, so they use UNION ALL.
Next, I want to do the same thing, but for a subset of sho, say SELECT * FROM sho WHERE id in (1,2,3). Is there a way where I can create a subtable first, and then query the subtable? Something like this
SELECT * FROM (SELECT * FROM sho WHERE id IN (1,2,3)) AS t1;
SELECT sho_01, COUNT(*) from t1 GROUP BY sho_01
UNION ALL
SELECT sho_02, COUNT(*) from t1 GROUP BY sho_02
UNION ALL
....
Thanks
Presumably, the columns are all of the same type. If so, you can simplify this using arrays:
select el.which, el.val, count(*)
from (select t1.*,
array[struct('sho_01' as which, sho_01 as val),
struct('sho_2', show_02),
. . .
] as ar
from t
) t cross join
unnest(ar) el
group by el.which, el.val;
You can then easily filter however you want by adding a where clause before the group by.
Below is for BigQuery Standard SQL and allows you to avoid manual typing of column names or even knowing them in advance
#standardSQL
SELECT
TRIM(SPLIT(kv, ':')[OFFSET(0)], '"') column,
SPLIT(kv, ':')[OFFSET(1)] value,
COUNT(1) cnt
FROM `project.dataset.table` t,
UNNEST(SPLIT(TRIM(TO_JSON_STRING(t), '{}'))) kv
GROUP BY column, value
-- ORDER BY column, value
Related
I am creating a table and inserting data into that table with a query1 union query2. The issue is that I want to add row_number() to the table however when I add row_number() over() to either of the queries, the numbering only applies to query1 or query2 but not to the entire table as a whole.
I did a hack to get my result where I insert the data into the table (table_no_serial) using insert query1 union query2, then I create a second table like so
insert into table_w_serial select row_number() over(), * from table_no_serial;
is it possible to get this right the first time around?
insert into table purchase_table
select row_number() over(), w.ts, w.tail, w.event, w.action, w.msg, w.tags
from table1 w
where
w.action = 'stop'
union
select row_number() over(), t.ts, t.tail, t.event, t.action, t.msg, t.tags
from table2 t
where
f.action = 'stop';
I want something like this to work.
I want to write a code where the resulting table (endtable) will be a union of the first query and the second query and will include a constant row number across both queries so that if query1 returns 50 results and query2 returns 40 results. End table will have row number from 1-90
Use a subquery:
insert into table purchase_table ( . . . ) -- include column names here
select row_number() over (), ts, tail, event, action, msg, tags
from ((select w.ts, w.tail, w.event, w.action, w.msg, w.tags
from table1 w
where w.action = 'stop'
) union all
(select w.ts, w.tail, w.event, w.action, w.msg, w.tags
from table2 w
where f.action = 'stop'
)
) w;
Note that this also changes union to union all. union all is more efficient; only use union if you want to incur the overhead of removing duplicates.
I want to select a third column based on two distant columns within the same table.
I could only think of this:
select tl.thirdcolumn
from table1 t1
WHERE
EXISTS
(
Select distinct tl.firstcolumn , t1.secondcolumn
From t1
)
This:
select distinct tl.thirdcolumn
from table t1
won't work as I don't want the distinct thirdrow. I want the thirdrow to be based on the first two rows being distinct.
I guess its a kind of nested sql statment with a select top 1... idk
CATEGORY NAME Query
---------------------------------------------------
STUDENTS NUMBER_OF_CHAPTERS QueryA
STUDENTS NUMBER_OF_STUDENT_MEMBERS QueryB
STUDENTS NUMBER_OF_STUDENT_MEMBERS QueryB
MEMBERS NUMBER_OF_MEMBERS_WORLDWIDE QueryC
MEMBERS NUMBER_OF_MEMBERS_WORLDWIDE QueryC
Your question is rather hard to follow, but I think you might simply want group by:
select tl.firstcolumn , t1.secondcolumn, max(tl.thirdcolumn)
from table1 t1
group by tl.firstcolumn , t1.secondcolumn;
If you want rows where the pair of values only appears once, then add having count(*) = 1:
select tl.firstcolumn , t1.secondcolumn, max(tl.thirdcolumn)
from table1 t1
group by tl.firstcolumn , t1.secondcolumn
having count(*) = 1;
Query -
SELECT
CATEGORY,NAME,QUERY
FROM
(
WITH TAB AS (
SELECT
'STUDENTS' AS CATEGORY,
'NUMBER_OF_CHAPTERS' AS NAME,
'QUERYA' AS QUERY
FROM
DUAL
UNION ALL
SELECT
'STUDENTS' AS CATEGORY,
'NUMBER_OF_STUDENT_MEMBERS' AS NAME,
'QUERYB' AS QUERY
FROM
DUAL
UNION ALL
SELECT
'STUDENTS' AS CATEGORY,
'NUMBER_OF_STUDENT_MEMBERS' AS NAME,
'QUERYB' AS QUERY
FROM
DUAL
UNION ALL
SELECT
'MEMBERS' AS CATEGORY,
'NUMBER_OF_MEMBERS_WORLDWIDE' AS NAME,
'QUERYC' AS QUERY
FROM
DUAL
UNION ALL
SELECT
'MEMBERS' AS CATEGORY,
'NUMBER_OF_MEMBERS_WORLDWIDE' AS NAME,
'QUERYC' AS QUERY
FROM
DUAL
) SELECT
CATEGORY,
NAME,
QUERY,
COUNT(*) OVER(PARTITION BY
CATEGORY,
NAME
ORDER BY
CATEGORY,
NAME,
QUERY
) AS RNK
FROM
TAB
)
WHERE
RNK = 1;
Output -
"CATEGORY","NAME","QUERY"
"STUDENTS","NUMBER_OF_CHAPTERS","QueryA"
I'm working on learning more about how the UNION function works in SQL Server.
I've got a query that is directed at a single table:
SELECT Category, COUNT(*) AS Number
FROM Table1
GROUP BY Category;
This returns the number of entries for each distinct line in the Category column.
I have multiple tables that are organized by this Category column and I'd like to be able to have the results for every table returned by one query.
It seems like UNION will accomplish what I want it to do but the way I've tried implementing the query doesn't work with COUNT(*).
SELECT *
FROM (SELECT Table1.Category
Table1.COUNT(*) AS Number
FROM dbo.Table1
UNION
SELECT Table2.Category
Table2.COUNT(*) AS Number
FROM dbo.Table2) AS a
GROUP BY a.Category
I'm sure there's an obvious reason why this doesn't work but can anyone point out what that is and how I could accomplish what I'm trying to do?
You cannot write a common Group by clause for two different select's. You need to use Group by clause for each select
SELECT TABLE1.Category, --missing comma here
COUNT(*) as Number -- Remove TABLE1. alias name
FROM dbo.TABLE1
GROUP BY Category
UNION ALL --UNION
SELECT TABLE2.Category, --missing comma here
COUNT(*) as Number -- Remove TABLE1. alias name
FROM dbo.TABLE2
GROUP BY Category
If you really want to remove duplicates in result then change UNION ALL to UNION
COUNT as any associated aggregation function has to have GROUP BY specified. You have to use group by for each sub query separately:
SELECT * FROM (
SELECT TABLE1.Category,
COUNT(*) as Number
FROM dbo.TABLE1
GROUP BY TABLE1.Category
UNION ALL
SELECT TABLE2.Category,
COUNT(*) as Number
FROM dbo.TABLE2
GROUP BY TABLE2.Category
) as a
It is better to use UNION ALL vs UNION - UNION eliminates duplicates from result sets, since - let say - you want to merge both results as they are it is safer to use UNION ALL
select *
from
{
SELECT
ID, CLASS, CHANGE_NUMBER AS OBJECT_NUMBER
FROM table_A
UNION
SELECT
ID, CLASS, CUST_NO AS OBJECT_NUMBER
FROM table_B
ORDER BY ID
} x where x.id ='5434';
Help me to run this query.
I am getting error "invalid table name"
I would suggest writing the query like this:
select x.*
from (SELECT ID, CLASS, CHANGE_NUMBER AS OBJECT_NUMBER FROM table_A
UNION ALL
SELECT ID, CLASS, CUST_NO AS OBJECT_NUMBER FROM table_B
) x
where x.id = '5434';
Notes:
The curly braces are probably your syntax problem.
Use UNION ALL instead of UNION, unless you really want to incur the overhead of removing duplicates.
The ORDER BY is not needed. After all, you are only choosing one id.
If you do have an ORDER BY, it is better practice to put it in the outer query than in the subquery.
Use '(' bracket instead of '{'.
select * from
(
SELECT ID,CLASS, CHANGE_NUMBER AS OBJECT_NUMBER FROM table_A
UNION
SELECT ID,CLASS,CUST_NO AS OBJECT_NUMBER FROM table_B
ORDER BY ID
) x where x.id ='5434';
I have a table (say table_1) having some columns of type number. Now i want to create other table (say table_2) having columns (name, sum, avg, max, min) which will store computed value of columns from the table_1.
Right now i'm creating table_2 and then inserting row in table_2 for each column in table_1 one at a time.
I want to do this in a single statement update. Query Like: "Create table_2(name, sum, avg, ...) select ....".
Please help me creating the execute statement.
This is an UNPIVOT operation.
SELECT colname, SUM(value), AVG(value), MIN(value), MAX(value)
FROM table1
UNPIVOT ( value FOR colname IN (x,y,..) )
GROUP BY colname
Where "x,y,..." should be the actual column names in your source table.
Edited to add
In pre-11g versions of Oracle, you can roll your own unpivot. Example for two columns:
WITH driver AS (
SELECT level colnum FROM dual CONNECT BY level <= 2
)
SELECT
CASE WHEN colnum=1 THEN 'x' WHEN colnum=2 THEN 'y' END colname,
CASE WHEN colnum=1 THEN sum_x WHEN colnum=2 THEN sum_y END colsum,
CASE WHEN colnum=1 THEN avg_x WHEN colnum=2 THEN avg_y END colavg
FROM driver
CROSS JOIN (
SELECT SUM(x) sum_x, AVG(x) avg_x, SUM(y) sum_y, AVG(y) avg_y
FROM mytable
)
ORDER BY colnum
That's legitimate: you can do
CREATE TABLE myTable as
SELECT....
;
or even using a common table expression:
CREATE TABLE myTable as
WITH myCte as
(
SELECT....
)
SELECT....
;