Is There a Way to Automate the Conversion of SQL Rows to Column Using Case? - sql

I was playing with usa_names dataset on Bigquery and in order to be able to visualize the top 10 names between 1910 and 2020, I had to GROUP BY year and create a new column for each of the 10 names using CASE.
The thing is, I will like to visualize the top 100 and I want to know if there is a way to automate the CASE, in the sense that I don't have to write a "WHEN and THEN Clause for each name in order to create a column for them.
I had to use the following SQL query code to first get the top 10 names;
SELECT
name,
SUM(number) AS total
FROM
bigquery-public-data.usa_names.usa_1910_current
WHERE
year BETWEEN 1910 AND 2020
GROUP BY
name
ORDER BY
total DESC
LIMIT
10
And then use the following code to convert each name row to columns;
SELECT
year,
SUM(CASE WHEN name = 'James' THEN number ELSE 0 END) AS James,
SUM(CASE WHEN name = 'John' THEN number ELSE 0 END) AS John,
SUM(CASE WHEN name = 'Robert' THEN number ELSE 0 END) AS Robert,
SUM(CASE WHEN name = 'Michael' THEN number ELSE 0 END) AS Michael,
SUM(CASE WHEN name = 'William' THEN number ELSE 0 END) AS William,
SUM(CASE WHEN name = 'Mary' THEN number ELSE 0 END) AS Mary,
SUM(CASE WHEN name = 'Richard' THEN number ELSE 0 END) AS Richard,
SUM(CASE WHEN name = 'Joseph' THEN number ELSE 0 END) AS Joseph,
SUM(CASE WHEN name = 'Charles' THEN number ELSE 0 END) AS Charles,
SUM(CASE WHEN name = 'Thomas' THEN number ELSE 0 END) AS Thomas
FROM
bigquery-public-data.usa_names.usa_1910_current
GROUP BY
year
ORDER BY
year
I want to achieve the same result without having to first pull out the name and manually enter them into the CASE statements.
Also, this won't be needed if there is a way to visualize the data directly without having to convert the names from row to columns.
Thanks.

You need to combine 2 capabilities:
row to column: PIVOT clause
scripting to automate the query finding the top 10 names
declare top_names default ((
select concat("'", string_agg(name, "','"), "'")
from (
// your query in question
SELECT
name
FROM
bigquery-public-data.usa_names.usa_1910_current
WHERE
year BETWEEN 1910 AND 2020
GROUP BY
name
ORDER BY
SUM(number) DESC
LIMIT
10
)));
select top_names;
The output is:
'James','John','Robert','Michael','William','Mary','David','Richard','Joseph','Charles'
The PIVOT query you will need is:
SELECT * FROM
(select year, name, sum(number) number
from bigquery-public-data.usa_names.usa_1910_current
group by year, name
)
PIVOT(SUM(number) FOR name IN ('James','John','Robert','Michael','William','Mary','David','Richard','Joseph','Charles'
))
which output exactly as your second query.
To stick the 2 together, you will need something like:
execute immediate concat(
"""
SELECT * FROM
(select year, name, sum(number) number
from bigquery-public-data.usa_names.usa_1910_current
group by year, name
)
PIVOT(SUM(number) FOR name IN (
""",
top_names,
"))");

You shouldn't need to create a column for each name. Your first query is sufficient (would obviously just need to change the limit to 100). Based on the questions tags I'm assuming your using Tableau, so it would be as simple as choosing your desired visualisation (say a bar chart) and placing names on one axis and total on the other axis.
Based on your follow up comment it would look like this
SELECT
name,
year,
SUM(number) AS total
From bigquery-public-data.usa_names.usa_1910_current
WHERE name IN
(
SELECT name
FROM
(
SELECT
name,
SUM(number) AS total
FROM
bigquery-public-data.usa_names.usa_1910_current
WHERE
year BETWEEN 1910 AND 2020
GROUP BY
name
ORDER BY
total DESC
LIMIT
100
))
GROUP BY name, year
You could also look into using calculate fields within Tableau ok the raw data to achieve the desired visualisation.

Related

Count the occurrences of a given list of values in a column using a single SQL query

I would like to get the count of occurrences of a given list of values in a column using a single SQL query. The operations must be optimised for performance.
Please refer the example given below,
Sample Table name - history
code_list
5lysgj
627czl
1lqnd8
627czl
dtrtvp
627czl
esdop9
esdop9
3by104
1lqnd8
Expected Output
Need to get the count of occurrences for these given list of codes 627czl, 1lqnd8, esdop9, aol4m6 in the format given below.
code
count
627czl
3
esdop9
2
1lqnd8
2
aol4m6
0
Method I tried in show below but the count of each input is shown as a new column using this query,
SELECT
sum(case when h.code_list = 'esdop9' then 1 else 0 end) AS count_esdop9,
sum(case when h.code_list = '627czl' then 1 else 0 end) AS count_627czl,
sum(case when h.code_list = '1lqnd8' then 1 else 0 end) AS count_1lqnd8,
sum(case when h.code_list = 'aol4m6' then 1 else 0 end) AS count_aol4m6
FROM history h;
Note - The number inputs need to be given in the query in 10 also the real table has millions of records.
If i properly understand you need to get the count of occurrences for the following codes: 627czl, 1lqnd8, esdop9.
In this case you can try this one:
SELECT code_list, count(*) as count_
FROM history
WHERE code_list in ('627czl','1lqnd8','esdop9')
GROUP BY code_list
ORDER BY count_ DESC;
dbfiddle
If you need to get the count of occurrences for all codes you can run the following query:
SELECT code_list, count(*) as count_
FROM history
GROUP BY code_list
ORDER BY count_ DESC;
you can try to use GROUP BY
Something like this
SELECT code_list, COUNT(1) as 'total' ROM h GROUP by code_list order by 'total' ;

check and compare the count from two tables without relation

I have below tables
Table1: "Demo"
Columns: SSN, sales, Create_DT,Update_Dt
Table2: "Agent"
Columns: SSN,sales, Agent_Name, Create_Dt, Update_DT
Scenario 1 and desired result set:
I want output as 0 if the count of SSN in Demo table is matched with the count of SSN in Agent table
if the count is not matched then I want result as 1
Scenario 2 and desired result set:
I want output as 0 if the sum of sales in Demo table is matched with the sum of sales in Agent table
if the sum is not matched then I want result as 1
Please help on this query part
Thanks
You can write two queries separately to take counts within the result query
SELECT (SELECT count(Demo.SSN) as SSN1 from Demo)!=(SELECT count(Agent.SSN) as SSN2 from Agent) AS Result;
Basically what the inner queries does is it checked whether the counts are equal or not and outputs 1 if it is true and 0 if it is false. Since you have asked to output 1 if it is false I used '!=' sign.
You can try the same procedure in scenario 2 also
For scenario 1
select (Case when (select count(ssn) from Demo)=(select count(ssn) from Agent) then 0 else 1 end) as desired_result
If you want to count unique ssn then:
select (Case when (select count(distinct ssn) from Demo)=(select count(distinct ssn) from Agent) then 0 else 1 end) as desired_result
For scenario 2:
select (Case when (select sum(sales) from Demo)=(select sum(sales) from Agent) then 0 else 1 end) as desired_result
I would suggest one query with both sets of information:
select (d.num_ssn <> a.num_ssn) as have_different_ssn_count,
(d.sales <> a.sales) as have_different_sales
from (select count(distinct ssn) as num_ssn,
coalesce(sum(sales), 0) as sales
from demo
) d cross join
(select count(distinct ssn) as num_ssn,
coalesce(sum(sales), 0) as sales
from agent
) a;
Note: This returns boolean values -- true/false rather than 1/0. If you really want 0/1, then use case:
select (case when d.num_ssn <> a.num_ssn then 1 else 0 end) as have_different_ssn_count,
(case when d.sales <> a.sales then 1 else 0 end) as have_different_sales
It would not surprise me if you were not only interested in the total counts but also that the agent/sales combinations are the same in both tables. If that is the case, please ask a new question with a clear explanation. Sample data and desired results help.

Proportion request sql

There is a table of accidents and output the share of accidents number 2 to all accidents I wrote this code, but I can not make it work:
select ((select count("ID") from "DTP" where "REASON"=2)/count("REASON"))
from "DTP"
group by "ID"
Something like this (not tested):
select id, count(case reason when 2 then 1 end)/count(*) as proportion
from your_table
-- where ... (if you need to filter, for example by date)
group by id
;
count(*) counts all the rows in a group (that is, all the rows for each separate id). The case expression returns 1 when the reason is 2 and it returns null otherwise; count counts only non-null values, so it will count the rows where the reason is 2.
You can use avg():
select id,
avg(case when reason = 2 then 1.0 else 0 end)
from "DTP"
group by "ID"
This produces the ratio for each id -- based on your sample query. If you only want one row for all the data, then:
select avg(case when reason = 2 then 1.0 else 0 end)
from "DTP";

Multiple Word Count in SQL

I have a list of words I need to find in a specific column , "description of what happenned "
this holds anything up to 500 or more characters. I have the script below that does work
However how do I replace the Name column 1.2.3 with the actual name of the word I am looking for with the total next to it.
Just cant get it to display prob something simple.
select GROUPING_ID ( Amoxicillin ,Atorvastatin ) as Name ,count(*) as Total
from ( select case when [description_of_what_happened] like '%Amoxicillin%'
then 1 else 0 end as Amoxicillin ,
case when [description_of_what_happened] like '%Atorvastatin%'
then 1 else 0 end as Atorvastatin
FROM "NAME OF TABLE"
group by grouping sets (() ,(Amoxicillin),(Atorvastatin))
having coalesce (Amoxicillin,1) != 0 and coalesce (Atorvastatin,1) != 0
order by grouping_id (Amoxicillin,Atorvastatin)
row 3 being the total I need row 1 and row 2 to show the name of the product
result as below
Name Total
1 7
2 9
3 4112
You can use strings instead of flags:
select coalesce(Amoxicillin, Atorvastatin, 'Total') as Name,
count(*) as Total
from (select (case when [description_of_what_happened] like '%Amoxicillin%'
then 'Amoxicillin'
end) as Amoxicillin ,
(case when [description_of_what_happened] like '%Atorvastatin%'
then 'Atorvastatin'
end
) as Atorvastatin
from "NAME OF TABLE"
where Amoxicillin is not null or Atorvastatin is not null
group by grouping sets ((), (Amoxicillin), (Atorvastatin))
order by name;
Note that I also moved the logic in the having to the where.

Summing the results of Case queries in SQL

I think this is a relatively straightforward question but I have spent the afternoon looking for an answer and cannot yet find it. So...
I have a view with a country column and a number column. I want to make any number less than 10 'other' and then sum the 'other's into one value.
For example,
AR 10
AT 7
AU 11
BB 2
BE 23
BY 1
CL 2
I used CASE as follows:
select country = case
when number < 10 then 'Other'
else country
end,
number
from ...
This replaces the countries values with less than 10 in the number column to other but I can't work out how to sum them. I want to end up with a table/view which looks like this:
AR 10
AU 11
BE 23
Other 12
Any help is greatly appreciated.
Just group by your case statement:
select
country = case
when number < 10 then 'Other'
else country
end,
sum(number)
from ...
group by
case
when number < 10 then 'Other'
else country
end
select country, number
from your_table
where number >= 10
union
select 'Other' as country, sum(number)
from your_table
where number < 10
You can wrap it into a derived table to avoid repeating the CASE statement.
select country,
sum(number) as number
from (
select case
when number < 10 then 'Other'
else country
end as country,
number
from ...
) t
group by country
You need a Group by
select country = case
when number < 10 then 'Other'
else country
end,
sum(number)
from ...
group by
case when number < 10 then 'Other' else country end
You were very close... You just can't do country = case... Try
Select
Case when number < 10 then 'Other'
Else country end as finalcountry,
Sum(number) as total number
From
YourTable
Group by
Finalcountry
Since I don't have MySQL handy, I don't remember if it allows you to use the final column name as a group by... You may have to re-copy the case when clause into the group by clause too.