How to create crosstab with two field in bigquery with standart or legacy sql - sql

I want to get two columns from table and create a crosstab to see how many product bought in which product category for each customer.
Here is an example data from my table:
Row Customer_ID Style
1 MEM014 BLS87
2 KAR810 DR126
3 NIKE61 MMQ5
4 NIKE61 MMQ5
5 STT019 BLS83
6 STT019 BLS84
7 STT019 BLS87
And I want to get result table like this:
Customer - DR126 - MMQ5 - BLS83 - BLS84 - BLS87
MEM014 0 0 0 0 1
KAR810 1 0 0 0 0
NIKE61 0 2 0 0 0
STT019 0 0 1 1 1

Below is for BigQuery Standard SQL
Step #1 - generate pivot query
#standardSQL
SELECT CONCAT(
"SELECT Customer_ID,",
STRING_AGG(CONCAT("COUNTIF(Style='", Style, "') ", Style)),
" FROM `project.dataset.your_table` GROUP BY Customer_ID ORDER BY Customer_ID")
FROM (
SELECT DISTINCT Style
FROM `project.dataset.your_table`
ORDER BY Style
)
If you run it with dummy data from your question like below
#standardSQL
WITH `project.dataset.your_table` AS (
SELECT 'MEM014' Customer_ID, 'BLS87' Style UNION ALL
SELECT 'KAR810', 'DR126' UNION ALL
SELECT 'NIKE61', 'MMQ5' UNION ALL
SELECT 'NIKE61', 'MMQ5' UNION ALL
SELECT 'STT019', 'BLS83' UNION ALL
SELECT 'STT019', 'BLS84' UNION ALL
SELECT 'STT019', 'BLS87'
)
SELECT CONCAT(
"SELECT Customer_ID,",
STRING_AGG(CONCAT("COUNTIF(Style='", Style, "') ", Style)),
" FROM `project.dataset.your_table` GROUP BY Customer_ID")
FROM (
SELECT DISTINCT Style
FROM `project.dataset.your_table`
ORDER BY Style
)
you will get following pivot query
SELECT Customer_ID,COUNTIF(Style='BLS83') BLS83,COUNTIF(Style='BLS84') BLS84,COUNTIF(Style='BLS87') BLS87,COUNTIF(Style='DR126') DR126,COUNTIF(Style='MMQ5') MMQ5 FROM `project.dataset.your_table` GROUP BY Customer_ID
Step #2 - run generated pivot query
if you run it against your dummy data - you get expected result
Row Customer_ID BLS83 BLS84 BLS87 DR126 MMQ5
1 KAR810 0 0 0 1 0
2 MEM014 0 0 1 0 0
3 NIKE61 0 0 0 0 2
4 STT019 1 1 1 0 0
Note 1: Above assumes your Style names comply with column names convention (those in your example do). If not - you will need to escape not supported characters and so on (easy adjustment for step 1)
Note 2: Maximum unresolved query length is 256 KB. So if your Style names are similar to those in your example - above solution will support around 8500 styles, which should be less than limit (10K?) for number of columns in table

You can use conditional aggregation:
select customer,
sum(case when style = 'DR126' then 1 else 0 end) as DR126,
sum(case when style = 'MMQ5' then 1 else 0 end) as MMQ5,
. . .
from t
group by customer;
This works if you have the exact list of styles. If not, then you should be thinking in terms of arrays for the result set.
EDIT:
You can create an array of structs if that better suits your purpose:
select customer, array_agg(cs) as styles
from (select customer, style, count(*) as cnt
from t
group by customer
) cs
group by customer;
What you cannot do is have a query return a variable number of columns. For that, you need dynamic SQL and a programming language.

Related

Find items in table with 2 specific sizes

I have items table where the item code repeats as it has different sizes, variants.
I want to find items which has 2 specific sizes, ie size in both M/Y and Euro.
Items table:
Id size
1 0
1 2Y
1 EU-15
2 2M
2 4M
3 0
3 2M-4M
3 EU-12
4 EU-11
4 EU-15
Required, I want to query for item id 1 and 3.
I was trying with SUM(), CASE but not able to figure it as it involves LIKE operator. (Size like '[^EU]%' and Size like 'EU%')
#Update:
With little hint, I could do it with 2 queries using temp table. Would be nice to see it in single query.
1st Query.
select id,
case when size like '[^EU]%' then 'S'
when size like 'EU%' then 'EU' END as size
into #t from table
2nd Query.
select id, size from table
where id in
( select id from #t
group by id
having count(distinct(size))>1)
order by id, size
Thanks.
I think you wanted Id with both EU% and non EU%
select t.Id
from tbl t
group by t.Id
having count(distinct case when size like 'EU%' then 1 else 2 end) = 2
You can use the analytical function as follows:
select * from
(select t.*,
count(case when Size like '%M' OR Size like '%Y' then 1 end)
over (partition by id) cnt1,
count(case when Size like 'EU%' then 1 end)
over (partition by id) cnt2
from your_Table t) t
where cnt1 > 0 AND cnt2 > 0

Multiple Word Count in SQL

I have a list of words I need to find in a specific column , "description of what happenned "
this holds anything up to 500 or more characters. I have the script below that does work
However how do I replace the Name column 1.2.3 with the actual name of the word I am looking for with the total next to it.
Just cant get it to display prob something simple.
select GROUPING_ID ( Amoxicillin ,Atorvastatin ) as Name ,count(*) as Total
from ( select case when [description_of_what_happened] like '%Amoxicillin%'
then 1 else 0 end as Amoxicillin ,
case when [description_of_what_happened] like '%Atorvastatin%'
then 1 else 0 end as Atorvastatin
FROM "NAME OF TABLE"
group by grouping sets (() ,(Amoxicillin),(Atorvastatin))
having coalesce (Amoxicillin,1) != 0 and coalesce (Atorvastatin,1) != 0
order by grouping_id (Amoxicillin,Atorvastatin)
row 3 being the total I need row 1 and row 2 to show the name of the product
result as below
Name Total
1 7
2 9
3 4112
You can use strings instead of flags:
select coalesce(Amoxicillin, Atorvastatin, 'Total') as Name,
count(*) as Total
from (select (case when [description_of_what_happened] like '%Amoxicillin%'
then 'Amoxicillin'
end) as Amoxicillin ,
(case when [description_of_what_happened] like '%Atorvastatin%'
then 'Atorvastatin'
end
) as Atorvastatin
from "NAME OF TABLE"
where Amoxicillin is not null or Atorvastatin is not null
group by grouping sets ((), (Amoxicillin), (Atorvastatin))
order by name;
Note that I also moved the logic in the having to the where.

Create a query that counts instances in a related table

I have re-written this query about 20 times today and I keep getting close but no dice... I'm sure this is easy-peasy for y'all, but my SQL (Oracle) is pretty rusty.
Here's what I need:
PersonID Count1 Count2 Count3 Count4
1 0 0 2 1
2 1 1 1 0
3 1 1 1 2
Data is coming from several sources. I have a table People, and a table Values. People can have any number of values in that table.
PersonID Item Value
1 Check1 3
1 Check2 3
1 Check3 4
2 Check4 2
2 Check5 3
2 Check6 1
.. etc
So the query would, for each PersonID, count how many times the particular Value appears. The values are always 1, 2, 3, or 4. I tried to do 4 subqueries, but it wouldn't read the PersonID from the main query and just returned the count of all instances of value=1.
I was then thinking do a Group_By ... I don't know. Any help is appreciated!
ETA: I've deleted & re-written the query many times in many ways and unfortunately did not save any intermediate attempts. I didn't include it originally because I was in the middle of rearranging it again, and it's not runnable as-is. But here it is as it stands now:
/*sources are the tested requirements
values are the scores people received on the tested sources
people are those who were tested on the requirements */
WITH sub_query4 (
SELECT values.personid,
count (values.ID) as count4 --how many 4s
FROM values
INNER JOIN sources ON values.valueid = sources.sourceid
INNER JOIN people ON people.personid = values.personid
WHERE values.yearid = 2017
AND values.quarter = 'Q1'
AND instr (sources.identifier, 'TESTBANK.01', 1 ,1) > 0
AND values.value = '4'
GROUP_BY people.personid
)
SELECT p.first_name,
p.last_name,
p.position,
p.email,
p.locationid,
sub_query4.count4 as count4 --eventually this would repeat for 1, 2, & 3
FROM people p
WHERE p.locationid=406
AND p.position in (9,10);
values is a bad name for a table because it is a SQL keyword.
In any case, conditional aggregation should work:
select personid,
sum(case when value = 1 then 1 else 0 end) as cnt_1,
sum(case when value = 2 then 1 else 0 end) as cnt_2,
sum(case when value = 3 then 1 else 0 end) as cnt_3,
sum(case when value = 4 then 1 else 0 end) as cnt_4
from values
group by personid;
I prefer to use PIVOT for this. Here is Example SQL Fiddle
SELECT "PersonID", val1,val2,val3,val4 FROM
(
SELECT "PersonID", "Value" from VALS
)
PIVOT
(
count("Value")
FOR "Value" IN (1 as val1, 2 as val2, 3 as val3, 4 as val4)
);

select result set row to columns transformation

I've a table remarks with columns id, story_id, like like can be +1, -1
I want my select query to return the following columns story_id, total, n_like, n_dislike where total = n_like + n_dislike without sub queries.
I am currently doing a group by on like and selecting like as like_t, count(like) as total which is giving me an output like
-- like_t --+ --- total --
-1 | 2
1 | 6
and returning two rows in result set. But what I want is to get 1 row where n_like is 6 and n_dislike is 2 and total is 8
First, LIKE is a reserved word in PostgreSQL, so you have to double-quote it. Maybe a better name should be picked for this column.
CREATE TABLE testbed (id int4, story_id int4, "like" int2);
INSERT INTO testbed VALUES
(1,1,'+1'),(1,1,'+1'),(1,1,'+1'),
(1,1,'+1'),(1,1,'+1'),(1,1,'+1'),
(1,1,'-1'),(1,1,'-1');
SELECT
story_id,
sum(CASE WHEN "like" > 0 THEN abs("like") ELSE 0 END) AS n_like,
sum(CASE WHEN "like" < 0 THEN abs("like") ELSE 0 END) AS n_dislike,
count(story_id) AS total
-- for cases +2 / -3 in the "like" field, use following construct instead
-- sum(abs("like")) AS total
FROM testbed
GROUP BY story_id;
I used abs("like") for cases when you'll have +2 or -3 in your "like" column.

Get the distinct count of values from a table with multiple where clauses

My table structure is this
id last_mod_dt nr is_u is_rog is_ror is_unv
1 x uuid1 1 1 1 0
2 y uuid1 1 0 1 1
3 z uuid2 1 1 1 1
I want the count of rows with:
is_ror=1 or is_rog =1
is_u=1
is_unv=1
All in a single query. Is it possible?
The problem I am facing is that there can be same values for nr as is the case in the table above.
Case statments provide mondo flexibility...
SELECT
sum(case
when is_ror = 1 or is_rog = 1 then 1
else 0
end) FirstCount
,sum(case
when is_u = 1 then 1
else 0
end) SecondCount
,sum(case
when is_unv = 1 then 1
else 0
end) ThirdCount
from MyTable
you can use union to get multiple results e.g.
select count(*) from table with is_ror=1 or is_rog =1
union
select count(*) from table with is_u=1
union
select count(*) from table with is_unv=1
Then the result set will contain three rows each with one of the counts.
Sounds pretty simple if "all in a single query" does not disqualify subselects;
SELECT
(SELECT COUNT(DISTINCT nr) FROM table1 WHERE is_ror=1 OR is_rog=1) cnt_ror_reg,
(SELECT COUNT(DISTINCT nr) FROM table1 WHERE is_u=1) cnt_u,
(SELECT COUNT(DISTINCT nr) FROM table1 WHERE is_unv=1) cnt_unv;
how about something like
SELECT
SUM(IF(is_u > 0 AND is_rog > 0, 1, 0)) AS count_something,
...
from table
group by nr
I think it will do the trick
I am of course not sure what you want exactly, but I believe you can use the logic to produce your desired result.