Find items in table with 2 specific sizes - sql

I have items table where the item code repeats as it has different sizes, variants.
I want to find items which has 2 specific sizes, ie size in both M/Y and Euro.
Items table:
Id size
1 0
1 2Y
1 EU-15
2 2M
2 4M
3 0
3 2M-4M
3 EU-12
4 EU-11
4 EU-15
Required, I want to query for item id 1 and 3.
I was trying with SUM(), CASE but not able to figure it as it involves LIKE operator. (Size like '[^EU]%' and Size like 'EU%')
#Update:
With little hint, I could do it with 2 queries using temp table. Would be nice to see it in single query.
1st Query.
select id,
case when size like '[^EU]%' then 'S'
when size like 'EU%' then 'EU' END as size
into #t from table
2nd Query.
select id, size from table
where id in
( select id from #t
group by id
having count(distinct(size))>1)
order by id, size
Thanks.

I think you wanted Id with both EU% and non EU%
select t.Id
from tbl t
group by t.Id
having count(distinct case when size like 'EU%' then 1 else 2 end) = 2

You can use the analytical function as follows:
select * from
(select t.*,
count(case when Size like '%M' OR Size like '%Y' then 1 end)
over (partition by id) cnt1,
count(case when Size like 'EU%' then 1 end)
over (partition by id) cnt2
from your_Table t) t
where cnt1 > 0 AND cnt2 > 0

Related

Why is SUM() and COUNT() returning different values?

I have two near identical queries and I'm trying to understand why they're returning different results. I would like to produce a table with a user_id and a food_orders column showing how many items each user has ordered. These queries produce this table but calculate different results in the food_orders column in some rows.
My questions are why is this and which should I use?
A simplified version of my queries are below.
Query 1 (using COUNT):
WITH order_made AS (
SELECT restaurant_id,
count(
CASE
WHEN item LIKE '%_angus' then 1
WHEN item LIKE '%_organic' then 1
WHEN item LIKE '%_lean' then 1
ELSE 0
END)
AS burgers
FROM mcdonalds.specialist_orders
GROUP BY mcdonalds.specialist_orders.user_id
UNION ALL
SELECT restaurant_id,
COUNT(
CASE
WHEN item LIKE 'salad%' THEN 1
WHEN item LIKE 'tomato%' THEN 1
WHEN item LIKE 'potatoes%' THEN 1
ELSE 0
END)
AS vegetables
FROM public.bulk_orders
GROUP BY public.bulk_orders.user_id),
Query 2 (using SUM):
WITH orders_made AS (
SELECT user_id, SUM(food_orders) AS food_orders
FROM (SELECT user_id,
CASE
WHEN item LIKE '%_angus' then 1
WHEN item LIKE '%_organic' then 1
WHEN item LIKE '%_lean' then 1
ELSE 0
END
AS food_orders
FROM mcdonalds.specialist_orders
UNION ALL
SELECT user_id,
CASE
WHEN item LIKE 'salad%' THEN 1
WHEN item LIKE 'tomato%' THEN 1
WHEN item LIKE 'potatoes%' THEN 1
ELSE 0
END
AS food_orders
FROM public.bulk_orders
GROUP BY user_id)
Because COUNT function count when the value isn't NULL (include 0) if you don't want to count, need to let CASE WHEN return NULL
so if you remove ELSE 0 from your CASE WHEN or use ELSE NULL to instead, that result would be as same as SUM
WITH order_made AS (
SELECT restaurant_id,
count(
CASE
WHEN item LIKE '%_angus' then 1
WHEN item LIKE '%_organic' then 1
WHEN item LIKE '%_lean' then 1
END)
AS burgers
FROM mcdonalds.specialist_orders
GROUP BY mcdonalds.specialist_orders.user_id
UNION ALL
SELECT restaurant_id,
COUNT(
CASE
WHEN item LIKE 'salad%' THEN 1
WHEN item LIKE 'tomato%' THEN 1
WHEN item LIKE 'potatoes%' THEN 1
END)
AS vegetables
FROM public.bulk_orders
GROUP BY public.bulk_orders.user_id),

How to compare a number with count result then use it in limit statement in redshift/sql

I have a table with two columns id and flag.
The data is very imbalanced. Only a few flag has value 1 and others are 0.
id flag
1 0
2 0
3 0
4 0
5 1
6 1
7 0
Now I want to create a balanced table. Therefore, I want get a subset from flag = 0 based on the number of records where flag = 1. Also, I don't want the number to be greater than 1000.
I am thinking about a code like this:
select *
from table
where flag = 0
order by random()
limit (least(1000,
select count(*)
from table
where flag = 1));
Expected result(Only two records have flag as 1 so I get two records with flag as 0, if there are more than 1000 records have flag as 1 I will only get 1000.):
id flag
2 0
7 0
If you want a balanced sample:
select t.*
from (select t.*, row_number() over (partition by flag order by flag) as seqnum,
sum(case when flag = 1 then 1 else 0 end) over () as cnt_1
from t
) t
where seqnum <= cnt_1;
You can change this to:
where seqnum <= least(cnt_1, 1000)
If you want an overall maximum.
You can use row_number to simulate LIMIT.
select * from (
select column1, column2, row_number() OVER() AS rownum
from table
where flag = 0 )
where rownum < 1000
If I’ve made a bad assumption please comment and I’ll refocus my answer.

How to create crosstab with two field in bigquery with standart or legacy sql

I want to get two columns from table and create a crosstab to see how many product bought in which product category for each customer.
Here is an example data from my table:
Row Customer_ID Style
1 MEM014 BLS87
2 KAR810 DR126
3 NIKE61 MMQ5
4 NIKE61 MMQ5
5 STT019 BLS83
6 STT019 BLS84
7 STT019 BLS87
And I want to get result table like this:
Customer - DR126 - MMQ5 - BLS83 - BLS84 - BLS87
MEM014 0 0 0 0 1
KAR810 1 0 0 0 0
NIKE61 0 2 0 0 0
STT019 0 0 1 1 1
Below is for BigQuery Standard SQL
Step #1 - generate pivot query
#standardSQL
SELECT CONCAT(
"SELECT Customer_ID,",
STRING_AGG(CONCAT("COUNTIF(Style='", Style, "') ", Style)),
" FROM `project.dataset.your_table` GROUP BY Customer_ID ORDER BY Customer_ID")
FROM (
SELECT DISTINCT Style
FROM `project.dataset.your_table`
ORDER BY Style
)
If you run it with dummy data from your question like below
#standardSQL
WITH `project.dataset.your_table` AS (
SELECT 'MEM014' Customer_ID, 'BLS87' Style UNION ALL
SELECT 'KAR810', 'DR126' UNION ALL
SELECT 'NIKE61', 'MMQ5' UNION ALL
SELECT 'NIKE61', 'MMQ5' UNION ALL
SELECT 'STT019', 'BLS83' UNION ALL
SELECT 'STT019', 'BLS84' UNION ALL
SELECT 'STT019', 'BLS87'
)
SELECT CONCAT(
"SELECT Customer_ID,",
STRING_AGG(CONCAT("COUNTIF(Style='", Style, "') ", Style)),
" FROM `project.dataset.your_table` GROUP BY Customer_ID")
FROM (
SELECT DISTINCT Style
FROM `project.dataset.your_table`
ORDER BY Style
)
you will get following pivot query
SELECT Customer_ID,COUNTIF(Style='BLS83') BLS83,COUNTIF(Style='BLS84') BLS84,COUNTIF(Style='BLS87') BLS87,COUNTIF(Style='DR126') DR126,COUNTIF(Style='MMQ5') MMQ5 FROM `project.dataset.your_table` GROUP BY Customer_ID
Step #2 - run generated pivot query
if you run it against your dummy data - you get expected result
Row Customer_ID BLS83 BLS84 BLS87 DR126 MMQ5
1 KAR810 0 0 0 1 0
2 MEM014 0 0 1 0 0
3 NIKE61 0 0 0 0 2
4 STT019 1 1 1 0 0
Note 1: Above assumes your Style names comply with column names convention (those in your example do). If not - you will need to escape not supported characters and so on (easy adjustment for step 1)
Note 2: Maximum unresolved query length is 256 KB. So if your Style names are similar to those in your example - above solution will support around 8500 styles, which should be less than limit (10K?) for number of columns in table
You can use conditional aggregation:
select customer,
sum(case when style = 'DR126' then 1 else 0 end) as DR126,
sum(case when style = 'MMQ5' then 1 else 0 end) as MMQ5,
. . .
from t
group by customer;
This works if you have the exact list of styles. If not, then you should be thinking in terms of arrays for the result set.
EDIT:
You can create an array of structs if that better suits your purpose:
select customer, array_agg(cs) as styles
from (select customer, style, count(*) as cnt
from t
group by customer
) cs
group by customer;
What you cannot do is have a query return a variable number of columns. For that, you need dynamic SQL and a programming language.

SQL Count with multiple conditions then join

Quick one,
I have a table, with the following structure
id lid taken
1 1 0
1 1 0
1 1 1
1 1 1
1 2 1
Pretty simply so far right?
I need to query the taken/available from the lid of 1, which should return
taken available
2 2
I know I can simply do two counts and join them, but is there a more proficient way of doing this rather than two separate queries?
I was looking at the following type of format, but I can not for the life of me get it executed in SQL...
SELECT
COUNT(case taken=1) AS taken,
COUNT(case taken=0) AS available FROM table
WHERE
lid=1
Thank you SO much.
You can do this:
SELECT taken, COUNT(*) AS count
FROM table
WHERE lid = 1
GROUP BY taken
This will return two rows:
taken count
0 2
1 2
Each count corresponds to how many times that particular taken value was seen.
Your query is correct just needs juggling a bit:
SELECT
SUM(case taken WHEN 1 THEN 1 ELSE 0 END) AS taken,
SUM(case taken WHEN 1 THEN 0 ELSE 1 END) AS available FROM table
WHERE
lid=1
Alternatively you could do:
SELECT
SUM(taken) AS taken,
COUNT(id) - SUM(taken) AS available
FROM table
WHERE
lid=1
SELECT
SUM(case WHEN taken=1 THEN 1 ELSE 0 END) AS taken,
SUM(case WHEN taken=0 THEN 1 ELSE 0 END) AS available
FROM table
WHERE lid=1
Weird application of CTE's:
WITH lid AS (
SELECT DISTINCT lid FROM taken
)
, tak AS (
SELECT lid,taken , COUNT(*) AS cnt
FROM taken t0
GROUP BY lid,taken
)
SELECT l.lid
, COALESCE(a0.cnt, 0) AS available
, COALESCE(a1.cnt, 0) AS taken
FROM lid l
LEFT JOIN tak a0 ON a0.lid=l.lid AND a0.taken = 0
LEFT JOIN tak a1 ON a1.lid=l.lid AND a1.taken = 1
WHERE l.lid=1
;

Get the distinct count of values from a table with multiple where clauses

My table structure is this
id last_mod_dt nr is_u is_rog is_ror is_unv
1 x uuid1 1 1 1 0
2 y uuid1 1 0 1 1
3 z uuid2 1 1 1 1
I want the count of rows with:
is_ror=1 or is_rog =1
is_u=1
is_unv=1
All in a single query. Is it possible?
The problem I am facing is that there can be same values for nr as is the case in the table above.
Case statments provide mondo flexibility...
SELECT
sum(case
when is_ror = 1 or is_rog = 1 then 1
else 0
end) FirstCount
,sum(case
when is_u = 1 then 1
else 0
end) SecondCount
,sum(case
when is_unv = 1 then 1
else 0
end) ThirdCount
from MyTable
you can use union to get multiple results e.g.
select count(*) from table with is_ror=1 or is_rog =1
union
select count(*) from table with is_u=1
union
select count(*) from table with is_unv=1
Then the result set will contain three rows each with one of the counts.
Sounds pretty simple if "all in a single query" does not disqualify subselects;
SELECT
(SELECT COUNT(DISTINCT nr) FROM table1 WHERE is_ror=1 OR is_rog=1) cnt_ror_reg,
(SELECT COUNT(DISTINCT nr) FROM table1 WHERE is_u=1) cnt_u,
(SELECT COUNT(DISTINCT nr) FROM table1 WHERE is_unv=1) cnt_unv;
how about something like
SELECT
SUM(IF(is_u > 0 AND is_rog > 0, 1, 0)) AS count_something,
...
from table
group by nr
I think it will do the trick
I am of course not sure what you want exactly, but I believe you can use the logic to produce your desired result.