CASE function for multiple instances - sql

I am looking to return a table with each column tagging whether or not a user used a certain product in a given month. The data appears as below:
User Date Product
001 1/1/2019 A
001 1/1/2019 A
001 1/1/2019 B
002 1/1/2019 A
002 1/1/2019 A
003 1/1/2019 C
004 1/1/2019 A
004 1/1/2019 B
004 1/1/2019 C
I would like the SQL code to result in this (i.e. a column for each product type, with a 1 or 0 if that product appeared in the row in the data):
User A B C
001 1 1 0
002 1 0 0
003 0 0 1
004 1 1 1
Currently I only end up with only a single 1 in each column, even when multiple instances exist. (This is in SQL Server)
Thanks!

You can use conditional aggregation:
select user,
max(case when product = 'A' then 1 else 0 end) as a,
max(case when product = 'B' then 1 else 0 end) as b,
max(case when product = 'C' then 1 else 0 end) as c
from t
group by user

Related

Using SQL to categorize a multiple rows of a column using Snowflake

A tricky conundrum I'm trying to figure out in Snowflake.
Let's say I have data like this
ID tag
001 A
001 A
002 B
003 A
004 1
003 1
005 B
005 2
004 B
002 C
006 A
006 2
006 A
And basically, my goal is I would like to categorize each ID into a unique table on the following criteria. So across ALL of a given ID...
If at any given point the ID is equal to 1 AND/OR A, then "GROUPA"
If at any given point the ID is equal to 2 AND/OR B, then "GROUPB"
If 1 AND B appear to the same ID OR if 2 AND A appear to the same ID, then NULL
And if any other values appear, no issue, I only care about 1,2,A,B ; each ID will have a row with atleast one of these.
So the resulting DF will be...
ID GROUP
001 GROUPA
002 GROUPB
003 GROUPA
004 NULL
005 GROUPB
006 NULL
Notice, 004 and 006 were nulled out because in 004 both 1 and B appeared. Similarly, even though A appeared twice in 006, the 2 does not match and thus is NULL.
Using conditional aggregation, here: COUNT_IF:
SELECT
ID,
CASE WHEN COUNT_IF(tag IN ('1','A')) > 0 AND COUNT_IF(tag IN ('2','B')) > 0 THEN NULL
WHEN COUNT_IF(tag IN ('1','A')) > 0 THEN 'GROUPA'
WHEN COUNT_IF(tag IN ('2','B')) > 0 THEN 'GROUPB'
END AS grp
FROM tab
WHERE tag IN ('1', '2', 'A', 'B')
GROUP BY ID
ORDER BY ID;

Redshift Row_Number() Query with partitions that restart

I have data with id, timestamp(ts) event, capital_event_bool, and prev_event_capital_bool.
id ts event capital_event_bool prev_event_capital_bool
001 00:01 a 0 0
002 00:02 b 0 0
002 00:03 b 0 0
002 00:04 b 1 0
002 00:05 c 0 1
003 00:03 c 0 0
003 00:04 b 0 0
003 00:05 b 1 0
003 00:06 b 0 1
003 00:07 b 0 0
003 00:08 b 1 0
Only "b" events can have a capital_event_bool = True.
What I would like to accomplish is have a way to count all capital_event_bool = False b events prior to every capital_event_bool = True event for every id. I originally thought I could accomplish this via the row_number() window function in Redshift with
ROW_NUMBER() OVER (PARTITION BY id, event, capital_event_bool ORDER BY ts) AS row_num
but the part that is tripping me up is how to get the count to restart after every capital_event_bool = True event. It is fine if the row numbering will stop at every capital_event_bool = True event and then restart because I can just use a case statement with the capital_event_bool to reach my final result.
row_num DESIRED only row_num Final Desired Result
1 1 0
1 1 0
2 2 0
1 3 2
1 1 0
2 2 0
1 1 0
1 2 1
2 1 0
3 2 0
2 3 2
This is a type of gap-and-islands problem. Basically, you need to define subsets of the data by the number of "1" in the "b" columns. For this purpose, an inverse sum of capital_event_bool does exactly what you want. Then, you can use window functions on this group:
select t.*,
(case when capital_event_bool = 1
then sum( (event = 'b')::int ) over (partition by id, grp) - 1
else 0
end) as final_result
from (select t.*,
sum(capital_event_bool) over (partition by id order by ts desc) as grp
from t
) t

Display output is columns based on filter criteria

I am trying to display data is columns/subcolumns based on certain filter criteria using case when statement but not getting required output.
data:
ID ID2 Country Type
1 001 US A
1 009 US A
2 002 AU B
3 003 CA A
3 005 CA A
4 007 US B
5 001 FR B
6 003 US B
7 002 US A
8 004 NZ A
based on my current case statement, here is how my output looks:
Type Country Count
B Other 2
B US 1
B Subtotal 3
A Other 4
A US 3
A Subtotal 7
Total 10
I want to display the following format, bonus if I can get the subtotal/totals:
Type-A Type-B
US Other US Other
3 4 1 2
I also need Subtotals, and Grandtotals, but these need to be calculated separately.
SubTotal: 7 SubTotal: 3
Grand Total: 10
You can like this
select
sum(case when Type = 'A' and Country = 'US' then 1 else 0 end) as US_TYPE_A,
sum(case when Type = 'A' and Country != 'US' then 1 else 0 end) as Other_TYPE_A,
sum(case when Type = 'B' and Country = 'US' then 1 else 0 end) as US_TYPE_B,
sum(case when Type = 'B' and Country != 'US' then 1 else 0 end) as Other_TYPE_B
from myTable

Conditional Row Deleting in SQL

I have a table that contains 4 columns. I need to remove some of the rows based on the Code and ID columns. A code of 1 initiates the process I'm trying to track and a code of 2 terminates it. I would like to remove all rows for a specific ID when a code of 2 comes after a code of 1 and there is not an additional code 1. For example, my current data set looks like this:
Code Deposit Date ID
1 $100 3/2/2016 5
2 $0 3/1/2016 5
1 $120 2/8/2016 5
1 $120 3/22/2016 4
2 $70 2/8/2016 3
1 $120 1/3/2016 3
2 $0 6/15/2015 2
1 $120 3/22/2016 2
1 $50 8/15/2015 1
2 $200 8/1/2015 1
After I run my script I would like it to look like this:
Code Deposit Date ID
1 $100 3/2/2016 5
2 $0 3/1/2016 5
1 $120 2/8/2016 5
1 $120 3/22/2016 4
1 $50 8/15/2015 1
2 $200 8/1/2015 1
In all I have about 150,000 ID's in my actual table but this is the general idea.
You can get the ids using logic like this:
select t.id
from t
group by t.id
having max(case when code = 2 then date end) > min(case when code = 1 then date end) and -- code 2 after code 1
max(case when code = 2 then date end) > max(case when code = 1 then date end) -- no code 1 after code2
It is then easy enough to incorporate this into a query to get the rest of the details:
select t.*
from t
where t.id not in (select t.id
from t
group by t.id
having max(case when code = 2 then date end) > min(case when code = 1 then date end) and -- code 2 after code 1
max(case when code = 2 then date end) > max(case when code = 1 then date end)
);
The approach I took was to add up the Code per each ID. If it equals 3 exactly, it should be removed.
;WITH keepID as (
Select
ID
,SUM(code) as 'sumCode'
From #testInit
Group by ID
HAVING SUM(code) <> 3
)
Select *
From #testInit
Where ID IN (Select ID from keepID)
Your post showed keeping ID = 1 which does not seem to fit the criteria ? Are you sure you would be keeping ID = 1 ? It only as 2 records with a code of 1 and a code of 2 which adds up to 3 ... thus, remove it.
I just showed the approach in logic ... let me know if you need help with the delete code.
delete from table
where table.id in
(select id from B where A.id=B.id and B.date>A.date
from
(select code,id,max(date),id where code=1 group by id) as A,
(select code ,id,max(date),id where code=2 group by id) as B)
explanation: select code,id,max(date),id where code=1 as A
will fetch data with the highest date for a specific id of code 1
select code ,id,max(date),id where code=2 group by id) as B
will fetch data with the highest date for a specific id of code 2
select id from B where A.id=B.id and B.date>A.date wil select all the ids for which the code 2 date is higher than code 1 date.

Is it possible to use 1column extract to 4 column

i don't sure this to possible i don't have idea
this my tables : tbchecked
[id] [status]
001 present
001 present
001 absent
001 absent
001 leave
001 present
001 present
002 present
002 absent
002 leave
it is possible? to output in gridview1 how to query?
[id] [present] [absent] [leave]
001 4 2 1
002 1 1 1
You can do it this way.
select id,
sum(case when status = 'present' then 1 else 0 end) as present,
sum(case when status = 'absent' then 1 else 0 end) as absent,
sum(case when status = 'leave' then 1 else 0 end) as leave
from tbchecked
group by id