sql case statement IN with group by - sql

I have a 2 column table with the columns : "user_name" and "characteristic". Each user_name may appear multiple times with a different characteristic.
The values in characteristic are:
Online
Instore
Account
Email
I want to write a sql statement that goes like this - but obviously this isn't working:
SELECT user_name,
case
when characteristic in ("online","instore") then 1
else 0
END as purchase_yn,
case
when characteristic in ("online","instore") and
characteristic in ("email",'account') then 1
else 0
END as purchaser_with_account
FROM my_table
GROUP BY user_name;
Essentially the first is a flag where I check for the presence of either value for that user_name.
The Second field is that they meet this criteria AND that they meet the criteria for having either 'email' or 'account'

An example the structure of your data would help better understand what you are trying to accomplish. But I think I get what you are trying to do.
You have to use an aggregate function in order to use a group by.
Something like SUM or AVG.
But you need first to build a pivot of your data and then you could use that pivot to check for your criterias:
This would create a table pivot that shows for each record what criterias are met:
SELECT
user_name,
case when characteristic = "online" then 1 else 0 end as online_yn,
case when characteristic = "instore" then 1 else 0 end as instore_yn,
case when characteristic = "account" then 1 else 0 end as account_yn,
case when characteristic = "email" then 1 else 0 end as email_yn,
FROM my_table
Now what you might wanted to do is to create an averaged version of these entries grouped by user_name and use those averages to create the fields you wanted. For that you need to use the same statement created earlier as an inline table :
Select
user_name,
case when avg(online_yn + instore_yn) >= 1 then 1 else 0 end as purchase_yn,
case when avg(online_yn + instore_yn) >= 1 and avg(email_yn + account_yn) >= 1 then 1 else 0 end as purchaser_with_account
From
(SELECT
user_name,
case when characteristic = "online" then 1 else 0 end as online_yn,
case when characteristic = "instore" then 1 else 0 end as instore_yn,
case when characteristic = "account" then 1 else 0 end as account_yn,
case when characteristic = "email" then 1 else 0 end as email_yn,
FROM my_table) avg_table
group by
user_name;
This should help.
It may not be efficient in terms of performance but you'll get what you want.

You just have to enclose the CASE expressions in COUNT aggregates:
SELECT user_name,
COUNT(case when characteristic in ("online","instore") then 1 END) as purchase_yn,
COUNT(case when characteristic in ("email",'account') then 1 END) as user_with_account
FROM my_table
GROUP BY user_name
If purchase_yn > 0 then you first flag is set. If purchase_yn > 0 and user_with_account > 0 then you second flag is set as well.
Note: You have to remove ELSE 0 from the CASE expressions because COUNT takes into account all not null values.

You haven't mentioned a specific RDBMS, but if SUM(DISTINCT ...) is available the following is quite nice:
SELECT
username,
SUM(DISTINCT
CASE
WHEN characteristic in ('online','instore') THEN 1
ELSE 0
END) AS purchase_yn,
CASE WHEN (
SUM(DISTINCT
CASE
WHEN characteristic in ('online','instore') THEN 1
WHEN characteristic in ('email','account') THEN 2
ELSE 0 END
)
) = 3 THEN 1 ELSE 0 END as purchaser_with_account
FROM
my_table
GROUP BY
username

If I correctly understand, if user have 'online' or 'instore', then for this user you want 1 as purchase_yn column, and if user also have 'email' or 'account', then 1 as purchaser_with_account column.
If this is correct, then one way is:
with your_table(user_name, characteristic) as(
select 1, 'online' union all
select 1, 'instore' union all
select 1, 'account' union all
select 1, 'email' union all
select 2, 'account' union all
select 2, 'email' union all
select 3, 'online'
)
-- below is actual query:
select your_table.user_name, coalesce(max(t1.purchase_yn), 0) as purchase_yn, coalesce(max(t2.purchaser_with_account), 0) as purchaser_with_account
from your_table
left join (SELECT user_name, 1 as purchase_yn from your_table where characteristic in('online','instore') ) t1
on your_table.user_name = t1.user_name
left join (SELECT user_name, 1 as purchaser_with_account from your_table where characteristic in('email', 'account') ) t2
on t1.user_name = t2.user_name
group by your_table.user_name

Related

How to check unique values in SQL

I have a table named Bank that contains a Bank_Values column. I need a calculated Bank_Value_Unique column to shows whether each Bank_Value exists somewhere else in the table (i.e. whether its count is greater than 1).
I prepared this query, but it does not work. Could anyone help me with this and/or modify this query?
SELECT
CASE
WHEN NULLIF(LTRIM(RTRIM(Bank_Value)), '') =
(SELECT Bank_Value
FROM [Bank]
GROUP BY Bank_Value
HAVING COUNT(*) = 1)
THEN '0' ELSE '1'
END AS Bank_Key_Unique
FROM [Bank]
A windowed count should work:
SELECT
*,
CASE
COUNT(*) OVER (PARTITION BY Bank_Value)
WHEN 1 THEN 1 ELSE 0
END AS Bank_Value_Unique
FROM
Bank
;
It works also, but I found solution also:
select CASE WHEN NULLIF(LTRIM(RTRIM(Bank_Value)),'') =
(select Bank_Value
from Bank
group by Bank_Value
having (count(distinct Bank_Value) > 2 )) THEN '1' ELSE '0' END AS
Bank_Value_Uniquness
from Bank
It was missing "distinct" in having part.

Combining multiple rows with the same ID, but different 'Yes'/'No' values for several columns, into one row showing all 'Yes'/'No' values

For the above table, I need to reduce the rows down to one per Filter ID and have all the possible yes/no values showing for that particular Filter Id
for example:
Filter ID
Outpatient Prescriptions
Opioid Outpatient Prescriptions
...
IP Pharmacy Medication Orders - Component Level
1
Yes
Yes
...
No
How is this achieved?
If I understand your question, for each partition of FilterID value, you want any field that has a yes to be aggregated up as 'Yes', otherwise 'No'. If you group by FilterID then you can handle the rollup using a CASE SUM CASE.
SELECT
FilterID,
Field1Response = CASE WHEN SUM(CASE WHEN Field1='Yes' THEN 1 ELSE 0 END) > 1 THEN 'Yes' ELSE 'No' END,
Field2Response = CASE WHEN SUM(CASE WHEN Field2='Yes' THEN 1 ELSE 0 END) > 1 THEN 'Yes' ELSE 'No' END ,
Field3Response = CASE WHEN SUM(CASE WHEN Field3='Yes' THEN 1 ELSE 0 END) > 1 THEN 'Yes' ELSE 'No' END
...
FROM
Data
GROUP BY
FilterID
By the nature of the data, you can also simply use a MAX. This is not a good habit of getting into because the values may change over time, however, if the values are always Y or N then you could simply use MAX:
SELECT
FilterID,
Field1Response = MAX(Field1),
Field2Response = MAX(Field1),
Field3Response = MAX(Field1)
...
FROM
Data
GROUP BY
FilterID

SQL - Subselect in select clause - how to create column which decides uniqity logic

I am trying to write subselect which will run through returned data, then checks status of all and then decides uniquity logic.
Is there any way to find out following ?
case any of data has 'Active' status first one will be marked as 1 everything else as 0
case there is no 'Active' status then first 'Expired' status will by marked as 1 and everything else as 0
case there is no 'Active' and 'Expired' status then first 'In Progress' will be marked as 1 and everything else as 0
I was trying to write it like this but i need to have it in one case statement
SELECT a.id, a.status,
,(SELECT
CASE WHEN b.STATUS = 'Active' THEN 1 ELSE 0 END
CASE WHEN b.STATUS = 'Expired' THEN 1 ELSE 0 END
FROM b.TABLE
WHERE a.id=b.id )AS unique
FROM my.TABLE
Result should look like https://i.stack.imgur.com/qCA74.png picture for expired case
Thank you in advance for any tips.
Use a window function:
select t.*,
(case when row_number() over (partition by id
order by case status when 'Active' then 1 when 'Expired' then 2 else 3 end
) = 1
then 1 else 0
end) as unique_flag
from my.table t;
If the lookup table is the same as source table, then you can use LAG function with constant and use its default value to mark the first row with 1 and others with 0. But you need to order your rows by some fields to deal with duplicates on status.
select a.id, a.status,
lag(0, 1, 1) over(
partition by a.id
order by
case a.status
when 'Active' then 0
when 'Expired' then 1
else 3
end asc,
a.some_more_columns asc /*To find that first row when there are duplicates by status*/
) as unique_flag
from MY_TABLE a
And what about object naming: never use keywords as identifiers. Calling column with date as date, table with users as users and some unknown table as table makes you design error prone.

Optimizing code with multple conditions on multiple tables?

I want to check whether these customers have LEAD action or SELL action which both stay in another tables. However, It takes like forever to finish it.
create table ct_nguyendang.visitor
as
select user_id, updated_at::date,
case
when user_id in (select distinct d_visitor_id from xiti.lead_detail) then 'lead'
else 'None'
end as lead_action,
case
when user_id in (select distinct account_id from ct_nguyendang.daily_listor) then 'sell'
else 'None'
end as sell_action
I think you can use union all and aggregation:
select user_id, max(is_lead) as has_lead, max(is_sale) as has_sale
from ((select d_visitor_id as user_id, 1 as is_lead, 0 as is_sale
from xiti.lead_detail
) union all
(select account_id, 0, 1
from ct_nguyendang.daily_listor
)
) ls
group by user_id;
If you have a table of users, then you can use correlated subqueries:
select u.*,
(case when exists (select 1
from xiti.lead_detail l
where u.user_id = l.d_visitor_id
)
then 1 else 0
end) as has_lead,
(case when exists (select 1
from ct_nguyendang.daily_listor s
where u.user_id = s.account_id
)
then 1 else 0
end) as has_sale
from users u;
Note that I prefer using 1 for "true" and 0 for "false". Of course, you can use string values if you prefer.
To optimize this query, you want indexes on xiti.lead_detail(d_visitor_id) and ct_nguyendang.daily_listor(account_id).

Use EXISTS in SQL for multiple select

I have a table STATUSES which contain columns NAME and ACTIVE_FLAG.The column value of NAME may have new, pending, cancel. I want to generate a new output for the count of each NAME with ACTIVE_FLAG=Y
By thinking to use EXISTS to select records for single NAME,
SELECT COUNT(*) AS PENDING
FROM STATUSES
WHERE EXISTS (select NAME from STATUSES where NAME='Pending' and ACTIVE_FLAG = 'Y')
Anyway if I can join other statuses count in a single SQL?
Seems like count and group by
SELECT
name
, count(*)
FROM statuses
WHERE active_flag = 'Y'
GROUP BY name
You can use something like this as i don't see any need to use EXISTS :
SELECT sum(case when name='Pending' then 1 else 0 end) AS PENDING,
sum(case when name='new' then 1 else 0 end) AS NEW,
sum(case when name='cancel' then 1 else 0 end) AS CANCEL
FROM STATUSES
WHERE ACTIVE_FLAG = 'Y'
SQL HERE