SQL using a function to check existence of event over a user id? - sql

I have a table 'EVENTS' with a user column, and an 'event' column
User
Event
1
a
1
a
1
a
1
b
2
b
2
c
In the above example, user 1 has never had event c appear for them. I want to do something like
WITH table_a as (
SELECT
CASE WHEN EVENT = 'c' Then 'Y' ELSE 'n' end as event_occured,
user_id
FROM EVENTS)
and then get a result such as
User
is_occured
1
n
2
y
So I first tried to do it like such
SELECT DISTINCT USER,'y' is_occured FROM table_a WHERE event_occured='y'
UNION
SELECT DISTINCT USER,'n' is_occured FROM table_a WHERE event_occured='n'
But this is obviously a bit clunky, and will be unmanageable, especially as more columns are added to the event table, and needed in the query. so next I tried to do it using a window function, but I'm not certain how to pull the values into only singular users, where I'm only looking for the existence.
SELECT user,
CASE WHEN ... over(partion by user)
FROM EVENTS
But I'm very confused how to procede or if this is even the right track

If you are purely trying to get a Y or N onto these, you can do a simple MAX with a case expression:
select [User]
, MAX(case when [Event] = 'c' then 'Y' else 'N' end) is_occurred
from [EVENTS]
group by [User]
If you wanted to avoid group by, you could do a window function:
select distinct [User]
, MAX(case when [Event] = 'c' then 'Y' else 'N' end) over (partition by [User])
from [EVENTS]
If you wanted to have this as a function, you could parameterize the [Event] comparison and pass the user as well to something like:
select MAX(case when [Event] = #p_checked_event then 'Y' else 'N' end)
from [EVENTS]
where [User] = #p_checked_user
Return the results of that query, and call it like:
select distinct [User]
, CheckEventOccurred([User], 'c')
from [EVENTS]

Related

Optimizing code with multple conditions on multiple tables?

I want to check whether these customers have LEAD action or SELL action which both stay in another tables. However, It takes like forever to finish it.
create table ct_nguyendang.visitor
as
select user_id, updated_at::date,
case
when user_id in (select distinct d_visitor_id from xiti.lead_detail) then 'lead'
else 'None'
end as lead_action,
case
when user_id in (select distinct account_id from ct_nguyendang.daily_listor) then 'sell'
else 'None'
end as sell_action
I think you can use union all and aggregation:
select user_id, max(is_lead) as has_lead, max(is_sale) as has_sale
from ((select d_visitor_id as user_id, 1 as is_lead, 0 as is_sale
from xiti.lead_detail
) union all
(select account_id, 0, 1
from ct_nguyendang.daily_listor
)
) ls
group by user_id;
If you have a table of users, then you can use correlated subqueries:
select u.*,
(case when exists (select 1
from xiti.lead_detail l
where u.user_id = l.d_visitor_id
)
then 1 else 0
end) as has_lead,
(case when exists (select 1
from ct_nguyendang.daily_listor s
where u.user_id = s.account_id
)
then 1 else 0
end) as has_sale
from users u;
Note that I prefer using 1 for "true" and 0 for "false". Of course, you can use string values if you prefer.
To optimize this query, you want indexes on xiti.lead_detail(d_visitor_id) and ct_nguyendang.daily_listor(account_id).

Group data by one column and creating a new column based on a rows in each group

I have 'Task' and 'Start time' columns in my data. For each Task, there may be one or more Start times. What I want to do is, categorize each task as an 'X' task if all its Start times are equal and as a 'Y' task if all its Start times are not equal.
This is how the table should look like :
This can be achieved by using a group by and counting the distinct start time and using a case to return X or Y.
Select task,
(case when count(distinct start_time) = 1 then 'X' else 'y' end)
from tasks
group by task;
or this if you want it to look exactly like the picture.
Select tasks.task, tasks.start_time, new.new
from tasks, (Select task,
(case when count(distinct start_time) = 1 then 'X' else 'y' end) as new
from tasks
group by task) as new
where tasks.task = new.task;
You can view my solution here https://paiza.io/projects/Zu7IBFc-5tFBK8xDuf3hPg?language=mysql P.S. I just use Integer instead of date because I didn't feel like dealing with dates lol.
Select task,
(case when count(distinct start_time) = 1 then 'X' else 'y' end)
from tasks
group by task;
With group by task get the value of new_column for each task and then join to the table tasks:
select t.id, t.task, g.new_column
from tasks t inner join (
select
task,
(case when count(distinct starttime) = 1 then 'X' else 'Y' end) new_column
from tasks
group by task
) g on g.task = t.task

Use EXISTS in SQL for multiple select

I have a table STATUSES which contain columns NAME and ACTIVE_FLAG.The column value of NAME may have new, pending, cancel. I want to generate a new output for the count of each NAME with ACTIVE_FLAG=Y
By thinking to use EXISTS to select records for single NAME,
SELECT COUNT(*) AS PENDING
FROM STATUSES
WHERE EXISTS (select NAME from STATUSES where NAME='Pending' and ACTIVE_FLAG = 'Y')
Anyway if I can join other statuses count in a single SQL?
Seems like count and group by
SELECT
name
, count(*)
FROM statuses
WHERE active_flag = 'Y'
GROUP BY name
You can use something like this as i don't see any need to use EXISTS :
SELECT sum(case when name='Pending' then 1 else 0 end) AS PENDING,
sum(case when name='new' then 1 else 0 end) AS NEW,
sum(case when name='cancel' then 1 else 0 end) AS CANCEL
FROM STATUSES
WHERE ACTIVE_FLAG = 'Y'
SQL HERE

How to add where condition if result count is greater than one

I want to build SQL query that returns unique id.
My problem is that i need to add another condition to query if i have more than one result.
select u.id
from users u
where u.id in ('1','2','3')
and u.active = 'Y'
if i get more than one result i need to add:
and u.active_contact = 'Y'
I tried to build this query
select * from (
select u.id, count(u.id) as results
from users u
where u.id in ('1','2','3')
and u.active = 'Y'
group by u.id
) tab
If(tab.results > 1) then
where tab.u.active_contact = 'Y'
end
Thanks in advanced.
Hope i explained my self good enough.
Here's a different approach:
SELECT id
FROM (SELECT id, (CASE WHEN active ='Y' THEN 1 ELSE 0 END) + (CASE WHEN active_contact ='Y' THEN 1 ELSE 0 END) as actv FROM users ORDER BY actv DESC)
WHERE actv > 0
LIMIT 1
The subquery adds a column which aggregates active and active_contact. The main SELECT then optimizes the combination of these two fields, requiring at least one of them. I believe this provides the intended result.
Among the possible ways to solve this, here are two.
1) Use the active_contact id. If there is none use another id.
select coalesce( max(case when active_contact = 'Y' then id end), max(id) ) as id
from users
where id in ('1','2','3')
and active = 'Y';
2) Sort with active_contact coming first. Then get the first record.
select id
from
(
select id
from users
where id in ('1','2','3')
and active = 'Y'
order by case when active_contact = 'Y' then 1 else 2 end
) where rownum = 1;
A method using Analytic functions
SELECT id
FROM (SELECT u.id
, u.active_contact
, count(*) OVER () actives
FROM users u
WHERE u.id IN ('1','2','3')
AND u.active = 'Y')
WHERE ( actives = 1
OR ( actives > 1
AND active_contact = 'Y'))
If there is more than one record where active = 'Y' AND active_contact = 'Y' it will return them all. If only one of these is required you will need to identify the criteria for choosing that one.

SQL using CASE in SELECT with GROUP BY. Need CASE-value but get row-value

so basicially there is 1 question and 1 problem:
1. question - when I have like 100 columns in a table(and no key or uindex is set) and I want to join or subselect that table with itself, do I really have to write out every column name?
2. problem - the example below shows the 1. question and my actual SQL-statement problem
Example:
A.FIELD1,
(SELECT CASE WHEN B.FIELD2 = 1 THEN B.FIELD3 ELSE null FROM TABLE B WHERE A.* = B.*) AS CASEFIELD1
(SELECT CASE WHEN B.FIELD2 = 2 THEN B.FIELD4 ELSE null FROM TABLE B WHERE A.* = B.*) AS CASEFIELD2
FROM TABLE A
GROUP BY A.FIELD1
The story is: if I don't put the CASE into its own select statement then I have to put the actual rowname into the GROUP BY and the GROUP BY doesn't group the NULL-value from the CASE but the actual value from the row. And because of that I would have to either join or subselect with all columns, since there is no key and no uindex, or somehow find another solution.
DBServer is DB2.
So now to describing it just with words and no SQL:
I have "order items" which can be divided into "ZD" and "EK" (1 = ZD, 2 = EK) and can be grouped by "distributor". Even though "order items" can have one of two different "departements"(ZD, EK), the fields/rows for "ZD" and "EK" are always both filled. I need the grouping to consider the "departement" and only if the designated "departement" (ZD or EK) is changing, then I want a new group to be created.
SELECT
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END) AS ZD,
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END) AS EK,
TABLE.DISTRIBUTOR,
sum(TABLE.SOMETHING) AS SOMETHING,
FROM TABLE
GROUP BY
ZD
EK
TABLE.DISTRIBUTOR
TABLE.DEPARTEMENT
This here worked in the SELECT and ZD, EK in the GROUP BY. Only problem was, even if EK was not the designated DEPARTEMENT, it still opened a new group if it changed, because he was using the real EK value and not the NULL from the CASE, as I was already explaining up top.
And here ladies and gentleman is the solution to the problem:
SELECT
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END) AS ZD,
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END) AS EK,
TABLE.DISTRIBUTOR,
sum(TABLE.SOMETHING) AS SOMETHING,
FROM TABLE
GROUP BY
(CASE WHEN TABLE.DEPARTEMENT = 1 THEN TABLE.ZD ELSE null END),
(CASE WHEN TABLE.DEPARTEMENT = 2 THEN TABLE.EK ELSE null END),
TABLE.DISTRIBUTOR,
TABLE.DEPARTEMENT
#t-clausen.dk: Thank you!
#others: ...
Actually there is a wildcard equality test.
I am not sure why you would group by field1, that would seem impossible in your example. I tried to fit it into your question:
SELECT FIELD1,
CASE WHEN FIELD2 = 1 THEN FIELD3 END AS CASEFIELD1,
CASE WHEN FIELD2 = 2 THEN FIELD4 END AS CASEFIELD2
FROM
(
SELECT * FROM A
INTERSECT
SELECT * FROM B
) C
UNION -- results in a distinct
SELECT
A.FIELD1,
null,
null
FROM
(
SELECT * FROM A
EXCEPT
SELECT * FROM B
) C
This will fail for datatypes that are not comparable
No, there's no wildcard equality test. You'd have to list every field you want tested individually. If you don't want to test each individual field, you could use a hack such as concatenating all the fields, e.g.
WHERE (a.foo + a.bar + a.baz) = (b.foo + b.bar + b.az)
but either way, you're listing all of the fields.
I might tend to solve it something like this
WITH q as
(SELECT
Department
, (CASE WHEN DEPARTEMENT = 1 THEN ZD
WHEN DEPARTEMENT = 2 THEN EK
ELSE null
END) AS GRP
, DISTRIBUTOR
, SOMETHING
FROM mytable
)
SELECT
Department
, Grp
, Distributor
, sum(SOMETHING) AS SumTHING
FROM q
GROUP BY
DEPARTEMENT
, GRP
, DISTRIBUTOR
If you need to find all rows in TableA that match in TableB, how about INTERSECT or INTERSECT DISTINCT?
select * from A
INTERSECT DISTINCT
select * from B
However, if you only want rows from A where the entire row matches the values in a row from B, then why does your sample code take some values from A and others from B? If the row matches on all columns, then that would seem pointless. (Perhaps your question could be explained a bit more fully?)