Group data by one column and creating a new column based on a rows in each group - sql

I have 'Task' and 'Start time' columns in my data. For each Task, there may be one or more Start times. What I want to do is, categorize each task as an 'X' task if all its Start times are equal and as a 'Y' task if all its Start times are not equal.
This is how the table should look like :

This can be achieved by using a group by and counting the distinct start time and using a case to return X or Y.
Select task,
(case when count(distinct start_time) = 1 then 'X' else 'y' end)
from tasks
group by task;
or this if you want it to look exactly like the picture.
Select tasks.task, tasks.start_time, new.new
from tasks, (Select task,
(case when count(distinct start_time) = 1 then 'X' else 'y' end) as new
from tasks
group by task) as new
where tasks.task = new.task;
You can view my solution here https://paiza.io/projects/Zu7IBFc-5tFBK8xDuf3hPg?language=mysql P.S. I just use Integer instead of date because I didn't feel like dealing with dates lol.

Select task,
(case when count(distinct start_time) = 1 then 'X' else 'y' end)
from tasks
group by task;

With group by task get the value of new_column for each task and then join to the table tasks:
select t.id, t.task, g.new_column
from tasks t inner join (
select
task,
(case when count(distinct starttime) = 1 then 'X' else 'Y' end) new_column
from tasks
group by task
) g on g.task = t.task

Related

SQL using a function to check existence of event over a user id?

I have a table 'EVENTS' with a user column, and an 'event' column
User
Event
1
a
1
a
1
a
1
b
2
b
2
c
In the above example, user 1 has never had event c appear for them. I want to do something like
WITH table_a as (
SELECT
CASE WHEN EVENT = 'c' Then 'Y' ELSE 'n' end as event_occured,
user_id
FROM EVENTS)
and then get a result such as
User
is_occured
1
n
2
y
So I first tried to do it like such
SELECT DISTINCT USER,'y' is_occured FROM table_a WHERE event_occured='y'
UNION
SELECT DISTINCT USER,'n' is_occured FROM table_a WHERE event_occured='n'
But this is obviously a bit clunky, and will be unmanageable, especially as more columns are added to the event table, and needed in the query. so next I tried to do it using a window function, but I'm not certain how to pull the values into only singular users, where I'm only looking for the existence.
SELECT user,
CASE WHEN ... over(partion by user)
FROM EVENTS
But I'm very confused how to procede or if this is even the right track
If you are purely trying to get a Y or N onto these, you can do a simple MAX with a case expression:
select [User]
, MAX(case when [Event] = 'c' then 'Y' else 'N' end) is_occurred
from [EVENTS]
group by [User]
If you wanted to avoid group by, you could do a window function:
select distinct [User]
, MAX(case when [Event] = 'c' then 'Y' else 'N' end) over (partition by [User])
from [EVENTS]
If you wanted to have this as a function, you could parameterize the [Event] comparison and pass the user as well to something like:
select MAX(case when [Event] = #p_checked_event then 'Y' else 'N' end)
from [EVENTS]
where [User] = #p_checked_user
Return the results of that query, and call it like:
select distinct [User]
, CheckEventOccurred([User], 'c')
from [EVENTS]

Is there a way to collect the data and inspect in one pass using groupby function

Sample Data of table_1
Have this Query that returns
select
customer,
SUM(CASE WHEN activity IN ( 'a','b')
THEN 1
ELSE 0 END) AS num_activity_a_or_b
from table_1
group by customer
Results:
Want to extend this to return one more column if for a given code say X1 if the Activity is "a" and "c" then return num_of_a_and_c_activity.
A bit stuck how to collect and inpect the code and activities in one pass.
can we combine windowing function to achieve this.
Please advise and help
UPDATE:
based on the updated results, maybe the below query is what you need
So what i assume is that you need both a and c as well x1 .
So I count distinct activities which are a and c and then do integer division by 2. if only a is present then count distinct =1 but 1/2 =0 in integer division.
It is only 1 when both a and c are present.
select
customer,
SUM(CASE WHEN activity IN ( 'a','b')
THEN 1
ELSE 0
END) AS num_activity_a_or_b,
COUNT(DISTINCT CASE WHEN code IN ('x1') AND activity IN ( 'a','c')
THEN activity
ELSE NULL
END)/2 AS num_activity_a_and_c
from table_1
group by customer
Maybe your query can be
select
customer,
SUM(CASE WHEN activity IN ( 'a','b')
THEN 1
ELSE 0
END) AS num_activity_a_or_b,
SUM(CASE WHEN code IN ('x1') AND activity IN ( 'a','c')
THEN 1
ELSE 0
END) AS num_activity_a_or_c
from table_1
group by customer

Count rows with and combine them with math operator

I have a table with different events for a case. I want to calculate how many times event A occurs for each case but subtract the amounts of event B
I have started with a code like this, but it does not work.
SELECT
((SELECT
case_id,
Count(*)
from database
where event = "A"
Group by case_id)
-
(SELECT
case_id,
Count(*)
from database
where event = "B"
Group by case_id)) as count,
case-id,
from database
I suspect that you want conditional aggregation:
select case_id,
( sum(case when event = 'A' then 1 else 0 end) -
sum(case when event = 'B' then 1 else 0 end)
) as diff
from database
group by case_id;

Partition table based on joined table

We have 2 Tables Lead and Task.
One lead can have multiple Tasks.
We want to determine if a Lead has a Task who's description contains String 'x'.
If the Lead has the String the it should belong to group1 if it doesn't to group2.
Then we want to count the leads per group and week.
The problem we have is that if a Lead has several tasks and one of them has string 'x' in its description and the others don't it is counted in both groups.
We would need something that resembles a break; statement in the IFF clause of the subquery, so that if the first condition = Contain string x is satisfied the other tasks are not counted anymore.
How would we achieve that?
So far we have the following statement:
--SQL:
SELECT LeadDate, GROUP, COUNT(LEAD_ID_T1)
FROM LEAD Lead INNER JOIN
(SELECT DISTINCT LEAD.ID AS LEAD_ID_T1,
IFF(CONTAINS(Task.DESCRIPTION,
'x'),
'GROUP1',
'GROUP2') AS GROUP
FROM TASK Task
RIGHT JOIN LEAD ON TASK.WHO_ID = LEAD.ID
) T1 ON T1.LEAD_ID_T1 = LEAD.ID
GROUP BY LeadDate,GROUP;
Code breaks because it can not aggregate the measures.
Really thankful for any input. This has been bothering me for a few days now.
I am thinking EXISTS with a CASE expression:
select l.*,
(case when exists (select 1
from task t
where t.who_id = l.id and
t.description like '%x%'
)
then 'GROUP1' else 'GROUP2'
end) as the_group
from lead l;
You can also try something like this, CASE with 1 and 0 then take the SUM
SELECT LeadDate,
sum(CASE When t.description like '%x%'then 1 else 0 end) as Group1,
sum(CASE When t.description like '%x%'then 0 else 1 end) as Group2
FROM TASK t
RIGHT JOIN LEAD l ON t.WHO_ID = l.ID
GROUP BY LeadDate;

SQL to 'group' certain records

I am not entirely sure if this is possible in SQL (I am by no means an expert).
I have lines of data in TABLEA like below:
I wish to have an SQL output that will 'group' any records together that where Activity!=D. The output results would need look like the table below:
Any 'merged' activities would would be grouped as 'Merged'. In the example, this would be A and B.
I got started
select
cycle_start,
cycle_end,
activity_start,
activity_end,
case when (Activity="D") then "D" else "Merged" end
but then I struggle to get the aggregation correct
Yes, you can do this with a case in the group by:
select cycle_start, cycle_end,
min(activity_start) as activity_start,
max(activity_end) as activity_end,
(case when Activity = 'D' then 'D' else 'Merged' end)
from table t
group by cycle_start, cycle_end,
(case when Activity = 'D' then 'D' else 'Merged' end)