Group by month and name SQL - sql

I need some help with SQL.
I have
Table1 with columns Id, Date1 and Date2
Table2 with columns Table1Id and Table2Id
Table3 with columns Id and Name
Here is my try:
with tmp_tab as (
select
v."Name" as name
, date_part('month', cv."OfferAcceptedDate") as MonthAcceptedName
, date_part('month', cv."OfferSentDate") as MonthSentName
, 1 as cntAcc
, 1 as cntSent
from hr_metrics."CvInfo" as cv
join hr_metrics."CvInfoVacancy" as civ
on civ."CvInfosId" = cv."Id"
join hr_metrics."Vacancy" as v
on civ."VacanciesId" = v."Id"
where cv."OfferSentDate" is not null
and date_part('year', cv."OfferSentDate") = date_part('year', CURRENT_DATE)
group by v."Name" , date_part('month', cv."OfferAcceptedDate"),
date_part('month', cv."OfferSentDate")
)
select distinct
tmp_tab."name" as name,
tmp_tab.MonthSentName as mSent,
tmp_tab.MonthAcceptedName as mAcc,
Sum(tmp_tab.cntSent) as sented,
Sum(tmp_tab.cntacc) as accepted
from tmp_tab as tmp_tab
group by tmp_tab.name, tmp_tab.MonthSentName, tmp_tab.MonthAcceptedName;
I need to take Count(date2)/Count(date1) grouped by monthes and name.
I have no idea how to do that, as there is no table with monthes.
DB - Postgres
sample data from comment:
t1
1 | 01/01/2021 | 31/03/2021
2 | 05/01/2021 | 18/01/2021
3 | 12/01/2021 | 31/01/2021
4 | 13/03/2021 | 22/03/2021
t2
1 | 1
2 | 1
3 | 2
4 | 1
t3
1 | SomeName1
2 | someName2
Desired result:
Name | month | value
SomeName1 | 1 | 1\2
SomeName1 | 3 | 2
SomeName2 | 1 | 1
Update: if count(date2) == 0, than count(date2) = -1

Source answer
Here code for my question thats work. And yeah, i've asked it on ru too.
select name, month, sum((SRC=1)::int) as AcceptedCount, sum((SRC=2)::int) as SentCount,
case when sum((SRC=1)::int) = 0 then -1
else sum((SRC=2)::int)::float / sum((SRC=1)::int) end as Result
from (
select v.name, SRC,
extract('month' from case SRC when 1 then OfferAcceptedDate else OfferSentDate end) as month
from (select (date_part('year', CURRENT_DATE)::char(4) || '-01-01')::timestamptz as from_date) x
cross join (select 1 as SRC union all select 2) s
join CvInfo as cv on (SRC=1 and cv.OfferAcceptedDate >= from_date and cv.OfferAcceptedDate < from_date + interval '1 year')
or (SRC=2 and cv.OfferSentDate >= from_date and cv.OfferSentDate < from_date + interval '1 year')
join CvInfoVacancy as civ on civ.CvInfosId = cv.Id
join Vacancy as v on civ.VacanciesId = v.Id
where case SRC when 1 then OfferAcceptedDate else OfferSentDate end is not null
) x
group by name, month

Related

how to know the changed name in table by date_key

i have a table with 3 value
Date_key | user_name | user_id
2022-07-12 | milkcotton | 1
2022-09-12 | cereal | 2
2022-06-12 | musicbox1 | 3
2022-12-31 | harrybel1 | 1
2022-12-25 | milkcotton1| 4
2023-01-01 | cereal | 2
i want to know the user who changed the user_name in 1 semester (01 july 2022 - 31 december 2022). Can i do this?
my expected value is:
previous_name| new_name | user_id
milkcotton | harrybel1 | 1
Thank you!
know the changed of the user_name from 1 table
Note: This is done in Postgres SQL. This should be similar in most of the SQL engines. Date functions could slightly different in other SQL engines.
Try this:
with BaseTbl as(
select *,
cast(to_char(Date_key, 'YYYYMM') as int) as year_month,
cast(to_char(Date_key, 'MM') as int) as month,
row_number() over(partition by user_id order by date_key desc) as rnk
from Table1
),
LatestTwoChanges as(
select *
from BaseTbl
where user_id in (select user_id from BaseTbl where rnk=2 )
and rnk <=2
)
select
t2.user_name as previous_name,
t1.user_name as new_name,
t1.user_id
from LatestTwoChanges t1
join LatestTwoChanges t2
on t1.user_id=t2.user_id
where t1.rnk=1
and t2.rnk=2
and t1.year_month-t2.year_month <6
and t1.user_name <> t2.user_name
and (t1.month + t2.month <= 12 or t1.month + t2.month >=14 )
-- this is to check whether the date falling in the same semester.
SQL fiddle demo Here
Here, the table t1 contains the latest changes and table t2 contains the previous changes for a user_id.
The last filter condition
and (t1.month + t2.month <= 12 or t1.month + t2.month >=14 )
is to make sure that the two dates are falling in the same semester or not . which means the two months should be either between 1 and 6 or 7 and 12

Obtain Name Column Based on Value

I have a table that calculates the number of associated records that fit a criteria for each parent record. See example below:
note - morning, afternoon and evening are only weekdays
| id | morning | afternoon | evening | weekend |
| -- | ------- | --------- | ------- | ------- |
| 1 | 0 | 2 | 3 | 1 |
| 2 | 2 | 9 | 4 | 6 |
What I am trying to achieve is to determine which columns have the lowest value and get their column name as such:
| id | time_of_day |
| -- | ----------- |
| 1 | morning |
| 2 | afternoon |
Here is my current SQL code to result in the first table:
SELECT
leads.id,
COALESCE(morning, 0) morning,
COALESCE(afternoon, 0) afternoon,
COALESCE(evening, 0) evening,
COALESCE(weekend, 0) weekend
FROM leads
LEFT OUTER JOIN (
SELECT DISTINCT ON (lead_id) lead_id, COUNT(*) AS morning
FROM lead_activities
WHERE lead_activities.modality = 'Call' AND lead_activities.bound_type = 'outbound' AND extract('dow' from created_at) IN (0,1,2,3,4,5) AND (extract('hour' from created_at) >= 0 AND extract('hour' from created_at) < 12)
GROUP BY lead_id
) morning ON morning.lead_id = leads.id
LEFT OUTER JOIN (
SELECT DISTINCT ON (lead_id) lead_id, COUNT(*) AS afternoon
FROM lead_activities
WHERE lead_activities.modality = 'Call' AND lead_activities.bound_type = 'outbound' AND extract('dow' from created_at) IN (0,1,2,3,4,5) AND (extract('hour' from created_at) >= 12 AND extract('hour' from created_at) < 17)
GROUP BY lead_id
) afternoon ON afternoon.lead_id = leads.id
LEFT OUTER JOIN (
SELECT DISTINCT ON (lead_id) lead_id, COUNT(*) AS evening
FROM lead_activities
WHERE lead_activities.modality = 'Call' AND lead_activities.bound_type = 'outbound' AND extract('dow' from created_at) IN (0,1,2,3,4,5) AND (extract('hour' from created_at) >= 17 AND extract('hour' from created_at) < 25)
GROUP BY lead_id
) evening ON evening.lead_id = leads.id
LEFT OUTER JOIN (
SELECT DISTINCT ON (lead_id) lead_id, COUNT(*) AS weekend
FROM lead_activities
WHERE lead_activities.modality = 'Call' AND lead_activities.bound_type = 'outbound' AND extract('dow' from created_at) IN (6,7)
GROUP BY lead_id
) weekend ON weekend.lead_id = leads.id
You can use CASE/WHEN/ELSE to check for the specific conditions and produce different values. For example:
with
q as (
-- your query here
)
select
id,
case
when morning <= least(afternoon, evening, weekend) then 'morning'
when afternoon <= least(morning, evening, weekend) then 'afternoon'
when evening <= least(morning, afternoon, weekend) then 'evening'
else 'weekend'
end as time_of_day
from q

SQL Select Statement for Time and attendance for a month

Anyone can help with this one please? Our attendance system generates the following data:
Empid Department Timestamp Read_ID
3221 IT 2017-01-29 11:12:00.000 1
5565 IT 2017-01-29 12:28:06.000 1
5565 IT 2017-01-29 12:28:07.000 1
3221 IT 2017-01-29 13:12:00.000 2
5565 IT 2017-01-29 13:28:06.000 2
3221 IT 2017-01-30 07:42:15.000 1
3221 IT 2017-01-30 16:16:15.000 2
3221 IT 2017-01-31 09:05:00.000 1
3221 IT 2017-01-31 11:05:00.000 2
3221 IT 2017-01-31 13:20:00.000 1
3221 IT 2017-01-31 16:10:00.000 2
Where Read_ID value are :
1 = Entry
2 = Exit
I'm looking for SQL query to run on MS SQL 2014 that summarize attendance time for each employee on monthly basis, for instance
Empid Department Year Month TotalHours
3221 IT 2017 1 15:24
5565 IT 2017 1 01:00
This query should give you the result you need. It works by selecting each entries, and joining it with the next exit of the same employee (entries without further exits are ignored) : this gives us the duration of this employee shift. Then results are aggregated and shift durations are sumed in each group.
SELECT
t1.empid,
t1.department,
YEAR(t1.timestamp) Year,
MONTH(t1.timestamp) Month,
CONVERT(
varchar(12),
DATEADD(minute, SUM(DATEDIFF(minute, t1.timestamp, t2.timestamp)), 0),
114
) TotalHours
FROM
mytable t1
INNER JOIN mytable t2
ON t1.empid = t2.empid
AND t2.read_id = 2
AND t2.timestamp = (
SELECT MIN(timestamp)
FROM mytable
WHERE
read_id = 2
AND empid = t2.empid
AND timestamp > t1.timestamp
)
WHERE
t1.read_id = 1
GROUP BY t1.empid, t1.department, YEAR(t1.timestamp), MONTH(t1.timestamp)
ORDER BY 1, 2, 3, 4
Returns :
empid | department | Year | Month | TotalHours
----: | :--------- | ---: | ----: | :-----------
3221 | IT | 2017 | 1 | 15:24:00:000
5565 | IT | 2017 | 1 | 02:00:00:000
DB Fiddle demo on SQL Server 2014
There is an edge case, however, where an employee enters twice and then exists (this happens in your data, where employee 5565 enters at 29/01/2017 12:28:06 and at 29/01/2017 12:28:07, and then exits at 29/01/2017 13:28:06. The above query will take in account the two overlaping entries and map them to the same exit, resulting in this hour of work being counted twice.
While this matches your expected results, is this what you really want ? Here is an alternative query that , if several consecutive of the same employee entries happen, only takes in account the latest one :
SELECT
t1.empid,
t1.department,
YEAR(t1.timestamp) Year,
MONTH(t1.timestamp) Month,
CONVERT(
varchar(12),
DATEADD(minute, SUM(DATEDIFF(minute, t1.timestamp, t2.timestamp)), 0),
114
) TotalHours
FROM
mytable t1
INNER JOIN mytable t2
ON t1.empid = t2.empid
AND t2.read_id = 2
AND t2.timestamp = (
SELECT MIN(timestamp)
FROM mytable
WHERE
read_id = 2
AND empid = t2.empid
AND timestamp > t1.timestamp
)
WHERE
t1.read_id = 1
AND NOT EXISTS (
SELECT 1
FROM mytable
WHERE
read_id = 1
AND empid = t1.empid
AND timestamp > t1.timestamp
AND timestamp < t2.timestamp
)
GROUP BY t1.empid, t1.department, YEAR(t1.timestamp), MONTH(t1.timestamp)
ORDER BY 1, 2, 3, 4
Returns :
empid | department | Year | Month | TotalHours
----: | :--------- | ---: | ----: | :-----------
3221 | IT | 2017 | 1 | 15:24:00:000
5565 | IT | 2017 | 1 | 01:00:00:000
DB fiddle
Try this. I was not sure what time format would satisfy your system, so I put both:
SELECT * INTO #Tbl3 FROM (VALUES
(3221,'IT','2017-01-29 11:12:00.000',1),
(5565,'IT','2017-01-29 12:28:06.000',1),
(5565,'IT','2017-01-29 12:28:07.000',1),
(3221,'IT','2017-01-29 13:12:00.000',2),
(5565,'IT','2017-01-29 13:28:06.000',2),
(3221,'IT','2017-01-30 07:42:15.000',1),
(3221,'IT','2017-01-30 16:16:15.000',2),
(3221,'IT','2017-01-31 09:05:00.000',1),
(3221,'IT','2017-01-31 11:05:00.000',2),
(3221,'IT','2017-01-31 13:20:00.000',1),
(3221,'IT','2017-01-31 16:10:00.000',2))
x (Empid,Department,Timestamp,Read_ID)
;With cte as (
SELECT t1.Empid, t1.Department
, [Year] = Year(t1.Timestamp)
, [Month] = Month(t1.Timestamp)
, Seconds = SUM(DATEDIFF(second, t1.Timestamp, t2.Timestamp))
FROM #Tbl3 as t1
OUTER APPLY (
SELECT Timestamp = MIN(t.Timestamp)
FROM #Tbl3 as t
WHERE t.Department = t1.Department and t.Empid = t1.Empid
and t.Timestamp > t1.Timestamp and t.Read_ID = 2
) as t2
WHERE t1.Read_ID = 1
GROUP BY t1.Empid, t1.Department, Year(t1.Timestamp), Month(t1.Timestamp))
SELECT *, TotalHours = Seconds / 3600., TotalTime =
RIGHT('0'+CAST(Seconds / 3600 as VARCHAR),2) + ':' +
RIGHT('0'+CAST((Seconds % 3600) / 60 as VARCHAR),2) + ':' +
RIGHT('0'+CAST(Seconds % 60 as VARCHAR),2)
FROM cte;

SQL - Conditional column selection in join

I am not sure if this scenario can be achieved using TSQL. I have a table called WorkingDays, which have this info
ID | EmployeeId | Monday | Tuesday | Wednesday | Thursday | Friday
----------------------------------------------------------------------
1 | 1 | 2 | 2 | 3 | 6 | 5
2 | 2 | 1 | 7 | 5 | 2 | 3
The days columns store Ids of WorkingSchedule table, which has this columns:
ID int Primary Key
StartTime time
EndTime time
So what I need id get the StartTime and EndTime of an employee depending on the current date.
What I need to get from query is the start and end time depending on the day. The day I want to filter is de current date (using getdate() function)
So need to select the correct day column name to make the join.
How can I achieve this scenario?
The dynamic sql version:
declare #sql nvarchar(max) ='
select
t.EmployeeId
, StarTime = max(case when t.rn=1 then '+quotename(datename(weekday,getdate()))+' end)
, EndTime = max(case when t.rn=2 then '+quotename(datename(weekday,getdate()))+' end)
from (
select *
, rn = row_number() over (partition by t.EmployeeId order by t.Id)
from t
) t
group by t.EmployeeId;'
exec sp_executesql #sql;
rextester demo: http://rextester.com/WNH34961
returns:
+------------+----------+---------+
| EmployeeId | StarTime | EndTime |
+------------+----------+---------+
| 1 | 5 | 3 |
+------------+----------+---------+
Depending on how you want the output, here are two other ways that do not use dynamic sql:
Both use cross apply() to unpivot the data, and WorkDay = datename(weekday,getdate()) to get the current WorkDay column.
For one row output we add some conditional aggregation:
/* one row per employeeId */
select
t.EmployeeId
, x.WorkDay
, StarTime = max(case when t.rn=1 then x.Time end)
, EndTime = max(case when t.rn=2 then x.Time end)
from (
select *
, rn = row_number() over (partition by t.EmployeeId order by t.Id)
from t
) t
cross apply (values
('Monday',Monday),('Tuesday',Tuesday),('Wednesday',Wednesday)
,('Thursday',Thursday),('Friday',Friday)
) x (WorkDay,Time)
where WorkDay = datename(weekday,getdate())
group by t.EmployeeId, x.WorkDay
returns:
+------------+---------+----------+---------+
| EmployeeId | WorkDay | StarTime | EndTime |
+------------+---------+----------+---------+
| 1 | Friday | 5 | 3 |
+------------+---------+----------+---------+
If you want the output on two rows, like your current output:
/* two rows per employeeId */
select
t.Id
, t.EmployeeId
, x.WorkDay
, t.StartEnd
, x.Time
from (
select *
, StartEnd = case
when row_number() over (partition by t.EmployeeId order by t.Id) = 1
then 'StartTime'
else 'EndTime'
end
from t
) t
cross apply (values
('Monday',Monday),('Tuesday',Tuesday),('Wednesday',Wednesday)
,('Thursday',Thursday),('Friday',Friday)
) x (WorkDay,Time)
where WorkDay = datename(weekday,getdate());
returns:
+----+------------+---------+-----------+------+
| Id | EmployeeId | WorkDay | StartEnd | Time |
+----+------------+---------+-----------+------+
| 1 | 1 | Friday | StartTime | 5 |
| 2 | 1 | Friday | EndTime | 3 |
+----+------------+---------+-----------+------+
select wd.Employee, ws.StartTime, ws.EndTime
from WorkingDays wd
join WorkingSchedule ws on ws.Id = case datename(weekday, getdate())
when 'Monday' then ws.Monday
when 'Tuesday' then ws.Tuesday
when 'Wednesday' then ws.Wednesday
when 'Thursday' then ws.Thursday
when 'Friday' then ws.Friday
else 0
end
Hint: datename(weekday, getdate()) returns you the weekday name in your current locale! This might be better:
select wd.Employee, ws.StartTime, ws.EndTime
from WorkingDays wd
join WorkingSchedule ws on ws.Id = case datepart(weekday, getdate())
when 1 then wd.Monday
when 2 then wd.Tuesday
when 3 then wd.Wednesday
when 4 then wd.Thursday
when 5 then wd.Friday
else 0
end
But then you have to check which day is the first of week (0, 1), depending on your settings.

Count and pivot a table by date

I would like to identify the returning customers from an Oracle(11g) table like this:
CustID | Date
-------|----------
XC321 | 2016-04-28
AV626 | 2016-05-18
DX970 | 2016-06-23
XC321 | 2016-05-28
XC321 | 2016-06-02
So I can see which customers returned within various windows, for example within 10, 20, 30, 40 or 50 days. For example:
CustID | 10_day | 20_day | 30_day | 40_day | 50_day
-------|--------|--------|--------|--------|--------
XC321 | | | 1 | |
XC321 | | | | 1 |
I would even accept a result like this:
CustID | Date | days_from_last_visit
-------|------------|---------------------
XC321 | 2016-05-28 | 30
XC321 | 2016-06-02 | 5
I guess it would use a partition by windowing clause with unbounded following and preceding clauses... but I cannot find any suitable examples.
Any ideas...?
Thanks
No need for window functions here, you can simply do it with conditional aggregation using CASE EXPRESSION :
SELECT t.custID,
COUNT(CASE WHEN (last_visit- t.date) <= 10 THEN 1 END) as 10_day,
COUNT(CASE WHEN (last_visit- t.date) between 11 and 20 THEN 1 END) as 20_day,
COUNT(CASE WHEN (last_visit- t.date) between 21 and 30 THEN 1 END) as 30_day,
.....
FROM (SELECT s.custID,
LEAD(s.date) OVER(PARTITION BY s.custID ORDER BY s.date DESC) as last_visit
FROM YourTable s) t
GROUP BY t.custID
Oracle Setup:
CREATE TABLE customers ( CustID, Activity_Date ) AS
SELECT 'XC321', DATE '2016-04-28' FROM DUAL UNION ALL
SELECT 'AV626', DATE '2016-05-18' FROM DUAL UNION ALL
SELECT 'DX970', DATE '2016-06-23' FROM DUAL UNION ALL
SELECT 'XC321', DATE '2016-05-28' FROM DUAL UNION ALL
SELECT 'XC321', DATE '2016-06-02' FROM DUAL;
Query:
SELECT *
FROM (
SELECT CustID,
Activity_Date AS First_Date,
COUNT(1) OVER ( PARTITION BY CustID
ORDER BY Activity_Date
RANGE BETWEEN CURRENT ROW AND INTERVAL '10' DAY FOLLOWING )
- 1 AS "10_Day",
COUNT(1) OVER ( PARTITION BY CustID
ORDER BY Activity_Date
RANGE BETWEEN CURRENT ROW AND INTERVAL '20' DAY FOLLOWING )
- 1 AS "20_Day",
COUNT(1) OVER ( PARTITION BY CustID
ORDER BY Activity_Date
RANGE BETWEEN CURRENT ROW AND INTERVAL '30' DAY FOLLOWING )
- 1 AS "30_Day",
COUNT(1) OVER ( PARTITION BY CustID
ORDER BY Activity_Date
RANGE BETWEEN CURRENT ROW AND INTERVAL '40' DAY FOLLOWING )
- 1 AS "40_Day",
COUNT(1) OVER ( PARTITION BY CustID
ORDER BY Activity_Date
RANGE BETWEEN CURRENT ROW AND INTERVAL '50' DAY FOLLOWING )
- 1 AS "50_Day",
ROW_NUMBER() OVER ( PARTITION BY CustID ORDER BY Activity_Date ) AS rn
FROM Customers
)
WHERE rn = 1;
Output
USTID FIRST_DATE 10_Day 20_Day 30_Day 40_Day 50_Day RN
------ ------------------- ---------- ---------- ---------- ---------- ---------- ----------
AV626 2016-05-18 00:00:00 0 0 0 0 0 1
DX970 2016-06-23 00:00:00 0 0 0 0 0 1
XC321 2016-04-28 00:00:00 0 0 1 2 2 1
Here is an answer that works for me, I have based it on your answers above, thanks for contributions from MT0 and Sagi:
SELECT CustID,
visit_date,
Prev_Visit ,
COUNT( CASE WHEN (Days_between_visits) <=10 THEN 1 END) AS "0-10_day" ,
COUNT( CASE WHEN (Days_between_visits) BETWEEN 11 AND 20 THEN 1 END) AS "11-20_day" ,
COUNT( CASE WHEN (Days_between_visits) BETWEEN 21 AND 30 THEN 1 END) AS "21-30_day" ,
COUNT( CASE WHEN (Days_between_visits) BETWEEN 31 AND 40 THEN 1 END) AS "31-40_day" ,
COUNT( CASE WHEN (Days_between_visits) BETWEEN 41 AND 50 THEN 1 END) AS "41-50_day" ,
COUNT( CASE WHEN (Days_between_visits) >50 THEN 1 END) AS "51+_day"
FROM
(SELECT CustID,
visit_date,
Lead(T1.visit_date) over (partition BY T1.CustID order by T1.visit_date DESC) AS Prev_visit,
visit_date - Lead(T1.visit_date) over (
partition BY T1.CustID order by T1.visit_date DESC) AS Days_between_visits
FROM T1
) T2
WHERE Days_between_visits >0
GROUP BY T2.CustID ,
T2.visit_date ,
T2.Prev_visit ,
T2.Days_between_visits;
This returns:
CUSTID | VISIT_DATE | PREV_VISIT | DAYS_BETWEEN_VISIT | 0-10_DAY | 11-20_DAY | 21-30_DAY | 31-40_DAY | 41-50_DAY | 51+DAY
XC321 | 2016-05-28 | 2016-04-28 | 30 | | | 1 | | |
XC321 | 2016-06-02 | 2016-05-28 | 5 | 1 | | | | |