SQL filling missing date entries, and including previous date's counts - sql

I have a table as follows
Date
Id
Group
Name
ScoreCount
2022-06-20
1
Athlete
Adam
52
2022-06-23
1
Athlete
Adam
77
2022-06-25
1
Athlete
Adam
79
2022-06-19
1
Employee
Adam
65
2022-06-22
1
Employee
Adam
28
I'd like this for the dates to be added for each individual id and type of group. So it should look something like:
Date
Id
Group
Name
ScoreCount
2022-06-20
1
Athlete
Adam
52
2022-06-21
1
Athlete
Adam
52
2022-06-22
1
Athlete
Adam
52
2022-06-23
1
Athlete
Adam
77
2022-06-24
1
Athlete
Adam
77
2022-06-25
1
Athlete
Adam
79
2022-06-19
1
Employee
Adam
65
2022-06-20
1
Employee
Adam
65
2022-06-21
1
Employee
Adam
65
2022-06-22
1
Employee
Adam
28
My code is as follows:
WITH t as (SELECT
Id,
Group,
Name,
min(Date) as MinDate
max(Date) as MaxDate
FROM recordTable
GROUP BY Id,Group,Name
SELECT t.Id,
t.Group,
t.Name,
c.Days,
(SELECT LAST_VALUE(ScoreCount) FROM recordTable WHERE t.Id = recordTable.Id AND t.Group = recordTable.Group)
FROM t
LEFT JOIN calendar c ON c.Days BETWEEN t.MinDate AND t.MaxDate
calendar is the table that contains individual dates for the year 2022, so they can be joined. Everything works, except for the ScoreCount, which Last_Value isn't actually doing what I want it to do. How can I fix this?

You can simply try reversing the order of your joined tables -
WITH t as (SELECT Id,
Group,
Name,
min(Date) as MinDate,
max(Date) as MaxDate
FROM recordTable
GROUP BY Id,Group,Name
)
SELECT t.Id,
t.Group,
t.Name,
c.Days,
(SELECT LAST_VALUE(ScoreCount) OVER(<your over clause is missing>)
FROM recordTable
WHERE t.Id = recordTable.Id
AND t.Group = recordTable.Group)
FROM calendar c
LEFT JOIN t ON c.Days BETWEEN t.MinDate AND t.MaxDate
Although I have not tested the query yet this will give you an idea to proceed further.

You don't need the last_value, you can get the first value
WITH t as (
SELECT
[Id],
[Group],
[Name],
min([Date]) as MinDate,
max([Date]) as MaxDate
FROM recordTable
GROUP BY [Id],[Group],[Name]
)
SELECT
t.Id,
t.[Group],
t.[Name],
c.[Date],
(SELECT top 1 ScoreCount
from recordTable x
where x.[Date] <= c.[Days]
and x.[Group] = t.[Group]
and x.[Name] = t.[Name]
order by x.[Date] desc
) ScoreCount
FROM t
LEFT JOIN calendar c ON c.[Days] BETWEEN t.MinDate AND t.MaxDate

Related

SQL query group by with null values is returning duplicates

I have following query
My #dates table has following records:
month year saledate
9 2020 2020-09-01
10 2020 2020-10-01
11 2020 2020-11-01
with monthlysalesdata as(
select month(salesdate) as salemonth, year(salesdate) as saleyear,salesrepid, salespercentage
from salesrecords r
join #dates d on d.saledate = r.salesdate
group by salesrepid, salesdate),
averagefor3months as(
select 0 as salemonth, 0 as saleyear, salesrepid, salespercentage
from monthlysalesdata
group by salesrepid)
finallist as(
select * from monthlysalesdata
union
select * from averagefor3months
This query returns following records which gives duplicate for a averagefor3months result set when there is null record in the first monthlyresultdata. how to achieve average for 3 months as one record instead of having duplicates?
salesrepid salemonth saleyear percentage
232 0 0 null -------------this is the duplicate record
232 0 0 90
232 9 2020 80
232 10 2020 null
232 11 2020 100
My first cte has this result:
salerepid month year percentage
---------------------------------------------
232 9 2020 80
232 10 2020 null
232 11 2020 100
My second cte has this result:
salerepid month year percentage
---------------------------------------------
232 0 0 null
232 0 0 90
How to avoid the duplicate record in my second cte,
I suspect that you want a summary row per sales rep based on some aggregation. Your question is not clear on what is needed for the aggregation, but something like this:
with ym as (
select r.salesrepid, d.year, d.month, sum(<something>) as whatever
from salesrecords r join
#dates d
on d.saledate = r.salesdate
group by r.salesrepid, d.year, d.month
)
select ym.*
from ym
union all
select salesrepid, null, null, avg(whatever)
from hm
group by salesrepid;
I updated to selected the group by from the table directly instead of the previous cte and got my results. Thank you all for helping
with ym as (
select r.salesrepid, d.year, d.month, sum(<something>) as whatever
from salesrecords r join
#dates d
on d.saledate = r.salesdate
group by r.salesrepid, d.year, d.month
),
threemonthsaverage as(
select r.salesrepid, r.year, r.month, sum(something) as whatever
from salesrecords as r
group by salesrepid)
select ym *
union
select threemonthsaverage*

SQLite query - Limit occurrence of value

I have a query that return this result. How can i limit the occurrence of a value from the 4th column.
19 1 _BOURC01 1
20 1 _BOURC01 3 2019-11-18
20 1 _BOURC01 3 2017-01-02
21 1 _BOURC01 6
22 1 _BOURC01 10
23 1 _BOURC01 13 2016-06-06
24 1 _BOURC01 21 2016-09-19
My Query:
SELECT "_44_SpeakerSpeech"."id" AS "id", "_44_SpeakerSpeech"."active" AS "active", "_44_SpeakerSpeech"."id_speaker" AS "id_speaker", "_44_SpeakerSpeech"."Speech" AS "Speech", "34 Program Weekend"."date" AS "date"
FROM "_44_SpeakerSpeech"
LEFT JOIN "_34_programWeekend" "34 Program Weekend" ON "_44_SpeakerSpeech"."Speech" = "34 Program Weekend"."theme_id"
WHERE "id_speaker" = "_BOURC01"
ORDER BY id_speaker, Speech, date DESC
Thanks
I think this is what you want here:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY s.id, s.active, s.id_speaker, s.Speech
ORDER BY p.date DESC) rn
FROM "_44_SpeakerSpeech" s
LEFT JOIN "_34_programWeekend" p ON s.Speech = p.theme_id
WHERE s.id_speaker = '_BOURC01'
)
SELECT id, active, id_speaker, Speech, date
FROM cte
WHERE rn = 1;
This logic assumes that when two or more records all have the same columns values (excluding the date), you want to retain only the latest record.

View and complex query count distinct locations employee stayed in SQL

I have a view which looks like this view_1:
id Office Begin_dt Last_dt Days
1 Office1 2019-09-02 2019-09-08 6
1 Office2 2019-09-09 2019-09-30 21
1 Office1 2019-10-01 2019-10-31 30
5 Office3 2017-10-01 2017-10-16 15
5 Office2 2017-10-17 2017-10-30 13
5 Office2 2017-11-01 2017-11-31 30
I want to find the office where employee stayed for max time and also the number of Distinct Office locations he stayed in.
Expected output
id Max_time_in_Office Days Distinct_office_locations
1 Office1 36 2
5 Office2 43 2
So id 1 spends 6 and 30, overall 36 days in office 1. Max time is spent in office 1 by him. Distinct locations are 2.
id 5 spends 13 and 30 , 43 days in office. Max time is spent in office 2. Distinct locations are 2.
Code tried
select v.*
from (select v.id, v.office, sum(days) as Max_time_in_Office, count(Office) as Distinct_office_locations,
rank() over (partition by id order by sum(days) desc) as seqnum
from view_1 v
group by id, office
) v
where seqnum = 1;
Output obtained
id Max_time_in_Office Days Distinct_office_locations
1 Office1 36 1
5 Office2 43 1
So I am getting wrong output. Can someone pls help
Close. You want a window function:
select v.*
from (select v.id, v.office, sum(days) as Max_time_in_Office,
count(*) over (partition by id) as Distinct_office_locations,
rank() over (partition by id order by sum(days) desc) as seqnum
from view_1 v
group by id, office
) v
where seqnum = 1;
Basically the window function is counting the number of rows returned after the aggregation -- and there is one row per office.
You could use the apply operator to achieve that:
select V.Id,
T.Max_Time_Office,
T.Days,
Distinct_office_locations = count(distinct V.Office)
from view_1 V
Cross apply
(
Select top 1 Id,
Max_Time_Office = Office,
Days = sum(Days)
From view_1 VG
where V.Id = VG.Id
group by VG.Id, VG.Office
order by sum(Days) desc
) T
group by V.Id, T.Max_Time_Office, T.Days
Basically, you are getting the Office with most days in the order by sum(Days) desc inside the Cross apply, and using that in the outer expression. I then just did a count(distinct V.Office) to get the distinct offices.

SQL Join two tables by unrelated date

I’m looking to join two tables that do not have a common data point, but common value (date). I want a table that lists the date and total number of hired/terminated employees on that day. Example is below:
Table 1
Hire Date Employee Number Employee Name
--------------------------------------------
5/5/2018 10078 Joe
5/5/2018 10077 Adam
5/5/2018 10078 Steve
5/8/2018 10079 Jane
5/8/2018 10080 Mary
Table 2
Termination Date Employee Number Employee Name
----------------------------------------------------
5/5/2018 10010 Tony
5/6/2018 10025 Jonathan
5/6/2018 10035 Mark
5/8/2018 10052 Chris
5/9/2018 10037 Sam
Desired result:
Date Total Hired Total Terminated
--------------------------------------
5/5/2018 3 1
5/6/2018 0 2
5/7/2018 0 0
5/8/2018 2 1
5/9/2018 0 1
Getting the total count is easy, just unsure as the best approach from the standpoint of "adding" a date column
If you need all dates within some window then you need to join the data to a calendar. You can then left join and sum flags for data points.
DECLARE #StartDate DATETIME = (SELECT MIN(ActionDate) FROM(SELECT ActionDate = MIN(HireDate) FROM Table1 UNION SELECT ActionDate = MIN(TerminationDate) FROM Table2)AS X)
DECLARE #EndDate DATETIME = (SELECT MAX(ActionDate) FROM(SELECT ActionDate = MAX(HireDate) FROM Table1 UNION SELECT ActionDate = MAX(TerminationDate) FROM Table2)AS X)
;WITH AllDates AS
(
SELECT CalendarDate=#StartDate
UNION ALL
SELECT DATEADD(DAY, 1, CalendarDate)
FROM AllDates
WHERE DATEADD(DAY, 1, CalendarDate) <= #EndDate
)
SELECT
CalendarDate,
TotalHired = SUM(CASE WHEN H.HireDate IS NULL THEN NULL ELSE 1 END),
TotalTerminated = SUM(CASE WHEN T.TerminationDate IS NULL THEN NULL ELSE 1 END)
FROM
AllDates D
LEFT OUTER JOIN Table1 H ON H.HireDate = D.CalendarDate
LEFT OUTER JOIN Table2 T ON T.TerminationDate = D.CalendarDate
/* If you only want dates with data points then uncomment out the where clause
WHERE
NOT (H.HireDate IS NULL AND T.TerminationDate IS NULL)
*/
GROUP BY
CalendarDate
I would do this with a union all and aggregations:
select dte, sum(is_hired) as num_hired, sum(is_termed) as num_termed
from (select hiredate as dte, 1 as is_hired, 0 as is_termed from table1
union all
select terminationdate, 0 as is_hired, 1 as is_termed from table2
) ht
group by dte
order by dte;
This does not include the "missing" dates. If you want those, a calendar or recursive CTE works. For instance:
with ht as (
select dte, sum(is_hired) as num_hired, sum(is_termed) as num_termed
from (select hiredate as dte, 1 as is_hired, 0 as is_termed from table1
union all
select terminationdate, 0 as is_hired, 1 as is_termed from table2
) ht
group by dte
),
d as (
select min(dte) as dte, max(dte) as max_dte)
from ht
union all
select dateadd(day, 1, dte), max_dte
from d
where dte < max_dte
)
select d.dte, coalesce(ht.num_hired, 0) as num_hired, coalesce(ht.num_termed) as num_termed
from d left join
ht
on d.dte = ht.dte
order by dte;
Try this one
SELECT ISNULL(a.THE_DATE, b.THE_DATE) as Date,
ISNULL(a.Total_Hire,0) as Total_Hire,
ISNULL (b.Total_Terminate,0) as Total_terminate
FROM (SELECT Hire_date as the_date, COUNT(1) as Total_Hire
FROM TABLE_HIRE GROUP BY HIRE_DATE) a
FULL OUTER JOIN (SELECT Termination_Date as the_date, COUNT(1) as Total_Terminate
FROM TABLE_TERMINATE GROUP BY HIRE_DATE) a
ON a.the_date = b.the_date

How to determine the maximum value for each category in SQL?

My table has records like below:
ID EmpID EffectiveDate PayElement Amount ComputeType AddDeduction
42 ISIPL001 2010-04-16 00:00:00.000 Basic 8000.00 On Attendance Addition
43 ISIPL001 2010-04-01 00:00:00.000 Con 2000.00 On Attendance Addition
44 ISIPL001 2010-04-01 00:00:00.000 HRA 2000.00 On Attendance Addition
54 ISIPL001 2011-01-01 00:00:00.000 Basic 15000.00 On Attendance Addition
55 ISIPL001 2011-01-01 00:00:00.000 Con 6000.00 On Attendance Addition
57 ISIPL001 2011-01-01 00:00:00.000 HRA 6000.00 On Attendance Addition
61 ISIPL001 2010-07-10 00:00:00.000 Basic 12000.00 On Attendance Addition
66 ISIPL001 2010-07-10 00:00:00.000 HRA 4200.00 On Attendance Addition
68 ISIPL001 2010-07-10 00:00:00.000 Con 5600.00 On Attendance Addition
I want the result display below:
i.e for each pay element available in my database, I need to record which is having maximum date for each pay element.
So my output should be like given below:
54 Basic 15000
55 Con 6000
57 HRA 6000
Try this:
SELECT ID,
PayElement,
Amount
FROM (
SELECT a.*,
RANK() OVER(PARTITION BY PayElement ORDER BY EffectiveDate DESC) AS rn
FROM <YOUR_TABLE> a
) a
WHERE rn = 1
;with cte as
(
select *,
row_number() over(partition by PayElement order by EffectiveDate desc) as rn
from YourTable
)
select
ID,
PayElement,
Amount
from cte
where rn = 1
Try this.
select
T.ID,
T.PayElement,
T.Amount
from
Test T inner join (select MAX(T_DATE.EffectiveDate) as MAX_DATE, T_DATE.PayElement from Test T_DATE group by T_DATE.PayElement) T_DATE on (T.PayElement = T_DATE.PayElement) and (T.EffectiveDate = T_DATE.MAX_DATE)
order by
T.ID
Select a.Id,
a.PayElement,
a.Amount
From dbo.YourTable a
Join
(
Select PayElement,
Max(EffectiveDate) as[MaxDate]
From dbo.YourTable
Group By PayElement
)b on a.PayElement = b.PayElement
And a.EffectiveDate = b.MaxDate
try something like
Select
a.ID, a.PayElement, a.Amount
From MyTable a
Inner Join (
Select PayElement, max(EffectiveDate) as MaxDate From MyTable Group By PayElement
) sub on a.EffectiveDate = sub.MaxDate and a.PayElement = sub.PayElement
select
Id, PayElement, Amount
from
YourTable a
inner join
(select
Id, PayElement, max(EffectiveDate) as EffectiveDate
from
YourTable
group by
PayElement, Id) b
on
a.Id = b.Id