Summing Hours Worked Based On Two Unique Identifiers - sql

I want to sum up the total hours worked in a given two week period (pay period) for employees in the company. I have a view that pulls a column for unique employee identifiers [CODE_USER], a column for uniquely identified pay types (Regular, Overtime, Holiday, Vacation, etc.) [Code], a column for total hours worked [Hours], and a column for each day of the workweek [Day].
As it stands right now, the [Hours] column shows total hours worked on a per day basis for each unique employee (based on the unique pay type, such as regular hours or overtime hours worked).
I need to combine all hours worked over a two week period for each employee [CODE_USER], for each pay type [CODE] into a summarized column named 'Hours'.
An ideal end result would look something like the following, given employee ID worked 80 regular hours, and 20 overtime hours over the course of two weeks (E1 equals Regular hours, E2 equals Overtime hours):
CODE_USER Code Hours
125 E1 80.00
125 E2 20.00
The closest I think I have gotten to solving it would be the following code, however it does not SUM hours worked for a unique CODE_USER for the two week period, it lists the hours worked for each day during the two week period as a collection of rows for that employee. For example, the following code shows 18 rows for the employee ID 125, the employee worked 10 full 8.00 hour days during the time period marked by E1 (regular), and there were 8 times where the employee worked overtime hours marked by E2 (overtime).
CODE:
SELECT [CODE_USER],
[Code],
SUM(Hours) AS Hours,
[Day]
FROM [LookUp].[dbo].[Daily_Hours_Worked]
WHERE [Day] >= '20191007' AND [Day] < '20191019'
AND [CODE_USER] LIKE '%125%'
GROUP BY [CODE_USER], [Code], [Hours], [Day]
ORDER BY [CODE_USER], [Day] DESC;
RESULTS:
CODE_USER Code Hours Day
125 E1 8.00 2019-10-18 00:00:00.000
125 E2 0.70 2019-10-18 00:00:00.000
125 E1 8.00 2019-10-17 00:00:00.000
125 E2 1.65 2019-10-17 00:00:00.000
125 E1 8.00 2019-10-16 00:00:00.000
125 E2 1.15 2019-10-16 00:00:00.000
125 E1 8.00 2019-10-15 00:00:00.000
125 E2 0.97 2019-10-15 00:00:00.000
125 E1 8.00 2019-10-14 00:00:00.000
125 E2 1.99 2019-10-14 00:00:00.000
125 E1 8.00 2019-10-11 00:00:00.000
125 E2 0.12 2019-10-11 00:00:00.000
125 E1 8.00 2019-10-10 00:00:00.000
125 E2 0.05 2019-10-10 00:00:00.000
125 E1 8.00 2019-10-09 00:00:00.000
125 E2 0.10 2019-10-09 00:00:00.000
125 E1 7.99 2019-10-08 00:00:00.000
125 E1 7.99 2019-10-07 00:00:00.000
EXPECTED RESULTS:
I want to see a SUM of E1, E2, etc., for the input pay period (2 week period) for each unique Employee ID [CODE_USER] in the table. The end result should be two rows for each employee with Regular Time (E1) and Overtime (E2) that SUMs that employee's hours worked for each category over the given time period.

Is it not simply that you should remove the day from the grouping and the specific employee from the where clause?
SELECT [CODE_USER],
[Code],
SUM(Hours) AS Hours
FROM [LookUp].[dbo].[Daily_Hours_Worked]
WHERE [Day] >= '20191007' AND [Day] < '20191019'
GROUP BY [CODE_USER], [Code]
ORDER BY [CODE_USER]
You don't need to group by hours; you're summing it. Situations where you should group by a column that you're also aggregating are rare
I'm confused as to why you say two weeks but the dates in your where clause are not two weeks apart; what if someone works on a weekend? I've left this part, just wanted to raise it as it seems odd that you'd do 12 days ie include only every other weekend (if the job is run once a fortnight)

Related

Using WITH and UNION to compute number of flights and weather condition with two tables

Table A
date
flight
airport
2012-10-01
oneway
ATL, GA
2012-10-01
oneway
LAX, CA
2012-10-02
oneway
SAN, CA
2012-10-02
oneway
DTW, MI
2012-10-03
round
SFO, CA
2012-10-04
round
SFO, CA
2012-10-05
round
SFO, CA
Table B
date
temp
precip
2012-10-01
27
0.02
2012-10-02
35
0.00
2012-10-03
66
0.18
2012-10-04
57
0.00
2012-10-05
78
0.24
Table A has about 100k rows and whereas Table B has only about 60 rows
I am trying to query to find total number of flights on cold days and warm days as well as tracking the number of days for either cold or warm.
A cold day is defined when temp from Table B is below (<) 40 and warm otherwise.
In the real data, I have total 10 days that matches the date therefore I need to count for that when aggregating. I tried to get the total count without using CTE but I am keep getting wrong counts.
The expected outcome
Days
Num_of_flight
Num_of_days
cold day
4
2
warm day
3
3
You need a LEFT join of TableB to TableA and aggregation on the result of a CASE expression which returns 'cold' or 'warm':
SELECT CASE WHEN b.temp < 40 THEN 'cold day' ELSE 'warm day' END Days,
COUNT(*) Num_of_flight,
COUNT(DISTINCT a.date) Num_of_days
FROM TableB b LEFT JOIN TableA a
ON a.date = b.date
GROUP BY Days;
See the demo.

Calculating difference (or deltas) between current and previous row with clickhouse

It would be awesome if there was a way to index rows during a query.
Is there a way to SELECT (compute) the difference of a single column between consecutive rows?
Let's say, something like the following query
SELECT
toStartOfDay(stamp) AS day,
count(day ) AS events ,
day[current] - day[previous] AS difference, -- how do I calculate this
day[current] / day[previous] as percent, -- and this
FROM records
GROUP BY day
ORDER BY day
I want to get the integer and percentage difference between the current row's 'events' column and the previous one for something similar to this:
day
events
difference
percent
2022-01-06 00:00:00
197
NULL
NULL
2022-01-07 00:00:00
656
459
3.32
2022-01-08 00:00:00
15
-641
0.02
2022-01-09 00:00:00
7
-8
0.46
2022-01-10 00:00:00
137
130
19.5
My version of Clickhouse doesn't support window-function but, on looking about the LAG() function mentioned in the comments, I found neighbor(), which works perfectly for what I'm trying to do
SELECT
toStartOfDay(stamp) AS day,
count(day ) AS events ,
(events - neighbor(events, -1)) as diff,
(events / neighbor(events, -1)) as perc
FROM records
GROUP BY day
ORDER BY day

Readmission of patient through 30 days after first discharge (total 31 days)

I have below sample data in one of my table and I want to find "If the discharge is followed by a readmission through 30 days after first discharge (total 31 days), use the admit date from the first admission and the discharge date from the last discharge".
PatientId ClaimId Admit Date Discharge Date
A001 110001 12/20/2019 1/17/2020
A001 110002 4/30/2020 4/30/2020
A001 110003 4/18/2020 4/30/2020
A001 110004 5/1/2020 5/5/2020
A001 110005 5/8/2020 5/27/2020
A001 110006 8/22/2020 9/20/2020
A001 110007 9/2/2020 9/5/2020
A001 110008 9/21/2020 10/20/2020
A001 110009 10/21/2020 11/19/2020
A001 110010 9/2/2020 9/5/2020
I tried this way but I can get only min of admit date. Not sure how to find Max of discharge date through 30 days after first discharge. Appreciate help.
SELECT A.PatientId,
A.Discharge_Date,
Min(B.Admit_Date) AS MinOfadmitDate,
DATEDIFF(dd,A.Discharge_Date,Min(B.Admit_Date)) AS Day_span
FROM Table1 A
INNER JOIN Table1 AS B ON A.PatientId = B.PatientId
WHERE B.Admit_Date > A.Discharge_Date
GROUP BY A.PatientId, A.Discharge_Date
HAVING DATEDIFF(dd,A.Discharge_Date, Min(B.Admit_Date))<=30
Can any one help to find min of admit date and max of discharge date
through 30 days after first discharge?
I think you can use the analytical function to find the first discharge date and then use aggregate function to find min and max from 30 days as follows:
select t.patientid,
min(admit_date) as min_Admit_date,
max(discharge_date) as max_discharge_date
from (select t.*,
min(discharge_date) over (partition by patientid) as min_d_date
from your_table t) t
where dateadd(d,30,min_d_date) > admit_date

SQL how to count but only count one instance if two columns match?

Wondering how to select from a table:
FIELDID personID purchaseID dateofPurchase
--------------------------------------------------
2 13 147 2014-03-21 00:00:00
3 15 165 2015-03-23 00:00:00
4 13 456 2018-03-24 00:00:00
5 1 133 2018-03-21 00:00:00
6 23 123 2013-03-22 00:00:00
7 25 456 2013-03-21 00:00:00
8 25 456 2013-03-23 00:00:00
9 22 456 2013-03-28 00:00:00
10 25 589 2013-03-21 00:00:00
11 82 147 1991-10-22 00:00:00
12 82 453 2003-03-22 00:00:00
I'd like to get a result table of two columns: weekday and the number of purchases of each weekday, but only count the distinct days of purchases if done by the same person on the same day - for example since personID 25 purchased two things on 2013-03-21, that should only count as one 'thursday' instead of 2.
Basically, if the personID and the dateofPurchase are the same for more than one row, only count it once is what I want.
Here is what I have currently: It does everything correctly except it will count the above scenario under the thursday twice, when I would only want to add one:
SELECT v.wkday as day, COUNT(*) as 'absences'
FROM dbo.AttendanceRecord pr CROSS APPLY
(VALUES (CASE WHEN DATEPART(WEEKDAY, date) IN (1, 7)
THEN 'Weekend'
ELSE DATENAME(WEEKDAY, date)
END)
) v(wkday)
GROUP BY v.wkday;
to clarify:
If an item is purchased for at least one puchaseID on a specific day they will be counted as purchased for that day, and do not need to be counted again for each new purchase ID on that day.
I think you want to count distinct persons, so that would be:
COUNT(DISTINCT personid) as absences
Note that single quotes are not appropriate around column aliases. If you need to escape them, use square braces.
EDIT:
If you want to count distinct person-days, then you can use:
COUNT(DISTINCT CONCAT(personid, ':', dateofpurchase) as absences

Select aggregrate and date range

I can't figure this query out and it should be easy. But I'm at a loss.
How do you query using an aggregate SUM bound by a date range?
Given this table:
ID EmployeeID PayAmount PayDate
1 48 289.0000 2003-12-22 00:00:00.000
2 251 458.0000 2003-12-30 00:00:00.000
3 48 248.0000 2003-12-30 00:00:00.000
4 167 255.5000 2003-12-30 00:00:00.000
5 48 100.00 2004-01-31 00:00:00.000
6 251 100.00 2004-01-31 00:00:00.000
7 251 300.00 2004-02-14 00:00:00:000
I would like to run a query to see how much each employee earned during a given year. So for 2003, the results would like this:
EmployeeID TotalPaid
48 537.00
167 255.50
251 458.00
For 2004 the results would be:
EmployeeID TotalPaid
48 100.00
251 400.00
In Microsoft SQL Server you can do like this.
Grouping the data based on the Year for each EmployeeID
The data can be filtered for a particular year using HAVING clause WITH YEAR function.
This query gives data for the year 2004
SELECT EmployeeID,
SUM(PayAmount) as TotalPaid,
DATEADD(YEAR, DATEDIFF(YEAR,0, Paydate), 0) as Year
FROM Table1
GROUP BY EmployeeID, DATEADD(YEAR, DATEDIFF(YEAR,0, Paydate), 0)
HAVING YEAR(DATEADD(YEAR, DATEDIFF(YEAR,0, Paydate), 0)) =2004