Find the difference between top 2 rows - sql

I am trying to workout the difference between the top 2 rows for all columns in a table. I will just specify one column to make this easier. I am new to SQL writing so apologies if this is easy. I am on SSMS and so far i think i need to inner join the table then compare row 1 of table1 with row 2 of table2. The id column works in that the newest row is the highest id. I need to take the highest id and then the second highest id (row 2) and find the difference.
SELECT Table1.id,
Table1.transferdate,
Table1.payment,
Table2.id,
Table2.transferdate,
Table2.payment
FROM Table1 AS Table1
INNER JOIN Table2 AS Table2 ON Table1.id = Table2.id
I want to see the difference between yesterday (Top row) and previous day to that (second row) in the payment column. The payment column should ever increase as that data is added on to the previous days. I'm just not sure where to go with it after the INNER JOIN and nothing I have tried has worked.
Data example of what i currently have:
id | transferdate |payment | debt | mailing_batch
46 | 2017-05-18 |651681 | 616816 | 1861651
45 | 2017-05-17 |601680 | 516168 | 1616866
What i want is the difference:
id | transferdate |payment | debt | mailing_batch
1 | 1 |50001 | 100648 | 244785
I only ever want to see the difference between the top 2 for each column. Would i just delete out the select column names and just leave the ones that i have used the LEAD function on with TOP1?
Not interested in any other of the rows. Just always the top 2 as this is a data copy table and its a way to ensure the data has updated correctly in the business set up.

SELECT Top 1 Table1.id,
Table1.payment-LEAD(Table1.payment,1) over(order by Table1.id desc)
FROM Table1 AS Table1
INNER JOIN Table2 AS Table2 ON Table1.id = Table2.id
order by Table1.id desc
If you are using a version of sql server 2012 or higher, you can use the above query.

;With CTE(id , transferdate ,payment , debt , mailing_batch)
AS
(
SELECT 46 , '2017-05-18' ,651681 , 616816 , 1861651 UNION ALL
SELECT 45 , '2017-05-17' ,601680 , 516168 , 1616866
)
SELECT Id
,transferdate
,payment
,debt
,mailing_batch
FROM (
SELECT (LEadId - Id) AS Id
,DATEDIFF(DAY, transferdate, Leadtransferdate) AS transferdate
,(Leadpayment - payment) AS payment
,(Leaddebt - debt) AS debt
,(Leadmailing_batch - mailing_batch) AS mailing_batch
,ROW_NUMBER() OVER (
ORDER BY (
SELECT 1
)
) AS Seq
FROM (
SELECT *
,LEAD(id) OVER (
ORDER BY id
) AS LEadId
,LEAD(transferdate) OVER (
ORDER BY transferdate
) AS Leadtransferdate
,LEAD(payment) OVER (
ORDER BY payment
) AS Leadpayment
,LEAD(debt) OVER (
ORDER BY debt
) AS Leaddebt
,LEAD(mailing_batch) OVER (
ORDER BY mailing_batch
) AS Leadmailing_batch
FROM CTE
) Dt
) Final
WHERE Final.Seq = 1
OutPut
id | transferdate |payment | debt | mailing_batch
1 | 1 |50001 | 100648 | 244785

Which version of sql server you're using ? If it is 2012+ then LEAD and LAG functions will work. You can read more about them here and here
Try it yourself as they're really easy. Show some efforts and if not working, I will help you out.
update after comments
CREATE TABLE [dbo].[t1_new](
[Id] [int] NULL,
[transferdate] [date] NULL,
[payment] [int] NULL
) ON [PRIMARY]
GO
INSERT [dbo].[t1_new] ([Id], [transferdate], [payment]) VALUES (1, CAST(N'2017-05-18' AS Date), 1000)
GO
INSERT [dbo].[t1_new] ([Id], [transferdate], [payment]) VALUES (2, CAST(N'2017-05-19' AS Date), 1100)
GO
INSERT [dbo].[t1_new] ([Id], [transferdate], [payment]) VALUES (3, CAST(N'2017-05-20' AS Date), 1200)
GO
INSERT [dbo].[t1_new] ([Id], [transferdate], [payment]) VALUES (4, CAST(N'2017-05-21' AS Date), 1400)
GO
select top 1 t1.*,
LAG(transferdate) OVER (ORDER BY transferdate) previous_date,
LAG(payment) OVER (ORDER BY payment) previou_payment,
t1.payment - (LAG(payment) OVER (ORDER BY payment)) as payment_difference
from t1_new t1
order by Id desc
screenshots :
input table
output table
PS: you don't need 2 separate tables. Your issue can be resolved by 1 table itself.

Related

How to show only the latest record in SQL

I have this issue where I want to show only the latest record (Col 1). I deleted the date column thinking that it might not work if it has different values. but if that's the case, then the record itself has a different name (Col 1) because it has a different date in the name of it.
Is it possible to fetch one record in this case?
The code:
SELECT distinct p.ID,
max(at.Date) as date,
at.[RAPID3 Name] as COL1,
at.[DLQI Name] AS COL2,
at.[HAQ-DI Name] AS COL3,
phy.name as phyi,
at.State_ID
FROM dbo.[Assessment Tool] as at
Inner join dbo.patient as p on p.[ID] = at.[Owner (Patient)_Patient_ID]
Inner join dbo.[Physician] as phy on phy.ID = p.Physician_ID
where (at.State_ID in (162, 165,168) and p.ID = 5580)
group by
at.[RAPID3 Name],
at.[DLQI Name],
at.[HAQ-DI Name],
p.ID, phy.name,
at.State_ID
SS:
In this SS I want to show only the latest record (COL 1) of this ID "5580". Means the first row for this ID.
Thank you
The Most Accurate way to handle this.
Extract The Date.
Than use Top and Order.
create table #Temp(
ID int,
Col1 Varchar(50) null,
Col2 Varchar(50) null,
Col3 Varchar(50) null,
Phyi Varchar(50) null,
State_ID int)
Insert Into #Temp values(5580,'[9/29/2021]-[9.0]High Severity',null,null,'Eman Elshorpagy',168)
Insert Into #Temp values(5580,'[10/3/2021]-[9.3]High Severity',null,null,'Eman Elshorpagy',168)
select top 1 * from #Temp as t
order by cast((Select REPLACE((SELECT REPLACE((SELECT top 1 Value FROM STRING_SPLIT(t.Col1,'-')),'[','')),']','')) as date) desc
This is close to ANSI standard, and it also caters for the newest row per id.
The principle is to use ROW_NUMBER() using a descending order on the date/timestamp (using a DATE type instead of a DATETIME and avoiding the keyword DATE for a column name) in one query, then to select from that query using the result of row number for the filter.
-- your input, but 2 id-s to show how it works with many ..
indata(id,dt,col1,phyi,state_id) AS (
SELECT 5580,DATE '2021-10-03','[10/3/2021] - [9,3] High Severity','Eman Elshorpagy',168
UNION ALL SELECT 5580,DATE '2021-09-29','[9/29/2021] - [9,0] High Severity','Eman Elshorpagy',168
UNION ALL SELECT 5581,DATE '2021-10-03','[10/3/2021] - [9,3] High Severity','Eman Elshorpagy',168
UNION ALL SELECT 5581,DATE '2021-09-29','[9/29/2021] - [9,0] High Severity','Eman Elshorpagy',168
)
-- real query starts here, replace following comman with "WITH" ...
,
with_rank AS (
SELECT
*
, ROW_NUMBER() OVER(PARTITION BY id ORDER BY dt DESC) AS rank_id
FROM indata
)
SELECT
id
, dt
, col1
, phyi
, state_id
FROM with_rank
WHERE rank_id=1
;
id | dt | col1 | phyi | state_id
------+------------+-----------------------------------+-----------------+----------
5580 | 2021-10-03 | [10/3/2021] - [9,3] High Severity | Eman Elshorpagy | 168
5581 | 2021-10-03 | [10/3/2021] - [9,3] High Severity | Eman Elshorpagy | 168

T-SQL data cleaning - spare row if certain atrribute is NULL when tuple occurrs more than once

Table with sample data:
UserID | DateID | Code | Type
0815 20191211 'oef' xx -> keep that row in the result
0815 20191211 'oef' NULL -> should not be in the result set because
0916 20191212 'bin' NULL -> keep that row in the result set if there is just one occurrence for this User at that day.
In the above sample the Type and the Code can be NULL.
A conditional data clean up should be applied if the Type is NULL.
The second row should not be in the result-set because the only difference to the first is the Type which is NULL.
The third row exists only once for that User at that day and with that code, so it should be kept.
I can't imagine an elegant and performant solution for that clean up task. So if anybody has an idea I would be glad.
There is clustered Index on UserID and DateID (I could change it to a columnstore if it helps - MS SQL Server 2016).
We are talking about 100.000.000 rows in that table.
If I understand correctly, you want all rows where the values in the two columns are not NULL.
Then you want NULL values is there is no corresponding row based on the other column. Based on what I interpret as what you want:
select t.*
from t
where (t.code is not null and t.type is not null) or
(t.code is null and
not exists (select 1
from t t2
where t2.user = t.user and t2.dateid = t.dateid and
t2.code is not null and
(t2.type = t.type or t2.type is null and t.type is null)
)
) or
(t.type is null and
not exists (select 1
from t t2
where t2.user = t.user and t2.dateid = t.dateid and
t2.type is not null and
(t2.code = t.code or t2.code is null and t.code is null)
)
) ;
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE MyTab (UserID int, DateID int , Code varchar(255),Type varchar(255))
INSERT INTO MyTab (UserID,DateID,Code,Type) VALUES (0815, 20191211 ,'oef','xx'),(0815, 20191211 ,'oef',NULL),(0916,20191212 ,'bin',NULL)
Query 1:
;WITH CTE AS (
SELECT * , ROW_NUMBER() OVER (PARTITION BY UserID ORDER BY UserID desc) AS rn
FROM MyTab )
SELECT UserID,DateID,Code,Type FROM CTE
where Type IS NULL and rn=1
Results:
| UserID | DateID | Code | Type |
|--------|----------|------|--------|
| 916 | 20191212 | bin | (null) |

Sum Quantity and Filter Results

I have the following table with order id's and quantities. I need to be able to sum the quantity and retrieve the id's that that equal less than the provided number.
| id | quantity |
|------|----------|
| 100 | 1 |
| 200 | 25 |
| 300 | 15 |
For example, I need the id's where the sum of quantity equals less than 25.
When I try the following it only provides me the first id (100).
Select *
from (
select *,
SUM (Quantity) OVER (ORDER BY Id) AS SumQuantity
from dbo.Orders
) as A
where SumQuantity <= 25
Is it possible to adjust this query where it will provide me id 100 and 300, since the sum total of those orders is less than 25?
I know I can use a where clause on for quantity less than 25, but the important thing here is I need to be able to sum the quantity and pull id's that give me less than the provided number.
Thank you in advance!
Perhaps you want to order by the quantity instead of id?
Select o.*
from (select o.*, SUM (Quantity) OVER (ORDER BY quantity) AS SumQuantity
from dbo.Orders
) o
where SumQuantity <= 25;
This chooses the smallest values so you will get the most rows.
Group by Id and set the condition in the HAVING clause:
select Id, SUM(Quantity) AS SumQuantity
from Orders
group by Id
having SUM(Quantity) <= 25
See the demo.
Results:
Id | SumQuantity
100 | 1
200 | 25
300 | 15
If you want to include all the columns you can modify your query to not ORDER BY id but PARTITION BY id:
select *
from (
select *,
SUM (Quantity) OVER (PARTITION BY Id) AS SumQuantity
from Orders
) as A
where SumQuantity <= 25
For this dataset:
CREATE TABLE Orders([id] varchar(6), [quantity] int);
INSERT INTO Orders([id], [quantity])VALUES
('100', '1'), ('100', '2'),
('200', '25'), ('200', '3'),
('300', '15'), ('300', '5');
Results:
id | quantity | SumQuantity
100 | 1 | 3
100 | 2 | 3
300 | 15 | 20
300 | 5 | 20
See the demo.
Setup:
Your threshold can vary, so let's make it into a variable:
declare #threshold int = 25;
But I also imagine that your table values can vary, like if we add another row only having a quantity of 2:
declare #orders table (id int, quantity int)
insert #orders values (100,1), (200,25), (300,15), (400, 2);
Solution:
For this, we'll need a recursive kind of cross joining:
with
traverse as (
select ids = convert(nvarchar(255), id),
id,
quantity
from #orders
where quantity < #threshold
union all
select ids =
convert(nvarchar(255), tv.ids + ',' +
convert(nvarchar(255), o.id)),
o.id,
quantity = tv.quantity + o.quantity
from traverse tv
cross join #orders o
where tv.id < o.id
and tv.quantity + o.quantity < #threshold
)
select t.ids, t.quantity
from traverse t;
which will produce:
Explanation:
The above code is an algorithm that builds a tree. It starts with your base id's and quantities as nodes (the anchor part of the CTE). It trims anything not meeting the threshold.
It then adds edges by cross joining with orders table again (the recursive part of the CTE), but it only includes the following:
Id's that are greater than the last id considered in the current node (this is so that we avoid duplicate considerations, such as ids = '300,400' and ids = '400,300').
Ids where the sum of quantities is less than the threshold.
Warnings:
But beware, the type of problem you're considering will have computational complexity considerations. But because of the trimming conditions, it will be more efficient than doing all the cross joins first and then filtering the result set at the end.
Also, keep in mind that you may get rows in your table where there is no single set of numbers that will sum up to less than 25. Rather, you can get different paths to that sum. The way I produce the results here will help you identify such a situation.
cross join is perfect for this task, try:
declare #tbl table (id int, quantity int);
insert into #tbl values
(100, 1), (200, 25), (300, 15), (400, 10);
select distinct case when t1.id > t2.id then t1.id else t2.id end,
case when t1.id < t2.id then t1.id else t2.id end
from #tbl t1
cross join #tbl t2
where t1.id <> t2.id
and t1.quantity + t2.quantity < 25

Join information from one table to another without joining twice cross apply T-SQL

Situation:
I have two tables. #t1 has logins and emails. #t2 has country associated to each emails.
I would like to join the information from #t2 to #t1 without having to join it twice. Joining it only once either in the inner or outer query would break the cross apply logic.
My current query uses a cross apply to get rolling information as such (fiddle data below):
SELECT DISTINCT CAST(logins AS DATE) AS Dates,
count(distinct d.email) AS DAU,
count(distinct m.MAU) AS MAU
FROM #t1 d
CROSS APPLY (
SELECT CAST(m.logins as date) as dates, m.email AS MAU
FROM #t1 m
WHERE m.logins BETWEEN d.logins and DATEADD(dd, 30, d.logins)
) m
group by CAST(logins as date)
The only way i found to join the two tables without having to break my cross apply was to inner join it in both the outer and inner query which is probably wrong but at least the output is correct. I do that so i can add my second condition in my where statement in the inner query. when i apply the logic to my actual table, the performance is dreadful(fiddle data below):
SELECT distinct CASt(logins AS DATE) AS Dates,
#t2.country,
count(distinct d.email) AS DAU,
count(distinct m.MAU) AS MAU
FROM #t1 d
inner join #t2 on d.email=#t2.email
CROSS APPLY (
SELECT cast(m.logins as date) as dates, m.email AS MAU, country.country AS country
FROM #t1 m
inner join #t2 country on m.email=country.email
WHERE m.logins BETWEEN d.logins and DATEADD(dd, 30, d.logins)
and #t2.country = country.country
) m
group by cast(logins as date), #t2.country
+-------------+---------+-----+-----+
| Dates | country | DAU | MAU |
+-------------+---------+-----+-----+
| 2019-04-01 | france | 1 | 2 |
| 2019-04-02 | france | 1 | 2 |
| 2019-04-03 | france | 1 | 2 |
| 2019-04-10 | france | 1 | 1 |
| 2019-04-03 | italie | 2 | 2 |
+-------------+---------+-----+-----+
Objective:
How can i find a way to join information from one table to another without having to join it twice. (fiddle data below)
The result should look like this (output from the second query above):
DAU: how many distinct logins per country happened on day 'x'
MAU: how many distinct logins per country happened between day 'x'
and 30 days after.
Fiddle:
create table #t1 (email varchar(max), logins datetime)
insert into #t1 values
('aa#gmail.com', '2019-04-01 00:00:00.000'),
('aa#gmail.com', '2019-04-02 00:00:00.000'),
('aa#gmail.com', '2019-04-03 00:00:00.000'),
('zz#gmail.com', '2019-04-10 00:00:00.000'),
('cc#gmail.com', '2019-04-03 00:00:00.000'),
('dd#gmail.com', '2019-04-03 00:00:00.000'),
('dd#gmail.com', '2019-04-03 00:00:00.000')
create table #t2 (country varchar(max), email varchar(max))
insert into #t2 values
('france', 'aa#gmail.com'),
('france', 'zz#gmail.com'),
('italie', 'cc#gmail.com'),
('italie', 'dd#gmail.com')
Update
So I had initially said the second should peform better, but I'll eat those words. The first works better by far in my testing.
In my test environment I generated your tables as permanent tables and first populated #t2 (I called it emailLocation) with 100,000 unique email addresses spread across 206 countries. The second table (loginRecord) was populated with 2,000,000 random entries spread across 1/1/2018 - 12/31/2019. Both of these tables are indexed.
The below query is basically the one I said would be slower (it isn't). The major difference in this one is that I am filtering the dates within the CTE to reduce the data set. In my environment this runs in 20 seconds and returns 48,410 rows. I didn't test how long it would take to return the whole set, but trying this same CTE with a self-join ran for 10 minutes before I killed it.
WITH joined AS
(
SELECT
t1.logins AS dates,
t1.email,
t2.country
FROM loginRecord t1
JOIN dbo.emailLocation t2 ON t2.email = t1.email
WHERE t1.logins > GETDATE()
)
SELECT
dates,
country,
COUNT(DISTINCT(email)) AS DAU,
(SELECT COUNT(DISTINCT(email)) FROM joined WHERE country = j.country AND dates BETWEEN j.dates AND DATEADD(DAY,30,j.dates)) AS MAU
FROM joined j
GROUP BY j.dates, j.country
ORDER BY country, dates
---original answer
It feels like you're stuck on using the cross apply logic.
Here are two options that don't use cross apply. Both use a CTE to get a nice clean grouping of your temp tables, then the first option is a correlated subquery (blech) and the second is a self join.
Rextester here: https://rextester.com/AVJS76389
WITH joined AS
(
SELECT
t1.logins AS dates,
t1.email,
t2.country
FROM #t1 t1
JOIN #t2 t2 ON t2.email = t1.email
)
SELECT
dates,
country,
COUNT(DISTINCT(email)) AS DAU,
(SELECT COUNT(DISTINCT(email)) FROM joined WHERE country = j.country AND dates BETWEEN j.dates AND DATEADD(DAY,30,j.dates)) AS MAU
FROM joined j
GROUP BY j.dates, j.country;
WITH joined AS
(
SELECT
t1.logins AS dates,
t1.email,
t2.country
FROM #t1 t1
JOIN #t2 t2 ON t2.email = t1.email
)
SELECT
j1.dates,
j1.country,
COUNT(DISTINCT(j1.email)) AS DAU,
COUNT(DISTINCT(j2.email)) AS MAU
FROM joined j1
JOIN joined j2
ON j1.country = j2.country
AND j2.dates BETWEEN j1.dates AND DATEADD(DAY,30,j1.dates)
GROUP BY j1.dates, j1.country

Exclude rows where dates exist in another table

I have 2 tables, one is working pattern, another is absences.
1) Work pattern
ID | Shift Start | Shift End
123| 01-03-2017 | 02-03-2017
2) Absences
ID| Absence Start | Absence End
123| 01-03-2017 | 04-03-2017
What would be the best way, when selecting rows from work pattern, to exclude any that have a date marked as an absence in the absence table?
For example, I have a report that uses the work pattern table to count how may days a week an employee has worked, however I don't want it to include the days that have been marked as an absence on the absence table if that makes sense? Also don't want it to include any days that fall between the absence start and absence end date?
If the span of the absence should always encompass the shift to be excluded you can use not exists():
select *
from WorkPatterns w
where not exists (
select 1
from Absences a
where a.Id = w.Id
and a.AbsenceStart <= w.ShiftStart
and a.AbsenceEnd >= w.ShiftEnd
)
rextester demo: http://rextester.com/DCODC76816
returns:
+-----+------------+------------+
| id | ShiftStart | ShiftEnd |
+-----+------------+------------+
| 123 | 2017-02-27 | 2017-02-28 |
| 123 | 2017-03-05 | 2017-03-06 |
+-----+------------+------------+
given this test setup:
create table WorkPatterns ([id] int, [ShiftStart] datetime, [ShiftEnd] datetime) ;
insert into WorkPatterns ([id], [ShiftStart], [ShiftEnd]) values
(123, '20170227', '20170228')
,(123, '20170301', '20170302')
,(123, '20170303', '20170304')
,(123, '20170305', '20170306')
;
create table Absences ([id] int, [AbsenceStart] datetime, [AbsenceEnd] datetime) ;
insert into Absences ([id], [AbsenceStart], [AbsenceEnd]) values
(123, '20170301', '20170304');
What would be the best way, when selecting rows from work pattern
If you dealing only whit dates (no time) and have control over db schema,
One approach will be to create calendar table ,
Where you going to put all dates since company started and some years in future
Fill that table once.
After it is easy to join other tables whit dates and do math.
If you have trouble whit constructing TSQL query please edit question whit more details about columns and values of tables, relations and needed results.
How about this:
SELECT WP_START.[id], WP_START.[shift_start], WP_START.[shift_end]
FROM work_pattern AS WP_START
INNER JOIN absences AS A ON WP_START.id = A.id
WHERE WP_START.[shift_start] NOT BETWEEN A.[absence_start] AND A.[absence_end]
UNION
SELECT WP_END.[id], WP_END.[shift_start], WP_END.[shift_end]
FROM work_pattern AS WP_END
INNER JOIN absences AS A ON WP_END.id = A.id
WHERE WP_END.[shift_end] NOT BETWEEN A.[absence_start] AND A.[absence_end]
See it on SQL Fiddle: http://sqlfiddle.com/#!6/49ae6/6
Here is my example that includes a Date Dimension table. If your DBAs won't add it, you can create #dateDim as a temp table, like I've done with SQLFiddle (didn't know I could do that). A typical date dimension would have a lot more details you need about the days, but if the table can't be added, just use what you need. You'll have to populate the other Holidays you need. The DateDim I use often is at https://github.com/shawnoden/SQL_Stuff/blob/master/sql_CreateDateDimension.sql
SQL Fiddle
MS SQL Server 2014 Schema Setup:
/* Tables for your test data. */
CREATE TABLE WorkPatterns ( id int, ShiftStart date, ShiftEnd date ) ;
INSERT INTO WorkPatterns ( id, ShiftStart, ShiftEnd )
VALUES
(123, '20170101', '20171031')
, (124, '20170601', '20170831')
;
CREATE TABLE Absences ( id int, AbsenceStart date, AbsenceEnd date ) ;
INSERT INTO Absences ( id, AbsenceStart, AbsenceEnd )
VALUES
( 123, '20170123', '20170127' )
, ( 123, '20170710', '20170831' )
, ( 124, '20170801', '20170820' )
;
/* ******** MAKE SIMPLE CALENDAR TABLE ******** */
CREATE TABLE dateDim (
theDate DATE NOT NULL
, IsWeekend BIT DEFAULT 0
, IsHoliday BIT DEFAULT 0
, IsWorkDay BIT DEFAULT 0
);
/* Populate basic details of dates. */
INSERT dateDim(theDate, IsWeekend, IsHoliday)
SELECT d
, CONVERT(BIT, CASE WHEN DATEPART(dw,d) IN (1,7) THEN 1 ELSE 0 END)
, CONVERT(BIT, CASE WHEN d = '20170704' THEN 1 ELSE 0 END) /* 4th of July. */
FROM (
SELECT d = DATEADD(DAY, rn - 1, '20170101')
FROM
(
SELECT TOP (DATEDIFF(DAY, '20170101', '20171231'))
rn = ROW_NUMBER() OVER (ORDER BY s1.[object_id])
FROM sys.all_objects AS s1
CROSS JOIN sys.all_objects AS s2
ORDER BY s1.[object_id]
) AS x
) AS y ;
/* If not a weekend or holiday, it's a WorkDay. */
UPDATE dateDim
SET IsWorkDay = CASE WHEN IsWeekend = 0 AND IsHoliday = 0 THEN 1 ELSE 0 END
;
Query For Calculation:
SELECT wp.ID, COUNT(d.theDate) AS workDayCount
FROM WorkPatterns wp
INNER JOIN dateDim d ON d.theDate BETWEEN wp.ShiftStart AND wp.ShiftEnd
AND d.IsWorkDay = 1
LEFT OUTER JOIN Absences a ON d.theDate BETWEEN a.AbsenceStart AND a.AbsenceEnd
AND wp.ID = a.ID
WHERE a.ID IS NULL
GROUP BY wp.ID
ORDER BY wp.ID
Results:
| ID | workDayCount |
|-----|--------------|
| 123 | 172 | << 216 total days, 44 non-working
| 124 | 51 | << 65 total days, 14 non-working