I have a table that tracks leave days for each user:
ID | Start | End | IDUser
1 | 02-02-2020 | 03-02-2020 | 2
2 | 01-02-2020 | 21-02-2020 | 2
IDUser connects to the Users Table, that has IDUser and Username columns
I have a view / exhibition / query that shows previous mentioned columns data PLUS a column named UsedDays that counts how many leave days were used:
DATEDIFF(DAY, dbo.leavedays.start, dbo.leavedays.[end]) + 1
This is what I have now:
Start | End | IDUser | UsedDays
02-02-2020 | 03-02-2020 | 2 | 1
01-02-2020 | 21-02-2020 | 1 | 20
Each user has a total available number of days per year so I would like to have a column that subtracts from those total possible days of each user, and show how many are left.
Example:
John (IDUser = 2) has 30 days available this year and he already used 1, so there are 29 left
Start | End | IDUser | TotalDaysYear | UsedDays | LeftDays
02-02-2020 | 03-02-2020 | 2 | 30 | 1 | 29
01-02-2020 | 21-02-2020 | 1 | 20 | 20 | 0
I believe I have to create a table for TotalDaysYear, probably with:
ID | Year | TotalDaysYear | IDUser
1 | 2020 | 30 | 2
2 | 2020 | 20 | 1
IDUser connects to the Users Table, that has IDUser and Username columns
But I'm having trouble finding the logic for the relationship and how to find the result that I want, since it depends also on the year (available days may change per yer, per user).
Assuming you are using SQL Server, this should work:
SELECT
ld.start,
ld.[end],
ld.IDUser,
ldy.TotalDaysYear,
SUM(DATEDIFF(DAY, ld.start, ld.[end])+1) OVER (PARTITION BY ld.IDUser, YEAR(ld.start) ORDER BY ld.start) as UsedDays,
ldy.TotalDaysYear - SUM(DATEDIFF(DAY, ld.start, ld.[end])+1) OVER (PARTITION BY ld.IDUser, YEAR(ld.start) ORDER BY ld.start) as LeftDays
FROM leavedays ld
LEFT JOIN leavedaysperyear ldy
ON YEAR(ld.start) = ldy.Year AND ld.IDUser = ldy.IDUser
Basic idea is to have a running total of Used Days per user, per year and then subtract it to total available days for that user, during that same year.
Here's a SQLFiddle
NB. The example provided doesn't handle leave periods across years
Related
I have a main table in Microsoft Access that consists of a document number "AD", a revision number "Rev" and a "Decision Date". There is occasionally more than 1 revision for every AD and 1-2 decision dates for every revision. I want to create a query that selects the most recent entry by decision date, and create a new table that only contains the most recent entries. The purpose of this new table is to have only unique ADs, so that AD can be made a primary key and related to other objects in the database.
Current Table: tbl1_Complete_Data
+----+--------+-----+---------------+
| ID | AD | Rev | Decision Date |
+----+--------+-----+---------------+
| 1 |98-24-02| 0 | 1998-06-20 |
| 2 |98-24-02| 0 | 1998-06-21 |
| 3 |98-24-02| 1 | 1998-06-24 |
| 4 |98-24-02| 1 | 1998-06-24 |
| 5 |98-24-03| 0 | 1998-06-24 |
| 6 |98-24-03| 0 | 1998-06-24 |
+----+--------+-----+---------------+
New Table: tbl2_Report_Data
+----+--------+-----+---------------+
| ID | AD | Rev | Decision Date |
+----+--------+-----+---------------+
| 3 |98-24-02| 1 | 1998-06-24 |
| 5 |98-24-03| 0 | 1998-06-24 |
+----+--------+-----+---------------+
^The goal of this table is to get rid of ID.
Consider:
SELECT tbl1_Complete_Data.* FROM tbl1_Complete_Data WHERE ID IN (
SELECT TOP 1 ID FROM tbl1_Complete_Data AS Dupe
WHERE Dupe.AD = tbl1_Complete_Data.AD ORDER BY Dupe.DecisionDate DESC, Dupe.ID);
Strongly advise not to use spaces nor punctuation/special characters in naming convention.
I have a Production Table and a Standing Data table. The relationship of Production to Standing Data is actually Many-To-Many which is different to how this relationship is usually represented (Many-to-One).
The standing data table holds a list of tasks and the score each task is worth. Tasks can appear multiple times with different "ValidFrom" dates for changing the score at different points in time. What I am trying to do is query the Production Table so that the TaskID is looked up in the table and uses the date it was logged to check what score it should return.
Here's an example of how I want the data to look:
Production Table:
+----------+------------+-------+-----------+--------+-------+
| RecordID | Date | EmpID | Reference | TaskID | Score |
+----------+------------+-------+-----------+--------+-------+
| 1 | 27/02/2020 | 1 | 123 | 1 | 1.5 |
| 2 | 27/02/2020 | 1 | 123 | 1 | 1.5 |
| 3 | 30/02/2020 | 1 | 123 | 1 | 2 |
| 4 | 31/02/2020 | 1 | 123 | 1 | 2 |
+----------+------------+-------+-----------+--------+-------+
Standing Data
+----------+--------+----------------+-------+
| RecordID | TaskID | DateActiveFrom | Score |
+----------+--------+----------------+-------+
| 1 | 1 | 01/02/2020 | 1.5 |
| 2 | 1 | 28/02/2020 | 2 |
+----------+--------+----------------+-------+
I have tried the below code but unfortunately due to multiple records meeting the criteria, the production data duplicates with two different scores per record:
SELECT p.[RecordID],
p.[Date],
p.[EmpID],
p.[Reference],
p.[TaskID],
s.[Score]
FROM ProductionTable as p
LEFT JOIN StandingDataTable as s
ON s.[TaskID] = p.[TaskID]
AND s.[DateActiveFrom] <= p.[Date];
What is the correct way to return the correct and singular/scalar Score value for this record based on the date?
You can use apply :
SELECT p.[RecordID], p.[Date], p.[EmpID], p.[Reference], p.[TaskID], s.[Score]
FROM ProductionTable as p OUTER APPLY
( SELECT TOP (1) s.[Score]
FROM StandingDataTable AS s
WHERE s.[TaskID] = p.[TaskID] AND
s.[DateActiveFrom] <= p.[Date]
ORDER BY S.DateActiveFrom DESC
) s;
You might want score basis on Record Level if so, change the where clause in apply.
Background information:
My company requires employees to maintain at least one certification (cert) on a position. There are a total of 17 different certifications that an employee can get.
An employee can hold multiple certs. But on any one day they can only "sit" one of the positions that they are certified in. Most employees primarily sit the highest level position that they hold a cert in, but can sit a lower level position if there are manning shortages in that position and if they hold that particular cert (some employees come to us holding the higher level certs but none of the lower ones because they let them expire).
Multiple employees can hold the same cert.
Around 90% of employees are on contract, meaning they have a set termination date. Contracts can be extended but for the sake of this Access database, and the report to be generated, we're presuming that the termination date is set in stone.
My boss (and boss' boss) are wanting to put together a manning projection report so that they don't get caught off guard should we start running low on employees certified in any one position.
Example of what they want:
Lets say you have three employees:
Employee1 has certs in position1, position2, and position3 but he primarily sits as position3 and his contract expires June 2020.
Employee2 has certs in position1 and position2 but primarily sits as position2 and her contract expires in February 2022.
Employee3 is new and arrived August 2019 and is in training to get position1, maximum allowed training time for initial cert is 3 months, so presumably he should have his position1 cert by December 2019 and his contract expires August 2025.
Lets say my boss wants to project out 12 months with the starting month being November 2019 (he'll only be able to select a starting month-year that is equal to or later than the current month-year). The charts below, which are generated in subreports, should be what gets generated off of the above employee information.
All Certifications Chart
+-----------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+
| Cert | Nov 19 | Dec 19 | Jan 20 | Feb 20 | Mar 20 | Apr 20 | May 20 | Jun 20 | Jul 20 | Aug 20 | Sep 20 | Oct 20 |
+-----------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+
| Position1 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 2 |
| Position2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 1 | 1 |
| Position3 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
+-----------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+
Primary Certifications Chart
+-----------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+
| Cert | Nov 19 | Dec 19 | Jan 20 | Feb 20 | Mar 20 | Apr 20 | May 20 | Jun 20 | Jul 20 | Aug 20 | Sep 20 | Oct 20 |
+-----------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+
| Position1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| Position2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| Position3 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
+-----------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+--------+
Now I already have a solution in place but it's extremely inefficient and involves a query for each cell (2 Charts X 12 Months X 17 positions = 408 Queries when a report is generated). I'm hoping to do something more efficient with a crosstab query.
The tables are set up as such (only listing relevant fields):
Emp_table
ID (autoNum)
contractStarted (Date)
contractEnd (Date)
Cert_individual
ID (autoNum)
certID (num, many->one relationship to cert_table.ID)
EmpID (num, many->one relationship to Emp_table.ID)
date_cert_received (date)
primary (yes/no)
cert_table
ID (autoNum)
cert_name (short text)
Obviously I'd need to do a couple of INNER JOINS in order to get everything together and I tried using the format from this website for my crosstab query but it would only add an individual cert to a count on the month-year that the employee received it and not to every month that the employee will hold the cert.
So my question is:
Is there a way in SQL or VBA to get a cert counted across multiple columns (month-years) based off of when the employee received the cert and when their contract is scheduled to terminate?
As far as I know, the main problem in getting the crosstab query is that it can only generate columns with data that you already have.
A solution for you to get the monthly columns would be to have a side table with the 12 dates and then use the Cartesian product to generate the monthly data for each of your records in your certification table. This "date" table can be updated and maintained to match the months that you require in your report with a query.
For example, if you have a table named TempDates :
And a table with Employees with the following data :
You can generate the cartesian product with a query that I named QryCertsDates :
SELECT Employees.*, TempDates.* FROM TempDates, Employees;
Which lets you attach all the wanted dates with your original date from the table Employees in order to obtain data similar to below :
Now you can generate your crosstab query pivoting on the month and year and filtering the dates with the WHERE criteria such as :
TRANSFORM Count(QryCertsDates.Cert) AS CountOfCert SELECT QryCertsDates.Cert FROM QryCertsDates WHERE (((CDate([Yr] & "-" & [Mo])) Between CDate([Start]) And CDate([Expire]))) GROUP BY QryCertsDates.Cert PIVOT CDate([Yr] & "-" & Format([Mo],"00"));
You will end up ultimately with something like this :
You can do the same thing to get your second table/report as well. I don't know your database structure, so you will most likely need to do some adaptation. The other possible way that you can achieve a similar result would be to fill in a table using VBA.
However, this might be the easier solution to implement. Good luck!
I have prescription drug data that has a prescription date and the number of days supplied for that prescription. I am trying estimate actually drug intake dates which can be different then prescription date if people (1) refill their prescription before their current prescription is done or (2) they lost their current prescription and so need a refill.
Below is sample data for 1 patient:
| patient_id | rx_start_date | days_supply |
|------------|---------------|-------------|
| 1 | 1/10/2013 | 3 |
| 1 | 1/11/2013 | 3 |
| 1 | 1/14/2013 | 3 |
Without adjusting for stockpiling the end dates are calculated as rx_start_date + days_supply - 1 see:
| patient_id | rx_start_date | days_supply | rx_end_date |
|------------|---------------|-------------|-------------|
| 1 | 1/10/2013 | 3 | 1/12/2013 |
| 1 | 1/11/2013 | 3 | 1/13/2013 |
| 1 | 1/14/2013 | 3 | 1/16/2013 |
As you can see the start date for the 2nd prescription is overlapped by the first prescription. If we assume that they filled their prescription early then the actual intake date for the 2nd prescription should start on 1/13/2013. But moving the end date of the 2nd prescription causes an overlap over the 3rd prescription and so that must be moved as well. See the expected resulting table below:
| patient_id | rx_start_date | days_supply | rx_end_date |
|------------|---------------|-------------|-------------|
| 1 | 1/10/2013 | 3 | 1/12/2013 |
| 1 | 1/13/2013 | 3 | 1/15/2013 |
| 1 | 1/16/2013 | 3 | 1/18/2013 |
The other case is we might say if the current prescription overlaps the next one by more than 50% than we assume they lost their prescription and the 2nd prescription start date is the actual intake date. This means though that we need to truncate the current prescription to end when the 2nd one starts.
The algorithm is relatively simple using a non-sql iterative solution but I'm having trouble with a generic sql solution since adjusting dates at time X could potentially cause a cascading effect that adjust many other dates. I'm using Impala SQL so recursive CTE's are not an option and I'd like this to work on other databases so database specific functions are not ideal either.
The following should give you what you are looking for, so long as there are no gaps in the treatment regime:
with aggs as (select d1.patient_id, d1.rx_start_dt, sum(ds.days_supply) days_supply, min(ds.rx_start_dt) + sum(ds.days_supply) - 1 end_dt
from drugs d1
inner join drugs ds
on ds.patient_id = d1.patient_id and ds.rx_start_dt <= d1.rx_start_dt
group by d1.patient_id, d1.rx_start_dt)
select patient_id, coalesce(lag(end_dt+1) over (partition by patient_id order by rx_start_dt),rx_start_dt) start_dt, end_dt
from aggs;
Using the given sample data, this gives as output:
ID Start End
1 2013-01-10 2013-01-12
1 2013-01-13 2013-01-15
1 2013-01-16 2013-01-18
This was tested on Oracle, but all functions used appear to also be available in impala so should work there too.
The below question is actually copied from the other post and asking for Tableau answer but I would like to use SQL to prevent from performance problem.
I'm trying to calculate user retention rates across dates and for the last 14 days. For example, if 44 users arrive for the first time on September 16th, and then 19 of them show up again on September 17th, our day 1 retention for those September 16th users is 19/44. And if 41 users showed up for the first time on September 17th and 24 of those came back again on September 18th, then the September 17th 1 day retention would be 24/41. And if 18 users returned on September 18th who arrived for the first time on September 16th, then their 2 day retention would be 18/44.
The final outcome I would like to have is shown as below. I'm trying to figure out how to calculate the retention for Cohort Day by date. In addition, the table login contains the following columns, TimeStamp, userid, gamelabel, and play_time.
Login Table
TimeStamp | Userid | GameLabel | playtime |
-----------------------+-----------+------------+-----------
2016-09-16 21:00:24+8 | af07 | LL | 60010 |
2016-09-16 21:00:25+8 | 9dbe | YY | 60016 |
2016-09-16 21:01:24+8 | af07 | SS | 60009 |
The Final Outcome I would like to have
Retention| Today | Today- 1 Day|Today- 2 Day...|Today-12 Day |Today-13 Day
---------+--------+-------------+---------------+------------------+--------
|09/29/16| 09/28/16 | 09/27/16...| 09/17/16 | 09/16/16
0 | | | | 41/41| 44/44
1 | | | | 24/41| 19/44
2 | | | | | 18/44
3 | | | | |
7 | | | | |
Table Login
The Final Outcome