Could you please take a look at the following task?
I have DATA table (it contains data for previous week):
CREATE TABLE DATA
(
EMPLOYEE nvarchar(50),
ABSENCE_START_DATE datetime,
ABSENCE_END_DATE datetime,
ABSENCE_TYPE nvarchar(50)
)
ABSENCE_START_DATE - date when absence starts
ABSENCE_END_DATE - date when absence ends
ABSENCE_TYPE - type of absence
Current table contains the following data:
INSERT INTO DATA(EMPLOYEE,ABSENCE_START_DATE,ABSENCE_END_DATE,ABSENCE_TYPE) VALUES
('EMP01','2017-09-04 00:00:00.000','2017-09-06 00:00:00.000','Sickness'),--Monday - Wednesday
('EMP01','2017-09-08 00:00:00.000','2017-09-08 00:00:00.000','Vacation'),--Friday - Friday
('EMP02','2017-09-04 00:00:00.000','2017-09-09 00:00:00.000','Sickness'),--Monday - Friday
('EMP03','2017-09-05 00:00:00.000','2017-09-09 00:00:00.000','Sickness')--Tuesday - Friday
Also, I have another table - STORAGE (it contains data for dates which are earlier than start of previous week).
CREATE TABLE STORAGE
(
EMPLOYEE nvarchar(50),
APPLY_DATE datetime,
ABSENCE_TYPE nvarchar(50)
)
There are daily records (excluding Saturdays and Sundays - they will never exist in this table)
INSERT INTO STORAGE(EMPLOYEE,APPLY_DATE,ABSENCE_TYPE) VALUES
('EMP01','2017-08-27 00:00:00.000','Sickness'),
('EMP01','2017-08-28 00:00:00.000','Worked'),
('EMP01','2017-08-29 00:00:00.000','Worked'),
('EMP01','2017-08-30 00:00:00.000','Sickness'),
('EMP01','2017-08-31 00:00:00.000','Sickness'),
('EMP01','2017-09-01 00:00:00.000','Sickness'),
('EMP02','2017-08-31 00:00:00.000','Worked'),
('EMP02','2017-09-01 00:00:00.000','Sickness')
So, the task is:
sql -script should find original start date to absence periods (from DATA table) which absence start date is Monday.
In other words, script should go day after day "in the past" and find date when appropriate absence period starts.
Not necessary that absence on Monday is 'Sickness'. It could be also 'Travel','Maternity'...
Expected result for examples below is (pay attention to first and third rows - absence start dates are different from appropriate rows in DATA table):
Thank you in advance.
From the sample data shared, it seems that you are looking to retrieve min date from storage to absence_start_date column of data table, below query can be an option.
SELECT d.Employee,
coalesce(CASE
WHEN d.ABSENCE_TYPE = 'Sickness' THEN
(SELECT min(apply_date)
FROM
STORAGE s
WHERE s.employee = d.employee
AND s.ABSENCE_TYPE = 'Sickness')
ELSE d.ABSENCE_START_DATE
END,d.ABSENCE_START_DATE) AS ABSENCE_START_DATE,
d.ABSENCE_END_DATE,
ABSENCE_TYPE
FROM DATA d;
Update 1:
A more generic query is below.
SELECT d.Employee,
coalesce(
(SELECT min(apply_date)
FROM
STORAGE s
WHERE s.employee = d.employee
AND s.ABSENCE_TYPE = d.ABSENCE_TYPE),d.ABSENCE_START_DATE) AS ABSENCE_START_DATE,
d.ABSENCE_END_DATE,
d.ABSENCE_TYPE
FROM DATA d
Update 2:
If you want to exclude weekends from the data, below is the query.
SELECT d.Employee,
coalesce(
(SELECT min(apply_date)
FROM
STORAGE s
WHERE s.employee = d.employee
AND s.ABSENCE_TYPE = d.ABSENCE_TYPE
AND DATENAME(dw,apply_date) NOT IN('Sunday','Saturday')),d.ABSENCE_START_DATE) AS ABSENCE_START_DATE,
d.ABSENCE_END_DATE,
d.ABSENCE_TYPE
FROM DATA d
Result:
Employee ABSENCE_START_DATE ABSENCE_END_DATE ABSENCE_TYPE
--------------------------------------------------------------------
EMP01 30.08.2017 00:00:00 06.09.2017 00:00:00 Sickness
EMP01 08.09.2017 00:00:00 08.09.2017 00:00:00 Vacation
EMP02 01.09.2017 00:00:00 09.09.2017 00:00:00 Sickness
EMP03 05.09.2017 00:00:00 09.09.2017 00:00:00 Sickness
you can check the demo here
Hope this will help.
Related
I have a sqlite3 database maintained on an AWS exchange that is regularly updated by a Python script. One of the things it tracks is when any team generates a new post for a given topic. The entries look something like this:
id
client
team
date
industry
city
895
acme industries
blueteam
2022-06-30
construction
springfield
I'm trying to create a table that shows me how many entries for construction occur each day. Right now, the entries with data populate, but they exclude dates with no entries. For example, if I search for just
SELECT date, count(id) as num_records
from mytable
WHERE industry = "construction"
group by date
order by date asc
I'll get results that looks like this:
date
num_records
2022-04-01
3
2022-04-04
1
How can I make sqlite output like this:
date
num_records
2022-04-02
3
2022-04-02
0
2022-04-03
0
2022-04-04
1
I'm trying to generate some graphs from this data and need to be able to include all dates for the target timeframe.
EDIT/UPDATE:
The table does not already include every date; it only includes dates relevant to an entry. If no team posts work on a day, the date column will jump from day 1 (e.g. 2022-04-01) to day 3 (2022-04-03).
Given that your "mytable" table contains all dates you need as an assumption, you can first select all of your dates, then apply a LEFT JOIN to your own query, and map all resulting NULL values for the "num_records" field to "0" using the COALESCE function.
WITH cte AS (
SELECT date,
COUNT(id) AS num_records
FROM mytable
WHERE industry = "construction"
GROUP BY date
ORDER BY date
)
SELECT dates.date,
COALESCE(cte.num_records, 0) AS num_records
FROM (SELECT date FROM mytable) dates
LEFT JOIN cte
ON dates.date = cte.date
I have a Time In and Time Out and there is a time range defined for Lunch Breakfast and Dinner. What i want is to Subtract these times from the attendance time (Time In And Time Out).
The sample data is
Attendance Table Data
EMPID 1095
TimeIN 2017-03-01 08:52:45.000
TimeOut 2017-03-01 19:59:18.000
The Mess Timings are
type StartTime EndTime
BreakFast 06:30:39 10:00:39
Dinner 19:00:39 21:00:39
Lunch 12:00:23 15:00:23
What i need is to subtract these mess timings from the actual attendance time to get actual employee duty time.
Thanks.
This approach utilises a numbers table to create a lookup table of all the seconds between your #TimeIn and #TimeOut values. This will work for periods covering multiple days, albeit with some severe caveats:
Breakfast, Lunch and Dinner are at the same time every day.
Your #TimeIn and #TimeOut period doesn't get so big it overflows the int value that contains the number of seconds.
In that case you will need to either just use minutes or find a different method
Your return value is less than 24 hours.
In that case, just don't return the difference as a time data type and handle it accordingly.
declare #TimeIn datetime = '2017-03-01 08:52:45.000'
,#TimeOut datetime = '2017-03-01 19:59:18.000'
,#BStart time = '06:30:39'
,#BEnd time = '10:00:39'
,#LStart time = '12:00:23'
,#LEnd time = '15:00:23'
,#DStart time = '19:00:39'
,#DEnd time = '21:00:39';
-- Create numbers table then use it to build a table os seconds between TimeIn and TimeOut
with n(n) as (select n from (values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) as n(n))
,s(s) as (select top (select datediff(s,#TimeIn,#TimeOut)+1) dateadd(s,row_number() over (order by (select 1))-1,#TimeIn) from n n1,n n2,n n3,n n4,n n5,n n6)
select cast(dateadd(s,count(1),0) as time) as s
from s
where s between #TimeIn and #TimeOut -- Return all seconds that aren't within Breakfast, Lunch or Dinner
and cast(s as time) not between #BStart and #BEnd
and cast(s as time) not between #LStart and #LEnd
and cast(s as time) not between #DStart and #DEnd
Which returns: 05:59:58.0000000
I have daily timings of mess in other table so i created a view and took all fields in front of daily attendance then using case statement to match the timings with Daily Attendance Time.
EmployeeID AttendanceDate ShiftID TimeIn TimeOut BreakOut BreakIn LeaveType TotalHours LeaveHours ATOThours DeductedHrs OTHours UserID AudtDate Reason SM SY OTDed DutyDed Mark Expr1 MARKL BreakFastStart BreakFastEnd LunchStart LunchEnd DinnerStart DinnerEnd
1095 2017-03-01 00:00:00.000 1 2017-03-01 08:52:45.000 2017-03-01 19:59:18.000 NULL NULL NULL 0 NULL 0 0 0 NULL NULL NULL 3 2017 NULL NULL NULL NULL NULL 2017-02-20 06:30:34.000 2017-02-20 09:30:34.000 2017-02-20 12:00:26.000 2017-02-20 15:00:26.000 2017-02-20 19:00:59.000 2017-02-20 21:00:59.000
For now it's good will check it's credibility with the passage of time.
Thanks For the support
You can also use the following script in the View OR in JOIN query of the tables. Note I got a different answer which I think is correct.
SELECT CONVERT(varchar, DATEADD(ss,
(DATEDIFF(ss,TimeIn, [TimeOut]) -
(
DATEDIFF(ss,[BreakFastStartTime], [BreakFastEndTime]) +
DATEDIFF(ss,[LunchStartTime], [LunchEndTime]) +
DATEDIFF(ss,[DinnerStartTime], [DinnerEndTime])
)
), 0), 108)
FROM [Attendance Data]
For your example, answer is 02:36:33
I have a customer table in which a new row is inserted when a customer signup occurs.
Problem
I want to know the total number of signup per day for a given date range.
For example, find the total number of signup each day from 2015-07-01 to 2015-07-10
customer table
sample data [relevant columns shown]
customerid username created
1 mrbean 2015-06-01
2 tom 2015-07-01
3 jerry 2015-07-01
4 bond 2015-07-02
5 superman 2015-07-10
6 tintin 2015-08-01
7 batman 2015-08-01
8 joker 2015-08-01
Required Output
created signup
2015-07-01 2
2015-07-02 1
2015-07-03 0
2015-07-04 0
2015-07-05 0
2015-07-06 0
2015-07-07 0
2015-07-08 0
2015-07-09 0
2015-07-10 1
Query used
SELECT
DATE(created) AS created, COUNT(1) AS signup
FROM
customer
WHERE
DATE(created) BETWEEN '2015-07-01' AND '2015-07-10'
GROUP BY DATE(created)
ORDER BY DATE(created)
I am getting the following output:
created signup
2015-07-01 2
2015-07-02 1
2015-07-10 1
What modification should I make in the query to get the required output?
You're looking for a way to get all the days listed, even those days that aren't represented in your customer table. This is a notorious pain in the neck in SQL. That's because in its pure form SQL lacks the concept of a contiguous sequence of anything ... cardinal numbers, days, whatever.
So, you need to introduce a table containing a source of contiguous cardinal numbers, or dates, or something, and then LEFT JOIN your existing data to that table.
There are a few ways of doing that. One is to create yourself a calendar table with a row for every day in the present decade or century or whatever, then join to it. (That table won't be very big compared to the capability of a modern database.
Let's say you have that table, and it has a column named date. Then you'd do this.
SELECT calendar.date AS created,
ISNULL(a.customer_count, 0) AS customer_count
FROM calendar
LEFT JOIN (
SELECT COUNT(*) AS customer_count,
DATE(created) AS created
FROM customer
GROUP BY DATE(created)
) a ON calendar.date = a.created
WHERE calendar.date BETWEEN start AND finish
ORDER BY calendar.date
Notice a couple of things. First, the LEFT JOIN from the calendar table to your data set. If you use an ordinary JOIN the missing data in your data set will suppress the rows from the calendar.
Second, the ISNULL in the toplevel SELECT to turn the missing, null, values from your dataset into zero values.
Now, you ask, where can I get that calendar table? I respectfully suggest you look that up, and ask another question if you can't figure it out.
I wrote a little essay on this, which you can find here.http://www.plumislandmedia.net/mysql/filling-missing-data-sequences-cardinal-integers/
Look here
Create teble with calendar and join it in your query.
DECLARE #MinDate DATE = '2015-07-01',
#MaxDate DATE = '2015-07-10';
Create Table tblTempDates
(created date, signup int)
insert into tblTempDates
SELECT TOP (DATEDIFF(DAY, #MinDate, #MaxDate) + 1)
Date = DATEADD(DAY, ROW_NUMBER() OVER(ORDER BY a.object_id) - 1, #MinDate), 0 As Signup
FROM sys.all_objects a
CROSS JOIN sys.all_objects b;
Create Table tblTempQueryDates
(created date, signup int)
INSERT INTO tblTempQueryDates
SELECT
created AS created, COUNT(scandate) AS signup
FROM
customer
WHERE
created BETWEEN #MinDate AND #MaxDate
GROUP BY created
UPDATE tblTempDates
SET tblTempDates.signup = tblTempQueryDates.signup
FROM tblTempDates INNER JOIN
tblTempQueryDates ON tblTempDates.created = tblTempQueryDates.created
select * from tblTempDates
order by created
Drop Table tblTempDates
Drop Table tblTempQueryDates
Not pretty, but it gives you what you want.
I have found similar queries on StackOverflow (e.g. Finding simultaneous events in a database between times) but nothing that matches exactly what I am after as far as I can tell so thought it OK to add as a new question.
I have a table that logs jobs (or "Activities"), with a start/end time for the job. I need to calculate working time (you can disregard non-working days, break times etc. as I have that covered). The complication is an individual can work on simultaneous jobs, overlapping at different points (the assumption is equal effort on simultaneous jobs), and the working time needs to reflect that. Minute accuracy is all that is required, not to the second.
Based on other suggestions I have this query, implemented as a table-valued function. It will look at each minute that activity is running, and if any other activities are running in the same period for the same person, and make calculations based on that. It works, but is very inefficient - taking over a minute to execute. Any ideas how I can do this more efficiently?
Running SQL 2005. I have done the obvious such as to add indexes on foreign keys by the way.
CREATE FUNCTION [dbo].[WorkActivity_WorkTimeCalculations] (#StartDate smalldatetime, #EndDate smalldatetime)
RETURNS #retActivity TABLE
(
ActivityID bigint PRIMARY KEY NOT NULL,
WorkMins decimal NOT NULL
)
/********************************************************************
Summary: Calculates the WORKING time on each activity running in a given date/time range
Remarks: Takes into account staff working simultaneously on jobs
(evenly distributes working time across simultaneous jobs)
Input Params: #StartDate - the start of the period to calculate
#EndDate - the end of the period to calculate
Output Params:
Returns: Recordset of activities and associated working time (minutes)
********************************************************************/
AS
BEGIN
-- any work activities still running use the overall end date as the activity's end date for the purpose of calculating
-- simulateneous jobs running
-- POPULATE A TEMP TABLE WITH EVERY MINUTE IN THE DATE RANGE
DECLARE #Minutes TABLE (MinuteDateTime smalldatetime NOT NULL)
;WITH cte AS (
SELECT #StartDate AS myDate
UNION ALL
SELECT DATEADD(minute,1,myDate)
FROM cte
WHERE DATEADD(minute,1,myDate) <= #EndDate
)
INSERT INTO #Minutes (MinuteDateTime)
SELECT myDate FROM cte
OPTION (MAXRECURSION 0)
-- POPULATE A TEMP TABLE WITH WORKLOAD PER EMPLOYEE PER MINUTE
DECLARE #JobsRunningByStaff TABLE (StaffID smallint NOT NULL, MinuteDateTime smalldatetime NOT NULL, JobsRunning decimal NOT NULL)
INSERT INTO #JobsRunningByStaff (StaffID, MinuteDateTime, JobsRunning)
SELECT wka_StaffID, MinuteDateTime, COUNT(DISTINCT wka_ItemID) JobsRunning
FROM dbo.WorkActivities
INNER JOIN #Minutes ON (MinuteDateTime BETWEEN wka_StartTime AND DATEADD(minute,-1,ISNULL(wka_EndTime,#EndDate)))
GROUP BY wka_StaffID, MinuteDateTime
-- FINALLY MAKE THE CALCULATIONS FOR EACH ACTIVITY
INSERT INTO #retActivity
SELECT wka_ActivityID, SUM(1/JobsRunning)WorkMins
FROM dbo.WorkActivities
INNER JOIN #JobsRunningByStaff ON (wka_StaffID = StaffID AND MinuteDateTime BETWEEN wka_StartTime AND DATEADD(minute,-1,ISNULL(wka_EndTime,#EndDate)))
GROUP BY wka_ActivityID
RETURN
END
Some example data (sorry for the poor formatting!)...
Source Data from WorkActivities table:
ACTIVITY ID | START TIME | END TIME | STAFF ID
1 | 03/03/2016 10:30 | 03/03/2016 10:50 | 1
2 | 03/03/2016 10:40 | 03/03/2016 11:00 | 1
And the desired results for a function call of SELECT * FROM dbo.WorkActivity_WorkTimeCalculations ('03-Mar-2016 10:30','03-Mar-2016 11:30'):
ACTIVITY ID | WORKMINS
1 | 25
2 | 15
So, the results take into account between 10:40 and 10:50 there are two jobs happening simultaneously, so calculates 5 mins working time on each over that period.
As suggested by posters, indexing made a significant difference - creating an index with wka_StartTime and wka_EndTime sorted it.
(sorry, couldn't see how to mark the comments made by others as an answer!)
So this is somewhat of a common question on here but I haven't found an answer that really suits my specific needs. I have 2 tables. One has a list of ProjectClosedDates. The other table is a calendar table that goes through like 2025 which has columns for if the row date is a weekend day and also another column for is the date a holiday.
My end goal is to find out based on the ProjectClosedDate, what date is 5 business days post that date. My idea was that I was going to use the Calendar table and join it to itself so I could then insert a column into the calendar table that was 5 Business days away from the row-date. Then I was going to join the Project table to that table based on ProjectClosedDate = RowDate.
If I was just going to check the actual business-date table for one record, I could use this:
SELECT actual_date from
(
SELECT actual_date, ROW_NUMBER() OVER(ORDER BY actual_date) AS Row
FROM DateTable
WHERE is_holiday= 0 and actual_date > '2013-12-01'
ORDER BY actual_date
) X
WHERE row = 65
from here:
sql working days holidays
However, this is just one date and I need a column of dates based off of each row. Any thoughts of what the best way to do this would be? I'm using SQL-Server Management Studio.
Completely untested and not thought through:
If the concept of "business days" is common and important in your system, you could add a column "Business Day Sequence" to your table. The column would be a simple unique sequence, incremented by one for every business day and null for every day not counting as a business day.
The data would look something like this:
Date BDAY_SEQ
========== ========
2014-03-03 1
2014-03-04 2
2014-03-05 3
2014-03-06 4
2014-03-07 5
2014-03-08
2014-03-09
2014-03-10 6
Now it's a simple task to find the N:th business day from any date.
You simply do a self join with the calendar table, adding the offset in the join condition.
select a.actual_date
,b.actual_date as nth_bussines_day
from DateTable a
join DateTable b on(
b.bday_seq = a.bday_seq + 5
);