How to apply Excel operation into SQL server Query? - sql

I am currently using SSMS 2008.
I would like to complete the operation, using SSMS and described in the Excel screenshot.
I have two tables joined, one having a positive count for when an employee's start working and one with a negative count for when the employee's leave. I am looking to have a column showing the count of employee's per hour.
I appreciate any help on this matter,
Thank you,

It is running total and could be implemented using windowed SUM:
SELECT *, SUM(Employee) OVER(ORDER BY [Date], [Time]) as Total_available
FROM tab
ORDER BY [Date], [Time];

An alternative method to SUM OVER is a self-join, with an aggregation on the lower or equal values.
Sample data:
CREATE TABLE TestEmployeeRegistration (
[Date] DATE,
[Time] TIME,
[Employees] INT NOT NULL DEFAULT 0,
PRIMARY KEY ([Date], [Time])
);
INSERT INTO TestEmployeeRegistration
([Date], [Time], [Employees]) VALUES
('2019-11-01', '08:00', 2),
('2019-11-01', '09:00', 5),
('2019-11-01', '10:00', 3),
('2019-11-01', '12:00',-5),
('2019-11-01', '13:00', 2),
('2019-11-01', '14:00',-5);
Query:
SELECT t.[Date], t.[Time], t.[Employees]
, SUM(t2.[Employees]) AS [Total available]
FROM [TestEmployeeRegistration] t
JOIN [TestEmployeeRegistration] t2
ON t2.[Date] = t.[Date]
AND t2.[Time] <= t.[Time]
GROUP BY t.[Date], t.[Time], t.[Employees]
ORDER BY t.[Date], t.[Time];
When using the window function of SUM, then I advice a partition by the "Date".
SELECT *
, SUM([Employees]) OVER (PARTITION BY [Date] ORDER BY [Time]) AS [Total available]
FROM [TestEmployeeRegistration]
ORDER BY [Date], [Time];
A test on rextester here

SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE MyTable (Dates Date,Times Time, EmployeesAvailable int)
INSERT INTO MyTable (Dates,Times,EmployeesAvailable) VALUES('2019-11-01','08:00',2)
INSERT INTO MyTable (Dates,Times,EmployeesAvailable) VALUES('2019-11-01','09:00',5)
INSERT INTO MyTable (Dates,Times,EmployeesAvailable) VALUES('2019-11-01','10:00',3)
INSERT INTO MyTable (Dates,Times,EmployeesAvailable) VALUES('2019-11-01','12:00',-5)
INSERT INTO MyTable (Dates,Times,EmployeesAvailable) VALUES('2019-11-01','13:00',2)
INSERT INTO MyTable (Dates,Times,EmployeesAvailable) VALUES('2019-11-01','14:00',-5)
Query 1:
SELECT Dates,Times,EmployeesAvailable,
SUM(EmployeesAvailable) OVER(ORDER BY Dates,Times) AS 'Total Available'
FROM MyTable
Results:
| Dates | Times | EmployeesAvailable | Total Available |
|------------|------------------|--------------------|-----------------|
| 2019-11-01 | 08:00:00.0000000 | 2 | 2 |
| 2019-11-01 | 09:00:00.0000000 | 5 | 7 |
| 2019-11-01 | 10:00:00.0000000 | 3 | 10 |
| 2019-11-01 | 12:00:00.0000000 | -5 | 5 |
| 2019-11-01 | 13:00:00.0000000 | 2 | 7 |
| 2019-11-01 | 14:00:00.0000000 | -5 | 2 |

Related

Fill in missing timestamp values in SQL

SQL newby here looking for a bit of help in writing a query.
Some sample data
Time Value
9:00 1.2
9:01 2.3
9:05 2.4
9:06 2.5
I need to fill in those missing times with zero - so the query would return
Time Value
9:00 1.2
9:01 2.3
9:02 0
9:03 0
9:04 0
9:05 2.4
9:06 2.5
Is this possible in T-SQL?
Thanks for any help / advice ...
One method uses a recursive CTE to generate the list of times and then use left join to bring in the values:
with cte as (
select min(s.time) as time, max(s.time) as maxt
from sample s
union all
select dateadd(minute, 1, cte.time), cte.maxt
from cte
where cte.time < cte.maxt
)
select cte.time, coalesce(s.value, 0)
from cte left join
sample s
on cte.time = s.time
order by cte.time;
Note that if you have more than 100 minutes, you will need option (maxrecursion 0) at the end of the query.
You can try to use recursive CTE make calendar table and OUTER JOIN base on that.
CREATE TABLE T(
[Time] Time,
Value float
);
insert into T values ('9:00',1.2);
insert into T values ('9:01',2.3);
insert into T values ('9:05',2.4);
insert into T values ('9:06',2.5);
Query 1:
with cte as (
SELECT MIN([Time]) minDt,MAX([Time] ) maxDt
FROM T
UNION ALL
SELECT dateadd(minute, 1, minDt) ,maxDt
FROM CTE
WHERE dateadd(minute, 1, minDt) <= maxDt
)
SELECT t1.minDt 'Time',
ISNULL(t2.[Value],0) 'Value'
FROM CTE t1
LEFT JOIN T t2 on t2.[Time] = t1.minDt
Results:
| Time | Value |
|------------------|-------|
| 09:00:00.0000000 | 1.2 |
| 09:01:00.0000000 | 2.3 |
| 09:02:00.0000000 | 0 |
| 09:03:00.0000000 | 0 |
| 09:04:00.0000000 | 0 |
| 09:05:00.0000000 | 2.4 |
| 09:06:00.0000000 | 2.5 |

Field correlation with date constraint

We're stuck in a huge challenge here. Let's assume the tables of one db were not properly planned in first hand. That's what it is and I need a solution for that.
There's a table A with 2 fields. Let's think that I have an assistant that supports my job every day, but i just registered when he/she started to assist me. It means the 'Stop Date' (not existent in the table) of each Assistant is the day before the Start Date of the next one.
Assistant | Start Date
James | 07/01/17
Frank | 01/03/18
Erika | 01/06/18
There's a second table B with that registers how many hours my assistant worked:
Date | Worked Hours
12/31/17 | 7.5
01/01/18 | 7.5
01/02/18 | 9
01/03/18 | 8
01/04/18 | 9
01/05/18 | 7.5
01/06/18 | 9
01/07/18 | 10
Given the information above, I need to write a SQL to return a table like below, considering the Start Dates of each person:
Assistant | Date | Worked Hours
Basically I need to correlate somehow the Date and Start Date to return the Assistant, but it involve's date comparisons that I have no idea how to do.
Any ideas how to solve this?
You can use a correlated subquery:
select b.*,
(select a.assistant
from a
where a.date <= b.date
order by a.date desc
fetch first 1 row only
) as assistant
from b;
Note all databases support the ANSI standard fetch first 1 row only, so you may need to use limit or top or whatever is appropriate for your database.
You can try this.
DECLARE #TableA TABLE (Assistant VARCHAR(10), [Start Date] DATE)
INSERT INTO #TableA VALUES
('James','07/01/17'),
('Frank','01/03/18'),
('Erika','01/06/18')
DECLARE #TableB TABLE ([Date] DATE, [Worked Hours] DECIMAL(18,2))
INSERT INTO #TableB VALUES
('12/31/17', 7.5),
('01/01/18', 7.5),
('01/02/18', 9 ),
('01/03/18', 8 ),
('01/04/18', 9 ),
('01/05/18', 7.5),
('01/06/18', 9 ),
('01/07/18', 10 )
;WITH CTE AS (
SELECT *, RN = ROW_NUMBER() OVER( PARTITION BY [Date] ORDER BY [Start Date] DESC)
FROM
#TableA A
INNER JOIN #TableB B ON A.[Start Date] <= B.Date
)
select Assistant, Date, [Worked Hours] FROM CTE WHERE RN = 1
Result:
Assistant Date Worked Hours
---------- ---------- ---------------------------------------
James 2017-12-31 7.50
James 2018-01-01 7.50
James 2018-01-02 9.00
Frank 2018-01-03 8.00
Frank 2018-01-04 9.00
Frank 2018-01-05 7.50
Erika 2018-01-06 9.00
Erika 2018-01-07 10.00
use 'lead()' to find the next record
use infinity to keep the final interval unclosed
CREATE TABLE a
( assistant text primary key
, startdate date
);
SET datestyle = 'mdy';
insert into a(assistant,startdate) VALUES
('James', '07/01/17' )
,('Frank', '01/03/18' )
,('Erika', '01/06/18' )
;
CREATE TABLE b
( ddate DATE NOT NULL primary key
, workedhours DECIMAL(4,1)
);
insert into b(ddate,workedhours) VALUES
('12/31/17', 7.5)
,('01/01/18', 7.5)
,('01/02/18', 9)
,('01/03/18', 8)
,('01/04/18', 9)
,('01/05/18', 7.5)
,('01/06/18', 9)
,('01/07/18', 10)
;
WITH aa AS (
SELECT a.assistant
, a.startdate
, lead(a.startdate, 1, 'infinity'::date) OVER (ORDER BY a.startdate)
AS enddate
FROM a
)
-- SELECT * FROM a ;
SELECT aa.startdate, aa.enddate, aa.assistant
, SUM(b.workedhours) AS workedhours
FROM aa
LEFT JOIN b ON b.ddate >= aa.startdate
AND b.ddate < aa.enddate
GROUP BY 1,2,3
;
Output:
CREATE TABLE
SET
INSERT 0 3
CREATE TABLE
INSERT 0 8
startdate | enddate | assistant | workedhours
------------+------------+-----------+-------------
2017-07-01 | 2018-01-03 | James | 24.0
2018-01-03 | 2018-01-06 | Frank | 24.5
2018-01-06 | infinity | Erika | 19.0
(3 rows)

Average of Days between ordered dates per group

+-------+-------+-----------+
| EmpID | PerID | VisitDate |
+-------+-------+-----------+
| 1 | 22 | 2/24/2017 |
| 1 | 22 | 3/25/2017 |
| 1 | 22 | 4/5/2017 |
| 2 | 33 | 5/6/2017 |
| 2 | 33 | 8/9/2017 |
| 2 | 33 | 6/7/2017 |
+-------+-------+-----------+
I am trying to find the latest visit date and average days between visits per EmpID. For Avg, I'll first have to order the days consecutively and then find the average.
Eg: Avg. days for EmpID=1 and PerID=22 would be [29(Days between 3/25 and 2/24) + 11 (Days between 3/25 and 4/5)/2] = 20 Days.
Desired Output:
+-------+-------+----------+----------+
| EmpID | PerID | MaxVDate | AvgVDays |
+-------+-------+----------+----------+
| 1 | 22 | 4/5/2017 | 20 |
| 2 | 33 | 8/9/2017 | 47.5 |
+-------+-------+----------+----------+
Attempt:
SELECT
EmpID
,PerID
,MAX(VisitDate) AS MaxVDate
,--Dunno how to find average AS AvgVDays
FROM
T1
GROUP BY
EmpID
,PerID
You can use lag to get the previous date and compute the date difference. Then use avg window function to get the average days.
Select distinct empid,perid,maxVdate,avg(diff_with_prev) OVER(Partition by empid) as avgVDays
from (
SELECT EmpID,PerID
,MAX(VisitDate) OVER(Partition BY EmpID) AS MaxVDate
,DATEDIFF(DAY,LAG(VisitDate) OVER(Partition BY EmpID order by VisitDate), VisitDate) as diff_with_prev
FROM T1
) t
Here's an option...
IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL
DROP TABLE #TestData;
CREATE TABLE #TestData (
EmpID INT NOT NULL,
PerID INT NOT NULL,
VisitDate DATE NOT NULL
);
INSERT #TestData (EmpID, PerID, VisitDate) VALUES
(1, 22, '2/24/2017'),
(1, 22, '3/25/2017'),
(1, 22, '4/5/2017'),
(2, 33, '5/6/2017'),
(2, 33, '8/9/2017'),
(2, 33, '6/7/2017');
-- SELECT * FROM #TestData td;
SELECT
db.EmpID,
db.PerID,
AvgDays = AVG(db.DaysBetween * 1.0)
FROM (
SELECT
*,
DaysBetween = DATEDIFF(dd, LAG(td.VisitDate, 1) OVER (PARTITION BY td.EmpID, td.PerID ORDER BY td.VisitDate), td.VisitDate)
FROM
#TestData td
) db
GROUP BY
db.EmpID,
db.PerID;
Results...
EmpID PerID AvgDays
----------- ----------- ---------------------------------------
1 22 20.000000
2 33 47.500000
The task is much easier than you think. You get the average with (last visit - first visit) / (count visits - 1).
select
empid,
perid,
max(VisitDate) as MaxVDate,
datediff(day, min(VisitDate), max(VisitDate)) * 1.0 / (count(*) - 1) as avgvdays
from mytable
group by empid, perid
having count(*) > 1
order by empid, perid;
The multiplication with 1.0 is necessary in order to avoid integer division. (You could also cast to decimal instead.)
As the calcualtion only makes sense for empid/perid pairs with more than one entry (and in order to avoid division by zero), I have applied an according HAVING clause.
Here is a test: http://rextester.com/AIFPA62612

SQL Find First Occurrence

I've been at this for about an hour now and am making little to no progress - thought I'd come here for some help/advice.
So, given a sample of my table:
+-----------+-----------------------------+--------------+
| MachineID | DateTime | AlertType |
+-----------+-----------------------------+--------------+
| 56 | 2015-10-05 00:00:23.0000000 | 2000 |
| 42 | 2015-10-05 00:01:26.0000000 | 1006 |
| 50 | 2015-10-05 00:08:33.0000000 | 1018 |
| 56 | 2015-10-05 00:08:48.0000000 | 2003 |
| 56 | 2015-10-05 00:10:15.0000000 | 2000 |
| 67 | 2015-10-05 00:11:59.0000000 | 3001 |
| 60 | 2015-10-05 00:13:02.0000000 | 1006 |
| 67 | 2015-10-05 00:13:08.0000000 | 3000 |
| 56 | 2015-10-05 00:13:09.0000000 | 2003 |
| 67 | 2015-10-05 00:14:50.0000000 | 1018 |
| 67 | 2015-10-05 00:15:00.0000000 | 1018 |
| 47 | 2015-10-05 00:16:55.0000000 | 1006 |
+-----------+-----------------------------+--------------+
How would I get the first occurrence of MachineID w/ an AlertType of 2000
and the last occurrence of the same MachineID w/ and AlertType of 2003.
Here is what I have tried - but it is not outputting what I expect.
SELECT *
FROM [Alerts] a
where
DateTime >= '2015-10-05 00:00:00'
AND DateTime <= '2015-10-06 00:00:00'
and not exists(
select b.MachineID
from [Alerts] b
where b.AlertType=a.AlertType and
b.MachineID<a.MachineID
)
order by a.DateTime ASC
EDIT: The above code doesn't get me what I want because I am not specifically telling it to search for AlertType = 2000 or AlertType = 2003, but even when I try that, I am still unable to gather my desired results.
Here is what I would like my output to display:
+-----------+-----------------------------+--------------+
| MachineID | DateTime | AlertType |
+-----------+-----------------------------+--------------+
| 56 | 2015-10-05 00:00:23.0000000 | 2000 |
| 56 | 2015-10-05 00:13:09.0000000 | 2003 |
+-----------+-----------------------------+--------------+
Any help with this would be greatly appreciated!
Not sure, but:
select * from [Table]
WHERE [DateTime] IN (
SELECT MIN([DateTime]) as [DateTime]
FROM [Table]
WHERE AlertType = 2000
GROUP BY MachineId
UNION ALL
SELECT MAX([DateTime]) as [DateTime]
FROM [Table]
WHERE AlertType = 2003
GROUP BY MachineId)
ORDER BY MachineId, AlertType
It looks like your outer section takes all records between 2015-10-05 to 2015-10-06, which includes all the records sorted by date. The inner portion only happens when no records fit the outer date range.
Looks like GSazheniuk has it right, but I am not sure if you just want the 2 records or everything that matches the MachineID and the two alerts?
Not sure what your attempt has to do with your question, but to answer this:
How would I get the first occurrence of MachineID w/ an AlertType of
2000 and the last occurrence of the same MachineID w/ and AlertType of
2003.
Simple:
SELECT * FROM (
SELECT TOP 1 * FROM Alerts WHERE AlertType='2000' ORDER BY Datetime ASC
UNION ALL
SELECT TOP 1 * FROM Alerts WHERE AlertType='2003' ORDER BY Datetime DESC
) t
I think everyone misses that your alert type is NOT a deciding factor, but a supplemental.
This should give you what you are looking for. I walked through the whole process.
`IF OBJECT_ID('tempdb..#alerts') IS NOT NULL DROP table #alerts
CREATE TABLE #alerts
(
MachineID int,
dte DATETIME,
alerttype int
)
INSERT INTO #alerts VALUES ('56','20151005 00:00:23','2000')
INSERT INTO #alerts VALUES ('42','20151005 00:01:26','1006')
INSERT INTO #alerts VALUES ('50','20151005 00:08:33','1018')
INSERT INTO #alerts VALUES ('56','20151005 00:08:48','2003')
INSERT INTO #alerts VALUES ('56','20151005 00:10:15','2000')
INSERT INTO #alerts VALUES ('67','20151005 00:11:59','3001')
INSERT INTO #alerts VALUES ('60','20151005 00:13:02','1006')
INSERT INTO #alerts VALUES ('67','20151005 00:13:08','3000')
INSERT INTO #alerts VALUES ('56','20151005 00:13:09','2003')
INSERT INTO #alerts VALUES ('67','20151005 00:14:50','1018')
INSERT INTO #alerts VALUES ('67','20151005 00:15:00','1018')
INSERT INTO #alerts VALUES ('47','20151005 00:16:55','1006')
GO
WITH rnk as ( --identifies the order of the records.
Select
MachineID,
dte = dte,
rnk = RANK() OVER (partition BY machineid ORDER BY dte DESC) --ranks the machine ID's based on date (first to Last)
FROM #alerts
),
agg as( --Pulls your first and last record
SELECT
MachineID,
frst = MIN(rnk),
lst = MAX(rnk)
FROM rnk
GROUP BY MachineID
)
SELECT
pop.MachineID,
pop.dte,
pop.alerttype
FROM #alerts pop
JOIN rnk r ON pop.MachineID = r.MachineID AND pop.dte = r.dte --the date join allows you to hook into your ranks
JOIN agg ON pop.MachineID = agg.MachineID
WHERE agg.frst = r.rnk OR agg.lst = r.rnk -- or clause can be replaced by two queries with a union all
ORDER BY 1,2 --viewability... machineID, date`
I personally use cross apply's to preform tasks like this, but CTE's are much more visually friendly for this exercise.

Get more records that appear more than once

How can I see all the records that appear more than once per day?
I have this table:
ID Name Date
1 John 27.03.2010 18:17:00
2 Mike 27.03.2010 16:38:00
3 Sonny 28.03.2010 20:23:00
4 Anna 29.03.2010 13:51:00
5 Maria 29.03.2010 21:59:00
6 Penny 29.03.2010 17:25:00
7 Alba 30.03.2010 09:36:00
8 Huston 31.03.2010 10:19:00
I wanna get:
1 John 27.03.2010 18:17:00
2 Mike 27.03.2010 16:38:00
4 Anna 29.03.2010 13:51:00
5 Maria 29.03.2010 21:59:00
6 Penny 29.03.2010 17:25:00
This should work assuming you are using MySQL.
SELECT *
FROM `thetable`
GROUP BY DATE(`Date`)
HAVING COUNT(*) > 1
I have not tested this however, its just the first thing I can think of.
The date function takes only the date part of a DateTime (which I assumed you were using since there's a time component shown too. Reference. I have also enclosed the Date field name in backticks since Date is a reserved word in MySQL (and did the same with the table name for consistency).
Bear in mind that different RDBMSs will probably have different functions for achieving this.
I'm calling the table Mytable, and changed the name of Date to somedate to not use keyword:
--create table mytable(ID int,Name varchar(32), somedate datetime)
select *
from mytable
where id in (
select id
from mytable
group by convert(varchar(10), somedate, 101), id
having count(1) > 1
)
UPDATE: Using SQL Server:
CREATE TABLE t1 (id int IDENTITY PRIMARY KEY, name varchar(20), date datetime);
INSERT INTO t1 VALUES('John', '2010-03-27 18:17:00');
INSERT INTO t1 VALUES('Mike', '2010-03-27 16:38:00');
INSERT INTO t1 VALUES('Sonny', '2010-03-28 20:23:00');
INSERT INTO t1 VALUES('Anna', '2010-03-29 13:51:00');
INSERT INTO t1 VALUES('Maria', '2010-03-29 21:59:00');
INSERT INTO t1 VALUES('Penny', '2010-03-29 17:25:00');
INSERT INTO t1 VALUES('Alba', '2010-03-30 09:36:00');
INSERT INTO t1 VALUES('Huston', '2010-03-31 10:19:00');
SELECT t1.id, t1.name, sub_t.date
FROM t1
JOIN (SELECT DATEADD(dd, DATEDIFF(dd,0, date), 0) as date
FROM t1
GROUP BY DATEADD(dd, DATEDIFF(dd,0, date), 0)
HAVING COUNT(id) > 1) sub_t ON
(sub_t.date = DATEADD(dd, DATEDIFF(dd,0, t1.date), 0));
Returns:
+----+-------+---------------------+
| id | name | date |
+----+-------+---------------------+
| 1 | John | 2010-03-27 00:00:00 |
| 2 | Mike | 2010-03-27 00:00:00 |
| 4 | Anna | 2010-03-29 00:00:00 |
| 5 | Maria | 2010-03-29 00:00:00 |
| 6 | Penny | 2010-03-29 00:00:00 |
+----+-------+---------------------+
Previous answer assumed MySQL:
Joining with a sub query would be one option:
CREATE TABLE t1 (id int AUTO_INCREMENT PRIMARY KEY,
name varchar(20),
date datetime);
INSERT INTO t1 VALUES(NULL, 'John', '2010-03-27 18:17:00');
INSERT INTO t1 VALUES(NULL, 'Mike', '2010-03-27 16:38:00');
INSERT INTO t1 VALUES(NULL, 'Sonny', '2010-03-28 20:23:00');
INSERT INTO t1 VALUES(NULL, 'Anna', '2010-03-29 13:51:00');
INSERT INTO t1 VALUES(NULL, 'Maria', '2010-03-29 21:59:00');
INSERT INTO t1 VALUES(NULL, 'Penny', '2010-03-29 17:25:00');
INSERT INTO t1 VALUES(NULL, 'Alba', '2010-03-30 09:36:00');
INSERT INTO t1 VALUES(NULL, 'Huston', '2010-03-31 10:19:00');
SELECT t1.id, t1.name, sub_t.date
FROM t1
JOIN (SELECT DATE(date) as date
FROM t1
GROUP BY DATE(date)
HAVING COUNT(id) > 1) sub_t ON (sub_t.date = DATE(t1.date));
Returns:
+----+-------+------------+
| id | name | date |
+----+-------+------------+
| 1 | John | 2010-03-27 |
| 2 | Mike | 2010-03-27 |
| 4 | Anna | 2010-03-29 |
| 5 | Maria | 2010-03-29 |
| 6 | Penny | 2010-03-29 |
+----+-------+------------+
5 rows in set (0.02 sec)
You mean, "how can I select all records for days which have more than one record?"
select *
from your_table
where trunc(date) in ( select trunc(date)
from your_table
group by trunc(date)
having count(*) > 1)
/
edit
Oh, you're on SQL Server. I used ORACLE's TRUNC() function, which takes a datetime and strips the time element. Apparently SQL Server doesn't have an exact equivalent but there are some workarounds.
You could do a join on the date part of the date variable, which will rule out any where there is only one row for that date (because it has nothing to join to).
This example is in t-sql:
select distinct l.*
from #table l
join
#table r
on convert(varchar, l.[date], 102) = convert(varchar, r.[date],102)
and l.id != r.id
Select *
From YourTable
Where ADate in (Select ADate
From YourTable
Group By ADate
Having Count(Distinct Id) > 1
)
Here's the final version:
SELECT *
FROM table
WHERE convert(varchar, table.DataInput, 102) IN
(SELECT convert(varchar, table.DataInput, 102)
FROM table
GROUP BY convert(varchar, table.DataInput, 102)
HAVING COUNT(*) > 2)
Thank you all for your input! :)
You can do it as follows:
SELECT *
FROM theTable t1
WHERE t1.created_at IN
(SELECT t2.created_at
FROM theTable t2
GROUP BY created_at
HAVING COUNT(*) > 1)