Using XML PATH to manipulate row data when grouping

Using XML PATH to manipulate row data when grouping - sql

In a sql query I'm trying to do two things.
I have a table Attendance like this
TABLE
EID | PID | In_Time | Out_Time | Shift
__________________________________________________________
100 | S001 | 2014-05-01 07:10 | 2014-05-01 19:20 | D
100 | S001 | 2014-05-04 07:00 | 2014-05-04 19:00 | D
100 | S001 | 2014-05-04 19:00 | 2014-05-05 07:00 | N
EID -EmployeeID
PID -PointID (Location)
D - Day Shift
N - Night Shift
When I group by all fields except Shift, (When grouping only the DatePart of In_Time fields will be taken) I want to get this
INTERMEDIATE STEP
EID | DAY | Shift |
___________________
100 | 01 | D |
100 | 04 | D/N |
Finally I want to PIVOT this to get following result
EXPECTED FINAL RESULT
EmployeeID | 01 | 02 | 03 | 04 |
__________________________________
100 | D | _ | _ | D/N |
I Use following query in this purpose but I'm getting slightly different result.
SELECT EID AS EmployeeID, [1],[2],[3],[4]
FROM (
SELECT
EID, datepart(dd,in_time) as [DAY],
STUFF((
SELECT '/ ' + Shift
FROM Attendance
WHERE ([in_time] = Results.[in_time] )
FOR XML PATH(''),TYPE).value('(./text())[1]','VARCHAR(MAX)')
,1,2,'') AS Shifts
FROM Attendance Results
WHERE EID = '100' AND PID ='C002'
GROUP BY EID , in_time
) AS SourceTable
PIVOT
(
MAX (Shifts )
FOR [DAY] IN ( [1],[2],[3],[4])
) AS PivotTable
This is the result of the query
EmployeeID | 01 | 02 | 03 | 04 |
________________________________
100 | D | _ | _ | N/N |
So something wrong in my query and could you pleas help me to sort this out? What's I'm missing in this query? Do you know any better way to do this?
EDIT: I just realized that the above code works well as long as the Attendance table contains records relevant to a single employee and a single Point (PID).
If the above table has details of several employees (EIDs) who works at different locations (PIDs) then the out put is wrong. So I see the code is not consistent but my knowledge in sql seems not enough to sort this out with out a help :(

the below answer may help you. The SQL looks lengthy, but the logic is simple, just grouping D and N data and finally joining. may be you can work from this.
DECLARE #Lu_Dt table
(id int identity(1,1),Dt Date)
INSERT INTO #Lu_Dt VALUES('2014-05-01'),
('2014-05-02'),('2014-05-03'),
('2014-05-04'),('2014-05-05')
DECLARE #tab table
(EID int,PID varchar(15),In_Time datetime,Out_Time datetime,[Shift] Char(1))
INSERT INTO #tab Values
(100,'S001','2014-05-01 07:10','2014-05-01 19:20','D'),
(100,'S001','2014-05-04 07:00','2014-05-04 19:00','D'),
(100,'S001','2014-05-04 19:00','2014-05-05 07:00','N')
SELECT * FROM #tab
DECLARE #refTab table
(id int identity(1,1),EID int, Dt Date,[shift] varchar(3))
INSERT INTO #refTab
SELECT Lu.EID,Lu.Dt,coalesce(LU1.[Flag],LU2.[Flag]) [shift]
FROM (SELECT *,100 EID FROM #Lu_Dt) Lu
LEFT JOIN (SELECT D.EID,D.In_Time,'D/N' [Flag]
FROM (SELECT EID,CAST(In_Time AS DATE) In_Time FROM #tab Where Shift = 'D') D
JOIN (SELECT EID,CAST(In_Time AS DATE) In_Time FROM #tab Where Shift = 'N') N
ON D.In_Time = N.In_Time AND D.EID = N.EID) LU1 ON Lu.Dt = LU1.In_Time
LEFT JOIN (SELECT CAST(In_Time AS DATE) In_Time,'D' [Flag] FROM #tab Where Shift = 'D') LU2 ON Lu.Dt = LU2.In_Time
LEFT JOIN (SELECT CAST(In_Time AS DATE) In_Time,'N' [Flag] FROM #tab Where Shift = 'N') LU3 ON Lu.Dt = LU3.In_Time
SELECT * FROM #refTab
SELECT *
FROM (SELECT EID,Dt,[shift] from #refTab) As src
PIVOT
(MAX([shift]) for Dt in
( [2014-05-01],
[2014-05-02],
[2014-05-03],
[2014-05-04],
[2014-05-05])) AS PivotTable;
Result
As there is a lot of hard coding, it cant be used as such.Hope this helped.

Related

SQL Server - Insert lines with null values when month doesn't exist

I have a table like this one:
Yr | Mnth | W_ID | X_ID | Y_ID | Z_ID | Purchases | Sales | Returns |
2015 | 10 | 1 | 5210 | 1402 | 2 | 1000.00 | etc | etc |
2015 | 12 | 1 | 5210 | 1402 | 2 | 12000.00 | etc | etc |
2016 | 1 | 1 | 5210 | 1402 | 2 | 1000.00 | etc | etc |
2016 | 3 | 1 | 5210 | 1402 | 2 | etc | etc | etc |
2014 | 3 | 9 | 880 | 2 | 7 | etc | etc | etc |
2014 | 12 | 9 | 880 | 2 | 7 | etc | etc | etc |
2015 | 5 | 9 | 880 | 2 | 7 | etc | etc | etc |
2015 | 7 | 9 | 880 | 2 | 7 | etc | etc | etc |
For each combination of (W, X, Y, Z) I would like to insert the months that don't appear in the table and are between the first and last month.
In this example, for combination (W=1, X=5210, Y=1402, Z=2), I would like to have additional rows for 2015/11 and 2016/02, where Purchases, Sales and Returns are NULL. For combination (W=9, X=880, Y=2, Z=7) I would like to have additional rows for months between 2014/4 and 2014/11, 2015/01 and 2015/04, 2016/06.
I hope I have explained myself correctly.
Thank you in advance for any help you can provide.

The process is rather cumbersome in this case, but quite possible. One method uses a recursive CTE. Another uses a numbers table. I'm going to use the latter.
The idea is:
Find the minimum and maximum values for the year/month combination for each set of ids. For this, the values will be turned into months since time 0 using the formula year*12 + month.
Generate a bunch of numbers.
Generate all rows between the two values for each combination of ids.
For each generated row, use arithmetic to re-extract the year and month.
Use left join to bring in the original data.
The query looks like:
with n as (
select row_number() over (order by (select null)) - 1 as n -- start at 0
from master.spt_values
),
minmax as (
select w_id, x_id, y_id, z_id, min(yr*12 + mnth) as minyyyymm,
max(yr*12 + mnth) as maxyyyymm
from t
group by w_id, x_id, y_id, z_id
),
wxyz as (
select minmax.*, minmax.minyyyymm + n.n,
(minmax.minyyyymm + n.n) / 12 as yyyy,
((minmax.minyyyymm + n.n) % 12) + 1 as mm
from minmax join
n
on minmax.minyyyymm + n.n <= minmax.maxyyyymm
)
select wxyz.yyyy, wxyz.mm, wxyz.w_id, wxyz.x_id, wxyz.y_id, wxyz.z_id,
<columns from t here>
from wxyz left join
t
on wxyz.w_id = t.w_id and wxyz.x_id = t.x_id and wxyz.y_id = t.y_id and
wxyz.z_id = t.z_id and wxyz.yyyy = t.yr and wxyz.mm = t.mnth;

Thank you for your help.
Your solution works, but I noticed it is not very good in terms of performance, but meanwhile I have managed to get a solution for my problem.
DECLARE #start_date DATE, #end_date DATE;
SET #start_date = (SELECT MIN(EOMONTH(DATEFROMPARTS(Yr , Mnth, 1))) FROM Table_Input);
SET #end_date = (SELECT MAX(EOMONTH(DATEFROMPARTS(Yr , Mnth, 1))) FROM Table_Input);
DECLARE #tdates TABLE (Period DATE, Yr INT, Mnth INT);
WHILE #start_date <= #end_date
BEGIN
INSERT INTO #tdates(PEriod, Yr, Mnth) VALUES(#start_date, YEAR(#start_date), MONTH(#start_date));
SET #start_date = EOMONTH(DATEADD(mm,1,DATEFROMPARTS(YEAR(#start_date), MONTH(#start_date), 1)));
END
DECLARE #pks TABLE (W_ID NVARCHAR(50), X_ID NVARCHAR(50)
, Y_ID NVARCHAR(50), Z_ID NVARCHAR(50)
, PerMin DATE, PerMax DATE);
INSERT INTO #pks (W_ID, X_ID, Y_ID, Z_ID, PerMin, PerMax)
SELECT W_ID, X_ID, Y_ID, Z_ID
, MIN(EOMONTH(DATEFROMPARTS(Ano, Mes, 1))) AS PerMin
, MAX(EOMONTH(DATEFROMPARTS(Ano, Mes, 1))) AS PerMax
FROM Table1
GROUP BY W_ID, X_ID, Y_ID, Z_ID;
INSERT INTO Table_Output(W_ID, X_ID, Y_ID, Z_ID
, ComprasLiquidas, RTV, DevManuais, ComprasBrutas, Vendas, Stock, ReceitasComerciais)
SELECT TP.DB, TP.Ano, TP.Mes, TP.Supplier_Code, TP.Depart_Code, TP.BizUnit_Code
, TA.ComprasLiquidas, TA.RTV, TA.DevManuais, TA.ComprasBrutas, TA.Vendas, TA.Stock, TA.ReceitasComerciais
FROM
(
SELECT W_ID, X_ID, Y_ID, Z_ID
FROM #tdatas CROSS JOIN #pks
WHERE Period BETWEEN PerMin And PerMax
) AS TP
LEFT JOIN Table_Input AS TA
ON TP.W_ID = TA.W_ID AND TP.X_ID = TA.X_ID AND TP.Y_ID = TA.Y_ID
AND TP.Z_ID = TA.Z_ID
AND TP.Yr = TA.Yr
AND TP.Mnth = TA.Mnth
ORDER BY TP.W_ID, TP.X_ID, TP.Y_ID, TP.Z_ID, TP.Yr, TP.Mnth;
I do the following:
Get the Min and Max date of the entire table - #start_date and #end_date variables;
Create an auxiliary table with all dates between Min and Max - #tdates table;
Get all the combinations of (W_ID, X_ID, Y_ID, Z_ID) along with the min and max dates of that combination - #pks table;
Create the cartesian product between #tdates and #pks, and in the WHERE clause I filter the results between the Min and Max of the combination;
Compute a LEFT JOIN of the cartesian product table with the input data table.

Find total date in ms sql 2008

I need help to find a total day in ms sql 2008 for example I have a course table like following
+----------+------------+------------+
| Course | DateFrom | DateTo |
+----------+------------+------------+
| Course1a | 12/22/2015 | 12/22/2015 |
| Course1b | 12/22/2015 | 12/22/2015 |
| Course1c | 12/24/2015 | 12/28/2015 |
+----------+------------+------------+
and a Holiday table that store holiday which mean no course during that day
+-----------+------------+
| name | DateFrom |
+-----------+------------+
| Christmas | 12/25/2015 |
+-----------+------------+
In here I want to have total days for course1 to be 5 days (12/22, 12/24, 12/25(do not count christmas holiday), 12/26, 12/27, 12/28)

One way to achieve it is to use:
;WITH tally AS
(
SELECT TOP 1000 r = ROW_NUMBER() OVER(ORDER BY (SELECT 1)) - 1
FROM master..spt_values
), cte AS
(
SELECT Course, DATEADD(d, t.r, c.DateFrom) AS dat
FROM #courses c
JOIN tally t
ON DATEADD(d, t.r, c.DateFrom) <= c.DateTo
)
SELECT LEFT(Course, 7) AS Course_Name,
COUNT(DISTINCT dat) AS Total_Days
FROM cte c
LEFT JOIN #holidays h
ON c.dat = h.DateFrom
WHERE h.DateFrom IS NULL
GROUP BY LEFT(Course, 7);
LiveDemo
Output:
╔═════════════╦════════════╗
║ Course_Name ║ Total_days ║
╠═════════════╬════════════╣
║ Course1 ║ 5 ║
╚═════════════╩════════════╝
How it works:
tally generates number table (any method)
cte transforms date_from and date_to to multiple rows
join with holidays table to exclude holiday dates
GROUP BY LEFT(Course, 7) is workaround (your course name should be distinct without suffixes (a,b,c) or you need another column that indicates that 3 courses combined create one course)
COUNT only DISTINCT dates to get total days count

How to transform rows into column? [duplicate]

This question already has answers here:
Convert Rows to columns using 'Pivot' in SQL Server
(9 answers)
Closed 7 years ago.
I have a table like this and there are only two feature for all user in this table
+-------+---------+-----------+----------+
| User | Feature | StartDate | EndDate |
+-------+---------+-----------+----------+
| Peter | F1 | 2015/1/1 | 2015/2/1 |
| Peter | F2 | 2015/3/1 | 2015/4/1 |
| John | F1 | 2015/5/1 | 2015/6/1 |
| John | F2 | 2015/7/1 | 2015/8/1 |
+-------+---------+-----------+----------+
I want to transform to
+-------+--------------+------------+--------------+------------+
| User | F1_StartDate | F1_EndDate | F2_StartDate | F2_EndDate |
+-------+--------------+------------+--------------+------------+
| Peter | 2015/1/1 | 2015/2/1 | 2015/3/1 | 2015/4/1 |
| John | 2015/5/1 | 2015/6/1 | 2015/7/1 | 2015/8/1 |
+-------+--------------+------------+--------------+------------+

If you are using SQL Server 2005 or up by any chance, PIVOT is what you are looking for.

The best general way to perform this sort of operation is a simple group by statement. This should work across all major ODBMS:
select user,
max(case when feature='F1' then StartDate else null end) F1_StartDate,
max(case when feature='F1' then EndDate else null end) F1_EndDate,
max(case when feature='F2' then StartDate else null end) F2_StartDate,
max(case when feature='F2' then EndDate else null end) F2_EndDate
from table
group by user
Note: as mentioned in the comments, this is often bad practice, as depending on your needs, it can make the data harder to work with. However, there are cases where it makes sense, when you have a small, limited number of values.

This is a bit of a hack with a CTE:
;WITH CTE AS (
SELECT [User], [Feature] + '_StartDate' AS [Type], StartDate AS [Date]
FROM Table1
UNION ALL
SELECT [User], [Feature] + '_EndDate' AS [Type], EndDate AS [Date]
FROM Table1)
SELECT * FROM CTE
PIVOT(MAX([Date]) FOR [Type] IN ([F1_StartDate],[F2_StartDate], [F1_EndDate], [F2_EndDate])) PIV

Use UNPIVOT & PIVOT like this:
Test data:
DECLARE #t table
(User1 varchar(20),Feature char(2),StartDate date,EndDate date)
INSERT #t values
('Pete','F1','2015/1/1 ','2015/2/1'),
('Pete','F2','2015/3/1 ','2015/4/1'),
('John','F1','2015/5/1 ','2015/6/1'),
('John','F2','2015/7/1 ','2015/8/1')
Query:
;WITH CTE AS
(
SELECT User1, date1, Feature + '_' + Seq cat
FROM #t as p
UNPIVOT
(date1 FOR Seq IN
([StartDate], [EndDate]) ) AS unpvt
)
SELECT * FROM CTE
PIVOT
(MIN(date1)
FOR cat
IN ([F1_StartDate],[F1_EndDate],[F2_StartDate],[F2_EndDate])
) as p
Result:
User1 F1_StartDate F1_EndDate F2_StartDate F2_EndDate
John 2015-05-01 2015-06-01 2015-07-01 2015-08-01
Pete 2015-01-01 2015-02-01 2015-03-01 2015-04-01

SQL Add hours for employees

I have a table for employees signing in and out. They have a date and time field for in and out and an PersonID number that links to the employees name etc.
I need to work out the difference between the 2 dates and times then add them all together for each employee.
select a.*,
b.timein,
b.timeout,
datediff(mi,b.timein,b.timeout) as total_mins
from tbl_people a
left join tbl_register b on a.id=b.personid
Output:
+----+-----------+----------+-------------------------+-------------------------+------------+
| ID | FirstName | LastName | TimeIn | TimeOut | Total_Mins |
+----+-----------+----------+-------------------------+-------------------------+------------+
| 1 | David | Test | 2015-05-12 12:11:00.000 | 2015-05-12 12:13:00.000 | 2 |
| 2 | David | Test | 2015-05-12 12:15:00.000 | 2015-05-12 12:18:00.000 | 3 |
+----+-----------+----------+-------------------------+-------------------------+------------+
This is what im currently getting. I would like it to show one record for each person with the total amount of minutes worked.
Thanks in anticipation!

Basically you have at least 2 options:
Option 1 - Use DISTINCT and SUM with OVER clause:
SELECT DISTINCT a.*,
SUM(DATEDIFF(mi, b.timein, b.timeout)) OVER(PARTITION BY a.id) AS total_mins
FROM tbl_people a
LEFT JOIN tbl_register b ON a.id=b.personid
Option 2 - Use a derived table for the GROUP BY part:
SELECT a.*,
total_mins
from tbl_people a
left join (
SELECT personid,
SUM(DATEDIFF(mi, timein, timeout) AS total_mins
FROM tbl_register
GROUP BY personid
) b ON a.id=b.personid

select
ppl.FirstName + ' ' + ppl.LastName as 'Person',
sum( datediff(mi, reg.timein, reg.timeout)) as 'total_mins'
from
tbl_people ppl
left join tbl_register reg on ppl.id = reg.personid
group by
ppl.FirstName + ' ' + ppl.LastName

Count by unique ID, group by another

I've inherited some scripts that count the number of people in a team by department; the current scripts create a table for each individual department and the previous user would copy/paste the data into Excel. I've been tasked to pull this report into SSRS so I need one table for all the departments by team.
Current Table
+-------+-----------+---------+
| Dept | DataMatch | Team |
+-------+-----------+---------+
| 01 | 4687Joe | Dodgers |
| 01 | 3498Cindy | RedSox |
| 01 | 1057Bob | Yankees |
| 01 | 0497Lucy | Dodgers |
| 02 | 7934Jean | Yankees |
| 02 | 4584Tom | Dodgers |
+-------+-----------+---------+
Desired Results
+-------+---------+--------+---------+
| Dept | Dodgers | RedSox | Yankees |
+-------+---------+--------+---------+
| 01 | 2 | 1 | 1 |
| 02 | 1 | 0 | 1 |
+-------+---------+--------+---------+
The DataMatch field is the unique identifier I will be counting. I started by wrapping each department in a CTE however this results in the Dept as the Column which would not work for my report, so I need to transpose my results and I haven't been able to figure that out. There are 60 departments and my query was getting very long.
Current query
SELECT Dept, DataMatch, Team INTO #temp_Team
FROM TeamDatabase
WHERE Status = 14
AND Team <> 'Missing'
;WITH A_cte (Team, Dept01)
AS
(
SELECT Team
, COUNT(DISTINCT datamatch) AS 'Dept01'
FROM #temp_Team
WHERE Dept = '01'
GROUP BY Team
),
B_cte (Team, Dept02) AS
(
SELECT Team
, COUNT(DISTINCT datamatch) AS 'Dept02'
FROM #temp_Team
WHERE Dept = '02'
GROUP BY Team
)
SELECT A_cte.Team
, A_cte.Dept01
, B_cte.Dept02
FROM A_cte
INNER JOIN B_cte
ON A_cte.Team=B_cte.Team
Which results in:
+----------------------------+-------+-------+
| Team | Prg01 | Prg02 |
+----------------------------+-------+-------+
| RedSox | 144 | 141 |
| Yankees | 63 | 236 |
| Dodgers | 298 | 196 |
+----------------------------+-------+-------+
I feel that using a pivot on my already very long query would be excessive and impact performance, 60 departments with over 30,000 rows.
What, mostly likely basic, step am I missing?
TL;DR - How do I count people by team and list by department?

I would replace the whole query with a dynamic pivot instead of adding a pivot to your CTEs.
You can add your Status/Team conditions to the SELECT inside the dynamic query at the bottom. They would be WHERE STATUS=14 AND TEAM !=''MISSING'' - note that is two single quotes to nest it within the string.
IF OBJECT_ID('tempdb..#data') IS NOT NULL DROP TABLE #data
CREATE TABLE #data (Dept VARCHAR(50), DataMatch NVARCHAR(50), Team VARCHAR(50))
INSERT INTO #data (Dept, DataMatch, Team)
VALUES ('01', '4687Joe','Dodgers'),
('01', '3498Cindy','RedSox'),
('01', '1057Bob','Yankees'),
('01', '0497Lucy','Dodgers'),
('02', '7934Jean','Yankees'),
('02', '4584Tom','Dodgers')
DECLARE #cols AS NVARCHAR(MAX),
#sql AS NVARCHAR(MAX)
SET #cols = STUFF(
(SELECT N',' + QUOTENAME(y) AS [text()]
FROM (SELECT DISTINCT Team AS y FROM #data) AS Y
ORDER BY y
FOR XML PATH('')),
1, 1, N'');
SET #sql = 'SELECT Dept, '+#cols+'
FROM (SELECT Dept, DataMatch, Team
FROM #data D) SUB
PIVOT (COUNT([DataMatch]) FOR Team IN ('+#cols+')) AS P'
PRINT #SQL
EXEC (#SQL)
In case you don't want to use a dynamic pivot, here is just a stand-alone query... again, add your conditions as you need.
SELECT Dept, Dodgers, RedSox, Yankees
FROM (SELECT Dept, DataMatch, Team
FROM #data D) SUB
PIVOT (COUNT([DataMatch]) FOR Team IN ([Dodgers], [RedSox], [Yankees])) AS P

I'm not sure I follow what relevance your existing query has, but to get from your current table to your desired results is a pretty straightforward usage of PIVOT:
SELECT *
FROM Table1
PIVOT(COUNT(DataMatch) FOR Team IN (Dodgers,RedSox,Yankees))pvt
Demo: SQL Fiddle
And this of course could be done dynamically if the teams list isn't static.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Using XML PATH to manipulate row data when grouping - sql

Related

SQL Server - Insert lines with null values when month doesn't exist

Find total date in ms sql 2008

How to transform rows into column? [duplicate]

SQL Add hours for employees

Count by unique ID, group by another

Categories

Resources