How to group subtotals on the same row by date, by code - sql

I couldn't find an equivalent question on here for this question. Apologies if this is a repeat
Basically I have a table with transactions. Each transaction has a code and a datetime stamp. I want to be able to create a SQL query so that the results look something like this
+------------+--------+--------+-------+--------+-------+--------+
| DATE | CODE1 | COUNT1 | CODE2 | COUNT2 | CODE3 | COUNT3 |
+------------+--------+--------+-------+--------+-------+--------+
| 2017-01-01 | George | 12 | John | 10 | Ringo | 114 |
+------------+--------+--------+-------+--------+-------+--------+
I currently have a query that I can pull the subtotals on individual lines, i.e:
SELECT CONVERT(mytime AS DATE), code, COUNT(*) FROM transactiontable
GROUP BY CONVERT(mytime AS DATE), code
ORDER BY CONVERT(mytime AS DATE), code
Would give me
DATE CODE COUNT
-----------------------------------
2017-01-01 George 12
2017-01-01 John 10
etc ...
I don't currently have a separate table for the codes, but I am considering it.
Thanks !

You also can use PIVOT for making this.
DECLARE #Table TABLE (DATE DATETIME, CODE VARCHAR(10), [COUNT] INT)
INSERT INTO #Table
VALUES
('2017-01-01','George',12),
('2017-01-01','John',10)
;WITH CTE AS
(
SELECT RN = ROW_NUMBER() OVER (ORDER BY DATE), * FROM #Table
)
SELECT * FROM
(SELECT DATE, CONCAT('CODE',RN) RN, CODE Value FROM CTE
UNION ALL
SELECT DATE, CONCAT('COUNT',RN) RN, CONVERT(VARCHAR,[COUNT]) Value FROM CTE
) SRC
PIVOT (MAX(Value) FOR RN IN ([CODE1],[COUNT1],[CODE2],[COUNT2])) PVT
Result:
DATE CODE1 COUNT1 CODE2 COUNT2
----------- ----------- ----------- -------- -------
2017-01-01 George 12 John 10

You can use window function row_number to form groups and use conditional aggregation to pivot:
select dt,
max(case when rn = 1 then code end) as code_1,
max(case when rn = 1 then cnt end) as code_1,
max(case when rn = 2 then code end) as code_2,
max(case when rn = 2 then cnt end) as code_2,
max(case when rn = 3 then code end) as code_3,
max(case when rn = 3 then cnt end) as code_3,
....
from (
select convert(date, mytime) as dt,
code,
count(*),
row_number() over (partition by convert(date, mytime) order by code) as rn
from transactiontable
group by convert(date, mytime), code
) t
group by dt
order by dt;

Related

First value in DATE minus 30 days SQL

I have bunch of data out of which I'm showing ID, max date and it's corresponding values (user id, type, ...). Then I need to take MAX date for each ID, substract 30 days and show first date and it's corresponding values within this date period.
Example:
ID Date Name
1 01.05.2018 AAA
1 21.04.2018 CCC
1 05.04.2018 BBB
1 28.03.2018 AAA
expected:
ID max_date max_name previous_date previous_name
1 01.05.2018 AAA 05.04.2018 BBB
I have working solution using subselects, but as I have quite huge WHERE part, refresh takes ages.
SUBSELECT looks like that:
(SELECT MIN(N.name)
FROM t1 N
WHERE N.ID = T.ID
AND (N.date < MAX(T.date) AND N.date >= (MAX(T.date)-30))
AND (...)) AS PreviousName
How'd you write the select?
I'm using TSQL
Thanks
I can do this with 2 CTEs to build up the dates and names.
SQL Fiddle
MS SQL Server 2017 Schema Setup:
CREATE TABLE t1 (ID int, theDate date, theName varchar(10)) ;
INSERT INTO t1 (ID, theDate, theName)
VALUES
( 1,'2018-05-01','AAA' )
, ( 1,'2018-04-21','CCC' )
, ( 1,'2018-04-05','BBB' )
, ( 1,'2018-03-27','AAA' )
, ( 2,'2018-05-02','AAA' )
, ( 2,'2018-05-21','CCC' )
, ( 2,'2018-03-03','BBB' )
, ( 2,'2018-01-20','AAA' )
;
Main Query:
;WITH cte1 AS (
SELECT t1.ID, t1.theDate, t1.theName
, DATEADD(day,-30,t1.theDate) AS dMinus30
, ROW_NUMBER() OVER (PARTITION BY t1.ID ORDER BY t1.theDate DESC) AS rn
FROM t1
)
, cte2 AS (
SELECT c2.ID, c2.theDate, c2.theName
, ROW_NUMBER() OVER (PARTITION BY c2.ID ORDER BY c2.theDate) AS rn
, COUNT(*) OVER (PARTITION BY c2.ID) AS theCount
FROM cte1
INNER JOIN cte1 c2 ON cte1.ID = c2.ID
AND c2.theDate >= cte1.dMinus30
WHERE cte1.rn = 1
GROUP BY c2.ID, c2.theDate, c2.theName
)
SELECT cte1.ID, cte1.theDate AS max_date, cte1.theName AS max_name
, cte2.theDate AS previous_date, cte2.theName AS previous_name
, cte2.theCount
FROM cte1
INNER JOIN cte2 ON cte1.ID = cte2.ID
AND cte2.rn=1
WHERE cte1.rn = 1
Results:
| ID | max_date | max_name | previous_date | previous_name |
|----|------------|----------|---------------|---------------|
| 1 | 2018-05-01 | AAA | 2018-04-05 | BBB |
| 2 | 2018-05-21 | CCC | 2018-05-02 | AAA |
cte1 builds the list of max_date and max_name grouped by the ID and then using a ROW_NUMBER() window function to sort the groups by the dates to get the most recent date. cte2 joins back to this list to get all dates within the last 30 days of cte1's max date. Then it does essentially the same thing to get the last date. Then the outer query joins those two results together to get the columns needed while only selecting the most and least recent rows from each respectively.
I'm not sure how well it will scale with your data, but using the CTEs should optimize pretty well.
EDIT: For the additional requirement, I just added in another COUNT() window function to cte2.
I would do:
select id,
max(case when seqnum = 1 then date end) as max_date,
max(case when seqnum = 1 then name end) as max_name,
max(case when seqnum = 2 then date end) as prev_date,
max(case when seqnum = 2 then name end) as prev_name,
from (select e.*, row_number() over (partition by id order by date desc) as seqnum
from example e
) e
group by id;

select top N records for each entity

I have a table like below -
ID | Reported Date | Device_ID
-------------------------------------------
1 | 2016-03-09 09:08:32.827 | 1
2 | 2016-03-08 09:08:32.827 | 1
3 | 2016-03-08 09:08:32.827 | 1
4 | 2016-03-10 09:08:32.827 | 2
5 | 2016-03-05 09:08:32.827 | 2
Now, i want a top 1 row based on date column for each device_ID
Expected Output
ID | Reported Date | Device_ID
-------------------------------------------
1 | 2016-03-09 09:08:32.827 | 1
4 | 2016-03-10 09:08:32.827 | 2
I am using SQL Server 2008 R2. i can go and write Stored Procedure to handle it but wanted do it with simple query.
****************EDIT**************************
Answer by 'Felix Pamittan' worked well but for 'N' just change it to
SELECT
Id, [Reported Date], Device_ID
FROM (
SELECT *,
Rn = ROW_NUMBER() OVER(PARTITION BY Device_ID ORDER BY [ReportedDate] DESC)
FROM tbl
)t
WHERE Rn >= N
He had mentioned this in comment thought to add it to questions so that no body miss it.
Use ROW_NUMBER:
SELECT
Id, [Reported Date], Device_ID
FROM (
SELECT *,
Rn = ROW_NUMBER() OVER(PARTITION BY Device_ID ORDER BY [ReportedDate] DESC)
FROM tbl
)t
WHERE Rn = 1
You can also try using CTE
With DeviceCTE AS
(SELECT *, ROW_NUMBER() OVER(PARTITION BY Device_ID ORDER BY [Reported Date] DESC) AS Num
FROM tblname)
SELECT Id, [Reported Date], Device_ID
From DeviceCTE
Where Num = 1
If you can't use an analytic function, e.g. because your application layer won't allow it, then you can try the following solution which uses a subquery to arrive at the answer:
SELECT t1.ID, t2.maxDate, t1.Device_ID
INNER JOIN
(
SELECT Device_ID, MAX([Reported Date]) AS maxDate
FROM yourTable
GROUP BY Device_ID
) t2
ON t1.Device_ID = t2.Device_ID AND t1.[Reported Date] = t2.maxDate
Select * from DEVICE_TABLE D
where [Reported Date] = (Select Max([Reported Date]) from DEVICE_TABLE where Device_ID = D.Device_ID)
should do the trick, assume that "top 1 row based on date column" means that you want to select the latest reported date of each Device_ID ?
As for your title, select top 5 rows of each Device_ID
Select * from DEVICE_TABLE D
where [Reported Date] in (Select top 5 [Reported Date] from DEVICE_TABLE D where Device_ID = D.Device_ID)
order by Device_ID, [Reported Date] desc
will give you the top 5 latest reports of each device id.
You may want to sort out the top 5 date if your data isn't in order...
Again with no analytic functions you can use CROSS APPLY :
DECLARE #tbl TABLE(Id INT,[Reported Date] DateTime , Device_ID INT)
INSERT INTO #tbl
VALUES
(1,'2016-03-09 09:08:32.827',1),
(2,'2016-03-08 09:08:32.827',1),
(3,'2016-03-08 09:08:32.827',1),
(4,'2016-03-10 09:08:32.827',2),
(5,'2016-03-05 09:08:32.827',2)
SELECT r.*
FROM ( SELECT DISTINCT Device_ID FROM #tbl ) d
CROSS APPLY ( SELECT TOP 1 *
FROM #tbl t
WHERE d.Device_ID = t.Device_ID ) r
Can be easily modified to support N records.
Credits go to wBob answering this question here

SQL count changes in column

I need to count the changes in assigned group on a ticket. The problem is my log also count changes in assignee that are in the same group.
Here is some sample data
ticket_id | assigned_group | assignee | date
----------------------------------------------------
1001 | group A | john | 1-1-15
1001 | group A | michael | 1-2-15
1001 | group A | jacob | 1-3-15
1001 | group B | eddie | 1-4-15
1002 | group A | john | 1-1-15
1002 | group B | eddie | 1-2-15
1002 | group A | john | 1-3-15
1002 | group B | eddie | 1-4-15
1002 | group A | john | 1-5-15
I need this to return
ticket_id | count
--------------------
10001 | 2
10002 | 4
My query is like this
select ticket_id, assigned_group, count(*) from mytable group by ticket_id, assigned_group
But that gives me
ticket_id | count
--------------------
10001 | 4
10002 | 5
edit:
Also if I use
select ticket_id, count(Distinct assigned_group) as [Count] from mytable group by ticket_id
I only get
ticket_id | count
--------------------
10001 | 2
10002 | 2
Any advice?
Use Distinct Count to get the result
select ticket_id, count(Distinct assigned_group) as [Count]
from mytable
group by ticket_id
try this..
with temp as
(
select ticket_id, assigned_group, count(*) as count,date from mytable group by ticket_id, assigned_group,date
)
select ticket_id, count from temp
You can use Row_number() function to look into the next record's value.
with tbl as (select *, row_number() over(partition by ticket_id order by 1) from table)
select a.ticket_id, a.assigned_group, a.assignee_name, a.date,
count(case when a.assigned_group <> b.assigned_group then 1 else 0 end) as No_of_change
from tbl as a
left join tbl as b
on a.rn = b.rn + 1
If you are using SQL Server 2012, then you can use the LAG function to determine the previous assigned group easily. Then, if the previous assigned group is different from the current assigned group, you can increment the count, as below:
WITH previous_groups AS
(
SELECT
ticket_id,
assign_date,
assigned_group,
LAG(assigned_group, 1, NULL) OVER (PARTITION BY ticket_id ORDER BY assign_date) AS prev_assign_group
FROM mytable
)
SELECT
ticket_id,
SUM(CASE
WHEN assigned_group <> prev_assign_group THEN 1
ELSE 0
END) AS count
FROM previous_groups
WHERE prev_assign_group IS NOT NULL
GROUP BY ticket_id
ORDER BY ticket_id;
If you are using SQL Server 2008 or earlier versions, then you need an extra step to determine the previous assigned group, as below:
WITH previous_assign_dates AS
(
SELECT
mt1.ticket_id,
mt1.assign_date,
MAX(mt2.assign_date) AS prev_assign_date
FROM mytable mt1
LEFT JOIN mytable mt2
ON mt1.ticket_id = mt2.ticket_id
AND mt2.assign_date < mt1.assign_date
GROUP BY
mt1.ticket_id,
mt1.assign_date
),
previous_groups AS
(
SELECT
mt1.*,
mt2.assigned_group AS prev_assign_group
FROM mytable mt1
INNER JOIN previous_assign_dates pad
ON mt1.ticket_id = pad.ticket_id
AND mt1.assign_date = pad.assign_date
LEFT JOIN mytable mt2
ON pad.ticket_id = mt2.ticket_id
AND pad.prev_assign_date = mt2.assign_date
)
SELECT
ticket_id,
SUM(CASE
WHEN assigned_group <> prev_assign_group THEN 1
ELSE 0
END) AS count
FROM previous_groups
WHERE prev_assign_group IS NOT NULL
GROUP BY ticket_id
ORDER BY ticket_id;
SQL Fiddle demo
References:
The LAG function on MSDN
Adding an ordinal number within the ticket, then a self join where the group is different and consecutive ordinals, should work:
SELECT t1.ticket_id, COUNT(*) FROM
(SELECT *, ROW_NUMBER() OVER(PARTITION BY ticket_id ORDER BY date) ordinal
FROM mytable) t1
JOIN
(SELECT *, ROW_NUMBER() OVER(PARTITION BY ticket_id ORDER BY date) ordinal FROM nytable) t2
ON t1.ticket_id=t2.ticket_id AND t1.assigned_group<>t2.assigned_group AND t1.ordinal+1=t2.ordinal
GROUP BY t1.ticket_id

Need to select table data into distinct columns based on a date in a row

I am not sure what I need here, looks sort of like I could use a pivot but I don't think it's that complicated and would like to avoid pivot if I can as I haven't used it much (er, at all).
I have data like this:
ID score notes CreateDate
1661 9.2 8.0 on Sept 2010 7/22/2010
1661 7.6 11/4/2010
1661 7.9 6/10/2011
1661 8.3 9/28/2011
1661 7.9 1/20/2012
I want to organize all that data on to one row with the oldest date being first and then use the next oldest date, then next oldest...until I use 4 or 5 dates. So the end result would look something like this:
ID score1 notes1 date1 score2 notes2 date2 score3 notes3 date3 score4 notes4 date4
1661 9.2 8.0 on Sept 2010 7/22/2010 7.6 blah 11/4/2010 7.9 blah2 6/10/2011 8.3 blah3 9/28/2011
PIVOT would be tricky in this situation, since you have more than one column per test (PIVOT works well if you only wanted to show Score1, Score2, Score3, etc). Fortunately, you can create a simple (if long-winded) solution with CASE statements:
select
ID,
max(case when RowNum = 1 then Score else null end) as Score1,
max(case when RowNum = 1 then Notes else null end) as Notes1,
max(case when RowNum = 1 then CreateDate else null end) as Date1,
max(case when RowNum = 2 then Score else null end) as Score2,
max(case when RowNum = 2 then Notes else null end) as Notes2,
max(case when RowNum = 2 then CreateDate else null end) as Date2,
max(case when RowNum = 3 then Score else null end) as Score3,
max(case when RowNum = 3 then Notes else null end) as Notes3,
max(case when RowNum = 3 then CreateDate else null end) as Date3,
max(case when RowNum = 4 then Score else null end) as Score4,
max(case when RowNum = 4 then Notes else null end) as Notes4,
max(case when RowNum = 4 then CreateDate else null end) as Date4,
max(case when RowNum = 5 then Score else null end) as Score5,
max(case when RowNum = 5 then Notes else null end) as Notes5,
max(case when RowNum = 5 then CreateDate else null end) as Date5
from
(
select
*, row_number() over (partition by ID order by CreateDate) as RowNum
from
mytable
) tt
group by
ID
This is hard-coded to cover 5 tests. It will be OK with less, but won't display a 6th. You can obviously create more CASE statements to handle more tests.
Just because I love pivots, I will show you how this can be done using the PIVOT function. In order to get the result with the PIVOT function you will first want to UNPIVOT your multiple columns score, notes and createdate. The unpivot process will convert the multiple columns into multiple rows.
Since you are using SQL Server 2008 you can use CROSS APPLY to unpivot your data, the first part of the query will be similar to:
;with cte as
(
select id, score, notes, createdate,
row_number() over(partition by id order by createdate) seq
from yourtable
)
select id, col, value
from
(
select t.id,
col = col + cast(seq as varchar(10)),
value
from cte t
cross apply
(
values
('score', cast(score as varchar(10))),
('notes', notes),
('date', convert(varchar(10), createdate, 120))
) c (col, value)
) d;
See SQL Fiddle with Demo. Doing this gets your data in the format:
| ID | COL | VALUE |
| 1661 | score1 | 9.20 |
| 1661 | notes1 | 8.0 on Sept 2010 |
| 1661 | date1 | 2010-07-22 |
| 1661 | score2 | 7.60 |
| 1661 | notes2 | (null) |
| 1661 | date2 | 2010-11-04 |
| 1661 | score3 | 7.90 |
Now you can apply the PIVOT function:
;with cte as
(
select id, score, notes, createdate,
row_number() over(partition by id order by createdate) seq
from yourtable
)
select id, col, value
from
(
select t.id,
col = col + cast(seq as varchar(10)),
value
from cte t
cross apply
(
values
('score', cast(score as varchar(10))),
('notes', notes),
('date', convert(varchar(10), createdate, 120))
) c (col, value)
) d
pivot
(
max(value)
for col in (score1, notes1, date1, score2, notes2, date2,
score3, notes3, date3, score4, notes4, date4,
score5, notes5, date5)
) piv;
See SQL Fiddle with Demo.
Then if you were going to have an unknown number of values for each id, you could implement dynamic SQL to get the result:
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT ',' + QUOTENAME(col + cast(seq as varchar(10)))
from
(
select row_number() over(partition by id order by createdate) seq
from yourtable
) d
cross apply
(
select 'score', 1 union all
select 'notes', 2 union all
select 'date', 3
) c (col, so)
group by seq, col, so
order by seq, so
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT id, ' + #cols + '
from
(
select t.id,
col = col + cast(seq as varchar(10)),
value
from
(
select id, score, notes, createdate,
row_number() over(partition by id order by createdate) seq
from yourtable
) t
cross apply
(
values
(''score'', cast(score as varchar(10))),
(''notes'', notes),
(''date'', convert(varchar(10), createdate, 120))
) c (col, value)
) x
pivot
(
max(value)
for col in (' + #cols + ')
) p '
execute sp_executesql #query;
See SQL Fiddle with Demo. Both versions give the result:
| ID | SCORE1 | NOTES1 | DATE1 | SCORE2 | NOTES2 | DATE2 | SCORE3 | NOTES3 | DATE3 | SCORE4 | NOTES4 | DATE4 | SCORE5 | NOTES5 | DATE5 |
| 1661 | 9.20 | 8.0 on Sept 2010 | 2010-07-22 | 7.60 | (null) | 2010-11-04 | 7.90 | (null) | 2011-06-10 | 8.30 | (null) | 2011-09-28 | 7.90 | (null) | 2012-01-20 |

Display multiple rows and column values into a single row, multiple column values

I have to show multiple incomes, type of income and employer name values for a single individual in a single row. So, if 'A' has three different incomes from three different sources,
id | Name | Employer | IncomeType | Amount
123 | XYZ | ABC.Inc | EarningsformJob | $200.00
123 | XYZ | Self | Self Employment | $300.00
123 | XYZ. | ChildSupport| Support | $500.00
I need to show them as
id | Name | Employer1 | Incometype1| Amount1 | Employer2 | incometype2 | Amount2| Employer3 | Incometype3| Amount3.....
123 |XYZ | ABC.Inc |EarningsformJob | $200.00|Self | Self Employment | $300.00|ChildSupport| Support | $500.00.....
I need both 'fixed number of columns' (where we know how many times employer, incometype and amount colums are going to repeat)logic and 'dynamic display of columns' ( unknown number of times these columns are going to repeat)
Thanks.
Since you are using SQL Server there are several ways that you can transpose the rows of data into columns.
Aggregate Function / CASE: You can use an aggregate function with a CASE expression along with row_number(). This version would require that you have a known number of values to become columns:
select id,
name,
max(case when rn = 1 then employer end) employer1,
max(case when rn = 1 then IncomeType end) IncomeType1,
max(case when rn = 1 then Amount end) Amount1,
max(case when rn = 2 then employer end) employer2,
max(case when rn = 2 then IncomeType end) IncomeType2,
max(case when rn = 2 then Amount end) Amount2,
max(case when rn = 3 then employer end) employer3,
max(case when rn = 3 then IncomeType end) IncomeType3,
max(case when rn = 3 then Amount end) Amount3
from
(
select id, name, employer, incometype, amount,
row_number() over(partition by id order by employer) rn
from yourtable
) src
group by id, name;
See SQL Fiddle with Demo.
PIVOT/UNPIVOT: You could use the UNPIVOT and PIVOT functions to get the result. The UNPIVOT converts your multiple columns of Employer, IncomeType and Amount into multiples rows before applying the pivot. You did not specific what version of SQL Server, assuming you have a known number of values then you could use the following in SQL Server 2005+ which uses CROSS APPLY with UNION ALL to unpivot:
select id, name,
employer1, incometype1, amount1,
employer2, incometype2, amount2,
employer3, incometype3, amount3
from
(
select id, name, col+cast(rn as varchar(10)) col, value
from
(
select id, name, employer, incometype, amount,
row_number() over(partition by id order by employer) rn
from yourtable
) t
cross apply
(
select 'employer', employer union all
select 'incometype', incometype union all
select 'amount', cast(amount as varchar(50))
) c (col, value)
) src
pivot
(
max(value)
for col in (employer1, incometype1, amount1,
employer2, incometype2, amount2,
employer3, incometype3, amount3)
) piv;
See SQL Fiddle with Demo.
Dynamic Version: Lastly, if you have an unknown number of values then you will need to use dynamic SQL to generate the result.
DECLARE #cols AS NVARCHAR(MAX),
#query AS NVARCHAR(MAX)
select #cols = STUFF((SELECT ',' + QUOTENAME(col+cast(rn as varchar(10)))
from
(
select row_number() over(partition by id order by employer) rn
from yourtable
) d
cross apply
(
select 'employer', 1 union all
select 'incometype', 2 union all
select 'amount', 3
) c (col, so)
group by col, rn, so
order by rn, so
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = 'SELECT id, name,' + #cols + '
from
(
select id, name, col+cast(rn as varchar(10)) col, value
from
(
select id, name, employer, incometype, amount,
row_number() over(partition by id order by employer) rn
from yourtable
) t
cross apply
(
select ''employer'', employer union all
select ''incometype'', incometype union all
select ''amount'', cast(amount as varchar(50))
) c (col, value)
) x
pivot
(
max(value)
for col in (' + #cols + ')
) p '
execute(#query);
See SQL Fiddle with Demo. All versions give a result:
| ID | NAME | EMPLOYER1 | INCOMETYPE1 | AMOUNT1 | EMPLOYER2 | INCOMETYPE2 | AMOUNT2 | EMPLOYER3 | INCOMETYPE3 | AMOUNT3 |
-------------------------------------------------------------------------------------------------------------------------------------
| 123 | XYZ | ABC.Inc | EarningsformJob | 200 | ChildSupport | Support | 500 | Self | Self Employment | 300 |