I'm fairly new to sql and not sure how to pivot table that can result in a binary data from a categorical data column.
Here is my current table:
+---------+------------------+--------------------+----------------+
| User ID | Cell Phone Brand | Purchased Platform | Recorded Usage |
+---------+------------------+--------------------+----------------+
| 1001 | Apple | Retail | 4 |
| 1001 | Samsung | Online | 4 |
| 1002 | Samsung | Retail | 5 |
| 1003 | Google | Online | 3 |
| 1003 | LG | Online | 3 |
| 1004 | LG | Online | 6 |
| 1005 | Apple | Online | 3 |
| 1006 | Google | Retail | 5 |
| 1007 | Goohle | Online | 3 |
| 1008 | Samsung | Retail | 4 |
| 1009 | LG | Retail | 4 |
| 1009 | Apple | Retail | 3 |
| 1010 | Apple | Retail | 6 |
+---------+------------------+--------------------+----------------+
I'd like to have the following result with aggregated Recorded Usage and binary data for devices:
+---------+--------------------+----------------+-------+---------+--------+----+
| User ID | Purchased Platform | Recorded Usage | Apple | Samsung | Google | LG |
+---------+--------------------+----------------+-------+---------+--------+----+
| 1001 | Retail | 4 | 1 | 0 | 0 | 0 |
| 1001 | Online | 4 | 0 | 1 | 0 | 0 |
| 1002 | Retail | 5 | 0 | 1 | 0 | 0 |
| 1003 | Online | 3 | 0 | 0 | 1 | 0 |
| 1003 | Online | 3 | 0 | 0 | 0 | 1 |
| 1004 | Online | 6 | 0 | 0 | 0 | 1 |
| 1005 | Online | 3 | 1 | 0 | 0 | 0 |
| 1006 | Retail | 5 | 0 | 0 | 1 | 0 |
| 1007 | Online | 3 | 0 | 0 | 1 | 0 |
| 1008 | Retail | 4 | 0 | 1 | 0 | 0 |
| 1009 | Retail | 4 | 0 | 0 | 0 | 1 |
| 1009 | Retail | 3 | 1 | 0 | 0 | 0 |
| 1010 | Retail | 6 | 1 | 0 | 0 | 0 |
+---------+--------------------+----------------+-------+---------+--------+----+
You can use case when statements:
declare #tmp table (UserID int, CellPhoneBrand varchar(10), PurchasedPlatform varchar(10), RecordedUsage int)
insert into #tmp
values
(1001,'Apple' ,'Retail', 4)
,(1001,'Samsung','Online', 4)
,(1002,'Samsung','Retail', 5)
,(1003,'Google' ,'Online', 3)
,(1003,'LG' ,'Online', 3)
,(1004,'LG' ,'Online', 6)
,(1005,'Apple' ,'Online', 3)
,(1006,'Google' ,'Retail', 5)
,(1007,'Goohle' ,'Online', 3)
,(1008,'Samsung','Retail', 4)
,(1009,'LG' ,'Retail', 4)
,(1009,'Apple' ,'Retail', 3)
,(1010,'Apple' ,'Retail', 6)
select UserID, PurchasedPlatform, RecordedUsage
,case when CellPhoneBrand ='Apple' then 1 else 0 end as Apple
,case when CellPhoneBrand ='Samsung' then 1 else 0 end as Samsung
,case when CellPhoneBrand ='Google' then 1 else 0 end as Google
,case when CellPhoneBrand ='LG' then 1 else 0 end as LG
from #tmp
Results:
This get's you the result you're after in your expected results. Like I mentioned in my comment, I would more likely expect an aggregated pivot here:
WITH VTE AS(
SELECT *
FROM (VALUES(1001,'Apple ','Retail',4),
(1001,'Samsung','Online',4),
(1002,'Samsung','Retail',5),
(1003,'Google ','Online',3),
(1003,'LG ','Online',3),
(1004,'LG ','Online',6),
(1005,'Apple ','Online',3),
(1006,'Google ','Retail',5),
(1007,'Goohle ','Online',3),
(1008,'Samsung','Retail',4),
(1009,'LG ','Retail',4),
(1009,'Apple ','Retail',3),
(1010,'Apple ','Retail',6)) V(ID, Brand, Platform, Usage))
SELECT ID,
Platform,
Usage,
CASE WHEN Brand = 'Apple' THEN 1 ELSE 0 END AS Apple,
CASE WHEN Brand = 'Samsung' THEN 1 ELSE 0 END AS Samsung,
CASE WHEN Brand = 'Google' THEN 1 ELSE 0 END AS Google,
CASE WHEN Brand = 'LG' THEN 1 ELSE 0 END AS LG
FROM VTE;
Since you used a word pivot in your description. here is a solution that shows how to pivot data in sqlserver using PIVOT statement
declare #temp TABLE
(
[UserID] varchar(50),
[CellPhoneBrand] varchar(50),
[PurchasedPlatform] varchar(50),
[RecordedUsage] int
);
INSERT INTO #temp
(
[UserID],
[CellPhoneBrand],
[PurchasedPlatform],
[RecordedUsage]
)
VALUES
(1001,'Apple', 'Retail', 4),
(1001,'Samsung', 'Online', 4),
(1002,'Samsung', 'Retail', 5),
(1003,'Google', 'Online', 3),
(1003,'LG', 'Online', 3),
(1004,'LG', 'Online', 6),
(1005,'Apple', 'Online', 3),
(1006,'Google', 'Retail', 5),
(1007,'Goohle', 'Online', 3),
(1008,'Samsung', 'Retail', 4),
(1009,'LG', 'Retail', 4),
(1009,'Apple', 'Retail', 3),
(1010,'Apple', 'Retail', 6)
select *
from
(
select [UserID], [PurchasedPlatform], [RecordedUsage],[CellPhoneBrand]
from #temp
) src
pivot
(
count(CellPhoneBrand)
for [CellPhoneBrand] in ([Apple], [Samsung],[Google],[LG])
) piv;
Related
I am working on a SQL query in the Azure Databricks environment that has the following dataset:
CREATE OR REPLACE TABLE touchpoints_table
(
List STRING,
Path_Lenght INT
);
INSERT INTO touchpoints_table VALUES
('BBB, AAA, CCC', 3),
('BBB', 1),
('DDD, AAA', 2),
('DDD, BBB, AAA, EEE, CCC', 5),
('EEE, AAA, EEE, CCC', 4);
SELECT * FROM touchpoints_table
| | List | Path_length |
| 0 | BBB, AAA, CCC | 3 |
| 1 | CCC | 1 |
| 2 | DDD, AAA | 2 |
| 3 | DDD, BBB, AAA, EEE, CCC | 5 |
| 4 | EEE, AAA, EEE, CCC | 4 |
and the task consists of generating the following table:
| | Content | Unique | Started | Middleway | Finished |
| 0 | AAA | 0 | 0 | 3 | 1 |
| 1 | BBB | 0 | 1 | 1 | 0 |
| 2 | CCC | 1 | 0 | 0 | 3 |
| 3 | DDD | 0 | 2 | 0 | 0 |
| 4 | EEE | 0 | 1 | 2 | 0 |
where the columns contain the following:
Content: the elements found in the List
Unique: the number of times that the element appears alone in the list
Started: the number of times that the element appears at the beginning
Finished: the number of times that the element appears at the end
Middleway: the number of times the element appears between the beginning and the end.
Using the following query I almost get the result but somehow the group by does not worked correctly
WITH tb1 AS(
SELECT
CAST(touch_array AS STRING) AS touch_list,
EXPLODE(touch_array) AS explode_list,
ROW_NUMBER()OVER(PARTITION BY CAST(touch_array AS STRING) ORDER BY (SELECT 1)) touch_count,
COUNT(*)OVER(PARTITION BY touch_array) touch_lenght
FROM (SELECT SPLIT(List, ',') AS touch_array FROM touchpoints_table)
)
SELECT
explode_list AS Content,
SUM(CASE WHEN touch_lenght=1 THEN 1 ELSE 0 END) AS Unique,
SUM(CASE WHEN touch_count=1 AND touch_lenght > 1 THEN 1 ELSE 0 END) AS Started,
SUM(CASE WHEN touch_count>1 AND touch_count < touch_lenght THEN 1 ELSE 0 END) AS Middleway,
SUM(CASE WHEN touch_count>1 AND touch_count = touch_lenght THEN 1 ELSE 0 END) AS Finished
FROM tb1
GROUP BY explode_list
ORDER BY explode_list
| | Content | Unique | Started | Middleway | Finished |
| 0 | AAA | 0 | 0 | 3 | 1 |
| 1 | BBB | 0 | 0 | 1 | 0 |
| 2 | CCC | 0 | 0 | 0 | 3 |
| 3 | EEE | 0 | 0 | 2 | 0 |
| 4 | BBB | 1 | 1 | 0 | 0 |
| 5 | DDD | 0 | 2 | 0 | 0 |
| 6 | EEE | 0 | 1 | 0 | 0 |
Could you help me by suggesting a code that solves this task?
Here is a way to do this using sql server.
with main_data
as (
select list
,ltrim(x.value) as split_val
,x.ordinal
,case when x.ordinal=1 and tt.path_length=1 then
'unique'
when x.ordinal=1 then
'start'
when x.ordinal=(tt.path_length+1)/2 then
'middle'
when x.ordinal=tt.path_length then
'end'
end as pos
from touchpoints_table tt
CROSS APPLY STRING_SPLIT(list,',',1) x
)
select split_val
,count(case when pos='unique' then 1 end) as unique_cnt
,count(case when pos='start' then 1 end) as start_cnt
,count(case when pos='middle' then 1 end) as middle_cnt
,count(case when pos='end' then 1 end) as end_cnt
from main_data
group by split_val
+-----------+------------+-----------+------------+---------+
| split_val | unique_cnt | start_cnt | middle_cnt | end_cnt |
+-----------+------------+-----------+------------+---------+
| AAA | 0 | 0 | 3 | 1 |
| BBB | 0 | 1 | 0 | 0 |
| CCC | 1 | 0 | 0 | 3 |
| DDD | 0 | 2 | 0 | 0 |
| EEE | 0 | 1 | 0 | 0 |
+-----------+------------+-----------+------------+---------+
I have a simple Invoice table that has each item sold and the date it was sold.
Here is some sample data of taking the base database and counting how much times each item was sold per week.
+------+-----------------+------------+---------+
| Week | Item_Number | Color_Code | Touches |
+------+-----------------+------------+---------+
| 1 | 11073900LRGMO | 02000 | 7 |
| 1 | 11073900MEDMO | 02000 | 9 |
| 2 | 1114900011BMO | 38301 | 62 |
| 2 | 1114910012BMO | 21701 | 147 |
| 2 | 1114910012BMO | 38301 | 147 |
| 2 | 1114910012BMO | 46260 | 147 |
| 3 | 13MK430R03R | 00101 | 2 |
| 3 | 13MK430R03R | 10001 | 2 |
| 3 | 13MK430R03R | 65004 | 8 |
| 3 | 13MK430R03S | 00101 | 2 |
| 3 | 13MK430R03S | 10001 | 2 |
+------+-----------------+------------+---------+
Then I created a matrix out of this data using a dynamic query and the pivot operator. Here is how I did that,
First, I create a temporary table
DECLARE #cols AS NVARCHAR(MAX)
DECLARE #query AS NVARCHAR(MAX)
IF OBJECT_ID('tempdb..#VTable') IS NOT NULL
DROP TABLE #VTable
CREATE TABLE #VTable
(
[Item_Number] NVARCHAR(100),
[Color_Code] NVARCHAR(100),
[Item_Cost] NVARCHAR(100),
[Week] NVARCHAR(10),
[xCount] int
);
Then I insert my data into that table,
INSERT INTO #VTable
(
[Item_Number],
[Color_Code],
[Item_Cost],
[Week],
[xCount]
)
SELECT
*
FROM (
SELECT
Item_Number
,Color_Code
,Item_Cost
,Week
,Count(Item_Number) Touches
FROM (
SELECT
DATEPART (year, I.Date_Invoiced) Year
,DATEPART (month, I.Date_Invoiced) Month
,Concat(CASE WHEN DATEPART (week, I.Date_Invoiced) <10 THEN CONCAT('0',DATEPART (week, I.Date_Invoiced)) ELSE CAST(DATEPART (week, I.Date_Invoiced) AS NVARCHAR) END,'-',RIGHT(DATEPART (year, I.Date_Invoiced),2) ) WEEK
,DATEPART (day, I.Date_Invoiced) Day
,I.Invoice_Number
,I.Customer_Number
,I.Warehouse_Code
,S.Pack_Type
,S.Quantity_Per_Carton
,S.Inner_Pack_Quantity
,LTRIM(RTRIM(ID.Item_Number)) Item_Number
,LTRIM(RTRIM(ID.Color_Code)) Color_Code
,CASE
WHEN ISNULL(s.Actual_Cost, 0) = 0
THEN ISNULL(s.Standard_Cost, 0)
ELSE s.Actual_Cost
END Item_Cost
,ID.Quantity
,case when s.Pack_Type='carton' then id.Quantity/s.Quantity_Per_Carton when s.Pack_Type='Inner Poly' then id.Quantity/s.Inner_Pack_Quantity end qty
,ID.Line_Number
FROM Invoices I
LEFT JOIN Invoices_Detail ID on I.Company_Code = ID.Company_Code and I.Division_Code = ID.Division_Code and I.Invoice_Number = ID.Invoice_Number
LEFT JOIN Style S on I.Company_Code = S.Company_Code and I.Division_Code = S.Division_Code and ID.Item_Number = S.Item_Number and ID.Color_Code = S.Color_Code
WHERE 1=1
AND (I.Company_Code = #LocalCompanyCode OR #LocalCompanyCode IS NULL)
AND (I.Division_Code = #LocalDivisionCode OR #LocalDivisionCode IS NULL)
AND (I.Warehouse_Code = #LocalWarehouse OR #LocalWarehouse IS NULL)
AND (S.Pack_Type = #LocalPackType OR #LocalPackType IS NULL)
AND (I.Customer_Number = #LocalCustomerNumber OR #LocalCustomerNumber IS NULL)
AND (I.Date_Invoiced Between #LocalFromDate And #LocalToDate)
) T
GROUP BY Item_Number,Color_Code,Item_Cost,Week
) TT
Then I use a dynamic query to create the matrix:
select #cols = STUFF((SELECT ',' + QUOTENAME(Week)
from #VTable
group by Week
order by (Right(Week,2) + LEFT(Week,2))
FOR XML PATH(''), TYPE
).value('.', 'NVARCHAR(MAX)')
,1,1,'')
set #query = '
SELECT
*
FROM (
SELECT Item_Number,Color_Code, Item_Cost,' + #cols + ' from
(
select Item_Number, Color_Code, Item_Cost, week, xCount
from #Vtable
) x
pivot
(
sum(xCount)
for week in (' + #cols + ')
) p
)T
'
execute(#query);
This gives me what I am looking for, here is what the matrix looks like.
+---------------+------------+-----------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| Item_Number | Color_Code | Item_Cost | 36-18 | 37-18 | 38-18 | 39-18 | 40-18 | 41-18 | 42-18 | 43-18 | 44-18 | 45-18 | 46-18 | 47-18 | 48-18 | 49-18 | 50-18 | 51-18 | 52-18 | 53-18 | 01-19 | 02-19 | 03-19 | 04-19 | 05-19 | 06-19 | 07-19 | 08-19 | 09-19 | 10-19 | 11-19 | 12-19 | 13-19 | 14-19 | 15-19 | 16-19 | 17-19 | 18-19 | 19-19 | 20-19 | 21-19 | 22-19 | 23-19 | 24-19 | 25-19 | 26-19 | 27-19 | 28-19 | 29-19 | 30-19 | 31-19 | 32-19 | 33-19 | 34-19 | 35-19 |
+---------------+------------+-----------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| 11073900LRGMO | 02000 | 8.51 | 1 | NULL | 13 | NULL | 3 | NULL | NULL | 3 | 3 | NULL | 4 | 3 | 6 | NULL | 4 | NULL | NULL | NULL | 7 | 4 | NULL | 3 | 2 | 5 | 30 | 7 | 3 | 10 | NULL | 9 | 19 | 5 | NULL | 10 | 9 | 5 | 2 | 3 | 5 | 4 | 3 | 9 | 7 | NULL | 5 | 1 | 3 | 5 | NULL | NULL | 11 | 7 | 3 |
| 11073900MEDMO | 02000 | 8.49 | 11 | NULL | 22 | NULL | 5 | NULL | NULL | 14 | 4 | NULL | 4 | 3 | 8 | NULL | 9 | NULL | NULL | NULL | 9 | 3 | NULL | 7 | 6 | 4 | 37 | 10 | 8 | 9 | NULL | 7 | 30 | 14 | NULL | 12 | 5 | 7 | 8 | 7 | 2 | 4 | 6 | 15 | 4 | NULL | 2 | 7 | 3 | 7 | NULL | NULL | 11 | 9 | 3 |
| 11073900SMLMO | 02000 | 8.50 | 6 | NULL | 18 | NULL | 3 | NULL | NULL | 3 | 7 | NULL | 5 | NULL | 7 | NULL | 9 | NULL | NULL | NULL | 7 | 4 | NULL | 7 | 2 | 6 | 37 | 9 | 4 | 7 | NULL | 7 | 19 | 7 | NULL | 11 | 5 | 7 | 7 | 2 | 3 | 8 | 8 | 9 | 2 | NULL | 2 | 2 | 2 | 4 | NULL | NULL | 8 | 5 | 4 |
| 11073900XLGMO | 02000 | 8.51 | 2 | NULL | 6 | NULL | 3 | NULL | NULL | 2 | 4 | NULL | 3 | 1 | 3 | NULL | 4 | NULL | NULL | NULL | 4 | 4 | NULL | NULL | 3 | 1 | 27 | 4 | 3 | 4 | NULL | 8 | 11 | 9 | NULL | 7 | 2 | 4 | 1 | 5 | 1 | 6 | 5 | 6 | 1 | NULL | 1 | 3 | NULL | 3 | NULL | NULL | 3 | 4 | 2 |
+---------------+------------+-----------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
The last thing I want to do is find a good way to sort this table. I think the best way to do that would be to sort by which item numbers are picked the most across all weeks. Doing column wise sum will give me the total amount of touches per week for all items, but I want to do a row wise sum where there is another column at the end that has the touches per item. Does anyone know how I would do this? I've tried messing around with another dynamic query from this link -> (calculate Row Wise Sum - Sql server )
but I couldn't get it to work.
Here is a quick-and-dirty solution based on this answer to do the "coalesce sum" over your wk-yr columns. This does not modify your existing code, but for efficiency it may be better do as #Sean Lange suggests.
Tested on SQL Server 2017 latest (linux docker image).
Input dataset:
(Only 3 wk-yr columns here for simplicity. The code should work on arbitrary amount of columns):
create table WeeklySum (
Item_Number varchar(50),
Color_Code varchar(10),
Item_Cost float,
[36-18] float,
[37-18] float,
[38-18] float
)
insert into WeeklySum (Item_Number, Color_Code, Item_Cost, [36-18], [37-18], [38-18])
values ('11073900LRGMO', '02000', 8.51, 1, NULL, 13),
('11073900MEDMO', '02000', 8.49, 11, NULL, 22),
('11073900SMLMO', '02000', 8.50, 6, NULL, 18),
('11073900XLGMO', '02000', 8.51, 2, NULL, 6);
select * from WeeklySum;
Sample Code:
/* 1. Expression of the sum of coalesce(wk-yr, 0) */
declare #s varchar(max);
-- In short, this query select wanted columns by exclusion in sys.columns
-- and then do the "coalesce sum" over the selected columns in a row.
-- The "#s = coalesce()" expression is to avoid redundant '+' at beginning.
-- NOTE: May have to change sys.columns -> syscolumns for SQL Server 2005
-- or earlier versions
select #s = coalesce(#s + ' + coalesce([' + C.name + '], 0)', 'coalesce([' + C.name + '], 0)')
from sys.columns as C
where C.object_id = (select top 1 object_id from sys.objects
where name = 'WeeklySum')
and C.name not in ('Item_Number', 'Color_Code', 'Item_Cost');
print #s;
/* 2. Perform the sorting query */
declare #sql varchar(max);
set #sql = 'select *, ' + #s + ' as totalCount ' +
'from WeeklySum ' +
'order by totalCount desc';
print #sql;
execute(#sql);
Output:
| Item_Number | Color_Code | Item_Cost | 36-18 | 37-18 | 38-18 | totalCount |
|---------------|------------|-----------|-------|-------|-------|------------|
| 11073900MEDMO | 02000 | 8.49 | 11 | NULL | 22 | 33 |
| 11073900SMLMO | 02000 | 8.5 | 6 | NULL | 18 | 24 |
| 11073900LRGMO | 02000 | 8.51 | 1 | NULL | 13 | 14 |
| 11073900XLGMO | 02000 | 8.51 | 2 | NULL | 6 | 8 |
Also check the generated expressions on the messages window:
#s:
coalesce([36-18], 0) + coalesce([37-18], 0) + coalesce([38-18], 0) as totalCount
#sql:
select *, coalesce([36-18], 0) + coalesce([37-18], 0) + coalesce([38-18], 0) as totalCount from WeeklySum order by totalCount desc
I have a table in SQL Server database containing :
int value (column's name : Value)
datetime value (column's name : Date)
bit value (column's name : LastLineOfPage)
I would like to make a pagination query over this table. The logic of the pagination is the following :
The query must return lines corresponding to a given page (parameter #PageNumber), after sorting lines by the Date column
Also, the query must give the SUM of all the previous pages lines
The line number per page is not fixed : by default it's 14 lines per page, but if the bit LastLineOfPage is true, then the page contain only lines until the one with the true value
Here is a synthetic view of the process :
Here is the data in text :
ID DATE VALUE LASTLINEOFPAGE
1 07/10/2006 10 0
2 14/10/2006 12 0
3 21/10/2006 4 1
4 28/10/2006 6 0
5 04/11/2006 8 1
6 25/11/2006 125 0
7 02/12/2006 1 0
8 09/12/2006 5 0
9 16/12/2006 45 0
10 30/12/2006 1 1
So, the query receiving #PageNumber, and also #DefaultLineNumberPerPage (which will be equal to 14 but maybe one day that will change).
Could you help me in the design of this query or SQL function ?
Thanks !
Sample data
I added few rows to illustrate how it works when there are more rows per page than #DefaultLineNumberPerPage. In this example I'll use #DefaultLineNumberPerPage=5 and you'll see how extra pages were generated.
DECLARE #T TABLE (ID int, dt date, VALUE int, LASTLINEOFPAGE bit);
INSERT INTO #T(ID, dt, VALUE, LASTLINEOFPAGE) VALUES
(1 , '2006-10-07', 10 , 0),
(2 , '2006-10-14', 12 , 0),
(3 , '2006-10-21', 4 , 1),
(4 , '2006-10-28', 6 , 0),
(5 , '2006-11-04', 8 , 1),
(6 , '2006-11-25', 125, 0),
(7 , '2006-12-02', 1 , 0),
(8 , '2006-12-09', 5 , 0),
(9 , '2006-12-16', 45 , 0),
(10, '2006-12-30', 1 , 1),
(16, '2007-01-25', 125, 0),
(17, '2007-02-02', 1 , 0),
(18, '2007-02-09', 5 , 0),
(19, '2007-02-16', 45 , 0),
(20, '2007-02-20', 1 , 0),
(26, '2007-02-25', 125, 0),
(27, '2007-03-02', 1 , 0),
(28, '2007-03-09', 5 , 0),
(29, '2007-03-10', 5 , 0),
(30, '2007-03-11', 5 , 0),
(31, '2007-03-12', 5 , 0),
(32, '2007-03-13', 5 , 1),
(41, '2007-10-07', 10 , 0),
(42, '2007-10-14', 12 , 0),
(43, '2007-10-21', 4 , 1);
Query
Run it step-by-step, CTE-by-CTE and examine intermediate results to understand what it does.
CTE_FirstLines sets the FirstLineOfPage flag to 1 for the first line of the page instead of the last.
CTE_SimplePages uses a cumulative SUM to calculate the simple page numbers based on FirstLineOfPage page breaks.
CTE_ExtraPages uses ROW_NUMBER divided by #DefaultLineNumberPerPage to calculate extra page numbers if there is a page that has more than #DefaultLineNumberPerPage rows.
CTE_CompositePages combines simple page numbers with extra page numbers to make a single composite page "Number". It assumes that there will be less than 1000 rows between original LASTLINEOFPAGE flags. If it is possible to have such long sequence of rows, increase the 1000 constant and consider using bigint type for CompositePageNumber column.
CTE_FinalPages uses DENSE_RANK to assign sequential numbers without gaps for each final page.
DECLARE #DefaultLineNumberPerPage int = 5;
DECLARE #PageNumber int = 3;
WITH
CTE_FirstLines
AS
(
SELECT
ID,dt, VALUE, LASTLINEOFPAGE
,CAST(ISNULL(LAG(LASTLINEOFPAGE)
OVER (ORDER BY dt), 1) AS int) AS FirstLineOfPage
FROM #T
)
,CTE_SimplePages
AS
(
SELECT
ID,dt, VALUE, LASTLINEOFPAGE, FirstLineOfPage
,SUM(FirstLineOfPage) OVER(ORDER BY dt
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS SimplePageNumber
FROM CTE_FirstLines
)
,CTE_ExtraPages
AS
(
SELECT
ID,dt, VALUE, LASTLINEOFPAGE, FirstLineOfPage, SimplePageNumber
,(ROW_NUMBER() OVER(PARTITION BY SimplePageNumber ORDER BY dt) - 1)
/ #DefaultLineNumberPerPage AS ExtraPageNumber
FROM CTE_SimplePages
)
,CTE_CompositePages
AS
(
SELECT
ID,dt, VALUE, LASTLINEOFPAGE, FirstLineOfPage, SimplePageNumber, ExtraPageNumber
,SimplePageNumber * 1000 + ExtraPageNumber AS CompositePageNumber
FROM CTE_ExtraPages
)
,CTE_FinalPages
AS
(
SELECT
ID,dt, VALUE, LASTLINEOFPAGE, FirstLineOfPage, SimplePageNumber, ExtraPageNumber
,CompositePageNumber
,DENSE_RANK() OVER(ORDER BY CompositePageNumber) AS FinalPageNumber
FROM CTE_CompositePages
)
,CTE_Sum
AS
(
SELECT
ID,dt, VALUE, LASTLINEOFPAGE, FirstLineOfPage, SimplePageNumber, ExtraPageNumber
,CompositePageNumber
,FinalPageNumber
,SUM(Value) OVER(ORDER BY FinalPageNumber, dt
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS SumCumulative
FROM CTE_FinalPages
)
SELECT
ID,dt, VALUE, LASTLINEOFPAGE, FirstLineOfPage, SimplePageNumber, ExtraPageNumber
,CompositePageNumber
,FinalPageNumber
,SumCumulative
FROM CTE_Sum
-- WHERE FinalPageNumber = #PageNumber
ORDER BY dt
;
Result with the final WHERE filter commented out
Here is the full result with all intermediate columns to illustrate how the query works.
+----+------------+-------+-----+-----+--------+-------+-----------+-------+------------+
| ID | dt | VALUE | Lst | Fst | Simple | Extra | Composite | Final | TotalValue |
+----+------------+-------+-----+-----+--------+-------+-----------+-------+------------+
| 1 | 2006-10-07 | 10 | 0 | 1 | 1 | 0 | 1000 | 1 | 10 |
| 2 | 2006-10-14 | 12 | 0 | 0 | 1 | 0 | 1000 | 1 | 22 |
| 3 | 2006-10-21 | 4 | 1 | 0 | 1 | 0 | 1000 | 1 | 26 |
| 4 | 2006-10-28 | 6 | 0 | 1 | 2 | 0 | 2000 | 2 | 32 |
| 5 | 2006-11-04 | 8 | 1 | 0 | 2 | 0 | 2000 | 2 | 40 |
| 6 | 2006-11-25 | 125 | 0 | 1 | 3 | 0 | 3000 | 3 | 165 |
| 7 | 2006-12-02 | 1 | 0 | 0 | 3 | 0 | 3000 | 3 | 166 |
| 8 | 2006-12-09 | 5 | 0 | 0 | 3 | 0 | 3000 | 3 | 171 |
| 9 | 2006-12-16 | 45 | 0 | 0 | 3 | 0 | 3000 | 3 | 216 |
| 10 | 2006-12-30 | 1 | 1 | 0 | 3 | 0 | 3000 | 3 | 217 |
| 16 | 2007-01-25 | 125 | 0 | 1 | 4 | 0 | 4000 | 4 | 342 |
| 17 | 2007-02-02 | 1 | 0 | 0 | 4 | 0 | 4000 | 4 | 343 |
| 18 | 2007-02-09 | 5 | 0 | 0 | 4 | 0 | 4000 | 4 | 348 |
| 19 | 2007-02-16 | 45 | 0 | 0 | 4 | 0 | 4000 | 4 | 393 |
| 20 | 2007-02-20 | 1 | 0 | 0 | 4 | 0 | 4000 | 4 | 394 |
| 26 | 2007-02-25 | 125 | 0 | 0 | 4 | 1 | 4001 | 5 | 519 |
| 27 | 2007-03-02 | 1 | 0 | 0 | 4 | 1 | 4001 | 5 | 520 |
| 28 | 2007-03-09 | 5 | 0 | 0 | 4 | 1 | 4001 | 5 | 525 |
| 29 | 2007-03-10 | 5 | 0 | 0 | 4 | 1 | 4001 | 5 | 530 |
| 30 | 2007-03-11 | 5 | 0 | 0 | 4 | 1 | 4001 | 5 | 535 |
| 31 | 2007-03-12 | 5 | 0 | 0 | 4 | 2 | 4002 | 6 | 540 |
| 32 | 2007-03-13 | 5 | 1 | 0 | 4 | 2 | 4002 | 6 | 545 |
| 41 | 2007-10-07 | 10 | 0 | 1 | 5 | 0 | 5000 | 7 | 555 |
| 42 | 2007-10-14 | 12 | 0 | 0 | 5 | 0 | 5000 | 7 | 567 |
| 43 | 2007-10-21 | 4 | 1 | 0 | 5 | 0 | 5000 | 7 | 571 |
+----+------------+-------+-----+-----+--------+-------+-----------+-------+------------+
To get only one given page uncomment the WHERE filter in the final SELECT.
Result with the final WHERE filter
+----+------------+-------+-----+-----+--------+-------+-----------+-------+------------+
| ID | dt | VALUE | Lst | Fst | Simple | Extra | Composite | Final | TotalValue |
+----+------------+-------+-----+-----+--------+-------+-----------+-------+------------+
| 6 | 2006-11-25 | 125 | 0 | 1 | 3 | 0 | 3000 | 3 | 165 |
| 7 | 2006-12-02 | 1 | 0 | 0 | 3 | 0 | 3000 | 3 | 166 |
| 8 | 2006-12-09 | 5 | 0 | 0 | 3 | 0 | 3000 | 3 | 171 |
| 9 | 2006-12-16 | 45 | 0 | 0 | 3 | 0 | 3000 | 3 | 216 |
| 10 | 2006-12-30 | 1 | 1 | 0 | 3 | 0 | 3000 | 3 | 217 |
+----+------------+-------+-----+-----+--------+-------+-----------+-------+------------+
The TotalValue in the last row gives you the total page value that you want to show at the bottom of the page. If you sum all values on this page (125+1+5+45+1 = 177) and subtract it from the last TotalValue (217-177 = 40) you'll get the total of previous pages that you want to show at the top of the page. You'd better do these calculations on the client.
I have a partial solution. Still doesnt count default page size, but can give you an idea. So let me know what you think. Hope you are familiar with CTE's. Test each step so you see what are the partial results.
SQL Demo
WITH cte as (
SELECT [ID], [DATE], [VALUE], [LASTLINEOFPAGE],
SUM([VALUE]) OVER (ORDER BY [ID]) as Total,
SUM([LASTLINEOFPAGE]) OVER (ORDER BY [ID]) as page_group
FROM Table1
),
pages as (
SELECT c1.[ID], c1.[Total],
CASE WHEN c1.[ID] = 1 THEN 0
WHEN c1.[ID] = m.[minID] THEN c1.[page_group] -1
ELSE c1.[page_group]
END as [page_group]
FROM cte as c1
JOIN (SELECT [page_group], MIN([ID]) as minID
FROM cte
GROUP BY [page_group]) m
ON c1.[page_group] = m.[page_group]
)
SELECT c.[ID], c.[DATE], c.[VALUE], c.[LASTLINEOFPAGE],
(SELECT MAX([Total])
FROM pages p2
WHERE p2.[page_group] = p.[page_group]) as [Total],
p.[page_group]
FROM cte c
JOIN pages p
ON c.[ID] = p.[id]
As you can see the total and the page are in the aditional column and you shouldnt display those on your app
Table schema:
CREATE TABLE TRANSACTIONDETAILS
(
TransNo CHAR(15),
Serial INT,
Project CHAR(3)
)
Dataset:
+-----------------+--------+---------+
| TransNo | Serial | Project |
+-----------------+--------+---------+
| A00000000000001 | 1 | 100 |
| A00000000000001 | 2 | 101 |
| A00000000000002 | 1 | 100 |
| A00000000000002 | 2 | 101 |
| A00000000000003 | 1 | 100 |
| A00000000000003 | 2 | 200 |
| A00000000000004 | 1 | 200 |
| A00000000000004 | 2 | 100 |
| A00000000000005 | 1 | 101 |
| A00000000000005 | 2 | 100 |
+-----------------+--------+---------+
I want to identify transactions those have same project sets.
Expected output:
+-----------------+--------+---------+---------+
| TransNo | Serial | Project | Flag |
+-----------------+--------+---------+---------+
| A00000000000001 | 1 | 100 | 1 |
| A00000000000001 | 2 | 101 | 1 |
| A00000000000002 | 1 | 100 | 1 |
| A00000000000002 | 2 | 101 | 1 |
| A00000000000005 | 1 | 101 | 1 |
| A00000000000005 | 2 | 100 | 1 |
| A00000000000003 | 1 | 100 | 2 |
| A00000000000003 | 2 | 200 | 2 |
| A00000000000004 | 1 | 200 | 2 |
| A00000000000004 | 2 | 100 | 2 |
+-----------------+--------+---------+---------+
I am using SQL Server 2012 and later.
Thanks.
UPDATE 1: Partially my objective would be achieved if I make following from input dataset.
+-----------------+---------+---------+
| TransNo | Project1| Project2|
+-----------------+---------+---------+
| A00000000000001 | 100 | 101 |
| A00000000000002 | 100 | 101 |
| A00000000000003 | 100 | 200 |
| A00000000000004 | 200 | 100 |
| A00000000000005 | 101 | 100 |
+-----------------+---------+---------+
UPDATE 2:
Data set
+-----------------+--------+---------+
| TransNo | Serial | Project |
+-----------------+--------+---------+
| A00000000000001 | 1 | 100 |
| A00000000000001 | 2 | 101 |
| A00000000000001 | 3 | 200 |
| A00000000000002 | 1 | 100 |
| A00000000000002 | 2 | 101 |
| A00000000000003 | 1 | 100 |
| A00000000000003 | 2 | 200 |
| A00000000000004 | 1 | 200 |
| A00000000000004 | 2 | 100 |
| A00000000000005 | 1 | 101 |
| A00000000000005 | 2 | 100 |
+-----------------+--------+---------+
Output:
+-----------------+--------+---------+---------+
| TransNo | Serial | Project | Flag |
+-----------------+--------+---------+---------+
| A00000000000001 | 1 | 100 | 1 |
| A00000000000001 | 2 | 101 | 1 |
| A00000000000001 | 2 | 200 | 1 |
| A00000000000002 | 1 | 100 | 2 |
| A00000000000002 | 2 | 101 | 2 |
| A00000000000005 | 1 | 101 | 2 |
| A00000000000005 | 2 | 100 | 2 |
| A00000000000003 | 1 | 100 | 3 |
| A00000000000003 | 2 | 200 | 3 |
| A00000000000004 | 1 | 200 | 3 |
| A00000000000004 | 2 | 100 | 3 |
+-----------------+--------+---------+---------+
Try this
;WITH cte
AS (SELECT *,
Concat(Min(Project)OVER(partition BY TransNo), Max(Project)OVER(partition BY TransNo)) AS inter
FROM TRANSACTIONDETAILS)
SELECT TransNo,
Serial,
Project,
Dense_rank()OVER(ORDER BY inter) AS flag
FROM cte
SQL FIDDLE DEMO
Update : For partial result
SELECT TransNo,
Max(CASE WHEN Serial = 1 THEN Project END) AS Project_1,
Max(CASE WHEN Serial = 2 THEN Project END) AS Project_2
FROM TRANSACTIONDETAILS
GROUP BY TransNo
CREATE TABLE #test_trans
([TransNo] varchar(15), [Serial] int, [Project] int)
;
INSERT INTO #test_trans
([TransNo], [Serial], [Project])
VALUES
('A00000000000001', 1, 100),
('A00000000000001', 2, 101),
('A00000000000001', 3, 200),
('A00000000000002', 1, 100),
('A00000000000002', 2, 101),
('A00000000000003', 1, 100),
('A00000000000003', 2, 200),
('A00000000000004', 1, 200),
('A00000000000004', 2, 100),
('A00000000000005', 1, 101),
('A00000000000005', 2, 100)
;
[![;WITH cte
AS (select \[TransNo\],(
Select cast(ST1.\[Project\] as varchar(max)) AS \[text()\]
From #test_trans ST1
where st1.TransNo=st2.TransNo
For XML PATH ('')) as rn,Project,st2.Serial from #test_trans st2)
SELECT TransNo,
Serial,
Project,
Dense_rank()OVER(ORDER BY rn) AS flag
FROM cte][1]][1]
I need help with a SQL that will convert this table:
===================
| Id | FK | Status|
===================
| 1 | A | 100 |
| 2 | A | 101 |
| 3 | B | 100 |
| 4 | B | 101 |
| 5 | C | 100 |
| 6 | C | 101 |
| 7 | A | 102 |
| 8 | A | 102 |
| 9 | B | 102 |
| 10 | B | 102 |
===================
to this:
==========================================
| FK | Count 100 | Count 101 | Count 102 |
==========================================
| A | 1 | 1 | 2 |
| B | 1 | 1 | 2 |
| C | 1 | 1 | 0 |
==========================================
I can so simple counts, etc., but am struggling trying to pivot the table with the information derived. Any help is appreciated.
Use:
SELECT t.fk,
SUM(CASE WHEN t.status = 100 THEN 1 ELSE 0 END) AS count_100,
SUM(CASE WHEN t.status = 101 THEN 1 ELSE 0 END) AS count_101,
SUM(CASE WHEN t.status = 102 THEN 1 ELSE 0 END) AS count_102
FROM TABLE t
GROUP BY t.fk
use:
select * from
(select fk,fk as fk1,statusFK from #t
) as t
pivot
(COUNT(fk1) for statusFK IN ([100],[101],[102])
) AS pt
Just adding a shortcut to #OMG's answer.
You can eliminate CASE statement:
SELECT t.fk,
SUM(t.status = 100) AS count_100,
SUM(t.status = 101) AS count_101,
SUM(t.status = 102) AS count_102
FROM TABLE t
GROUP BY t.fk