GroupBy with respect to record intervals on another table - sql

I prepared a sql fiddle for my question. Here it is There is a working code here. I am asking whether there exists an alternative solution which I did not think.
CREATE TABLE [Product]
([Timestamp] bigint NOT NULL PRIMARY KEY,
[Value] float NOT NULL
)
;
CREATE TABLE [PriceTable]
([Timestamp] bigint NOT NULL PRIMARY KEY,
[Price] float NOT NULL
)
;
INSERT INTO [Product]
([Timestamp], [Value])
VALUES
(1, 5),
(2, 3),
(4, 9),
(5, 2),
(7, 11),
(9, 3)
;
INSERT INTO [PriceTable]
([Timestamp], [Price])
VALUES
(1, 1),
(3, 4),
(7, 2.5),
(10, 3)
;
Query:
SELECT [Totals].*, [PriceTable].[Price]
FROM
(
SELECT [PriceTable].[Timestamp]
,SUM([Value]) AS [TotalValue]
FROM [Product],
[PriceTable]
WHERE [PriceTable].[Timestamp] <= [Product].[Timestamp]
AND NOT EXISTS (SELECT * FROM [dbo].[PriceTable] pt
WHERE pt.[Timestamp] <= [Product].[Timestamp]
AND pt.[Timestamp] > [PriceTable].[Timestamp])
GROUP BY [PriceTable].[Timestamp]
) AS [Totals]
INNER JOIN [dbo].[PriceTable]
ON [PriceTable].[Timestamp] = [Totals].[Timestamp]
ORDER BY [PriceTable].[Timestamp]
Result
| Timestamp | TotalValue | Price |
|-----------|------------|-------|
| 1 | 8 | 1 |
| 3 | 11 | 4 |
| 7 | 14 | 2.5 |
Here, my first table [Product] contains the product values for different timestamps. And second table [PriceTable] contains the prices for different time intervals. A given price is valid until a new price is set. Therefore the price with timestamp 1 is valid for Products with timestamps 1 and 2.
I am trying to get the total number of products with respect to given prices. The SQL on the fiddle produces what I expect.
Is there a smarter way to get the same result?
By the way, I am using SQLServer 2014.

DECLARE #Product TABLE
(
[Timestamp] BIGINT NOT NULL
PRIMARY KEY ,
[Value] FLOAT NOT NULL
);
DECLARE #PriceTable TABLE
(
[Timestamp] BIGINT NOT NULL
PRIMARY KEY ,
[Price] FLOAT NOT NULL
);
INSERT INTO #Product
( [Timestamp], [Value] )
VALUES ( 1, 5 ),
( 2, 3 ),
( 4, 9 ),
( 5, 2 ),
( 7, 11 ),
( 9, 3 );
INSERT INTO #PriceTable
( [Timestamp], [Price] )
VALUES ( 1, 1 ),
( 3, 4 ),
( 7, 2.5 ),
( 10, 3 );
WITH cte
AS ( SELECT * ,
LEAD(pt.[Timestamp]) OVER ( ORDER BY pt.[Timestamp] ) AS [lTimestamp]
FROM #PriceTable pt
)
SELECT cte.[Timestamp] ,
( SELECT SUM(Value)
FROM #Product
WHERE [Timestamp] >= cte.[Timestamp]
AND [Timestamp] < cte.[lTimestamp]
) AS [TotalValue],
cte.[Price]
FROM cte
Idea is to generate intervals from price table like:
1 - 3
3 - 7
7 - 10
and sum up all values in those intervals.
Output:
Timestamp TotalValue Price
1 8 1
3 11 4
7 14 2.5
10 NULL 3
You can simply add WHERE clause if you want to filter out rows where no orders are sold.
Also you can indicate the default value for LEAD window function if you want to close the last interval like:
LEAD(pt.[Timestamp], 1, 100)
and I guess it would be something like this in production:
LEAD(pt.[Timestamp], 1, GETDATE())

I think I've got a query which is easier to read. Does this work for you?
select pt.*,
(select sum(P.Value) from Product P where
P.TimeStamp between pt.TimeStamp and (
--get the next time stamp
select min(TimeStamp)-1 from PriceTable where TimeStamp > pt.TimeStamp
)) as TotalValue from PriceTable pt
--exclude entries with timestamps greater than those in Product table
where pt.TimeStamp < (select max(TimeStamp) from Product)
Very detailed question BTW

You could use a cte
;with cte as
(
select p1.[timestamp] as lowval,
case
when p2.[timestamp] is not null then p2.[timestamp] - 1
else 999999
end hival,
p1.price
from
(
select p1.[timestamp],p1.price,
row_number() over (order by p1.[timestamp]) rn
from pricetable p1 ) p1
left outer join
(select p1.[timestamp],p1.price,
row_number() over (order by p1.[timestamp]) rn
from pricetable p1) p2
on p2.rn = p1.rn + 1
)
select cte.lowval as 'timestamp',sum(p1.value) TotalValue,cte.price
from product p1
join cte on p1.[Timestamp] between cte.lowval and cte.hival
group by cte.lowval,cte.price
order by cte.lowval
It's a lot easier to understand and the execution plan compares favourably with your query (about 10%) cheaper

Related

SQL to Find Max date Value from each group- SQL Server

So I have 2 tables which iam joining using Inner Join.
Table 1 :
Name
Batch_Date
AcctID
Bob
18-08-11
32
Bob
19-08-11
32
Shawn
18-08-11
42
Shawn
20-08-11
42
Paul
18-08-11
36
Paul
19-08-11
36
Table 2
Code
order_Date
AcctID
1
18-08-11
32
0
NULL
32
0
NULL
42
0
NULL
42
1
18-08-11
36
1
18-08-11
36
So I want to get the name, last batch_date , AcctID from the table 1
and code, order date from table 2.
The challenge for me here is as there are multiple rows of same AcctId in table 2, if for any acctid, the date column is not null, I want to select that date and if date column is null for each row, I want to select the null value for date.
SO resulting dataset should look like below:
Name
Batch_Date
AcctID
Code
Order_Date
Bob
19-08-11
32
1
18-08-11
Shawn
20-08-11
42
0
NULL
Paul
19-08-11
36
1
18-08-11
OK, try this
--Set up your sample data in useable form, skip in your actual solution
with cteT1 as (
SELECT *
FROM (VALUES ('Bob', '18-08-11', 32), ('Bob', '19-08-11', 32)
, ('Shawn', '18-08-11', 42), ('Shawn', '20-08-11', 42)
, ('Paul', '18-08-11', 36), ('Paul', '19-08-11', 36)
) as T1 (CustName, BatchDate, AcctID)
), cteT2 as (
SELECT *
FROM (VALUES (1, '18-08-11', 32), (0, NULL, 32), (0, NULL, 42)
, (0, NULL, 42), (1, '18-08-11', 36), (1, '18-08-11', 36)
) as T2 (OrderCode, OrderDate, AcctID)
)
--Set up the solution - tag the newest of each table
, cteTopBatches as (
SELECT ROW_NUMBER() over (PARTITION BY AcctID ORDER BY BatchDate DESC) as BatchNewness, *
FROM cteT1
), cteTopOrders as (
SELECT ROW_NUMBER() over (PARTITION BY AcctID ORDER BY OrderDate DESC) as OrderNewness, *
FROM cteT2 --NOTE: NULLs sort below actual dates, but you could use COALESCE to force a specific value to use, probably in another CTE for readability
)
--Now combine the 2 tables keeping only the newest of each
SELECT T1.AcctID , T1.CustName , T1.BatchDate , T2.OrderCode , T2.OrderDate
FROM cteTopBatches as T1 INNER JOIN cteTopOrders as T2 ON T1.AcctID = T2.AcctID
WHERE T1.BatchNewness = 1 AND T2.OrderNewness = 1

SQL Server grouping rows

I have this data:
Id | Name | count | Group_number
------+-------+-------+--------------
1 | cdd | 50 | 0
2 | cdd | 15 | 0
3 | cdd | 0 | 0
4 | cdd | 25 | 0
5 | cdd | 11 | 0
I want a script that makes three or four groups on condition: Sum(count) for each group < 50
I want this output:
1 | cdd | 50 | 1
2 | cdd | 15 | 2
3 | cdd | 0 | 2
4 | cdd | 25 | 2
5 | cdd | 11 | 3
Assuming this has to be done for each name, you can use a recursive cte.
with rownums as (select t.*,row_number() over(partition by name order by id) as rnum from t)
,cte(rnum,id,name,cnt,runningsum,grp) as
(select rnum,id,name,cnt,cnt,1 from rownums where rnum=1
union all
select t.rnum,t.id,t.name,t.cnt
,case when c.runningsum+t.cnt > 50 then t.cnt else c.runningsum+t.cnt end
,case when c.runningsum+t.cnt > 50 then t.id else c.grp end
from cte c
join rownums t on t.rnum=c.rnum+1 and t.name=c.name
)
select id,cnt,name,dense_rank() over(partition by name order by grp) as grp
from cte
Sample Demo
Keep track of the running sum and reset it when it goes over 50. Also remember the id when the sum goes over 50. This can be used to assign group numbers.
For records where the count is less than 50 we can simply generate a grouping id by calculating a running total on the count and then divide this running total by 50. However, since some records may already have a count that is greater than or equal to 50 might generate an incorrect id. To solve this problem, we need to somehow force the generation of a new grouping id on the next record. This can be done by simply adjusting the count the next record by 50 if the current records count is 50 or greater.
The following example demonstrates how this can be done:
CREATE TABLE #Items
(
[Id] INT NOT NULL PRIMARY KEY
,[Name] VARCHAR(50) NOT NULL
,[Count] INT NOT NULL
)
INSERT INTO #Items
VALUES
(1, 'cdd', 50 ),
(2, 'cdd', 15 ),
(3, 'cdd', 0 ),
(4, 'cdd', 25 ),
(5, 'cdd', 11 );
;WITH CTE_ItemCountsAdjusted
AS
(
SELECT [Id]
,[Name]
,[Count]
,LAG([Count], 1, 0) OVER (PARTITION BY [Name] ORDER BY [Id]) AS PrevCount
,(
CASE
WHEN LAG([Count], 1, 0) OVER (PARTITION BY [Name] ORDER BY [Id]) >= 50 THEN [Count] + 50
ELSE [Count]
END
) AdjustedCount
FROM #Items
)
SELECT [Id]
,[Name]
,[Count]
,SUM([AdjustedCount]) OVER (PARTITION BY [Name] ORDER BY [Id] ROWS UNBOUNDED PRECEDING) / 50 AS [Group_number]
FROM CTE_ItemCountsAdjusted
ORDER BY [Id]
This method eliminates the need for recursive calls. Note if you need the group id to be strictly sequential (no gaps between group numbers) then you can make use of the DENSE_RANK() windowing function to achieve this as per following example:
INSERT INTO #Items
VALUES
(1, 'cdd', 50 ),
(2, 'cdd', 15 ),
(3, 'cdd', 0 ),
(4, 'cdd', 25 ),
(5, 'cdd', 11 ),
(6, 'cdd', 200 ),
(7, 'cdd', 10 );
;WITH CTE_ItemCountsAdjusted
AS
(
SELECT [Id]
,[Name]
,[Count]
,LAG([Count], 1, 0) OVER (PARTITION BY [Name] ORDER BY [Id]) AS PrevCount
,(
CASE
WHEN LAG([Count], 1, 0) OVER (PARTITION BY [Name] ORDER BY [Id]) >= 50 THEN [Count] + 50
ELSE [Count]
END
) AdjustedCount
FROM #Items
),CTE_ItemCountsWithGroupID
AS
(
SELECT [Id]
,[Name]
,[Count]
,SUM([AdjustedCount]) OVER (PARTITION BY [Name] ORDER BY [Id] ROWS UNBOUNDED PRECEDING) / 50 AS [Group_number]
FROM CTE_ItemCountsAdjusted
)
SELECT [Id]
,[Name]
,[Count]
,[Group_number]

Transpose SQL table results using Pivot

I am trying to transpose column results in my table into row results. Here is the query that generates the table results:
CREATE TABLE Zone
([Zone] varchar(9), [CompanyID] int, [SubCount] int);
CREATE TABLE Company
([UniqueIdentifier]int, [Name] varchar(50));
--Adding Values into the table
INSERT INTO Company
([UniqueIdentifier], [Name])
VALUES
( 1, 'CompanyA'),
( 2, 'CompanyB'),
( 3, 'CompanyC'),
( 4, 'CompanyD'),
( 5, 'CompanyE');
--Adding Values to the table
INSERT INTO Zone
([Zone], [CompanyID], [SubCount])
VALUES
( 'Zone1', 1, 100),
( 'Zone2', 1, 200),
( 'Zone3', 2, 1250),
( 'Zone4', 3, 1440),
( 'Zone5', 4, 1445),
( 'Zone6', 4, 3250),
( 'Zone7', 5, 4440);
--Getting TOTALS
SELECT
CASE WHEN GROUPING(dbo.Company.Name)=1 THEN 'Grand Total' else dbo.Company.Name end as Company,
SUM(dbo.Zone.SubCount) as Subs
FROM dbo.Company INNER JOIN dbo.Zone ON dbo.Company.UniqueIdentifier = dbo.Zone.CompanyID
WHERE (dbo.Zone.SubCount IS NOT NULL) AND (dbo.Zone.SubCount > 0)
Group by ROLLUP(dbo.Company.Name)
ORDER BY Subs DESC;
HERE ARE THE RESULTS OF THE QUERY:
Company | Subs
------------------------
1 Grand Total | 12125
2 CompanyD | 4695
3 CompanyE | 4440
4 CompanyC | 1440
5 CompanyB | 1250
6 CompanyA | 300
DESIRED RESULT WOULD LOOK LIKE:
Company |CompanyA|CompanyB|CompanyC|CompanyE|CompanyD|Grand Total
---------------------------------------------------------------------
Subs | 300 | 1250 | 1440 | 4440 | 4695 | 12125
ANY HELP IS GREATLY APPRECIATED!
All you need to do is use pivot function for converting these rows into columns
SELECT 'Subs' AS Company
,[CompanyA]
,[CompanyB]
,[CompanyC]
,[CompanyD]
,[CompanyE]
,[Grand Total]
FROM (
SELECT CASE
WHEN GROUPING(dbo.Company.NAME) = 1
THEN 'Grand Total'
ELSE dbo.Company.NAME
END AS Company
,SUM(dbo.Zone.SubCount) AS Subs
FROM dbo.Company
INNER JOIN dbo.Zone ON dbo.Company.UNIQUEIDENTIFIER = dbo.Zone.CompanyID
WHERE (dbo.Zone.SubCount IS NOT NULL)
AND (dbo.Zone.SubCount > 0)
GROUP BY ROLLUP(dbo.Company.NAME)
) a
pivot(max(subs) FOR Company IN (
[CompanyA]
,[CompanyB]
,[CompanyC]
,[CompanyD]
,[CompanyE]
,[Grand Total]
)) piv;

For different groups, sql query to replace null value with known value from next known value

I have this SQL table which looks like this:
customer date number
--------- ---- ------
A 1 3
A 2 NULL
A 3 5
A 4 NULL
A 5 6
B 1 NULL
B 2 NULL
B 3 10
Per customer, I'm looking to add an extra column number_NEW which replaces the NULL in number (if this is null) with the next known chronologically known number (determined by date):
customer date number number_NEW
--------- ---- ------ ----------
A 1 3 3
A 2 NULL 5
A 3 5 5
A 4 NULL 6
A 5 6 6
B 1 NULL 10
B 2 NULL 10
B 3 10 10
How would I go about this in SQL?
Thanks a lot!
You can use APPLY:
SELECT
*,
Number_NEW = ISNULL(t.Number, x.Number)
FROM Test t
OUTER APPLY(
SELECT TOP 1 Number
FROM Test
WHERE
Customer = t.Customer
AND Date > t.Date
AND Number IS NOT NULL
ORDER BY Date
)x
ORDER BY t.Customer, t.Date
Your sample data is not upto the mark .
[date] column is not clear.So to be safe I have use row_number which I think is require.
Also I think your problem is already solved.I have written this script using sql 2012 with dynamic LEAD().
It not only giving correct output but also depict dynamic use of LEAD().
Declare #t table(customer varchar(20),[date] int, number int)
insert into #t values
('A', 1,3 )
,('A', 2, NULL)
,('A', 3, 5 )
,('A', 4, NULL )
,('A', 5, 6)
,('B', 1, NULL)
,('B', 2, NULL)
,('B', 3, 10)
;WITH CTE
AS (
SELECT *
,ROW_NUMBER() OVER (
PARTITION BY customer ORDER BY [DATE]
) RN
FROM #T
)
--SELECT * FROM CTE
SELECT *
,IIF(number IS NULL, LEAD(number, (
SELECT TOP 1 RN - A.RN
FROM CTE
WHERE customer = a.customer
AND RN > a.RN
AND number IS NOT NULL
ORDER BY RN
), number) OVER (
ORDER BY customer
,[date]
), number) number_NEW
FROM CTE A
alter table T add number_NEW int null;
update T /* substitute table name here -- I realize that SQL Server allows aliases */
set number_NEW =
case
when number is null
then (
select min(t2.number) /* do date and number always increase together? */
from T as t2
/* substitute full table name here as well */
where t2.customer = T.customer and t2.date > T."date"
)
else number
end
);
alter table T alter column number_NEW int not null;

How to limit the selection in SQL Server by sum of a column?

Can I limit rows by sum of a column in a SQL Server database?
For example:
Type | Time (in minutes)
-------------------------
A | 50
B | 10
C | 30
D | 20
E | 70
...
And I want to limit the selection by sum of time. For example maximum of 100 minutes. Table must look like this:
Type | Time (in minutes)
-------------------------
A | 50
B | 10
C | 30
Any ideas? Thanks.
DECLARE #T TABLE
(
[Type] CHAR(1) PRIMARY KEY,
[Time] INT
)
INSERT INTO #T
SELECT 'A',50 UNION ALL
SELECT 'B',10 UNION ALL
SELECT 'C',30 UNION ALL
SELECT 'D',20 UNION ALL
SELECT 'E',70;
WITH RecursiveCTE
AS (
SELECT TOP 1 [Type], [Time], CAST([Time] AS BIGINT) AS Total
FROM #T
ORDER BY [Type]
UNION ALL
SELECT R.[Type], R.[Time], R.Total
FROM (
SELECT T.*,
T.[Time] + Total AS Total,
rn = ROW_NUMBER() OVER (ORDER BY T.[Type])
FROM #T T
JOIN RecursiveCTE R
ON R.[Type] < T.[Type]
) R
WHERE R.rn = 1 AND Total <= 100
)
SELECT [Type], [Time], Total
FROM RecursiveCTE
OPTION (MAXRECURSION 0);
Or if your table is small
SELECT t1.[Type],
t1.[Time],
SUM(t2.[Time])
FROM #T t1
JOIN #T t2
ON t2.[Type] <= t1.[Type]
GROUP BY t1.[Type],t1.[Time]
HAVING SUM(t2.[Time]) <=100