Grouping Column Without Breaking The Sequence - sql

The main objective is to group the rows following Amount Column sequentially so that, if there is any different value between the 2 same values, they will be numbered separately.
This is the raw data here:
SELECT Area, DateA, DateB, Amount
FROM (VALUES
('ABC', '2019-08-18', '2019-08-18 00:07:47.000', 3.75),
('ABC','2019-08-19', '2019-08-19 00:08:47.000', 3.75),
('ABC','2019-08-20', '2019-08-20 00:09:47.000', 3.65),
('ABC','2019-08-21', '2019-08-21 00:09:57.000', 3.75))
AS FeeCollection(Area, DateA, DateB, Amount)
I've tried this but, I don't know the real matter to number in a special way.
DENSE_RANK() OVER(ORDER BY Area, Amount)
This is the sample result I want to achieve. I'm looking for simple logic to do it. Using cursor or while looping will not be efficient for me.

I believe this is what you want. I use LAG to get the value of the prior row in a CTE, and then use a windowed COUNT to reduce the value of ROW_NUMBER by 1 for each row with the same consecutive value for amount:
WITH CTE AS(
SELECT Area,
DateA,
DateB,
Amount,
LAG(Amount) OVER (PARTITION BY Area ORDER BY DateA) AS PrevAmount
FROM (VALUES
('ABC', '2019-08-18', '2019-08-18 00:07:47.000', 3.75),
('ABC','2019-08-19', '2019-08-19 00:08:47.000', 3.75),
('ABC','2019-08-20', '2019-08-20 00:09:47.000', 3.65),
('ABC','2019-08-21', '2019-08-21 00:09:57.000', 3.75))
AS FeeCollection(Area, DateA, DateB, Amount))
SELECT Area,
DateA,
DateB,
Amount,
ROW_NUMBER() OVER (PARTITION BY Area ORDER BY DateA) -
COUNT(CASE Amount WHEN PrevAmount THEN 1 END) OVER (PARTITION BY Area ORDER BY DateA
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Number
FROM CTE
ORDER BY DateA;
I did assume your PARTITION BY clause, which you may need to change/remove/move to the ORDER BY. As we had only one value for Area was impossible to know what the value should be when it changes.

I would do this using lag() and a cumulative sum, but looking like:
select t.*,
sum(case when prev_amount = amount then 0 else 1 end) over
(partition by area order by datea) as number
from (select t.*,
lag(amount) over (partition by area order by datea) as prev_amount
from t
) t;

Related

SQL - get start & end balance for each member each year

so I'd like to effectively get for each year the starting and end balance for each member for every year there is a record. for example the below would give me the latest balance for each member each year based on the date column
SELECT
T.MemberID,
T.DateCol,
T.Amount
FROM
(SELECT T.MemberID,
T.DateCol,
Amount,
ROW_NUMBER() OVER (PARTITION BY MemberID,
YEAR(DateCol)
ORDER BY
DateCol desc) AS seqnum
FROM
Tablet T
GROUP BY DateCol, MemberID, Amount
) T
WHERE
seqnum = 1 AND
MemberID = '1000009'
and the below would give me the earliest balance for each year
SELECT
T.MemberID,
T.DateCol,
T.Amount
FROM
(SELECT T.MemberID,
T.DateCol,
Amount,
ROW_NUMBER() OVER (PARTITION BY MemberID,
YEAR(DateCol)
ORDER BY
DateCol) AS seqnum
FROM
Tablet T
GROUP BY DateCol, MemberID, Amount
) T
WHERE
seqnum = 1 AND
MemberID = '1000009'
This would give me a result set like the below, column titles (MemberID, Date, Amount)
What I'm looking for is one query which is done by YEAR, MEMBERID, STARTBALANCE, ENDBALANCE as the columns. And would look like the below
What would be the best way to go about this?
commented above

SQL calculation with previous row + current row

I want to make a calculation based on the excel file. I succeed to obtain 2 of the first records with LAG (as you can check on the 2nd screenshot). Im out of ideas how to proceed from now and need help. I just need the Calculation column take its previous data. I want to automatically calculate it over all the dates. I also tried to make a LAG for the calculation but manually and the result was +1 row more data instead of NULL. This is a headache.
LAG(Data ingested, 1) OVER ( ORDER BY DATE ASC ) AS LAG
You seem to want cumulative sums:
select t.*,
(sum(reconciliation + aves - microa) over (order by date) -
first_value(aves - microa) over (order by date)
) as calculation
from CalcTable t;
Here is a SQL Fiddle.
EDIT:
Based on your comment, you just need to define a group:
select t.*,
(sum(reconciliation + aves - microa) over (partition by grp order by date) -
first_value(aves - microa) over (partition by grp order by date)
) as calculation
from (select t.*,
count(nullif(reconciliation, 0)) over (order by date) as grp
from CalcTable t
) t
order by date;
Imo this could be solved using a "gaps and islands" approach. When Reconciliation>0 then create a gap. SUM(GAP) OVER converts the gaps into island groupings. In the outer query the 'sum_over' column (which corresponds to the 'Calculation') is a cumumlative sum partitioned by the island groupings.
with
gap_cte as (
select *, case when [Reconciliation]>0 then 1 else 0 end gap
from CalcTable),
grp_cte as (
select *, sum(gap) over (order by [Date]) grp
from gap_cte)
select *, sum([Reconciliation]+
(case when gap=1 then 0 else Aves end)-
(case when gap=1 then 0 else Microa end))
over (partition by grp order by [Date]) sum_over
from grp_cte;
[EDIT]
The CASE statement could be CROSS APPLY'ed instead
with
grp_cte as (
select c.*, v.gap, sum(v.gap) over (order by [Date]) grp
from #CalcTable c
cross apply (values (case when [Reconciliation]>0 then 1 else 0 end)) v(gap))
select *, sum([Reconciliation]+
(case when gap=1 then 0 else Aves end)-
(case when gap=1 then 0 else Microa end))
over (partition by grp order by [Date]) sum_over
from grp_cte;
Here is a fiddle

How to get the validity date range of a price from individual daily prices in SQL

I have some prices for the month of January.
Date,Price
1,100
2,100
3,115
4,120
5,120
6,100
7,100
8,120
9,120
10,120
Now, the o/p I need is a non-overlapping date range for each price.
price,from,To
100,1,2
115,3,3
120,4,5
100,6,7
120,8,10
I need to do this using SQL only.
For now, if I simply group by and take min and max dates, I get the below, which is an overlapping range:
price,from,to
100,1,7
115,3,3
120,4,10
This is a gaps-and-islands problem. The simplest solution is the difference of row numbers:
select price, min(date), max(date)
from (select t.*,
row_number() over (order by date) as seqnum,
row_number() over (partition by price, order by date) as seqnum2
from t
) t
group by price, (seqnum - seqnum2)
order by min(date);
Why this works is a little hard to explain. But if you look at the results of the subquery, you will see how the adjacent rows are identified by the difference in the two values.
SELECT Lag.price,Lag.[date] AS [From], MIN(Lead.[date]-Lag.[date])+Lag.[date] AS [to]
FROM
(
SELECT [date],[Price]
FROM
(
SELECT [date],[Price],LAG(Price) OVER (ORDER BY DATE,Price) AS LagID FROM #table1 A
)B
WHERE CASE WHEN Price <> ISNULL(LagID,1) THEN 1 ELSE 0 END = 1
)Lag
JOIN
(
SELECT [date],[Price]
FROM
(
SELECT [date],Price,LEAD(Price) OVER (ORDER BY DATE,Price) AS LeadID FROM [#table1] A
)B
WHERE CASE WHEN Price <> ISNULL(LeadID,1) THEN 1 ELSE 0 END = 1
)Lead
ON Lag.[Price] = Lead.[Price]
WHERE Lead.[date]-Lag.[date] >= 0
GROUP BY Lag.[date],Lag.[price]
ORDER BY Lag.[date]
Another method using ROWS UNBOUNDED PRECEDING
SELECT price, MIN([date]) AS [from], [end_date] AS [To]
FROM
(
SELECT *, MIN([abc]) OVER (ORDER BY DATE DESC ROWS UNBOUNDED PRECEDING ) end_date
FROM
(
SELECT *, CASE WHEN price = next_price THEN NULL ELSE DATE END AS abc
FROM
(
SELECT a.* , b.[date] AS next_date, b.price AS next_price
FROM #table1 a
LEFT JOIN #table1 b
ON a.[date] = b.[date]-1
)AA
)BB
)CC
GROUP BY price, end_date

SQL - Rank monthly dataset high, medium, low

I have a table which includes the month, accountID and a set of application scores. I want to create a new column which either gives a 'high', 'medium' or 'low' for the top, middle and bottom 33% of the results each month.
If I use rank() I can order the application scores for a single month or the whole dataset but I'm unsure how to order it per month. Also, on my version of sql server percent_rank() does not work.
select
AccountID
, ApplicationScore
, rank() over (order by applicationscore asc) as Rank
from Table
I then know I need to put the rank() statement in a subquery and then use a case statement to apply the 'high', 'medium' or 'low'.
select
AccountID
, case when rank <= total/3 then 'low'
when rank > total/3 and rank <= (total/3)*2 then 'medium'
when rank > (total/3)*2 then 'high' end ApplicationScore
from (subquery) a
Ntile(3) worked very well
select
AccountID
, Monthstart
, ApplicationScore
, ntile(3) over (partition by monthstart order by applicationscore) Rank
from table
SQL Server may have something built in to handle your problem. But we can easily use a ratio of counts to find the three segments of your scores, for each month. The ratio we can use is the count, partitioned by month and ordered by score, divided by the count for the entire month.
WITH cte AS (
SELECT *,
1.0 * COUNT(*) OVER (PARTITION BY Month ORDER BY ApplicationScore) /
COUNT(*) OVER (PARTITION BY Month) AS cnt
FROM yourTable
)
SELECT
AccountID,
Month,
ApplicationScore,
CASE WHEN cnt < 0.34 THEN 'low'
WHEN cnt < 0.67 THEN 'medium'
ELSE 'high' END AS rank
FROM cte
ORDER BY
Month,
ApplicationScore DESC;
Demo

How can I add cumulative sum column?

I use SqlExpress
Following is the query using which I get the attached result.
SELECT ReceiptId, Date, Amount, Fine, [Transaction]
FROM (
SELECT ReceiptId, Date, Amount, 'DR' AS [Transaction]
FROM ReceiptCRDR
WHERE (Amount > 0)
UNION ALL
SELECT ReceiptId, Date, Amount, 'CR' AS [Transaction]
FROM ReceiptCR
WHERE (Amount > 0)
UNION ALL
SELECT strInvoiceNo AS ReceiptId, CONVERT(datetime, dtInvoiceDt, 103) AS Date, floatTotal AS Amount, 'DR' AS [Transaction]
FROM tblSellDetails
) AS t
ORDER BY Date
Result
want a new column which would show balance amount.
For example. 1 Row should show -2500, 2nd should -3900, 3rd should -700 and so on.
basically, it requires previous row' Account column's data and carry out calculation based on transaction type.
Sample Result
Well, that looks like SQL-Server , if you are using 2012+ , then use SUM() OVER() :
SELECT t.*,
SUM(CASE WHEN t.transactionType = 'DR'
THEN t.amount*-1
ELSE t.amount END)
OVER(PARTITION BY t.date ORDER BY t.receiptId,t.TransactionType DESC) as Cumulative_Col
FROM (YourQuery Here) t
This will SUM the value when its CR and the value*-1 when its DR
Right now I grouped by date, meaning each day will recalculate this column, if you want it for all time, replace the OVER() with this:
OVER(ORDER BY t.date,t.receiptId,t.TransactionType DESC) as Cumulative_Col
Also, I didn't understand why in the same date, for the same ReceiptId DR is calculated before CR , I've add it to the order by but if thats not what you want then explain the logic better.