Aggregation by positive/negative values v.2 - sql

I've posted several topics and every query had some problems :( Changed table and examples for better understanding
I have a table called PROD_COST with 5 fields
(ID,Duration,Cost,COST_NEXT,COST_CHANGE).
I need extra field called "groups" for aggregation.
Duration = number of days the price is valid (1 day=1row).
Cost = product price in this day.
-Cost_next = lead(cost,1,0).
Cost_change = Cost_next - Cost.
example:
+----+---------+------+-------------+-------+
|ID |Duration | Cost | Cost_change | Groups|
+----+---------+------+-------------+-------+
| 1 | 1 | 10 | -1,5 | 1 |
| 2 | 1 | 8,5 | 3,7 | 2 |
| 3 | 1 | 12.2 | 0 | 2 |
| 4 | 1 | 12.2 | -2,2 | 3 |
| 5 | 1 | 10 | 0 | 3 |
| 6 | 1 | 10 | 3.2 | 4 |
| 7 | 1 | 13.2 | -2,7 | 5 |
| 8 | 1 | 10.5 | -1,5 | 5 |
| 9 | 1 | 9 | 0 | 5 |
| 10 | 1 | 9 | 0 | 5 |
| 11 | 1 | 9 | -1 | 5 |
| 12 | 1 | 8 | 1.5 | 6 |
+----+---------+------+-------------+-------+
Now i need to group("Groups" field) by Cost_change. It can be positive,negative or 0 values.
Some kind guy advised me this query:
select id, COST_CHANGE, sum(GRP) over (order by id asc) +1
from
(
select *, case when sign(COST_CHANGE) != sign(isnull(lag(COST_CHANGE)
over (order by id asc),COST_CHANGE)) and Cost_change!=0 then 1 else 0 end as GRP
from PROD_COST
) X
But there is a problem: If there are 0 values between two positive or negative values than it groups it separately, for example:
+-------------+--------+
| Cost_change | Groups |
+-------------+--------+
| 9.262 | 5777 |
| -9.262 | 5778 |
| 9.262 | 5779 |
| 0.000 | 5779 |
| 9.608 | 5780 |
| -11.231 | 5781 |
| 10.000 | 5782 |
+-------------+--------+
I need to have:
+-------------+--------+
| Cost_change | Groups |
+-------------+--------+
| 9.262 | 5777 |
| -9.262 | 5778 |
| 9.262 | 5779 |
| 0.000 | 5779 |
| 9.608 | 5779 | -- Here
| -11.231 | 5780 |
| 10.000 | 5781 |
+-------------+--------+
In other words, if there's 0 values between two positive ot two negative values than they should be in one group, because Sequence: MINUS-0-0-MINUS - no rotation. But if i had MINUS-0-0-PLUS, than GROUPS should be 1-1-1-2, because positive valus is rotating with negative value.
Thank you for attention!
I'm Using Sql Server 2012

I think the best approach is to remove the zeros, do the calculation, and then re-insert them. So:
with pcg as (
select pc.*, min(id) over (partition by grp) as grpid
from (select pc.*,
(row_number() over (order by id) -
row_number() over (partition by sign(cost_change)
order by id
) as grp
from prod_cost pc
where cost_change <> 0
) pc
)
select pc.*, max(groups) over (order by id)
from prod_cost pc left join
(select pcg.*, dense_rank() over (order by grpid) as groups
from pcg
) pc
on pc.id = pcg.id;
The CTE assigns a group identifier based on the lowest id in the group, where the groups are bounded by actual sign changes. The subquery turns this into a number. The outer query then accumulates the maximum value, to give a value to the 0 records.

Related

SQL group by changing column

Suppose I have a table sorted by date as so:
+-------------+--------+
| DATE | VALUE |
+-------------+--------+
| 01-09-2020 | 5 |
| 01-15-2020 | 5 |
| 01-17-2020 | 5 |
| 02-03-2020 | 8 |
| 02-13-2020 | 8 |
| 02-20-2020 | 8 |
| 02-23-2020 | 5 |
| 02-25-2020 | 5 |
| 02-28-2020 | 3 |
| 03-13-2020 | 3 |
| 03-18-2020 | 3 |
+-------------+--------+
I want to group by changes in value within that given date range, and add a value that increments each time as an added column to denote that.
I have tried a number of different things, such as using the lag function:
SELECT value, value - lag(value) over (order by date) as count
GROUP BY value
In short, I want to take the table above and have it look like:
+-------------+--------+-------+
| DATE | VALUE | COUNT |
+-------------+--------+-------+
| 01-09-2020 | 5 | 1 |
| 01-15-2020 | 5 | 1 |
| 01-17-2020 | 5 | 1 |
| 02-03-2020 | 8 | 2 |
| 02-13-2020 | 8 | 2 |
| 02-20-2020 | 8 | 2 |
| 02-23-2020 | 5 | 3 |
| 02-25-2020 | 5 | 3 |
| 02-28-2020 | 3 | 4 |
| 03-13-2020 | 3 | 4 |
| 03-18-2020 | 3 | 4 |
+-------------+--------+-------+
I want to eventually have it all in one small table with the earliest date for each.
+-------------+--------+-------+
| DATE | VALUE | COUNT |
+-------------+--------+-------+
| 01-09-2020 | 5 | 1 |
| 02-03-2020 | 8 | 2 |
| 02-23-2020 | 5 | 3 |
| 02-28-2020 | 3 | 4 |
+-------------+--------+-------+
Any help would be very appreciated
you can use a combination of Row_number and Dense_rank functions to get the required results like below:
;with cte
as
(
select t.DATE,t.VALUE
,Dense_rank() over(partition by t.VALUE order by t.DATE) as d_rank
,Row_number() over(partition by t.VALUE order by t.DATE) as r_num
from table t
)
Select t.Date,t.Value,d_rank as count
from cte
where r_num = 1
You can use a lag and cumulative sum and a subquery:
SELECT value,
SUM(CASE WHEN prev_value = value THEN 0 ELSE 1 END) OVER (ORDER BY date)
FROM (SELECT t.*, LAG(value) OVER (ORDER BY date) as prev_value
FROM t
) t
Here is a db<>fiddle.
You can recursively use lag() and then row_number() analytic functions :
WITH t2 AS
(
SELECT LAG(value,1,value-1) OVER (ORDER BY date) as lg,
t.*
FROM t
)
SELECT t2.date,t2.value, ROW_NUMBER() OVER (ORDER BY t2.date) as count
FROM t2
WHERE value - lg != 0
Demo
and filter through inequalities among the returned values from those functions.

Obtain MIN() and MAX() over not correlative values in PostgreSQL

I have a problem that I can't found a solution. This is my scenario:
parent_id | transaction_code | way_to_pay | type_of_receipt | unit_price | period | series | number_from | number_to | total_numbers
10 | 2444 | cash | local | 15.000 | 2018 | A | 19988 | 26010 | 10
This result's when a grouping parent_id, transaccion_code, way_to_pay, type_of_receipt, unit_price, periodo, series, MIN(number), MAX(number) and COUNT(number). But the grouping hides that the number is not correlative, because this is my childs situation:
parent_id | child_id | number
10 | 1 | 19988
10 | 2 | 19989
10 | 3 | 19990
10 | 4 | 19991
10 | 5 | 22001
10 | 6 | 22002
10 | 7 | 26007
10 | 8 | 26008
10 | 9 | 26009
10 | 10 | 26010
What is the magic SQL to achieve the following?
parent_id | transaction_code | way_to_pay | type_of_receipt | unit_price | period | series | number_from | number_to | total_numbers
10 | 2444 | cash | local | 15.000 | 2018 | A | 19988 | 19991 | 4
10 | 2444 | cash | local | 15.000 | 2018 | A | 22001 | 22002 | 2
10 | 2444 | cash | local | 15.000 | 2018 | A | 26007 | 26010 | 4
You can identify adjacent numbers by subtracting a sequence. It would help if you showed your query, but the idea is this:
select parent_id, transaccion_code, way_to_pay, type_of_receipt, unit_price, periodo, series,
min(number), max(number), count(*)
from (select t.*,
row_number() over
(partition by parent_id, transaccion_code, way_to_pay, type_of_receipt, unit_price, periodo, series
order by number
) as seqnum
from t
) t
group by parent_id, transaccion_code, way_to_pay, type_of_receipt, unit_price, periodo, series,
(number - seqnum);

Alternation of positive and negative values

thank you for attention.
I have a table called "PROD_COST" with 5 fields
(ID,Duration,Cost,COST_NEXT,COST_CHANGE).
I need extra field called "groups" for aggregation.
Duration = number of days the price is valid (1 day=1row).
Cost = product price in this day.
Cost_next = lead(cost,1,0).
Cost_change = Cost_next - Cost.
Now i need to group by Cost_change. It can be
positive,negative or 0 values.
+----+---+------+------+------+
| 1 | 1 | 10 | 8,5 | -1,5 |
| 2 | 1 | 8,5 | 12,2 | 3,7 |
| 3 | 1 | 12,2 | 5,3 | -6,9 |
| 4 | 1 | 5,3 | 4,2 | 1,2 |
| 5 | 1 | 4,2 | 6,2 | 2 |
| 6 | 1 | 6,2 | 9,2 | 3 |
| 7 | 1 | 9,2 | 7,5 | -2,7 |
| 8 | 1 | 7,5 | 6,2 | -1,3 |
| 9 | 1 | 6,2 | 6,3 | 0,1 |
| 10 | 1 | 6,3 | 7,2 | 0,9 |
| 11 | 1 | 7,2 | 7,5 | 0,3 |
| 12 | 1 | 7,5 | 0 | 7,5 |
+----+---+------+------+------+`
I need to make a query, which will group it by first negative or positive value (+ - + - + -). Last one field is what i want.
Sorry for my English `
+----+---+------+------+------+---+
| 1 | 1 | 10 | 8,5 | -1,5 | 1 |
| 2 | 1 | 8,5 | 12,2 | 3,7 | 2 |
| 3 | 1 | 12,2 | 5,3 | -6,9 | 3 |
| 4 | 1 | 5,3 | 4,2 | 1,2 | 4 |
| 5 | 1 | 4,2 | 6,2 | 2 | 4 |
| 6 | 1 | 6,2 | 9,2 | 3 | 4 |
| 7 | 1 | 9,2 | 7,5 | -2,7 | 5 |
| 8 | 1 | 7,5 | 6,2 | -1,3 | 5 |
| 9 | 1 | 6,2 | 6,3 | 0,1 | 6 |
| 10 | 1 | 6,3 | 7,2 | 0,9 | 6 |
| 11 | 1 | 7,2 | 7,5 | 0,3 | 6 |
| 12 | 1 | 7,5 | 0 | 7,5 | 6 |
+----+---+------+------+------+---+`
If you're in SQL Server 2012 you can use the window functions to do this:
select
id, COST_CHANGE, sum(GRP) over (order by id asc) +1
from
(
select
*,
case when sign(COST_CHANGE) != sign(isnull(lag(COST_CHANGE)
over (order by id asc),COST_CHANGE)) then 1 else 0 end as GRP
from
PROD_COST
) X
Lag will get the value from previous row, check the sign of it and compare it to the current row. If the values don't match, the case will return 1. The outer select will do a running total of these numbers, and every time there is 1, it will increase the total.
It is possible to use the same logic with older versions too, you'll just have to fetch the previous row from the table using the id and do running total by re-calculating all rows before the current one.
Example in SQL Fiddle
James's answer is close but it doesn't handle the zero value correctly. This is a pretty easy modification. One tricky approximation uses differences between product changes:
select id, COST_CHANGE, sum(IsNewGroup) over (order by id asc) +1
from (select pc.*,
(case when sign(cost_change) - sign(lag(cost_change) over (order by id)) between -1 and 1
then 0
else 1 -- `NULL` intentionally goes here
end) IsNewGroup
from Prod_Cost
) pc
For clarity, here is a SQL Fiddle with zero values. From my understanding of the question, the OP only wants an actual sign change.
This may still not be correct. The OP simply is not clear about what to do about 0 values.

Sequential Group By in sql server

For this Table:
+----+--------+-------+
| ID | Status | Value |
+----+--------+-------+
| 1 | 1 | 4 |
| 2 | 1 | 7 |
| 3 | 1 | 9 |
| 4 | 2 | 1 |
| 5 | 2 | 7 |
| 6 | 1 | 8 |
| 7 | 1 | 9 |
| 8 | 2 | 1 |
| 9 | 0 | 4 |
| 10 | 0 | 3 |
| 11 | 0 | 8 |
| 12 | 1 | 9 |
| 13 | 3 | 1 |
+----+--------+-------+
I need to sum sequential groups with the same Status to produce this result.
+--------+------------+
| Status | Sum(Value) |
+--------+------------+
| 1 | 20 |
| 2 | 8 |
| 1 | 17 |
| 2 | 1 |
| 0 | 15 |
| 1 | 9 |
| 3 | 1 |
+--------+------------+
How can I do that in SQL Server?
NB: The values in the ID column are contiguous.
Per the tag I added to your question this is a gaps and islands problem.
The best performing solution will likely be
WITH T
AS (SELECT *,
ID - ROW_NUMBER() OVER (PARTITION BY [STATUS] ORDER BY [ID]) AS Grp
FROM YourTable)
SELECT [STATUS],
SUM([VALUE]) AS [SUM(VALUE)]
FROM T
GROUP BY [STATUS],
Grp
ORDER BY MIN(ID)
If the ID values were not guaranteed contiguous as stated then you would need to use
ROW_NUMBER() OVER (ORDER BY [ID]) -
ROW_NUMBER() OVER (PARTITION BY [STATUS] ORDER BY [ID]) AS Grp
Instead in the CTE definition.
SQL Fiddle

Oracle rank function issue

Iam experiencing an issue in oracle analytic functions
I want the rank in oracle to be displayed sequentialy but require a cyclic fashion.But this ranking should happen within a group.
Say I have 10 groups
In 10 groups each group must be ranked in till 9. If greater than 9 the rank value must start again from 1 and then end till howmuch so ever
emp id date1 date 2 Rank
123 13/6/2012 13/8/2021 1
123 14/2/2012 12/8/2014 2
.
.
123 9/10/2013 12/12/2015 9
123 16/10/2013 15/10/2013 1
123 16/3/2014 15/9/2015 2
In the above example the for the group of rows of the empid 123 i have split the rank in two subgroup fashion.Sequentially from 1 to 9 is one group and for the rest of the rows the rank again starts from 1.How to achieve this in oracle rank functions.
as per suggestion from Egor Skriptunoff above:
select
empid, date1, date2
, row_number() over(order by date1, date2) as "rank"
, mod(row_number() over(order by date1, date2)-1, 9)+1 as "cycle_9"
from yourtable
example result
| empid | date1 | date2 | rn | ranked |
|-------|----------------------|----------------------|----|--------|
| 72232 | 2016-10-26T00:00:00Z | 2017-03-07T00:00:00Z | 1 | 1 |
| 04365 | 2016-11-03T00:00:00Z | 2017-07-29T00:00:00Z | 2 | 2 |
| 79203 | 2016-12-15T00:00:00Z | 2017-05-16T00:00:00Z | 3 | 3 |
| 68638 | 2016-12-18T00:00:00Z | 2017-02-08T00:00:00Z | 4 | 4 |
| 75784 | 2016-12-24T00:00:00Z | 2017-11-18T00:00:00Z | 5 | 5 |
| 72836 | 2016-12-24T00:00:00Z | 2018-09-10T00:00:00Z | 6 | 6 |
| 03679 | 2017-01-24T00:00:00Z | 2017-10-14T00:00:00Z | 7 | 7 |
| 43527 | 2017-02-12T00:00:00Z | 2017-01-15T00:00:00Z | 8 | 8 |
| 03138 | 2017-02-26T00:00:00Z | 2017-01-30T00:00:00Z | 9 | 9 |
| 89758 | 2017-03-29T00:00:00Z | 2018-04-12T00:00:00Z | 10 | 1 |
| 86377 | 2017-04-14T00:00:00Z | 2018-10-07T00:00:00Z | 11 | 2 |
| 49169 | 2017-04-28T00:00:00Z | 2017-04-21T00:00:00Z | 12 | 3 |
| 45523 | 2017-05-03T00:00:00Z | 2017-05-07T00:00:00Z | 13 | 4 |
SQL Fiddle