Is there a way to partition by incremental series in Postgressql?

Is there a way to partition by incremental series in Postgressql? - sql

In postgressql is there a way to attain the result below by using partition by or any other way?
last_name year increment partition
Doe 2000 1 1
Doe 2001 2 1
Doe 2002 3 1
Doe 2003 -1 2
Doe 2004 1 3
Doe 2005 2 3
Doe 2006 3 3
Doe 2007 -1 4
Doe 2008 -2 4

SELECT last_name,
year,
increment,
SUM(CASE WHEN increment < 0 THEN 1 ELSE 0 END) OVER (PARTITION BY last_name ORDER BY year) AS partition
FROM your_table
ORDER BY last_name, year;

It seems that you want to group the consecutive positive/ negative values together, one option is to use a difference between two row_number functions, this will make the partition but with unordered group numbers.
select *,
row_number() over (partition by last_name order by year) -
row_number() over (partition by last_name,
case when increment>=0 then 1 else 2 end order by year) as prt
from tbl
order by last_name, year
If you want the partitions in order (1, 2, 3...) you could try another approach using lag and running sum as the following:
select last_name, year, increment,
1 + sum(case when sign(increment) <> sign(pre_inc) then 1 else 0 end) over
(partition by last_name order by year) as prt
from
(
select *,
lag(increment, 1 , increment) over
(partition by last_name order by year) pre_inc
from tbl
) t
order by last_name, year
See demo

If the increment column does encrease over the column year, it will be marked as 1; otherwise, it will be marked as 0. Then, we group the successive data using "LAG", regardless of whether the increment is positive or negative.
with cte as (
select * ,
row_number() over (partition by last_name order by year) as row_num,
case when increment >= LAG(increment,1,0) over (partition by last_name order by year)
then 1 else 0 end rank_num
from mytable
),
cte2 as (
select *, LAG(rank_num,1,1) over (partition by last_name order by year) as pre
from cte
order by year
)
select last_name, year, increment, 1+sum(case when pre <> rank_num then 1 else 0 end) over
(partition by last_name order by year) as partition
from cte2;

Related

Selecting rows that have row_number more than 1

I have a table as following (using bigquery):
id
year
month
sales
row_number
111
2020
11
1000
1
111
2020
12
2000
2
112
2020
11
3000
1
113
2020
11
1000
1
Is there a way in which I can select rows that have row numbers more than one?
For example, my desired output is:
id
year
month
sales
row_number
111
2020
11
1000
1
111
2020
12
2000
2
I don't want to just exclusively select rows with row_number = 2 but also row_number = 1 as well.
The original code block I used for the first table result is:
SELECT
id,
year,
month,
SUM(sales) AS sales,
ROW_NUMBER() OVER (PARTITIONY BY id ORDER BY id ASC) AS row_number
FROM
table
GROUP BY
id, year, month

You can use window functions:
select t.* except (cnt)
from (select t.*,
count(*) over (partition by id) as cnt
from t
) t
where cnt > 1;
As applied to your aggregation query:
SELECT iym.* EXCEPT (cnt)
FROM (SELECT id, year, month,
SUM(sales) as sales,
ROW_NUMBER() OVER (Partition by id ORDER BY id ASC) AS row_number
COUNT(*) OVER(Partition by id ORDER BY id ASC) AS cnt
FROM table
GROUP BY id, year, month
) iym
WHERE cnt > 1;

You can wrap your query as in below example
select * except(flag) from (
select *, countif(row_number > 1) over(partition by id) > 0 flag
from (YOUR_ORIGINAL_QUERY)
)
where flag
so it can look as
select * except(flag) from (
select *, countif(row_number > 1) over(partition by id) > 0 flag
from (
SELECT id,
year,
month,
SUM(sales) as sales,
ROW_NUMBER() OVER(Partition by id ORDER BY id ASC) AS row_number
FROM table
GROUP BY id, year, month
)
)
where flag
so when applied to sample data in your question - it will produce below output

Try this:
with tmp as (SELECT id,
year,
month,
SUM(sales) as sales,
ROW_NUMBER() OVER(Partition by id ORDER BY id ASC) AS row_number
FROM table
GROUP BY id, year, month)
select * from tmp a where exists ( select 1 from tmp b where a.id = b.id and b.row_number =2)
It's a so clearly exists statement SQL

This is what I use, it's similar to #ElapsedSoul answer but from my understanding for static list "IN" is better than using "EXISTS" but I'm not sure if the performance difference, if any, is significant:
Difference between EXISTS and IN in SQL?
WITH T1 AS
(
SELECT
id,
year,
month,
SUM(sales) as sales,
ROW_NUMBER() OVER(PARTITION BY id ORDER BY id ASC) AS ROW_NUM
FROM table
GROUP BY id, year, month
)
SELECT *
FROM T1
WHERE id IN (SELECT id FROM T1 WHERE ROW_NUM > 1);

Convert number sequence format so that it is hyphenated

I have a sequence of numbers that need to be rendered with a hyphen but not sure how best to do this from the SQL database selection.
The expected result:
Peter: 1,3-7,10,11,13
Andrew: 1-3
Paul: 1-3
An example of the data from the table (small selection):
NAME #
Peter 1
Andrew 1
Paul 1
Andrew 2
Paul 2
Peter 3
Andrew 3
Paul 3
Peter 4
Peter 5
Peter 6
Peter 7

This is part gaps-and-islands and part string aggregation. This identifies the groupings:
select name,
(case when min(number) = max(number)
then convert(varchar(max), min(num))
else concat(min(number), '-', max(number))
end) as range
from (select name, number,
row_number() over (partition by name order by number) as seqnum
from t
) t
group by name, (number - seqnum);
With this you can add an additional level of aggregation to get the final result:
select name,
string_agg(range, ',') within group (order by min(min_number)) as col
from (select name, min(number) as min_number,
(case when min(number) = max(number)
then convert(varchar(max), min(num))
else concat(min(number), '-', max(number))
end) as range
from (select name, number,
row_number() over (partition by name order by number) as seqnum
from t
) t
group by name, (number - seqnum)
) n
group by name;

How to calculate unique rank in SQL Server (without any duplication)?

I want to calculate unique rankings but I get duplicate rankings
Here's my attempt:
SELECT
TG.EMPCODE,
DENSE_RANK() OVER (ORDER BY TS.COUNT_DEL DESC, TG.COUNT_TG DESC) AS YOUR_RANK
FROM
(SELECT
EmpCode,
SUM(CASE WHEN Tgenerate = 1 THEN 1 ELSE 0 END) AS COUNT_TG
FROM
TBLTGENERATE1
GROUP BY
EMPCODE) TG
INNER JOIN
(SELECT
EMP_CODE,
SUM(CASE WHEN STATUS = 'DELIVERED' THEN 1 ELSE 0 END) AS COUNT_DEL
FROM
TBLSTAT
GROUP BY
EMP_CODE) TS ON TG.EMPCODE = TS.EMP_CODE;
The output I get is like this:
EID Rank
---------
102 1
105 2
101 2
103 3
106 4
There is same rank for 105 and 101.
How do I calculate unique ranking?

Use ROW_NUMBER() instead of DENSE_RANK():
SELECT TG.EMPCODE,
ROW_NUMBER() OVER (ORDER BY TS.COUNT_DEL DESC, TG.COUNT_TG DESC) AS YOUR_RANK
Ties will then be given sequential rankings.

Sql query to Count Total Consecutive Years from latest year

I have a table Temp:
CREATE TABLE Temp
(
[ID] [int],
[Year] [INT],
)
**ID Year**
1 2016
1 2016
1 2015
1 2012
1 2011
1 2010
2 2016
2 2015
2 2014
2 2012
2 2011
2 2010
2 2009
3 2016
3 2015
3 2004
3 1999
4 2016
4 2015
4 2014
4 2010
5 2016
5 2014
5 2013
I want to calculate the total consecutive years starting from the most recent Year.
Result should look like this:
ID Total Consecutive Yrs
1 2
2 3
3 2
4 3
5 1

select ID,
-- returns a sequence without gaps for consecutive years
first_value(year) over (partition by ID order by year desc) - year +1 as x,
-- returns a sequence without gaps
row_number() over (partition by ID order by year desc) as rn
from Temp
e.g. for ID=1:
1 2016 1 1
1 2015 2 2
1 2012 5 3
1 2011 6 4
1 2010 7 5
As long as there's no gap, both sequences increase the same.
Now check for equal sequences and count the rows:
with cte as
(
select ID,
-- returns a sequence without gaps for consecutive years
first_value(year) over (partition by ID order by year desc) - year + 1 as x,
-- returns a sequence without gaps
row_number() over (partition by ID order by year desc) as rn
from Temp
)
select ID, count(*)
from cte
where x = rn -- no gap
group by ID
Edit:
Based on your year zero comment:
with cte as
(
select ID, year,
-- returns a sequence without gaps for consecutive years
first_value(year) over (partition by ID order by year desc) - year + 1 as x,
-- returns a sequence without gaps
row_number() over (partition by ID order by year desc) as rn
from Temp
)
select ID,
-- remove the year zero from counting
sum(case when year <> 0 then 1 else 0 end)
from cte
where x = rn
group by ID

You can use lead and get this counts as below:
Select top (1) with ties Id, RowN as [Total Consecutive Years] from (
Select *, Num = case when ([year]- lead(year) over(partition by Id order by [Year] desc) > 1) then 0 else 1 end
, RowN = Row_Number() over (partition by Id order by [Year] desc)
from temp
) a
where a.Num = 0
order by row_number() over(partition by Id order by RowN)
Output as below:
+----+-------------------------+
| Id | Total Consecutive Years |
+----+-------------------------+
| 1 | 2 |
| 2 | 3 |
| 3 | 2 |
| 4 | 3 |
| 5 | 1 |
+----+-------------------------+

You can do this using window functions:
select id, count(distinct year)
from (select t.*,
dense_rank() over (partition by id order by year + seqnum desc) as grp
from (select t.*,
dense_rank() over (partition by id order by year desc) as seqnum
from temp t
) t
) t
where grp = 1
group by id;
This assumes that "most recent year" is per id.

Gordon Linoff,
Your code is awesome!
Your code pulls consecutive years from the most recent year.
I modified it to pull overall max consecutive years.
Posted here in case anyone else needs it:
--overall max consecutive years
select id,max(yr_cnt) max_consecutive_years
from (
select id, grp,count(seqnum) yr_cnt
from (select t.*,
dense_rank() over (partition by id order by year + seqnum desc) as grp
from (select t.*,
dense_rank() over (partition by id order by year desc) as seqnum
from temp t
) t
) t
group by id,grp) t2
group by id;

Find date sequence in SQL Server

I'm trying to find the maximum sequence of days by customer in my data.
I want to understand what is the max sequence of days that specific customer made. If someone enter to my app in the 25/8/16 AND 26/08/16 AND 27/08/16 AND 01/09/16 AND 02/09/16 - The max sequence will be 3 days (25,26,27).
In the end (The output) I want to get two fields: custid | MaxDaySequence
I have the following fields in my data table:
custid | orderdate(timestemp)
For exmple:
custid orderdate
1 25/08/2007
1 03/10/2007
1 13/10/2007
1 15/01/2008
1 16/03/2008
1 09/04/2008
2 18/09/2006
2 08/08/2007
2 28/11/2007
2 04/03/2008
3 27/11/2006
3 15/04/2007
3 13/05/2007
3 19/06/2007
3 22/09/2007
3 25/09/2007
3 28/01/2008
I'm using SQL Server 2014.
Thanks

There is a trick, if you have an incrementing number ordered by your date then a subtracting that number of days from your dates will be the same if they are consecutive. So like this:
SELECT custid,
min(orderdate) as start_of_group,
max(orderdate) as end_of_group,
count(*) as num_days
FROM (
SELECT custid, orderdate
ROW_NUMBER() OVER (PARTITION BY custid ORDER BY orderdate) as rn
) x
GROUP BY custid, dateadd(day, - rn, orderdate);
You could take the result of this and pull out the max number of days to solve your problem:
SELECT custid, max(num_days) as longest
FROM (
SELECT custid,
count(*) as num_days
FROM (
SELECT custid, orderdate
ROW_NUMBER() OVER (PARTITION BY custid ORDER BY orderdate) as rn
) x
GROUP BY custid, dateadd(day, - rn, orderdate)
) y
GROUP BY custid

If you want to solve it with MySQL:
select user_id,max(num_days) as longest
from(
select user_id, count(*) as num_days
from
(
SELECT (CASE a1.user_id
WHEN #curType
THEN #curRow := #curRow + 1
ELSE #curRow := 1 AND #curType := a1.user_id END
) AS rank,
a1.user_id,
a1.last_update as dat
FROM (select a2.user_id,left(FROM_UNIXTIME(a2.last_update),10) as 'last_update'
from visits as a2 group by 1,2) as a1 ,
(SELECT #curRow := 0, #curType := '') r
ORDER BY a1.user_id DESC, dat) x
group by user_id, DATE_ADD(dat,INTERVAL -rank day)
) y
group by 1
order by longest desc

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Is there a way to partition by incremental series in Postgressql? - sql

In postgressql is there a way to attain the result below by using partition by or any other way? last_name year increment partition Doe 2000 1 1 Doe 2001 2 1 Doe 2002 3 1 Doe 2003 -1 2 Doe 2004 1 3 Doe 2005 2 3 Doe 2006 3 3 Doe 2007 -1 4 Doe 2008 -2 4

SELECT last_name, year, increment, SUM(CASE WHEN increment < 0 THEN 1 ELSE 0 END) OVER (PARTITION BY last_name ORDER BY year) AS partition FROM your_table ORDER BY last_name, year;

Related

Selecting rows that have row_number more than 1

Convert number sequence format so that it is hyphenated

How to calculate unique rank in SQL Server (without any duplication)?

Sql query to Count Total Consecutive Years from latest year

Find date sequence in SQL Server

Categories

Resources