Find gaps of a sequence in PostgreSQL tables - sql

I have a table invoices with a field invoice_number. This is what happens when i execute select invoice_number from invoice
invoice_number
1
2
3
5
6
10
11
I want a SQL that gives me the following result:
gap_start
gap_end
1
3
5
6
10
11

demo:db<>fiddle
You can use row_number() window function to create a row count and use the difference to your actual values as group criterion:
SELECT
MIN(invoice) AS start,
MAX(invoice) AS end
FROM (
SELECT
*,
invoice - row_number() OVER (ORDER BY invoice) as group_id
FROM t
) s
GROUP BY group_id
ORDER BY start

Related

Running total for each 4 rows in snowflake

I have one column: NMV
I want to calculate the cumulative sum for it but like this:
nmv cumsum
1 1
2 3
3 6
4 10 ---stops here
5 5 --- starts again
6 11
7 18
Using helper column:
CREATE OR REPLACE TABLE tab(nmv INT);
INSERT INTO tab(nmv) VALUES (1),(2),(3),(4),(5),(6),(7);
WITH cte AS (
SELECT *, CEIL(ROW_NUMBER() OVER(ORDER BY NMV)/4) AS subgrp
FROM tab
)
SELECT *, SUM(NMV) OVER(PARTITION BY subgrp ORDER BY NMV) AS cumsum
FROM cte;
Output:
select sum(cumsum) from table1
group by (nmv-1)/4

How to select top 2 values for each id

I have a table with values
id sales date
1 5 "2015-01-04"
1 3 "2015-01-03"
1 1 "2015-01-01"
1 1 "2015-01-01"
2 7 "2015-01-05"
2 6 "2015-01-04"
2 4 "2015-01-03"
3 11 "2015-01-08"
3 10 "2015-01-07"
3 9 "2015-01-06"
3 8 "2015-01-05"
I want to select top two values of each id as shown in desired output.
Desired output:
id sales date
1 5 "2015-01-04"
1 3 "2015-01-03"
2 7 "2015-01-05"
2 6 "2015-01-04"
3 11 "2015-01-08"
3 10 "2015-01-07"
My attempt:
can someone help me with this. Thank you in advance!
select transactions.salesperson_id, transactions.id, transactions.date
from transactions
ORDER BY transactions.salesperson_id ASC, transactions.date DESC;
This can be done using window functions:
select id, sales, "date"
from (
select id, sales, "date",
dense_rank() over (partition by id order by "date" desc) as rnk
from transactions
) t
where rnk <= 2;
If there are multiple rows on the same date this might return more than two rows for the same ID. If you don't want that, use row_number() instead of dense_rank()
row_number() will get what you want.
select * from
(select row_number() over (partition by id order by date) as rn, sales, date from transactions) t1
where t1.rn <= 2

SQL query to find counts of numbers in running total

Suppose the table has 1 column ID and the values are as below:
ID
5
5
5
6
5
5
6
6
the output should be
ID count
5 3
6 1
5 2
6 2
How can we do that in a single SQL query.
If you want to find the Total count of the Records you have you can write like
select count(*) from database_name order by column_name;
In relational databases data in the table has no any order, see this: https://en.wikipedia.org/wiki/Table_(database)
the database system does not guarantee any ordering of the rows unless
an ORDER BY clause is specified in the SELECT statement that queries
the table.
therefore, in order to get desired results, you must have an additional colum in the table that defines an order of rows (and can by used in ORDER BY clause).
In the below examle cn column defines such an order:
select * from tab123 ORDER BY rn;
RN ID
---------- -------
1 5
2 5
3 5
4 6
5 5
6 5
7 6
8 6
Starting from Oracle version 12c new MATCH_REGOGNIZE clause can be used:
select * from tab123
match_recognize(
order by rn
measures
strt.id as id,
count(*) as cnt
one row per match
after match skip past last row
pattern( strt ss* )
define ss as ss.id = prev( ss.id )
);
On earlier versions that support windows function (Oracle 10 and above) you can use two windows functions: LAG ... over and SUM ... over, in this way
select max( id ) as id, count(*) as cnt
FROM (
select id, sum( xxx ) over (order by rn ) as yyy
from (
select t.*,
case lag( id ) over (order by rn )
when id then 0 else 1 end as xxx
from tab123 t
)
)
GROUP BY yyy
ORDER BY yyy;

Oracle Nested Grouping

The question is: For each day, list the User ID who has read the most number of messages.
user_id msgID read_date
1 1 10
1 2 10
2 2 10
2 2 23
3 2 23
I believe the date is an outer group and user_id is an inner group, but how to do group nesting in sql? Or somehow avoid this?
This is a task for a Window Function:
select *
from
(
select user_id, read_date, count(*) as cnt,
rank()
over (partition by read_date -- each day
order by count(*) desc) as rnk -- maximum number
from tab
group by user_id, read_date
) dt
where rnk = 1
This might return multiple users for one with the same maximum count, if you want just one (randomly) switch to ROW_NUMBER
select user_id
from
(
select user_id,count(msgID)
from table
group by read_date
)
where rownum <= 1;

divide data in sql to groups order by another column

I have this set of data
shopId companyId date
1 1 25/8/2015
2 1 26/8/2015
3 1 22/8/2015
4 2 20/8/2015
5 2 27/8/2015
what i need is to get this result
shopId companyId date dense_rank
1 2 27/8/2015 1
2 2 20/8/2015 1
3 1 26/8/2015 2
4 1 25/8/2015 2
5 1 22/8/2015 2
how to get all groups ranked but order with date
SELECT *
, DENSE_RANK() OVER (ORDER BY companyId DESC, [Date] DESC) AS [DENSE_RANK]
FROM TableName
If you want the groups ordered by date, then you need two steps: first get the maximum date for each group. Then use dense_rank():
select shopid, companyid, date,
dense_rank() over (order by maxd desc) as dense_rank
from (select t.*, max(date) over (partition by companyid) as maxd
from table t
) t
Note: this assumes that your date is really stored as a date and not as a string. You will need additional transformations if the data is (improperly) stored as a string.