SQL query to find counts of numbers in running total - sql

Suppose the table has 1 column ID and the values are as below:
ID
5
5
5
6
5
5
6
6
the output should be
ID count
5 3
6 1
5 2
6 2
How can we do that in a single SQL query.

If you want to find the Total count of the Records you have you can write like
select count(*) from database_name order by column_name;

In relational databases data in the table has no any order, see this: https://en.wikipedia.org/wiki/Table_(database)
the database system does not guarantee any ordering of the rows unless
an ORDER BY clause is specified in the SELECT statement that queries
the table.
therefore, in order to get desired results, you must have an additional colum in the table that defines an order of rows (and can by used in ORDER BY clause).
In the below examle cn column defines such an order:
select * from tab123 ORDER BY rn;
RN ID
---------- -------
1 5
2 5
3 5
4 6
5 5
6 5
7 6
8 6
Starting from Oracle version 12c new MATCH_REGOGNIZE clause can be used:
select * from tab123
match_recognize(
order by rn
measures
strt.id as id,
count(*) as cnt
one row per match
after match skip past last row
pattern( strt ss* )
define ss as ss.id = prev( ss.id )
);
On earlier versions that support windows function (Oracle 10 and above) you can use two windows functions: LAG ... over and SUM ... over, in this way
select max( id ) as id, count(*) as cnt
FROM (
select id, sum( xxx ) over (order by rn ) as yyy
from (
select t.*,
case lag( id ) over (order by rn )
when id then 0 else 1 end as xxx
from tab123 t
)
)
GROUP BY yyy
ORDER BY yyy;

Related

SQL query to partition rows into groups where lag (difference between rows) is greater than some value

Suppose I have a table like
id
1
3
4
10
12
19
and I'd like to group the ids (in sorted order) into the same group if they differ by 5 or less, and a new group if they differ by 6 or more. So the output would be:
id
group
1
1
3
1
4
1
10
2
12
2
19
3
Is this possible in SQL? It will be a query in Trino, and I see they have commands like lag and partition. Has anyone made a query like this that can help out?
You can use a cte with lead:
with cte(id, l1) as (
select t.id, abs(coalesce(lead(t.id) over (order by t.id), 0) - t.id) < 6 from tbl t
)
select c.id, (select sum(c1.id < c.id and c1.l1 = 0) from cte c1) + 1 from cte c

(SQL) Per ID, starting from the first row, return all successive rows with a value N greater than the prior returned row

I have the following example dataset:
ID
Value
Row index (for reference purposes only, does not need to exist in final output)
a
4
1
a
7
2
a
12
3
a
12
4
a
13
5
b
1
6
b
2
7
b
3
8
b
4
9
b
5
10
I would like to write a SQL script that returns the next row which has a Value of N or more than the previously returned row starting from the first row per ID and ordered ascending by [Value]. An example of the final table for N = 3 should look like the following:
ID
Value
Row index
a
4
1
a
7
2
a
12
3
b
1
6
b
4
9
Can this script be written in a vectorised manner? Or must a loop be utilised? Any advice would be greatly appreciated. Thanks!
SQL tables represent unordered sets. There is no definition of "previous" value, unless you have a column that specifies the ordering. With such a column, you can use lag():
select t.*
from (select t.*,
lag(value) over (partition by id order by <ordering column>) as prev_value
from t
) t
where prev_value is null or prev_value <= value - 3;
EDIT:
I think I misunderstood what you want to do. You seem to want to start with the first row for each id. Then get the next row that is 3 or higher in value. Then hold onto that value and get the next that is 3 or higher than that. And so on.
You can do this in SQL using a recursive CTE:
with ts as (
select distinct t.id, t.value, dense_rank() over (partition by id order by value) as seqnum
from t
),
cte as (
select id, value, value as grp_value, 1 as within_seqnum, seqnum
from ts
where seqnum = 1
union all
select ts.id, ts.value,
(case when ts.value >= cte.grp_value + 3 then ts.value else cte.grp_value end),
(case when ts.value >= cte.grp_value + 3 then 1 else cte.within_seqnum + 1 end),
ts.seqnum
from cte join
ts
on ts.id = cte.id and ts.seqnum = cte.seqnum + 1
)
select *
from cte
where within_seqnum = 1
order by id, value;
Here is a db<>fiddle.

Find gaps of a sequence in PostgreSQL tables

I have a table invoices with a field invoice_number. This is what happens when i execute select invoice_number from invoice
invoice_number
1
2
3
5
6
10
11
I want a SQL that gives me the following result:
gap_start
gap_end
1
3
5
6
10
11
demo:db<>fiddle
You can use row_number() window function to create a row count and use the difference to your actual values as group criterion:
SELECT
MIN(invoice) AS start,
MAX(invoice) AS end
FROM (
SELECT
*,
invoice - row_number() OVER (ORDER BY invoice) as group_id
FROM t
) s
GROUP BY group_id
ORDER BY start

Query to group based on the sorted table result

Below is my table
a 1
a 2
a 1
b 1
a 2
a 2
b 3
b 2
a 1
My Expected output is
a 4
b 1
a 4
b 5
a 1
I want them to be grouped if they are in sequence.
If your dbms supports window functions, you can use the row_number difference to assign the same group to consecutive values (which are the same) in one column. After assigning the groups, it is easy to sum the values for each group.
select col1,sum(col2)
from (select t.*,
row_number() over(order by someid)
- row_number() over(partition by col1 order by someid) as grp
from tablename t
) x
group by col1,grp
Replace tablename, col1,col2,someid with the appropriate column names. someid should be the column to be ordered by.

SQL of group by in order by

[Raw data]
A B C
1 10 1
1 10 2
2 20 3
2 20 4
1 100 5
1 100 6
[Wanted result]
A SUM_OF_B
1 20
2 40
1 200
It's unuseful that the query has the simple 'group by' clause and 'dense_rank over partition by' because grouping works all rows. However I want grouping in state of ordering. How do I write the proper query?
You need to identify the groups of adjacent records. You can actually do this by using a difference of row numbers approach -- assuming that c orders the rows. The difference is constant for consecutive values of a that are the same:
select a, sum(b)
from (select t.*,
(row_number() over (order by c) -
row_number() over (partition by a order by c)
) as grp
from table t
) t
group by grp, a
order by min(c);