How to GROUP entries BY uninterrupted sequence? - sql

CREATE TABLE entries (
id serial NOT NULL,
title character varying,
load_sequence integer
);
and data
INSERT INTO entries(title, load_sequence) VALUES ('A', 1);
INSERT INTO entries(title, load_sequence) VALUES ('A', 2);
INSERT INTO entries(title, load_sequence) VALUES ('A', 3);
INSERT INTO entries(title, load_sequence) VALUES ('A', 6);
INSERT INTO entries(title, load_sequence) VALUES ('B', 4);
INSERT INTO entries(title, load_sequence) VALUES ('B', 5);
INSERT INTO entries(title, load_sequence) VALUES ('B', 7);
INSERT INTO entries(title, load_sequence) VALUES ('B', 8);
Is there a way in PostgreSQL to write SQL that groups data by same title segments after ordering them by load_sequence.
I mean:
=# SELECT id, title, load_sequence FROM entries ORDER BY load_sequence;
id | title | load_sequence
----+-------+---------------
9 | A | 1
10 | A | 2
11 | A | 3
13 | B | 4
14 | B | 5
12 | A | 6
15 | B | 7
16 | B | 8
AND I want groups:
=# SELECT title, string_agg(id::text, ',' ORDER BY id) FROM entries ???????????;
so result would be:
title | string_agg
-------+-------------
A | 9,10,11
B | 13,14
A | 12
B | 15,16

You can use the following query:
SELECT title, string_agg(id::text, ',' ORDER BY id)
FROM (
SELECT id, title,
ROW_NUMBER() OVER (ORDER BY load_sequence) -
ROW_NUMBER() OVER (PARTITION BY title
ORDER BY load_sequence) AS grp
FROM entries ) AS t
GROUP BY title, grp
Calculated grp field serves to identify slices of title records having consecutive load_sequence values. Using this field in the GROUP BY clause we can achieve the required aggregation over id values.
Demo here

There's a trick you can use with sum as a window function running over a lagged window for this.
The idea is that when you hit an edge/discontinuity you return 1, otherwise you return 0. You detect the discontinuities using the lag window function.
SELECT title, string_agg(id::text, ', ') FROM (
SELECT
id, title, load_sequence,
sum(title_changed) OVER (ORDER BY load_sequence) AS partition_no
FROM (
SELECT
id, title, load_sequence,
CASE WHEN title = lag(title, 1) OVER (ORDER BY load_sequence) THEN 0 ELSE 1 END AS title_changed FROM entries
) x
) y
GROUP BY partition_no, title;

Related

SQL Occurrence of Sequence Number

I want to find if any Name has straight 4 or more occurrences of SeqNo in consecutive sequence only.
If there is a break in seqNo but 4 or more rows are consecutive then also i need that Name.
Example:
SeqNo Name
10 | A
15 | A
16 | A
17 | A
18 | A
9 | B
10 | B
13 | B
14 | B
6 | C
7 | C
9 | C
10 | C
OUTPUT:
A
BELOW IS SCRIPT FOR ANYONE HELPING.
create table testseq (Id int, Name char)
INSERT into testseq values
(10, 'A'),
(15, 'A'),
(16, 'A'),
(17, 'A'),
(18, 'A'),
(9, 'B'),
(10, 'B'),
(13, 'B'),
(14, 'B'),
(6, 'C'),
(7, 'C'),
(9, 'C'),
(10, 'C')
SELECT * FROM testseq
You can use some gaps-and-islands techniques for this.
If you want names that have at least 4 consecutive records where seqno is increasing by 1, then you can use the difference between seqno androw_number()` to define the groups, and then aggregate:
select distinct name
from (
select t.*, row_number() over(partition by name order by seqno) rn
from testseq t
) t
group by name, rn - seqno
having count(*) >= 4
Note that for your sample data, this returns no rows. A has 3 consecutive records where seqno is incrementing by 1, B and C have two.
I don't really view this as a "gaps-and-islands" problem. You are just looking for a minimum number of adjacent rows. This is easily handled using lag() or lead():
select t.*
from (select t.*,
lead(seqno, 3) over (partition by name order by seqno) as seqno_name_3
from t
) t
where seqno_name_3 = seqno + 3;
This checks the third sequence number on the same name. The third one after means that four names are the same in a row.
If you just want the name and to handle duplicates:
select distinct name
from (select t.*,
lead(seqno, 3) over (partition by name order by seqno) as seqno_name_3
from t
) t
where seqno_name_3 = seqno + 3;
If the sequence numbers can have gaps (but are otherwise adjacent):
select distinct name
from (select t.*,
lead(seqno, 3) over (partition by name order by seqno) as seqno_name_3,
lead(seqno, 3) over (order by seqno) as seqno_3
from t
) t
where seqno_name_3 = seqno_3;
A solution in plain SQL, no LAG() or LEAD() or ROW_NUMBER():
SELECT t1.Name
FROM testseq t1
WHERE (
SELECT count(t2.Id)
FROM testseq t2
WHERE t2.Name=t1.Name
and t2.Id between t1.Id and t1.Id+3
GROUP BY t2.Name)>=4
GROUP BY t1.Name;

Custom ordering before regular ordering?

I have 3 tables:
CREATE TABLE items (
id integer PRIMARY KEY,
title varchar (256) NOT NULL
);
INSERT INTO items (id, title) VALUES (1, 'qux');
INSERT INTO items (id, title) VALUES (2, 'quux');
INSERT INTO items (id, title) VALUES (3, 'quuz');
INSERT INTO items (id, title) VALUES (4, 'corge');
INSERT INTO items (id, title) VALUES (5, 'grault');
CREATE TABLE last_used (
item_id integer NOT NULL REFERENCES items (id),
date integer NOT NULL
);
INSERT INTO last_used (item_id, date) VALUES (2, 1000);
INSERT INTO last_used (item_id, date) VALUES (3, 2000);
INSERT INTO last_used (item_id, date) VALUES (2, 3000);
CREATE TABLE rating (
item_id integer NOT NULL REFERENCES items (id),
rating integer NOT NULL
);
INSERT INTO rating (item_id, rating) VALUES (1, 400);
INSERT INTO rating (item_id, rating) VALUES (2, 100);
INSERT INTO rating (item_id, rating) VALUES (3, 200);
INSERT INTO rating (item_id, rating) VALUES (4, 300);
INSERT INTO rating (item_id, rating) VALUES (5, 500);
I want to select rows in the following order:
Last used items matching the search string;
Most rated items matching the search string;
All other items matching the search string.
For the search i.title ~* '(?=.*u)', I get:
id | title | max(last_used.date) | rating.rating
3 | quuz | 2000 | 200
2 | quux | 3000 | 100
5 | grault | null | 500
1 | qux | null | 400
…with the following code:
WITH used AS (
SELECT lu.item_id
FROM last_used lu
JOIN (
SELECT item_id, max(date) AS date
FROM last_used
GROUP BY 1
) sub USING (date)
-- WHERE lu.user_id = 1
ORDER BY lu.date DESC
)
SELECT i.id, i.title, r.rating
FROM items i
LEFT JOIN rating r
ON r.item_id = i.id
WHERE
i.title ~* '(?=.*u)'
ORDER BY
i.id NOT IN (SELECT item_id FROM used),
r.rating DESC NULLS LAST
LIMIT 5 OFFSET 0
Is it possible to get the following results (latest used items first)?
id | title | max(last_used.date) | rating.rating
2 | quux | 3000 | 100
3 | quuz | 2000 | 200
5 | grault | null | 500
1 | qux | null | 400
You can use the following query to get the desired order
through the ORDER BY clause with l.date DESC NULLS LAST, r.rating DESC NULLS LAST:
SELECT i.id, i.title, l.date, r.rating
FROM items i
LEFT JOIN rating r
ON r.item_id = i.id
LEFT JOIN ( SELECT item_id, max(date) AS date FROM last_used GROUP BY 1 ) l
ON l.item_id = i.id
WHERE
i.title ~* '(?=.*u)'
ORDER BY l.date DESC NULLS LAST, r.rating DESC NULLS LAST
LIMIT 5 OFFSET 0;
Demo

SQL Grouping by sequential occurrences of a value

I have the below table with 2 columns
ID | Dept
1 | A
2 | A
3 | B
4 | B
5 | B
6 | A
I want to do a count such that the output should look as the table below.
Dept | Count
A | 2
B | 3
A | 1
Thanks for your help in advance!
Slightly different to Michael's, same result:
with cte1 as (
select id,
dept,
row_number() over (partition by dept order by id) -
row_number() over (order by id) group_num
from test),
cte2 as (
select dept,
group_num,
count(*) c_star,
max(id) max_id
from cte1
group by dept,
group_num)
select dept,
c_star
from cte2
order by max_id;
http://sqlfiddle.com/#!4/ff747/1
From your example, it looks like you're wanting to count sequential records for each department.
You can do this by combining the row number and the ordering Id.
create table tblDept (
id int not null,
dept varchar(50)
);
insert into tblDept values (1, 'A');
insert into tblDept values (2, 'A');
insert into tblDept values (3, 'B');
insert into tblDept values (4, 'B');
insert into tblDept values (5, 'B');
insert into tblDept values (6, 'A');
with orderedDepts as (
select
dept,
id,
row_number() over (partition by dept order by id) -
row_number() over (order by id) as rn
from tblDept
)
select
dept,
count(*) as num
from orderedDepts
group by
dept,
rn
order by
max(id)
Gives the output:
+------+-----+
| DEPT | NUM |
+------+-----+
| A | 2 |
| B | 3 |
| A | 1 |
+------+-----+
SQL Fiddle
You cannot do this with SQL. Count counts distinct items, so count in your case would give you a count of A and a count of B.
You can only count/group by values in the table, not by the order of rows. Order is not guaranteed in SQL if you don't use an order by anyway.
run this query for this:
SELECT Dept, count(*) FROM table_name group By Dept

How can I group a set split by change in a field with respect to an order?

I have a set of records.
ID Value
1 a
2 b
3 b
4 b
5 a
6 a
7 b
8 b
And I would like to group them like so.
MIN(ID) MAX(ID) Value
1 1 a
2 4 b
5 6 a
7 8 b
I'm vaguely aware of oracle over() analytical function which looks to be the right direction, but I don't know what this problem is called much less how to solve it.
Probably an easier way, but this may help to start. I ran it on Postgres, but should work (maybe with a minor tweak) on Oracle. The inner most query puts the previous value on each row. We can use that to detect a grouping change (when value does not equal previous value). Every time there is a group change, we flag it with a "1". Sum these group changes and we now have a group id which increments every time there is a value change. Then we can perform our normal group by function.
create table x(id int, value varchar(1));
insert into x values(1, 'a');
insert into x values(2, 'b');
insert into x values(3, 'b');
insert into x values(4, 'b');
insert into x values(5, 'a');
insert into x values(6, 'a');
insert into x values(7, 'b');
insert into x values(8, 'b');
SELECT MIN(id), MAX(id), value
FROM ( SELECT id
,value
,previous_value
,SUM( CASE WHEN value = previous_value THEN 0 ELSE 1 END ) OVER(ORDER BY id) AS group_id
FROM ( SELECT id
,value
,COALESCE( LAG(value) OVER(ORDER BY id), value ) previous_value
FROM x
ORDER BY id
) y
) z
GROUP BY group_id, value
ORDER BY 1, 2;
min | max | value
-----+-----+-------
1 | 1 | a
2 | 4 | b
5 | 6 | a
7 | 8 | b
(4 rows)

double sorted selection from a single table

I have a table with an id as the primary key, and a description as another field.
I want to first select the records that have the id<=4, sorted by description, then I want all the other records (id>4), sorted by description. Can't get there!
select id, descr
from t
order by
case when id <= 4 then 0 else 1 end,
descr
select *, id<=4 as low from table order by low, description
You may want to use an id <= 4 expression in your ORDER BY clause:
SELECT * FROM your_table ORDER BY id <= 4 DESC, description;
Test case (using MySQL):
CREATE TABLE your_table (id int, description varchar(50));
INSERT INTO your_table VALUES (1, 'c');
INSERT INTO your_table VALUES (2, 'a');
INSERT INTO your_table VALUES (3, 'z');
INSERT INTO your_table VALUES (4, 'b');
INSERT INTO your_table VALUES (5, 'g');
INSERT INTO your_table VALUES (6, 'o');
INSERT INTO your_table VALUES (7, 'c');
INSERT INTO your_table VALUES (8, 'p');
Result:
+------+-------------+
| id | description |
+------+-------------+
| 2 | a |
| 4 | b |
| 1 | c |
| 3 | z |
| 7 | c |
| 5 | g |
| 6 | o |
| 8 | p |
+------+-------------+
8 rows in set (0.00 sec)
Related post:
Using MySql, can I sort a column but have 0 come last?
select id, description
from MyTable
order by case when id <= 4 then 0 else 1 end, description
You can use UNION
SELECT * FROM (SELECT * FROM table1 WHERE id <=4 ORDER by description)aaa
UNION
SELECT * FROM (SELECT * FROM table1 WHERE id >4 ORDER by description)bbb
OR
SELECT * FROM table1
ORDER BY
CASE WHEN id <=4 THEN 0
ELSE 1
END, description