PLPGSQL - stored procedure to get a set of rows with count - sql

I am using PostgreSQL.
I need stored procedure using PLPGSQL language that will return table (SET OF RECORDS) containing count of top 2 and bottom 2 results from my_table.
For example:
my_table
id value
1 a
2 a
3 a
4 b
5 b
6 c
7 c
8 e
9 f
10 g
11 g
12 g
13 g
14 h
15 h
Returns:
count value
4 g
3 a
1 e
1 f
Thank you

You can use window functions with aggration
select v.value, v.cnt
from (select value, count(*) as cnt,
row_number() over (order by count(*) desc) as seqnum_desc,
row_number() over (order by count(*) asc) as seqnum_asc
from t
group by value
) v
where seqnum_desc <= 2 or seqnum_asc <= 2;
Note: In the case of ties -- particularly likely at the bottom end -- this returns arbitrary values with the same count. You can adjust for this using rank() or dense_rank(), depending on what you want in this case.

Related

Repeated values should not show together in SQL

I want to display some data and my requirement is repeated values should not be shown adjacent.
Right now the data in the table is in this order
ID Name
1 A
2 A
3 B
4 C
5 B
6 B
7 C
8 C
9 C
Expected result - It should be in below order
ID Name
1 A
3 B
4 C
2 A
5 B
7 C
6 B
8 C
9 C
This can be done using the ROW_NUMBER window function.
SELECT *, ROW_NUMBER() OVER (PARTITION BY Name ORDER BY ID) AS rn
FROM mytable
ORDER BY rn, Name
db<>fiddle
You can put row_number() directly in the order by. I would recommend:
select t.*
from t
order by row_number() over (partition by name order by id),
name;

PSQL, adding a "step increasing" column

have this values in a table column select a from tab:
a
1
2
3
4
5
6
7
15
16
18
Using a variable=3, how can create column b starting with min(a) and with the following values:
a
b
1
1
2
1
3
1
4
4
5
4
6
4
7
7
15
15
17
15
18
18
something like: for each a (ordered) maintain the value at most for 3, otherwise reset.
Thanks,
AAWNSD
I think you want window functions and groups of three based on arithmetic on a:
select a,
min(a) over (partition by ceiling(a / 3.0)) as b
from tab;
Here is a db<>fiddle.
Hmmm . . . I realize that the above returns "16" for the last row rather than 18. My above interpretation may not be correct. You may be saying that you want groups -- once they start -- to never exceed the group starting value plus 2.
If so, one approach is a recursive CTE:
with recursive tt as (
select a, row_number() over (order by a) as seqnum
from tab
),
cte as (
select a, seqnum, a as grp
from tt
where seqnum = 1
union all
select tt.a, tt.seqnum,
(case when tt.a <= grp + 2 then grp else tt.a end)
from cte join
tt
on tt.seqnum = cte.seqnum + 1
)
select *
from cte;

SQL Random N rows for each distinct value in column

I have the following table:
Name Field
A 1
B 1
C 1
D 1
E 1
F 1
G 1
H 2
I 2
J 2
K 3
L 3
M 3
N 3
O 3
P 3
Q 3
R 3
S 3
T 3
I need a SQL query which will generate me a set with 5 random rows for each distinct value on column Field.
For example, results expected:
Name Field
A 1
B 1
D 1
E 1
G 1
J 2
I 2
H 2
M 3
Q 3
T 3
S 3
P 3
Is there an easy way to do this? Or should i split that table into more tables and generate random for each table then union them?
You can do this with a CTE using a ROW_NUMBER() whilst PARTITIONing on the Field:
;With Cte As
(
Select Name, Field,
Row_Number() Over (Partition By Field Order By NewId()) RN
From YourTable
)
Select Name, Field
From Cte
Where RN <= 5
SQL Fiddle
You can readily do this with row_number():
select name, field
from (select t.*,
row_number() over (partition by field order by newid()) as seqnum
from t
) t
where seqnum <= 5;
An enhancement to Gordon Linoff's code, This code really helped me if you need criteria in your query.
select *
from (select t.*,
row_number() over (partition by region order by newid()) as seqnum
from MyTable t
WHERE t.program = 'ACME'
) t
where seqnum <= 1500;

MS Sql Server, same column with a different row neighbors

I need a little help on a SQL query. I could not get the result that I wanted.
ID I10 H 10NS HNS CC NSCC
0 1 1 1 1 14 14
1 0 1 0 1 6 2
1 0 2 0 2 12 2
1 0 3 0 3 17 4
1 0 3 0 3 18 4
1 0 3 0 3 19 4
1 0 3 0 3 20 4
What I want to have is one from each ID with highest CC
For example,
ID I10 H 10NS HNS CC NSCC
0 1 1 1 1 14 14
1 0 3 0 3 20 4
I tried with this code:
SELECT a.ID, b.name, a.i10 as[i-10-index], a.h as[h-index], 10ns as[i-10-index based on non-self-citation], a.hns as [h-index based on non-self-citation],
max(a.[Citation Count]), (a.[Non-Self-Citation Count])
FROM tbl_lpNumerical as a
join tbl_lpAcademician as b
on a.ID= (b.ID-1)
GROUP BY a.ID, b.name, a.i10, a.h, a.10ns, a.hns,
a.[Non-Self-Citation Count]
order by a.ID desc
However, I could not get the desired results.
Thank you for your time.
You can simply get all the row where not exist another row with an higher CC
SELECT n.*
FROM tbl_lpNumerical n
WHERE NOT EXISTS ( SELECT 'b'
FROM tbl_lpNumerical n2
WHERE n2.ID = n.ID
AND n2.CC > n.CC
)
In SQL Server, you can use row_number() for this. Based on your sample data`, something like:
select sd.*
from (select sd.*, row_number() over (partition by id order by cc desc) as seqnum
from sampledata sd
) sd
where seqnum = 1;
I have no idea what your query has to do with the sample data. If it generates the data, then you can use a CTE:
with sampledata as (
<some query here>
)
select sd.*
from (select sd.*, row_number() over (partition by id order by cc desc) as seqnum
from sampledata sd
) sd
where seqnum = 1;
The following query will select a single row from each ID partition: the one with the highest CC value:
SELECT *
FROM (SELECT *,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY CC DESC) AS rn
FROM mytable) t
WHERE t.rn = 1
If there can be multiple rows having the same CC max value and you want all of them selected, then you can replace ROW_NUMBER() with RANK().

SQL Local Minima and Maxima

I have this data:
row_id type value
1 a 1
2 a 2
3 a 3
4 a 5 --note that type a, value 4 is missing
5 a 6
6 a 7
7 b 1
8 b 2
9 b 3
10 b 4
11 b 5 --note that type b is missing no values from 1 to 5
12 c 1
13 c 3 --note that type c, value 2 is missing
I want to find the minimum and maximum values for each consecutive "run" within each type. That is, I want to return
row_id type group_num min_value max_value
1 a 1 1 3
2 a 2 5 7
3 b 1 1 5
4 c 1 1 1
5 c 2 3 3
I am a fairly experienced SQL user, but I've never solved this problem. Obviously I know how to get the overall minimum and maximum for each type, using GROUP, MIN, and MAX, but I'm really at a loss for these local minima and maxima. I haven't found anything on other questions that answers my question.
I'm using PLSQL Developer with Oracle 11g. Thanks!
This is a gaps-and-islands problem. You can use an analytic function effect/trick to finds the chains of contiguous values for each type:
select type,
min(value) as min_value,
max(value) as max_value
from (
select type, value,
dense_rank() over (partition by type order by value)
- dense_rank() over (partition by null order by value) as chain
from your_table
)
group by type, chain
order by type, min(value);
The inner query uses the difference between the ranking of the values within the type and within the entire result set to create the 'chain' number. The outer query just uses that for the grouping.
SQL Fiddle including the result of the inner query.
This is one way to achieve the result you require:
with step_1 as (
select w.type,
w.value,
w.value - row_number() over (partition by w.type order by w.row_id) as grp
from window_test w
), step_2 as (
select x.type,
x.value,
dense_rank() over (partition by x.type order by x.grp) as grp
from step_1 x
)
select rank() over (order by y.type, y.grp) as row_id,
y.type,
y.grp as group_num,
min(y.value) as min_val,
max(y.value) as max_val
from step_2 y
group by y.type, y.grp
order by 1;