postgresql count the top 3 rows - sql

I know we can use LIMIT in PostreSQL to get the top 3 values in the relation but what if there are duplicate values. For example,
5
4
4
3
2
Ordering it in DESC order and using LIMIT 3 will just return 5,4,4. But how do we get 5,4,4,3 (The top 3 with duplicates).
I know how to do this the long way but I was wondering if there are any PostreSQL built in things?

One easy way would be to use the dense_rank window function to rank the values as desired and then peel off those with the desired ranks.
For example, given this:
create table t (
id serial not null primary key, -- just a placeholder so that we can differentiate the duplicate `c`s.
c int not null
);
insert into t (c)
values (1), (1), (2), (3), (4), (4), (4), (5);
you'd presumably want the rows with c in (5,4,3) and you could do that with:
select id, c
from (
select id, c, dense_rank() over (order by c desc) as r
from t
) dt
where r <= 3
Demo: http://sqlfiddle.com/#!15/5b262/8
Note that you need to use dense_rank rather than rank because rank will arrange things in the right order but it will leave gaps in the rankings so r <= 3 wouldn't necessarily work. Compare the r values in the above fiddle with dense_rank and rank and you'll see the difference.

Suppose you have below record
index Name
1 "ABC"
2 "XYZ"
2 "ABC"
1 "XYZ"
1 "XYZ"
in this case you can distinct to choose non duplicate record.
Select distinct index, name from table_name;

Related

PostgreSQL check if values in a given list exist in a table

Given below table in Postgres:
id
some_col
1
a
1
b
2
a
3
a
I want to get output as id and true (if at least one row with that id is present in the table) or false (if no rows with that id are found in the table).
For example where id in (1, 2, 3, 4, 5):
id
value
1
true
2
true
3
true
4
false
5
false
I tried with the group by and case with conditions, but is there any optimized/performance SQL way of achieving this? Bcz, groupby does complete rows count which is unnecessary in this scenario. And, with exists query I didn't find a way to deal with multiple rows return;
Thanks in Advance!
The in clause won't do it. You need to build a list of values somehow (the values table valued constructor works in most rdbms) and outer join with it:
SELECT x.id, EXISTS (
SELECT *
FROM t
WHERE t.id = x.id
) AS value
FROM (VALUES (1), (2), (3), (4), (5)) AS x(id)

Effective way of locating top ranked rows on Oracle DB

I have a large table (millions of records) and I need to write an efficient select statement.
The table looks like this:
create table tab1 (
pt_key number
, cp_key number
, ext_info varchar2(10)
, resp_nm varchar2(20)
, resp_dttm date
, rank number
);
Sample records:
insert into tab1 values (1,1,'info1','OK', to_date('01.03.18 17:00:00','DD.MM.RR HH24:MI'),1);
insert into tab1 values (1,1,'info2','FAILED', to_date('01.03.18 17:00:00','DD.MM.RR HH24:MI'),2);
insert into tab1 values (1,1,'info3','SENT', to_date('01.03.18 17:00:00','DD.MM.RR HH24:MI'),3);
insert into tab1 values (1,1,'info4','SENT', to_date('02.03.18 17:00:00','DD.MM.RR HH24:MI'),3);
insert into tab1 values (1,2,'info5','OK', to_date('05.03.18 17:00:00','DD.MM.RR HH24:MI'),1);
insert into tab1 values (1,2,'info6','OK', to_date('06.03.18 17:00:00','DD.MM.RR HH24:MI'),1);
insert into tab1 values (1,2,'info7','FAILED', to_date('01.03.18 17:00:00','DD.MM.RR HH24:MI'),2);
I would like the query to return for each combination of pt_key and cp_key (part of composite primary key, other columns are not indexed) record with the highest rank. If there are (for each combination of pt_key and cp_key) several records with the same highest rank then pick the one with the greatest resp_dttm.
The select statement should return only the first four columns.
For the above posted sample data the desired result would be:
1 1 info4 SENT
1 2 info7 FAILED
Thanks for help.
Here's one approach using row_number():
select *
from (
select *, row_number() over (partition by pt_key, cp_key
order by rank desc, resp_dttm desc) rn
from tab1
) t
where rn = 1
Here's another approach using FIRST aggregate function:
select pt_key,
cp_key,
max(ext_info) keep (dense_rank first order by t.rank desc, t.resp_dttm desc) as ext_info,
max(resp_nm) keep (dense_rank first order by t.rank desc, t.resp_dttm desc) as resp_nm
from tab1 t
group by pt_key, cp_key
Here's how it works on Oracle Live SQL
EDIT 2:
Result:
PT_KEY | CP_KEY | EXT_INFO | RESP_NM
--------+--------+----------+---------
1 | 1 | info4 | SENT
1 | 2 | info7 | FAILED
EDIT 1:
This solution has an important drawback, if for a certain combination of pt_key and cp_key, there are multiple rows with the same rank and resp_dttm values. In that case it will "combine" those rows, and calculate the aggregates for ext_info and resp_nm (in my example it'll take max value).
You can refine that behavior, by adding tertiary sort criteria, to make the ranking distinct (e.g. add all other columns from the primary key).
The answer from #sgeddes is a bit better in that sense, that it will use one (random) row from the equally ranked rows, without combining the data, and without having to add sorting criteria. It also is easier to maintain/update, as it has the ranking criteria in one place, while mine has it in two spots.
You should probably test performance of both in your specific scenario (e.g. specific indices, specific data profile/statistics).

How to select only the next smaller value

I am trying to select smaller number from the database with the SQL.
I have table in which I have records like this
ID NodeName NodeType
4 A A
2 B B
2 C C
1 D D
0 E E
and other columns like name, and type.
If I pass "4" as a parameter then I want to receive the next smallest number records:
ID NodeName NodeType
2 B B
2 C C
Right now if I am using the < sign then it is giving me
ID NodeName NodeType
2 B B
2 C C
1 D D
0 E E
How can I get this working?
You can use WITH TIES clause:
SELECT TOP (1) WITH TIES *
FROM mytable
WHERE ID < 4
ORDER BY ID DESC
TOP clause in conjunction with WHERE and ORDER BY selects the next smallest value to 4. WITH TIES clause guarantees that all these values will be returned, in case there is more than one.
Demo here
select ID
from dbo.yourtable
where ID in
(
select top 1 ID
from dbo.your_table
where ID < 4
order by ID desc
);
Note: where dbo.your_table is your source table
What this does it uses an inner query to pull the next smallest ID below your selected value. Then the outer query just pulls all records that have that same match to the ID of the next smallest value.
Here's a full working example:
use TestDatabase;
go
create table dbo.TestTable1
(
ID int not null
);
go
insert into dbo.TestTable1 (ID)
values (6), (4), (2), (2), (1), (0);
go
select ID
from dbo.TestTable1
where ID in
(
select top 1 ID
from dbo.TestTable1
where ID < 4
order by ID desc
);
/*
ID
2
2
*/

create a table of duplicated rows of another table using the select statement

I have a table with one column containing different integers.
For each integer in the table I would like to duplicate it as the number of digits -
For example:
12345 (5 digits):
1. 12345
2. 12345
3. 12345
4. 12345
5. 12345
I thought doing it using with recursion t (...) as () but I didn't manage, since I don't really understand how it works and what is happening "behind the scenes.
I don't want to use insert because I want it to be scalable and automatic for as many integers as needed in a table.
Any thoughts and an explanation would be great.
The easiest way is to join to a table with numbers from 1 to n in it.
SELECT n, x
FROM yourtable
JOIN
(
SELECT day_of_calendar AS n
FROM sys_calendar.CALENDAR
WHERE n BETWEEN 1 AND 12 -- maximum number of digits
) AS dt
ON n <= CHAR_LENGTH(TRIM(ABS(x)))
In my example I abused TD's builtin calendar, but that's not a good choice, as the optimizer doesn't know how many rows will be returned and as the plan must be a Product Join it might decide to do something stupid. So better use a number table...
Create a numbers table that will contain the integers from 1 to the maximum number of digits that the numbers in your table will have (I went with 6):
create table numbers(num int)
insert numbers
select 1 union select 2 union select 3 union select 4 union select 5 union select 6
You already have your table (but here's what I was using to test):
create table your_table(num int)
insert your_table
select 12345 union select 678
Here's the query to get your results:
select ROW_NUMBER() over(partition by b.num order by b.num) row_num, b.num, LEN(cast(b.num as char)) num_digits
into #temp
from your_table b
cross join numbers n
select t.num
from #temp t
where t.row_num <= t.num_digits
I found a nice way to perform this action. Here goes:
with recursive t (num,num_as_char,char_n)
as
(
select num
,cast (num as varchar (100)) as num_as_char
,substr (num_as_char,1,1)
from numbers
union all
select num
,substr (t.num_as_char,2) as num_as_char2
,substr (num_as_char2,1,1)
from t
where char_length (num_as_char2) > 0
)
select *
from t
order by num,char_length (num_as_char) desc

Dynamic SQL Procedure that can insert into a table using a while loop to control the number of row entries

I have a small SQL based challenge that i'm trying to solve to better my knowledge of Dynamic SQL.
My requirements are as follows.
I created a table that looks as follows:
CREATE TABLE Prison_Doors
(
DoorNum INT IDENTITY(1,1) PRIMARY KEY,
DoorOpen BIT,
DoorClosed BIT,
Trips INT
)
GO
I need to Create a Dynamic SQL Proc to insert 50 Door numbers and assign them as closed.
Expected result of proc:
|DoorNum|DoorOpen|DoorClosed|Trips|
|-------|--------|----------|-----|
| 1 | 0 | 1 |null |
|-------|--------|----------|-----|
|---------All the way to 50-------|
|-------|--------|----------|-----|
| 50 | 0 | 1 |null |
This is what I have written but it is not inserting:
BEGIN
DECLARE #SQL VARCHAR(8000)
DECLARE #Index INT
SET #Index=1
WHILE (#Index<=50)
BEGIN
SET #SQL= 'INSERT INTO Prison_Doors(DoorNum,DoorOpen,DoorClosed)
VALUES('+CAST(#Index AS VARCHAR)+',0,1),'
SET #Index=#Index+1
END
SET #SQL = SUBSTRING(#SQL, 1, LEN(#SQL)-1)
EXEC(#SQL)
END
I would like to know what I am doing wrong.
after all of this is done I then need to run another loop to start at door one and change every second door to open and change trips to one and then increment to every 3 doors to open and trips becomes 2 and this incrementation continues until all doors are open which will then select the number of trips that it took.
I hope somebody can assist me with this as I am new to Dynamic SQL and I just need some guidance and not the complete solution.
Help is much appreciated :)
The error you got is because you are trying to insert into an identity column DoorNumber:
DoorNum INT IDENTITY(1,1) PRIMARY KEY,
Remove that columns from the column list, instead of:
INSERT INTO Prison_Doors(DoorNum,DoorOpen,DoorClosed)
remove that column DoorNum:
INSERT INTO Prison_Doors(DoorOpen,DoorClosed)
...
However, there is no need for dynamic SQL to do that, you can do this using an anchor table like this:
WITH temp
AS
(
SELECT n
FROM (VALUES(1), (2), (3), (4)) temp(n)
), nums
AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS n
FROM temp t1, temp t2, temp t3
)
INSERT INTO Prison_Doors(DoorOpen, DoorClosed)
SELECT 0 AS DoorOpen, 1 AS DoorClosed
FROM nums
WHERE n <= 50;
Live Demo.
Update:
What my code does line by line?
Generating Sequence of numbers:
The first problem was generating a sequence of 50 numbers from 1 to 50, I used an anchor table with only four rows from 1 to 4 like this:
SELECT n
FROM (VALUES(1), (2), (3), (4)) temp(n);
This syntax using the VALUES is new to SQL-Server-2008, it is called Row Value Constructor. After the VALUES, you assign an alias of the table and the target columns in parentheses like temp(n).
For old versions you have to use something like :
SELECT n
FROM
(
SELECT 1 AS n
UNION ALL
SELECT 2
UNION ALL
SELECT 3
UNION ALL
SELECT 4
) AS temp;
This will give you only 4 rows, but we need to generate 50. Thats why I CROSS JOIN this table three time with itself using:
FROM temp t1, temp t2, temp t3
It is the same as
FROM temp t1
CROSS JOIN temp t2
CROSS JOIN temp t3
This will multiply these rows 64 times, 4 rows3 = 64 rows:
1 1 1
1 2 1
1 3 1
1 4 1
2 1 1
2 2 1
2 3 1
2 4 1
....
3 1 4
3 2 4
3 3 4
3 4 4
4 1 4
4 2 4
4 3 4
4 4 4
The Use Of ROW_NUMBER() Function:
Then using the ROW_NUMBER() will give us a ranking number from 1 to 64 like this:
WITH temp
AS
(
SELECT n
FROM (VALUES(1), (2), (3), (4)) temp(n)
)
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS n
FROM temp t1, temp t2, temp t3;
Note that: ROW_NUMBER requires an ORDER BY clause, in or case it doesn't matter the order, so I used (SELECT 1).
There is also another way for generating a sequence numbers with out the use of ROW_NUMBER, it also depends on the CROSS JOIN, with an anchor table like:
WITH temp
AS
(
SELECT n
FROM (
VALUES(0), (1), (2), (3), (4), (5), (6), (7), (8), (9)
) temp(n)
), nums
AS
(
SELECT t1.n * 100 + t2.n * 10 + t3.n + 1 AS n
FROM temp t1, temp t2, temp t3
)
SELECT n
FROM nums
ORDER BY n;
Another Way of Generating Sequence Of Numbers
Common Table Expressions:
The CTE is called common table expression, and it was introdeced in SQL Server 2005. It is one of the table expressions types that SQL Server supports. The other three are:
Derived tables,
Views, and
Inline table-valued functions.
It has a lot of important advantages. One of them is let you create a virtual tables that you can reuse them later, like what I did:
WITH temp
AS
(
SELECT n
FROM (VALUES(1), (2), (3), (4)) temp(n)
), nums
AS
(
SELECT ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS n
FROM temp t1, temp t2, temp t3
)
...
Here I defined two CTE's temp then another one nums that ruse that temp so this OL, you can create multiple CTE's, just put a semicolon, then a new one with the AS clause and two ().
Insert into one table from another table using INSERT INTO ... SELECT ...
Now, we have a virtual table nums having rows from 1 to 64, we need to insert the rows from 1 to 50.
For this, you can use the INSERT INTO ... SELECT ....
Note that the columns in the INSERT clause are optional, but If you do so, you have to put a value for each row, if not you will got an error, for example if you have four columns and you put only three values in the VALUES clause or in the SELECT clause, then you will got an error. This is not valid for the idenetityt columns which are defined with:
Identity(1,1)
^ ^
| |
| ------------------The seed
The start
In this case you simply ignore that column in the columns list in the INSERT clause and it will have the identity value. There is an option that let you insert a value manually like in the #Raj's answer.
Note that: In my answer, I am not inserting the sequence numbers in to that column instead, inserting the values 50 times. But the actual numbers are generating automatically because of the Identity column:
...
INSERT INTO Prison_Doors(DoorOpen, DoorClosed)
SELECT 0 AS DoorOpen, 1 AS DoorClosed
FROM nums;
WHERE n <= 50;
Firstly, you have declared DoorNum as Identity and then you are trying to explicitly insert values into that column. SQL Server does not allow this, unless you choose to
SET IDENTITY_INSERT ON
Try this query. It should insert the 50 rows you want
DECLARE #SQL VARCHAR(8000)
DECLARE #Index INT
SET #Index=1
WHILE (#Index<=50)
BEGIN
SET #SQL= 'INSERT INTO Prison_Doors(DoorOpen,DoorClosed)VALUES(0,1)'
EXEC(#SQL)
SET #Index=#Index+1
END
Raj