Oracle ranking columns on multiple fields - sql

I am having some issues with ranking some columns in Oracle. I have two columns I need to rank--a group id and a date.
I want to group the table two ways:
Rank the records in each GROUP_ID by DATETIME (RANK_1)
Rank the GROUP_IDs by their DATETIME, GROUP_ID (RANK_2)
It should look like this:
GROUP_ID | DATE | RANK_1 | RANK_2
----------|------------|-----------|----------
2 | 1/1/2012 | 1 | 1
2 | 1/2/2012 | 2 | 1
2 | 1/4/2012 | 3 | 1
3 | 1/1/2012 | 1 | 2
1 | 1/3/2012 | 1 | 3
I have been able to do the former, but have been unable to figure out the latter.
SELECT group_id,
datetime,
ROW_NUMBER() OVER (PARTITION BY group_id ORDER BY datetime) AS rn,
DENSE_RANK() OVER (ORDER BY group_id) AS rn2
FROM table_1
ORDER BY group_id;
This incorrectly orders the RANK_2 field:
GROUP_ID | DATE | RANK_1 | RANK_2
----------|------------|-----------|----------
1 | 1/3/2012 | 1 | 1
2 | 1/1/2012 | 1 | 2
2 | 1/2/2012 | 2 | 2
2 | 1/4/2012 | 3 | 2
3 | 1/1/2012 | 1 | 3

Assuming you don't have an actual id column in the table, it appears that you want to do the second rank by the earliest date in each group. This will require a nested subquery:
select group_id, datetime, rn,
dense_rank() over (order by EarliestDate, group_id) as rn2
from (SELECT group_id, datetime,
ROW_NUMBER() OVER (PARTITION BY group_id ORDER BY datetime) AS rn,
min(datetime) OVER (partition by group_id) as EarliestDate
FROM table_1
) t
ORDER BY group_id;

Related

Filtering consecutive dates ranges using SQL Server

I want to filter categories that only have consecutive dates.
I will explain with an example.
My table is
| ID | Category | Date |
|--------------------|-----------------|---------------------|
| 1 | 1 | 01-04-2021 |
| 2 | 1 | 02-04-2021 |
| 3 | 2 | 01-03-2021 |
| 4 | 2 | 04-03-2021 |
| 5 | 2 | 01-02-2010 |
| 6 | 3 | 02-02-2010 |
| 7 | 3 | 03-02-2010 |
| 8 | 4 | 03-02-2010 |
Expected output:
| Category |
|----------------|
| 1 |
| 3 |
| 4 |
I would like to filter my data such as I only have categories that do not contain consecutive dates.
… for unique dates per category
select category
from mytable
group by category
having max(Date) = dateadd(day, count(*)-1, min(Date))
Here's one way. You'll have to maybe adjust it for your particular flavor of SQL.
WITH a AS (
SELECT
category,
DATEDIFF('days', date, LAG(date) OVER (PARTITION BY category ORDER BY
date)) AS days_apart
FROM tbl
),
b AS (
SELECT
category,
MAX(days_apart) AS max_days_apart
FROM a
GROUP BY 1
)
SELECT
category
FROM b
WHERE max_days_apart IS NULL OR max_days_apart = 1
select distinct category
from dates
where category not in (
select distinct category
from (
select category, [date],
row_number() over (partition by category order by [date]) as days_cnt,
min([date]) over (partition by category) as min_date
from dates
group by category, [date]
) as c
where c.[date]<>dateadd(d, c.days_cnt-1, c.min_date))
order by category
Categories where the sequence of dates is the same as the sequence of ids.
with cte as (
select [category],
row_number() over (partition by [category] order by [date], [id])
- row_number() over (partition by [category] order by [id]) drn
)
select [category]
from cte
group by [category]
having sum(abs(drn)) = 0;

SQL Server Add row number each group

I working on a query for SQL Server 2016. I have order by serial_no and group by pay_type and I would like to add row number same example below
row_no | pay_type | serial_no
1 | A | 4000118445
2 | A | 4000118458
3 | A | 4000118461
4 | A | 4000118473
5 | A | 4000118486
1 | B | 4000118499
2 | B | 4000118506
3 | B | 4000118519
4 | B | 4000118521
1 | A | 4000118534
2 | A | 4000118547
3 | A | 4000118550
1 | B | 4000118562
2 | B | 4000118565
3 | B | 4000118570
4 | B | 4000118572
Help me please..
SELECT
ROW_NUMBER() OVER(PARTITION BY paytype ORDER BY serial_no) as row_no,
paytype, serial_no
FROM table
ORDER BY serial_no
You can assign groups to adjacent pay types that are the same and then use row_number(). For this purpose, the difference of row numbers is a good way to determine the groups:
select row_number() over (partition by pay_type, seqnum - seqnum_2 order by serial_no) as row_no,
t.*
from (select t.*,
row_number() over (order by serial_no) as seqnum,
row_number() over (partition by pay_type order by serial_no) as seqnum_2
from t
) t;
This type of problem is one example of a gaps-and-islands problem. Why does the difference of row numbers work? I find that the simplest way to understand is to look at the results of the subquery.
Here is a db<>fiddle.
add this to your select list
ROW_NUMBER() OVER ( ORDER BY (SELECT 1) )
since you already sorting by your stuff, so you don't need to sorting in your windowing function so consuming less CPU,

Select the highest value of column 2 per column 1

Given the following table P_PROV
+----+-----------+-----------+
| id | date | person_id |
+----+-----------+-----------+
| 1 |19/06/2019 | 1 |
| 2 |18/07/2010 | 2 |
| 3 |19/06/2020 | 1 |
| 4 |17/06/2020 | 2 |
| 5 |28/06/2020 | 3 |
+----+-----------+-----------+
I want this output
+----+-----------+-----------+
| id | date | person_id |
+----+-----------+-----------+
| 3 |19/06/2020 | 1 |
| 4 |17/06/2020 | 2 |
| 5 |28/06/2020 | 3 |
+----+-----------+-----------+
Putting this in words, I want to return per person the maximum date. I tried something like this
SELECT DISTINCT pp.date, pp.id FROM P_PROV pp
WHERE (SELECT MAX(aa.date)
FROM P_PROV aa) = pp.date;
This one is only returning one row (of course, because the MAX will return the maximum date only), but I really don't know how to approach this issue, any kind of help would be appreciated
ROW_NUMBER provides one way to handle this:
SELECT id, date, person_id
FROM
(
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY person_id ORDER BY date DESC) rn
FROM yourTable t
) t
WHERE rn = 1;
Oracle has a fun way to do this using aggregation:
select max(id) keep (dense_rank first order by date desc) as id,
max(date) as date, person_id
from P_PROV
group by person_id;
Given that your ids are increasing, this probably also does what you want:
select max(id) as id, max(date) as date, person_id
from P_PROV
group by person_id;

Select ONLY row with max(id) in SQL SERVER

I have a table A :
ID | ProductCatId | ProductCode | Price
1 | 1 | PROD0001 | 2
2 | 2 | PROD0005 | 2
3 | 2 | PROD0005 | 2
4 | 3 | PROD0008 | 2
5 | 5 | PROD0009 | 2
6 | 7 | PROD0012 | 2
I want to select ID,ProductCatId,ProductCode,Price with condition :
"if ProductCatId exists same value ,so get ProductCatId with max(ID)", like :
ID | ProductCatId | ProductCode | Price
1 | 1 | PROD0001 | 2
3 | 2 | PROD0005 | 2
4 | 3 | PROD0008 | 2
5 | 5 | PROD0009 | 2
6 | 7 | PROD0012 | 2
Go for window function and row_number()
select ID , ProductCatId , ProductCode , Price
from (
select ID , ProductCatId , ProductCode , Price, row_number() over (partition by ProductCatId order by ID desc) as rn
from myTable
) as t
where t.rn = 1
select
top 1 with ties
ID,ProductCatId,ProductCode,Price
from
table
order by
row_number() over (partition by productcatid order by id desc)
may use row_number():
select t.*
from (select t.*,
row_number() over (partition by ProductCatId order by ID desc) as seqnum
from #Table t
) t
where seqnum = 1
order by ID;
You can try this,
Select Max(ID),ProductCatId,ProductCode,price
From TableName
Group By ProductCatId,ProductCode,price
A little shorter:
SELECT DISTINCT
max(ID) OVER (PARTITION BY ProductCatId,
ProductCode,
Price) AS ID,
ProductCatId,
ProductCode,
Price,
FROM myTable

How can I remove partial duplicates in my MS SQL Database?

I have a database table which is automatically filled from different sources. Now I have the problem that there are some duplicate entries.
For example:
EID | TID | StartDate | EndDate
--------------------------------------------
1 | 1 | 20.01.2012 | 23.01.2012
1 | 2 | 25.01.2012 | 26.01.2012
1 | 3 | 27.01.2012 | 30.01.2012
2 | 2 | 20.02.2012 | 23.02.2012
2 | 2 | 25.01.2012 | 26.01.2012
3 | 1 | 20.01.2012 | 23.01.2012
As you can see, there are two rows in which EID and TID are the same. What I am trying to achieve is, that one the row, where the date is higher is deleted.
The only workaround I found, is a query where only the lower ones are selected.
SELECT EID, TID, Min(StartDate), Min(EndDate) FROM Table1 GROUP BY EID, TID
You can use a CTE and the ROW_NUMBER function:
WITH CTE AS
(
SELECT EID, TID, StartDate, EndDate,
RN = ROW_NUMBER() OVER (PARTITION BY EID, TID ORDER BY StartDate, EndDate)
FROM Table1
)
DELETE FROM CTE WHERE RN > 1
DEMO