Select Top 1 based on distinct columns

Select Top 1 based on distinct columns - sql

I need to select the top 1 record from each group of column UnitID and CompanyCode and using order by from column CreatedDate
Here's an example of my table
ID | UnitID | CompanyCode | CreatedDate |
----------------------------------------|
1 | A1 | G100 | 2020-03-12 |
2 | A1 | G100 | 2020-03-13 |
3 | A1 | G100 | 2020-03-14 |
4 | B2 | G100 | 2020-03-12 |
5 | B2 | F200 | 2020-03-13 |
6 | B2 | E300 | 2020-03-14 |
My expected results would be these rows
ID | UnitID | CompanyCode | CreatedDate |
----------------------------------------|
3 | A1 | G100 | 2020-03-14 |
4 | B2 | G100 | 2020-03-12 |
5 | B2 | F200 | 2020-03-13 |
6 | B2 | E300 | 2020-03-14 |
We looking at UnitID first, next check CompanyCode If there is a record with different CompanyCode it will be display, but if have same, it will be select top 1 with order by createdDate
SIMPLE QUERY: SELECT ID, UnitID, CompanyCode, CreatedDate FROM Tbl_Unit ORDER BY CreatedDate
Anyone know how this can be achieved?

If I understand correctly, the next statement may help:
SELECT ID, UnitID, CompanyCode, CreatedDate
FROM (
SELECT
ID, UnitID, CompanyCode, CreatedDate,
ROW_NUMBER() OVER (PARTITION BY UnitID, CompanyCode ORDER BY CreatedDate) AS Rn
FROM Tbl_Unit
) t
WHERE Rn = 1

I like using TOP 1 WITH TIES for handling this type of query on SQL Server:
SELECT TOP 1 WITH TIES *
FROM Tbl_Unit
ORDER BY
ROW_NUMBER() OVER (PARTITION BY UnitID, CompanyCode ORDER BY CreatedDate DESC);

Using ROW_NUMBER() Function.
SELECT ID , UnitID , CompanyCode , CreatedDate FROM
(
select TAB.* , ROW_NUMBER() OVER (PARTITION BY UnitId , CompanyCode order by createddate desc ) RNK from TAB
) Drived WHERE RNK=1;
Demo
ROW_NUMBER() -- Funtion to generate row number
OVER (PARTITION BY UnitId , CompanyCode -- Partition range
order by createddate desc -- Sorting order )

Related

Filtering consecutive dates ranges using SQL Server

I want to filter categories that only have consecutive dates.
I will explain with an example.
My table is
| ID | Category | Date |
|--------------------|-----------------|---------------------|
| 1 | 1 | 01-04-2021 |
| 2 | 1 | 02-04-2021 |
| 3 | 2 | 01-03-2021 |
| 4 | 2 | 04-03-2021 |
| 5 | 2 | 01-02-2010 |
| 6 | 3 | 02-02-2010 |
| 7 | 3 | 03-02-2010 |
| 8 | 4 | 03-02-2010 |
Expected output:
| Category |
|----------------|
| 1 |
| 3 |
| 4 |
I would like to filter my data such as I only have categories that do not contain consecutive dates.

… for unique dates per category
select category
from mytable
group by category
having max(Date) = dateadd(day, count(*)-1, min(Date))

Here's one way. You'll have to maybe adjust it for your particular flavor of SQL.
WITH a AS (
SELECT
category,
DATEDIFF('days', date, LAG(date) OVER (PARTITION BY category ORDER BY
date)) AS days_apart
FROM tbl
),
b AS (
SELECT
category,
MAX(days_apart) AS max_days_apart
FROM a
GROUP BY 1
)
SELECT
category
FROM b
WHERE max_days_apart IS NULL OR max_days_apart = 1

select distinct category
from dates
where category not in (
select distinct category
from (
select category, [date],
row_number() over (partition by category order by [date]) as days_cnt,
min([date]) over (partition by category) as min_date
from dates
group by category, [date]
) as c
where c.[date]<>dateadd(d, c.days_cnt-1, c.min_date))
order by category

Categories where the sequence of dates is the same as the sequence of ids.
with cte as (
select [category],
row_number() over (partition by [category] order by [date], [id])
- row_number() over (partition by [category] order by [id]) drn
)
select [category]
from cte
group by [category]
having sum(abs(drn)) = 0;

Select top 1 Student Fee From List In SQL Server

In my SQL Server table, I have this data:
+------+-----+------------+
| Name | Fee | Date_Time |
+------+-----+------------+
| AA | 50 | 2018-03-27 |
| AA | 30 | 2018-04-10 |
| BB | 40 | 2018-01-10 |
| BB | 10 | 2018-04-10 |
| CC | 10 | 2018-04-10 |
| DD | 10 | 2018-04-10 |
+------+-----+------------+
How can I get data using SQL query like TOP 1 for (AA, BB, CC, DD) ORDER BY Date_Time DESC into a list?
+------+-----+------------+
| Name | Fee | Date_Time |
+------+-----+------------+
| AA | 30 | 2018-04-10 |
| BB | 10 | 2018-04-10 |
| CC | 10 | 2018-04-10 |
| DD | 10 | 2018-04-10 |
+------+-----+------------+

Use row_number() function to get the top most Fee
select top(1) with ties Name, Fee, Date_Time
from table t
order by row_number() over (partition by Name order by Date_Time desc)

Another approach can be
SELECT Name,Fee,Date_Time FROM
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY NAME ORDER BY DATE_TIME DESC) RN
FROM [TABLE_NAME]
) T
WHERE RN=1
In case if you have multiple entries on same day for a particular fee, and you want both should appear you can use DENSE_RANK() instead of ROW_NUMBER() like following.
SELECT Name,Fee,Date_Time FROM
(
SELECT *, DENSE_RANK() OVER(PARTITION BY NAME ORDER BY DATE_TIME DESC) RN
FROM [TABLE_NAME]
) T
WHERE RN=1
DEMO

Give a row_number based on the partition by Name and order by descending order of Date_Time and then select rows having row_number is 1.
Query
;with cte as (
select [rn] = row_number() over(
partition by [Name]
order by [Date_Time] desc
), *
from [your_table_name]
)
select [Name], [Fee], [Date_Time]
from cte
where [rn] = 1;

Partition By over Two Columns in Row_Number function

I am trying to RANK the records using the following query:
SELECT
ROW_NUMBER() over (partition by
TW.EMPL_ID,TW.HR_DEPT_ID,TW.Transfer_Startdate
order by TW.EMPL_ID,TW.Effective_Bdate) RN,
TW.EMPL_ID,TW.HR_DEPT_ID,TW.Transfer_Startdate,Effective_BDate from
TT_EMPLOYEE_WORKDAY TW
where TW.HR_DOMAIN_CODE = 'SGP'
However the resultant Row_Number computed column only displays partition for the first column. Ideally I expected to have the same value for Row_Number where the partition by column data is identical.
Any clue where I might be going wrong?
USING RANK or DENSE RANK isn't an option as I want to identify all such rows for multiple employee where EMPL_ID, HR_DEPT_ID and Transfer_StartDate are same (RN=1)
Sample data:
RN AON_EMPL_ID HR_DEPT_ID Transfer_Startdate Effective_BDate
1 0100690 69895 01/01/2017 2017-01-01
2 0100690 69895 01/01/2017 2017-01-03
3 0100690 69895 01/01/2017 2017-01-04

expanding sample data to:
create table t (
aon_empl_id varchar(16)
, hr_dept_id varchar(16)
, Transfer_Startdate date
, Effective_bdate date
);
insert into t values
('0100690','69895','01/01/2017','2017-01-01')
,('0100690','69895','01/01/2017','2017-01-03')
,('0100690','69895','01/01/2017','2017-01-04')
,('0200700','69895','01/01/2016','2016-01-01')
,('0200700','69895','01/01/2016','2016-01-03')
,('0200700','69896','01/01/2017','2017-01-04')
,('0200700','69896','01/01/2017','2017-01-04');
using top with ties
select top 1 with ties
aon_empl_id
, hr_dept_id
, Transfer_Startdate = convert(char(10),Transfer_Startdate,120)
, Effective_bdate = convert(char(10),Effective_bdate,120)
from t
order by row_number() over (
partition by aon_empl_id, hr_dept_id, Transfer_Startdate
order by Effective_bdate
)
rextester demo: http://rextester.com/KOIZ42069
returns:
+-------------+------------+--------------------+-----------------+
| aon_empl_id | hr_dept_id | Transfer_Startdate | Effective_bdate |
+-------------+------------+--------------------+-----------------+
| 0100690 | 69895 | 2017-01-01 | 2017-01-01 |
| 0200700 | 69895 | 2016-01-01 | 2016-01-01 |
| 0200700 | 69896 | 2017-01-01 | 2017-01-04 |
+-------------+------------+--------------------+-----------------+
Alternative using a common table expression with row_number():
;with cte as (
select
rn = row_number() over (
partition by aon_empl_id, hr_dept_id, Transfer_Startdate
order by Effective_bdate
)
, aon_empl_id
, hr_dept_id
, Transfer_Startdate = convert(char(10),Transfer_Startdate,120)
, Effective_bdate = convert(char(10),Effective_bdate,120)
from t tw
)
select *
from cte
where rn = 1
returns:
+----+-------------+------------+--------------------+-----------------+
| rn | aon_empl_id | hr_dept_id | Transfer_Startdate | Effective_bdate |
+----+-------------+------------+--------------------+-----------------+
| 1 | 0100690 | 69895 | 2017-01-01 | 2017-01-01 |
| 1 | 0200700 | 69895 | 2016-01-01 | 2016-01-01 |
| 1 | 0200700 | 69896 | 2017-01-01 | 2017-01-04 |
+----+-------------+------------+--------------------+-----------------+

SELECT
RANK() over (partition by --or DENSE_RANK()
TW.EMPL_ID,TW.HR_DEPT_ID,TW.Transfer_Startdate
order by TW.EMPL_ID,TW.Effective_Bdate) RN,
TW.EMPL_ID,TW.HR_DEPT_ID,TW.Transfer_Startdate,Effective_BDate from
TT_EMPLOYEE_WORKDAY TW
where TW.HR_DOMAIN_CODE = 'SGP'
UPDATE
SELECT
RANK() over (partition by --or DENSE_RANK()
TW.EMPL_ID,TW.HR_DEPT_ID,TW.Transfer_Startdate
order by TW.EMPL_ID) RN,
TW.EMPL_ID,TW.HR_DEPT_ID,TW.Transfer_Startdate,Effective_BDate from
TT_EMPLOYEE_WORKDAY TW
where TW.HR_DOMAIN_CODE = 'SGP'
Order by RN,TW.Effective_Bdate

This bit of code appears to be working:
SELECT
dense_rank() over (partition by AON_EMPL_ID
order by AON_EMPL_ID,HR_DEPT_ID,Transfer_StartDate) RN,
TW.AON_EMPL_ID,TW.HR_DEPT_ID,TW.Transfer_Startdate,Effective_BDate from
TT_AON_EMPLOYEE_WORKDAY TW
where TW.HR_DOMAIN_CODE = 'SGP'
Apparently, I just need to partition by AON_EMPL_ID and everything else should go to Order By clause.

Select ONLY row with max(id) in SQL SERVER

I have a table A :
ID | ProductCatId | ProductCode | Price
1 | 1 | PROD0001 | 2
2 | 2 | PROD0005 | 2
3 | 2 | PROD0005 | 2
4 | 3 | PROD0008 | 2
5 | 5 | PROD0009 | 2
6 | 7 | PROD0012 | 2
I want to select ID,ProductCatId,ProductCode,Price with condition :
"if ProductCatId exists same value ,so get ProductCatId with max(ID)", like :
ID | ProductCatId | ProductCode | Price
1 | 1 | PROD0001 | 2
3 | 2 | PROD0005 | 2
4 | 3 | PROD0008 | 2
5 | 5 | PROD0009 | 2
6 | 7 | PROD0012 | 2

Go for window function and row_number()
select ID , ProductCatId , ProductCode , Price
from (
select ID , ProductCatId , ProductCode , Price, row_number() over (partition by ProductCatId order by ID desc) as rn
from myTable
) as t
where t.rn = 1

select
top 1 with ties
ID,ProductCatId,ProductCode,Price
from
table
order by
row_number() over (partition by productcatid order by id desc)

may use row_number():
select t.*
from (select t.*,
row_number() over (partition by ProductCatId order by ID desc) as seqnum
from #Table t
) t
where seqnum = 1
order by ID;

You can try this,
Select Max(ID),ProductCatId,ProductCode,price
From TableName
Group By ProductCatId,ProductCode,price

A little shorter:
SELECT DISTINCT
max(ID) OVER (PARTITION BY ProductCatId,
ProductCode,
Price) AS ID,
ProductCatId,
ProductCode,
Price,
FROM myTable

Oracle ranking columns on multiple fields

I am having some issues with ranking some columns in Oracle. I have two columns I need to rank--a group id and a date.
I want to group the table two ways:
Rank the records in each GROUP_ID by DATETIME (RANK_1)
Rank the GROUP_IDs by their DATETIME, GROUP_ID (RANK_2)
It should look like this:
GROUP_ID | DATE | RANK_1 | RANK_2
----------|------------|-----------|----------
2 | 1/1/2012 | 1 | 1
2 | 1/2/2012 | 2 | 1
2 | 1/4/2012 | 3 | 1
3 | 1/1/2012 | 1 | 2
1 | 1/3/2012 | 1 | 3
I have been able to do the former, but have been unable to figure out the latter.
SELECT group_id,
datetime,
ROW_NUMBER() OVER (PARTITION BY group_id ORDER BY datetime) AS rn,
DENSE_RANK() OVER (ORDER BY group_id) AS rn2
FROM table_1
ORDER BY group_id;
This incorrectly orders the RANK_2 field:
GROUP_ID | DATE | RANK_1 | RANK_2
----------|------------|-----------|----------
1 | 1/3/2012 | 1 | 1
2 | 1/1/2012 | 1 | 2
2 | 1/2/2012 | 2 | 2
2 | 1/4/2012 | 3 | 2
3 | 1/1/2012 | 1 | 3

Assuming you don't have an actual id column in the table, it appears that you want to do the second rank by the earliest date in each group. This will require a nested subquery:
select group_id, datetime, rn,
dense_rank() over (order by EarliestDate, group_id) as rn2
from (SELECT group_id, datetime,
ROW_NUMBER() OVER (PARTITION BY group_id ORDER BY datetime) AS rn,
min(datetime) OVER (partition by group_id) as EarliestDate
FROM table_1
) t
ORDER BY group_id;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Select Top 1 based on distinct columns - sql

If I understand correctly, the next statement may help: SELECT ID, UnitID, CompanyCode, CreatedDate FROM ( SELECT ID, UnitID, CompanyCode, CreatedDate, ROW_NUMBER() OVER (PARTITION BY UnitID, CompanyCode ORDER BY CreatedDate) AS Rn FROM Tbl_Unit ) t WHERE Rn = 1

I like using TOP 1 WITH TIES for handling this type of query on SQL Server: SELECT TOP 1 WITH TIES * FROM Tbl_Unit ORDER BY ROW_NUMBER() OVER (PARTITION BY UnitID, CompanyCode ORDER BY CreatedDate DESC);

Related

Filtering consecutive dates ranges using SQL Server

Select top 1 Student Fee From List In SQL Server

Partition By over Two Columns in Row_Number function

Select ONLY row with max(id) in SQL SERVER

Oracle ranking columns on multiple fields

Categories

Resources