How to select unique records from a result in oracle SQL?

How to select unique records from a result in oracle SQL? - sql

I am running a SQL query on oracle database.
SELECT DISTINCT flow_id , COMPOSITE_NAME FROM CUBE_INSTANCE where flow_id IN(200148,
200162);
I am getting below results as follow.
200162 ABCWS1
200148 ABCWS3
200162 ABCWS2
200148 OutputLog
200162 OutputLog
In this result 200162 came thrice as composite Name is different in each result. But my requirement is to get only one row of 200162 which is 1st one. If result contains same flow_id multiple times then it should only display result of first flow_id and ignore whatever it has in 2nd and 3rd.
EXPECTED OUTPUT -
200162 ABCWS1
200148 ABCWS3
Could you please help me with modification of query?
Thank you in advance !!!

It appears that you want to take the lexicographically first composite name for each flow_id:
WITH cte AS (
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY flow_id ORDER BY COMPOSITE_NAME) rn
FROM CUBE_INSTANCE t
WHERE flow_id IN (200148, 200162)
)
SELECT flow_id, COMPOSITE_NAME
FROM cte
WHERE rn = 1;

There is no such thing as a "first" row, unless a column specifies that information.
But you can easily use aggregation for this purpose:
select ci.flow_id, min(ci.composite_name)
from cube_instance ci
where flow_id in (200148, 200162);
group by ci.flow_id
If you do have a column that specifies the ordering, you can still use aggregation. The equivalent of the "first" function in Oracle is:
select ci.flow_id,
min(ci.composite_name) keep (dense_rank first order by <ordering col>)
from cube_instance ci
where flow_id in (200148, 200162);
group by ci.flow_id

Related

ROW_NUMBER function does not start from 1

I would like to ask about strange behaviour in SQL Server whilst using ROW_NUMBER() Function. Typically it should start from 1 and Order values by the selected column in Order By clause, which for the most scenarios works for me just as it is supposed to, but I have a particular case when I use a basic Select Statement:
SELECT
ROW_NUMBER() OVER (ORDER BY VIN) AS RN,
*
FROM dbo.RawData
and I get such result:
RN VIN
6301 JTEBR3FJ00K096082
6302 JTEBR3FJ00K096132
6303 JTEBR3FJ00K096146
6304 JTEBR3FJ00K096163
6305 JTEBR3FJ00K096180
6306 JTEBR3FJ00K096275
1801 5TDDZRFHX0S820530
1802 5TDDZRFHX0S824111
1803 5TDDZRFHX0S824500
1804 5TDDZRFHX0S825971
1805 5TDDZRFHX0S826456
and those are the first columns in the return table. The whole ROW_NUMBER function works randomly, after chain from 6301 to 6306, the chain from 1801 to 1940 starts etc.
The VIN column (the one I sort data based on) is set to nvarchar(17)
could you please help with solving the issue which might occur in this case?
I would be grateful for any tips what might be wrong

You can use ORDER BY to order the rows in a desired way:
SELECT ROW_NUMBER() OVER (ORDER BY VIN) AS RN
,*
FROM dbo.RawData
ORDER BY RN;
As the row_number is calculated in the SELECTE, you can use its value in the ORDER BY clause without the need of nested query.

How to get first row of 3 specific values of a column using Oracle SQL?

I have a table which has ID, FAMILY, ENV_XML_PATH and CREATED_DATE columns.
ID
FAMILY
ENV_XML_PATH
CREATED_DATE
15826841
CRM
path1.xml
03-09-22 6:50:34AM
15826856
SCM
path3.xml
03-10-22 7:12:20AM
15826786
IC
path4.xml
02-10-22 12:50:52AM
15825965
CRM
path5.xml
02-10-22 1:50:52AM
15653951
null
path6.xml
04-10-22 12:50:52AM
15826840
FIN
path7.xml
03-10-22 2:34:09AM
15826841
SCM
path8.xml
02-10-22 8:40:52AM
15223450
IC
path9.xml
03-09-22 5:34:09AM
15026853
SCM
path10.xml
05-10-22 4:40:59AM
Now there are 18 DISTINCT values in FAMILY column and each value has multiple rows associated (as you can see from the above image).
What I want is to get the first row of 3 specific values (CRM, SCM and IC) in FAMILY column.
Something like this:
ID
FAMILY
ENV_XML_PATH
CREATED_DATE
15826841
CRM
path1.xml
date1
15826856
SCM
path3.xml
date2
15826786
IC
path4.xml
date3
I am new to this, though I understand the logic but I am not sure how to implement it. Kindly help. Thanks.

You can use RANK for that. Something like this:
WITH groupedData AS
(SELECT id, family, env_xml_path, created_date,
RANK () OVER (PARTITION BY family ORDER BY id) AS r_num
FROM yourtable
GROUP BY id, family, env_xml_path, created_date)
SELECT id, family, env_xml_path, created_date
FROM groupedData
WHERE r_num = 1
ORDER BY id;
Thus, within the first query, your data will be grouped by family and sorted by the column you want (in my example, it will be sorted by id).
After that, you will use the second query to only take the first row of each family.
Add a WHERE clause to the first query if you need to apply further restrictions on the result set.
See here a working example: db<>fiddle

You could use a window function to get to know the row number of each partition in family ordered by the created_date, and then filter by the the three families you are interested in:
with row_window as (
select
id,
family,
env_xml_path,
created_date,
row_number() over (partition by family order by created_date asc) as rn
from <your_table>
where family in ('CRM', 'SCM', 'IC')
)
select
id,
family,
env_xml_path,
created_date
from row_window
where rn = 1
Output:
ID
FAMILY
ENV_XML_PATH
CREATED_DATE
15826841
CRM
path1.xml
03-09-22 6:50:34
15826856
SCM
path3.xml
03-10-22 7:12:20
15826786
IC
path4.xml
02-10-22 12:50:52

The question doesn't really specify what 'first' means, but I assume it means the first to be added in the table, aka the person whose date is the oldest. Try this code:
SELECT DISTINCT * FROM (yourTable) WHERE Family = 'CRM' OR
Family = 'SCM' OR Family = 'IC' ORDER BY Created_Date ASC FETCH FIRST (number) ROWS ONLY;
What it does:
Distinct - It selects different rows, which means you won't get same type of rows at the top.
Where - checks if certain condition is true
OR - it means that the select should choose rows that match those requirements. In the current situation the distinct clause means that same rows won't repeat, so you won't be getting 2 different 'CRM' family names, so it will find the first 'CRM' then the first 'SCM' and so on.
ORDER BY - orders the column in specified order. In the current one, if first rows mean the oldest, then by ordering them by date and using ASC the oldest(aka smallest date) will be at the top.
FETCH FIRST (number) ROWS ONLY - It selects only the very first couple of rows you want. For example if you need 3 different 'first' rows you need to get FETCH FIRST 3 ROWS ONLY. Combined with the distinct word it will only show 3 different rows.

Sql Islands and Gaps Merge Contiguous records if relevant fields hold same values

I have created a test case here for my problem https://rextester.com/ZRXSQ14415
Its must each easier to show the problem to explain what I am trying to achieve.
I have a list of records across time and I wish to merge contiguous records into a single record.
Each record has a period Date, Risk Levels and a couple of flags. When these risks and flags are the same the records should be merged when they are different then they should be a separate row.
On the Rextester example, i have almost achieved my goal, however look at rows 3 + 4 of the result.
What I want to achieve is that rows 3 + 4 would be combined such that row 3
StartDate End Date Name ... ...
17.03.2019 20.03.2019 CPWJ40-A ... ...
As all flags and risk levels are the same.

Change the SEQ expression to
..
ROW_NUMBER() OVER (ORDER BY PeriodDate) - ROW_NUMBER() OVER (Partition BY ImplicitRisk,QCReadyRisk,IsQualityControlReady, ActivePeriod ORDER BY PeriodDate) AS SEQ
..
This way you'll get the proper grouping of islands of ImplicitRisk,QCReadyRisk,IsQualityControlReady, ActivePeriod.

This answer is purely to complement Serg answer with the full query.
SELECT MIN(d.PeriodDate) AS StartDate,
MAX(d.PeriodDate) AS EndDate,
ImplicitRisk,
QcReadyRisk,
IsQualityControlReady,
ActivePeriod,
LocationEventName
FROM
(
SELECT c.*,
ROW_NUMBER() OVER (ORDER BY PeriodDate) - ROW_NUMBER() OVER (Partition BY LocationEventId, ImplicitRisk, QCReadyRisk, IsQualityControlReady, ActivePeriod ORDER BY PeriodDate) AS grp
FROM tab c
--order by PeriodDate
) d
group by ImplicitRisk, QcReadyRisk, IsQualityControlReady, ActivePeriod, LocationEventName, grp
order by 1

How to compare ordered datasets with the dataset before?

I have the following query:
select * from events order by Source, DateReceived
This gives me something like this:
I would like to get the results which i marked blue -> When there are two or more equal ErrorNr-Entries behind each other FROM THE SAME SOURCE.
So I have to compare every row with the row before. How can I achieve that?
This is what I want to get:

Apply the row number over partition by option on your table:
SELECT
ROW_NUMBER() OVER(PARTITION BY Source ORDER BY datereceived)
AS Row,
* FROM events
Either you can run a (max) having > 1 option on the result set's row number. Or if you need the details, apply the same query deducting the row nuber with 1.
Then you can make a join on the source and the row numbers and if the error nr is the same then you have a hit.

You can use the partition by as below.
select * from(select
*,row_number()over(partition by source,errornr order by Source, DateReceived) r
from
[yourtable])t
where r>1
You can specify your column names in the outer select.

How to Order only first 20 records in a resultset using SQL?

My requirement is to get the List of Diagnosis based on the most used Diagnosis. So, to achieve that I have added one Column named DiagnosisCounter in the tblDiagnosisMst Table of the database which increases by 1 for each Diagnosis the each time user selects it. So, my query is like below:
select DiagnosisID,DiagnosisCode,Name from tblDiagnosisMst
where GroupName = 'Common' and RecStatus = 'A' order by DiagnosisCounter desc,
Name asc
So, this query is helping me to get the list of Diagnosis but in descending order for Diagnosis and then alphabetically for Diagnosis Name. But now my client wants to show only 20 most used Diagnosis name at the top and then all the names should appear in alphabetical order. But unfortunately I am stuck in this point. It would be so appreciative if I get your helpful advice for this problem.

This should do the trick:
;With Ordered as (
select DiagnosisID,DiagnosisCode,Name,
ROW_NUMBER() OVER (ORDER BY DiagnosisCounter desc) as rn
from tblDiagnosisMst
where GroupName = 'Common' and RecStatus = 'A'
)
select * from Ordered
order by CASE WHEN rn <= 20 THEN rn ELSE 21 END,
Name asc
We use ROW_NUMBER to assign the numbers 1-x to each of the rows, based on the diagnosiscounter. We then use that value for the first ORDER BY condition if it's in 1-20, and all other rows sort equally in position 21. The second condition is then used as a tie-breaker to sort those remaining row by name.

Try this
SELECT TOP 20
* FROM tblDiagnosisMst ORDER BY DiagnosisCounter;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to select unique records from a result in oracle SQL? - sql

It appears that you want to take the lexicographically first composite name for each flow_id: WITH cte AS ( SELECT t.*, ROW_NUMBER() OVER (PARTITION BY flow_id ORDER BY COMPOSITE_NAME) rn FROM CUBE_INSTANCE t WHERE flow_id IN (200148, 200162) ) SELECT flow_id, COMPOSITE_NAME FROM cte WHERE rn = 1;

Related

ROW_NUMBER function does not start from 1

How to get first row of 3 specific values of a column using Oracle SQL?

Sql Islands and Gaps Merge Contiguous records if relevant fields hold same values

How to compare ordered datasets with the dataset before?

How to Order only first 20 records in a resultset using SQL?

Categories

Resources