Oracle-Complex sql view creation - sql

I have a table like below:
For each disinct combination of ID and VALUE, I have several steps. For example, For the combination of A and B, I have three steps QC, LC and DR and so on for C and D. Now, I want a view like below:
That is, I want a column "OUTPUT" in the view where i have to put the first step after QC for each combination of ID and VALUE. For example, For A and B, first step after QC is LC and so OUTPUT value is LC. For C and D, there is no QC and so OUTPUT value is NA.
Can anyone please help me on this issue.
Thanks in advance.

In SQL, tables are inherently unordered. So, you need a column to specify the ordering. Let me assume that you have such a column, say StepOrder in the table. If so, then you can do what you want using analytic functions.
The lead() in the inner subquery returns the next step. The max() in the next subquery returns the value after QA, and the output max() spreads the value over all rows with the same id and value:
select id, value, step,
coalesce(max(qa_next) over (partition by id, value), 'NA') as "Output"
from (select t.*,
max(case when step = 'QA' then nextstep end) over (partition by id, value) as qa_next
from (select t.*,
lead(step) over (partition by id, value order by StepOrder) as nextStep
from table t
) t
) t

Related

How to get first row of 3 specific values of a column using Oracle SQL?

I have a table which has ID, FAMILY, ENV_XML_PATH and CREATED_DATE columns.
ID
FAMILY
ENV_XML_PATH
CREATED_DATE
15826841
CRM
path1.xml
03-09-22 6:50:34AM
15826856
SCM
path3.xml
03-10-22 7:12:20AM
15826786
IC
path4.xml
02-10-22 12:50:52AM
15825965
CRM
path5.xml
02-10-22 1:50:52AM
15653951
null
path6.xml
04-10-22 12:50:52AM
15826840
FIN
path7.xml
03-10-22 2:34:09AM
15826841
SCM
path8.xml
02-10-22 8:40:52AM
15223450
IC
path9.xml
03-09-22 5:34:09AM
15026853
SCM
path10.xml
05-10-22 4:40:59AM
Now there are 18 DISTINCT values in FAMILY column and each value has multiple rows associated (as you can see from the above image).
What I want is to get the first row of 3 specific values (CRM, SCM and IC) in FAMILY column.
Something like this:
ID
FAMILY
ENV_XML_PATH
CREATED_DATE
15826841
CRM
path1.xml
date1
15826856
SCM
path3.xml
date2
15826786
IC
path4.xml
date3
I am new to this, though I understand the logic but I am not sure how to implement it. Kindly help. Thanks.
You can use RANK for that. Something like this:
WITH groupedData AS
(SELECT id, family, env_xml_path, created_date,
RANK () OVER (PARTITION BY family ORDER BY id) AS r_num
FROM yourtable
GROUP BY id, family, env_xml_path, created_date)
SELECT id, family, env_xml_path, created_date
FROM groupedData
WHERE r_num = 1
ORDER BY id;
Thus, within the first query, your data will be grouped by family and sorted by the column you want (in my example, it will be sorted by id).
After that, you will use the second query to only take the first row of each family.
Add a WHERE clause to the first query if you need to apply further restrictions on the result set.
See here a working example: db<>fiddle
You could use a window function to get to know the row number of each partition in family ordered by the created_date, and then filter by the the three families you are interested in:
with row_window as (
select
id,
family,
env_xml_path,
created_date,
row_number() over (partition by family order by created_date asc) as rn
from <your_table>
where family in ('CRM', 'SCM', 'IC')
)
select
id,
family,
env_xml_path,
created_date
from row_window
where rn = 1
Output:
ID
FAMILY
ENV_XML_PATH
CREATED_DATE
15826841
CRM
path1.xml
03-09-22 6:50:34
15826856
SCM
path3.xml
03-10-22 7:12:20
15826786
IC
path4.xml
02-10-22 12:50:52
The question doesn't really specify what 'first' means, but I assume it means the first to be added in the table, aka the person whose date is the oldest. Try this code:
SELECT DISTINCT * FROM (yourTable) WHERE Family = 'CRM' OR
Family = 'SCM' OR Family = 'IC' ORDER BY Created_Date ASC FETCH FIRST (number) ROWS ONLY;
What it does:
Distinct - It selects different rows, which means you won't get same type of rows at the top.
Where - checks if certain condition is true
OR - it means that the select should choose rows that match those requirements. In the current situation the distinct clause means that same rows won't repeat, so you won't be getting 2 different 'CRM' family names, so it will find the first 'CRM' then the first 'SCM' and so on.
ORDER BY - orders the column in specified order. In the current one, if first rows mean the oldest, then by ordering them by date and using ASC the oldest(aka smallest date) will be at the top.
FETCH FIRST (number) ROWS ONLY - It selects only the very first couple of rows you want. For example if you need 3 different 'first' rows you need to get FETCH FIRST 3 ROWS ONLY. Combined with the distinct word it will only show 3 different rows.

How to select unique records from a result in oracle SQL?

I am running a SQL query on oracle database.
SELECT DISTINCT flow_id , COMPOSITE_NAME FROM CUBE_INSTANCE where flow_id IN(200148,
200162);
I am getting below results as follow.
200162 ABCWS1
200148 ABCWS3
200162 ABCWS2
200148 OutputLog
200162 OutputLog
In this result 200162 came thrice as composite Name is different in each result. But my requirement is to get only one row of 200162 which is 1st one. If result contains same flow_id multiple times then it should only display result of first flow_id and ignore whatever it has in 2nd and 3rd.
EXPECTED OUTPUT -
200162 ABCWS1
200148 ABCWS3
Could you please help me with modification of query?
Thank you in advance !!!
It appears that you want to take the lexicographically first composite name for each flow_id:
WITH cte AS (
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY flow_id ORDER BY COMPOSITE_NAME) rn
FROM CUBE_INSTANCE t
WHERE flow_id IN (200148, 200162)
)
SELECT flow_id, COMPOSITE_NAME
FROM cte
WHERE rn = 1;
There is no such thing as a "first" row, unless a column specifies that information.
But you can easily use aggregation for this purpose:
select ci.flow_id, min(ci.composite_name)
from cube_instance ci
where flow_id in (200148, 200162);
group by ci.flow_id
If you do have a column that specifies the ordering, you can still use aggregation. The equivalent of the "first" function in Oracle is:
select ci.flow_id,
min(ci.composite_name) keep (dense_rank first order by <ordering col>)
from cube_instance ci
where flow_id in (200148, 200162);
group by ci.flow_id

Using a subquery in a where clause to find the second smallest value

I'm trying to find the second smallest value for a list to put it into SSRS as a way to highlight that value. This issue is there are multiple minimum values for a given element. The data is presented such that there is an overarching group A that encompasses smaller groups B and I am wanting the second smallest value for each of the smaller groups.
I have a query set up right now that uses a subquery in the where clause to exclude the minimum value from the search so that the second smallest value will be considered the new minimum value. This seemed to work but the subquery only rules out the minimum value for the larger A group, which may or may not be the minimum value for each B group. Here is my query:
Select
BPosition,
Min(Value) as SecondMinimum
From Table
Where Value > (Select
Min(Value)
From Table
Where APosition = #AName)
and APosition = #AName
Group By BPosition
I was expecting a list of the second smallest values for each B group, but it is pulling in the smallest value in each B group that is greater than the smallest value of the A group. This is right for the one B group that contains the true smallest value but incorrect for the others.
If you want the second largest value, use dense_rank():
Select distinct BPosition, Value as SecondMinimum
From (select t.*,
dense_rank() over (partition by Aposition, Bposition order by value) as seqnum
from table
) t
where seqnum = 2;

How to implement lag function in teradata.

Input :
Output :
I want the output as shown in the image below.
In the output image, 4 in 'behind' is evaluated as tot_cnt-tot and the subsequent numbers in 'behind', for eg: 2 is evaluated as lag(behind)-tot & as long as the 'rank' remains same, even 'behind' should remain same.
Can anyone please help me implement this in teradata?
You appears to want :
select *, (select count(*)
from table t1
where t1.rank > t.rank
) as behind
from table t;
I would summarize the data and do:
select id, max(tot_cnt), max(tot),
(max(tot_cnt) -
sum(max(tot)) over (order by id rows between unbounded preceding and current row)
) as diff
from t
group by id;
This provides one row per id, which makes a lot more sense to me. If you want the original data rows (which are all duplicates anyway), you can join this back to your table.

How to compare ordered datasets with the dataset before?

I have the following query:
select * from events order by Source, DateReceived
This gives me something like this:
I would like to get the results which i marked blue -> When there are two or more equal ErrorNr-Entries behind each other FROM THE SAME SOURCE.
So I have to compare every row with the row before. How can I achieve that?
This is what I want to get:
Apply the row number over partition by option on your table:
SELECT
ROW_NUMBER() OVER(PARTITION BY Source ORDER BY datereceived)
AS Row,
* FROM events
Either you can run a (max) having > 1 option on the result set's row number. Or if you need the details, apply the same query deducting the row nuber with 1.
Then you can make a join on the source and the row numbers and if the error nr is the same then you have a hit.
You can use the partition by as below.
select * from(select
*,row_number()over(partition by source,errornr order by Source, DateReceived) r
from
[yourtable])t
where r>1
You can specify your column names in the outer select.