Trying to aggregate string column by choosing 1 of several values

Trying to aggregate string column by choosing 1 of several values - sql

I have a dataset that I want to aggregate based on a string column. Dataset is basically:
system status
-------------------
PRE1-SYS1 SUCCESS
PRE1-SYS2 SUCCESS
PRE2-SYS1 RUNNING
PRE2-SYS2 SUCCESS
PRE3-SYS1 SUCCESS
PRE3-SYS2 <blank>
Basically, I want this to become:
system status
-------------------
PRE1 SUCCESS
PRE2 RUNNING
PRE3 RUNNING
I have the sql needed to trim down the system values to PRE1, and PRE2, but I'm not sure how to aggregate the string function so that a system is:
only SUCCESS, then status is SUCCESS
only nulls, then status is PENDING
any other combination, then RUNNING (SUCCESS/RUNNING, SUCCESS/null, RUNNING/null)
I've looked at LISTAGG but I don't think it applies.

Here is an SQL query you could use:
select regexp_substr(system, '^[^-]*') as prefix,
case
when count(status) = 0 then 'PENDING'
when count(*) = count(case when status = 'SUCCESS' then 1 end) then 'SUCCESS'
else 'RUNNING'
end as status
from mytable
group by regexp_substr(system, '^[^-]*')

with
inputs ( system, status ) as (
select 'PRE1-SYS1', 'SUCCESS' from dual union all
select 'PRE1-SYS2', 'SUCCESS' from dual union all
select 'PRE2-SYS1', 'RUNNING' from dual union all
select 'PRE2-SYS2', 'SUCCESS' from dual union all
select 'PRE3-SYS1', 'SUCCESS' from dual union all
select 'PRE3-SYS2', '' from dual
),
prep ( system, flag ) as (
select substr(system, 1, instr(system, '-') - 1),
case status when 'SUCCESS' then 0
when 'RUNNING' then 1 else 2 end
from inputs
)
select system,
case when min(flag) = 2 then 'PENDING'
when max(flag) = 0 then 'SUCCESS'
else 'RUNNING' end as status
from prep
group by system
order by system;
Output:
SYSTEM STATUS
--------- -------
PRE1 SUCCESS
PRE2 RUNNING
PRE3 RUNNING

I would approach it by ranking the responses. For example set a value to the most desired to the least desired results:
SUCCESS = 1
RUNNING = 2
<blank> = 3
PENDING = 3
Then select a min based on that.
select xval = case status when 'Success' then 1
when 'Running' then 2
when 'Pending' then 3
else 3
end
Use a nested sub select on the value you get here so you only get one record per system.
select System, Min(Xval)
then display the 1 as Success
the 2 as Running
and 3 as Pending
It is hard to do in text format, easier to do with numbers. The numbers you assign to the string are important because they determine when you have multiple values which single one you return in your final query.

Another alternative. Practically, this ends up being very similar to #trincot's solution, I'm just separating the logic for getting the counts from the logic that interprets those counts. If your logic gets more complicated in the future, this may be a bit more flexible.
with
inputs ( system, status ) as (
select 'PRE1-SYS1', 'SUCCESS' from dual union all
select 'PRE1-SYS2', 'SUCCESS' from dual union all
select 'PRE2-SYS1', 'RUNNING' from dual union all
select 'PRE2-SYS2', 'SUCCESS' from dual union all
select 'PRE3-SYS1', 'SUCCESS' from dual union all
select 'PRE3-SYS2', '' from dual
),
/* The cnts CTE counts how many rows relate to a SYSTEM,
how many of those are SUCCESS, and how many are NULL.
*/
cnts( system, num_rows, num_success, num_null ) as (
select substr(system,1,instr(system, '-')-1) system,
count(*),
sum(case when status = 'SUCCESS' then 1 else 0 end),
sum(case when status is null then 1 else 0 end)
from inputs
group by substr(system,1,instr(system, '-')-1)
)
/* Using the counts from the CTE, we can implement whatever logic we
want
*/
select system,
(case when num_rows = num_success then 'SUCCESS'
when num_rows = num_null then 'PENDING'
else 'RUNNING'
end) status
from cnts

Related

Looking for best way to execute Yes/No Query check in select statement

I was wondering if anyone could recommend the best way to execute this. I will introduce you to what I'm working on.
I've written a select query with some sub-queries which gets order records, I have a number business logic that these orders need to meet so that they come up on the report.
Additionally I've added a nested case statement which helps me determine is the business logic is met and it simply returns a Yes or a No. So far all looks great!
E.G.
Above is just a sample result for one order (29817). What I need to do next is only show Order_No when NOYESCHECK returns all YES's.
Nested Case statement:
(case when sm.supply_code='Project Inventory' and
(select po.order_no
from purchase_order_line_all po
where po.contract = sm.contract
and po.part_no = sm.part_no
and po.activity_seq = sm.activity_seq
and po.project_id = sm.project_id
and po.state in ('Closed','Arrived','Recieved') order by po.date_entered desc fetch first 1 row only) is not null then 'YES'
when sm.supply_code='Invent Order' and
( select sum(QTY_ONHAND - QTY_RESERVED)
from inventory_part_in_stock ipis
where ipis.contract = sm.contract
and ipis.part_no = sm.part_no
and ipis.QTY_ONHAND - ipis.QTY_RESERVED > '0'
and ipis.project_id is null
and ipis.AVAILABILITY_CONTROL_ID not in ('QUARANTINE','RD','TRANSIT','PRE SCRAP')
) is not null then 'YES'
else 'NO' end)NoYesCheck
What would be the best way to achieve this? I have tried using ALL operator but it didn't work quite as expected. What I tried with ALL operator:
and 'YES' = ALL (case when sm.supply_code='Project Inventory' and
(select po.order_no
from purchase_order_line_all po
where po.contract = sm.contract
and po.part_no = sm.part_no
and po.activity_seq = sm.activity_seq
and po.project_id = sm.project_id
and po.state in ('Closed','Arrived','Recieved') order by po.date_entered desc fetch first 1 row only) is not null then 'YES'
when sm.supply_code='Invent Order' and
( select sum(QTY_ONHAND - QTY_RESERVED)
from inventory_part_in_stock ipis
where ipis.contract = sm.contract
and ipis.part_no = sm.part_no
and ipis.QTY_ONHAND - ipis.QTY_RESERVED > '0'
and ipis.AVAILABILITY_CONTROL_ID not in ('QUARANTINE','RD','TRANSIT','PRE SCRAP')
and ipis.project_id is null
) is not null then 'YES'
else 'NO' end)
It seemed to return only lines with 'YES' in my check but the purpose here is:
If check is done per order and returns at least one 'No' then do not show the order. So in above image this order was never meant to show up as a result in my query but it did. So I'm a little stuck.
Any help would be appreciated. Let me know if I need to provide more info.
Thanks,
Kasia

You can use your NOYESCHECK column in a subselect within the where clause combined with a NOT IN check.
Psuedo code:
select
--main query columns
from data_source
where key_column not in (
select distinct
key_column
from (
select
key_column,
noyescheck_column
from data_source
where noyescheck_column = 'NO'
)
)

Would this help? See comments within code.
SQL> with
2 -- this is what your query currently returns
3 test (order_no, component_part, noyescheck) as
4 (select 29817, 100, 'NO' from dual union all
5 select 29817, 101, 'YES' from dual union all
6 --
7 select 30000, 200, 'YES' from dual union all
8 select 30000, 201, 'YES' from dual union all
9 --
10 select 40000, 300, 'NO' from dual
11 ),
12 -- find ORDER_NOs whose NOYESCHECK = YES only
13 yess as
14 (select order_no
15 from test
16 group by order_no
17 having min(noyescheck) = max(noyescheck)
18 and min(noyescheck) = 'YES'
19 )
20 -- return only ORDER_NOs that satisfy condition
21 select t.*
22 from test t join yess y on y.order_no = t.order_no;
ORDER_NO COMPONENT_PART NOY
---------- -------------- ---
30000 200 YES
30000 201 YES
SQL>

sql case statement IN with group by

I have a 2 column table with the columns : "user_name" and "characteristic". Each user_name may appear multiple times with a different characteristic.
The values in characteristic are:
Online
Instore
Account
Email
I want to write a sql statement that goes like this - but obviously this isn't working:
SELECT user_name,
case
when characteristic in ("online","instore") then 1
else 0
END as purchase_yn,
case
when characteristic in ("online","instore") and
characteristic in ("email",'account') then 1
else 0
END as purchaser_with_account
FROM my_table
GROUP BY user_name;
Essentially the first is a flag where I check for the presence of either value for that user_name.
The Second field is that they meet this criteria AND that they meet the criteria for having either 'email' or 'account'

An example the structure of your data would help better understand what you are trying to accomplish. But I think I get what you are trying to do.
You have to use an aggregate function in order to use a group by.
Something like SUM or AVG.
But you need first to build a pivot of your data and then you could use that pivot to check for your criterias:
This would create a table pivot that shows for each record what criterias are met:
SELECT
user_name,
case when characteristic = "online" then 1 else 0 end as online_yn,
case when characteristic = "instore" then 1 else 0 end as instore_yn,
case when characteristic = "account" then 1 else 0 end as account_yn,
case when characteristic = "email" then 1 else 0 end as email_yn,
FROM my_table
Now what you might wanted to do is to create an averaged version of these entries grouped by user_name and use those averages to create the fields you wanted. For that you need to use the same statement created earlier as an inline table :
Select
user_name,
case when avg(online_yn + instore_yn) >= 1 then 1 else 0 end as purchase_yn,
case when avg(online_yn + instore_yn) >= 1 and avg(email_yn + account_yn) >= 1 then 1 else 0 end as purchaser_with_account
From
(SELECT
user_name,
case when characteristic = "online" then 1 else 0 end as online_yn,
case when characteristic = "instore" then 1 else 0 end as instore_yn,
case when characteristic = "account" then 1 else 0 end as account_yn,
case when characteristic = "email" then 1 else 0 end as email_yn,
FROM my_table) avg_table
group by
user_name;
This should help.
It may not be efficient in terms of performance but you'll get what you want.

You just have to enclose the CASE expressions in COUNT aggregates:
SELECT user_name,
COUNT(case when characteristic in ("online","instore") then 1 END) as purchase_yn,
COUNT(case when characteristic in ("email",'account') then 1 END) as user_with_account
FROM my_table
GROUP BY user_name
If purchase_yn > 0 then you first flag is set. If purchase_yn > 0 and user_with_account > 0 then you second flag is set as well.
Note: You have to remove ELSE 0 from the CASE expressions because COUNT takes into account all not null values.

You haven't mentioned a specific RDBMS, but if SUM(DISTINCT ...) is available the following is quite nice:
SELECT
username,
SUM(DISTINCT
CASE
WHEN characteristic in ('online','instore') THEN 1
ELSE 0
END) AS purchase_yn,
CASE WHEN (
SUM(DISTINCT
CASE
WHEN characteristic in ('online','instore') THEN 1
WHEN characteristic in ('email','account') THEN 2
ELSE 0 END
)
) = 3 THEN 1 ELSE 0 END as purchaser_with_account
FROM
my_table
GROUP BY
username

If I correctly understand, if user have 'online' or 'instore', then for this user you want 1 as purchase_yn column, and if user also have 'email' or 'account', then 1 as purchaser_with_account column.
If this is correct, then one way is:
with your_table(user_name, characteristic) as(
select 1, 'online' union all
select 1, 'instore' union all
select 1, 'account' union all
select 1, 'email' union all
select 2, 'account' union all
select 2, 'email' union all
select 3, 'online'
)
-- below is actual query:
select your_table.user_name, coalesce(max(t1.purchase_yn), 0) as purchase_yn, coalesce(max(t2.purchaser_with_account), 0) as purchaser_with_account
from your_table
left join (SELECT user_name, 1 as purchase_yn from your_table where characteristic in('online','instore') ) t1
on your_table.user_name = t1.user_name
left join (SELECT user_name, 1 as purchaser_with_account from your_table where characteristic in('email', 'account') ) t2
on t1.user_name = t2.user_name
group by your_table.user_name

ORACLE: SELECT VALUE IF

I am trying to select different values that depend on different conditions, but I don't exactly know, how one can achieve this in SQL/Oracle..
Here is an example:
SELECT VALUE (I dont exactly know what to write here)
FROM
(SELECT
(CASE
WHEN (Select 1 from DUAL) = 1 THEN 'TEST'
WHEN (Select 1 from DUAL) = 0 THEN 'TEST1'
WHEN (Select 1 from DUAL) = 0 THEN 'TEST2'
ELSE 'N/A'
END)
FROM DUAL);
I want to print different results according to the conditions...For instance, in the example above it should print "TEST"

You need to provide an alias to the CASE statement:
SELECT alias_for_your_case_value
FROM (
SELECT CASE (Select 1 from DUAL)
WHEN 1 THEN 'TEST'
WHEN 0 THEN 'TEST1'
WHEN 0 THEN 'TEST2'
ELSE 'N/A'
END AS alias_for_your_case_value
FROM DUAL
);

Pulling data while pivoting at the same time

ID | Type | Code
1 Purchase A1
1 Return B1
1 Exchange C1
2 Purchase D1
2 Return NULL
2 Exchange F1
3 Purchase G1
3 Return H1
3 Exchange I1
4 Purchase J1
4 Exchange K1
Above is sample data. What I want to return is:
ID | Type | Code
1 Purchase A1
1 Return B1
1 Exchange C1
3 Purchase G1
3 Return H1
3 Exchange I1
So if a field is null in code or the values of Purchase, Return and Exchange are not all present for that ID, ignore that ID completely. However there is one last step. I want this data to then be pivoted this way:
ID | Purchase | Return | Exchange
1 A1 B1 C1
3 G1 H1 I1
I asked this yesterday without the pivot portion which you can see here:
SQL query to return data only if ALL necessary columns are present and not NULL
However I forgot to note the last part. I tried to play around with excel but had no luck. I tried to make a temp table but the data is too large to do that so I was wondering if this could all be done in 1 sql statement?
I personally used this query with success:
select t.*
from t
where 3 = (select count(distinct t2.type)
from t t2
where t2.id = t.id and
t2.type in ('Purchase', 'Exchange', 'Return') and
t2.Code is not null
);
So how can we adjust that to include the pivot part. Is that possible?

Quite easily. Just use conditional aggregation:
select t.id,
max(case when type = 'Purchase' then code end) as Purchase,
max(case when type = 'Exchange' then code end) as Exchange,
max(case when type = 'Return' then code end) as Return
from t
where 3 = (select count(distinct t2.type)
from t t2
where t2.id = t.id and
t2.type in ('Purchase', 'Exchange', 'Return') and
t2.Code is not null
)
group by t.id;
This is actually simpler to express (in my opinion) using having without the subquery:
select t.id,
max(case when type = 'Purchase' then code end) as Purchase,
max(case when type = 'Exchange' then code end) as Exchange,
max(case when type = 'Return' then code end) as Return
from t
group by t.id
having max(case when type = 'Purchase' then code end) is not null and
max(case when type = 'Exchange' then code end) is not null and
max(case when type = 'Return' then code end) is not null;
Many databases would allow:
having Purchase is not null and Exchange is not null and Return is not null
But Oracle doesn't allow the use of table aliases in the having clause.

UPDATE - Based on discussion in the question comments, my previous query had a faulty assumption (which I carried over from what I thought I saw in the original query in the question); I've eliminated the bad assumption.
select id
, max(case when type='Purchase' then Code end) Purchase
, max(case when type='Return' then Code end) Return
, max(case when type='Exchange' then Code end) Exchange
from t
where code is not null
and type in ('Purchase', 'Return', 'Exchange')
group by id
having count(distinct type) = 3

I will point out again (as I did in your other thread) that analytic functions will do the job much faster - they need the base table to be read just once, and there are no explicit or implicit joins.
with
test_data ( id, type, code ) as (
select 1, 'Purchase', 'A1' from dual union all
select 1, 'Return' , 'B1' from dual union all
select 1, 'Exchange', 'C1' from dual union all
select 2, 'Purchase', 'D1' from dual union all
select 2, 'Return' , null from dual union all
select 2, 'Exchange', 'F1' from dual union all
select 3, 'Purchase', 'G1' from dual union all
select 3, 'Return' , 'H1' from dual union all
select 3, 'Exchange', 'I1' from dual union all
select 4, 'Purchase', 'J1' from dual union all
select 4, 'Exchange', 'K1' from dual
)
-- end of test data; actual solution (SQL query) begins below this line
select id, purchase, return, exchange
from ( select id, type, code
from ( select id, type, code,
count( distinct case when type in ('Purchase', 'Return', 'Exchange')
then type end
) over (partition by id) as ct_type,
count( case when code is null then 1 end
) over (partition by id) as ct_code
from test_data
)
where ct_type = 3 and ct_code = 0
)
pivot ( min(code) for type in ('Purchase' as purchase, 'Return' as return,
'Exchange' as exchange)
)
;
Output:
ID PURCHASE RETURN EXCHANGE
--- -------- -------- --------
1 A1 B1 C1
3 G1 H1 I1
2 rows selected.

select single row based on count of rows present oracle

I've got below table using a query, Now I want to fetch single record based on conditions explained below and assign it to two variable i.e. v_dte_meeting and v_status_meeting declared in my stored procedure,
Dte_Meeting| Ststus_Meeting
########################
15-Oct-14 | Due
30-Oct-14 | Due
15-Dec-14 | Init
30-Dec-14 | Init
30-Nov-15 | Approved
I want to assign value to these variables based on below conditions:
If a a single or multiple records present with Status_Meeting as 'Due' Then assign v_dte_meeting the greatest date with 'Due' status and assign v_status_meeting with value 'Due'
If above condition fails then, check If a single or multiple records present with Ststus_Meeting as 'Init' If it does, Then assign v_dte_meeting the greatest date with 'Init' status and assign v_status_meeting with value 'Init'
If both condition fails then assign both variables NULL value
Please help me to do it the best way in Oracle

Hope this helps. I am only showing the SELECT statement (I didn't create variables, so I am not selecting INTO, but that wasn't your difficulty, you know how to do that.) I use subquery factoring (the WITH clause), available only in versions >= 11 I believe, otherwise you can rewrite to put the subqueries where they belong.
Note the use of rank(); in Gordon's solution, he will pick up max(dte) over ALL rows, not just those with status = 'Due', so it can't be as simple as what he wrote. EDIT: I also don't see where he selects NULL, NULL if neither 'Due' nor 'Init' are present. (Sorry for the abuse, this should be a comment to his answer, I lack privileges.)
WITH t (Date_Meeting, Status_Meeting) AS
(
SELECT TO_DATE('15-OCT-14', 'DD-MON-YY'), 'Due' FROM dual UNION ALL
SELECT TO_DATE('30-OCT-14', 'DD-MON-YY'), 'Due' FROM dual UNION ALL
SELECT TO_DATE('15-DEC-14', 'DD-MON-YY'), 'Init' FROM dual UNION ALL
SELECT TO_DATE('30-DEC-14', 'DD-MON-YY'), 'Init' FROM dual UNION ALL
SELECT TO_DATE('15-NOV-15', 'DD-MON-YY'), 'Approved' FROM dual
),
s (Date_Meeting, Status_Meeting) AS
(
SELECT Date_Meeting, Status_Meeting FROM t
WHERE Status_Meeting = 'Due' OR Status_Meeting = 'Init'
UNION ALL SELECT NULL, NULL FROM dual -- To ensure you have the nulls if needed
),
r (Date_Meeting, Status_Meeting, rk) AS
(
SELECT Date_Meeting, Status_Meeting,
RANK() OVER (ORDER BY DECODE(Status_Meeting, 'Due', 0, 'Init', 1, 2),
Date_Meeting DESC) -- make sure you understand this
FROM s
)
SELECT Date_Meeting, Status_Meeting FROM r WHERE rk = 1
/
Result:
DATE_MEET STATUS_M
--------- --------
30-OCT-14 Due
1 row selected.

One method uses aggregation:
select (case when sum(case when status_meeting = 'Due' then 1 else 0 end) > 0
then max(case when status_meeting = 'Due' dte_meeting end)
when sum(case when status_meeting = 'Init' then 1 else 0 end) > 0
then max(case when status_meeting = 'Init' then dte_meeting end)
end),
(case when sum(case when status_meeting = 'Due' then 1 else 0 end) > 0
then'Due'
when sum(case when status_meeting = 'Init' then 1 else 0 end) > 0
then 'Init'
end)
into v_dte_meeting, v_status_meeting
from t;
However, I think a simpler version just uses order by:
select max(dte_meeting), max(status_meeting)
into v_dte_meeting, v_status_meeting
from (select t.*
from t
where status_meeting in ('Due', 'Init')
order by (case when status_meeting = 'Due' then 1
when status_meeting = 'Init' then 2
end)
) t
where rownum = 1;
The max() is just to ensure that exactly one row is returned.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Trying to aggregate string column by choosing 1 of several values - sql

Here is an SQL query you could use: select regexp_substr(system, '^[^-]') as prefix, case when count(status) = 0 then 'PENDING' when count() = count(case when status = 'SUCCESS' then 1 end) then 'SUCCESS' else 'RUNNING' end as status from mytable group by regexp_substr(system, '^[^-]*')

Related

Looking for best way to execute Yes/No Query check in select statement

sql case statement IN with group by

ORACLE: SELECT VALUE IF

Pulling data while pivoting at the same time

select single row based on count of rows present oracle

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Trying to aggregate string column by choosing 1 of several values - sql

Here is an SQL query you could use: select regexp_substr(system, '^[^-]*') as prefix, case when count(status) = 0 then 'PENDING' when count(*) = count(case when status = 'SUCCESS' then 1 end) then 'SUCCESS' else 'RUNNING' end as status from mytable group by regexp_substr(system, '^[^-]*')

Related

Looking for best way to execute Yes/No Query check in select statement

sql case statement IN with group by

ORACLE: SELECT VALUE IF

Pulling data while pivoting at the same time

select single row based on count of rows present oracle

Categories

Resources

Here is an SQL query you could use: select regexp_substr(system, '^[^-]') as prefix, case when count(status) = 0 then 'PENDING' when count() = count(case when status = 'SUCCESS' then 1 end) then 'SUCCESS' else 'RUNNING' end as status from mytable group by regexp_substr(system, '^[^-]*')