Select rows when a value appears multiple times - sql

I have a table like this one:
+------+------+
| ID | Cust |
+------+------+
| 1 | A |
| 1 | A |
| 1 | B |
| 1 | B |
| 2 | A |
| 2 | A |
| 2 | A |
| 2 | B |
| 3 | A |
| 3 | B |
| 3 | B |
+------+------+
I would like to get the IDs that have at least two times A and two times B. So in my example, the query should return only the ID 1,
Thanks!

In MySQL:
SELECT id
FROM test
GROUP BY id
HAVING GROUP_CONCAT(cust ORDER BY cust SEPARATOR '') LIKE '%aa%bb%'
In Oracle
WITH cte AS ( SELECT id, LISTAGG(cust, '') WITHIN GROUP (ORDER BY cust) custs
FROM test
GROUP BY id )
SELECT id
FROM cte
WHERE custs LIKE '%aa%bb%'

I would just use two levels of aggregation:
select id
from (select id, cust, count(*) as cnt
from t
where cust in ('A', 'B')
group by id, cust
) ic
group by id
having count(*) = 2 and -- both customers are in the result set
min(cnt) >= 2 -- and there are at least two instances

This is one option; lines #1 - 13 represent sample data. Query you might be interested in begins at line #14.
SQL> with test (id, cust) as
2 (select 1, 'a' from dual union all
3 select 1, 'a' from dual union all
4 select 1, 'b' from dual union all
5 select 1, 'b' from dual union all
6 select 2, 'a' from dual union all
7 select 2, 'a' from dual union all
8 select 2, 'a' from dual union all
9 select 2, 'b' from dual union all
10 select 3, 'a' from dual union all
11 select 3, 'b' from dual union all
12 select 3, 'b' from dual
13 )
14 select id
15 from (select
16 id,
17 sum(case when cust = 'a' then 1 else 0 end) suma,
18 sum(case when cust = 'b' then 1 else 0 end) sumb
19 from test
20 group by id
21 )
22 where suma = 2
23 and sumb = 2;
ID
----------
1
SQL>

You can use group by and having for the relevant Cust ('A' , 'B')
And query twice (I chose to use with to avoid multiple selects and to cache it)
with more_than_2 as
(
select Id, Cust, count(*) c
from tab
where Cust in ('A', 'B')
group by Id, Cust
having count(*) >= 2
)
select *
from tab
where exists ( select 1 from more_than_2 where more_than_2.Id = tab.Id and more_than_2.Cust = 'A')
and exists ( select 1 from more_than_2 where more_than_2.Id = tab.Id and more_than_2.Cust = 'B')

What you want is a perfect candidate for match_recognize. Here you go:
select id_ as id from t
match_recognize
(
order by id, cust
measures id as id_
pattern (A {2, } B {2, })
define A as cust = 'A',
B as cust = 'B'
)
Output:
Regards,
Ranagal

Related

Need to get the mismatcted cells in specific partition

ID
TC_No
Result
1
tc_1
PASS
1
tc_2
PASS
1
tc_3
FAIL
1
tc_4
PASS
1
tc_5
FAIL
2
tc_1
FAIL
2
tc_2
PASS
2
tc_3
FAIL
2
tc_4
FAIL
2
tc_5
FAIL
I'm trying to find all records that have conflicting "Result" on the same "TC_No" and among different "ID" values, filtered by ID IN (1,2).
Here's the expected output:
ID
TC_No
Result
1
tc_1
PASS
1
tc_4
PASS
2
tc_1
FAIL
2
tc_4
FAIL
and my attempted query:
SELECT * From
(SELECT * from Excel As T1
UNION
SELECT * from Excel As T2)
As c
where ID in(1,2) order By TC_NO
find the distinct count of result for each tc_no and then select the records having count greater than 1
Query
select * from your_tbl_name a
where exists(
select 1 from (
select tc_no, count(distinct result) as cnt
from your_tbl_name
where result in ('PASS','FAIL')
group by c_no
) b
where a.tc_no = b.tc_no
and b.cnt > 1
)
You can simply INNER JOIN the table onto itself and add a predicate in the WHERE clause to return only mismatched results.
SQL:
SELECT
a.ID,
a.TC_No,
a.Result
FROM
Excel a
INNER JOIN Excel b ON a.TC_No = b.TC_No
WHERE
a.Result <> b.Result;
Result:
| ID | TC_No | Result |
|----|-------|--------|
| 1 | tc_1 | PASS |
| 1 | tc_4 | PASS |
| 2 | tc_1 | FAIL |
| 2 | tc_4 | FAIL |
SQL Fiddle Demo:
Here
Check when the maximum result is different than the minimum one, in your filtered data, using window functions.
WITH cte AS (
SELECT tab.*,
MAX(Result_) OVER(PARTITION BY TC_No) AS max_result,
MIN(Result_) OVER(PARTITION BY TC_No) AS min_result
FROM tab
WHERE ID IN (1,2)
)
SELECT Id, Tc_No, Result_
FROM cte
WHERE min_result < max_result
Check the demo here.
You could use analytic function COUNT() OVER() to find the rows with different content and then just filter the result in Where clause:
SELECT ID, TC_NO, RESULT
FROM ( Select ID, TC_NO, RESULT,
CASE WHEN Count(DISTINCT RESULT) OVER(Partition By TC_NO) = 2 THEN 'Y' END "IS_DIFF"
From tbl
)
WHERE IS_DIFF = 'Y'
ORDER BY ID
With your sample data:
WITH
tbl (ID, TC_NO, RESULT) AS
(
Select 1, 'tc_1', 'PASS' From Dual Union All
Select 1, 'tc_2', 'PASS' From Dual Union All
Select 1, 'tc_3', 'FAIL' From Dual Union All
Select 1, 'tc_4', 'PASS' From Dual Union All
Select 1, 'tc_5', 'FAIL' From Dual Union All
Select 2, 'tc_1', 'FAIL' From Dual Union All
Select 2, 'tc_2', 'PASS' From Dual Union All
Select 2, 'tc_3', 'FAIL' From Dual Union All
Select 2, 'tc_4', 'FAIL' From Dual Union All
Select 2, 'tc_5', 'FAIL' From Dual
)
... the result is:
ID
TC_NO
RESULT
1
tc_1
PASS
1
tc_4
PASS
2
tc_1
FAIL
2
tc_4
FAIL

SQL: summing a column starting from row immediately after two consecutive 'trigger' values in another column

How to sum all values after two consecutive YES's in the CONDITION_SATISFIED column?
ID | CONDITION_SATISFIED | VALUE
--------------------------------
1 | NO | 100
2 | NO | 300
3 | NO | 500
4 | YES | 100
5 | YES | 300
6 | NO | 500 <-
7 | NO | 100 <-
8 | YES | 300 <-
9 | NO | 500 <-
--------------------------------
SUM | 1400
Note: further occurrences of YES/NO are ignored once the summation is started.
I've gotten to the point where I am able to generate two extra columns for the CONDITION_SATISFIED column like this:
ID | CONDITION_SATISFIED | VALUE RANK | REPEAT_COUNT
-------------------------------- -------------------
1 | NO | 100 1 | 3
2 | NO | 300 1 | 3
3 | NO | 500 1 | 3
4 | YES | 100 2 | 2
5 | YES | 300 2 | 2
6 | NO | 500 3 | 2 <- start from here
7 | NO | 100 3 | 2
8 | YES | 300 4 | 1
9 | NO | 500 5 | 1
-------------------------------- -------------------
But I'm not able to figure out how to get the first instance of REPEAT_COUNT >= 2 AND CONDITION_SATISFIED = 'YES', and then start the summation immediately after the 2nd YES (as indicated).
Hmmm . . . You can get the first where the two yesses are using lag():
select t.*
from (select t.*,
lag(condition_satisfied) over (order by id) as prev_cs,
lag(condition_satisfied, 2) over (order by id) as prev2_cs
from t
) t
where prev2_cs = 'YES' and prev_cs = 'YES';
Then you can just use this in a query:
select t.*
from t join
(select min(t.id) as id
from (select t.*,
lag(condition_satisfied) over (order by id) as prev_cs,
lag(condition_satisfied, 2) over (order by id) as prev2_cs
from t
) t
where prev2_cs = 'YES' and prev_cs = 'YES'
) yy
on t.id >= yy.id;
Oracle 12c: pattern matching
with t1 (id, condition_satisfied, value) as (
select 1, 'NO' , 100 from dual union all
select 2, 'NO' , 300 from dual union all
select 3, 'NO' , 500 from dual union all
select 4, 'YES', 100 from dual union all
select 5, 'YES', 300 from dual union all
select 6, 'NO' , 500 from dual union all
select 7, 'NO' , 100 from dual union all
select 8, 'YES', 300 from dual union all
select 9, 'NO' , 500 from dual)
select sum(v_value) as sum_value
from t1
match_recognize(
order by id
measures s.value as v_value
all rows per match
pattern (yes{2} s+)
define
yes as condition_satisfied = 'YES'
);
SUM_VALUE
----------
1400
If you have version lower than 12 no need to self-join and generate/prevent duplicates:
with s (id, condition_satisfied, value) as (
select 1, 'NO' , 100 from dual union all
select 2, 'NO' , 300 from dual union all
select 3, 'NO' , 500 from dual union all
select 4, 'YES', 100 from dual union all
select 5, 'YES', 300 from dual union all
select 6, 'NO' , 500 from dual union all
select 7, 'YES' , 100 from dual union all
select 8, 'YES', 300 from dual union all
select 9, 'NO' , 500 from dual)
select sum(value) sum_value
from
(select s.*, min(first_id) over () min_id
from
(select s.*,
case when condition_satisfied = 'YES' and condition_satisfied = lag(condition_satisfied) over (order by id) then id end first_id
from s
) s
)
where id > min_id;
SUM_VALUE
----------
1400

SQL recursive id nodes

I have a table structure like so
Id Desc Node
---------------------
1 A
2 Aa 1
3 Ab 1
4 B
5 Bb 4
6 Bb1 5
these Desc values are presented in a listview to the user, if the user chooses Bb, I want the ID 5 and also the ID 4 becuase thats the root node of that entry, simular to that if the user chooses Bb1, I need ID 6, 5 and 4
I am only able to query one level up, but there could be n levels, so my query at the moment looks like this
SELECT Id
FROM tbl
WHERE Desc = 'Bb1'
OR Id = (SELECT Node FROM tbl WHERE Desc = 'Bb1');
You can do this with Recursive CTE like below
Schema:
CREATE TABLE #TAB (ID INT, DESCS VARCHAR(10), NODE INT)
INSERT INTO #TAB
SELECT 1 AS ID, 'A' DESCS, NULL NODE
UNION ALL
SELECT 2 , 'AA', 1
UNION ALL
SELECT 3, 'AB', 1
UNION ALL
SELECT 4, 'B', NULL
UNION ALL
SELECT 5, 'BB', 4
UNION ALL
SELECT 6, 'BB1', 5
Now do recursive CTE for picking node value and apply it again on #TAB with a Join.
;WITH CTE AS(
SELECT ID, DESCS, NODE FROM #TAB WHERE ID=6
UNION ALL
SELECT T.ID, T.DESCS, T.NODE FROM #TAB T
INNER JOIN CTE C ON T.ID = C.NODE
)
SELECT * FROM CTE
When you pass 6 to the first query in CTE, the result will be
+----+-------+------+
| ID | DESCS | NODE |
+----+-------+------+
| 6 | BB1 | 5 |
| 5 | BB | 4 |
| 4 | B | NULL |
+----+-------+------+

How can I use the self join to find equal values within a group?

I am looking to find instances of GROUPID where all price values are 0. The following is a simplified version of what I am looking at
--------------------------------
| Groupid | Price | Customer|
--------------------------------
| 001 | 9 | 4 |
| 001 | 0 | 4 |
| 002 | 4 | 4 |
| 002 | 4 | 4 |
| 003 | 0 | 4 |
| 003 | 0 | 4 |
| 004 | 4 | 4 |
| 004 | 7 | 4 |
--------------------------------
I am attempting to use the following query to find all GROUPID where both PRICE values for that particular group = 0.
SELECT * FROM MYTABLE WHERE GROUPID IN
(SELECT TB1.GROUPID FROM MYTABLE TB1 JOIN MYTABLE TB2 ON TB1.GROUPID = TB2.GROUPID
AND TB1.PRICE = 0 AND TB2.PRICE = 0)
and CUSTOMER = 4
ORDER BY GROUPID;
This query returns:
| Groupid | Price | Customer|
--------------------------------
| 001 | 9 | 4 |
| 001 | 0 | 4 |
| 003 | 0 | 4 |
| 003 | 0 | 4 |
--------------------------------
In my case, I only need it to return GROUPID 003.
I'd also like to ask for assistance in modifying the query to return all non 0 equal PRICE values within a groupid. It doesn't have to be in the same query as above. For example the return would be:
| Groupid | Price | Customer|
--------------------------------
| 002 | 4 | 4 |
| 002 | 4 | 4 |
Any help would be appreciated. Thank you for your time.
If all the prices are zero, then look at the minimum and maximum price for the groupid:
select groupid
from mytable t
group by groupid
having min(price) = 0 and max(price) = 0;
I should point out that no self-join is required for this.
Try this:
SELECT * FROM MYTABLE as m1 where Price = 0
and not exists (
select 1
from MYTABLE as m2
where m2.Groupid = m1.Groupid
and m2.Price <> 0
)
You can list the the rows, whose group_id has no rows with non-zero price
select groupid, price, customer from mytable t
where not exists (
select 1 from mytable where group_id = t.group_id
and price != 0
);
I would do it by counting the number of distinct prices over each group (and I'm assuming that each group will have the same customer) and then filtering on the price(s) you're interested in.
For example, for prices that are 0 for all rows in each groupid+customer:
WITH mytable AS (SELECT '001' groupid, 9 price, 4 customer FROM dual UNION ALL
SELECT '001' groupid, 0 price, 4 customer FROM dual UNION ALL
SELECT '002' groupid, 4 price, 4 customer FROM dual UNION ALL
SELECT '002' groupid, 4 price, 4 customer FROM dual UNION ALL
SELECT '003' groupid, 0 price, 4 customer FROM dual UNION ALL
SELECT '003' groupid, 0 price, 4 customer FROM dual UNION ALL
SELECT '004' groupid, 4 price, 4 customer FROM dual UNION ALL
SELECT '004' groupid, 7 price, 4 customer FROM dual)
SELECT groupid,
customer,
price
FROM (SELECT groupid,
customer,
price,
COUNT(DISTINCT price) OVER (PARTITION BY groupid, customer) num_distinct_prices
FROM mytable)
WHERE num_distinct_prices = 1
AND price = 0;
GROUPID CUSTOMER PRICE
------- ---------- ----------
003 4 0
003 4 0
Just change the and price = 0 to and price != 0 if you want the groups which have the same non-zero price for all rows. Or simply remove that predicate altogether.
EDIT
Gordon's is the best solution for the first part:
SELECT groupid
FROM mytable t
GROUP BY GroupID
HAVING (MAX(price) = 0 and MIN(price) = 0)
And for the second part:
SELECT groupid
FROM mytable t
GROUP BY GroupID
HAVING MIN(price) <> 0 AND (MAX(price) = MIN(price))
My original one:
SELECT groupid
FROM mytable t
GROUP BY GroupID
HAVING SUM(Price) =0
This assumes, there are no negative prices.
To the second part of your question:
SELECT groupid
FROM mytable t
WHERE Price > 0
GROUP BY GroupID, Price
HAVING COUNT(price) > 1
In your sample data, you have only one customer. I assume if you had more than one customer, you would still want to return the rows where the groupid has the same price, across all rows and all customers. If so, you could use the query below. It is almost the same as Boneist's - I just use min(price) and max(price) instead of count(distinct), and I don't include customer in partition by.
If the price may be NULL, it will be ignored in the computation of max price and min price; if all the NON-NULL prices are equal for a groupid, all the rows for that group will be returned. If price can be NULL and this is NOT the desired behavior, that can be changed easily - but the OP will have to clarify.
The query below retrieves all the cases when there is a single price for the groupid. To retrieve only the groups where the price is 0 (an additional condition), add and price = 0 to the WHERE clause of the outer query. I added more test data to illustrate some of the cases the query covers.
with
test_data ( groupid, price, customer ) as (
select '001', 9, 4 from dual union all
select '001', 0, 4 from dual union all
select '002', 4, 4 from dual union all
select '002', 4, 4 from dual union all
select '003', 0, 4 from dual union all
select '003', 0, 4 from dual union all
select '004', 4, 4 from dual union all
select '004', 7, 4 from dual union all
select '002', 4, 8 from dual union all
select '005', 2, 8 from dual union all
select '005', null, 8 from dual
),
prep ( groupid, price, customer, min_price, max_price) as (
select groupid, price, customer,
min(price) over (partition by groupid),
max(price) over (partition by groupid)
from test_data
)
select groupid, price, customer
from prep
where min_price = max_price
;
GROUPID PRICE CUSTOMER
------- --------- ---------
002 4 8
002 4 4
002 4 4
003 0 4
003 0 4
005 8
005 2 8
This may be what you want:
SELECT * FROM MYTABLE
WHERE GROUPID NOT IN (
SELECT GROUPID
FROM MYTABLE
WHERE Price <> 0)
and just change the last line for the other query:
SELECT * FROM MYTABLE
WHERE GROUPID NOT IN (
SELECT GROUPID
FROM MYTABLE
WHERE Price = 0)
I would do it very similarly to what Gordon posted
SELECT groupId
FROM MyTable
GROUP BY groupId
HAVING SUM(price) = 0

Oracle - count records if number of token less than 2

I have records like...
ID | KEY
-------|---------
1 | 123_456_abc
1 | 123_xyz
1 | 456_abc
2 | 123_abc
2 | 122_73_zcc
3 | 123_wer
4 | 345_23_fhd
4 | 3453_abc
5 | ad1fr2h3_abcasd
5 | ers2g45bb_abc2rtd
5 | asf23g_abc1_sf45
I want count(ID) where count(tokanize(numeric(KEY),'_')) < 2
As count(ID) will be 6
You can try something like this
SELECT COUNT(ID) FROM xyz WHERE key NOT LIKE '%_%_%';
This should filter all elements which have less than two underscores.
Try this :
select Count(1) from
(with abc(id,key) as (select '1','123_456_abc' from dual
Union all
select '1','123_xyz' from dual
UNion all
select '1','456_abc' from dual
Union all
select '2','123_abc' from dual
UNion all
select '2','123_73_zcc' from dual
Union all
select '3','123_wer' from dual
UNion all
select '1','345_23_fhd' from dual
UNion all
select '1','345_abc' from dual
)
select key, length(regexp_replace(key,'[^_]*','')) cntr
from abc )
where cntr = 1
eliminate all records which has more than 1 underscores
then eliminate the ones which do not start with a number
then sum it up
select sum(cnt) from (
select key, cnt, id from (
select key, length(regexp_replace(key,'[^_]*','')) cnt, id from table_name
) where cnt < 2
) where regexp_like(key,'[1-9]+(.)*')