Count distinct values separated by a comma - sql

I have this table called test
id
my_list
1
aa//11, aa//34, ab//65
2
bb//43, bb//43, be//54
3
4
cc//76
I want to count the distinct values in my_list, where each item in the list is separated by a comma. In this case:
id=1 will have 3 distinct values
id=2 will have 2 distinct values as bb//43 as shown up twice, thus 2 distinct values
id=3 will have 0 distinct values as it as an empty list
id=4 will have 1 since there is only 1 item in the list
I want to do this in pure SQL and not using a custom made procedure. I tried with the statement below but it is showing 1.
SELECT id, COUNT(DISTINCT my_list) as my_count
FROM test;
Expected result:
id
my_count
1
3
2
2
3
0
4
1

You need to turn your list into table to count distinct inside it. With json_table, for example.
with a(id, my_list) as (
select 1, 'aa//11, aa//34, ab//65' from dual union all
select 2, 'bb//43, bb//43, be//54' from dual union all
select 3, null from dual union all
select 4, 'cc//76' from dual
)
select
id
, (
select count(distinct val)
from json_table(
/*Replace comma with quotes and comma: ','
And wrap with array brackets
*/
'[''' || regexp_replace(my_list, '\s*,\s*', ''',''') || ''']'
, '$[*]'
columns ( val varchar(20) path '$')
)
) as cnt
from a
ID | CNT
-: | --:
1 | 3
2 | 2
3 | 0
4 | 1
db<>fiddle here

Related

Need to get the mismatcted cells in specific partition

ID
TC_No
Result
1
tc_1
PASS
1
tc_2
PASS
1
tc_3
FAIL
1
tc_4
PASS
1
tc_5
FAIL
2
tc_1
FAIL
2
tc_2
PASS
2
tc_3
FAIL
2
tc_4
FAIL
2
tc_5
FAIL
I'm trying to find all records that have conflicting "Result" on the same "TC_No" and among different "ID" values, filtered by ID IN (1,2).
Here's the expected output:
ID
TC_No
Result
1
tc_1
PASS
1
tc_4
PASS
2
tc_1
FAIL
2
tc_4
FAIL
and my attempted query:
SELECT * From
(SELECT * from Excel As T1
UNION
SELECT * from Excel As T2)
As c
where ID in(1,2) order By TC_NO
find the distinct count of result for each tc_no and then select the records having count greater than 1
Query
select * from your_tbl_name a
where exists(
select 1 from (
select tc_no, count(distinct result) as cnt
from your_tbl_name
where result in ('PASS','FAIL')
group by c_no
) b
where a.tc_no = b.tc_no
and b.cnt > 1
)
You can simply INNER JOIN the table onto itself and add a predicate in the WHERE clause to return only mismatched results.
SQL:
SELECT
a.ID,
a.TC_No,
a.Result
FROM
Excel a
INNER JOIN Excel b ON a.TC_No = b.TC_No
WHERE
a.Result <> b.Result;
Result:
| ID | TC_No | Result |
|----|-------|--------|
| 1 | tc_1 | PASS |
| 1 | tc_4 | PASS |
| 2 | tc_1 | FAIL |
| 2 | tc_4 | FAIL |
SQL Fiddle Demo:
Here
Check when the maximum result is different than the minimum one, in your filtered data, using window functions.
WITH cte AS (
SELECT tab.*,
MAX(Result_) OVER(PARTITION BY TC_No) AS max_result,
MIN(Result_) OVER(PARTITION BY TC_No) AS min_result
FROM tab
WHERE ID IN (1,2)
)
SELECT Id, Tc_No, Result_
FROM cte
WHERE min_result < max_result
Check the demo here.
You could use analytic function COUNT() OVER() to find the rows with different content and then just filter the result in Where clause:
SELECT ID, TC_NO, RESULT
FROM ( Select ID, TC_NO, RESULT,
CASE WHEN Count(DISTINCT RESULT) OVER(Partition By TC_NO) = 2 THEN 'Y' END "IS_DIFF"
From tbl
)
WHERE IS_DIFF = 'Y'
ORDER BY ID
With your sample data:
WITH
tbl (ID, TC_NO, RESULT) AS
(
Select 1, 'tc_1', 'PASS' From Dual Union All
Select 1, 'tc_2', 'PASS' From Dual Union All
Select 1, 'tc_3', 'FAIL' From Dual Union All
Select 1, 'tc_4', 'PASS' From Dual Union All
Select 1, 'tc_5', 'FAIL' From Dual Union All
Select 2, 'tc_1', 'FAIL' From Dual Union All
Select 2, 'tc_2', 'PASS' From Dual Union All
Select 2, 'tc_3', 'FAIL' From Dual Union All
Select 2, 'tc_4', 'FAIL' From Dual Union All
Select 2, 'tc_5', 'FAIL' From Dual
)
... the result is:
ID
TC_NO
RESULT
1
tc_1
PASS
1
tc_4
PASS
2
tc_1
FAIL
2
tc_4
FAIL

How can I count the total no of rows which is not containing some value in array field? It should include null values as well

| names |
| -----------------------------------|
| null |
| null |
| [{name:'test'},{name:'test1'}] |
| [{name:'test'},{name:'test1'}] |
| [{name:'test1'},{name:'test2'}] |
I want to count the no of rows which does not have the value 'test' in the name key.
Here it should give answer as 3 (Row no 1, 2 and 5th row) because all these row do not contain the value 'test'.
Use below approach
select count(*)
from your_table
where 0 = ifnull(array_length(regexp_extract_all(names, r"\b(name:'test')")), 0)
you can test it with below data (that resemble whatever you presented in your question)
with your_table as (
select null names union all
select null union all
select "[{name:'test'},{name:'test1'}]" union all
select "[{name:'test'},{name:'test1'}]" union all
select "[{name:'test1'},{name:'test2'}]"
)
with output
Below approach will work,
with your_table as (
select null names union all
select null union all
select [('name','test'),('name','test1')] union all
select [('name','test'),('name','test1')] union all
select [('name','test1'),('name','test2')]
)
SELECT COUNT(*) as count FROM your_table
WHERE NOT EXISTS (
SELECT 1
FROM UNNEST(your_table.names) AS names
WHERE names IN (('name','test'))
)

unpivot with a flag in Oracle

I have this table structure
ID,SUPPLIER_GROUP1,SUPPLIER1,SUPPLIER_GROUP2,SUPPLIER2.
i want to unpivot and get
ID,SUPPLIER_GROUP,SUPPLIER,TYPE
so each supplier_group and supplier values come to the appropriate column and in TYPE column will be either 1 or 2 to see if the SUPPLIER_GROUP and SUPPLIER value was supplier1 or supplier2 .
Use UNPIVOT with multiple column groups:
SELECT *
FROM table_name
UNPIVOT (
(supplier_group, supplier) FOR type IN (
(supplier_group1, supplier1) AS 1,
(supplier_group2, supplier2) AS 2
)
);
Which, for the sample data:
CREATE TABLE table_name (ID,SUPPLIER_GROUP1,SUPPLIER1,SUPPLIER_GROUP2,SUPPLIER2) AS
SELECT 1, 'sg1.1', 's1.1', 'sg2.1', 's2.1' FROM DUAL UNION ALL
SELECT 2, 'sg1.2', 's1.2', 'sg2.2', 's2.2' FROM DUAL UNION ALL
SELECT 3, 'sg1.3', 's1.3', 'sg2.3', 's2.3' FROM DUAL
Outputs:
ID
TYPE
SUPPLIER_GROUP
SUPPLIER
1
1
sg1.1
s1.1
1
2
sg2.1
s2.1
2
1
sg1.2
s1.2
2
2
sg2.2
s2.2
3
1
sg1.3
s1.3
3
2
sg2.3
s2.3
db<>fiddle here

case statement after where clause to omit the row of data if satisfied

Hi I have a table as below and I'm trying to extract the data from them if and only if the below condition is satisfied.
ID Rank
45689 1
54789 2
98765 1
96541 2
98523 3
92147 4
96741 2
99999 10
If the ID starts with 4 and 9 or 5 and 9 and have same Rank then omit them. If ID starts with 9 and no matching Rank with other ID (starting with 4 or 5) then show them as result.
So My Output should look like
ID Rank
98523 3
92147 4
99999 10
How can I use case statement in where clause to filter the data?
If I understand correctly, you want to select only those ID that begin with a 9, and have a rank that is not also the rank of (another) ID that begins with 4 or 5. Is that correct?
The query below is for the case ID is of string data type (although it will work OK, probably, if ID is numeric data type - through implicit conversion).
select *
from your_table
where id like '9%'
and rank not in (
select rank
from your_table
where substr(id, 1, 1) in ('4', '5')
)
;
One option would be using COUNT() analytic function along with a conditional aggregation such as
WITH t2 AS
(
SELECT SUM(CASE WHEN SUBSTR(id,1,1) IN ('5','9') OR
SUBSTR(id,1,1) IN ('4','9') THEN 1 END ) OVER
(PARTITION BY Rank) AS count, t.*
FROM t -- your original table
)
SELECT id, rank
FROM t2
WHERE count = 1
Demo
You can use an analytic function to only query the table once:
SELECT id,
rank
FROM (
SELECT t.*,
COUNT( CASE WHEN id LIKE '4%' OR id LIKE '5%' THEN 1 END )
OVER ( PARTITION BY Rank )
AS num_match
FROM table_name t
WHERE id LIKE '4%'
OR id LIKE '5%'
OR id LIKE '9%'
)
WHERE id LIKE '9%'
AND num_match = 0;
Which, for the sample data:
CREATE TABLE table_name ( ID, Rank ) AS
SELECT 45689, 1 FROM DUAL UNION ALL
SELECT 54789, 2 FROM DUAL UNION ALL
SELECT 98765, 1 FROM DUAL UNION ALL
SELECT 96541, 2 FROM DUAL UNION ALL
SELECT 98523, 3 FROM DUAL UNION ALL
SELECT 92147, 4 FROM DUAL UNION ALL
SELECT 96741, 2 FROM DUAL UNION ALL
SELECT 99999, 10 FROM DUAL;
Outputs:
ID | RANK
----: | ---:
98523 | 3
92147 | 4
99999 | 10
db<>fiddle here

SQL theory: Filtering out duplicates in one column, picking lowest value in other column

I am trying to figure out the best way to remove rows from a result set where either the value in one column or the value in a different column has a duplicate in the result set.
Imagine the results of a query are as follows:
a_value | b_value
-----------------
1 | 1
2 | 1
2 | 2
3 | 1
4 | 3
5 | 2
6 | 4
6 | 5
What I want to do is:
Eliminate all rows that have duplicate values in a_value
Pick only 1 row for a given b_value
So I'd want the filtered results to end up like this after eliminating a_value duplicates:
a_value | b_value
-----------------
1 | 1
3 | 1
4 | 3
5 | 2
And then like this after picking only a single b_value:
a_value | b_value
-----------------
1 | 1
4 | 3
5 | 2
I'd appreciate suggestions on how to accomplish this task in an efficient way via SQL.
with
q_res ( a_value, b_value ) as (
select 1, 1 from dual union all
select 2, 1 from dual union all
select 2, 2 from dual union all
select 3, 1 from dual union all
select 4, 3 from dual union all
select 5, 2 from dual union all
select 6, 4 from dual union all
select 6, 5 from dual
)
-- end test data; solution begins below
select min(a_value) as a_value, b_value
from (
select a_value, min(b_value) as b_value
from q_res
group by a_value
having count(*) = 1
)
group by b_value
order by a_value -- ORDER BY is optional
;
A_VALUE B_VALUE
------- -------
1 1
4 3
5 2
1) In the inner query I am avoiding all duplicates which are present in a_value
column and getting all the remaining rows from input table and storing them
as t2. By joining t2 with t1 there would be full data without any dups as per
your #1 in requirement.
SELECT t1.*
FROM Table t1,
(
SELECT a_value
FROM Table
GROUP BY a_value
HAVING COUNT(*) = 1
) t2
WHERE t1.a_value = t2.a_value;
2) Once the filtered data is obtained, I am assigning rank to each row in the filtered dataset obtained in step-1 and I am selecting only rows with rank=1.
SELECT X.a_value,
X.b_value
FROM
(
SELECT t1.*,
ROW_NUMBER() OVER ( PARTITION BY t1.b_value ORDER BY t1.a_value,t1.b_value ) AS rn
FROM Table t1,
(
SELECT a_value
FROM Table
GROUP BY a_value
HAVING COUNT(*) = 1
) t2
WHERE t1.a_value = t2.a_value
) X
WHERE X.rn = 1;