SQL query to find duplicates with mismatching another column

SQL query to find duplicates with mismatching another column - sql

Table looks like this:
col1 | col2
A | B
A | B
D | C
D | C
E | F
E | G
what I need, is to extract col1 E.
Already tried couple variants of SELECT DISTINCT or SELECT ..., COUNT(*) with GROUP BY, but can't figure it out.
p.s. DBMS is Oracle

Such as this?
SQL> with test (col1, col2) as
2 (select 'A', 'B' from dual union all
3 select 'A', 'B' from dual union all
4 select 'D', 'C' from dual union all
5 select 'D', 'C' from dual union all
6 select 'E', 'F' from dual union all
7 select 'E', 'G' from dual
8 )
9 select col1
10 from test
11 group by col1
12 having count(distinct col2) > 1;
C
-
E
SQL>

Related

Filter Alphabets from Oracle table

I have input like :-
Table 1
Table 2
A
4
B
5
C
6
1
X
2
Y
3
Z
And Output muse be
Output
A
B
C
X
Y
Z

One method uses a union followed by a filter:
SELECT val
FROM
(
SELECT col1 AS val FROM yourTable
UNION ALL
SELECT col2 FROM yourTable
) t
WHERE REGEXP_LIKE(val, '^[A-Z]+$')
ORDER BY val;

If data really looks as you put it, a simple option is to use greatest function.
Sample data:
SQL> with test (col1, col2) as
2 (select 'A', '4' from dual union all
3 select 'B', '5' from dual union all
4 select 'C', '6' from dual union all
5 select '1', 'X' from dual union all
6 select '2', 'Y' from dual union all
7 select '3', 'Z' from dual
8 )
Query:
9 select greatest(col1, col2) result
10 from test;
RESULT
----------
A
B
C
X
Y
Z
6 rows selected.
SQL>

Is there a Dataprep rolling list equivalent function in BigQuery?

I'm looking for functionality similar to this in BigQuery: https://cloud.google.com/dataprep/docs/html/ROLLINGLIST-Function_118228853
Does anyone know of a suitable function?

Below example for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'a' col1, 1 col2 UNION ALL
SELECT 'b', 2 UNION ALL
SELECT 'c', 3 UNION ALL
SELECT 'd', 4 UNION ALL
SELECT 'e', 5 UNION ALL
SELECT 'f', 6 UNION ALL
SELECT 'g', 7 UNION ALL
SELECT 'h', 8
)
SELECT *,
STRING_AGG(col1)
OVER(ORDER BY col2 ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) rolling_list
FROM `project.dataset.table`
with output
Row col1 col2 rolling_list
1 a 1 a
2 b 2 a,b
3 c 3 a,b,c
4 d 4 b,c,d
5 e 5 c,d,e
6 f 6 d,e,f
7 g 7 e,f,g
8 h 8 f,g,h

SQL How to align ranges of data points in rows?

Suppose having the data set:
with
data_table(title, x) as (
select 'a', 1 from dual union all
select 'a', 3 from dual union all
select 'a', 4 from dual union all
select 'a', 5 from dual union all
select 'b', 1 from dual union all
select 'b', 2 from dual union all
select 'b', 3 from dual union all
select 'b', 6 from dual
)
select * from data_table;
TITLE | X
-----------
a 1
a 3
a 4
a 5
b 1
b 2
b 3
b 6
Wee see that points related to a and b are different.
How to align values in X column so both groups have the same points, filling the gaps with NULL?
Expected result is:
TITLE | X
-----------
a 1
a NULL
a 3
a 4
a 5
a NULL
b 1
b 2
b 3
b NULL
b NULL
b 6
Straightforward solution I got is
with
data_table(title, x) as (
select 'a', 1 from dual union all
select 'a', 3 from dual union all
select 'a', 4 from dual union all
select 'a', 5 from dual union all
select 'b', 1 from dual union all
select 'b', 2 from dual union all
select 'b', 3 from dual union all
select 'b', 6 from dual
),
all_points(x) AS (
select distinct x from data_table
),
all_titles(title) AS (
select distinct title from data_table
),
aligned_data(title, x) as (
select t.title, p.x from all_points p cross join all_titles t
)
select ad.title, dt.x
from aligned_data ad
left join data_table dt on dt.title = ad.title and dt.x = ad.x
order by ad.title, ad.x;
As wee see cross join in aligned_data definition is bottleneck and can kill performance on valuable data sets.
I wonder if this task could be solved more elegantly. Maybe a trick with window functions can be proposed.

T-SQL ORDER BY base on MIN of a group's column

Hi take the following data as an example
id | value
----------
A | 3
A | 9
B | 7
B | 2
C | 4
C | 5
I want to list out all the data base on the min value of each id group, so that the expected output is
id | value
----------
B | 2
B | 7
A | 3
A | 9
C | 4
C | 5
i.e. min of group A is 3, group B is 2, group C is 4, so group B first and then the rest of group B in ascending order. Next group A and then group C
I tried this but thats not what I want
SELECT * FROM (
SELECT 'A' AS id, '3' AS value
UNION SELECT 'A', '9' UNION SELECT 'B', '7' UNION SELECT 'B', '2'
UNION SELECT 'C', '4' UNION SELECT 'C', '5') data
GROUP BY id, value
ORDER BY MIN(value)
Please help! Thank you

SELECT * FROM (
SELECT 'A' AS id, '3' AS value
UNION SELECT 'A', '9' UNION SELECT 'B', '7' UNION SELECT 'B', '2'
UNION SELECT 'C', '4' UNION SELECT 'C', '5') data
ORDER BY MIN(value) OVER(PARTITION BY id), id, value
OVER Clause (Transact-SQL)
Add the over() clause to your query output and you can see what it does for you.
SELECT *,
MIN(value) OVER(PARTITION BY id) OrderedBy FROM (
SELECT 'A' AS id, '3' AS value
UNION SELECT 'A', '9' UNION SELECT 'B', '7' UNION SELECT 'B', '2'
UNION SELECT 'C', '4' UNION SELECT 'C', '5') data
ORDER BY MIN(value) OVER(PARTITION BY id), id, value
Result:
id value OrderedBy
---- ----- ---------
B 2 2
B 7 2
A 3 3
A 9 3
C 4 4
C 5 4

Oracle: normalized fields to CSV string

I have some one-many normalized data that looks like this.
a | x
a | y
a | z
b | i
b | j
b | k
What query will return the data such that the "many" side is represented as a CSV string?
a | x,y,z
b | i,j,k

Mark,
If you are on version 11gR2, and who isn't :-), then you can use listagg
SQL> create table t (col1,col2)
2 as
3 select 'a', 'x' from dual union all
4 select 'a', 'y' from dual union all
5 select 'a', 'z' from dual union all
6 select 'b', 'i' from dual union all
7 select 'b', 'j' from dual union all
8 select 'b', 'k' from dual
9 /
Tabel is aangemaakt.
SQL> select col1
2 , listagg(col2,',') within group (order by col2) col2s
3 from t
4 group by col1
5 /
COL1 COL2S
----- ----------
a x,y,z
b i,j,k
2 rijen zijn geselecteerd.
If your version is not 11gR2, but higher than 10gR1, then I recommend using the model clause for this, as written here: http://rwijk.blogspot.com/2008/05/string-aggregation-with-model-clause.html
If lower than 10, then you can see several techniques in rexem's link to the oracle-base page, or in the link to the OTN-thread in the blogpost mentioned above.
Regards,
Rob.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL query to find duplicates with mismatching another column - sql

Table looks like this: col1 | col2 A | B A | B D | C D | C E | F E | G what I need, is to extract col1 E. Already tried couple variants of SELECT DISTINCT or SELECT ..., COUNT(*) with GROUP BY, but can't figure it out. p.s. DBMS is Oracle

Related

Filter Alphabets from Oracle table

Is there a Dataprep rolling list equivalent function in BigQuery?

SQL How to align ranges of data points in rows?

T-SQL ORDER BY base on MIN of a group's column

Oracle: normalized fields to CSV string

Categories

Resources