Column_A
Column_B
1
X
1
Z
2
X
2
Y
3
Y
4
X
4
Y
4
Z
5
Y
I want get all distinct values of Column A that has a row with Column B equal to X and other row with Column B equal to 'Y'
The result will be like this:
Column_A
1
4
I tried in this way:
SELECT DISTINCT COLUMN_A
FROM TABLE
INNER JOIN (
SELECT DISTINCT COLUMN_A
FROM TABLE
WHERE COLUMN_B = 'X') SUBTABLE
ON TABLE.COLUMN_A = SUBTABLE.COLUMN_A
WHERE TABLE.COLUMN_B = 'Y';
I think that this solution works but isn't optimum
Thanks a have a nice day
You can apply a simple aggregation by:
filtering only Column_B values you're interested in
aggregating for distinct values of Column_B
checking the amount of distinct values equals 2
SELECT Column_A
FROM tab
WHERE Column_B IN ('X', 'Y')
GROUP BY Column_A
HAVING COUNT(DISTINCT Column_B) = 2
or you can use the INTERSECT operator between:
the records having Column_B = 'X'
the records having Column_B = 'Y'
SELECT DISTINCT Column_A FROM tab WHERE Column_B = 'X'
INTERSECT
SELECT DISTINCT Column_A FROM tab WHERE Column_B = 'Y'
Check the demo here.
Related
I'm querying a database to get the distinct ids where all the rows with each id match the criteria. For example, I would like to query the table below to get the distinct id where all values are truue. In this case, I would only return a single row with the id of 1.
Column A
Column B
1
true
1
true
2
false
2
true
2
true
3
false
3
false
3
false
Expected result
ColumnA
1
Currently, I have a query such as this
select
columnA
from
table
group by
columnA
having
(count(columnB = false) = 0)
But I end up returning no data at all. Not an error, just nothing matching my query. This is an example with dummy data, but the actual DB is quite large so I would expect lots of data back.
Any help is appreciated!
consider to use LOGICAL_AND function.
WITH sample_table AS (
SELECT 1 column_a, true column_b UNION ALL
SELECT 1 column_a, true column_b UNION ALL
SELECT 2 column_a, false column_b UNION ALL
SELECT 2 column_a, true column_b UNION ALL
SELECT 2 column_a, true column_b UNION ALL
SELECT 3 column_a, false column_b UNION ALL
SELECT 3 column_a, false column_b UNION ALL
SELECT 3 column_a, false column_b
)
SELECT column_a
FROM sample_table
GROUP BY 1
HAVING LOGICAL_AND(column_b) IS TRUE;
+----------+
| column_a |
+----------+
| 1 |
+----------+
This is a follow-up question of Capture changes in 2 datasets.
I need to capture change between 2 datasets based on key(s): one historical and another current version of the same dataset (both datasets share same schema). These datasets can have duplicate rows as well. In below example id is considered key for comparison:
-- Table t_curr
-------
id col
-------
1 A
1 B
2 C
3 F
-- Table t_hist
-------
id col
-------
1 B
2 C
2 D
4 G
-- Expected output t_change
----------------
id col change
----------------
1 A modified -- change status is 'modified' as first row for id=1 is different for both tables
1 B inserted
2 C same
2 D deleted
3 F inserted
4 G deleted
I'm looking for an efficient solution to get the desired output.
EDIT
Explanation: While fetching data from t_curr if records come in the same order as shown and records were ranked wrt to id:
1/A is first and 1/B second records in t_curr
1/B is the first records in t_hist
1st record for both datasets compared ie 1/A in t_curr compared with 1/B of t_hist hence 1/A marked as modified in t_change
Since 1/B present only in t_curr it's
marked inserted
I was able to do it using full outer join and row_number(). Query:
with t_hist as (
select 1 as id, 'B' as col union all
select 2 as id, 'C' as col union all
select 2 as id, 'D' as col union all
select 4 as id, 'G' as col
),
t_curr as (
select 1 as id1, 'A' as col1 union all
select 1 as id1, 'B' as col1 union all
select 2 as id1, 'C' as col1 union all
select 3 as id1, 'F' as col1
)
select
case when id1 is null then id else id1 end as id_,
case when col1 is null then col else col1 end as col_,
case
when id is null then 'inserted'
when id1 is null then 'deleted'
when col = col1 then 'same'
else 'modified'
end
as change
from
(select t_curr.*, t_hist.* from (select *, row_number() over (partition by id1 order by id1) r1 from t_curr) t_curr
full outer join (select *, row_number() over (partition by id) r from t_hist ) t_hist on id1 = id and r1 = r )
order by id_
I want to select rows where some precise column has different values while another precise column has the same value.
Exemple :
COLUMN_A | COLUMN_B
__________|___________
|
1 | 2002
1 | 2002
2 | 2001
2 | 2007
3 | 2010
3 | 2010
Now, suppose I want to know which Rows has the same A but different B, the query would return the rows
2 | 2001
2 | 2007
or just
2
as long as I know which one is it ...
This is the case for Count(Distinct ColumnName). It ensures that only unique values are taken into account.
With Src As (
Select *
From (Values
(1, 2002),
(1, 2002),
(2, 2001),
(2, 2007),
(3, 2010),
(3, 2010)
) V (COLUMN_A, COLUMN_B)
)
Select *
From Src
Where COLUMN_A In (
Select COLUMN_A
From Src
Group By COLUMN_A
Having Count(Distinct COLUMN_B) > 1 --<- "More than one unique value" condition
)
COLUMN_A COLUMN_B
2 2001
2 2007
You can use:
SELECT COLUMN_A
FROM dbo.YourTable
GROUP BY COLUMN_A
HAVING MIN(COLUMN_B) <> MAX(COLUMN_B);
Another way can be using EXISTS:
SELECT *
FROM dbo.YourTable A
WHERE EXISTS(SELECT 1 FROM dbo.YourTable
WHERE COLUMN_A = A.COLUMN_A
AND COLUMN_B <> A.COLUMN_B);
This one is without GROUPing:
SELECT x.column_a, x.column_b, y.column_b
FROM table_name x
JOIN table_name y
ON ( x.column_a = y.column_a AND x.column_b <> y.column_b )
You just join the table to itself, and provide the conditions you are looking for.
Consider this example table "Table1".
Col1 Col2
A 1
B 1
A 4
A 5
A 3
A 2
D 1
B 2
C 3
B 4
I am trying to fetch those values from Col1 which corresponds to all values (in this case, 1,2,3,4,5). Here the result of the query should return 'A' as none of the others have all values 1,2,3,4,5 in Col2.
Note that the values in Col2 are decided by other parameters in the query and they will always return some numeric values. Out of those values the query needs to fetch values from Col1 corresponding to all in Col2. The values in Col2 could be 11,12,1,2,3,4 for instance (meaning not necessarily in sequence).
I have tried the following select query:
select distinct Col1 from Table1 where Col1 in (1,2,3,4,5);
select distinct Col1 from Table1 where Col1 exists (select distinct Col2 from Table1);
and its different variations. But the problem is that I need to apply an 'and' for Col2 not an 'or'.
like Return a value from Col1 where Col2 'contains' all values between 1 and 5.
Appreciate any suggestion.
You could use analytic ROW_NUMBER() function.
SQL FIddle for a setup and working demonstration.
SELECT col1
FROM
(SELECT col1,
col2,
row_number() OVER(PARTITION BY col1 ORDER BY col2) rn
FROM your_table
WHERE col2 IN (1,2,3,4,5)
)
WHERE rn =5;
UPDATE As requested by OP, some explanation about how the query works.
The inner sub-query gives you the following resultset:
SQL> SELECT col1,
2 col2,
3 row_number() OVER(PARTITION BY col1 ORDER BY col2) rn
4 FROM t
5 WHERE col2 IN (1,2,3,4,5);
C COL2 RN
- ---------- ----------
A 1 1
A 2 2
A 3 3
A 4 4
A 5 5
B 1 1
B 2 2
B 4 3
C 3 1
D 1 1
10 rows selected.
PARTITION BY clause will group each sets of col1, and ORDER BY will sort col2 in each group set of col1. Thus the sub-query gives you the row_number for each row in an ordered way. now you know that you only need those rows where row_number is at least 5. So, in the outer query all you need ot do is WHERE rn =5 to filter the rows.
You can use listagg function, like
SELECT Col1
FROM
(select Col1,listagg(Col2,',') within group (order by Col2) Col2List from Table1
group by Col1)
WHERE Col2List = '1,2,3,4,5'
You can also use below
SELECT COL1
FROM TABLE_NAME
GROUP BY COL1
HAVING
COUNT(COL1)=5
AND
SUM(
(CASE WHEN COL2=1 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=2 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=3 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=4 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=5 THEN 1 ELSE 0
END))=5
How can I obtain a transposed UNION of the TSQL Query Results below
SELECT TOP 1 Column_A FROM table1
SELECT TOP 1 Column_B FROM table2
SELECT TOP 1 Column_C FROM table3
So that the output will be ONE row of 3 columns with a single value per each:
[Column_A] [Column_B] [Column_C]
Like this:
Select
(SELECT TOP 1 Column_A FROM table1) as 'Column_A',
(SELECT TOP 1 Column_B FROM table2) as 'Column_B',
(SELECT TOP 1 Column_C FROM table3) as 'Column_C'