I am having an issue in extracting data using data of two tables in SQL.
select A, B, C, D
from Table_one T1
where A in (select T2.A from Table_two T2
where T2.E <> 'ZZZ');
This returns A, B, C, D where E in T2 is not ZZZ.
However, when I add another where clause like below,
it returns data where T2 is ZZZ also.
select A, B, C, D
from Table_one T1
where A in (select T2.A from Table_two T2
where T2.E <> 'ZZZ')
and D <> 0 ;
This ignores "T2.E <> 'ZZZ'" part, but "D<>0" is not ignored.
Why is this happening?
Because you have duplicates in Table_two. For some of those duplicates, one has the value of ZZZ and the other does not.
You are using the wrong logic if you want to exclude rows that have a ZZZ in table_two. I would recommend NOT EXISTS:
select A, B, C, D
from Table_one T1
where not exists (select 1
from Table_two T2
where T1.A = T2.A and
T2.E = 'ZZZ'
) and
D <> 0 ;
Related
I have written the query:
Select distinct a,b from t1 minus Select distinct a,b from t2.
Here t1 and t2 are two tables. I want distinct values of a and b that occur in t1 but not in t2. So I'm using minus operator. I want values of both a and b but I know that in some cases the value of b in t1 and t2 maybe different. This would result in values of a and b that are present in both t1 and t2 as minus would not happen if values of b do not match in both the tables. How can I do this successfully?
How can I get values of a and b that are present in table t1 but not in table t2 even though in some cases values of b might not match in both the tables?
table1: table2:
column1 column2 column1 column2
1 a 1 c
2 b 3 d
In this case I would want values (2,b) only. I would not want (1,a) as 1 is also present in table2.
Start with not exists:
select distinct. . .
from t1
where not exists (select 1 from t2 where t2.a = t1.a and t2.b = t1.b);
From you describe, you might want the comparison only on a:
select distinct a, b
from t1
where not exists (select 1 from t2 where t2.a = t1.a);
Another option is to use sub query in the WHERE condition as below-
SELECT A.*
FROM table1 A
WHERE A.column1 NOT IN
(SELECT DISTINCT column1 FROM table2)
You can also use LEFT JOIN as below which will provide you the same output as below-
SELECT A.*
FROM table1 A
LEFT JOIN table2 B ON A.column1 = B.column1
WHERE B.column1 IS NULL
For the data not include in t2, you can either go for the NOT EXISTS or LEFT OUTER JOIN.
Here is the solution.
Using NOT EXISTS
SELECT DISTINCT A,B FROM T1 WHERE NOT EXISTS (SELECT 1 FROM T2 WHERE T2.A = T1.A AND T2.B = T1.B);
Using Left Join
SELECT DISTINCT a,b,c FROM T1 LEFT JOIN T2 ON T1.a = T2.a and T1.b = T2.b WHERE T2.a IS NULL AND T2.b IS NULL
Hope it helps.
How do I get not-null results from a sub query in SELECT statement?
SELECT a, b, c,
(SELECT d
FROM table2
WHERE ...) as d
FROM table 1
WHERE ...
I want to get results only when all values (a, b, c , d) not Null.
It won't be kind of weird/non-efficient to use the same sub-query in main WHERE clause as well but with EXISTS?
The easiest way to do this is to put your original query in a subquery, then you can check whether the whole row that the subquery returns is NULL:
SELECT *
FROM (
SELECT a, b, c,
(SELECT d
FROM table2
WHERE ...)
FROM table 1
WHERE ...
) AS sub
WHERE sub IS NOT NULL
sub being the row of (a,b,c,d) returned by the subquery.
You can use a subquery:
select a, b, c, d
from (SELECT a, b, c,
(SELECT d
FROM table2
WHERE ...) as d
FROM table 1
WHERE ... and
a is not null and b is not null and c is not null
) x
where d is not null;
In all likelihood, though, you can use JOIN:
SELECT a, b, c, x.d
FROM table 1 JOIN
(SELECT d
FROM table2
WHERE ...
) x
WHERE ... and
a is not null and b is not null and c is not null and d is not null;
SELECT
t1.a,
t1.b,
t1.c,
t2.d
FROM table1 t1
left join table2 as t2 on t2.ID = t1.ID
WHERE t1.a is not null and t1.b is not null and t1.c is not null and t2.d is not null
I have two rather large tables in an SQL, they share one column to do the relational analysis.
I want to pull only around 10,000 entries from a merged table, so I don't have to query for the whole database.
I want to do this as generic as possible, so let's say we have one table with fields A,B,C and another with C,D,E. Each table has around 3 million entries. And my output should be a table with A,B,C,D,E with only 10,000 entries.
Thanks!
Select
A, B, C, D, E
from
table1 t1
inner join
table2 t2 on t1.c = t2.c
limit 3000000
Here is an answer in SQL Server
select top(10000) A, B, C, D, E
into t3
from t1 join t2 on t1.pk = t2.fk
Here is an answer in MySQL
create table t3 as
select A, B, C, D, E
from t1 inner join t2 on t1.pk = t2.fk
limit 10000
SELECT k.A,k.B,k.C,k.D,k.E
FROM
(
SELECT t1.A,t1.B,t1.C,t2.D,t2.E,ROW_NUMBER() OVER ( ORDER BY t1.c ) AS rn
FROM table1 t1
INNER JOIN
table2 t2
ON t1.c = t2.c
) k
WHERE k.rn <= 10000;
Select top 10000
A, B, C, D, E
from
table1 t1
inner join
table2 t2 on t1.c = t2.c
Order by newid ()
With the above query we will get 10k random records.
Let's say I have a table with columns: A, B, C & D
Any two rows are considered a duplicate if:
A, B, C have equal values but not D
or
A, B, D have equal values but not C.
How do I get a set of duplicate rows? Using a CTE is OK.
I think you can do it with union all with the corresponding where conditions.
select * from tablename where a=b and b=c and a<>d
union all
select * from tablename where a=b and b=d and a<>c
Using a self join it's quite easy:
SELECT DISTINCT t1.*
FROM TableName t1
INNER JOIN TableName t2
ON T1.A = T2.A
AND T1.B = T2.B
AND (T1.C = T2.C OR T1.D = T2.D)
Assuming, of course, that if all 4 columns are equal it's a duplicated row as well...
However, if for some strange reason these rows are not considered as duplicates, you can change the conditions in the ON clause to this:
SELECT DISTINCT t1.*
FROM TableName t1
INNER JOIN TableName t2
ON T1.A = T2.A
AND T1.B = T2.B
AND (
(T1.C = T2.C AND T1.D <> T2.D)
OR (T1.C <> T2.C AND T1.D = T2.D)
)
You can use RANK() to detect duplicates without having to select from the table twice :
SELECT s.* FROM (
SELECT t.*,
RANK() OVER(PARTITION BY t.a,t.b,t.c ORDER BY t.d) as d_dif,
RANK() OVER(PARTITION BY t.a,t.b,t.D ORDER BY t.c) as c_dif
FROM YourTable) s
WHERE s.d_dif > 1 or s.c_dif > 1
RANK() as opposed to ROW_NUMBER() deals with duplicates, so if d / c will be the same, both records will get the same rank and won't be selected.
I want to get the top 1 row for each unique value of b with the minimum value of c for that particular value of b. Even though there can be more than 1 row with the same min value (just chose the first one)
myTable
a integer (unique)
b integer
c integer
I've tried this query
SELECT t1.*
FROM myTable t1,
(SELECT b,
MIN(c) as c
FROM myTable
GROUP BY b) t2
WHERE t1.b = t2.b
AND t1.c = t2.c
However, in this table it's possible for there to be more than 1 instance of the minimum value of c for a given value of b. The above query generates duplicates under these conditions.
I've got a feeling that I need to use rownum somewhere, but I'm not quite sure where.
You can use ROW_NUMBER:
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY b ORDER BY c) AS rn
FROM myTable
) AS T1
WHERE rn = 1
To tie-break between the equal c's, you will need to subquery one level further to get the min-a for each group of equal c's per b. (A mouthful!)
select t0.*
FROM myTable t0
inner join (
select t1.b, t1.c, MIN(a) as a
from myTable t1
inner join (
select b, min(c) as c
from myTable
group by b
) t2 on t1.b = t2.b and t1.c = t2.c
group by t1.b, t1.c
) t3 on t3.a = t0.a and t3.b = t0.b and t3.c = t0.c