filtering for column pairs in SQL - sql

consider two tables that have the following columns
A
1,X
2,Y
3,Z
B
1,X
1,Y
1,Z
2,X
2,Y
2,Z
3,X
3,Y
3,Z
is it possible select rows in B that have column pairs as in A without joining or a third column?
something like
select * from B where distinct columns in (select distinct columns from A)

You could use exists logic:
SELECT col1, col2
FROM TableB b
WHERE EXISTS (SELECT 1 FROM TableA a WHERE a.col1 = b.col1 AND a.col2 = b.col2);

Related

Hive join tables and keep only 1 column

I have below table join and noticed that Hive keeps two copies of the pk column - one from table b and one from table c. Is there a way to keep only 1 of those columns?
I can always replace select * with exact select column1, column2 etc but that wont be too efficient
with a as (
select
*
from table1 b left join table2 c
on b.pk = c.pk
)
select
*
from a;
;
#update 1
is it possible to alias many columns?
for example the below line works
select b.pk as duplicate_pk
but is there a way to do something like
select b.* as table2 to add text table2 before all the columns of the table b?
Not sure if you already tried this but you can choose what to select using either
b.* to select cols of only table1
c.* to select cols of only table2
Example:
with a as (
select
b.*
from table1 b left join table2 c
on b.pk = c.pk
)
select
*
from a;

Query from Join 2 table subqueries without where or Query from joined table with where is fasster

Assume I have 2 tables A (ID, col1, col2) and B (ID, col1, col2).
Is it better to write
SELECT *
FROM
(SELECT *
FROM A
WHERE col1 = 1) A
JOIN
(SELECT *
FROM B
WHERE col2 = 2) B ON A.ID = B.ID
or rather:
SELECT *
FROM A
JOIN B ON A.ID = B.ID
WHERE A.col1 = 1
AND B.col2 = 2
Yes. It's better to use joins.
Joining data with WHERE clause does NOT allow you to grab data like LEFT|RIGHT JOIN does. WHERE clause allows you to grab data in the way in which INNER JOIN does.
For further details, please see: Visual representation of SQL Joins

Why can't we use/refer to derived table in subquery

I wonder why can't a subquery use a derived table as its derived table (Query2), even though it can access the attribute of the derived table (Query1)?
Query1
select B.col1, col2 from dummy B
where B.col1 = (select col1 from dummy A where B.col1 = 'aa' and A.col1 = B.col1 limit 1);
Here we can use B.col1.
Query2
select B.col1, col2 from dummy B
where B.col1 = (select col1 from B where B.col1 = 'aa' limit 1);
Here we can't use B. The error says B doesn't exist.
You can find the sqlfiddle here.
Because the first reference is a correlated reference. A table alias is a reference to a particular "instance" of a table in the query, not another "table" itself.
If you want that ability, use a CTE:
with b as (
select d.*
from dummy d
)
Then you can use b multiple times in the ensuing query.
The problem is that in your second query you are trying to select from the table that does not exist and in your first query you are referring to the column that does.
In both query's the B and the A letters are only aliases not tables.
Having that in mind, lets remove the aliases in the second query:
select col1, col2 from dummy
where col1 = (select col1 from where col1 = 'aa' limit 1);
Do you see something missing now in the query structure ?
To conclude, from your question it seems you do not understand which B is the problem. This one: FROM B and not this one where B.col1.
Hope this helps...

SQL how to check is a value in a col is NOT in another table

Maybe I need another coffee because this seems so simple yet I cannot get my head around it.
Let's say I have a tableA with a col1 where employee IDs are stored.... ALL employee IDs. And the 2nd table, tableB has col2 which lists all employeeID who have a negative evaluation.
I need a query which returns all ID's from col1 from table1 and a newcol which show a '1' for those ID's which do NOT exist in col2 of TableB.
I am doing this in dashDB
One option uses a LEFT JOIN between the two tables:
SELECT a.col1,
CASE WHEN b.col2 IS NULL THEN 1 ELSE 0 END AS new_col
FROM tableA a
LEFT JOIN tableB b
ON a.col1 = b.col2
Alternatively you can achieve your requirement with LEFT JOIN along with IFNULL function as below.
SELECT a.col1,
IFNULL(b.col2, 1) NewCol
FROM tableA a
LEFT JOIN tableB b
ON a.col1 = b.col2

How to select all rows from tableA where two correlating columns in tableB are identical to the ones in tableA

Here is an example row found in both tableA and TableB:
col1 col2 col3
123 | asdf | ddd
I was going to try to use a CTE To retrieve this value, but i have only got it to work with one column at a time.
Here is what i have so far, but right now I am stumped.
;with cte as (
SELECT A.col1
,A.col2
FROM tableA A
)
select col1, col2 from cte where col1, col2
not in (select col1, col2 from FHU.dbo.HolidayCallers)
And just to reiterate i am expecting output to be the original sample row up top.
Based on your question, you'd could use EXISTS to find rows where the related columns are in both tables.
SELECT a.ConversationID, a.SendDateUtc
FROM tableA a
WHERE EXISTS(SELECT NULL
FROM FHU.HolidayCallers hc
WHERE a.ConversationID = hc.ConversationID
AND a.SendDateUtc = hc.SendDateUtc);
BUT, based on the sample code you provided, it seems like you are looking for a NOT EXISTS condition:
SELECT a.ConversationID, a.SendDateUtc
FROM tableA a
WHERE NOT EXISTS(SELECT NULL
FROM FHU.HolidayCallers hc
WHERE a.ConversationID = hc.ConversationID
AND a.SendDateUtc = hc.SendDateUtc);
You can use an INNER JOIN to find records that are in both tables:
SELECT a.ConversationID, a.SendDateUtc
FROM tableA AS a
INNER JOIN FHU.dbo.HolicayCallers AS hc ON a.ConversationId = hc.ConversationId
AND a.SendDateUtc = hc.SendDateUtc