Finding unmatched records with SQL - sql

I'm trying write a query to find records which don't have a matching record in another table.
For example, I have a two tables whose structures looks something like this:
Table1
State | Product | Distributor | other fields
CA | P1 | A | xxxx
OR | P1 | A | xxxx
OR | P1 | B | xxxx
OR | P1 | X | xxxx
WA | P1 | X | xxxx
VA | P2 | A | xxxx
Table2
State | Product | Version | other fields
CA | P1 | 1.0 | xxxx
OR | P1 | 1.5 | xxxx
WA | P1 | 1.0 | xxxx
VA | P2 | 1.2 | xxxx
(State/Product/Distributor together form the key for Table1. State/Product is the key for Table2)
I want to find all the State/Product/Version combinations which are Not using distributor X. (So the result in this example is CA-P1-1.0, and VA-P2-1.2.)
Any suggestions on a query to do this?

SELECT
*
FROM
Table2 T2
WHERE
NOT EXISTS (SELECT *
FROM
Table1 T1
WHERE
T1.State = T2.State AND
T1.Product = T2.Product AND
T1.Distributor = 'X')
This should be ANSI compliant.

In T-SQL:
SELECT DISTINCT Table2.State, Table2.Product, Table2.Version
FROM Table2
LEFT JOIN Table1 ON Table1.State = Table2.State AND Table1.Product = Table2.Product AND Table1.Distributor = 'X'
WHERE Table1.Distributor IS NULL
No subqueries required.
Edit: As the comments indicate, the DISTINCT is not necessary. Thanks!

select * from table1 where state not in (select state from table1 where distributor = 'X')
Probably not the most clever but that should work.

SELECT DISTINCT t2.State, t2.Product, t2.Version
FROM table2 t2
JOIN table1 t1 ON t1.State = t2.State AND t1.Product = t2.Product
AND t1.Distributor <> 'X'

In Oracle:
SELECT t2.State, t2.Product, t2.Version
FROM Table2 t2, Table t1
WHERE t1.State(+) = t2.State
AND t1.Product(+) = t2.Product
AND t1.Distributor(+) = :distributor
AND t1.State IS NULL

Related

Joining two tables at max date where conditions occur

I have two tables,
[TABLE_1]
| ID1 | ID2 | ID3 |
|-----+-----+-----|
| 200 | 125 | 300 |
| 206 | 128 | 650 |
| 230 | 543 | 989 |
[TABLE_2]
| ID1 | ID2 | ID3 | Date |
|-----+-----+-----+--------|
| 200 | 125 | 300 | 1/1/18 |
| 200 | 125 | 300 | 1/1/19 |
| 206 | 128 | 650 | 1/1/13 |
| 206 | 128 | 650 | 1/2/13 |
| 206 | 128 | 650 | 9/5/05 |
I'm trying to Left Join TABLE_1 to TABLE_2 while filtering the output so only rows where Date is at its maximum for those classifications are displayed. I simplified the data in my tables a little bit, but there's NO overall max date that can be used for all items in the table, the max date is unique to each item.
Desired results on the above example would be:
| ID1 | ID2 | ID3 | Date |
|-----+-----+-----+--------|
| 200 | 125 | 300 | 1/1/19 |
| 206 | 128 | 650 | 1/2/13 |
Here's my latest attempt at the query. It seems a little too complicated as I'm relatively new with SQL and has been running without giving a result for a long time now so I'm afraid it may be in an endless loop somehow:
SELECT DISTINCT *
FROM TABLE_1 t1
LEFT JOIN TABLE_2 t2
ON t1.ID1 = t2.ID1
AND t1.ID2 = t2.ID2
AND t1.ID3 = t2.ID3
WHERE t2.Date = (SELECT MAX(Date) FROM Table_2
WHERE t1.ID1 = t2.ID2
AND t1.ID2 = t2.ID2
AND t1.ID3 = t2.ID3);
Any help on how better to query this will be greatly appreciated, thanks!
Notes: The column names are not identical (I know this would cause an error), I only labeled them like that for simplification.
For your given results, why not just aggregate table 2?
select id1, id2, id3, max(date)
from table2
group by id1, id2, id3;
If you need to filter this only for the triples in table1, then:
select t2.id1, t2.id2, t2.id3, max(t2.date)
from table2 t2 join
table1 t1
on t2.id1 = t1.id1 and t2.id2 = t1.id2 and t2.id3 = t1.id3
group by t2.id1, t2.id2, t2.id3;
So as per the comments, I think the following is the complete code:
SELECT
T1.ID1,
T1.ID2,
T1.ID3,
MAX(T2.DATE) AS DATE
FROM
TABLE1 T1
-- USED LEFT JOIN AS NOT SURE IF THERE IS ATLEAST ONE ROW IN T2 FOR EACH ROW IN T1
LEFT JOIN (
SELECT
ID1,
ID2,
ID3,
MAX(DATE)
FROM
TABLE2
GROUP BY
ID1,
ID2,
ID3
) T2 ON T2.ID1 = T1.ID1
AND T2.ID2 = T1.ID2
AND T2.ID3 = T1.ID3;
Cheers!!
You can also use the WITH statement, preaggregating the entries in table 2:
WITH agg_table_2 AS (
SELECT
ID1
,ID2
,ID3
,MAX(DATE) AS MAX_DATE
FROM TABLE_2
GROUP BY
ID1
,ID2
,ID3
)
SELECT
T1.ID1
,T1.ID2
,T1.ID3
,T2.MAX_DATE
FROM TABLE_1 T1
--LEFT
JOIN TABLE_2 T2
ON T1.ID1 = T2.ID1
AND T1.ID2 = T2.ID2
AND T1.ID3 = T2.ID3
;
Note that I actually commented out the LEFT in JOIN as in your desired output 230, 543, 989 was not present. Uncomment it, if you want to keep that entry with NULL value assigned.

What is the correct way from performance perspective to match(replace) every value in every row in temp table using SQL Server 2016 or 2017?

I am wondering what should I use in SQL Server 2016 or 2017 (CTE, LOOP, JOINS, CURSOR, REPLACE, etc) to match (replace) every value in every row in temp table? What is the best solution from performance perspective?
Source Table
|id |id2|
| 1 | 2 |
| 2 | 1 |
| 1 | 1 |
| 2 | 2 |
Mapping Table
|id |newid|
| 1 | 3 |
| 2 | 4 |
Expected result
|id |id2|
| 3 | 4 |
| 4 | 3 |
| 3 | 3 |
| 4 | 4 |
You may join the second table to the first table twice:
WITH cte AS (
SELECT
t1.id AS id_old,
t1.id2 AS id2_old,
t2a.newid AS id_new,
t2b.newid AS id2_new
FROM table1 t1
LEFT JOIN table2 t2a
ON t1.id = t2a.id
LEFT JOIN table2 t2b
ON t1.id2 = t2b.id
)
UPDATE cte
SET
id_old = id_new,
id2_old = id2_new;
Demo
Not sure if you want just a select here, or maybe an update, or an insert into another table. In any case, the core logic I gave above should work for all these cases.
You'd need to apply joins on update query. Something like this:
Update tblA set column1 = 'something', column2 = 'something'
from actualName tblA
inner join MappingTable tblB
on tblA.ID = tblB.ID
this query will compare eachrow with ids and if matched then it will update/replace the value of the column as you desire. :)
Do the self join only
SELECT t1.id2 as id, t2.id2
FROM table1 t
INNER JOIN table2 t1 on t1.id = t.id
INNER JOIN table2 t2 on t2.id = t.id2
This may have best performance from solutions posted here if you have indexes set appropriately:
select (select [newid] from MappingTable where id = [ST].[id]) [id],
(select [newid] from MappingTable where id = [ST].[id2]) [id2]
from SourecTable [ST]

Getting all the current effective records from a ORACLE table

I have two tables in oracle database
Table 1 say table1 with fields (id, name)
Records e.g.
###############
id | name
1 | Chair
2 | Table
3 | Bed
###############
and Table 2 say table2 with fields (id, table1_id, date, price)
##############################
id |table1_id| date | price
1 | 1 | 2013-09-09 | 500
2 | 1 | 2013-08-09 | 300
3 | 2 | 2013-09-09 | 5100
4 | 2 | 2013-08-09 | 5000
5 | 3 | 2013-09-09 | 10500
################################
What I want to achieve is to retrieve all the latest price of items from table 2
Result of SQL should be like
##############################
id |table1_id| date | price
1 | 1 | 2013-09-09 | 500
3 | 2 | 2013-09-09 | 5100
5 | 3 | 2013-09-09 | 10500
################################
I am able to run in mysql by following query
SELECT t2.id, t1.id, t1.name, t2.date, t2.price
FROM table1 t1 JOIN table2 t2
ON (t1.id = t2.table1_id
AND t2.id = (
SELECT id
FROM table2
WHERE table1_id = t1.id
ORDER BY table2.date DESC
LIMIT 1
));
but it's not working in ORACLE, Here i Need a query which can run on both server with minor modification
You may try this (shoud work in both MySQL and Oracle):
select t2.id, t2.table1_id, t2.dat, t2.price
from table1 t1 join table2 t2 on (t1.id = t2.table1_id)
join (select table1_id, max(dat) max_date
from table2 group by table1_id) tmax
on (tmax.table1_id = t2.table1_id and tmax.max_date = t2.dat);
This query may return several rows for the same table1_id and date if there are several prices in table2, like this:
##############################
id |table1_id| date | price
1 | 1 | 2013-09-09 | 500
2 | 1 | 2013-09-09 | 300
It's possible to change the query to retrieve only 1 row for each table1_id, but there should be some additional requirements (which row to choose in the above example)
if it doesn't matter then you may try this:
select max(t2.id) as id, t2.table1_id, t2.dat, max(t2.price) as price
from table1 t1 join table2 t2 on (t1.id = t2.table1_id)
join (select table1_id, max(dat) max_date
from table2 group by table1_id) tmax
on (tmax.table1_id = t2.table1_id and tmax.max_date = t2.dat)
group by t2.table1_id, t2.dat;
You can try this using GROUP BY instead, since you're not retrieving the product name from table1 except the product id (which is already in table2)
SELECT id,table1_id,max(date),price
FROM table2
GROUP BY id,table1_id,price
this is what you want :
select t2.id,t2.table1_id,t1.name,t2.pricedate,t2.price
from table1 t1
join
(
select id,table1_id, pricedate,price, row_number() over (partition by table1_id order by pricedate desc) rn
from table2
) t2
on t1.id = t2.table1_id
where t2.rn = 1

Postgresql array_agg, INNER JOIN and LEFT JOIN problems

I have a slight problem with one of my query. The goal of this query is to get all the table1 items of a user and their information. As you can see, the data model is quite complex (for good reasons), and this requires an big query (my goal is to gather everything with one query only).
Here is the data model :
What I want :
All T1 info
All T2 info for one T1 item (it is a 1 to n relations, so I'll use array_agg)
All T3 info for one T1 item
All T4 info for one T1 item
All T6 info for one T1 item
i18n info for the T1 itemp
Here are the table1_table2 and table4_table6 SELECT * :
table1_id | table2_id
-------------+---------------
item2id | table2item1
item4id | table2item2
item4id | table2item1
item5id | table2item3
item5id | table2item2
table4_id | table6_id
------------------+--------------------
table4item1 | table6item1
table4item1 | table6item2
table4item2 | table6item2
table4item3 | table6item3
table4item1 | table6item3
table4item2 | table6item3
Here are the Table1 SELECT with id and its foreign key.
table1_id | table3_id
------------------------
item1id | table3item1
item2id | table3item1
item6id | table3item4
item3id | table3item2
item4id | table3item2
item5id | table3item3
Same for table3 :
table3_id | table4_id
------------+--------------
table3item1 | table4item1
table3item4 | table4item1
table3item2 | table4item2
table3item3 | table4item3
Finally, here is my query :
SELECT t1.id,
na.name,
array_to_json(array_agg(row_to_json(t2))) AS table2items,
array_to_json(array_agg(row_to_json(t6))) AS table6items
FROM table1 t1
INNER JOIN table1_i18n na ON na.table1_id = t1.id
INNER JOIN table3 t3 ON t3.id = t1.table3_id
INNER JOIN table4 t4 ON t4.id = t3.table4_id
LEFT JOIN table1_table2 t1t2 ON t1t2.table1_id = t1.id
LEFT JOIN table2 t2 ON t2.id = t1t2.table2_id
LEFT JOIN table4_table6 t5_t6 ON t5_t6.table5_id = t3.table4_id
LEFT JOIN table6 t6 ON t6.id = t5_t6.table6_id
WHERE t1.user_id = 'myuserid' AND na.lang = 'en_US'
GROUP BY t1.id, na.name, t4.id
ORDER BY t1.id;
Here is the result :
id | name | table3_id | table4_id | table2items | table6items
-------------+------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
item1id | MyFirstItem | table3item1 | table4item1 | [null,null,null] | [{"id":"table6item1"},{"id":"table6item2"},{"id":"table6item3"}]
item2id | MySecondItem | table3item1 | table4item1 | [{"table2item1","data1":"damage","data2":10},{"id":"table2item1","data1":"damage","data2":10},{"id":"table2item1","data1":"damage","data2":10}] | [{"id":"table6item1"},{"id":"table6item2"},{"id":"table6item3"}]
item3id | MyThirdItem | table3item2 | table4item2 | [null,null] | [{"id":"table6item2"},{"id":"table6item3"}]
item4id | MyFourthItem | table3item2 | table4item2 | [{"id":"table2item2","data1":"range","data2":20},{"id":"table2item1","data1":"damage","data2":10},{"id":"table2item2","data1":"range","data2":20},{"id":"table2item1","data1":"damage","data2":10}] | [{"id":"table6item2"},{"id":"table6item3"},{"id":"table6item3"},{"id":"table6item2"}]
item5id | MyFifthItem | table3item3 | table4item3 | [{"id":"table2item3","data1":"range","data2":20},{"id":"table2item2","data1":"range","data2":20}] | [{"id":"table6item3"},{"id":"table6item3"}]
item6id | MySixthItem | table3item4 | table4item1 | [null,null,null] | [{"id":"table6item2"},{"id":"table6item1"},{"id":"table6item3"}]
Well, I've got a problem here. As you can see, my table2_items and table6_items arrays have the same size. I don't know the reason for this, but it seems that I'm missing something.
Worse, instead of filling this array with null value, this query creates duplicates which should not appear.
Details :
item1 and item6 have the same problem : no links to table2, and 3 items in table6. I end up with an array [null, null, null] for table2_items
item2 has 3 links to table 6, and 1 to table2. I end up with 3 times the same table2 object in the array
item4... I don't know what's happening here. Should have 2 things in each array, and I've got 4 (duplicates)
item5 : you can clearly see the duplication.
I have tried to group by table6.id, or table2.id. It doesn't work (I have got a line for each of them, so several line for each item).
Note : If I do
SELECT t1.id,
na.name,
array_to_json(array_agg(row_to_json(t2))) AS table2items,
FROM table1 t1
INNER JOIN table1_i18n na ON na.table1_id = t1.id
INNER JOIN table3 t3 ON t3.id = t1.table3_id
INNER JOIN table4 t4 ON t4.id = t3.table4_id
LEFT JOIN table1_table2 t1t2 ON t1t2.table1_id = t1.id
LEFT JOIN table2 t2 ON t2.id = t1t2.table2_id
WHERE t1.user_id = 'myuserid' AND na.lang = 'en_US'
GROUP BY t1.id, na.name, t4.id
ORDER BY t1.id;
alone, it works perfectly. Same for t6. It's only when I try to gather everything at the same time that I got some problems.
If it is not clear enough, ask for details. It's really not easy to explain such a problem :).

select from multiple tables, when one is never used

I improved my question with example tables for a better understanding
I have 3 tables with following rows:
TABLE1 t1 TABLE t2 TABLE t3
ID NAME OBS ID HW_VER ID SERIAL
----------------- ----------- ------------
1 | Name1 | Obs1 1 | HWVer1 5 | Serial5
2 | Name2 | Obs2 2 | HWVer2 6 | Serial6
3 | Name3 | Obs3 3 | HWVer3 7 | Serial7
4 | Name4 | Obs4
5 | Name5 | Obs5
6 | Name6 | Obs6
7 | Name7 | Obs7
Now, I want to select the id, name and obs when 2 conditions are fulfilled:
the id is present in t2 or t3 (never in both);
it refers to either t2 or t3 attributes (eg. t2.HW_VER='HWVER1'), never on both
I did something like this, but it's wrong:
SELECT DISTINCT t1.id, t1.name, t1.obs
FROM table1 t1, table2 t2, table3 t3
WHERE t1.id IN (t2.id, t3.id) AND t3.serial='Serial6';
I cannot use unions, external tables or views for this.
Please let me know in case of further questions.
Thanks a lot for your answers, I really appreciate your time..
You need to select from T2 OR T3 but never both? I think you want something like this
select count(*)
from t1
where exists (
select 'x'
from t2
where MyPrimaryKey_Name = 'random_name'
and t2.id = t1.id
)
or exists (
select 'x'
from t3
where MyPrimaryKey_Name = 'random_name'
and t3.id = t1.id
)