SQL: Join/union of two tables, on multiple conditions

SQL: Join/union of two tables, on multiple conditions - sql

Provided the following structure and data:
CREATE TABLE "CHANGES" (
"ID" NUMBER(38),
"LAST_UPD_DATE" DATE DEFAULT SYSDATE
);
CREATE TABLE "EXPORT_LOG" (
"ID" NUMBER(38),
"LAST_EXPORT" DATE DEFAULT SYSDATE
);
Table CHANGES contains:
----------------------------
| ID | LAST_UPD_DATE |
----------------------------
| 123 | 12-MAY-16 12.23.23 |
| 124 | 12-MAY-16 12.24.23 |
| 125 | 12-MAY-16 12.11.23 |
----------------------------
and EXPORT_LOG
----------------------------
| ID | LAST_EXPORT |
----------------------------
| 124 | 12-MAY-16 12.23.12 |
| 125 | 12-MAY-16 12.12.24 |
----------------------------
I need to get the records in CHANGES that either don't exist in EXPORT_LOG or, if exists, get records with LAST_UPD_DATE later than LAST_EXPORT.
So in the above example, I should be getting 123 and 124.
I'm trying different JOINs but I cannot seem to get the result I want:
INNER JOIN is used for intersections, LEFT JOIN gets ALL of first table, but only those of second table that match a condition you set - none of these is what I want. So is the solution some sort of UNION?

Try this:
SELECT t1.*
FROM CHANGES AS t1
LEFT JOIN EXPORT_LOG AS t2 ON t1.ID = t2.ID
WHERE (t2.ID IS NULL) OR (t1.LAST_UPD_DATE > t2.LAST_EXPORT)
This will return all records of CHANGES table that don't have a match in EXPORT_LOG table plus the records of CHANGES table that have a LAST_UPD_DATE that is later than LAST_EXPORT.

One method is to translate the conditions directly using exists:
select c.*
from changes c
where not exists (select 1 from export_log el where c.id = el.id) or
not exists (select 1 from export_log el where c.id = el.id and el.last_export > c.last_upd_date);
This can be simplified to:
select c.*
from changes c
where not exists (select 1 from export_log el where c.id = el.id and el.last_export > c.last_upd_date);

Please use the below mentioned code and provide me the feedback. It should work as per your requirement.
SELECT CH.ID, CH.LAST_UPD_DATE
FROM CHANGES CH, EXPORT_LOG EL
WHERE CH.ID = EL.ID(+)
AND ((EL.ID IS NULL) OR (CH.LAST_UPD_DATE > EL.LAST_EXPORT));

Related

Update a table based on a condition

I have a column Table1.Tradedate and another column Table2.SettlementDate.
Based on the comparison between these 2, I want to update a column in table 2
IF (Table1.TradeDate <= Table2.SettlementDate)
BEGIN
UPDATE Table2 SET Status='Y'
END
This is what I have tried but I know its wrong, since the table will obviously contain more than 1 records. So, I believe what I should do is
use a join on 2 tables based on some #id to pick a particular record
check the IF condition for that particular record
update the Status column in table2.
I hope my approach is correct but I am writing it incorrectly.
Table1:
SKacc | Name | TradeDate | Othercolumns....
1 | xxx | 01/07/2019 |
2 | xxx | 01/06/2019 |
Table2:
SKAcc | Name | SettlementDate | Status |Other Columns....
1 | xxx | 01/08/2019 | NULL |
2 | xxx | 01/08/2019 | NULL |

Try below
update t2 set Status = 'Y'
from table2 t2
join table1 t1 on t1.id = t2.id
where t1.tradeDate <= t2.settlementDate

Try joining the two tables with the related column and then update the table you want to update with the value. Using inner join in the example but can change depending on the usecase
UPDATE Table2
SET Status = 'Y'
FROM Table2
INNER JOIN Table1 ON Table1.id = Table2.table1_id
WHERE Table1.TradeDate <= Table2.SettlementDate

I would not recommend a JOIN for this purpose. Instead:
update table2
set Status = 'Y'
where exists (select 1
from table1 t1
where t1.id = t2.id and
t1.tradeDate <= t2.settlementDate
);
The reason I recommend this version is because you have not specified that id is unique in table1. In general, you only want to use JOIN in UPDATE when you can guarantee that there is only one matching row.

Retrieve data from the same table if subId is an id of other item

I have a table that contains some records, and I would like to get only these records that have subID to a record with the id of the subID value. If there is no row with the id then do not take this row to the table. Also do not duplicate values if already in the table and do not look at rows that have subId 0 because they are as parents we can say so they do not have childs
----------------------------
ID | SUBID | NAME | ENABLED |
30 | 0 | EXP1 | TRUE |
55 | 30 | EXP2 | TRUE |
70 | 30 | EXP3 | FALSE |
99 | 42 | EXP4 | FALSE |
232| 0 | EXP5 | TRUE |
65 | 232 | EXP6 | TRUE |
-----------------------------
Expected result:
----------------------------
ID | SUBID | NAME | ENABLED |
30 | 0 | EXP1 | TRUE |
55 | 30 | EXP2 | TRUE |
70 | 30 | EXP3 | FALSE |
232| 0 | EXP5 | TRUE |
65 | 232 | EXP6 | TRUE |
-----------------------------
If someone could help me how to write this SQL statement in a good way I will be grateful.

You can use 'Exists':
SELECT T1.* FROM TEST T1
WHERE EXISTS (SELECT T2.ID FROM TEST T2 WHERE T2.ID = T1.SUBID)
OR EXISTS (SELECT T3.SUBID FROM TEST T3 WHERE T3.SUBID = T1.ID)
Test Result:
DB<>Fiddle

How about a union
select a.*
from have a
inner join have b
on a.subid=b.id
union
select b.*
from have a
inner join have b
on a.subid=b.id;

This can be actually pretty complicated, as you've evidently found out. I'd suggest a CTE and a UNION with your JOINs and aliases. It also looks like it'll need all that in a subquery to do a DISTINCT, too.
Without testing this, I'd image it looks something like this:
WITH MAIN AS (
SELECT ID, SUBID, NAME
FROM TABLE t
WHERE ENABLED = TRUE
)
SELECT DISTINCT ID, NAME
FROM (
SELECT ID, NAME
FROM MAIN
UNION
SELECT t.ID, t.NAME
FROM MAIN
LEFT JOIN TABLE t on MAIN.SUBID = t.ID
WHERE MAIN.SUBID <> 0
)
The outer select might not be needed if you do a distinct on each of the inner queries, but without testing it, I can't say for sure. I'd guess it would only DISTINCT the two lists separately, which isn't your intended result.
I'm kind of hoping someone else can come up with a less complicated version. I'd also suggest you do some more research on CTEs, UNIONs, aliases, and see if you can make this simpler on your own. But this should get you in the right direction.
BTW, I used a CTE (WITH MAIN AS) so that the query wouldn't be duplicated.

Try this script-
SELECT YT.*
FROM your_table YT
INNER JOIN (
SELECT DISTINCT(B.ID)
FROM your_table A
LEFT JOIN your_table B
ON A.SUBID = B.ID
WHERE B.ID IS NOT NULL
) C ON YT.ID = C.ID OR YT.SUBID = C.ID

By my understanding of what you are trying to do, you simply want:
SELECT * FROM myTable t1
WHERE SubID = 0
OR EXISTS (SELECT NULL FROM myTable t2 WHERE t2.id = t1.SubID)

Type of join based on condition postgres

I have 2 tables, where based on type of and item in table A, I would either like to force existence in another table or not require it (in order to return this id)
I wrote the following however I am getting SQL error.
How can I achieve this behaviour?
SELECT
item.id, delivery
FROM
item
(CASE
WHEN item.id not in (select ad_object_id from delivery) THEN
LEFT JOIN
ad_object_delivery
ON
item.id = ad_object_id
ELSE
JOIN
ad_object_delivery
ON
item.id = ad_object_id
END
)
Example data:
item
id | name | type
1 | John | socks
2 | Daniel | pants
3 | Barak | shirt
delivery
id | item_id | delivery
1 | 1 | UK
1 | 1 | US
definition
id | item_id | definition
1 | 1 | UK
1 | 2 | IL
I would like to get as a result only John and Barak records, because Daniel appears only in delivery but not in definition. Barak appears in neither so it's ok.

I think you want something like this:
SELECT i.id, COALESCE(d.delivery, aod2.delivery)
FROM item i LEFT JOIN
delivery d
ON i.id = d.ad_object_id LEFT JOIN
ad_object_delivery aod2
ON i.id = aod2.ad_object_id AND d.ad_object_id IS NULL
WHERE d.ad_object_id IS NOT NULL OR aod2.ad_object_id IS NOT NULL;
This matches to ad_object_delivery only when delivery does not exist. Note that you need to adjust the SELECT clause to select columns from the two tables.

If understand correctly you need JOIN with UNION ALL :
select i.id, i.name, dl.delivery
from item i inner join
delivery dl
on dl.item_id = i.id inner join
definition df
on df.item_id = i.id
union all
select i.id, i.name, null
from item i
where not exists (select 1 from delivery dl where dl.item_id = i.id) and
not exists (select 1 from definition df where df.item_id = i.id);

A confusing requirement, but I think its this:
select i.*
from item i
left join (select distinct item_id from delivery) del on del.item_id=i.id
left join (select distinct item_id from def) def on def.item_id=i.id
where (del.item_id is null and def.item_id is null)
or (del.item_id is not null and def.item_id is not null)

Hive / SQL - Left join with fallback

In Apache Hive I have to tables I would like to left-join keeping all the data from the left data and adding data where possible from the right table.
For this I use two joins, because the join is based on two fields (a material_id and a location_id).
This works fine with two traditional left joins:
SELECT
a.*,
b.*
FROM a
INNER JOIN (some more complex select) b
ON a.material_id=b.material_id
AND a.location_id=b.location_id;
For the location_id the database only contains two distinct values, say 1 and 2.
We now have the requirement that if there is no "perfect match", this means that only the material_id can be joined and there is no correct combination of material_id and location_id (e.g. material_id=100 and location_id=1) for the join for the location_id in the b-table, the join should "default" or "fallback" to the other possible value of the location_id e.g. material_id=001 and location_id=2 and vice versa. This should only be the case for the location_id.
We have already looked into all possible answers also with CASE etc. but to no prevail. A setup like
...
ON a.material_id=b.material_id AND a.location_id=
CASE WHEN a.location_id = b.location_id THEN b.location_id ELSE ...;
we tried or did not figure out how really to do in hive query language.
Thank you for your help! Maybe somebody has a smart idea.
Here is some sample data:
Table a
| material_id | location_id | other_column_a |
| 100 | 1 | 45 |
| 101 | 1 | 45 |
| 103 | 1 | 45 |
| 103 | 2 | 45 |
Table b
| material_id | location_id | other_column_b |
| 100 | 1 | 66 |
| 102 | 1 | 76 |
| 103 | 2 | 88 |
Left - Join Table
| material_id | location_id | other_column_a | other_column_b
| 100 | 1 | 45 | 66
| 101 | 1 | 45 | NULL (mat. not in b)
| 103 | 1 | 45 | DEFAULT TO where location_id=2 (88)
| 103 | 2 | 45 | 88
PS: As stated here exists etc. does not work in the sub-query ON.

The solution is to left join without a.location_id = b.location_id and number all rows in order of preference. Then filter by row_number. In the code below the join will duplicate rows first because all matching material_id will be joined, then row_number() function will assign 1 to rows where a.location_id = b.location_id and 2 to rows where a.location_id <> b.location_id if exist also rows where a.location_id = b.location_id and 1 if there are not exist such. b.location_id added to the order by in the row_number() function so it will "prefer" rows with lower b.location_id in case there are no exact matching. I hope you have caught the idea.
select * from
(
SELECT
a.*,
b.*,
row_number() over(partition by material_id
order by CASE WHEN a.location_id = b.location_id THEN 1 ELSE 2 END, b.location_id ) as rn
FROM a
LEFT JOIN (some more complex select) b
ON a.material_id=b.material_id
)s
where rn=1
;

Maybe this is helpful for somebody in the future:
We also came up with a different approach.
First, we create another table to calculate averages from the table b based on material_id over all (!) locations.
Second, In the join table we create three columns:
c1 - the value where material_id and location_id are matching (result from a left join of table a with table b). This column is null if there is no perfect match.
c2 - the value from the table where we write the number from the averages (fallback) table for this material_id (regardless of the location)
c3 - the "actual value" column where we use a case statement to decide if when the column 1 is NULL (there is no perfect match of material and location) then we use the value from column 2 (the average over all the other locations for the material) for the further calculations.

How to get a single result with columns from multiple records in a single table?

Platform: Oracle 10g
I have a table (let's call it t1) like this:
ID | FK_ID | SOME_VALUE | SOME_DATE
----+-------+------------+-----------
1 | 101 | 10 | 1-JAN-2013
2 | 101 | 20 | 1-JAN-2014
3 | 101 | 30 | 1-JAN-2015
4 | 102 | 150 | 1-JAN-2013
5 | 102 | 250 | 1-JAN-2014
6 | 102 | 350 | 1-JAN-2015
For each FK_ID I wish to show a single result showing the two most recent SOME_VALUEs. That is:
FK_ID | CURRENT | PREVIOUS
------+---------+---------
101 | 30 | 20
102 | 350 | 250
There is another table (lets call it t2) for the FK_ID, and it is here that there is a reference
saying which is the 'CURRENT' record. So a table like:
ID | FK_CURRENT | OTHER_FIELDS
----+------------+-------------
101 | 3 | ...
102 | 6 | ...
I was attempting this with a flawed sub query join along the lines of:
SELECT id, curr.some_value as current, prev.some_value as previous FROM t2
JOIN t1 curr ON t2.fk_current = t1.id
JOIN t1 prev ON t1.id = (
SELECT * FROM (
SELECT id FROM (
SELECT id, ROW_NUMBER() OVER (ORDER BY SOME_DATE DESC) as rno FROM t1
WHERE t1.fk_id = t2.id
) WHERE rno = 2
)
)
However the t1.fk_id = t2.id is flawed (i.e. wont run), as (I now know) you can't pass a parent
field value into a sub query more than one level deep.
Then I started wondering if Common Table Expressions (CTE) are the tool for this, but then I've no
experience using these (so would like to know I'm not going down the wrong track attempting to use them - if that is the tool).
So I guess the key complexity that is tripping me up is:
Determining the previous value by ordering, but while limiting it to the first record (and not the whole table). (Hence the somewhat convoluted sub query attempt.)
Otherwise, I can just write some code to first execute a query to get the 'current' value, and then
execute a second query to get the 'previous' - but I'd love to know how to solve this with a single
SQL query as it seems this would be a common enough thing to do (sure is with the DB I need to work
with).
Thanks!

Try an approach with LAG function:
SELECT FK_ID ,
SOME_VALUE as "CURRENT",
PREV_VALUE as Previous
FROM (
SELECT t1.*,
lag( some_value ) over (partition by fk_id order by some_date ) prev_value
FROM t1
) x
JOIN t2 on t2.id = x.fk_id
and t2.fk_current = x.id
Demo: http://sqlfiddle.com/#!4/d3e640/15

Try out this:
select t1.FK_ID ,t1.SOME_VALUE as CURRENT,
(select SOME_VALUE from t1 where p1.id2=t1.id and t1.fk_id=p1.fk_id) as PREVIOUS
from t1 inner join
(
select t1.fk_id, max(t1.id) as id1,max(t1.id)-1 as id2 from t1 group by t1.FK_ID
) as p1 on t1.id=p1.id1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL: Join/union of two tables, on multiple conditions - sql

Please use the below mentioned code and provide me the feedback. It should work as per your requirement. SELECT CH.ID, CH.LAST_UPD_DATE FROM CHANGES CH, EXPORT_LOG EL WHERE CH.ID = EL.ID(+) AND ((EL.ID IS NULL) OR (CH.LAST_UPD_DATE > EL.LAST_EXPORT));

Related

Update a table based on a condition

Retrieve data from the same table if subId is an id of other item

Type of join based on condition postgres

Hive / SQL - Left join with fallback

How to get a single result with columns from multiple records in a single table?

Categories

Resources