Join table based on date - sql

I have two tables:
Table A
ID | name
---+----------
1 | example
2 | example2
Table B (created field is timestamptz)
ID | id_table_a | dek | created
---+------------+------+---------------------
1 | 1 | deka | 2019-10-21 10:00:00
2 | 2 | dekb | 2019-10-21 11:00:00
3 | 1 | dekc | 2019-10-21 09:00:00
4 | 2 | dekd | 2019-10-21 09:40:00
5 | 1 | deke | 2019-10-21 09:21:00
I need to get records from Table A and each records should have the last dek from table b based on created.
How can I do that?

I would use a lateral join, very often this is faster than using a select max()
select a.*, b.dek
from table_a a
join lateral (
select id, id_table_a, dek, created
from table_b
where b.id_table_a = a.id
order by created desc
limit 1
) tb on true;
Another alternative is to use distinct on:
select a.*, b.dek
from table_a a
join lateral (
select distinct on (id_table_a) id, id_table_a, dek, created
from table_b
order by id_table_a, created desc
) tb on tb.id_table_a = a.id;
It depends on your data distribution which one is faster.

With a CTE returning the joined tables and NOT EXISTS:
with cte as (
select a.id, a.name, b.dek, b.created
from tablea a inner join tableb b
on b.id_table_a = a.id
)
select t.* from cte t
where not exists (
select 1 from cte
where id = t.id and created > t.created
)

Related

select single row from foreign table in left join

I want to fetch the first row where foreign key match. I don't know how to select first row
where foreign key matches
events table
id | name
----------------
1 | john
----------------
2 | Cat
event_attendee table
id | event_id | type
--------------------------
1 | 1 | User
--------------------------
2 | 1 | Local
--------------------------
3 | 1 | User
--------------------------
4 | 2 | User
--------------------------
5 | 2 | User
I want this result
id | name | event_id | type
------------------------------------
1 | John | 1 | User
------------------------------------
2 | Cat | 2 | User
Tried
select
a.*,
b.*
from
events as a
left join (
select
distinct
event_attendee.events_id,
event_attendee.type
from
event_attendee
left join events on
event_attendees.events_id = events.id
where
events.id = event_attendees.events_id
limit 1
) as b on
a.id = b.events_id
Problem
It only works for the 1st row, for 2nd row its show empty
id | name | type
------------------------------------
1 | John | User
------------------------------------
2 | Cat |
You can do this using a lateral join. In Postgres, the syntax is:
select e.*, ea.*
from events e left join lateral
(select ea.event_Id, ea.Type
from event_attendee ea
where ea.event_id = e.id
order by ea.id
) ea
on 1=1;
However, distinct on is a way to do this with no subqueries:
select distinct on (e.event_id) e.*, ea.*
from events e join
event_attendee ea
on ea.event_id = e.id
order by e.event_id, ea.id;
I would expect the lateral join to work better on larger tables, particularly with the correct indexes.
This is easy with a cross apply:
select *
from events e
cross apply (
select top (1) event_Id, Type
from event_attendee ea
where ea.event_id=e.id
order by id
)x
Edit, alternative compatible method!
select e.*,ea.event_Id, (select type from event_attendee ea2 where ea2.id=ea.id ) Type
from (
select Min(id) Id, event_id
from event_attendee
group by event_id
)ea
join events e on e.id=ea.event_id
One way to get the rank and use it to filter 1st record:
select
t_.id, t_.name, t_.type
from
(
select a.*, b.type,
rank() OVER (PARTITION BY a.id ORDER BY b.id asc) rank_
from events a
left join event_attendees b
on
a.id = b.events_id
) t_
where
t_.rank_ = 1

SQL Performance Inner Join

Let me ask you something I've been thinking about for a while. Imagine that you have two tables with data:
MAIN TABLE (A)
| ID | Date |
|:-----------|------------:|
| 1 | 01-01-1990|
| 2 | 01-01-1991|
| 3 | 01-01-1992|
| 4 | 01-01-2000|
| 5 | 01-01-2001|
| 6 | 01-01-2003|
SECONDARY TABLE (B)
| ID | Date | TOTAL |
|:-----------|------------:|--------:|
| 1 | 01-01-1990| 1 |
| 2 | 01-01-1991| 2 |
| 3 | 01-01-1992| 1 |
| 4 | 01-01-2000| 5 |
| 5 | 01-01-2001| 8 |
| 6 | 01-01-2003| 7 |
and you want to select only ID with date greater than 31-12-1999 and get the following columns: ID, Date and Total. For that we have many options but my question would be which of the following would be better in terms of performance:
OPTION 1
With main as(
select id,
date
from A
where date > '31-12-1999'
)
select main.id,
main.date,
B.total
from main inner join B on main.id = b.id
OPTION 1
With main as(
select id,
date
from A
where date > '31-12-1999'
),
secondary as (
select id,
total
from B
where date > '31-12-1999'
)
select main.id,
main.date,
secondary.total
from main inner join secondary on main.id = b.id
Which of both queries would be better in terms of performance? Thanks in advance!
DATE FOR BOTH TABLES MEANS THE SAME
You don't need to use CTE you can directly join two tables -
select A.id,
A.date,
B.total
from A inner join B on A.id = b.id
where A.date > '31-12-1999'
You would need to test on your data. But there is really no need for CTEs:
select a.id a.date, b.total
from a inner join
b
on a.id = b.id
where a.date > '1999-12-31' and b.date > '1999-12-31';
As for your specific question, the two queries are not the same, because the first is filtering on only one date and the second is filtering on two dates. You should run the query that implements the logic that you intend.

Exclude first record associated with each parent record in Postgres

There are 2 tables, users and job_experiences.
I want to return a list of all job_experiences except the first associated with each user.
users
id
---
1
2
3
job_experiences
id | start_date | user_id
--------------------------
1 | 201001 | 1
2 | 201201 | 1
3 | 201506 | 1
4 | 200901 | 2
5 | 201005 | 2
Desired result
id | start_date | user_id
--------------------------
2 | 201201 | 1
3 | 201506 | 1
5 | 201005 | 2
Current query
select
*
from job_experiences
order by start_date asc
offset 1
But this doesn't work as it would need to apply the offset to each user individually.
You can do this with a lateral join:
select je.*
from users u cross join lateral
(select je.*
from job_experiences je
where u.id = je.user_id
order by id
offset 1 -- all except the first
) je;
For performance, an index on job_experiences(user_id, id) is recommended.
use row_number() window function
with cte as
(
select e.*,
row_number()over(partition by user_id order by start_date desc) rn,
count(*) over(partition by user_id) cnt
from users u join job_experiences e on u.id=e.user_id
)
, cte2 as
(
select * from cte
) select * from cte2 t1
where rn<=(select max(cnt)-1 from cte2 t2 where t1.user_id=t2.user_id)
You could use an intermediate CTE to get the first (MIN) jobs for each user, and then use that to determine which records to exclude:
WITH user_first_je("user_id", "job_id") AS
(
SELECT "user_id", MIN("id")
FROM job_experiences
GROUP BY "user_id"
)
SELECT job_experiences.*
FROM job_experiences
LEFT JOIN user_first_je ON
user_first_je.job_id = job_experiences.id
WHERE user_first_je.job_id IS NULL;

SQL Server - Select first row that meets criteria

I have 2 tables that contain IDs. There will be duplicate IDs in one of the tables and I only want to return one row for each matching ID in table B. For example:
Table A
+-----------+-----------+
| objectIdA | objectIdB |
+-----------+-----------+
| 1 | A |
| 1 | B |
| 1 | D |
| 5 | F |
+-----------+-----------+
Table B
+-----------+
| objectIdA |
+-----------+
| 1 |
| 5 |
+-----------+
Would return:
+-----------+-----------+
| objectIdA | objectIdB |
+-----------+-----------+
| 1 | D |
| 5 | F |
+-----------+-----------+
I only need one entry from Table A that matches Table B. It doesn't matter which row of table A is returned.
I'm using SQL Server.
Thanks.
;WITH CTE
AS (
SELECT B.objectIdA
,A.objectIdB
,ROW_NUMBER() OVER (PARTITION BY B.objectIdA ORDER BY A.objectIdB DESC) rn
FROM TableA A
INNER JOIN TableB B ON A.objectIdA = B.objectIdA
)
SELECT C.objectIdA
,C.objectIdB
FROM CTE
WHERE rn = 1
You can do so,by using a subselect for table a to get one entry per objectIdA group
select b.*,a.[objectIdB]
from b
join
(select [objectIdA], max([objectIdB]) [objectIdB]
from a group by [objectIdA]
) a
on(b.[objectIdA] = a.[objectIdA])
Fiddle Demo
Edit deom comments to get a whole row from tablea you can use a self join for tablea
select b.*,a.*
from b
join a
on(b.[objectIdA] = a.[objectIdA])
join (select [objectIdA], max([objectIdB]) [objectIdB]
from a group by [objectIdA]) a1
on(a.[objectIdA] = a1.[objectIdA]
and
a.[objectIdB] = a1.[objectIdB])
Fiddle Demo 2
SELECT MAX(b.ID) AS ID
,MAX(Value) AS Value
,MAX(OtherCol1) AS OtherCol1
,MAX(OtherCol2) AS OtherCol2
,MAX(OtherCol3) AS OtherCol3
FROM TblA AS a
INNER JOIN TblB AS b ON a.TblBID = b.ID
GROUP BY TblBID
Table A
Table B
Table A Data
Table B Data
Query Result
You should use PARTITION OVER to achieve the results.
SELECT
t.objectIdA,
t.objectIdB
FROM (
SELECT
a.objectIdA,
a.objectIdB,
rowid = ROW_NUMBER() OVER (PARTITION BY a.objectIdA ORDER BY a.objectIdB DESC)
FROM TableA a
INNER JOIN TableB b ON (a.objectIdA = b.objectIdA)
) t
WHERE rowid <= 1
Fiddle Code: http://sqlfiddle.com/#!3/a2ccd/1

How to create view that combine multiple row from 2 tables?

I want to create view that combine data from two tables, sample data in each table is like below.
SELECT Command for TableA
SELECT [ID], [Date], [SUM]
FROM TableA
Result
ID | Date | SUM
1 | 1/1/2010 | 2
1 | 1/2/2010 | 4
3 | 1/3/2010 | 6
SELECT Command for TableB
SELECT [ID], [Date], [SUM]
FROM TableB
Result
ID | Date | SUM
1 | 1/1/2010 | 5
1 | 2/1/2010 | 3
1 | 31/1/2010 | 2
2 | 1/2/2010 | 20
I want output like below
ID | Date | SUMA | SUMB
1 | 1/1/2010 | 2 | 10
1 | 1/2/2010 | 4 | 0
2 | 1/2/2010 | 0 | 20
3 | 1/3/2010 | 6 | 0
How can I do that on SQL Server 2005?
Date information be vary, as modify in table.
Try this...
SELECT
ISNULL(TableA.ID, TableB.ID) ID,
ISNULL(TableA.Date, TableB.Date),
ISNULL(TableA.Sum,0) SUMA,
ISNULL(TableB.Sum, 0) SUMB
FROM
TableA FULL OUTER JOIN TableB
ON TableA.ID = TableB.ID AND TableA.Date = TableB.Date
ORDER BY
ID
A full outer join is what you need because you want to include results from both tables regardless of whether there is a match or not.
I usually union the two queries together and then group them like so:
SELECT ID, [Date], SUM(SUMA) As SUMA, SUM(SUMB) AS SUMB
FROM (
SELECT ID, [Date], SUMA, 0 AS SUMB
FROM TableA
UNION ALL
SELECT ID, [Date], 0 As SUMA, SUMB
FROM TableB
)
GROUP BY ID, [Date]
SELECT
ISNULL(a.ID, b.ID) AS ID,
ISNULL(a.Date, b.Date) AS Date,
ISNULL(a.SUM, 0) AS SUMA,
ISNULL(b.SUM, 0) AS SUMB,
FROM
TableA AS a
FULL JOIN
TableB AS b
ON a.ID = b.ID
AND a.Date = b.Date;
It's not obvious how you want to combine the two tables. I think this is what you're after, but can you confirm please?
TableA.Date is the most important field; if a given date occurs in TableA then it will be included in the view, but not if it only occurs in TableB.
If a date has records in TableA and TableB and the records have a matching ID, they are combined into one row in the view with SUMA being taken from TableA.Sum and SUMB being TableA.Sum * TableB.Sum (e.g. Date: 01/01/2010, ID: 1) (e.g. Date: 01/03/2010 ID: 3).
If a date has records in TableA and TableB with different IDs, the view include these records separately without multiplying the Sum values at all (e.g. Date 02/01/2010, ID: 1 and ID: 2)