matching value by non-unique id and minimum date difference - sql

I'm using sqlite through the RSQLite package in R.
I have two tables:
Table 1 has important columns 'PERMCO' and 'Reporting_Period'.
('Reporting_Period' is an integer date)
Table 2 has important columns 'PERMCO' and 'date'.
('date' is an integer date)
I want to do a left join with table 1 as the left table.
Thing is that 'PERMCO' is not unique (row-wise, many duplicates) in the second table.
For a given row of table 1, I want the match from the second table to be the row from table 2 with matching PERMCO that is closest in absolute date to 'Reporting_Period' in the first table.
Not really sure how to do this...
Thank you

The Idea is a correlated subquery to get the closest day in table2 from Reporting_Period in table1
SELECT t1.*, t2.*
FROM table1 t1
LEFT JOIN table2 t2
ON t1.permco = t2.permco
WHERE ABS(t2."date" - t1.Reporting_Period) = (SELECT MIN(ABS("date" - t1.Reporting_Period) )
FROM table2
WHERE permco = t1.permco
)
OR t2.permco IS NULL --because you want a left join
;
I'm not familier with Sqlite, so you may need to change the query to subtract the two date.

Related

MariaDB Select Min-Date (earliest) from multiple columns

I need to do a select within a MariaDB, where I have one row per customer and the earliest date out of three different tables with action datetime values.
Example Tables:
Main Table
customer-id
Column_X
First
Things
Second
Things
Table One
customer-id
Date
First
Date
Second
Date
Table Two
customer-id
Date
First
Date
Second
Earliest Date (Table2)
Table Three
customer-id
Date
First
Earliest Date (Table3)
Second
Date
My aim is to have the earliest date out of the three columns in the other tables in one column within the select.
What I tried to do is this:
SELECT main.customer-id , main.Column_X
(SELECT LEAST(C) FROM (VALUES ((table1.date) , (table2.date), (table3.date)) AS C) AS First_Action
FROM main_table main
LEFT JOIN table_one table1 ON table1.cutomer-id = main.customer-id
LEFT JOIN table_two table2 ON table2.cutomer-id = main.customer-id
LEFT JOIN table_three table3 ON table3.cutomer-id = main.customer-id
GROUP BY main.customer-id;
Unfortunatly, I don't get any results just an error message.
So the resulting table should look something like this:
Result
customer-id
Column_X
First_Action
First
Things
Earliest Date (Table 3)
Second
Things
Earliest Date (Table 2)
I just started working with SQL statments and therefore have basically no experience. Help would be much appreciated!
Many Greetings
Chris
It seems you simply need a LEAST function -
SELECT main.customer-id , main.Column_X,
LEAST(table1.date, table2.date, table3.date) AS First_Action
FROM main_table main
LEFT JOIN table_one table1 ON table1.cutomer-id = main.customer-id
LEFT JOIN table_two table2 ON table2.cutomer-id = main.customer-id
LEFT JOIN table_three table3 ON table3.cutomer-id = main.customer-id;

How to JOIN and get data from either table based on specific logics?

Let's say I have 2 tables as shown below:
Table 1:
Table 2:
I want to join the 2 tables together so that the output table will have a "date" column, a "hrs_billed_v1" column from table1, and a "hrs_billed_v2" column from table2. Sometimes a date only exists in one of the tables, and sometimes a date exists in both tables. If a date exists in both table1 and table2, then I want to allocate the hrs_billed_v1 from table1 and hrs_billed_v2 from table2 to the output table.
So the ideal result will look like this:
I've tried "FULL OUTPUT JOIN" but it returned some null values for "date" in the output table. Below is the query I wrote:
SELECT
DISTINCT CASE WHEN table1.date is null then table2.date WHEN table2.date is null then table1.date end as date,
CASE WHEN table1.hrs_billed_v1 is null then 0 else table1.hrs_billed_v1 END AS hrs_billed_v1,
CASE WHEN table2.hrs_billed_v2 is null then 0 else table2.hrs_billed_v2 END AS hrs_billed_v2
FROM table1
FULL OUTER JOIN table2 ON table1.common = table2.common
Note that the "common" column where I use to join table1 and table2 on is just a constant string that exists in both tables.
Any advice would be greatly appreciated!
A full join is indeed what you want. I think that would be:
select
common,
date,
coalesce(t1.hrs_billed_v1, 0) as hrs_billed_v1,
coalesce(t2.hrs_billed_v2, 0) as hrs_billed_v2
from table1 t1
full join table2 t2 using (common, date)
Rationale:
you don't show what common is; your data indicates that you want to match rows of the same date - so I put both in the join condition; you might need to adapat that
there should really be no need for distinct
coalesce() is much shorter than the case expressions
using () is handy to express the join condition when the columns to match have the same name in both tables

New table from two table with max(timestamp) - Bigquery SQL

I have two tables where the combination of retailer and id are the one common between the two. I need to create a new table for all retailer + id combination from the first table and respective data for those from the second table that has the latest timestamp
The first table will have only one record for each retailer, id combination but the second table will have multiple records for each retailer, id combination based on the time it was scraped, I need to create a new table with the latest timestamp data for each combination
input table 1:
input table 2:
output table:
This is basically aggregation and join:
select *
from table1 t1 left join
(select t2.retailer, max(timestamp) as max_timestamp
from table2 t2
group by t2.retailer
) t2
on using (retailer);
If you wanted the entire most recent row, you can use a variant of this:
select *
from table1 t1 left join
(select ( array_agg(t2 order by timestamp desc limit 1) )[safe_ordinal(1)].*
from table2 t2
group by t2.retailer
) t2
on using (retailer);

Comparing base table value with second table's sum of value with group by

I have two tables:
One is base table and second is transaction table. I want to compare base table value with second table's sum of value with group by.
Table1(T1Id,Amount1,...)
Tabe2(T2Id,T1ID,Amount2)
I want those rows from table 1 WHere SUM of Table2's SUM( Amount2) is greater or equal table1's Amount1.
*T1ID is in relation with both tables
* The SQL query have many joins with other table for data retriving.
One approach uses a join:
SELECT t1.T1Id, t1.Amount1
FROM Table1 t1
INNER JOIN Table2 t2
ON t1.T1Id = t2.T1ID
GROUP BY
t1.T1Id, t1.Amount1
HAVING
SUM(t2.Amount2) >= t1.Amount1;
We can also try doing this via a correlated subquery:
SELECT t1.T1Id, t1.Amount1
FROM Table1 t1
WHERE t1.Amount1 <= (SELECT SUM(t2.Amount2) FROM Table2 t2
WHERE t1.T1Id = t2.T1ID);
I would use something similar to the query below:
SELECT
a.T1Id, a.Amount1, SUM(b.Amount2)
FROM Table1 a
INNER JOIN Table2 b on b.T1Id = a.T1Id
GROUP BY a.T1Id, a.Amount1
HAVING SUM(b.Amount2) >= a.Amount1;
Basically what the query above does is give you the ID, Amount from table 1 and the summed amount from table 2. The HAVING clause at the end of query filters out those records where the summed amount from the second table is smaller than the amount from the first one.
If you want to add further table joins to the query, you can do so by adding as many joins as you wish. I would recommend having a referenced ID for each table you are joining in the Table1 table.

add new column with matching id in both Table 1 and Table 2

I have two tables,
in table1 I have 5 rows and
in table2 3 rows
table1:
#no---Name---value
1-----John---100
2-----Cooper-200
3-----Mil----300
4-----Key----200
5-----Van----300
Table 2:
#MemID-#no---FavID
19-----1-----2
21-----1-----3
22-----2-----5
Now expected result:
#no---name---value---MyFav
1-----John---100-----NULL
2-----Cooper-200-----1
3-----Mil----300-----1
4-----Key----200-----NULL
5-----Van----300-----NULL
1 indicates - My favorites
MyFav - new column ( alias)
This is the expected result, please suggest how to get it.
I think I understand the logic. You want MyFav to be marked as a 1 if that row is a favorite of John. You can do this with a left join and some more filtering:
select t1.*,
(case when t2.#no is not null then 1 end) as MyFav
from table1 t1 left join
table2 t2
on t1.#no = t2.FavId and
t2.#no = (select tt1.#no from table1 tt1 where tt1.Name = 'John');
Just use natural join for that, It will use your primary key as a mediator to join both the tables, as required. In your case, I think primary key is #no
For more information on natural join please visit SQL Joins