New table from two table with max(timestamp) - Bigquery SQL - sql

I have two tables where the combination of retailer and id are the one common between the two. I need to create a new table for all retailer + id combination from the first table and respective data for those from the second table that has the latest timestamp
The first table will have only one record for each retailer, id combination but the second table will have multiple records for each retailer, id combination based on the time it was scraped, I need to create a new table with the latest timestamp data for each combination
input table 1:
input table 2:
output table:

This is basically aggregation and join:
select *
from table1 t1 left join
(select t2.retailer, max(timestamp) as max_timestamp
from table2 t2
group by t2.retailer
) t2
on using (retailer);
If you wanted the entire most recent row, you can use a variant of this:
select *
from table1 t1 left join
(select ( array_agg(t2 order by timestamp desc limit 1) )[safe_ordinal(1)].*
from table2 t2
group by t2.retailer
) t2
on using (retailer);

Related

Group using data from one query into another

I have a table that looks like below. It is created using a query -
NPI Other_Columns
123 Several_Other_Columns
456 Several_Other_Columns
How do I take every NPI from this table and get a count of the number of times they appeared in another table? The structure of the other table is like so -
Claim_id NPI1 NPI2 NPI3 NPI4 NPI5 NPI6 NPI7 NPI8
If NPIs in the first table, show in any field in the second table, we want to count that claim.s
The first task is the join
SELECT
t1.npi,
t1.other_columns,
t2.claim_id
FROM table1 as t1
JOIN table2 as t2 ON t1.npi in (t2.np1,t2.np2,t2.np3,t2.np4,t2.np5,t2.np6,t2.np7,t2.np8)
that gets you all the things joined.
Now count those..
SELECT
count(t2.claim_id)
FROM table1 as t1
JOIN table2 as t2 ON t1.npi in (t2.np1,t2.np2,t2.np3,t2.np4,t2.np5,t2.np6,t2.np7,t2.np8)

Comparing base table value with second table's sum of value with group by

I have two tables:
One is base table and second is transaction table. I want to compare base table value with second table's sum of value with group by.
Table1(T1Id,Amount1,...)
Tabe2(T2Id,T1ID,Amount2)
I want those rows from table 1 WHere SUM of Table2's SUM( Amount2) is greater or equal table1's Amount1.
*T1ID is in relation with both tables
* The SQL query have many joins with other table for data retriving.
One approach uses a join:
SELECT t1.T1Id, t1.Amount1
FROM Table1 t1
INNER JOIN Table2 t2
ON t1.T1Id = t2.T1ID
GROUP BY
t1.T1Id, t1.Amount1
HAVING
SUM(t2.Amount2) >= t1.Amount1;
We can also try doing this via a correlated subquery:
SELECT t1.T1Id, t1.Amount1
FROM Table1 t1
WHERE t1.Amount1 <= (SELECT SUM(t2.Amount2) FROM Table2 t2
WHERE t1.T1Id = t2.T1ID);
I would use something similar to the query below:
SELECT
a.T1Id, a.Amount1, SUM(b.Amount2)
FROM Table1 a
INNER JOIN Table2 b on b.T1Id = a.T1Id
GROUP BY a.T1Id, a.Amount1
HAVING SUM(b.Amount2) >= a.Amount1;
Basically what the query above does is give you the ID, Amount from table 1 and the summed amount from table 2. The HAVING clause at the end of query filters out those records where the summed amount from the second table is smaller than the amount from the first one.
If you want to add further table joins to the query, you can do so by adding as many joins as you wish. I would recommend having a referenced ID for each table you are joining in the Table1 table.

Multiple rows in table with values from another table

I am struggling with following issue:
Table1:
Table2:
Expected result:
Basically I want to multiple rows in dates table with rows from User table. Is it somehow possible? (using TSQL).
You need to apply cross join
select date,month,userid,userno from table1 cross join table2
select Date, Month, USER_ID, ID from
t1 cross join t2

How to select records of a table with last related record on another table in T-sql

I have two tables and i want to use some records from first table and get last related record from another one.
You can see my tables
I want to join table 1 with last record of table 2. (creationDate = 2018-07-20)
If you simply want to get the latest record in table 2 for every ID in table one then this will work:
select t1.ID, t1.Name, q.ID, q.CreationDate
from table1 t1
outer apply
(
select top 1 t2.ID, t2.CreationDate
from table2 t2
where t2.tbl_1_Id = t1.ID
order by t2.CreationDate desc
)q

matching value by non-unique id and minimum date difference

I'm using sqlite through the RSQLite package in R.
I have two tables:
Table 1 has important columns 'PERMCO' and 'Reporting_Period'.
('Reporting_Period' is an integer date)
Table 2 has important columns 'PERMCO' and 'date'.
('date' is an integer date)
I want to do a left join with table 1 as the left table.
Thing is that 'PERMCO' is not unique (row-wise, many duplicates) in the second table.
For a given row of table 1, I want the match from the second table to be the row from table 2 with matching PERMCO that is closest in absolute date to 'Reporting_Period' in the first table.
Not really sure how to do this...
Thank you
The Idea is a correlated subquery to get the closest day in table2 from Reporting_Period in table1
SELECT t1.*, t2.*
FROM table1 t1
LEFT JOIN table2 t2
ON t1.permco = t2.permco
WHERE ABS(t2."date" - t1.Reporting_Period) = (SELECT MIN(ABS("date" - t1.Reporting_Period) )
FROM table2
WHERE permco = t1.permco
)
OR t2.permco IS NULL --because you want a left join
;
I'm not familier with Sqlite, so you may need to change the query to subtract the two date.