Pair record with closest from join - sql

I have the following tables:
id | detected
-----------+----------------
288 | 26817612
288 | 26817734
468 | 26817609
468 | 26817646
476 | 26817700
502 | 26817609
502 | 26817616
502 | 26817655
and
id | fulfilled
-----------+-----------------
288 | 26817616
288 | 26817635
468 | 26817623
468 | 26817659
476 | 26817706
502 | 26817621
502 | 26817627
502 | 26817663
What i need to do, is to JOIN these to tables by id, matching records from the first table, with its closest fulfilled counterpart.
For example:
id | detected | fulfilled
-------------------------
288| 26817612 | 26817616
288| 26817734 | 26817635
468| 26817609 | 26817623
... and so on.
Is there any way to do this with this data, or am i wasting my time and should gather new one?

I have created a solution at DB fiddle with just dataset for Id 288 and it should work for all other Id's as well. Here is the URL https://www.db-fiddle.com/f/vieqapnXDrrGzGeUA7GE5h/4
Here is final sql:
SELECT
s1.Id, s1.detected, s2.fulfilled
FROM
(SELECT
t1.Id, t1.detected, MIN(ABS(t1.detected - t2.fulfilled)) AS Diff
FROM
table1 t1
LEFT JOIN
table2 t2
ON t1.Id = t2.Id
Group by t1.Id, t1.detected) s1
LEFT JOIN
table2 s2
ON s1.Id = s2.Id
WHERE
s1.Diff = ABS(s1.detected - s2.fulfilled)

You seem to want to reduce the number of rows as well. To me, this suggests row_number():
select t12.*
from (select t1.*, t2.*fulfilled
row_number() over (partition by t1.id order by abs(t1.detected - t2.fulfilled)) as seqnum
from t1 join
t2
on t1.id = t2.id
) t12
where seqnum = 1;

One option you have, assuming both tables will always contain the same list of id values, is to use apply() and subtract the two values to get the closest match:
select *
from t1
cross apply(
select top (1) t2.fulfilled
from t2
where t2.id=t1.id
order by Abs(t1.detected-t2.fulfilled)
)t2

Related

How to group total amount spend based on both ID and name

i have a table where
patientId | Units | Amount | PatientName
1234 | 1 | 20 |lisa
1111 | 5 | 10 |john
1234 | 10 | 200 |lisa
345 | 2 | 30 | xyz
i want to get ID in one column, then patient name then total amount spent by him on different items,
please note i have got patient name in the column above by doing a join on 2 tables using ID as the key
i am doing this to get this table
select t1.*,t2.name from table1 as t1 inner join table2 as t2
on t1.id = t2.id
then for adding i am trying to use the group by clause but that gives an error
please note i cannot use temp table in this, only need to do this using subquery, how to do it?
Are you looking for group by?
select t1.patientid, t2.patientname, sum(t1.amount)
from table1 t1 join
table2 t2
on t1.id = t2.id
group by t1.patientid, t2.patientname;
select t1.*,
t2.name
from table1 t1
inner join table2 t2
on t1.id = t2.id
group by t1.id, t2.name
What are table1 and table2 like? What's the error message?

select the master table and relation table where by some column id

I don't know it is possible or not only with sql.
what I want to do is select the master table and relation table where by some column id
product_id | product
-------------------+-------------
21 | Milk
26 | HeadPhone
25 | TV
product_id | custom_id
-------------------+----------
21 | 213
26 | 245
26 | 229
25 | 245
25 | 244
is it possible to find by custom_id,
Below is something likes where custom_id = 245
product_id | product | custom_id
------------+-------------+---------
26 | HeadPhone | 245,229
25 | TV | 245,244
I am not entirely sure I understand the part "something likes where custom_id = 245", but maybe you are looking for something this.
select mt.product_id,
mt.product,
string_agg(rt.custom_id::text, ',' order by rt.custom_id desc) as custom_ids
from master_table mt
join relation_table rt on mt.product_id = rt.product_id
where exists (select *
from relation_table mt2
where mt2.product_id = mt.product_id
and mt2.custom_id = 245)
group by mt.product_id, mt.product;
If you are using posgres version 8.4 or higher you can make use of the array_arg function.
The array_arg function returns an array but it can be casted into text.
SELECT t1.product_id, t1.product, array_agg(t2.custom_id)
FROM Table1 t1, Table2 t2
WHERE t1.product_id = t2.product_id
If you are using postgres version 9.0 or higher you can make use of the string_agg function and your query could look like this
SELECT t1.product_id, t1.product, string_agg(t2.custom_id, ',')
FROM Table1 t1, Table2 t2
WHERE t1.product_id = t2.product_id

Oracle Efficiently joining tables with subquery in FROM

Table 1:
| account_no | **other columns**...
+------------+-----------------------
| 1 |
| 2 |
| 3 |
| 4 |
Table 2:
| account_no | TX_No | Balance | History |
+------------+-------+---------+------------+
| 1 | 123 | 123 | 12.01.2011 |
| 1 | 234 | 2312 | 01.03.2011 |
| 3 | 232 | 212 | 19.02.2011 |
| 4 | 117 | 234 | 24.01.2011 |
I have multiple join query, one of the tables(Table 2) inside a query is problematic as it is a view which computes many other things, that is why each query to that table is costly. From Table 2, for each account_no in Table 1 I need the whole row with the greatest TX_NO, this is how I do it:
SELECT * FROM TABLE1 A LEFT JOIN
( SELECT
X.ACCOUNT_NO,
HISTORY,
X.BALANCE
FROM TABLE2 X INNER JOIN
(SELECT
ACCOUNT_NO,
MAX(TX_NO) AS TX_NO
FROM TABLE2
GROUP BY ACCOUNT_NO) Y ON X.ACCOUNT_NO = Y.ACCOUNT_NO) B
ON B.ACCOUNT_NO = A.ACCOUNT_NO
As I understand at first it will make the inner join for all the rows in Table2 and after that left join needed account_no's with Table1 which is what I would like to avoid.
My question: Is there a way to find the max(TX_NO) for only those accounts that are in Table1 instead of going through all? I think it will help to increase the speed of the query.
I think you are on the right track, but I don't think that you need to, and would not myself, nest the subqueries the way you have done. Instead, if you want to get each record from table 1 and the matching max record from table 2, you can try the following:
SELECT * FROM TABLE1 t1
LEFT JOIN
(
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY account_no ORDER BY TX_No DESC) rn
FROM TABLE2 t
) t2
ON t1.account_no = t2.account_no AND
t2.rn = 1
If you want to continue with your original approach, this is how I would do it:
SELECT *
FROM TABLE1 t1
LEFT JOIN TABLE2 t2
ON t1.account_no = t2.account_no
INNER JOIN
(
SELECT account_no, MAX(TX_No) AS max_tx_no
FROM TABLE2
GROUP BY account_no
) t3
ON t2.account_no = t3.account_no AND
t2.TX_No = t3.max_tx_no
Instead of using a window function to find the greatest record per account in TABLE2, we use a second join to a subquery instead. I would expect the window function approach to perform better than this double join approach, and once you get used to it can even easier to read.
If table1 is comparatiely less expensive then you could think of doing a left outer join first which would considerable decrease the resultset and from that pick the latest transaction id records alone
select <required columns> from
(
select f.<required_columns),row_number() over (partition by account_no order by tx_id desc ) as rn
from
(
a.*,b.tx_id,b.balance,b.History
from table1 a left outer join table2 b
on a.account_no=b.account_no
)f
)g where g.rn=1

Query that countes pairs with same values depending on third column

I have three columns: Team_Code, ID, times_together.
I'm trying to count how many times ID's have the same "Team_Code" and add times_together to it.
In other words- I'm trying to write all the pairs of one column, check how many times they have the same value in other raw, and add third raw to it.
The simple way to ask this question is picture so:
Values can appear twice (for example
1110 with 8888
and then
8888 with 1110).
You could self join the table on team_code and sum the times_together:
SELECT t1.id, t2.id, SUM(t1.times_together)
FROM mytable t1
JOIN mytable t2 ON t1.team_code = t2.team_code AND t1.id != t2.id
If you want to make sure each pair only appears once, you could add a condition to always take the lower id on the left:
SELECT t1.id, t2.id, SUM(t1.times_together)
FROM mytable t1
JOIN mytable t2 ON t1.team_code = t2.team_code AND t1.id < t2.id
I would suggest this self-joining SQL which takes all possible ID pairs (but only where the first is smaller than the second), and uses a CASE to sum the times_together when the persons played in the same team:
select t1.id,
t2.id,
sum(case when t1.Team_Code = t2.Team_Code
then t1.times_together
else 0
end) times_together
from t as t1
inner join t as t2
on t1.id < t2.id
group by t1.id, t2.id
order by 1, 2
Output in the example case is:
| id | id | times_together |
|------|------|----------------|
| 1028 | 1110 | 0 |
| 1028 | 2220 | 0 |
| 1028 | 8888 | 0 |
| 1110 | 2220 | 1 |
| 1110 | 8888 | 1 |
| 2220 | 8888 | 6 |

Getting all the current effective records from a ORACLE table

I have two tables in oracle database
Table 1 say table1 with fields (id, name)
Records e.g.
###############
id | name
1 | Chair
2 | Table
3 | Bed
###############
and Table 2 say table2 with fields (id, table1_id, date, price)
##############################
id |table1_id| date | price
1 | 1 | 2013-09-09 | 500
2 | 1 | 2013-08-09 | 300
3 | 2 | 2013-09-09 | 5100
4 | 2 | 2013-08-09 | 5000
5 | 3 | 2013-09-09 | 10500
################################
What I want to achieve is to retrieve all the latest price of items from table 2
Result of SQL should be like
##############################
id |table1_id| date | price
1 | 1 | 2013-09-09 | 500
3 | 2 | 2013-09-09 | 5100
5 | 3 | 2013-09-09 | 10500
################################
I am able to run in mysql by following query
SELECT t2.id, t1.id, t1.name, t2.date, t2.price
FROM table1 t1 JOIN table2 t2
ON (t1.id = t2.table1_id
AND t2.id = (
SELECT id
FROM table2
WHERE table1_id = t1.id
ORDER BY table2.date DESC
LIMIT 1
));
but it's not working in ORACLE, Here i Need a query which can run on both server with minor modification
You may try this (shoud work in both MySQL and Oracle):
select t2.id, t2.table1_id, t2.dat, t2.price
from table1 t1 join table2 t2 on (t1.id = t2.table1_id)
join (select table1_id, max(dat) max_date
from table2 group by table1_id) tmax
on (tmax.table1_id = t2.table1_id and tmax.max_date = t2.dat);
This query may return several rows for the same table1_id and date if there are several prices in table2, like this:
##############################
id |table1_id| date | price
1 | 1 | 2013-09-09 | 500
2 | 1 | 2013-09-09 | 300
It's possible to change the query to retrieve only 1 row for each table1_id, but there should be some additional requirements (which row to choose in the above example)
if it doesn't matter then you may try this:
select max(t2.id) as id, t2.table1_id, t2.dat, max(t2.price) as price
from table1 t1 join table2 t2 on (t1.id = t2.table1_id)
join (select table1_id, max(dat) max_date
from table2 group by table1_id) tmax
on (tmax.table1_id = t2.table1_id and tmax.max_date = t2.dat)
group by t2.table1_id, t2.dat;
You can try this using GROUP BY instead, since you're not retrieving the product name from table1 except the product id (which is already in table2)
SELECT id,table1_id,max(date),price
FROM table2
GROUP BY id,table1_id,price
this is what you want :
select t2.id,t2.table1_id,t1.name,t2.pricedate,t2.price
from table1 t1
join
(
select id,table1_id, pricedate,price, row_number() over (partition by table1_id order by pricedate desc) rn
from table2
) t2
on t1.id = t2.table1_id
where t2.rn = 1