Linking Table to Itself and Getting Relational ID - sql

I want to get accounts that have same id as other accounts and then ultimately figure out which account it s related to (see table below for example).
Table Structure
Account ID | flag | id2
123 | Y | 1
456 | N | 1
789 | N | 1
888 | Y | 2
999 | N | 2
Results I want:
Account ID | id2 | src_account_id
456 | 1 | 123
789 | 1 | 123
999 | 2 | 888
Here's the query that I have
Select account_id, id2, src_account_id
FROM table1
WHERE id2 IN (Select id2 FROM table1 WHERE flag = 'Y')
But I'm stuck with how to get src_account_id. I'm fairly sure it involves doing an inner join the table to itself, but I'm still not sure how to get the src_account_id.

You can try this. use a subquery to get flag = 'Y' result set. then self-join
SELECT t1.AccountID,t1.id2,t2.AccountID
FROM T t1 inner join (
SELECT id2,AccountID
FROM T
WHERE flag = 'Y'
) t2 on t1.id2 = t2.id2
WHERE t1.flag = 'N'
sqlfiddle
[Results]:
| AccountID | id2 | AccountID |
|-----------|-----|-----------|
| 456 | 1 | 123 |
| 789 | 1 | 123 |
| 999 | 2 | 888 |

Self join the table on id2 and flag.
SELECT t1."Account ID",
t1.id2,
t2."Account ID" src_account_id
FROM elbat t1
INNER JOIN elbat t2
ON t2.id2 = t1.id2
AND t1.flag = 'N'
AND t2.flag = 'Y';

Related

Take the row after the specific row

I have the table, where I need to take the next row after the row which has course 'TA' and flag = 1. For this I created the column rnum (OVER DATE) which may help for finding it
| student | date | course | flag | rnum |
| ------- | ----- | ----------- | ---- | ---- |
| 1 | 17:00 | Math | null | 1 |
| 1 | 17:10 | Python | null | 2 |
| 1 | 17:15 | TA | 1 | 3 |
| 1 | 17:20 | English | null | 4 |
| 1 | 17:35 | Geography | null | 5 |
| 2 | 16:10 | English | null | 1 |
| 2 | 16:20 | TA | 1 | 2 |
| 2 | 16:30 | SQL | null | 3 |
| 2 | 16:40 | Python | null | 4 |
| 3 | 19:05 | English | null | 1 |
| 3 | 19:20 | Literachure | null | 2 |
| 3 | 19:30 | TA | null | 3 |
| 3 | 19:40 | Python | null | 4 |
| 3 | 19:50 | Python | null | 5 |
As a result I should have:
| student | date | course | flag | rnum |
| ------- | ----- | ------- | ---- | ---- |
| 1 | 17:20 | English | null | 4 |
| 2 | 16:30 | SQL | null | 3 |
There are many ways to get your desired result, let's see some of them.
1) EXISTS
You can use the EXISTS clause, specifying a subquery to match for the condition.
SELECT T2.*
FROM #MyTable T2
WHERE EXISTS (
SELECT 'x' x
FROM #MyTable T1
WHERE T1.course = 'TA' AND T1.flag = 1
AND T1.student = T2.student AND T2.rnum = T1.rnum + 1
)
2) LAG
You ca use window function LAG to access previous row for a given order and then filter your resultset with your conditions.
SELECT w.student, w.date, w.course, w.flag, w.rnum
FROM (
SELECT T1.*
, LAG(course, 1) OVER (PARTITION BY student ORDER BY rnum) prevCourse
, LAG(flag, 1) OVER (PARTITION BY student ORDER BY rnum) prevFlag
FROM #MyTable T1
) w
WHERE prevCourse = 'TA' AND prevFlag = 1
3) JOIN
You can self-JOIN your table on the next rnum and keep only the rows who match the right condition.
SELECT T2.*
FROM MyTable T1
JOIN MyTable T2 ON T1.student = T2.student AND T2.rnum = T1.rnum + 1
WHERE T1.course = 'TA' AND T1.flag = 1
4) CROSS APPLY
You can use CROSS APPLY to specify a subquery with the matching condition. It is pretty similar to EXISTS clause, but you will also get in your resultset the columns from the subquery.
SELECT T2.*
FROM #MyTable T2
CROSS APPLY (
SELECT 'x' x
FROM #MyTable T1
WHERE T1.course = 'TA' AND T1.flag = 1
AND T1.student = T2.student AND T2.rnum = T1.rnum + 1
) x
5) CTE
You can use common table expression (CTE) to extract matching rows and then use it to filter your table with a JOIN.
;WITH
T1 AS (
SELECT student, rnum
FROM #MyTable T1
WHERE T1.course = 'TA' AND T1.flag = 1
)
SELECT T2.*
FROM #MyTable T2
JOIN T1 ON T1.student = T2.student AND T2.rnum = T1.rnum + 1
Adding the rownumber was a good start, you can use it to join the table with itself:
WITH matches AS (
SELECT
student,
rnum
FROM table
WHERE flag = 1
AND course = 'TA'
)
SELECT t.*
FROM table t
JOIN matches m
on t.student = m.student
and t.rnum = m.rnum + 1

SQL query join- unrelated tables

Can someone help me to join the two tables without any primary or secondary keys. Sample table is
TABLE 1
| ID | NAME |
| 1 | x |
| 2 | Y |
| 3 | z |
TABLE 2
| Num | NAME | DATE |
| 52 | X | 12-aug-17 |
| 53 | X | 11-apr-17 |
| 62 | X | 10-aug-11 |
| 12 | y | 2-jan-16 |
| 23 | Y | 3-apr-18 |
I want retrieve data from X
select *
from table2
where name = 'x';
| Num | NAME | DATE |
| 52 | X | 12-aug-17 |
| 53 | X | 11-apr-17 |
| 62 | X | 10-aug-11 |
Now I will get three data from table2. I'm little stuck after this step. I want to get top of data the from table 2 and combine with table one.
I want final output should be
| ID | NAME | Num | DATE |
| 1 | x | 52 | 12-aug-17 |
Can someone suggest me how can I join this table? Its easy to join when we have any primary key but here not the case
Thanks
You can use this:
SELECT TOP(1) table1.ID, table2.Num, table2.Name, table2.DATE
FROM table2 INNER JOIN table1 ON table1.NAME = table2.NAME
WHERE table2.NAME = 'x'
ORDER BY table2.DATE ASC
OR
SELECT table1.ID, table2.Num, table2.Name, table2.DATE
FROM table1 INNER JOIN
(SELECT TOP(1) * FROM table2 WHERE NAME = 'x' ORDER BY DATE ASC) table2
ON table1.NAME = table2.NAME
You need to get the maximum DATE using a subquery, as in:
select t1.id, t2.*
from table1 t1
join table2 t2 on t2.name = t1.name
where t2.date = (
select max(date) from table2 where name = 'x'
);

How to group results of a postgres table if any of the fields match?

I have a postgresql table of records Where every every record has a record in that table that matches it on at least one of three possible fields.
id | name | email | phone | product
----------------------------------------------------
1 | Rob A | foo#bar.com | 123 | 102
2 | Rob B | foo#bar.com | 323 | 102
3 | Rob C | foo#bcr.com | 123 | 102
4 | Rob A | foo#bdr.com | 523 | 102
5 | Rob A | foo#bar.com | 123 | 104
6 | Cat A | liz#bar.com | 999 | 102
7 | Cat B | lid#bar.com | 999 | 102
8 | Cat A | lib#bar.com | 991 | 102
I want to group tables any tables where the "product" matches and any of these other three fields, (name, email, phone). So the groups would end up looking like
id | name | email | phone | product
----------------------------------------------------
1 | Rob A | foo#bar.com | 123 | 102
2 | Rob B | foo#bar.com | 323 | 102
3 | Rob C | foo#bcr.com | 123 | 102
4 | Rob A | foo#bdr.com | 523 | 102
5 | Rob A | foo#bar.com | 123 | 104
6 | Cat A | liz#bar.com | 999 | 102
7 | Cat B | lid#bar.com | 999 | 102
8 | Cat A | lib#bar.com | 991 | 102
Is there any way to do this?
If we INNER JOIN the table with itself like
SELECT t1.id id1,
t2.id id2
FROM elbat t1
INNER JOIN elbat t2
ON t2.product = t1.product
AND (t2.name = t1.name
OR t2.email = t1.email
OR t2.phone = t1.phone)
AND t2.id > t1.id;
we'll have the lowest ID of a "group" with more than one row in id1. For each id1 the other members of the "group" are in id2.
That is, we can join the result from the query above, so that for each row the lowest ID of the "group" is joined. As rows, which build a "group" on their own or rows, which already have the lowest ID of a "group" won't find a partner row, we have to LEFT JOIN. We can now use the joined lowest ID, or the ID of a row itself, if there wasn't joined any partner row, as the "group" ID using coalesce().
SELECT coalesce(x.id1, t.id) groupid,
t.*
FROM elbat t
LEFT JOIN (SELECT t1.id id1,
t2.id id2
FROM elbat t1
INNER JOIN elbat t2
ON t2.product = t1.product
AND (t2.name = t1.name
OR t2.email = t1.email
OR t2.phone = t1.phone)
AND t2.id > t1.id) x
ON x.id2 = t.id
ORDER BY coalesce(x.id1, t.id);
As we also ordered by the "group" ID, we can sequentially traverse the result in any application and know, if the "group" ID changes, we're reading the first row of a new "group".
db<>fiddle

How can I write a select statement for this use case?

Please help me compose a SELECT statement. I have these two tables:
Table1 Table2
---------------- ------------------------------------------------
ID | PName | | ID | NameID | DateActive | HoursActive |
---------------- ------------------------------------------------
1 | Neil | | 1 | 1 | 8/2/2013 | 3 |
2 | Mark | | 2 | 1 | 8/3/2013 | 4 |
3 | Onin | | 3 | 2 | 8/2/2013 | 2 |
---------------- | 4 | 2 | 8/6/2013 | 5 |
| 5 | 3 | 8/7/2013 | 1 |
| 6 | 3 | 8/8/2013 | 10 |
------------------------------------------------
And I just want to retrieve the earliest DateActive but no duplicate PName. Like this:
PName | DateActive | HoursActive |
----------------------------------------
Neil | 8/2/2013 | 3 |
Mark | 8/2/2013 | 2 |
Onin | 8/7/2013 | 1 |
----------------------------------------
Something like this might do it. You need to find the min date for each NameID first, then join back to the table to get the hours.
SELECT
PName, MaxDate as DataActive, HoursActive
From
Table1 t1
inner Join Table2 t2 on t1.ID = t2.NameID
Inner Join (Select min(DateActive) as mindate, NameID from Table2 Group by NameID) as t3 on t3.mindate = t2.ActiveDate and t3.NameID = t2.NameId
This should be a pretty standard solution:
select t.pname,
t2.dateactive,
t2.hoursac
from table1 t
join table2 t2 on t.id = t2.nameid
join (
select nameid, min(dateactive) mindateactive
from table2
group by nameid
) t3 on t2.nameid = t3.name
and t3.mindateactive = t2.dateactive
If you are using an RDBMS that supports partition by statements, then this would be more efficient:
select pname, dateactive, HoursActive
from (
select t.pname,
t2.dateactive,
t2.hoursactive,
rank() over (partition by t.id order by t2.dateactive) rownum
from table1 t
join table2 t2 on t.id = t2.nameid
) t
where rownum = 1

Is there a workaround to the Oracle Correlated Subquery Nesting Limit?

I have a situation where I'm trying to use a correlated subquery but am running into the nesting limit in Oracle. I might be missing another feature that Oracle has, so I thought I'd post this question here. Does anyone know how to rewrite the below SQL without running into this nesting limit, but also staying within the below constraints?
Constraints:
Only the SQL in the IN clause can be modified (Due to constraints beyond my control)
As shown, the filtering in the parent query needs to be applied to the aggregation subquery before the aggregation occurs.
Filter out 0 on an aggregation of colB after the parent filter is applied
The below code shows my try at this before running into the Oracle limit. Also, the Oracle version I'm on is 11.2.0.2. Any help would be appreciated. Thanks!
SELECT
*
FROM
table1 t1
WHERE
t1.colA BETWEEN XXXX AND XXXX
AND t1.pk_id IN (
SELECT
t2.pk_id
FROM (
SELECT
t3.pk_id,
SUM(t3.amt) OVER (PARTITION BY t3.colB) amt
FROM table1 t3
WHERE t3.colA = t1.colA
) t2
WHERE
t2.amt <> 0
)
Here are some sample input/outputs of what I was looking for when running the above SQL:
Sample table1:
-----------------------------
| pk_id | colA | colB | amt |
-----------------------------
| 1 | 1 | A | 2 |
| 2 | 1 | A | -1 |
| 3 | 1 | B | 1 |
| 4 | 2 | B | 1 |
| 5 | 2 | A | -2 |
| 6 | 2 | A | 1 |
| 7 | 3 | A | 1 |
Results of SUM over t3.colB with t1.colA BETWEEN 1 And 2:
---------------
| pk_id | amt |
---------------
| 1 | 0 |
| 2 | 0 |
| 3 | 2 |
| 4 | 2 |
| 5 | 0 |
| 6 | 0 |
Results of subquery for IN clause with t1.colA BETWEEN 1 And 2:
---------
| pk_id |
---------
| 3 |
| 4 |
Result of top level query with t1.colA BETWEEN 1 And 2:
-----------------------------
| pk_id | colA | colB | amt |
-----------------------------
| 3 | 1 | B | 1 |
| 4 | 2 | B | 1 |
After working through some of the answers provided, I have a way of avoiding the nesting limit in Oracle with a simple CASE statement:
SELECT
*
FROM
table1 t1
WHERE
t1.colA BETWEEN 1 AND 2
AND t1.pk_id IN (
SELECT
CASE
WHEN SUM(t2.amt) OVER (PARTITION BY t2.colB) <> 0 THEN t2.pk_id
ELSE NULL
END
FROM table1 t2
WHERE t2.colA = t1.colA
)
Unfortunately this surfaced the real problem. Because this is a subquery, I can only iterate through one value of the t1.colA range at a time. This appears to make it impossible execute the analytic sum within that range in the subquery. Because I can only modify the SQL within the IN clause, I don't see a solution to this problem. If anyone has any suggestions please let me know. Thanks.
If you know what the between values are and can use those in your subquery, then you can add that to your subquery instead:
SELECT
*
FROM
table1 t1
WHERE
t1.colA BETWEEN 1 AND 2
AND t1.pk_id IN (
SELECT
t2.pk_id
FROM
(
SELECT
t3.pk_id,
SUM(t3.amt) OVER (PARTITION BY t3.colB) amt
FROM table1 t3
WHERE t3.colA BETWEEN 1 AND 2
) t2
WHERE
t2.amt <> 0
)
SQL Fiddle Demo
You can rewrite your query like this:
SELECT *
FROM table1 t1
WHERE t1.colA BETWEEN XXXX AND XXXX and
t1.pk_id IN (
SELECT t2.pk_id
FROM (SELECT t3.pk_id, t3.ColA, SUM(t3.amt) as amt
FROM table1 t3
group by t3.pk_id, t3.ColA
having sum(t3.amt) > 0
) t2
WHERE t2.colA = t1.colA
)
From here, you can rewrite it as:
select t1.*
from table1 t1 join
(SELECT t3.pk_id, t3.ColA, SUM(t3.amt) as amt
FROM table1 t3
group by t3.pk_id, t3.ColA
having sum(t3.amt) > 0
) t2
on t1.pk_id = t2.pk_id and t1.ColA = t3.ColA
WHERE t1.colA BETWEEN XXXX AND XXXX