Self join for two same rows with one different column

Self join for two same rows with one different column - sql

Hi I have table with the following data
A B bid status
10 20 1 SUCCESS_1
10 20 1 SUCCESS_2
10 30 2 SUCCESS_1
10 30 2 SUCCESS_2
Now I want to print or count above rows based on SUCCESS_1 and SUCCESS_2. I created the following query but it does not work it just returns one row by combining two rows.
select * from tbl t1 join tbl t2 on
on (t1.A=t2.A and t1.B=t2.B and
(t1.Status = 'SUCCESS_1' and t2.Status = 'SUCCESS_2')
where t1.bid= 1
I want output as the following for the above query
A B bid status
10 20 1 SUCCESS_1
10 20 1 SUCCESS_2
I am new to SQL please guide. Thanks in advance.

If you need to do the join for some reason (e.g. your database does not let you select everything if you group by 1 column, because it wants everything projected to either be grouped or be an aggregate), you could do the following:
select t1.*
from tbl t1 join tbl t2
on (t1.A=t2.A and t1.B=t2.B and t1.Status = 'SUCCESS_1' and t2.Status = 'SUCCESS_2')
where t1.bid= 1
union all select t2.*
from tbl t1 join tbl t2
on (t1.A=t2.A and t1.B=t2.B and t1.Status = 'SUCCESS_1' and t2.Status = 'SUCCESS_2')
where t1.bid= 1
order by 1,2,3,4
Your original query is pulling back all the data in one row, but this one pulls back the two rows that make that resulting join row separately.

SELECT * FROM `tbl1` WHERE `bid`=1 GROUP BY `status`

Related

counting totals after left join and requiring 0 for a NULL variable - SQL Server

I am using SQL Server Management Studio 2012 and I am running the following query:
SELECT T1.ID, COUNT(DISTINCT T2.APPOINTMENT_DATE) AS [TOTAL_APPOINTMENTS]
FROM T1
LEFT JOIN T2
ON T1.ID = T2.ID
WHERE T2.APPOINTMENT_DATE > '2019-01-01' AND T2.APPOINTMENT_DATE < '2020-01-01'
AND (T1.ID = 1 OR T1.ID = 2 OR T1.ID = 3)
I would like the total number of appointments for these 3 individuals for now. Then, I will include everyone in Table 1. Table 1 gives me the ID (one row per individual), Table 2 gives me all appointments across different days per individual.
The results I get are:
ID TOTAL_APPOINTMENTS
1 12
2 3
But I would like:
ID TOTAL_APPOINTMENTS
1 12
2 3
3 0
Can you please advise?

Move the WHERE conditions on the second table to the ON clause:
SELECT T1.ID, COUNT(DISTINCT T2.APPOINTMENT_DATE) AS [TOTAL_APPOINTMENTS]
FROM T1 LEFT JOIN
T2
ON T1.ID = T2.ID AND
T2.APPOINTMENT_DATE > '2019-01-01' AND
T2.APPOINTMENT_DATE < '2020-01-01'
WHERE T1.ID IN (1, 2, 3);
Note that the conditions on the first table remain in the WHERE clause. Also, IN is simpler than a bunch of OR conditions.

SQL Subquery Join and Sum

I have Table 1 & 2 with common Column name ID in both.
Table 1 has duplicate entries of rows which I was able to trim using:
SELECT DISTINCT
Table 2 has duplicate numeric entries(dollarspent) for ID's which I needed and was able to sum up:
Table 1 Table 2
------------ ------------------
ID spec ID Dol1 Dol2
54 A 54 1 0
54 A 54 2 1
55 B 55 0 2
56 C 55 3 0
-I need to join these two queries into one so I get a resultant JOIN of Table 1 & Table 2 ON column ID, (a) without duplicates in Table 1 & (b) Summed $ values from Table 2
For eg:
NewTable
----------------------------------------
ID Spec Dol1 Dol2
54 A 3 1
55 B 3 2
Notes : No. of rows in Table 1 and 2 are not the same.
Thanks

Use a derived table to get the distinct values from table1 and simply join to table 2 and use aggregation.
The issue you have is you have a M:M relationship between table1 and table2. You need it to be a 1:M for the summations to be accurate. Thus we derive t1 from table1 by using a select distinct to give us the unique records in the 1:M relationship (assuming specs are same for each ID)
SELECT T1.ID, T1.Spec, Sum(T2.Dol1) as Dol1, sum(T2.Dol2) as Dol2
FROM (SELECT distinct ID, spec
FROM table1) T1
INNER JOIN table2 T2
on t2.ID = T1.ID
GROUP BY T1.ID, T1.Spec
This does assume you only want records that exist in both. Otherwise we may need to use an (LEFT, RIGHT, or FULL) outer join; depending on desired results.

I can't really see your data, but you might want to try:
SELECT DISTINCT ID
FROM TblOne
UNION ALL
SELECT DISTINCT ID, SUM(Dol)
FROM TblTwo
GROUP BY ID

Pre-aggregate table 2 and then join:
select t1.id, t1.spec, t2.dol1, t2.dol2
from (select t2.id, sum(dol1) as dol1, sum(dol2) as dol2
from table2 t2
group by t2.id
) t2 join
(select distinct t1.id, t1.spec
from table1 t1
) t1
on t1.id = t2.id;
For your data examples, you don't need to pre-aggregate table 2. This gives the correct sums -- albeit in multiple rows -- if table1 has multiple specs for a given id.

Querying two tables to filter data using select case

I have two tables
Table 1 looks like this
ID Repeats
-----------
A 1
A 1
A 0
B 2
B 2
C 2
D 1
Table 2 looks like this
ID values
-----------
A 100
B 200
C 100
D 300
Using a view I need a result like this
ID values Repeats
-------------------
A 100 NA
B 200 2
C 100 2
D 300 1
that means, I want unique ID, its values and Repeats. Repeats value should display NA when there are multiple values against single ID and it should display the Repeats value in case there is single value for repeats.
Initially I needed to display the max value of repeats so I tried the following view
ALTER VIEW [dbo].[BookingView1]
AS
SELECT bv.*, bd2.Repeats FROM Table1 bv
JOIN
(
SELECT distinct bd.id, bd.Repeats FROM table2 bd
JOIN
(
SELECT Id, MAX(Repeats) AS MaxRepeatCount
FROM table2
GROUP BY Id
) bd1
ON bd.Id = bd1.Id
AND bd.Repeats = bd1.MaxRepeatCount
) bd2
ON bv.Id = bd2.Id;
and this returns the correct result but when trying to implement the CASE it fails to return unique ID results. Please help!!

One method uses outer apply:
select t2.*, t1.repeats
from table2 t2 outer apply
(select (case when max(repeats) = min(repeats) then max(repeats)
else 'NA'
end) as repeats
from table1 t1
where t1.id = t2.id
) t1;
Two notes:
This assumes that repeats is a string. If it is a number, you need to cast it to a string.
repeats is not null.

For the sake of completeness, I'm including another approach that will work if repeats is NULL. However, Gordon's answer has a much simpler query plan and should be preferred.
Option 1 (Works with NULLs):
SELECT
t1.ID, t2.[Values],
CASE
WHEN COUNT(*) > 1 THEN 'NA'
ELSE CAST(MAX(Repeats) AS VARCHAR(2))
END Repeats
FROM (
SELECT DISTINCT t1.ID, t1.Repeats
FROM #table1 t1
) t1
LEFT OUTER JOIN #table2 t2
ON t1.ID = t2.ID
GROUP BY t1.ID, t2.[Values]
Option 2 (does not contain explicit subqueries, but does not work with NULLs):
SELECT DISTINCT
t1.ID,
t2.[Values],
CASE
WHEN COUNT(t1.Repeats) OVER (PARTITION BY COUNT(DISTINCT t1.Repeats), t1.ID) > 1 THEN 'NA'
ELSE CAST(t1.Repeats AS VARCHAR(2))
END Repeats
FROM #table1 t1
LEFT OUTER JOIN #table2 t2
ON t1.ID = t2.ID
GROUP BY t1.ID, t2.[Values], t1.Repeats
NOTE:
This may not give desired results if table2 has different values for the same ID.

SQL Server Return Rows Where Field Changed

I have a table with 3 values.
ID AuditDateTime UpdateType
12 12-15-2015 18:09 1
45 12-04-2015 17:41 0
75 12-21-2015 04:26 0
12 12-17-2015 07:43 0
35 12-01-2015 05:36 1
45 12-15-2015 04:35 0
I'm trying to return only records where the UpdateType has changed from AuditDateTime based on the IDs. So in this example, ID 12 changes from the 12-15 entry to the 12-17 entry. I would want that record returned. There will be multiple instances of ID 12, and I need all records returned where an ID's UpdateType has changed from its previous entry. I tried adding a row_number but it didn't insert sequentially because the records are not in the table in order. I've done a ton of searching with no luck. Any help would be greatly appreciated.

By using a CTE it is possible to find the previous record based upon the order of the AuditDateTime
WITH CTEData AS
(SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY AuditDateTime) [ROWNUM], *
FROM #tmpTable)
SELECT A.ID, A.AuditDateTime, A.UpdateType
FROM CTEData A INNER JOIN CTEData B
ON (A.ROWNUM - 1) = B.ROWNUM AND
A.ID = B.ID
WHERE A.UpdateType <> B.UpdateType
The Inner Join back onto the CTE will give in one query both the current record (Table Alias A) and previous row (Table Alias B).

This should do what you're trying to do I believe
SELECT
T1.ID,
T1.AuditDateTime,
T1.UpdateType
FROM
dbo.My_Table T1
INNER JOIN dbo.My_Table T2 ON
T2.ID = T1.ID AND
T2.UpdateType <> T1.UpdateType AND
T2.AuditDateTime < T1.AuditDateTime
LEFT OUTER JOIN dbo.My_Table T3 ON
T3.ID = T1.ID AND
T3.AuditDateTime < T1.AuditDateTime AND
T3.AuditDateTime > T2.AuditDateTime
WHERE
T3.ID IS NULL
Alternatively:
SELECT
T1.ID,
T1.AuditDateTime,
T1.UpdateType
FROM
dbo.My_Table T1
INNER JOIN dbo.My_Table T2 ON
T2.ID = T1.ID AND
T2.UpdateType <> T1.UpdateType AND
T2.AuditDateTime < T1.AuditDateTime
WHERE
NOT EXISTS
(
SELECT *
FROM
dbo.My_Table T3
WHERE
T3.ID = T1.ID AND
T3.AuditDateTime < T1.AuditDateTime AND
T3.AuditDateTime > T2.AuditDateTime
)
The basic gist of both queries is that you're looking for rows where an earlier row had a different type and no other rows exist between the two rows (hence, they're sequential). Both queries are logically identical, but might have differing performance.
Also, these queries assume that no two rows will have identical audit times. If that's not the case then you'll need to define what you expect to get when that happens.

You can use the lag() window function to find the previous value for the same ID. Now you can pick only those rows that introduce a change:
select *
from (
select lag(UpdateType) over (
partition by ID
order by AuditDateTime) as prev_updatetype
, *
from YourTable
) sub
where prev_updatetype <> updatetype
Example at SQL Fiddle.

Oracle: Check if rows exist in other table

I've got a query joining several tables and returning quite a few columns.
An indexed column of another table references the PK of one of these joined tables. Now I would like to add another column to the query that states if at least one row with that ID exists in the new table.
So if I have one of the old tables
ID
1
2
3
and the new table
REF_ID
1
1
1
3
then I'd like to get
ID REF_EXISTS
1 1
2 0
3 1
I can think of several ways to do that, but what is the most elegant/efficient one?
EDIT
I tested the performance of the queries provided with 50.000 records in the old table, every other record matched by two rows in the new table, so half of the records have REF_EXISTS=1.
I'm adding average results as comments to the answers in case anyone is interested. Thanks everyone!

Another option:
select O.ID
, case when N.ref_id is not null then 1 else 0 end as ref_exists
from old_table o
left outer join (select distinct ref_id from new_table) N
on O.id = N.ref_id

I would:
select distinct ID,
case when exists (select 1 from REF_TABLE where ID_TABLE.ID = REF_TABLE.REF_ID)
then 1 else 0 end
from ID_TABLE
Provided you have indexes on the PK and FK you will get away with a table scan and index lookups.
Regards
K

Use:
SELECT DISTINCT t1.id,
CASE WHEN t2.ref_id IS NULL THEN 0 ELSE 1 END AS REF_EXISTS
FROM TABLE_1 t1
LEFT JOIN TABLE_2 t2 ON t2.ref_id = t1.id
Added DISTINCT to ensure only unique rows are displayed.

A join could return multiple rows for one id, as it does for id=1 in the example data. You can limit it to one row per id with a group by:
SELECT
t1.id
, COUNT(DISTINCT t2.ref_id) as REF_EXISTS
FROM TABLE_1 t1
LEFT JOIN TABLE_2 t2 ON t2.ref_id = t1.id
GROUP BY t1.id
The group by ensures there's only one row per id. And count(distinct t2.ref_id) will be 1 if a row is found and 0 otherwise.
EDIT: You can rewrite it without a group by, but I doubt that will make things easer:
SELECT
t1.id
, CASE WHEN EXISTS (
SELECT * FROM TABLE_2 t2 WHERE t2.ref_id = t1.id)
THEN 1 ELSE 0 END as REF_EXISTS
, ....
FROM TABLE_1 t1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Self join for two same rows with one different column - sql

SELECT * FROM `tbl1` WHERE `bid`=1 GROUP BY `status`

Related

counting totals after left join and requiring 0 for a NULL variable - SQL Server

SQL Subquery Join and Sum

Querying two tables to filter data using select case

SQL Server Return Rows Where Field Changed

Oracle: Check if rows exist in other table

Categories

Resources