SQL select 1 to many within the same row - sql

I have a table with 1 record, which then ties back to a secondary table which can contain either no match, 1 match, or 2 matches.
I need to fetch the corresponding records and display them within the same row which would be easy using left join if I just had 1 or no matches to tie back, however, because I can get 2 matches, it produces 2 records.
Example with 1 match:
Select T1.ID, T1.Person1, T2.Owner
From T1
Left Join T2
ON T1.ID = T2.MatchID
Output
ID Person1 Owner1
----------------------
1 John Frank
Example with 2 match:
Select T1.ID, T1.Person1, T2.Owner
From T1
Left Join T2
ON T1.ID = T2.MatchID
Output
ID Person1 Owner
----------------------
1 John Frank
1 John Peter
Is there a way I can formulate my select so that my output would reflect the following When I have 2 matches:
ID Person1 Owner1 Owner2
-------------------------------
1 John Frank Peter
I explored Oracle Pivots a bit, however couldn't find a way to make this work. Also explored the possibility of using left join on the same table twice using MIN() and MAX() when fetching the matches, however I can only see myself resorting this as a "no other option" scenario.
Any suggestions?
** EDIT **
#ughai - Using CTE does address the issue to some extent, however when attempting to retrieve all of the records, the details derived from this common table isn't showing any records on the LEFT JOIN unless I specify the "MatchID" (CASE_MBR_KEY) value, meaning by removing the "where" clause, my outer joins produce no records, even though the CASE_MBR_KEY values are there in the CTE data.
WITH CTE AS
(
SELECT TEMP.BEAS_KEY,
TEMP.CASE_MBR_KEY,
TEMP.FULLNAME,
TEMP.BIRTHDT,
TEMP.LINE1,
TEMP.LINE2,
TEMP.LINE3,
TEMP.CITY,
TEMP.STATE,
TEMP.POSTCD,
ROW_NUMBER()
OVER(ORDER BY TEMP.BEAS_KEY) R
FROM TMP_BEN_ASSIGNEES TEMP
--WHERE TEMP.CASE_MBR_KEY = 4117398
)
The reason for this is because the ROW_NUMBER value, given the amount of records won't necessarily be 1 or 2, so I attempted the following, but getting ORA-01799: a column may not be outer-joined to a subquery
--// BEN ASSIGNEE 1
LEFT JOIN CTE BASS1
ON BASS1.CASE_MBR_KEY = C.CASE_MBR_KEY
AND BASS1.R IN (SELECT min(R) FROM CTE A WHERE A.CASE_MBR_KEY = C.CASE_MBR_KEY)
--// END BA1
--// BEN ASSIGNEE 2
LEFT JOIN CTE BASS2
ON BASS2.CASE_MBR_KEY = C.CASE_MBR_KEY
AND BASS2.R IN (SELECT MAX(R) FROM CTE B WHERE B.CASE_MBR_KEY = C.CASE_MBR_KEY)
--// END BA2
** EDIT 2 **
Fixed the above issue by moving the Row number clause to the "Where" portion of the query instead of within the JOIN clause. Seems to work now.

You can use CTE with ROW_NUMBER() with 2 LEFT JOIN OR with PIVOT like this.
SQL Fiddle
Query with Multiple Left Joins
WITH CTE as
(
SELECT MatchID,Owner,ROW_NUMBER()OVER(ORDER BY Owner) r FROM t2
)
select T1.ID, T1.Person, t2.Owner as Owner1, t3.Owner as Owner2
FROM T1
LEFT JOIN CTE T2
ON T1.ID = T2.MatchID AND T2.r = 1
LEFT JOIN CTE T3
ON T1.id = T3.MatchID AND T3.r = 2;
Query with PIVOT
WITH CTE as
(
SELECT MatchID,Owner,ROW_NUMBER()OVER(ORDER BY Owner) R FROM t2
)
SELECT ID, Person,O1,O2
FROM T1
LEFT JOIN CTE T2
ON T1.ID = T2.MatchID
PIVOT(MAX(Owner) FOR R IN (1 as O1,2 as O2));
Output
ID PERSON OWNER1 OWNER2
1 John Maxwell Peter

If you know there are at most two matches, you can also use aggregation:
Select T1.ID, T1.Person1,
MIN(T2.Owner) as Owner1,
(CASE WHEN MIN(t2.Owner) <> MAX(t2.Owner) THEN MAX(t2.Owner) END) as Owner2
From T1 Left Join
T2
on T1.ID = T2.MatchID
Group By t1.ID, t1.Person1;

Related

Joining and grouping to equate on two tables

I've tried to minify this problem as much as possible. I've got two tables which share some Id's (among other columns)
id id
---- ----
1 1
1 1
2 1
2
2
Firstly, I can get each table to resolve to a simple count of how many of each Id there is:
select id, count(*) from tbl1 group by id
select id, count(*) from tbl2 group by id
id | tbl1-count id | tbl2-count
--------------- ---------------
1 2 1 3
2 1 2 2
but then I'm at a loss, I'm trying to get the following output which shows the count from tbl2 for each id, divided by the count from tbl1 for the same id:
id | count of id in tbl2 / count of id in tbl1
==========
1 | 1.5
2 | 2
So far I've got this:
select tbl1.Id, tbl2.Id, count(*)
from tbl1
join tbl2 on tbl1.Id = tbl2.Id
group by tbl1.Id, tbl2.Id
which just gives me... well... something nowhere near what I need, to be honest! I was trying count(tbl1.Id), count(tbl2.Id) but get the same multiplied amount (because I'm joining I guess?) - I can't get the individual representations into individual columns where I can do the division.
This gives consideration to your naming of tables -- the query from tbl2 needs to be first so the results will include all records from tbl2. The LEFT JOIN will include all results from the first query, but only join those results that exist in tbl1. (Alternatively, you could use a FULL OUTER JOIN or UNION both results together in the first query.) I also added an IIF to give you an option if there are no records in tbl1 (dividing by null would produce null anyway, but you can do what you want).
Counts are cast as decimal so that the ratio will be returned as a decimal. You can adjust precision as required.
SELECT tb2.id, tb2.table2Count, tb1.table1Count,
IIF(ISNULL(tb1.table1Count, 0) != 0, tb2.table2Count / tb1.table1Count, null) AS ratio
FROM (
SELECT id, CAST(COUNT(1) AS DECIMAL(18, 5)) AS table2Count
FROM tbl2
GROUP BY id
) AS tb2
LEFT JOIN (
SELECT id, CAST(COUNT(1) AS DECIMAL(18, 5)) AS table1Count
FROM tbl1
GROUP BY id
) AS tb1 ON tb1.id = tb2.id
(A subqquery with a LEFT JOIN will allow the query optimizer to determine how to generate the results and will generally outperform a CROSS APPLY, as that executes a calculation for every record.)
Assuming your expected results are wrong, then this is how I would do it:
CREATE TABLE T1 (ID int);
CREATE TABLE T2 (ID int);
GO
INSERT INTO T1 VALUES(1),(1),(2);
INSERT INTO T2 VALUES(1),(1),(1),(2),(2);
GO
SELECT T1.ID AS OutID,
(T2.T2Count * 1.) / COUNT(T1.ID) AS OutCount --Might want a CONVERT to a smaller scale and precision decimal here
FROM T1
CROSS APPLY (SELECT T2.ID, COUNT(T2.ID) AS T2Count
FROM T2
WHERE T2.ID = T1.ID
GROUP BY T2.ID) T2
GROUP BY T1.ID,
T2.T2Count;
GO
DROP TABLE T1;
DROP TABLE T2;
You can aggregate in subqueries and then join:
select t1.id, t2.cnt * 1.0 / t1.cnt
from (select id, count(*) as cnt
from tbl1
group by id
) t1 join
(select id, count(*) as cnt
from tbl2
group by id
) t2
on t1.id = t2.id

Querying two tables to filter data using select case

I have two tables
Table 1 looks like this
ID Repeats
-----------
A 1
A 1
A 0
B 2
B 2
C 2
D 1
Table 2 looks like this
ID values
-----------
A 100
B 200
C 100
D 300
Using a view I need a result like this
ID values Repeats
-------------------
A 100 NA
B 200 2
C 100 2
D 300 1
that means, I want unique ID, its values and Repeats. Repeats value should display NA when there are multiple values against single ID and it should display the Repeats value in case there is single value for repeats.
Initially I needed to display the max value of repeats so I tried the following view
ALTER VIEW [dbo].[BookingView1]
AS
SELECT bv.*, bd2.Repeats FROM Table1 bv
JOIN
(
SELECT distinct bd.id, bd.Repeats FROM table2 bd
JOIN
(
SELECT Id, MAX(Repeats) AS MaxRepeatCount
FROM table2
GROUP BY Id
) bd1
ON bd.Id = bd1.Id
AND bd.Repeats = bd1.MaxRepeatCount
) bd2
ON bv.Id = bd2.Id;
and this returns the correct result but when trying to implement the CASE it fails to return unique ID results. Please help!!
One method uses outer apply:
select t2.*, t1.repeats
from table2 t2 outer apply
(select (case when max(repeats) = min(repeats) then max(repeats)
else 'NA'
end) as repeats
from table1 t1
where t1.id = t2.id
) t1;
Two notes:
This assumes that repeats is a string. If it is a number, you need to cast it to a string.
repeats is not null.
For the sake of completeness, I'm including another approach that will work if repeats is NULL. However, Gordon's answer has a much simpler query plan and should be preferred.
Option 1 (Works with NULLs):
SELECT
t1.ID, t2.[Values],
CASE
WHEN COUNT(*) > 1 THEN 'NA'
ELSE CAST(MAX(Repeats) AS VARCHAR(2))
END Repeats
FROM (
SELECT DISTINCT t1.ID, t1.Repeats
FROM #table1 t1
) t1
LEFT OUTER JOIN #table2 t2
ON t1.ID = t2.ID
GROUP BY t1.ID, t2.[Values]
Option 2 (does not contain explicit subqueries, but does not work with NULLs):
SELECT DISTINCT
t1.ID,
t2.[Values],
CASE
WHEN COUNT(t1.Repeats) OVER (PARTITION BY COUNT(DISTINCT t1.Repeats), t1.ID) > 1 THEN 'NA'
ELSE CAST(t1.Repeats AS VARCHAR(2))
END Repeats
FROM #table1 t1
LEFT OUTER JOIN #table2 t2
ON t1.ID = t2.ID
GROUP BY t1.ID, t2.[Values], t1.Repeats
NOTE:
This may not give desired results if table2 has different values for the same ID.

SQL Server Return Rows Where Field Changed

I have a table with 3 values.
ID AuditDateTime UpdateType
12 12-15-2015 18:09 1
45 12-04-2015 17:41 0
75 12-21-2015 04:26 0
12 12-17-2015 07:43 0
35 12-01-2015 05:36 1
45 12-15-2015 04:35 0
I'm trying to return only records where the UpdateType has changed from AuditDateTime based on the IDs. So in this example, ID 12 changes from the 12-15 entry to the 12-17 entry. I would want that record returned. There will be multiple instances of ID 12, and I need all records returned where an ID's UpdateType has changed from its previous entry. I tried adding a row_number but it didn't insert sequentially because the records are not in the table in order. I've done a ton of searching with no luck. Any help would be greatly appreciated.
By using a CTE it is possible to find the previous record based upon the order of the AuditDateTime
WITH CTEData AS
(SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY AuditDateTime) [ROWNUM], *
FROM #tmpTable)
SELECT A.ID, A.AuditDateTime, A.UpdateType
FROM CTEData A INNER JOIN CTEData B
ON (A.ROWNUM - 1) = B.ROWNUM AND
A.ID = B.ID
WHERE A.UpdateType <> B.UpdateType
The Inner Join back onto the CTE will give in one query both the current record (Table Alias A) and previous row (Table Alias B).
This should do what you're trying to do I believe
SELECT
T1.ID,
T1.AuditDateTime,
T1.UpdateType
FROM
dbo.My_Table T1
INNER JOIN dbo.My_Table T2 ON
T2.ID = T1.ID AND
T2.UpdateType <> T1.UpdateType AND
T2.AuditDateTime < T1.AuditDateTime
LEFT OUTER JOIN dbo.My_Table T3 ON
T3.ID = T1.ID AND
T3.AuditDateTime < T1.AuditDateTime AND
T3.AuditDateTime > T2.AuditDateTime
WHERE
T3.ID IS NULL
Alternatively:
SELECT
T1.ID,
T1.AuditDateTime,
T1.UpdateType
FROM
dbo.My_Table T1
INNER JOIN dbo.My_Table T2 ON
T2.ID = T1.ID AND
T2.UpdateType <> T1.UpdateType AND
T2.AuditDateTime < T1.AuditDateTime
WHERE
NOT EXISTS
(
SELECT *
FROM
dbo.My_Table T3
WHERE
T3.ID = T1.ID AND
T3.AuditDateTime < T1.AuditDateTime AND
T3.AuditDateTime > T2.AuditDateTime
)
The basic gist of both queries is that you're looking for rows where an earlier row had a different type and no other rows exist between the two rows (hence, they're sequential). Both queries are logically identical, but might have differing performance.
Also, these queries assume that no two rows will have identical audit times. If that's not the case then you'll need to define what you expect to get when that happens.
You can use the lag() window function to find the previous value for the same ID. Now you can pick only those rows that introduce a change:
select *
from (
select lag(UpdateType) over (
partition by ID
order by AuditDateTime) as prev_updatetype
, *
from YourTable
) sub
where prev_updatetype <> updatetype
Example at SQL Fiddle.

SQL - get max result

Assume there is a table name "test" below:
name value
n1 1
n2 2
n3 3
Now, I want to get the name which has the max value, I have some solution below:
Solution 1:
SELECT TOP 1 name
FROM test
ORDER BY value DESC
solution 2:
SELECT name
FROM test
WHERE value = (SELECT MAX(value) FROM test);
Now, I hope use join operation to find the result, like
SELECT name
FROM test
INNER JOIN test ON...
Could someone please help and explain how it works?
If you are looking for JOIN then
SELECT T.name, T.value
FROM test T
INNER JOIN
( SELECT T1.name, T1.value ,
RANK() OVER (PARTITION BY T1.name ORDER BY T1.value) N
FROM test T1
WHERE T1.value IN (SELECT MAX(t2.value) FROM test T2)
)T3 ON T3.N = 1 AND T.name = T3.name
FIDDLE DEMO
or
select name, value
from
(
select name, value,
row_number() over(order by value desc) rn
from test
) src
where rn = 1
FIDDLE DEMO
First, note that solutions 1 and 2 could give different results when value is not unique. If in your test data there would be an additional record ('n4', 3), then solution 1 would return either 'n3' or 'n4', but solution 2 would return both.
A solution with JOIN will need aliases for the table, because as you started of, the engine would say Ambiguous column name 'name'.: it would not know whether to take name from the first or second occurrence of the test table.
Here is a way to complete the JOIN version:
SELECT t1.name
FROM test t1
LEFT JOIN test t2
ON t2.value > t1.value
WHERE t2.value IS NULL;
This query takes each of the records, and checks if any records exist that have a higher value. If not, the first record will be in the result. Note the use of LEFT: this denotes an outer join, so that records from t1 that have no match with t2 -- based on the ON condition -- are not immediately rejected (as would be the case with INNER): in fact, we want to reject all the other records, which is done with the WHERE clause.
A way to understand this mechanism, is to look at a variant of the query above, which lacks the WHERE clause and returns the values of both tables:
SELECT t1.value, t2.value
FROM test t1
LEFT JOIN test t2
ON t2.value > t1.value
On your test data this will return:
t1.value t2.value
1 2
1 3
2 3
3 (null)
Note that the last entry would not be there if the join where an INNER JOIN. But with the outer join, one can now look for the NULL values and actually get those records in the result that would be excluded from an INNER JOIN.
Note that this query will give the same result as solution 2 when there are duplicate values. If you want to have also only one result like with solution 1, it suffices to add TOP 1 after SELECT.
Here is a fiddle.
Alternative with pure INNER JOIN
If you really want an INNER join, then this will do it. Again the TOP 1 is only needed if you have non-unique values:
SELECT TOP 1 t1.name
FROM test t1
INNER JOIN (SELECT Max(value) AS value FROM test) t2
ON t2.value = t1.value;
But this one really is very similar to what you did in solution 2. Here is fiddle for it.

SQL Server - Setting multiple columns from another table

I have two tables.
Table 1
ID Code1 Code2 Code3
10 1.1 1.2 1.3
Table 2
Code Group Category
1.1 a cat1
1.2 b cat1
1.3 c cat2
1.4 d cat3
Now I need to get the outputs in two different forms from these two tables tables
Output 1
ID Group1 Group2 Group3
10 a b c
Output 2
ID cat1 cat2 cat3
10 1 1 0
Here the cat1, cat2, cat3 columns are Boolean in nature since the table 1 did not have any code corresponding to cat3 so the value for this is 0.
I was thinking of doing this with case statements but there are about 1000 codes mapped to about 50 categories. Is their a way to do this? I am struggling to come up with a query for this.
First off, I strongly suggest you look into an alternative. This will get messy very fast, as you're essentially treating rows as columns. It doesn't help much that Table1 is already denormalized - though if it really only has 3 columns, it's not that big of a deal to normalize it again.:
CREATE VIEW v_Table1 AS
SELECT Id, Code1 as Code FROM Table1
UNION SELECT Id, Code2 as Code FROM Table1
UNION SELECT Id, Code3 as Code FROM Table1
If we take you second query, it appears you want all possible combinations of ID and Category, and a boolean of whether that combination appears in Table2 (using Code to get back to ID in Table1).
Since there doesn't appear to be a canonical list of ID and Category, we'll generate it:
CREATE VIEW v_AllCategories AS
SELECT DISTINCT ID, Category FROM v_Table1 CROSS JOIN Table2
Getting the list of represented ID and Category is pretty straightforward:
CREATE VIEW v_ReportedCategories AS
SELECT DISTINCT ID, Category FROM Table2
JOIN v_Table1 ON Table2.Code = v_Table1.Code
Put those together, and we can then get the bool to tell us which exists:
CREATE VIEW v_CategoryReports AS
SELECT
T1.ID, T1.Category, CASE WHEN T2.ID IS NULL THEN 0 ELSE 1 END as Reported
FROM v_AllCategories as T1
LEFT OUTER JOIN v_ReportedCategories as T2 ON
T1.ID = T2.ID
AND T1.Category = T2.Category
That gets you your answer in a normalized form:
ID | Category | Reported
10 | cat1 | 1
10 | cat2 | 1
10 | cat3 | 0
From there, you'd need to do a PIVOT to get your Category values as columns:
SELECT
ID,
cat1,
cat2,
cat3
FROM v_CategoryReports
PIVOT (
MAX([Reported]) FOR Category IN ([cat1], [cat2], [cat3])
) p
Since you mentioned over 50 'Categories', I'll assume they're not really 'cat1' - 'cat50'. In which case, you'll need to code gen the pivot operation.
SqlFiddle with a self-contained example.
These answers assume that all 3 codes are available in table 2. If not, then you should use OUTER joins instead of INNER.
Output 1 can be achieved like this:
select t1.ID,
cd1.Group as Group1,
cd2.Group as Group2,
cd3.Group as Group3
from table1 t1
inner join table2 cd1
on t1.Code1 = cd1.Code
inner join table2 cd2
on t1.Code2 = cd2.Code
inner join table2 cd3
on t1.Code3 = cd3.Code
Output 2 is trickier. Since you want a column for every row in Table2, you could write SQL that writes SQL.
Basically start with this base statement:
select t1.ID,
//THE BELOW WILL BE GENERATED ONCE PER ROW
Case when cd1.Category = '' OR
cd2.Category = '' OR
cd3.Category = '' then convert(bit,1) else 0 end as '',
//END GENERATED CODE
from table1 t1
inner join table2 cd1
on t1.Code1 = cd1.Code
inner join table2 cd2
on t1.Code2 = cd2.Code
inner join table2 cd3
on t1.Code3 = cd3.Code
then you can generate the code in the middle like this:
select distinct 'Case when cd1.Category = '''+t2.Category+''' OR
cd2.Category = '''+t2.Category+''' OR
cd3.Category = '''+t2.Category+''' then convert(bit,1) else 0 end as ['+t2.Category+'],'
from table2 t2
Paste those results into the original SQL statement (strip off the trailing comma) and you should be good to go.
We can use the Pivot feature and build the query dynamically. Some what like below:
Query 1
Select * from
(SELECT Id, Code, GroupCode
FROM Table2 join Table1
ON Table1.Code1 = Table2.Code
OR Table1.Code2 = Table2.Code
OR Table1.Code3 = Table2.Code
) ps
PIVOT
(
Max (GroupCode)
FOR Code IN
( [1.1], [1.2], [1.3])
) AS Result
Query 2
Select * from
(SELECT Id, GroupCode, Category
FROM Table2 join Table1
ON Table1.Code1 = Table2.Code
OR Table1.Code2 = Table2.Code
OR Table1.Code3 = Table2.Code
) ps
PIVOT
(
Count (GroupCode)
FOR Category IN
( [cat1], [cat2], [cat3])
) AS Result
Unfortunately your stuck with a bad design for Table1. A better approach would have been to have 3 rows for ID 10.
But, given your current design, your query will look something like this:
SELECT ID, G1.Group Group1, G2.Group Group2, G3.Group Group3
FROM Table1 T1
INNER JOIN Table2 G1 ON T1.Code1 = G1.Code
INNER JOIN Table2 G2 ON T1.Code2 = G2.Code
INNER JOIN Table2 G3 ON T1.Code3 = G3.Code
and
SELECT ID, G1.Category Cat1, G2.Category Cat2, G3.Category Cat3
FROM Table1 T1
INNER JOIN Table2 G1 ON T1.Code1 = G1.Code
INNER JOIN Table2 G2 ON T1.Code2 = G2.Code
INNER JOIN Table2 G3 ON T1.Code3 = G3.Code
The PIVOT and CROSS APPLY keywords within MSSQL would help you out. Though it's not exactly clear what you are trying to accomplish. CROSS APPLY for performing a join on a correlated subquery and displaying different output for each join, and PIVOT for doing a crosstab on your data.
For table 1 it might be easier if you mash it together into a more normalized style.
WITH cteTab1 (Id, Code) AS
(
SELECT Id, Code1 FROM Table1
UNION ALL
SELECT Id, Code2 FROM Table1
UNION ALL
SELECT Id, Code3 FROM Table1)
SELECT *
FROM Table2 INNER JOIN cteTab1 ON Table2.Code = cteTab1.Code