Inner join 2 tables by 2 categories and multiple columns - sql

I have 2 tables: Candidates and Jobs.
In Jobs there are columns Profession and Subprofession.
For each row in Candidates there are 8 columns:
Selected_Profession1, Selected_Subprofession1,
Selected_Profession2, Selected_Subprofession2,
Selected_Profession3, Selected_Subprofession3,
Selected_Profession4, Selected_Subprofession4
I would like to make a query that would select all the jobs whose profession and subprofession are in one of the respective fields in the Candidates table.
So let's say we have the following Jobs table:
(profession subprofession) -----> (100, 200)
(100, 201)
(101, 200)
(101, 201)
and the following Candidates table:
(prof1 subprof1 prof2 subprof2 prof3 subprof3 prof4 subprof4) ---->
(100, 200, 300, 400, 100, 200, 100, 300)
(101, 200, 102, 200, 300, 200, 200, 300)
(100, 200, 300, 400, 101, 201, 100, 300)
(101, 101, 200, 200, 300, 300, 400, 400)
The query will return rows 1, 3 and 4 from the Jobs table (because Candidate 1 has the pair 100, 200 and candidate 2 has the pair 101, 200 and candidate 3 has the pair 101, 201).
Hope this is clear enough...

You can do the join on multiple fields with an or condition:
select j.*
from jobs j join
candidates c
on (j.prof = c.prof1 and j.subprof = c.subprof1) or
(j.prof = c.prof2 and j.subprof = c.subprof2) or
(j.prof = c.prof3 and j.subprof = c.subprof3) or
(j.prof = c.prof4 and j.subprof = c.subprof4);
If you have large tables, the performance on this will not be very good. You can fix the data structure to get better performance, by having a CandidateProf table, where each prof/subprof pair is on a different row.
With the data structure you have, you would get better performance with separate joins for each prof/subprof grouping, particularly by having an index on the pair. The problem is the select clause. So:
select distinct j.*
from jobs j lef outer join
candidates c1
on (j.prof = c1.prof1 and j.subprof = c1.subprof1) left outer join
candidates c2
on (j.prof = c2.prof2 and j.subprof = c2.subprof2) left outer join
. . .
where c1.prof1 is not null or c2.prof1 is not null or
c3.prof1 is not null or c4.prof1 is not null
And you need to remove duplicates because one candidate might have multiple qualifications.

If your data structures were normalised, this kind of query becomes easier, and your database becomes more flexible.
IE : Your table should look more like
CandidateID ProfessionOrder Profession SubProfession
1 1 100 200
1 2 300 400
...
2 1 101 200
The following query based on your current data structures firstly normalises the candidate/professions table, and then joins in order to demonstrate the ease of finding the solution with normalised data structures.
select
candidateid
from
jobs
inner join
(
select
candidateid, prof1 as profession, subprof1 as subprofession
from candidates
union
select
candidateid, prof2 , subprof2
from candidates
union
select
candidateid, prof3 , subprof3
from candidates
union
select
candidateid, prof4 , subprof4
from candidates
) candidates
on jobs.profession = candidates.profession
and jobs.subprofession = candidates.subprofession

Related

Transpose data in SQL Server Select

I am wondering if there is a better way to write this query. It achieves the target result but my colleague would prefer it be written without the subselects into temp tables t1-t3. The main "challenge" here is transposing the data from dbo.ReviewsData into a single row along with the rest of the data joined from dbo.Prodcucts and dbo.Reviews.
CREATE TABLE dbo.Products (
idProduct int identity,
product_title varchar(100)
PRIMARY KEY (idProduct)
);
INSERT INTO dbo.Products VALUES
(1001, 'poptart'),
(1002, 'coat hanger'),
(1003, 'sunglasses');
CREATE TABLE dbo.Reviews (
Rev_IDReview int identity,
Rev_IDProduct int
PRIMARY KEY (Rev_IDReview)
FOREIGN KEY (Rev_IDProduct) REFERENCES dbo.Products(idProduct)
);
INSERT INTO dbo.Reviews VALUES
(456, 1001),
(457, 1002),
(458, 1003);
CREATE TABLE dbo.ReviewFields (
RF_IDField int identity,
RF_FieldName varchar(32),
PRIMARY KEY (RF_IDField)
);
INSERT INTO dbo.ReviewFields VALUES
(1, 'Customer Name'),
(2, 'Review Title'),
(3, 'Review Message');
CREATE TABLE dbo.ReviewData (
RD_idData int identity,
RD_IDReview int,
RD_IDField int,
RD_FieldContent varchar(100)
PRIMARY KEY (RD_idData)
FOREIGN KEY (RD_IDReview) REFERENCES dbo.Reviews(Rev_IDReview)
);
INSERT INTO dbo.ReviewData VALUES
(79, 456, 1, 'Daniel'),
(80, 456, 2, 'Love this item!'),
(81, 456, 3, 'Works well...blah blah'),
(82, 457, 1, 'Joe!'),
(84, 457, 2, 'Pure Trash'),
(85, 457, 3, 'It was literally a used banana peel'),
(86, 458, 1, 'Karen'),
(87, 458, 2, 'Could be better'),
(88, 458, 3, 'I can always find something wrong');
SELECT P.product_title as "item", t1.ReviewedBy, t2.ReviewTitle, t3.ReviewContent
FROM dbo.Reviews R
INNER JOIN dbo.Products P
ON P.idProduct = R.Rev_IDProduct
INNER JOIN (
SELECT D.RD_FieldContent AS "ReviewedBy", D.RD_IDReview
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 1
) t1
ON t1.RD_IDReview = R.Rev_IDReview
INNER JOIN (
SELECT D.RD_FieldContent AS "ReviewTitle", D.RD_IDReview
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 2
) t2
ON t2.RD_IDReview = R.Rev_IDReview
INNER JOIN (
SELECT D.RD_FieldContent AS "ReviewContent", D.RD_IDReview
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 3
) t3
ON t3.RD_IDReview = R.Rev_IDReview
EDIT: I have updated this post with the create statements for the tables as opposed to an image of the data (shame on me) and a more specific description of what exactly needed to be improved. Thanks to all for the comments and patience.
As others have said in comments, there is nothing objectively wrong with the query. However, you could argue that it's verbose and hard to read.
One way to shorten it is to replace INNER JOIN with CROSS APPLY:
INNER JOIN (
SELECT D.RD_FieldContent AS 'ReviewedBy', D.RD_IDReview
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 1
) t1
ON t1.RD_IDReview = R.Rev_IDReview
APPLY lets you refer to values from the outer query, like in a subquery:
CROSS APPLY (
SELECT D.RD_FieldContent AS 'ReviewedBy'
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 1 AND D.RD_IDReview = R.Rev_IDReview
) t1
I think of APPLY like a subquery that brings in new columns. It's like a cross between a subquery and a join. Benefits:
The query can be shorter, because you don't have to repeat the ID column twice.
You don't have to expose columns that you don't need.
Disadvantages:
If the query in the APPLY references outer values, then you can't extract it and run it all by itself without modifications.
APPLY is specific to Sql Server and it's not that widely-used.
Another thing to consider is using subqueries instead of joins for values that you only need in one place. Benefits:
The queries can be made shorter, because you don't have to repeat the ID column twice, and you don't have to give the output columns unique aliases.
You only have to look in one place to see the whole subquery.
Subqueries can only return 1 row, so you can't accidentally create extra rows, if only 1 row is desired.
SELECT
P.product_title as 'item',
(SELECT D.RD_FieldContent
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 1 AND
D.RD_IDReview = R.Rev_IDReview) as ReviewedBy,
(SELECT D.RD_FieldContent
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 2 AND
D.RD_IDReview = R.Rev_IDReview) as ReviewTitle,
(SELECT D.RD_FieldContent
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 3 AND
D.RD_IDReview = R.Rev_IDReview) as ReviewContent
FROM dbo.Reviews R
INNER JOIN dbo.Products P ON P.idProduct = R.Rev_IDProduct
Edit:
It just occurred to me that you have made the joins themselves unnecessarily verbose (#Dale K actually already pointed this out in the comments):
INNER JOIN (
SELECT D.RD_FieldContent AS 'ReviewedBy', D.RD_IDReview
FROM dbo.ReviewsData D
WHERE D.RD_IDField = 1
) t1
ON t1.RD_IDReview = R.Rev_IDReview
Shorter:
SELECT RevBy.RD_FieldContent AS 'ReviewedBy'
...
INNER JOIN dbo.ReviewsData RevBy
ON RevBy.RD_IDReview = R.Rev_IDReview AND
RevBy.RD_IDField = 1
The originally submitted query is undoubtedly and unnecessarily verbose. Having digested various feedback from the community it has been revised to the following, working splendidly. In retrospect I feel very silly for having done this with subselects originally. I am clearly intermediate at best when it comes to SQL - I had not realized an "AND" clause could be included in the "ON" clause in a "JOIN" statement. Not sure why I would have made such a poor assumption.
SELECT
P.product_title as 'item',
D1.RD_FieldContent as 'ReviewedBy',
D2.RD_FieldContent as 'ReviewTitle',
D3.RD_FieldContent as 'ReviewContent'
FROM dbo.Reviews R
INNER JOIN dbo.Products P
ON P.idProduct = R.Rev_IDProduct
INNER JOIN dbo.ReviewsData D1
ON D1.RD_IDReview = R.Rev_IDReview AND D1.RD_IDField = 1
INNER JOIN dbo.ReviewsData D2
ON D2.RD_IDReview = R.Rev_IDReview AND D2.RD_IDField = 2
INNER JOIN dbo.ReviewsData D3
ON D3.RD_IDReview = R.Rev_IDReview AND D3.RD_IDField = 3

Combine two table for one output sql query

I have two tables
Threads:
i_id thread_note seq_id
1 ABC 2
2 CDE 2
3 FGH 1
4 IJK 2
Notes:
i_id note_text entered_date
1 stack 09/08/2017
5 queue 07/07/2014
3 push 09/07/1996
I want the output as
i_id thread_note seq_id note_text entered_date
1 ABC 2 stack 09/08/2017
2 CDE 2 null null
3 FGH 1 push 09/07/1996
4 IJK 2 null null
5 null null queue 07/07/2014
How do I achieve this? The tables are not related to each other.
Note: This is different from most of the questions similar to this asked because there are some "i_id" values which are present in threads table but not in notes table and there are some "i_id" values present in notes table but not in threads table
Use a full outer join:
SELECT
COALESCE(t.i_id, n.i_id) AS i_id,
t.thread_note,
t.seq_id,
n.note_text,
n.entered_date
FROM Threads t
FULL OUTER JOIN Notes n
ON n.i_id = t.i_id
ORDER BY
i_id;
Note that having the need to do a full outer join often can indicate a problem with your relational model, because it means you don't know the key relationships between your tables.
Demo
Edit:
If you are using a database such as MySQL which does not support a full outer join, we can still simulate one:
SELECT *
FROM Threads t
LEFT JOIN Notes n
ON n.i_id = t.i_id
UNION ALL
SELECT *
FROM Threads t
RIGHT JOIN Notes n
ON n.i_id = t.i_id
WHERE t.i_id IS NULL;
First, you need to get all i_id from all tables in a subquery. Once you have the rows, join this to the two tables to get the columns you need,
SELECT a.i_id,
b.thread_note,
b.seq_id,
c.Note_text,
c.entered_date
FROM
(
SELECT i_id FROM Threads UNION
SELECT i_id FROM Notes
) a
LEFT JOIN Threads b
ON a.i_id = b.i_id
LEFT JOIN Notes c
ON a.i_id = c.i_id
ORDER BY a.i_id
Here's a Demo.
You could just use a FULL OUTER JOIN here. If I make some test data:
DECLARE #threads TABLE (i_id INT, thread_note NVARCHAR(3), seq_id INT);
INSERT INTO #threads SELECT 1, 'ABC', 2;
INSERT INTO #threads SELECT 2, 'CDE', 2;
INSERT INTO #threads SELECT 3, 'FGH', 1;
INSERT INTO #threads SELECT 4, 'IJK', 2;
DECLARE #notes TABLE (i_id INT, note_text NVARCHAR(10), entered_date DATE);
INSERT INTO #notes SELECT 1, 'stack', '20170809';
INSERT INTO #notes SELECT 5, 'queue', '20140707';
INSERT INTO #notes SELECT 3, 'push', '19960709';
Then my query is simply:
SELECT
ISNULL(t.i_id, n.i_id) AS i_id,
t.thread_note,
t.seq_id,
n.note_text,
n.entered_date
FROM
#threads t
FULL OUTER JOIN #notes n ON n.i_id = t.i_id
ORDER BY
ISNULL(t.i_id, n.i_id);
No need to make a list of unique I_ids.
Use the below query
select Isnull(n.i_id,t.i_id), [thread_note],seq_id,Notetest,Enddate
from [dbo].[note] n FULL OUTER JOIN [dbo].[thread] t on n.[i_id]=t.[i_id]
order by Isnull(n.i_id,t.i_id)

Find ID of parent where all children exactly match

The Scenario
Let's suppose we have a set of database tables that represent four key concepts:
Entity Types (e.g. account, client, etc.)
Entities (e.g. instances of the above Entity Types)
Cohorts (a named group)
Cohort Members (the Entities that form up the membership of a Cohort)
The rules around Cohorts are:
A Cohort always has at least one Cohort Member.
A Cohorts Members must be unique to that Cohort (i.e. Entity 5 cannot be a member of Cohort 3 twice, though it could be a member of Cohort 3 and Cohort 4)
No two Cohorts will ever be entirely equal in membership, though one Cohort may legitimately be a subset of another Cohort.
The rules around Entities are:
No two Entities may have the same value pair (business_key, entity_type_id)
Two entities with a different entity_type_id may share a business_key
Because pictures tell a thousand lines of code, here is the ERD:
The Question
I want a SQL query that, when provided a collection of (business_key, entity_type_id) pairs, will search for a Cohort that matches exactly, returning one row with just the cohort_id if that Cohort exists, and zero rows otherwise.
i.e. - if the set of Entities matchesentity_ids 1 and 2, it will only return a cohort_id where the cohort_members are exactly 1 and 2, not just 1, not just 2, not a cohort with entity_ids 1 2 and 3. If no cohort exists that satisfies this, then zero rows are returned.
The Test Cases
To help people addressing the question, I have created a fiddle of the tables along with some data that defines various Entity Types, Entities, and Cohorts. There is also a table with test data for matching, named test_cohort. It contains 6 test cohorts which test various scenarios. The first 5 tests should exactly match just one cohort. The 6th test is a bogus one to test the zero-row clause. When using the test table, the associated INSERT statement should just have one line uncommented (see fiddle, it's set up like that initially):
http://sqlfiddle.com/#!18/2d022
My attempt in SQL is the following, though it fails tests #2 and #4 (which can be found in the fiddle):
SELECT actual_cohort_member.cohort_id
FROM test_cohort
INNER JOIN entity
ON entity.business_key = test_cohort.business_key
AND entity.entity_type_id = test_cohort.entity_type_id
INNER JOIN cohort_member AS existing_potential_member
ON existing_potential_member.entity_id = entity.entity_id
INNER JOIN cohort
ON cohort.cohort_id = existing_potential_member.cohort_id
RIGHT OUTER JOIN cohort_member AS actual_cohort_member
ON actual_cohort_member.cohort_id = cohort.cohort_id
AND actual_cohort_member.cohort_id = existing_potential_member.cohort_id
AND actual_cohort_member.entity_id = existing_potential_member.entity_id
GROUP BY actual_cohort_member.cohort_id
HAVING
SUM(CASE WHEN
actual_cohort_member.cohort_id = existing_potential_member.cohort_id AND
actual_cohort_member.entity_id = existing_potential_member.entity_id THEN 1 ELSE 0
END) = COUNT(*)
;
This scenario can be achieve by adding compound condition in the WHERE clause since you're comparing to a pair value. Then you have to count the result based from the conditions set in the WHERE clause as well as the total rows by of the cohort_id.
SELECT c.cohort_id
FROM cohort c
INNER JOIN cohort_member cm
ON c.cohort_id = cm.cohort_id
INNER JOIN entity e
ON cm.entity_id = e.entity_id
WHERE (e.entity_type_id = 1 AND e.business_key = 'acc1') -- condition here
OR (e.entity_type_id = 1 AND e.business_key = 'acc2')
GROUP BY c.cohort_id
HAVING COUNT(*) = 2 -- number must be the same to the total number of condition
AND (SELECT COUNT(*)
FROM cohort_member cm2
WHERE cm2.cohort_id = c.cohort_id) = 2 -- number must be the same to the total number of condition
Test Case #1
Test Case #2
Test Case #3
Test Case #4
Test Case #5
Test Case #6
As you can see in the test cases above, the value in the filter depends on the number of conditions in the WHERE clause. It would be advisable to create a dynamic query on this.
UPDATE
If the table test_cohort contains only one scenario, then this will suffice your requirement, however, if test_cohort contains list of scenarios then you might want to look in the other answer since this solution does not alter any table schema.
SELECT c.cohort_id
FROM cohort c
INNER JOIN cohort_member cm
ON c.cohort_id = cm.cohort_id
INNER JOIN entity e
ON cm.entity_id = e.entity_id
INNER JOIN test_cohort tc
ON tc.business_key = e.business_key
AND tc.entity_type_id = e.entity_type_id
GROUP BY c.cohort_id
HAVING COUNT(*) = (SELECT COUNT(*) FROM test_cohort)
AND (SELECT COUNT(*)
FROM cohort_member cm2
WHERE cm2.cohort_id = c.cohort_id) = (SELECT COUNT(*) FROM test_cohort)
Test Case #1
Test Case #2
Test Case #3
Test Case #4
Test Case #5
Test Case #6
I have added a column i to your test_cohort table, so that you can test all your scenarios at the same time. Here is a DDL
CREATE TABLE test_cohort (
i int,
business_key NVARCHAR(255),
entity_type_id INT
);
INSERT INTO test_cohort VALUES
(1, 'acc1', 1), (1, 'acc2', 1) -- TEST #1: should match against cohort 1
,(2, 'cli1', 2), (2, 'cli2', 2) -- TEST #2: should match against cohort 2
,(3, 'cli1', 2) -- TEST #3: should match against cohort 3
,(4, 'acc1', 1), (4, 'acc2', 1), (4, 'cli1', 2), (4, 'cli2', 2) -- TEST #4: should match against cohort 4
,(5, 'acc1', 1), (5, 'cli2', 2) -- TEST #5: should match against cohort 5
,(6, 'acc1', 3), (6, 'cli2', 3) -- TEST #6: should not match any cohort
And the query:
select
c.i, m.cohort_id
from
(
select
*, cnt = count(*) over (partition by i)
from
test_cohort
) c
join entity e on c.entity_type_id = e.entity_type_id and c.business_key = e.business_key
join (
select
*, cnt = count(*) over (partition by cohort_id)
from
cohort_member
) m on e.entity_id = m.entity_id and c.cnt = m.cnt
group by m.cohort_id, c.cnt, c.i
having count(*) = c.cnt
Output
i cohort_id
------------
1 1
2 2
3 3
4 4
5 5
The idea is to count number of rows before join. And compare by exact match

Insert a batch number into a table in Oracle SQL

I have a temp table where I insert modified data extracted from a SELECT query.
In this temp table I want to group my rows into batches, so I added an indexed INT column called "BATCH_NUM"
The idea that I am hoping to achieve is this (for say 1000 results in my SELECT statement).
Pseudo Code
Batch Size = 100
Count = 0
For batch size in results set
Insert Into Temp Table (a , b , y , count)
Count++
Current SQL - inputs static value of 1 into BATCH_NUM column
INSERT INTO TEMP_TABLE
(
ASSET_ID,
PAR_PROM_INTEG_ID,
IGNORE
BATCH_NUM
)
SELECT carelevel.row_id, pstn.PROM_INTEG_ID,
CASE
WHEN promoprod.fabric_cd = 'Disabled'
THEN 'Y'
ELSE 'N'
END
'1'
FROM SIEBEL.S_ASSET carelevel
INNER JOIN SIEBEL.S_ASSET pstn
ON pstn.row_id = carelevel.par_asset_id
INNER JOIN SIEBEl.S_ASSET promotion
ON pstn.prom_integ_id = promotion.integration_id
INNER JOIN SIEBEL.S_PROD_INT prod
ON prod.row_id = carelevel.prod_id
INNER JOIN SIEBEL.S_ORG_EXT bill
ON carelevel.bill_accnt_id = bill.row_id
INNER JOIN SIEBEL.S_INV_PROF invoice
ON bill.row_id = invoice.accnt_id
INNER JOIN SIEBEL.S_PROD_INT promoprod
ON promotion.prod_id = promoprod.row_id
WHERE prod.part_num = 'Testproduct'
But if the select statement has 1000 results, then I want BATCH_NUM to go from 1,2,3,4,5,6,7,8,9,10 per 100 records.
Can this be done?
To map record to batch, you might simply want to use integer division. Or slightly more complicated as row are numbered from 1, but something like TRUNC((ROWNUM-1)/100)+1 will do the trick.
Here is a test for that mapping:
select level, trunc((ROWNUM-1)/100)+1 from dual connect by level <= 1000
Result:
ROWNUM TRUNC((ROWNUM-1)/100)+1
1 1
...
100 1
101 2
...
200 2
201 3
...
...
900 9
901 10
...
1000 10
Given your query:
INSERT INTO TEMP_TABLE
(
ASSET_ID,
PAR_PROM_INTEG_ID,
IGNORE,
BATCH_NUM
)
SELECT carelevel.row_id, pstn.PROM_INTEG_ID,
CASE
WHEN promoprod.fabric_cd = 'Disabled'
THEN 'Y'
ELSE 'N'
END,
TRUNC((ROWNUM-1)/100)+1,
-- ^^^^^^^^^^^^^^^^^^^^
-- map rows 1-100 to batch 1, rows 101-200 to batch 2 and so on
FROM SIEBEL.S_ASSET carelevel
INNER JOIN SIEBEL.S_ASSET pstn
ON pstn.row_id = carelevel.par_asset_id
INNER JOIN SIEBEl.S_ASSET promotion
ON pstn.prom_integ_id = promotion.integration_id
INNER JOIN SIEBEL.S_PROD_INT prod
ON prod.row_id = carelevel.prod_id
INNER JOIN SIEBEL.S_ORG_EXT bill
ON carelevel.bill_accnt_id = bill.row_id
INNER JOIN SIEBEL.S_INV_PROF invoice
ON bill.row_id = invoice.accnt_id
INNER JOIN SIEBEL.S_PROD_INT promoprod
ON promotion.prod_id = promoprod.row_id
WHERE prod.part_num = 'Testproduct'

How can I write this query as a full join instead of a union left/right join?

here's the code, showing the inputs and the required output.
Basically, I'm trying to self-join to match the results of my broker's statement with my internal records. So left set of columns is broker's list, right side is my list. If broker has a position, and I don't, NULLs on the right. If I have a position and broker doesn't, NULLs on the left.
The left join + right join + union works exactly as I want. Seems like there should be some voodoo to allow a full join to get that without two selects, but I can't figure it out.
drop table MatchPositions
go
create table MatchPositions
(
mt_source varchar (10),
mt_symbol varchar (10),
mt_qty float,
mt_price float
)
go
insert into MatchPositions values ('BROKER', 'IBM', 100, 50.25)
insert into MatchPositions values ('BROKER', 'MSFT', 75, 30)
insert into MatchPositions values ('BROKER', 'GOOG', 25, 500)
insert into MatchPositions values ('BROKER', 'SPY', 200, 113)
insert into MatchPositions values ('MODEL', 'MSFT', 75, 30)
insert into MatchPositions values ('MODEL', 'GOOG', 25, 500)
insert into MatchPositions values ('MODEL', 'GLD', 300, 150)
go
select * from MatchPositions b
left join MatchPositions m on b.mt_symbol = m.mt_symbol and m.mt_source = 'MODEL'
where b.mt_source = 'BROKER'
union
select * from MatchPositions b
right join MatchPositions m on b.mt_symbol = m.mt_symbol and b.mt_source = 'BROKER'
where m.mt_source = 'MODEL'
and here's the expected output:
mt_source mt_symbol mt_qty mt_price mt_source mt_symbol mt_qty mt_price
---------- ---------- ---------------------- ---------------------- ---------- ---------- ---------------------- ----------------------
NULL NULL NULL NULL MODEL GLD 300 150
BROKER GOOG 25 500 MODEL GOOG 25 500
BROKER IBM 100 50.25 NULL NULL NULL NULL
BROKER MSFT 75 30 MODEL MSFT 75 30
BROKER SPY 200 113 NULL NULL NULL NULL
;WITH T1 AS
(
SELECT *
FROM MatchPositions
WHERE mt_source = 'BROKER'
), T2 AS
(
SELECT *
FROM MatchPositions
WHERE mt_source = 'MODEL'
)
SELECT *
FROM T1 FULL JOIN T2 ON T1.mt_symbol = T2.mt_symbol
Possibly using an isnull function:
SELECT *
FROM MatchPositions b
FULL JOIN MatchPositions m on b.mt_symbol = m.mt_symbol
and b.mt_source != m.mt_source
WHERE isnull(b.mt_source, 'BROKER') = 'BROKER'
and isnull(m.mt_source, 'MODEL') = 'MODEL'
SELECT *
FROM MatchPositions b
FULL JOIN MatchPositions m ON b.mt_symbol = m.mt_symbol
AND b.mt_source = 'BROKER'
AND m.mt_source = 'MODEL'
This filters the table into the 'BROKER' and 'MODEL' parts before outer joining them.
Try this:
select *
from MatchPositions broker
full join MatchPositions model on model.mt_symbol = broker.mt_symbol
and model.mt_source <> broker.mt_source
where ( broker.mt_source = 'BROKER' or broker.MT_SOURCE is null )
and ( model.mt_source = 'MODEL' or model.MT_SOURCE is null )
From the first logical source table, you want either broker rows, or missing rows.
From the second logical source table, you want either model rows or missing rows.
If your RDBMS supports FULL JOIN (also known as FULL OUTER JOIN):
SELECT *
FROM (SELECT * FROM MatchPositions WHERE mt_source = 'BROKER') b
FULL
JOIN (SELECT * FROM MatchPositions WHERE mt_source = 'MODEL' ) m
ON b.mt_symbol = m.mt_symbol
This solution is basically same as Martin's, just uses a different syntax, which may be helpful in case your RDBMS doesn't support CTEs.