How to select from a master table but replace certain rows using a secondary, linked table? - sql

I have two tables with a foreign key relationship on an ID. I'll refer to them as master and secondary to make things easier and also not worry about the FK for now. Here is cut down, easy to reproduce example using table variables to represent the problem:
DECLARE #Master TABLE (
[MasterID] Uniqueidentifier NOT NULL
,[Description] NVARCHAR(50)
)
DECLARE #Secondary TABLE (
[SecondaryID] Uniqueidentifier NOT NULL
,[MasterID] Uniqueidentifier NOT NULL
,[OtherInfo] NVARCHAR(50)
)
INSERT INTO #Master ([MasterID], [Description])
VALUES ('0C1F1A0C-1DB5-4FA2-BC70-26AA9B10D5C3', 'Test')
,('2696ECD2-FFDB-4E26-83D0-F146ED419C9C', 'Test 2')
,('F21568F0-59C5-4950-B936-AA73DA6009B5', 'Test 3')
INSERT INTO #Secondary (SecondaryID, MasterID, Otherinfo)
VALUES ('514673A6-8B5C-429B-905F-15BD8B55CB5D','0C1F1A0C-1DB5-4FA2-BC70-26AA9B10D5C3','Other info')
SELECT [MasterID], [Description], NULL AS [OtherInfo] FROM #Master
UNION
SELECT S.[MasterID], M.[Description], [OtherInfo] FROM #Secondary S
JOIN #Master M ON M.MasterID = S.MasterID
With the results.....
0C1F1A0C-1DB5-4FA2-BC70-26AA9B10D5C3 Test NULL
0C1F1A0C-1DB5-4FA2-BC70-26AA9B10D5C3 Test Other info
F21568F0-59C5-4950-B936-AA73DA6009B5 Test 3 NULL
2696ECD2-FFDB-4E26-83D0-F146ED419C9C Test 2 NULL
.... I would like to only return records from #Secondary if there is a duplicate MasterID, so this is my expected output:
0C1F1A0C-1DB5-4FA2-BC70-26AA9B10D5C3 Test Other info
F21568F0-59C5-4950-B936-AA73DA6009B5 Test 3 NULL
2696ECD2-FFDB-4E26-83D0-F146ED419C9C Test 2 NULL
I tried inserting my union query into a temporary table, then using a CTE with the partition function. This kind of works but unfortunately returns the row from the #Master table rather than the #Secondary table (regardless of the order I select). See below.
DECLARE #Results TABLE (MasterID UNIQUEIDENTIFIER,[Description] NVARCHAR(50),OtherInfo NVARCHAR(50))
INSERT INTO #Results
SELECT [MasterID], [Description], NULL AS [OtherInfo] FROM #Master
UNION
SELECT S.[MasterID], M.[Description], [OtherInfo] FROM #Secondary S
JOIN #Master M ON M.MasterID = S.MasterID
;WITH CTE AS (
SELECT *, RN= ROW_NUMBER() OVER (PARTITION BY [MasterID] ORDER BY [Description] DESC) FROM #Results
)
SELECT * FROM CTE WHERE RN =1
Results:
0C1F1A0C-1DB5-4FA2-BC70-26AA9B10D5C3 Test NULL 1
F21568F0-59C5-4950-B936-AA73DA6009B5 Test 3 NULL 1
2696ECD2-FFDB-4E26-83D0-F146ED419C9C Test 2 NULL 1
Note that I am not just trying to select the rows which have a value for OtherInfo, this is just to help differentiate the two tables in the result set.
Just to reiterate, what I need to only return the rows present in #Secondary, when there is a duplicate MasterID. If #Secondary has a row for a particular MasterID, I don't need the row from #Master. I hope this makes sense.
What is the best way to do this? I am happy to redesign my database structure. I'm effectively trying to have a master list of items but sometimes take one of those and assign extra info to it + tie it to another ID. In this instance, that record replaces the master list.

You are way overcomplicating this. All you need is a left join.
SELECT M.[MasterID], M.[Description], S.[OtherInfo] FROM #Master M
LEFT JOIN #Secondary S ON M.MasterID = S.MasterID

Union seems to be the wrong approach... I would suggest a left join:
SELECT m.[MasterID], m.[Description], s.[OtherInfo]
FROM #Master m
LEFT JOIN #Secondary s ON s.MasterID = m.MasterID

Related

Query for a group of IDs

I use the simple query below to output a list of partIDs based on a modelID that we get from a printout on a sheet of paper.
We've always just used one modelId at a time like this:
SELECT gm.partId, 343432346 as modelId
FROM dbo.systemMemberTable sm
WHERE gm.partID NOT IN (SELECT partID FROM systemsPartTable)
Is there a way to create a query that uses 10 modelIds at a time?
I can't think of a way because the modelIds are not in any table, just a printout that is handed to us.
Thanks!
SELECT gm.partId, T.number as modelId
FROM ( values (4),(9),(16),(25)
) as T(number)
CROSS JOIN dbo.systemMemberTable sm
WHERE gm.partID NOT IN (SELECT partID FROM systemsPartTable)
op said getting an error but this is the test that runs for me
SELECT T.number as [modelID]
,[main].[name]
FROM ( values (4),(9),(16),(25)
) as T(number)
cross join [Gabe2a_ENRONb].[dbo].[docFieldDef] as [main]
where [main].[ID] not in (1)
insert model ids into a table variable and then do the join with this table variable
Also use not exists instead of not in as not in doesn't work if there are null values in the parts table.
declare #modelIds table
(
model_id int
)
insert into #modelIds values (343432346) , (123456)
your select would be
As you want same model id repeated for all parts, you can just use cross join
select gm.partId, m.model_id
from dbo.systemMeberTable sm
cross join #modelIds m
where not exists ( select 1 from systemPartsTable SPT where SPT.partId = gm.PartID )
Try this:
DECLARE #T TABLE (ModelId INT);
INSERT INTO #T (ModelID)
VALUES (343432346), (343432347) -- And so on and so forth
SELECT gm.partId, T.ModelId
FROM dbo.systemMemberTable sm
INNER JOIN #T AS T
ON T.ModelId = SM.ModelID
WHERE gm.partID NOT IN (SELECT partID FROM systemsPartTable)
Create table #tempmodelTable
(
#modelid int
)
insert all modelid here, then use join with your select query
INSERT INTO #tempmodelTable values(123123)
INSERT INTO #tempmodelTable values(1232323)
INSERT INTO #tempmodelTable values(1232343123)
SELECT gm.partId, modelId
FROM dbo.systemMemberTable gm inner join #tempmodelTable
WHERE gm.partID NOT IN (SELECT partID FROM systemsPartTable)

SQL - select selective row multiple times

I need to produce mailing labels for my company and I thought I would do a query for that:
I have 2 tables - tblAddress , tblContact.
In tblContact I have "addressNum" which is a foreign key of address and "labelsNum" column that represents the number of times the address should appear in the labels sheet.
I need to create an inner join of tblcontact and tbladdress by addressNum,
but if labelsNum exists more than once it should be displayed as many times as labelsNum is.
I suggest using a recursive query to do the correct number of iterations for each row.
Here is the code (+ link to SQL fiddle):
;WITH recurs AS (
SELECT *, 1 AS LEVEL
FROM tblContact
UNION ALL
SELECT t1.*, LEVEL + 1
FROM tblContact t1
INNER JOIN
recurs t2
ON t1.addressnum = t2.addressnum
AND t2.labelsnum > t2.LEVEL
)
SELECT *
FROM recurs
ORDER BY addressnum
Wouldn't the script return multiple lines for different contacts anyway?
CREATE TABLE tblAddress (
AddressID int IDENTITY
, [Address] nvarchar(35)
);
CREATE TABLE tblContact (
ContactID int IDENTITY
, Contact nvarchar(35)
, AddressNum int
, labelsNum int
);
INSERT INTO tblAddress VALUES ('foo1');
INSERT INTO tblAddress VALUES ('foo2');
INSERT INTO tblContact VALUES ('bar1', 1, 1);
INSERT INTO tblContact VALUES ('bar2', 2, 2);
INSERT INTO tblContact VALUES ('bar3', 2, 2);
SELECT * FROM tblAddress a JOIN tblContact c ON a.AddressID = c.AddressNum
This yields 3 rows on my end. The labelsNum column seems redundant to me. If you add a third contact for address foo2, you would have to update all labelsNum columns for all records referencing foo2 in order to keep things consistent.
The amount of labels is already determined by the amount of different contacts.
Or am I missing something?

Using GROUP-BY-HAVING on two joined tables

I'm trying to join two tables, and select columns from both based on where constraints as a well as a group-by-having condition. I'm experiencing some issues and behavior that I do not understand. I'm using sybase. A simple example below
CREATE TABLE #test(
name varchar(4),
num int,
cat varchar(3)
)
CREATE TABLE #other(
name varchar(4),
label varchar(20)
)
Insert #test VALUES('a',2,'aa')
Insert #test VALUES ('b',2,'aa')
Insert #test VALUES ('c',3,'bb')
Insert #test VALUES ( 'a',3,'aa')
Insert #test VALUES ( 'd',4,'aa')
Insert #other VALUES('a','this label is a')
Insert #other VALUES ('b','this label is b')
Insert #other VALUES ('c','this label is c')
Insert #other VALUES ( 'd','this label is d')
SELECT t.name,t.num,o.label
FROM #other o inner JOIN #test t ON o.name=t.name
WHERE t.name='a'
GROUP BY t.name
HAVING t.num=MAX(t.num)
I get non-sense when I have the GROUP BY (the label columns are clearly related to a different t.name). If I cut out the GROUP BY statement the query behaves as I would expect, but then I am forced to use this as a subquery and then apply
SELECT * FROM (subquery here) s GROUP BY s.name having s.num=MAX(s.num)
There has to be a better way of doing this. Any help and explanation for this behavior would be very appreciated.
**I should clarify. In my actual query I have something like
SELECT .... FROM (joined tables) WHERE name IN (long list of names), GROUP BY .....
You can try the following, if I understand your query well.
SELECT t.name,t.num,o.label
FROM #other o inner JOIN #test t ON o.name=t.name
WHERE t.name='a' AND t.num=MAX(t.num)
Your GROUP BY must include t.name, t.num and o.label. If you do this
GROUP BY t.name, t.num, o.label
then the query executes without error.
However you're not computing any aggregate values in the group by.
What are you trying to do?
If I understand your requirement correctly, running this...
SELECT t.name, num=MAX(t.num), o.label
FROM #other o
INNER JOIN #test t ON o.name=t.name
WHERE t.name='a'
GROUP BY t.name, o.label;
...gives me this result:
name num label
---- ----------- --------------------
a 3 this label is a

How can I condense strings in several string rows into a single field?

For a class project, a few others and I have decided to make a (very ugly) limited clone of StackOverflow. For this purpose, we're working on one query:
Home Page: List all the questions, their scores (calculated from votes), and the user corresponding to their first revision, and the number of answers, sorted in date-descending order according to the last action on the question (where an action is an answer, an edit of an answer, or an edit of the question).
Now, we've gotten the entire thing figured out, except for how to represent tags on questions. We're currently using a M-N mapping of tags to questions like this:
CREATE TABLE QuestionRevisions (
id INT IDENTITY NOT NULL,
question INT NOT NULL,
postDate DATETIME NOT NULL,
contents NTEXT NOT NULL,
creatingUser INT NOT NULL,
title NVARCHAR(200) NOT NULL,
PRIMARY KEY (id),
CONSTRAINT questionrev_fk_users FOREIGN KEY (creatingUser) REFERENCES
Users (id) ON DELETE CASCADE,
CONSTRAINT questionref_fk_questions FOREIGN KEY (question) REFERENCES
Questions (id) ON DELETE CASCADE
);
CREATE TABLE Tags (
id INT IDENTITY NOT NULL,
name NVARCHAR(45) NOT NULL,
PRIMARY KEY (id)
);
CREATE TABLE QuestionTags (
tag INT NOT NULL,
question INT NOT NULL,
PRIMARY KEY (tag, question),
CONSTRAINT qtags_fk_tags FOREIGN KEY (tag) REFERENCES Tags(id) ON
DELETE CASCADE,
CONSTRAINT qtags_fk_q FOREIGN KEY (question) REFERENCES Questions(id) ON
DELETE CASCADE
);
Now, for this query, if we just join to QuestionTags, then we'll get the questions and titles over and over and over again. If we don't, then we have an N query scenario, which is just as bad. Ideally, we'd have something where the result row would be:
+-------------+------------------+
| Other Stuff | Tags |
+-------------+------------------+
| Blah Blah | TagA, TagB, TagC |
+-------------+------------------+
Basically -- for each row in the JOIN, do a string join on the resulting tags.
Is there a built in function or similar which can accomplish this in T-SQL?
Here's one possible solution using recursive CTE:
The methods used are explained here
TSQL to set up the test data (I'm using table variables):
DECLARE #QuestionRevisions TABLE (
id INT IDENTITY NOT NULL,
question INT NOT NULL,
postDate DATETIME NOT NULL,
contents NTEXT NOT NULL,
creatingUser INT NOT NULL,
title NVARCHAR(200) NOT NULL)
DECLARE #Tags TABLE (
id INT IDENTITY NOT NULL,
name NVARCHAR(45) NOT NULL
)
DECLARE #QuestionTags TABLE (
tag INT NOT NULL,
question INT NOT NULL
)
INSERT INTO #QuestionRevisions
(question,postDate,contents,creatingUser,title)
VALUES
(1,GETDATE(),'Contents 1',1,'TITLE 1')
INSERT INTO #QuestionRevisions
(question,postDate,contents,creatingUser,title)
VALUES
(2,GETDATE(),'Contents 2',2,'TITLE 2')
INSERT INTO #Tags (name) VALUES ('Tag 1')
INSERT INTO #Tags (name) VALUES ('Tag 2')
INSERT INTO #Tags (name) VALUES ('Tag 3')
INSERT INTO #Tags (name) VALUES ('Tag 4')
INSERT INTO #Tags (name) VALUES ('Tag 5')
INSERT INTO #Tags (name) VALUES ('Tag 6')
INSERT INTO #QuestionTags (tag,question) VALUES (1,1)
INSERT INTO #QuestionTags (tag,question) VALUES (3,1)
INSERT INTO #QuestionTags (tag,question) VALUES (5,1)
INSERT INTO #QuestionTags (tag,question) VALUES (4,2)
INSERT INTO #QuestionTags (tag,question) VALUES (2,2)
Here's the action part:
;WITH CTE ( id, taglist, tagid, [length] )
AS ( SELECT question, CAST( '' AS VARCHAR(8000) ), 0, 0
FROM #QuestionRevisions qr
GROUP BY question
UNION ALL
SELECT qr.id
, CAST(taglist + CASE WHEN [length] = 0 THEN '' ELSE ', ' END + t.name AS VARCHAR(8000) )
, t.id
, [length] + 1
FROM CTE c
INNER JOIN #QuestionRevisions qr ON c.id = qr.question
INNER JOIN #QuestionTags qt ON qr.question=qt.question
INNER JOIN #Tags t ON t.id=qt.tag
WHERE t.id > c.tagid )
SELECT id, taglist
FROM ( SELECT id, taglist, RANK() OVER ( PARTITION BY id ORDER BY length DESC )
FROM CTE ) D ( id, taglist, rank )
WHERE rank = 1;
This was the solution I ended up settling on. I checkmarked Mack's answer because it works with arbitrary numbers of tags, and because it matches what I asked for in my question. I ended up though going with this, however, simply because I understand what this is doing, while I have no idea how Mack's works :)
WITH tagScans (qRevId, tagName, tagRank)
AS (
SELECT DISTINCT
QuestionTags.question AS qRevId,
Tags.name AS tagName,
ROW_NUMBER() OVER (PARTITION BY QuestionTags.question ORDER BY Tags.name) AS tagRank
FROM QuestionTags
INNER JOIN Tags ON Tags.id = QuestionTags.tag
)
SELECT
Questions.id AS id,
Questions.currentScore AS currentScore,
answerCounts.number AS answerCount,
latestRevUser.id AS latestRevUserId,
latestRevUser.caseId AS lastRevUserCaseId,
latestRevUser.currentScore AS lastRevUserScore,
CreatingUsers.userId AS creationUserId,
CreatingUsers.caseId AS creationUserCaseId,
CreatingUsers.userScore AS creationUserScore,
t1.tagName AS tagOne,
t2.tagName AS tagTwo,
t3.tagName AS tagThree,
t4.tagName AS tagFour,
t5.tagName AS tagFive
FROM Questions
INNER JOIN QuestionRevisions ON QuestionRevisions.question = Questions.id
INNER JOIN
(
SELECT
Questions.id AS questionId,
MAX(QuestionRevisions.id) AS maxRevisionId
FROM Questions
INNER JOIN QuestionRevisions ON QuestionRevisions.question = Questions.id
GROUP BY Questions.id
) AS LatestQuestionRevisions ON QuestionRevisions.id = LatestQuestionRevisions.maxRevisionId
INNER JOIN Users AS latestRevUser ON latestRevUser.id = QuestionRevisions.creatingUser
INNER JOIN
(
SELECT
QuestionRevisions.question AS questionId,
Users.id AS userId,
Users.caseId AS caseId,
Users.currentScore AS userScore
FROM Users
INNER JOIN QuestionRevisions ON QuestionRevisions.creatingUser = Users.id
INNER JOIN
(
SELECT
MIN(QuestionRevisions.id) AS minQuestionRevisionId
FROM Questions
INNER JOIN QuestionRevisions ON QuestionRevisions.question = Questions.id
GROUP BY Questions.id
) AS QuestionGroups ON QuestionGroups.minQuestionRevisionId = QuestionRevisions.id
) AS CreatingUsers ON CreatingUsers.questionId = Questions.id
INNER JOIN
(
SELECT
COUNT(*) AS number,
Questions.id AS questionId
FROM Questions
INNER JOIN Answers ON Answers.question = Questions.id
GROUP BY Questions.id
) AS answerCounts ON answerCounts.questionId = Questions.id
LEFT JOIN tagScans AS t1 ON t1.qRevId = QuestionRevisions.id AND t1.tagRank = 1
LEFT JOIN tagScans AS t2 ON t2.qRevId = QuestionRevisions.id AND t2.tagRank = 2
LEFT JOIN tagScans AS t3 ON t3.qRevId = QuestionRevisions.id AND t3.tagRank = 3
LEFT JOIN tagScans AS t4 ON t4.qRevId = QuestionRevisions.id AND t4.tagRank = 4
LEFT JOIN tagScans AS t5 ON t5.qRevId = QuestionRevisions.id AND t5.tagRank = 5
ORDER BY QuestionRevisions.postDate DESC
This is a common question that comes up quite often phrased in a number of different ways (concatenate rows as string, merge rows as string, condense rows as string, combine rows as string, etc.). There are two generally accepted ways to handle combining an arbitrary number of rows into a single string in SQL Server.
The first, and usually the easiest, is to abuse XML Path combined with the STUFF function like so:
select rsQuestions.QuestionID,
stuff((select ', '+ rsTags.TagName
from #Tags rsTags
inner join #QuestionTags rsMap on rsMap.TagID = rsTags.TagID
where rsMap.QuestionID = rsQuestions.QuestionID
for xml path(''), type).value('.', 'nvarchar(max)'), 1, 1, '')
from #QuestionRevisions rsQuestions
Here is a working example (borrowing some slightly modified setup from Mack). For your purposes you could store the results of that query in a common table expression, or in a subquery (I'll leave that as an exercise).
The second method is to use a recursive common table expression. Here is an annotated example of how that would work:
--NumberedTags establishes a ranked list of tags for each question.
--The key here is using row_number() or rank() partitioned by the particular question
;with NumberedTags (QuestionID, TagString, TagNum) as
(
select QuestionID,
cast(TagName as nvarchar(max)) as TagString,
row_number() over (partition by QuestionID order by rsTags.TagID) as TagNum
from #QuestionTags rsMap
inner join #Tags rsTags on rsTags.TagID = rsMap.TagID
),
--TagsAsString is the recursive query
TagsAsString (QuestionID, TagString, TagNum) as
(
--The first query in the common table expression establishes the anchor for the
--recursive query, in this case selecting the first tag for each question
select QuestionID,
TagString,
TagNum
from NumberedTags
where TagNum = 1
union all
--The second query in the union performs the recursion by joining the
--anchor to the next tag, and so on...
select NumberedTags.QuestionID,
TagsAsString.TagString + ', ' + NumberedTags.TagString,
NumberedTags.TagNum
from NumberedTags
inner join TagsAsString on TagsAsString.QuestionID = NumberedTags.QuestionID
and NumberedTags.TagNum = TagsAsString.TagNum + 1
)
--The result of the recursive query is a list of tag strings building up to the final
--string, of which we only want the last, so here we select the longest one which
--gives us the final result
select QuestionID, max(TagString)
from TagsAsString
group by QuestionID
And here is a working version. Again, you could use the results in a common table expression or subquery to join against your other tables to get your ultimate result. Hopefully the annotations help you understand a little more how the recursive common table expression works (though the link in Macks answer also goes into some detail about the method).
There is, of course, another way to do it, which doesn't handle an arbitrary number of rows, which is to join against your table aliased multiple times, which is what you did in your answer.

How to show fields from most recently added detail in a view?

QUERY:
drop table #foot
create table #foot (
id int primary key not null,
name varchar(50) not null
)
go
drop table #note
create table #note (
id int primary key not null,
note varchar(MAX) not null,
foot_id int not null references #foot(id)
)
go
insert into #foot values
(1, 'Joe'), (2, 'Mike'), (3, 'Rob')
go
insert into #note (id, note, foot_id) values (1, 'Joe note 1', 1)
go
insert into #note (id, note, foot_id) values(2, 'Joe note 2', 1)
go
insert into #note (id, note, foot_id) values(3, 'Mike note 1', 2)
go
select F.name, N.note, N.id
from #foot F left outer join #note N on N.foot_id=F.id
RESULT:
QUESTION:
How can I create a view/query resulting in one row for each master record (#foot) along with fields from the most recently inserted detail (#note), if any?
GOAL:
(NOTE: the way I would tell which one is most recent is the id which would be higher for newer records)
select t.name, t.note, t.id
from (select F.name, N.note, N.id,
ROW_NUMBER() over(partition by F.id order by N.id desc) as RowNum
from #foot F
left outer join #note N
on N.foot_id=F.id) t
where t.RowNum = 1
Assuming the ID created in the #note table is always incremental (imposed by using IDENTITY or by controlling the inserts to always increment the by by max value) you can use the following query (which uses rank function):
WITH Dat AS
(
SELECT f.name,
n.note,
n.id,
RANK() OVER(PARTITION BY n.foot_id ORDER BY n.id DESC) rn
FROM #foot f LEFT OUTER JOIN #note n
ON n.foot_id = f.id
)
SELECT *
FROM Dat
WHERE rn = 1