Flattening multiple rows - sql

I'm trying to combine the results of multiple rows into one. I would like to flatten the first two rows below, and perhaps combine the keywords so that they are in the same column. How can I do that simply using a select statement (i.e. no functions)?
I'm currently getting:
documentid documentname keyword
1 doc1 politics politics italy
2 doc2 politics politics italy
I would like to get:
documentid documentname keyword
1 doc1 politics italy
2 doc2 politics
This is part of my query:
SELECT d.DocumentId AS documentid ,
m.Title AS documentname ,
STUFF(( SELECT N' ' + k.Word
FROM [arabicarchive].[dbo].[Keywords] k
JOIN [arabicarchive].[dbo].DocumentKeywords dk ON k.KeywordId = dk.Keyword_KeywordId
JOIN [arabicarchive].[dbo].Documents d ON dk.Document_DocumentId = d.DocumentId
FOR
XML PATH('')
), 1, 1, '') AS Keyword
FROM [arabicarchive].[dbo].[Metadatas] m
JOIN [arabicarchive].[dbo].[Documents] d ON d.DocumentId = m.DocumentId
WHERE d.Status = 1
EDIT: I have updated the query and the results that I am currently getting. I haven't used STUFF or XML PATH before so please bear with me.
EDIT 2: I have managed to get rid of the duplicate row, but the result in the keyword column is not correct.
EDIT 3: Adding DISTINCT to the query still doesn't produce a keyword column with the correct values.

In an article I once wrote, towards the end, I showed how to do this
Database Migration Scripts: Getting from place A to place B
If you look near the end, you'll see some code that is similar to what you want. I've modified to fit what you want a bit more closely.
SELECT title_ID, title, ltrim(
(SELECT distinct ' '+tagname.tag FROM titles thisTitle
INNER JOIN TagTitle ON titles.title_ID=TagTitle.Title_ID
INNER JOIN Tagname ON Tagname.TagName_ID=TagTitle.TagName_ID
WHERE ThisTitle.title_id=titles.title_ID
FOR XML PATH(''), TYPE).value('.', 'nvarchar(max)'))
FROM titles
ORDER BY title_ID
I reckon that something like this would work but I've no means of testing it!...
SELECT d.DocumentId AS documentid , m.Title AS documentname ,
ltrim(
( SELECT distinct N' ' + k.Word
FROM [arabicarchive].[dbo].[Keywords] k
JOIN [arabicarchive].[dbo].DocumentKeywords dk ON k.KeywordId = dk.Keyword_KeywordId
Where dk.Document_DocumentId=d.DocumentId
FOR
XML PATH(''), TYPE).value('.', 'nvarchar(max)')
) AS Keyword
FROM [arabicarchive].[dbo].[Metadatas] m
JOIN [arabicarchive].[dbo].[Documents] d ON d.DocumentId = m.DocumentId
WHERE d.Status = 1

Related

Update a column in Table B with multiple row values from Table A using XML PATH

I have 4 columns in Table A viz., Inv_Num1, Inv_Date1, Inv_Amt1, Inv_DocNum1
I have 4 columns in Table B viz., Inv_Num2, Inv_Date2, Inv_Amt2, Inv_Status2
I would like to match the rows between Table A and Table B by using an inner join where condition on is
Invoice_Num1=Invoice_Num2 AND Invoice_Date1=Invoice_Date2 AND
Invoice_Amt1=Invoice_Amt2
When I do this matching I may get more than 1 row as a result in Table
A (Invoice_DocNum1 column)
I tried XML Path code but I dont know how to implement in Update statement
update cis2
set cis2.Inv_Status2 =
(SELECT
TypeName = STUFF((
SELECT '; ' + imd1.Inv_DocNum1
FROM [VRS].[Table_B] cis1
INNER JOIN [Table_A] imd1
ON cis1.Inv_Num1 = imd1.Inv_Num2
WHERE cis1.Inv_Num1 = imd1.Inv_Num2
AND cis1.Inv_Date1 = imd1.Inv_Date2
AND cis1.Inv_Amt1 = imd1.Inv_Amt2
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
) FROM Table_B cis2
Doing this to your database is against good practices since it violates 1NF. But you could still this if you are deadset on doing it. Something along these lines should work.
with myCte as
(
SELECT Inv_Num1
, TypeName = STUFF((
SELECT '; ' + imd1.Inv_DocNum1
FROM [VRS].[Table_B] cis1
INNER JOIN [Table_A] imd1
ON cis1.Inv_Num1 = imd1.Inv_Num2
WHERE cis1.Inv_Num1 = imd1.Inv_Num2
AND cis1.Inv_Date1 = imd1.Inv_Date2
AND cis1.Inv_Amt1 = imd1.Inv_Amt2
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
from Table_A
group by Inv_Num1
)
update tb
set Inv_Status2 = c.TypeName
from Table_B tb
join myCte c on c.Inv_Num1 = tb.Inv_Num2
The answer has two parts. First, you need to produce your comma-separated list per row. The best way to do it is STRING_AGG (https://learn.microsoft.com/en-us/sql/t-sql/functions/string-agg-transact-sql?view=sql-server-2017)
You will need to use it with the group by, like select ..., STRING_AGG(Inv_DocNum1, ',') group by ... where ... stands for your three fields forming unique key.
Second, you need to use update ... from syntax, see https://learn.microsoft.com/en-us/sql/t-sql/queries/update-transact-sql?view=sql-server-2017#l-specifying-a-table-alias-as-the-target-object. In your case, it will be from your target table joining the resultset you computed at step one.

display more than one value using a SQL query

I am trying to display multiple authors per title in a single column. At the moment there a repeating rows, due to the fact that some Titles have more than 1 FirstName. Is there a form of concatenation that can be used to resolve this and display all the authors in a single filed and perhaps separated by a comma.
This is my current query:
SELECT
Submission.Title, Researcher.FirstName, Submission.Type
FROM
Submission
INNER JOIN
((Faculty
INNER JOIN
School ON Faculty.FacultyID = School.[FacultyID])
INNER JOIN
(Researcher
INNER JOIN
ResearcherSubmission ON Researcher.ResearcherID = ResearcherSubmission.ResearcherID)
ON School.SchoolID = Researcher.SchoolID)
ON Submission.SubmissionID = ResearcherSubmission.SubmissionID
GROUP BY
Submission.Title, Researcher.FirstName, Submission.Type;
This the output it generates:
[
this is the output I am trying to generate:
Title FirstName Type
---------------------------------------------------------------------------
21st Century Business Matthew, Teshar Book Chapter
A Family Tree... Keshant, Lawrence Book Chapter
Benefits of BPM... Jafta Journal Article
Business Innovation Matthew, Morna, Teshar Book Chapter
You may inclde the concantenation logic within a CROSS APPLY
SELECT
Submission.Title
, CA.FirstNames
, Submission.Type
FROM Submission
CROSS APPLY (
SELECT
STUFF((
SELECT /* DISTINCT ??? */
', ' + r.FirstName
FROM ResearcherSubmission rs
INNER JOIN Researcher r ON r.ResearcherID = rs.ResearcherID
WHERE Submission.SubmissionID = rs.SubmissionID
FOR XML PATH (''), TYPE
).value('.', 'NVARCHAR(MAX)'), 1, 2, ' ')
) AS CA (FirstNames)
GROUP BY
Submission.Title
, CA.FirstNames
, Submission.Type
;
NB: I'm not sure if you need to include DISTINCT into the subquery when concatenating the names, e.g. if these was 'Jane' (Smith) and 'Jane' (Jones) do you want the final list as: 'Jane' or 'Jane, Jane'?
You can do this in your application logic as well.
But if you want to do this with a query. You should be able do something like this:
SELECT DISTINCT
sm.Title,
STUFF(
(SELECT ', ' + r.FirstName
FROM ResearcherSubmission rs
INNER JOIN Researcher r ON r.ResearcherID = rs.ResearcherID
WHERE sm.SubmissionID = rs.SubmissionID
FOR XML PATH('')), 1, 2, '') AS FirstNames,
sm.Type
FROM Submission sm
You can use the below query to generate the o/p as you want from the o/p that you have got.
CREATE TABLE #temptable(Title VARCHAR(200), FirstName VARCHAR(200), Type VARCHAR(200))
INSERT INTO #temptable
SELECT 'Book1','Matt','Chapter' UNION
SELECT 'Book1','Tesh','Chapter' UNION
SELECT 'BPM','Jafta','Article' UNION
SELECT 'Ethics','William','Journal' UNION
SELECT 'Ethics','Lawrence','Journal' UNION
SELECT 'Ethics','Vincent','Journal' UNION
SELECT 'Cellular','Jane','Conference'
SELECT Title
,STUFF((SELECT ', ' + CAST(FirstName AS VARCHAR(10)) [text()]
FROM #temptable
WHERE Title = t.Title
FOR XML PATH(''), TYPE)
.value('.','NVARCHAR(MAX)'),1,2,' ') List_Output
,Type
FROM #temptable t
GROUP BY Title,Type

SQL Server dynamically change WHERE clause in a SELECT based on returned data

I'm mainly a presentation/logic tier developer and don't mess around with SQL all that much but I have a problem and am wondering if it's impossible within SQL as it's not a full programming language.
I have a field ContactID which has an CompanyID attached to it
In another table, the CompanyID is attached to CompanyName
I am trying to create a SELECT statement that returns ONE CONTACT ID and in a seperate column, an aggregate of all the Companies attached to this contact (by name).
E.G
ContactID - CompanyID - CompanyName
***********************************
1 001 Lol
1 002 Haha
1 003 Funny
2 002 Haha
2 004 Lmao
I want to return
ContactID - Companies
*********************
1 Lol, Haha, Funny
2 Haha, Lmao
I have found the logic to do so with ONE ContactID at a time:
SELECT x.ContactID, substring(
(
SELECT ', '+y.CompanyName AS [text()]
FROM TblContactCompany x INNER JOIN TblCompany y ON x.CompanyID = y.CompanyID WHERE x.ContactID = 13963
For XML PATH (''), root('MyString'), type
).value('/MyString[1]','varchar(max)')
, 3, 1000)
[OrgNames] from TblContact x WHERE x.ContactID = 13963
As you can see here, I am hardcoding in the ContactID 13963, which is neccessary to only return the companies this individual is linked to.
The issue is when I want to return this aggregate information PER ROW on a much bigger scale SELECT (on a whole table full of ContactID's).
I want to have x.ContactID = (this.ContactID) but I can't figure out how!
Failing this, could I run one statement to return a list of ContactID's, then in the same StoredProc run another statement that LOOPS through this list of ContactID's (essentially performing the second statement x times where x = no. of ContactID's)?
Any help greatly appreciated.
You want a correlated subquery:
SELECT ct.ContactID,
stuff((SELECT ', ' + co.CompanyName AS [text()]
FROM TblContactCompany cc INNER JOIN
TblCompany co
ON cc.CompanyID = co.CompanyID
WHERE cc.ContactID = ct.ContactId
For XML PATH (''), root('MyString'), type
).value('/MyString[1]', 'varchar(max)'),
1, 2, '')
[OrgNames]
from TblContact ct;
Note the where clause on the inner subquery.
I also made two other changes:
I changed the table aliases to better represent the table names. This makes queries easier to understand. (Plus, the aliases had to be changed because you were using x in the outer query and the inner query.)
I replaced the substring() with stuff(), which does exactly what you want.
You could use a table variable to store the required x.ContactID and in your main query in the WHERE clause use IN clause like below
WHERE
...
x.ContactID IN (SELECT ContactID FROM #YourTableVariable)
I guess all you need to do is to use unique table identifiers in your subquery and join the table in subquery with outer table x:
SELECT x.ContactID, substring(
(
SELECT ', '+z.CompanyName AS [text()]
FROM TblContactCompany y, TblCompany z WHERE y.CompanyID = z.CompanyID AND y.ContactId = x.ContactId
For XML PATH (''), root('MyString'), type
).value('/MyString[1]','varchar(max)')
, 3, 1000)
[OrgNames] from TblContact x
Don't loop or you will get performance problems (row by agonising row RBAR). Instead do set based queries.
This is untested but should give you an idea of how it may work:
SELECT
x.ContactID,
substring(
(SELECT ', '+y.CompanyName AS [text()]
FROM TblContactCompany y
WHERE x.CompanyID = y.CompanyID
For XML PATH (''), root('MyString'), type).value('/MyString[1]','varchar(max)')
, 3, 1000)
[OrgNames]
FROM TblContact x
And I have a feeling you can use CONCAT instead of substring

How to concatenate distinct columns of otherwise duplicate rows into one row without using FOR XML PATH('')?

I have this query:
SELECT DISTINCT
f1.CourseEventKey,
STUFF
(
(
SELECT '; ' + Title
FROM (
SELECT DISTINCT
ces.CourseEventKey,
f.Title
FROM CourseEventSchedule ces
INNER JOIN Facility f ON f.FacilityKey = ces.FacilityKey
WHERE ces.CourseEventKey IN
(
SELECT CourseEventKey
FROM #CourseEvents
)
) f2
WHERE f2.CourseEventKey = f1.CourseEventKey
FOR XML PATH('')
), 1, 2, ''
)
FROM (
SELECT DISTINCT
ces.CourseEventKey,
f.Title
FROM CourseEventSchedule ces
INNER JOIN Facility f ON f.FacilityKey = ces.FacilityKey
WHERE ces.CourseEventKey IN
(
SELECT CourseEventKey
FROM #CourseEvents
)
) f1
It produces this result set:
CourseEventKey Titles
-------------- ----------------------------------
29 Test Facility 1
30 Memphis Training Room
32 Drury Inn & Suites Creve Coeur
The data is accurate, but I can't have FOR XML PATH('') because it escapes certain special characters.
To be clear, I'm using FOR XML PATH('') because it is possible for records with the same CourseEventKey to have multiple Facility titles associated with them.
How can I retain the data returned by this query without using FOR XML PATH('')?
Change the "for xml path('') )" section to "for xml path(''), root('root'), type).query('root').value ('.', 'varchar(max)')" this will unescape the characters correctly.
Sorry for the poor formatting but not at my computer right now. I can give a full example later if you need.

How should I modify this SQL statement?

My SQL Server view
SELECT
geo.HyperLinks.CatID, geo.Tags.Tag, geo.HyperLinks.HyperLinksID
FROM
geo.HyperLinks LEFT OUTER JOIN
geo.Tags INNER JOIN
geo.TagsList ON geo.Tags.TagID = geo.TagsList.TagID ON geo.HyperLinks.HyperLinksID = geo.TagsList.HyperLinksID WHERE HyperLinksID = 1
returns these...
HyperLinksID CatID Tags
1 2 Sport
1 2 Tennis
1 2 Golf
How should I modify the above to have results like
HyperLinksID CatID TagsInOneRowSeperatedWithSpaceCharacter
1 2 Sport Tennis Golf
UPDATE: As Brad suggested I came up here...
DECLARE #TagList varchar(100)
SELECT #TagList = COALESCE(#TagList + ', ', '') + CAST(TagID AS nvarchar(100))
FROM TagsList
WHERE HyperLinksID = 1
SELECT #TagList
Now the result looks like
HyperLinksID CatID TagsInOneRowSeperatedWithSpaceCharacter
1 2 ID_OF_Sport ID_OF_Tennis ID_OF_Golf
And of course I have to combine the contents from the#TagList variable and the original SELECT statement...
Which means that I'll have to wait for the holy SO bounty :(
If SQL, try this post:
Concatenating Row Values
If you want to try your hand at CLR code, there are examples of creating a custom aggregate function for concatenation, again, for MS SQL.
This post is pretty exhaustive with lots of ways to accomplish your goal.
Using the approach from here to avoid any issues if your tag names contain special XML characters:.
;With HyperLinks As
(
SELECT 1 AS HyperLinksID, 2 AS CatID
),
TagsList AS
(
SELECT 1 AS TagId, 1 AS HyperLinksID UNION ALL
SELECT 2 AS TagId, 1 AS HyperLinksID UNION ALL
SELECT 3 AS TagId, 1 AS HyperLinksID
)
,
Tags AS
(
SELECT 1 AS TagId, 'Sport' as Tag UNION ALL
SELECT 2 AS TagId, 'Tennis' as Tag UNION ALL
SELECT 3 AS TagId, 'Golf' as Tag
)
SELECT HyperLinksID,
CatID ,
(SELECT mydata
FROM ( SELECT Tag AS [data()]
FROM Tags t
JOIN TagsList tl
ON t.TagId = tl.TagId
WHERE tl.HyperLinksID = h.HyperLinksID
ORDER BY t.TagId
FOR XML PATH(''), TYPE
) AS d ( mydata ) FOR XML RAW,
TYPE
)
.value( '/row[1]/mydata[1]', 'varchar(max)' ) TagsInOneRowSeperatedWithSpaceCharacter
FROM HyperLinks h
Edit: As KM points out in the comments this method actually automatically adds spaces so I've removed the manually added spaces. For delimiters other than spaces such as commas Peter's answer seems more appropriate.
If you know your data will not contain any problematic characters then a simpler (probably more performant) version is
SELECT CatID ,
HyperLinksID,
stuff(
( SELECT ' ' + Tag
FROM Tags t
JOIN TagsList tl
ON t.TagId = tl.TagId
WHERE tl.HyperLinksID = h.HyperLinksID
ORDER BY t.TagId
FOR XML PATH('')
), 1, 1, '') TagsInOneRowSeperatedWithSpaceCharacter
FROM HyperLinks h
Use FOR XML in a correlated subquery. For a space-delimited list:
SELECT h.HyperLinksID, h.CatID
, TagList = (
SELECT t.Tag AS [data()]
FROM geo.TagList l
JOIN geo.Tags t ON l.TagId = t.TagId
WHERE l.HyperLinksID = h.HyperLinksID
ORDER BY t.Tag
FOR XML PATH(''), TYPE
).value('.','NVARCHAR(MAX)')
FROM geo.HyperLinks AS h
WHERE h.HyperLinksID = 1
For any other delimiter:
SELECT h.HyperLinksID, h.CatID
, TagList = STUFF((
SELECT ', '+t.Tag
FROM geo.TagList l
JOIN geo.Tags t ON l.TagId = t.TagId
WHERE l.HyperLinksID = h.HyperLinksID
ORDER BY t.Tag
FOR XML PATH(''), TYPE
).value('.','NVARCHAR(MAX)')
,1,2,'')
FROM geo.HyperLinks AS h
WHERE h.HyperLinksID = 1
The subquery creates a delimited list, and then STUFF(...,1,2,'') removes the leading ,. TYPE).value() gets around most common problems w/ special characters in XML.