SQL Merge Cells from many rows into a single cell - sql

Similar to the question here:
http://forums.asp.net/t/1580379.aspx/1
I'm trying to merge common cells into a single comma-delimited cell, however across an inner join.
My SQL is:
SELECT DISTINCT tb_Order.OrderNumber, tb_Order.OrderId,
tb_Order.orderDate, tb_Order.OrderTotal,
tb_OrderStatus.OrderStatus, tb_Order.GroupOrderId,
tb_Venue.Title AS Venue
FROM tb_Order INNER JOIN tb_OrderItem ON tb_Order.OrderId = tb_OrderItem.OrderId
INNER JOIN tb_Show ON tb_OrderItem.ShowId = tb_Show.showId
INNER JOIN tb_OrderStatus ON tb_Order.OrderStatusId = tb_OrderStatus.OrderStatusID
INNER JOIN tb_Venue ON tb_Show.VenueId = tb_Venue.id
WHERE (tb_Order.OrderId = 705)
I need the [venue] to be comma-delimited like:
"Interactive Seating Chart Advanced, Interactive Seating Chart Mode Multi-Click"

If you have SQL Server 2017 (14.x) and later, you can use the STRING_AGG function.
SELECT tb_Order.OrderNumber, tb_Order.OrderId,
tb_Order.orderDate, tb_Order.OrderTotal,
tb_OrderStatus.OrderStatus, tb_Order.GroupOrderId,
STRING_AGG(tb_Venue.Title, ',') AS Venue
FROM tb_Order INNER JOIN tb_OrderItem ON tb_Order.OrderId = tb_OrderItem.OrderId
INNER JOIN tb_Show ON tb_OrderItem.ShowId = tb_Show.showId
INNER JOIN tb_OrderStatus ON tb_Order.OrderStatusId = tb_OrderStatus.OrderStatusID
INNER JOIN tb_Venue ON tb_Show.VenueId = tb_Venue.id
WHERE (tb_Order.OrderId = 705)
GROUP BY tb_Order.OrderNumber, tb_Order.OrderId,
tb_Order.orderDate, tb_Order.OrderTotal,
tb_OrderStatus.OrderStatus, tb_Order.GroupOrderId

One way to do this is by using a side effect of row processing order to populate a variable iteratively, like so. The downside of this is that it won't work in a simple query context and it's not the most efficient solution.
DECLARE #venues varchar(max)
SET #venues = ''
SELECT #venues =
CASE WHEN #venues = '' THEN v.Title
ELSE #venues + ',' + v.Title END
FROM tb_Venue v
SELECT #venues
A second way to do this is with STUFF and SQL Server XML extensions, like so:
SELECT DISTINCT v.Title,
(STUFF(
(SELECT ',' + v2.Title
FROM tb_Venue v2
-- uncomment this line if you are going
-- to aggregate only by something in the outer query
-- WHERE v2.GroupKey = v.GroupKey
ORDER BY v2.Title
FOR XML PATH(''), TYPE, ROOT
).value('root[1]','varchar(max)'),1,1,'')) as Aggregation
FROM tb_Venue v
A CLR-based solution is usually the most performant for this use case, and one of those is described here along with a boatload of other less ideal solutions...
Your problem is not one rooted in set logic so there isn't a clean SQL solution...

Related

How to convert inline SQL queries to JOINS in SQL SERVER to reduce load time

I need help in optimizing this SQL query.
In the main SELECT statement there are three columns which is dependent on the outer query result. This is why my query is taking a long time to return data. I have tried making left joins but this is not working properly.
Can anyone help me to resolve this issue?
SELECT
DISTINCT ou.OrganizationUserID AS StudentID,
ou.FirstName,
ou.LastName,
(
SELECT
STRING_AGG(
(ug.UG_Name),
','
)
FROM
Groups ug
INNER JOIN ApplicantUserGroup augm ON augm.AUGM_UserGroupID = ug.UG_ID
WHERE
augm.AUGM_OrganizationUserID = ou.OrganizationUserID
AND ug.UG_IsDeleted = 0
AND augm.AUGM_IsDeleted = 0
) AS UserGroups,
order1.OrderNumber AS OrderId -- UAT-2455
,
(
SELECT
STRING_AGG(
(CActe.CustomAttribute),
','
)
FROM
CustomAttributeCte CActe
WHERE
CActe.HierarchyNodeID = dpm.DPM_ID
AND CActe.OrganizationUserID = ps.OrganizationUserID
) AS CustomAttributes -- UAT-2455
,
(
SELECT
STRING_AGG(
(CActe.CustomAttributeID),
','
)
FROM
CustomAttributeCte CActe
WHERE
CActe.HierarchyNodeID = dpm.DPM_ID
AND CActe.OrganizationUserID = ps.OrganizationUserID
) AS CustomAttributeID
FROM
ApplicantData acd WITH (NOLOCK)
INNER JOIN ClientPackage ps WITH (NOLOCK) ON acd.ClientSubscriptionID = ps.ClientSubscriptionID
INNER JOIN [ClientOrder] order1 WITH (NOLOCK) ON order1.OrderID = ps.OrderID
AND order1.IsDeleted = 0
INNER JOIN OUser ou WITH (NOLOCK) ON ou.OrganizationUserID = ps.OrganizationUserID
It looks like this query can be simplified, and the dependent subqueries in your SELECT clause removed, Consider your second and third dependent subqueries. You can refactor them into one nondependent subquery with a LEFT JOIN. Using nondependent subqueries is more efficient because the query planner can run them just once, rather than once for each row.
You want two STRING_AGG() results from the same table. This subquery gives those two outputs for every possible combination of HierarchyNodeID and OrganizationUserID values. STRING_AGG() is an aggregate function like SUM() and so works nicely with GROUP BY.
SELECT HierarchyNodeID, OrganizationUserID,
STRING_AGG((CActe.CustomAttribute), ',') CustomAttributes -- UAT-2455,
STRING_AGG((CActe.CustomAttributeID), ',') CustomAttributeIDs -- UAT-2455
FROM CustomAttributeCte CActe
GROUP BY HierarchyNodeID, OrganizationUserID
You can run this subquery itself to convince yourself it works.
Now, we can LEFT JOIN that into your query. Like this. (For readability I took out the NOLOCKs and used JOIN: it means the same thing as INNER JOIN.)
SELECT DISTINCT
ou.OrganizationUserID AS StudentID,
ou.FirstName,
ou.LastName,
'tempvalue' AS UserGroups, -- shortened for testing
order1.OrderNumber AS OrderId, -- UAT-2455
uat2455.CustomAttributes, -- UAT-2455
uat2455.CustomAttributeIDs -- UAT-2455
FROM ApplicantData acd
JOIN ClientPackage ps
ON acd.ClientSubscriptionID = ps.ClientSubscriptionID
JOIN ClientOrder order1
ON order1.OrderID = ps.OrderID
AND order1.IsDeleted = 0
JOIN OUser ou
ON ou.OrganizationUserID = ps.OrganizationUserID
LEFT JOIN (
SELECT HierarchyNodeID, OrganizationUserID,
STRING_AGG((CActe.CustomAttribute), ',') CustomAttributes -- UAT-2455,
STRING_AGG((CActe.CustomAttributeID), ',') CustomAttributeIDs -- UAT-2455
FROM CustomAttributeCte CActe
GROUP BY HierarchyNodeID, OrganizationUserID
) uat2455
ON uat2455.HierarchyNodeID = dpm.DPM_ID
AND uat2455.OrganizationUserId = ps.OrganizationUserID
See how we collapsed your second and third dependent subqueries to just one, then used it as a virtual table with LEFT JOIN? We transformed the WHERE clauses from the dependent subqueries into an ON clause.
You can test this: run it with TOP(50) and eyeball the results.
When you're happy, the next step is to transform your first dependent subquery the same way.
Pro tip Don't use WITH (NOLOCK), ever, unless a database administration expert tells you to after looking at your specific query. If your query's purpose is a historical report and you don't care whether the most recent transactions in your database are represented exactly right, you can precede your query with this statement. It also allows the query to run while avoiding locks.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
Pro tip Be obsessive about formatting your queries for readability. You, your colleagues, and yourself a year from now must be able to read and reason about queries like this.

How do I fix the syntax of a sub query with joins?

I have the following query:
SELECT tours_atp.NAME_T, today_atp.TOUR, today_atp.ID1, odds_atp.K1, today_atp.ID2, odds_atp.K2
FROM (players_atp INNER JOIN (players_atp AS players_atp_1 INNER JOIN (today_atp INNER JOIN odds_atp ON (today_atp.TOUR = odds_atp.ID_T_O) AND (today_atp.ID1 = odds_atp.ID1_O) AND (today_atp.ID2 = odds_atp.ID2_O) AND (today_atp.ROUND = odds_atp.ID_R_O)) ON players_atp_1.ID_P = today_atp.ID2) ON players_atp.ID_P = today_atp.ID1) INNER JOIN tours_atp ON today_atp.TOUR = tours_atp.ID_T
WHERE (((tours_atp.RANK_T) Between 1 And 4) AND ((today_atp.RESULT)="") AND ((players_atp.NAME_P) Not Like "*/*") AND ((players_atp_1.NAME_P) Not Like "*/*") AND ((odds_atp.ID_B_O)=2))
ORDER BY tours_atp.NAME_T;
I'd like to add a field to this query that provides me with the sum of a field in another table (FS) with a few criteria applied.
I've been able to build a stand alone query to get the sum of FS by ID_T as follows:
SELECT tbl_Ts_base_atp.ID_T, Sum(tbl_Ts_mkv_atp.FS) AS SumOfFS
FROM tbl_Ts_base_atp INNER JOIN tbl_Ts_mkv_atp ON tbl_Ts_base_atp.ID_Ts = tbl_Ts_mkv_atp.ID_Ts
WHERE (((tbl_Ts_base_atp.DATE_T)>Date()-2000 And (tbl_Ts_base_atp.DATE_T)<Date()))
GROUP BY tbl_Ts_base_atp.ID_T, tbl_Ts_mkv_atp.ID_Ts;
I now want to match up the sum of FS from the second query to the records of the first query by ID_T. I realise I need to do this using a sub query. I'm confident using these when there's only one table but I consistently get 'syntax errors' when there are joins.
I simplified the first query down to remove all the WHERE conditions so it was easier for me to try and error check but no luck. I guess the resulting SQL will also be easier for you guys to follow:
SELECT today_atp.TOUR, (SELECT Sum(tbl_Ts_mkv_atp.FS)
FROM tbl_Ts_mkv_atp INNER JOIN (tbl_Ts_base_atp INNER JOIN today_atp ON tbl_Ts_base_atp.ID_T = today_atp.TOUR) ON tbl_Ts_mkv_atp.ID_Ts = tbl_Ts_base_atp.ID_Ts AS tt
WHERE tt.DATE_T>Date()-2000 And tt.DATE_T<Date() AND tt.TOUR=today_atp.TOUR
ORDER BY tt.DATE_T) AS SumOfFS
FROM today_atp
Can you spot where I'm going wrong? My hunch is that the issue is in the FROM line of the sub query but I'm not sure. Thanks in advance.
It's difficult to advise an appropriate solution without knowledge of how the database tables relate to one another, but assuming that I've correctly understood what you are looking to achieve, you might wish to try the following solution:
select
tours_atp.name_t,
today_atp.tour,
today_atp.id1,
odds_atp.k1,
today_atp.id2,
odds_atp.k2,
subq.sumoffs
from
(
(
(
(
today_atp inner join odds_atp on
today_atp.tour = odds_atp.id_t_o and
today_atp.id1 = odds_atp.id1_o and
today_atp.id2 = odds_atp.id2_o and
today_atp.round = odds_atp.id_r_o
)
inner join players_atp as players_atp_1 on
players_atp_1.id_p = today_atp.id2
)
inner join players_atp on
players_atp.id_p = today_atp.id1
)
inner join tours_atp on
today_atp.tour = tours_atp.id_t
)
inner join
(
select
tbl_ts_base_atp.id_t,
sum(tbl_ts_mkv_atp.fs) as sumoffs
from
tbl_ts_base_atp inner join tbl_ts_mkv_atp on
tbl_ts_base_atp.id_ts = tbl_ts_mkv_atp.id_ts
where
tbl_ts_base_atp.date_t > date()-2000 and tbl_ts_base_atp.date_t < date()
group by
tbl_ts_base_atp.id_t
) subq on
tours_atp.tour = subq.id_t
where
(tours_atp.rank_t between 1 and 4) and
today_atp.result = "" and
players_atp.name_p not like "*/*" and
players_atp_1.name_p not like "*/*" and
odds_atp.id_b_o = 2
order by
tours_atp.name_t;

SQL to combine multiple rows into a single row

I am currently writing a SQL script - takes a business term, and all related synonyms. What it does is creates multiple rows (because there are multiple synonyms (can have other columns that could be multiple values as well.
What I am trying to do is to create a single row for every business term, and concatenate values (, delimited) so that I get one line item for each business term only.
Currently my SQL script is:
SELECT dbo.TblBusinessTerm.BusinessTerm, dbo.TblBusinessTerm.BusinessTermLongDesc,
dbo.TblBusinessTerm.DomainCatID, dbo.TblSystem.SystemName,
dbo.TblDomainCat.DataSteward, dbo.TblDomainCat.DomainCatName,
dbo.TblField.GoldenSource, dbo.TblField.GTS_table,
dbo.TblTableOwner.TableOwnerName, dbo.TblBusinessSynonym.Synonym
FROM dbo.TblTableOwner INNER JOIN
dbo.TblBusinessTerm INNER JOIN
dbo.TblBusinessSynonym ON dbo.TblBusinessTerm.BusinessTermID = dbo.TblBusinessSynonym.BusinessTermID INNER JOIN
dbo.TblField ON dbo.TblBusinessTerm.BusinessTermID = dbo.TblField.BusinessTermID INNER JOIN
dbo.TblSystem INNER JOIN
dbo.TblTable ON dbo.TblSystem.SystemID = dbo.TblTable.SystemID ON dbo.TblField.TableID = dbo.TblTable.TableID INNER JOIN
dbo.TblDomainCat ON dbo.TblBusinessTerm.DomainCatID = dbo.TblDomainCat.DomainCatID ON dbo.TblTableOwner.TableOwnerID = dbo.TblDomainCat.DataSteward
Is there an easy way to do this that takes performance into consideration - am new to SQL.
Thank you
I have managed to create a with statement that now concatenates my rows:
With syn as (
select [BusinessTermID],
syns = STUFF((SELECT ', ' + dbo.TblBusinessSynonym.Synonym
FROM dbo.TblBusinessSynonym
WHERE [BusinessTermID] = x.[BusinessTermID]
AND dbo.TblBusinessSynonym.Synonym <> ''
FOR XML PATH ('')),1,2,'')
FROM dbo.TblBusinessSynonym AS x
GROUP BY [BusinessTermID]
)
select * from syn
But now how can I use it in the above query where everything links?
Would want to replace dbo.TblBusinessSynonym.Synonym with the results from syn
Any SQL 2014 developers that can assist?
Write your with statement at the very top, without the select.
Then write your upper query as it is and change
INNER JOIN dbo.TblBusinessSynonym ON dbo.TblBusinessTerm.BusinessTermID = dbo.TblBusinessSynonym.BusinessTermID
to
INNER JOIN syn ON syn.BusinessTermID = dbo.TblBusinessTerm.BusinessTermID
That's it
With syn as (
select [BusinessTermID],
syns = STUFF((SELECT ', ' + dbo.TblBusinessSynonym.Synonym
FROM dbo.TblBusinessSynonym
WHERE [BusinessTermID] = x.[BusinessTermID]
AND dbo.TblBusinessSynonym.Synonym <> ''
FOR XML PATH ('')),1,2,'')
FROM dbo.TblBusinessSynonym AS x
GROUP BY [BusinessTermID]
)
SELECT dbo.TblBusinessTerm.BusinessTerm,
dbo.TblBusinessTerm.BusinessTermLongDesc,
dbo.TblBusinessTerm.DomainCatID, dbo.TblSystem.SystemName,
dbo.TblDomainCat.DataSteward, dbo.TblDomainCat.DomainCatName,
dbo.TblField.GoldenSource, dbo.TblField.GTS_table,
dbo.TblTableOwner.TableOwnerName, syn.syns
FROM dbo.TblTableOwner INNER JOIN
dbo.TblBusinessTerm INNER JOIN
syn ON dbo.TblBusinessTerm.BusinessTermID = syn.BusinessTermID INNER JOIN
dbo.TblField ON dbo.TblBusinessTerm.BusinessTermID = dbo.TblField.BusinessTermID INNER JOIN
dbo.TblSystem INNER JOIN
dbo.TblTable ON dbo.TblSystem.SystemID = dbo.TblTable.SystemID ON dbo.TblField.TableID = dbo.TblTable.TableID INNER JOIN
dbo.TblDomainCat ON dbo.TblBusinessTerm.DomainCatID = dbo.TblDomainCat.DomainCatID ON dbo.TblTableOwner.TableOwnerID = dbo.TblDomainCat.DataSteward
Please use STRING_AGG function. It combines record items in field ans set them in one record separated with specified delimiter.
Details are here:
https://learn.microsoft.com/en-us/sql/t-sql/functions/string-agg-transact-sql?view=sql-server-2017
Your query is complicated, so I will just post here sample data and how to deal with it in a manner you want. The operation is string aggregation with concatenation, in latest version there's string_agg function, that does the work for us. But, as you can't use this function, here's workaround:
select * into #tt
from (values (1, '1'),(1, '2'),(2, '1'),(2, '2')) A(id, someStr)
select id, (select someStr + ',' from #tt where id = [t].id for xml path('')) [grouped]
from #tt [t] group by id
Above query groups by Id and concaenates all corresponding rows in someStr column.

Nested Select or Inner Join including rows from another table

I have a pretty complex select statement that returns counted statistics from tables (think of it as an answer bank -- the complex select statement below) using inner join.
These answers are related to a table called Questions_Bank_AnswerChoices (which stores all the questions).
I am attempting to first pull the Questions (from the table Questions_Bank_AnswerChoices) then match them up with the statistics (complex statement below). The complex statement below pulls the statistics, but does not pull the questions unless they have been answered.
So, if no one answers question1, then question one will not show up in the statistics because it is not included in the Answers table (bc no one answered it).
How can I achieve this? I think that I need to outer join?
Complex Select Statement:
WITH tbl as (
SELECT
Questions_Bank.QuestionID, Questions_Bank.QuestionName,
REPLACE(Schools_Answers_Items.AnswerValue, '? ', ', ') as AnswerValue,
COUNT(Schools_Answers_Items.SchoolsAnswersItemID) AS CountAnswer,
Schools_Answers_Items.SchoolID
FROM Questions_Bank
INNER JOIN Schools_Answers_Items
ON Questions_Bank.QuestionID = Schools_Answers_Items.QuestionID
LEFT OUTER JOIN Schools_Answers
ON Schools_Answers_Items.SchoolsAnswerID = Schools_Answers.SchoolsAnswerID
WHERE (Questions_Bank.QuestionID = 1108)
AND (Schools_Answers.SchoolID = 103)
GROUP BY
Schools_Answers_Items.SchoolID,
Schools_Answers_Items.AnswerValue,
Questions_Bank.QuestionID,
Questions_Bank.QuestionName
)
SELECT
QuestionID, QuestionName, AnswerValue, CountAnswer,
SUM(CountAnswer) OVER () AS CountAllAnswers
FROM tbl
Try changing this
INNER JOIN Schools_Answers_Items
ON Questions_Bank.QuestionID = Schools_Answers_Items.QuestionID
to
LEFT OUTER JOIN Schools_Answers_Items
ON Questions_Bank.QuestionID = Schools_Answers_Items.QuestionID
and you might want to remove this
AND (Schools_Answers.SchoolID = 103)
or replace it with this
AND (Schools_Answers.SchoolID = 103 OR Schools_Answers.SchoolID IS NULL)
Try this:
SELECT
Questions_Bank.QuestionID, Questions_Bank.QuestionName,
REPLACE(Schools_Answers_Items.AnswerValue, '? ', ', ') as AnswerValue,
Schools_Answers_Items.SchoolID
FROM Questions_Bank
LEFT OUTER JOIN Schools_Answers_Items
ON Questions_Bank.QuestionID = Schools_Answers_Items.QuestionID
LEFT OUTER JOIN Schools_Answers
ON Schools_Answers_Items.SchoolsAnswerID = Schools_Answers.SchoolsAnswerID
WHERE Schools_Answers_Items.SchoolID

Passing result of requery as parameter to UDF

I need to execute a UDF within a query statement and its parameter depends on the current row in the larger query. I need to get a scalar from another table and pass that to the UDF however I get syntax errors if I try to use a query within the parameters of a UDF.
Example:
SELECT M.Col1
FROM MyTable M
WHERE M.RemoteID = UDFLookupRemoteID(SELECT W.Name
FROM WidgetNames W
WHERE W.Col2 = M.RemoteID)
The select within the UDF cannot be done elsewhere since it depends on the outer query.
What's the correct syntax for this?
I think this will give you what you need.
SELECT m.col1
FROM mytable m
INNER JOIN widgetnames w
ON w.col2 = m.remoteid
WHERE m.remoteid = Udflookupremoteid(w.name)
Here's an example I tested with the AdventureWorks database
SELECT pr.*
FROM production.productreview pr
INNER JOIN production.product p
ON p.productid = pr.productid
WHERE pr.rating < dbo.Ufngetstock(p.productid)