T-SQL combining XML columns with same ID - sql

Hey all I have a question about combining like IDs that also have a XML column.
My data I'm trying to combine:
_ID _xml _indivisualCommaList _eachIndividual
------ ------------------------------------------------------------------------------------------------- ----------------------- ---------------
46589 <Individual><TBS>768-hER-382</TBS><Categories /><TBS2>768-hER-382,908-YTY-354</TBS2></Individual> 768-hER-382,908-YTY-354 768-hER-382
46589 <Individual><TBS>768-hER-382</TBS><Categories /><TBS2>768-hER-382,908-YTY-354</TBS2></Individual> 768-hER-382,908-YTY-354 908-YTY-354
Where
_ID = INT
_xml = XML
_indivisualCommaList = VARCHAR(MAX)
_eachIndividual = VARCHAR(MAX)
Pretty (easier to read) XML from above:
<Individual>
<TBS>768-hER-382</TBS>
<Categories />
<TBS2>768-hER-382,908-YTY-354</TBS2>
</Individual>
<Individual>
<TBS>768-hER-382</TBS>
<Categories />
<TBS2>768-hER-382,908-YTY-354</TBS2>
</Individual>
The XML, ID and _indivisualCommaList will always be the same no matter how many rows return back. The only unique column would be the _eachIndividual.
So I try the following query to group like IDs together
SELECT
*
FROM
#tblData
WHERE
_ID = #AssetID
GROUP BY
_ID
Naturally, because of my XML column, I get the error of:
Column '#tblData._xml' is invalid in the select list because it is not
contained in either an aggregate function or the GROUP BY clause.
So I'm really not sure what I can do in order to combine these rows?
The end result I am looking to have is:
_ID _xml _indivisualCommaList _eachIndividual
------ ------------------------------------------------------------------------------------------------- ----------------------- -----------------------
46589 <Individual><TBS>768-hER-382</TBS><Categories /><TBS2>768-hER-382,908-YTY-354</TBS2></Individual> 768-hER-382,908-YTY-354 768-hER-382,908-YTY-354
SO, is this possible to do?

A solution (with horrible performance) without string_agg should be:
SELECT
dataA._id,
dataA._xml,
dataA._individualCommaList,
CONCAT(dataA._eachIndividual,',',dataB._eachIndividual) as _eachIndividual
FROM data dataA
JOIN data dataB ON dataA._id = dataB._id AND dataA._eachIndividual != dataB._eachIndividual
WHERE dataA._individualCommaList = CONCAT(dataA._eachIndividual,',',dataB._eachIndividual)
db<>fiddle
JOIN the table onto itself to get the necessary data into one row, but only join different indivduals.
The WHERE Clauses ensures that the record with the correct order is kept.
Alternativley you could use an LIKE to keep the row from the first(?) indivdual in the list.

If I've got it right and for a given _ID only _eachIndividual is varying
select top(1) with ties t._ID, t._xml, t._indivisualCommaList, t2.x as _eachIndividual
from tbl t
join (select _ID, string_agg(_eachIndividual, ',') x
from tbl
group by _ID) t2 on t._ID = t2._ID
order by row_number() over(partition by t._ID order by t._ID)
Using for xml path aggregation in older versions
select top(1) with ties t._ID, t._xml, t._indivisualCommaList,
stuff((select ',' + t2._eachIndividual
from tbl t2
where t2._ID = t._ID
for xml path ('')),
1,1, '') _eachIndividual
from tbl t
order by row_number() over(partition by t._ID order by t._ID)

Related

How to deselect duplicate entries in a query?

I've got a query like this:
SELECT *
FROM RecipeTable, RecipeIngredientTable, SyncRecipeIngredientTable
WHERE RecipeTable.recipe_id = SyncRecipeIngredientTable.recipe_id
AND RecipeIngredientTable.recipe_ingredient_id =
SyncRecipeIngredientTable.recipe_ingredient_id
AND RecipeIngredientTable.recipe_item_name in ("ayva", "pirinç", "su")
GROUP by RecipeTable.recipe_id
HAVING COUNT(*) >= 3;
and this query returns the result like this:
As you can see in the image there is 3 duplicate, unnecessary entries (no, i can't delete them because of the multiple foreign keys). How can I deselect these duplicate entries from the result query? In the end I want to return 6 entries not 9.
What you want to eliminate in the result set is not duplication of recipe_id values but recipe_name values.
You just need to group(partition) by recipe_name through use of ROW_NUMBER() analytic function :
SELECT recipe_id, author_name ...
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY recipe_name) AS rn,
sr.recipe_id, author_name ...
FROM SyncRecipeIngredientTable sr
JOIN RecipeIngredientTable ri
ON ri.recipe_ingredient_id = sr.recipe_ingredient_id
JOIN RecipeTable rt
ON rt.recipe_id = sr.recipe_id
WHERE ri.recipe_item_name in ("ayva", "pirinç", "su")
)
WHERE rn = 1
This way, you can pick only one of the records with rn=1 (ORDER BY Clause might be added to that analytic function after PARTITION BY clause if spesific record is needed to be picked)

SQL to retrieve all linked records

I have a product table (tProduct) and a product links table (tProductLink) to allow establishing links between products. Given a ProductID and ProductLinkID, I need to get all of the tProduct.ID records that are related.
In the example table (tProductLink) below, all of the ID's would be returned. Note that it's not possible to create a recursive link; that is given the first row in the table below there cannot be a row where ProductID is 31563 and ProductID is 28818.
So say I search for all products related to the link in row 4, ProductID 137902 and LinkProductID 410901. Give that link, it should return all six rows.
Here is an example of the data.
I have tried various techniques such as a recursive CTE and calling a table function using "cross apply" but I have got nowhere.
This is one of the last solutions I tried, which ended up not returning all products as noted in the comments.
declare #ProductID int, #ProductLinkID int
select #ProductID = 137902
select #ProductLinkID = 410901
;with p1 as
(
select ProductID, ProductLinkID
from tProductLink
where ProductID = #ProductID and ProductLinkID = #ProductLinkID
union all
select tProductLink.ProductID, tProductLink.ProductLinkID
from tProductLink
join p1 on p1.ProductLinkID = tProductLink.ProductID
where not (tProductLink.ProductID = #ProductID and tProductLink.ProductLinkID = #ProductLinkID)
)
select distinct ProductID from p1
union
select ProductLinkID from p1
You start with one ID. This can be in multiple rows ProductLinkId or ProductId in the second table. You look up the corresponding IDs thus found again in the second table.
This asks for a recursive query, where you always collect all corresponding IDs. Unfortunately SQL Server does not support DISTINCT in recursive queries, so the same IDs get looked up multiple times. SQL Server also doesn't prevent from cycles (but fails instead), so we must prevent them ourselves by remembering which IDs we already found. This would ideally be done with an array or set that we fill, but SQL Server doesn't support such, so we must build a string instead.
The complete query:
with cte(id, seen) as
(
select 28520 as id, cast('/28520/' as varchar(max)) as seen from t1
union all
select case when cte.id = t2.productid then t2.linkproductid
else t2.productid end as id,
cte.seen + cast(case when cte.id = t2.productid
then t2.linkproductid
else t2.productid end as varchar(max)) + '/'
from cte
join t2 on cte.id in (t2.productid, t2.linkproductid)
and charindex('/' + cast(case when cte.id = t2.productid
then t2.linkproductid
else t2.productid end as varchar(max))+ '/', cte.seen) = 0
)
select distinct id from cte
option (maxrecursion 1000);
Rextester demo: http://rextester.com/WJJ78304

Group by not working to get count of a column with other max record in sql

I have a table named PublishedData, see image below
I'm trying to get the output like, below image
I think you can use a query like this:
SELECT dt.DistrictName, ISNULL(dt.Content, 'N/A') Content, dt.UpdatedDate, mt.LastPublished, mt.Unpublished
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY DistrictName ORDER BY UpdatedDate DESC, ISNULL(Content, 'zzzzz')) seq
FROM PublishedData) dt
INNER JOIN (
SELECT DistrictName, MAX(LastPublished) LastPublished, COUNT(CASE WHEN IsPublished = 0 THEN 1 END) Unpublished
FROM PublishedData
GROUP BY DistrictName) mt
ON dt.DistrictName = mt.DistrictName
WHERE
dt.seq = 1;
Because I think you use an order over UpdatedDate, Content to gain you two first columns.
Check out something like this (I don't have your tables, but you will get the idea where to follow with your query):
SELECT DirectName,
MAX(UpdatedDate),
MAX(LastPublished),
(
SELECT COUNT(*)
FROM PublishedData inr
WHERE inr.DirectName = outr.DirectName
AND inr.IsPublished = 0
) AS Unpublished
FROM PublishedData outr
GROUP BY DirectName
We should required a unique identity for that required output in PublishedData Table,Because We can't get the Latest content from given Schema.
If you want data apart from content like DistictName,updatedDate,LastPublishedDate and count of Unpublished records ,Please use Query given below :
select T1.DistrictName,T1.UpdatedDate,T1.LastPublished,T2.Unpublished from
(select DistrictName,Max(UpdateDate) as UpdatedDate,Max(LastPublished) as LastPublished from PublishedData group by DistrictName) T1
inner join
(select DistrictName,count(IsPublished) as Unpublished from PublishedData where isPublished=0 group by DistrictName) T2 ON T1.DistrictName=T2.DistrictName ORDER BY T2.Unpublished DESC

SQL Server : combine two rows into one

I want to write a query which will display the following result
FROM
ID Contract# Market
1 123kjs 40010
1 123kjs 40011
2 121kjs 40098
2 121kjs 40099
TO
ID Contract# Market
1 123kjs 40010,40011
2 121kjs 40098,40099
Try out this query, I use GROUP_CONCAT to turn column fields into 1 row field.
Also notice that you should rename the FROM clause with the name of your table.
SELECT ID,Contract#, GROUP_CONCAT(Market SEPARATOR ',')
FROM nameOfThatTable GROUP BY ID;
Try this out. I used PIVOT to solve it.
SELECT
ID,
Contract#,
ISNULL(CONVERT(varchar,[40010]) + ',' + CONVERT(varchar,[40011]),
CONVERT(varchar,[40098]) + ',' + CONVERT(varchar,[40099])) AS Market FROM
( SELECT * FROM ContractTable) AS A
PIVOT(MIN(Market) FOR Market IN ([40010],[40011],[40098],[40099])) AS PVT
ORDER BY ID
You can use ', ' + CAST(Market AS VARCHAR(30)) in sub-query and join Id and Contract# of sub-query with outer query to get values of Market as Comma Separated Values for each Id and Contract#.
SELECT DISTINCT ID,Contract#,
SUBSTRING(
(SELECT ', ' + CAST(Market AS VARCHAR(30))
FROM #TEMP T1
WHERE T2.Id=T1.Id AND T2.Contract#=T1.Contract#
FOR XML PATH('')),2,200000) Market
FROM #TEMP T2
Click here to view result
Note
.........
If you want to get CSV values for Id only, remove T2.Contract#=T1.Contract# from sub-query.

Getting row number for query

I have a query which will return one row. Is there any way I can find the row index of the row I'm querying when the table is sorted?
I've tried rowid but got #582 when I was expecting row #7.
Eg:
CategoryID Name
I9GDS720K4 CatA
LPQTOR25XR CatB
EOQ215FT5_ CatC
K2OCS31WTM CatD
JV5FIYY4XC CatE
--> C_L7761O2U CatF <-- I want this row (#5)
OU3XC6T19K CatG
L9YKCYAYMG CatH
XKWMQ7HREG CatI
I've tried rowid with unexpected results:
SELECT rowid FROM Categories WHERE CategoryID = 'C_L7761O2U ORDER BY Name
EDIT: I've also tried J Cooper's suggestion (below), but the row numbers just aren't right.
using (var cmd = conn.CreateCommand()) {
cmd.CommandText = string.Format(#"SELECT (SELECT COUNT(*) FROM Recipes AS t2 WHERE t2.RecipeID <= t1.RecipeID) AS row_Num
FROM Recipes AS t1
WHERE RecipeID = 'FB3XSAXRWD'
ORDER BY Name";
cmd.Parameters.AddWithValue("#recipeId", id);
idx = Convert.ToInt32(cmd.ExecuteScalar());
Here is a way to get the row number in Sqlite:
SELECT CategoryID,
Name,
(SELECT COUNT(*)
FROM mytable AS t2
WHERE t2.Name <= t1.Name) AS row_Num
FROM mytable AS t1
ORDER BY Name, CategoryID;
Here's a funny trick you can use in Spatialite to get the order of values. If you use the count() function with a WHERE clause limiting to only values >= the current value, then the count will actually give the order. So if I have a point layer called "mypoints" with columns "value" and "val_order" then:
SELECT value, (
SELECT count(*) FROM mypoints AS my
WHERE my.value>=mypoints.value) AS val_order
FROM mypoints
ORDER BY value DESC;
Gives the descending order of the values.
I can update the "val_order" column this way:
UPDATE mypoints SET val_order = (
SELECT count(*) FROM mypoints AS my
WHERE my.value>=mypoints.value
);
What you are asking can be explained in two different ways, but I'm assuming you want to sort the resulting table and then number those rows according to the sort.
declare #resultrow int
select
#resultrow = row_number() OVER (ORDER BY Name Asc) as 'Row Number'
from Categories WHERE CategoryID = 'C_L776102U'
select #resultrow