Problems with using STUFF - sql

Why the heck isn't this working??? Seems to follow everything I've found around here. I'm getting the error:
Column '#TempTable.clientId' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
If I add the tt.clientId to the group by, then it doesn't do the stuff, and combine them all into 1 line, they come up as separate rows. Did I make a typo or something?
SELECT tt.Station, STUFF((SELECT ', ' + c.client_code
FROM client c
WHERE tt.clientId = c.ID
FOR XML PATH('')),1,1,'') [Values]
FROM #TempTable tt
GROUP BY tt.Station

SELECT ... FOR XML PATH should be a function of GROUP BY column[s]
tt.station in your case.
Something like that
SELECT tt.Station, STUFF((SELECT ', ' + c.client_code
FROM client c
JOIN #TempTable tt2
ON tt2.clientId = c.ID
AND tt2.Station = tt.Station
FOR XML PATH('')),1,1,'') [Values]
FROM
GROUP BY tt.Station

You have to add tt.ClientId to the GROUP BY because you are using it to correlate the subquery here: WHERE tt.clientId = c.ID
Otherwise, how does SQL Server know which ClientId to use for each Station?
If you don't want to group by ClientId, then you have to correlate by tt.Station.

There are a couple alternatives here, but here are some things to consider:
Determine what you want your left-most item to be -- in this case, it looks like it's supposed to be station. If that is the case, one way to approach it is to first get the distinct set of stations. This can be done using a group by or a distinct.
Determine the level of the items for which you wish to generate a list -- in this case it's client_code. Therefore, you want to get you inner select to be at that level. One way to do that is to resolve the distinct set of client codes prior to attempting to use for xml.
One critique -- it's always good to provide a simple set of data. Makes providing an answer much easier and faster.
Again, there are alternatives here, but here's a possible solution.
The test data
select top (100)
client_id = abs(checksum(newid())) % 100,
client_code = char(abs(checksum(newid())) % 10 + 65)
into #client
from sys.all_columns a
cross join sys.all_columns b;
select top (100)
station = abs(checksum(newid())) % 10,
client_id = abs(checksum(newid())) % 50 -- just a subset
into #temp
from sys.all_columns a
cross join sys.all_columns b;
The query
select station, client_codes = stuff((
select ', ' + cc.client_code
from (
select distinct c.client_code -- distinct client codes
from #temp t
join #client c
on t.client_id = c.client_id
where t.station = s.station) cc
order by client_code
for xml path('')), 1, 2, '')
from (
select distinct station
from #temp) s; -- distinct stations
The results
station client_codes
----------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
0 B, C, D, E, G, J
1 A, B, D, G, H, J
2 A, C, E, F, G, H, J
3 B, C, H, J
4 A, B, C, D, F, H, J
5 H, J
6 D, E, F, G, I
7 A, C, D, F, G, H, J
8 A, E, G
9 C, E, F, G, I
Hope this helps.

Related

SQL Finding duplicate values in two of the three columns of each row

Let's say we have three columns: A, B, and C.
I would like to filter the results as follows:
The values of A and B are the same (duplicated) for > 1 (more than 1) row, and the value of C is always different.
In the attached image, the values that appear selected would meet the conditions mentioned above.
What I've tried:
SELECT
a.notation as A, a.gene as B, b.id as C
FROM
`db-dummy`.sgdata c
join `db-dummy`.g_info a on a.rec_id = c.gen_id
join `db-dummy`.spec_data b on b.rec_id = c.spec_id GROUP BY A, B HAVING COUNT(*) > 1;
I thought that using GROUP BY and HAVING COUNT(*) > 1 I could get the desired result, but I get the following error:
SQL Error [1055] [42000]: (conn=1632) Expression #3 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'db-dummy.b.spec_id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
If you had a single table, I would suggest just using exists. But because you have a join, use window functions. If you are. looking for different values of id:
SELECT A, B, C
FROM (SELECT a.notation as A, a.gene as B, b.id as C,
MIN(b.id) OVER (PARTITION BY a.notation, a.gene) as min_id,
MAX(b.id) OVER (PARTITION BY a.notation, a.gene) as max_id
FROM `db-dummy`.sgdata c JOIN
`db-dummy`.g_info a
ON a.rec_id = c.gen_id JOIN
`db-dummy`.spec_data b
ON b.rec_id = c.spec_id
) x
WHERE min_id <> max_id;
If you are just looking for multiple rows for a given A and B, then you can use:
SELECT A, B, C
FROM (SELECT a.notation as A, a.gene as B, b.id as C,
COUNT(*) OVER (PARTITION BY a.noation, a.gene) as cnt
FROM `db-dummy`.sgdata c JOIN
`db-dummy`.g_info a
ON a.rec_id = c.gen_id JOIN
`db-dummy`.spec_data b
ON b.rec_id = c.spec_id
) x
WHERE cnt > 1;
SELECT * FROM `db-dummy`.sgdata a
LEFT JOIN
(SELECT COUNT(Id) as count, notation, gene
FROM `db-dummy`.sgdata
GROUP BY notation, gene
HAVING COUNT(id) > 1) b
on a.notation = b.notation AND a.gene = b.gene

FOR Clause Issues/Combine Row into One Column

Currently my goal is that if I have three rows (a, b, z), (c, d, z), (e,f,z). I will combine them to form one column so (a:b, c:d, e:f, z).
I have tried the code:
SELECT
d.engagement_id,
(SELECT cf.field + ':' + cf.custom_field_value
FROM LEFT OUTER JOIN custom_fields cf ON cf.engagement_id = d.engagement_id
FOR XML PATH('')) [SECTORS]
FROM
pseudo_table d
Currently, it says I am missing a right parenthesis before the FOR. Any ideas on why this is happening/get to my goal?
In Oracle, you would use listagg():
SELECT d.engagement_id,
LISTAGG(cf.field || ':' || cf.custom_field_value, ', ') WITHIN GROUP (ORDER BY cf.field) as sectors
FROM pseudo_table d LEFT OUTER JOIN
custom_fields cf
ON cf.engagement_id = d.engagement_id
GROUP BY d.engagement_id;

SQL UPDATE Statement on complex select query

I have a complex select query with many joins. It is generate from a tool. I have to update a field based on that query.
I tried to decode it but not sure that updating based on my decode is correct. Can I update the values based on the tool generated query. Like below:-
UPDATE F_Sales SET d_source = "XYZ" WHERE
F_Sales.customer_code in (SELECT A, B, C, D......... FROM K, L, M, N, O,P ....)
create table #temp(customer_code INT)
insert into #temp SELECT A, B, C, D......... FROM K, L, M, N, O,P ....
UPDATE F_Sales SET d_source = "XYZ"
FROM F_Sales join #temp ON
F_Sales.customer_code = #temp.customer_code
Provided one of the A,B, .. columns (column D for example) can be mapped to F_Sales.customer_code
UPDATE F_Sales SET d_source = "XYZ"
WHERE
F_Sales.customer_code in (
SELECT D
FROM ( -- untouched original query
SELECT A, B, C, D......... FROM K, L, M, N, O,P ....) q
)
or
UPDATE F_Sales SET d_source = "XYZ"
FROM F_Sales
JOIN ( -- untouched original query
SELECT A, B, C, D......... FROM K, L, M, N, O,P ....) q
ON F_Sales.customer_code = q.D
Probably we can make it much better if you can show the generated query

return column name of the maximum value in sql server 2012

My table looks like this (Totally different names)
ID Column1--Column2---Column3--------------Column30
X 0 2 6 0101 31
I want to find the second maximum value of Column1 to Column30 and Put the column_Name in a seperate column.
First row would look like :
ID Column1--Column2---Column3--------------Column30------SecondMax
X 0 2 6 0101 31 Column3
Query :
Update Table
Set SecondMax= (select Column_Name from table where ...)
with unpvt as (
select id, c, m
from T
unpivot (c for m in (c1, c2, c3, ..., c30)) as u /* <-- your list of columns */
)
update T
set SecondMax = (
select top 1 m
from unpvt as u1
where
u1.id = T.id
and u1.c < (
select max(c) from unpvt as u2 where u2.id = u1.id
)
order by c desc, m
)
I really don't like relying on top but this isn't a standard sql question anyway. And it doesn't do anything about ties other than returning the first column name by order of alphabetical sort.
You could use a modification via the condition below to get the "third maximum". (Obviously the constant 2 comes from 3 - 1.) Your version of SQL Server lets you use a variable there as well. I think SQL 2012 also supports the limit syntax if that's preferable to top. And since it should work for top 0 and top 1 as well, you might just be able to run this query in a loop to populate all of your "maximums" from first to thirty.
Once you start having ties you'll eventually get a "thirtieth maximum" that's null. Make sure you cover those cases though.
and u1.c < all (
select top 2 distinct c from unpvt as u2 where u2.id = u1.id
)
And after I think about it. If you're going to rank and update so many columns it would probably make even more sense to use a proper ranking function and do the update all at once. You'll also handle the ties a lot better even if the alphabetic sorting is still arbitrary.
with unpvt as (
select id, c, m, row_number() over (partition by id order by c desc, m) as nthmax
from T
unpivot (c for m in (c1, c2, c3, ..., c30)) as u /* <-- your list of columns */
)
update T set
FirstMax = (select c from unpvt as u where u.id = T.id and nth_max = 1),
SecondMax = (select c from unpvt as u where u.id = T.id and nth_max = 2),
...
NthMax = (select c from unpvt as u where u.id = T.id and nth_max = N)

Join Multiple Rows into 1 row different columns SQL Server

I am working on a query and would love some help.
I will provide a simplified version of the query in hopes that it communicates what I am attempting to do.
Given the following Tables:
TableA (RecordNumber, TableAID, SomeValue)
TableB (RecordNumber, X, Y, Z)
TableC (RecordNumber, D, E, F, G)
The result set I am looking for:
TableB.RecordNumber, X, Y, Z, D, E, F, G, SomeValue1, SomeValue2, SomeValue3, SomeValue4
My query currently is
Select
TableB.RecordNumber, X, Y, Z, D, E, F, G, SomeValue1)
inner join
TableC on TableB.RecordNumber = TableC.RecordNumber
inner join
TableA on TableB.RecordNumber = TableA.RecordNumber
I realize that this is returning 1 row per SomeValue in TableA.
What I would like to do is combine each row for a RecordNumber into 1 row populating the SomeValueX with the SomeValue value from row X for that record number.
Thoughts?
You can't have a variable number of columns without using dynamic SQL but you can have a variable number of comma-separated values in a single column.
SELECT b.RecordNumber, X, Y, Z, D, E, F, G,
STUFF((
SELECT ',' + SomeValue
FROM TableA a
WHERE a.RecordNumber = b.RecordNumber
ORDER BY SomeValue
FOR XML PATH('')
), 1, 1, '') AS SomeValues
FROM TableB b
INNER JOIN TableC c
ON b.RecordNumber = c.RecordNumber
If TableA.SomeValue is not of a character or string data type, you would also want to cast it to a varchar of an appropriate length.