Join Multiple Rows into 1 row different columns SQL Server - sql

I am working on a query and would love some help.
I will provide a simplified version of the query in hopes that it communicates what I am attempting to do.
Given the following Tables:
TableA (RecordNumber, TableAID, SomeValue)
TableB (RecordNumber, X, Y, Z)
TableC (RecordNumber, D, E, F, G)
The result set I am looking for:
TableB.RecordNumber, X, Y, Z, D, E, F, G, SomeValue1, SomeValue2, SomeValue3, SomeValue4
My query currently is
Select
TableB.RecordNumber, X, Y, Z, D, E, F, G, SomeValue1)
inner join
TableC on TableB.RecordNumber = TableC.RecordNumber
inner join
TableA on TableB.RecordNumber = TableA.RecordNumber
I realize that this is returning 1 row per SomeValue in TableA.
What I would like to do is combine each row for a RecordNumber into 1 row populating the SomeValueX with the SomeValue value from row X for that record number.
Thoughts?

You can't have a variable number of columns without using dynamic SQL but you can have a variable number of comma-separated values in a single column.
SELECT b.RecordNumber, X, Y, Z, D, E, F, G,
STUFF((
SELECT ',' + SomeValue
FROM TableA a
WHERE a.RecordNumber = b.RecordNumber
ORDER BY SomeValue
FOR XML PATH('')
), 1, 1, '') AS SomeValues
FROM TableB b
INNER JOIN TableC c
ON b.RecordNumber = c.RecordNumber
If TableA.SomeValue is not of a character or string data type, you would also want to cast it to a varchar of an appropriate length.

Related

SQL Finding duplicate values in two of the three columns of each row

Let's say we have three columns: A, B, and C.
I would like to filter the results as follows:
The values of A and B are the same (duplicated) for > 1 (more than 1) row, and the value of C is always different.
In the attached image, the values that appear selected would meet the conditions mentioned above.
What I've tried:
SELECT
a.notation as A, a.gene as B, b.id as C
FROM
`db-dummy`.sgdata c
join `db-dummy`.g_info a on a.rec_id = c.gen_id
join `db-dummy`.spec_data b on b.rec_id = c.spec_id GROUP BY A, B HAVING COUNT(*) > 1;
I thought that using GROUP BY and HAVING COUNT(*) > 1 I could get the desired result, but I get the following error:
SQL Error [1055] [42000]: (conn=1632) Expression #3 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'db-dummy.b.spec_id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
If you had a single table, I would suggest just using exists. But because you have a join, use window functions. If you are. looking for different values of id:
SELECT A, B, C
FROM (SELECT a.notation as A, a.gene as B, b.id as C,
MIN(b.id) OVER (PARTITION BY a.notation, a.gene) as min_id,
MAX(b.id) OVER (PARTITION BY a.notation, a.gene) as max_id
FROM `db-dummy`.sgdata c JOIN
`db-dummy`.g_info a
ON a.rec_id = c.gen_id JOIN
`db-dummy`.spec_data b
ON b.rec_id = c.spec_id
) x
WHERE min_id <> max_id;
If you are just looking for multiple rows for a given A and B, then you can use:
SELECT A, B, C
FROM (SELECT a.notation as A, a.gene as B, b.id as C,
COUNT(*) OVER (PARTITION BY a.noation, a.gene) as cnt
FROM `db-dummy`.sgdata c JOIN
`db-dummy`.g_info a
ON a.rec_id = c.gen_id JOIN
`db-dummy`.spec_data b
ON b.rec_id = c.spec_id
) x
WHERE cnt > 1;
SELECT * FROM `db-dummy`.sgdata a
LEFT JOIN
(SELECT COUNT(Id) as count, notation, gene
FROM `db-dummy`.sgdata
GROUP BY notation, gene
HAVING COUNT(id) > 1) b
on a.notation = b.notation AND a.gene = b.gene

Using the result from a subquery elsewhere in the query

I have the following pseudo-sqlite call:
SELECT x, y,
(SELECT --very long SQL call--) AS z,
(SELECT a FROM diff_table_name WHERE b = z) AS e
FROM table_name
WHERE c = d
Essentially I want to use the z variable result from the first subquery in the second subquery, but I get a
no such column: z
error when I do. I can repeat the very long SQL call in the second subquery and that works, but I was hoping to not have to do that. Or maybe there's a way to return both a and z from one subquery?
This part of your query:
SELECT x, y,
(SELECT --very long SQL call--) AS z
FROM table_name
WHERE c = d
can be safely wrapped inside a CTE and then use the value of z:
WITH cte AS (
SELECT x, y,
(SELECT --very long SQL call--) AS z
FROM table_name
WHERE c = d
)
SELECT x, y, z,
(SELECT a FROM diff_table_name WHERE b = z) AS e
FROM cte

SQL UPDATE Statement on complex select query

I have a complex select query with many joins. It is generate from a tool. I have to update a field based on that query.
I tried to decode it but not sure that updating based on my decode is correct. Can I update the values based on the tool generated query. Like below:-
UPDATE F_Sales SET d_source = "XYZ" WHERE
F_Sales.customer_code in (SELECT A, B, C, D......... FROM K, L, M, N, O,P ....)
create table #temp(customer_code INT)
insert into #temp SELECT A, B, C, D......... FROM K, L, M, N, O,P ....
UPDATE F_Sales SET d_source = "XYZ"
FROM F_Sales join #temp ON
F_Sales.customer_code = #temp.customer_code
Provided one of the A,B, .. columns (column D for example) can be mapped to F_Sales.customer_code
UPDATE F_Sales SET d_source = "XYZ"
WHERE
F_Sales.customer_code in (
SELECT D
FROM ( -- untouched original query
SELECT A, B, C, D......... FROM K, L, M, N, O,P ....) q
)
or
UPDATE F_Sales SET d_source = "XYZ"
FROM F_Sales
JOIN ( -- untouched original query
SELECT A, B, C, D......... FROM K, L, M, N, O,P ....) q
ON F_Sales.customer_code = q.D
Probably we can make it much better if you can show the generated query

Problems with using STUFF

Why the heck isn't this working??? Seems to follow everything I've found around here. I'm getting the error:
Column '#TempTable.clientId' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
If I add the tt.clientId to the group by, then it doesn't do the stuff, and combine them all into 1 line, they come up as separate rows. Did I make a typo or something?
SELECT tt.Station, STUFF((SELECT ', ' + c.client_code
FROM client c
WHERE tt.clientId = c.ID
FOR XML PATH('')),1,1,'') [Values]
FROM #TempTable tt
GROUP BY tt.Station
SELECT ... FOR XML PATH should be a function of GROUP BY column[s]
tt.station in your case.
Something like that
SELECT tt.Station, STUFF((SELECT ', ' + c.client_code
FROM client c
JOIN #TempTable tt2
ON tt2.clientId = c.ID
AND tt2.Station = tt.Station
FOR XML PATH('')),1,1,'') [Values]
FROM
GROUP BY tt.Station
You have to add tt.ClientId to the GROUP BY because you are using it to correlate the subquery here: WHERE tt.clientId = c.ID
Otherwise, how does SQL Server know which ClientId to use for each Station?
If you don't want to group by ClientId, then you have to correlate by tt.Station.
There are a couple alternatives here, but here are some things to consider:
Determine what you want your left-most item to be -- in this case, it looks like it's supposed to be station. If that is the case, one way to approach it is to first get the distinct set of stations. This can be done using a group by or a distinct.
Determine the level of the items for which you wish to generate a list -- in this case it's client_code. Therefore, you want to get you inner select to be at that level. One way to do that is to resolve the distinct set of client codes prior to attempting to use for xml.
One critique -- it's always good to provide a simple set of data. Makes providing an answer much easier and faster.
Again, there are alternatives here, but here's a possible solution.
The test data
select top (100)
client_id = abs(checksum(newid())) % 100,
client_code = char(abs(checksum(newid())) % 10 + 65)
into #client
from sys.all_columns a
cross join sys.all_columns b;
select top (100)
station = abs(checksum(newid())) % 10,
client_id = abs(checksum(newid())) % 50 -- just a subset
into #temp
from sys.all_columns a
cross join sys.all_columns b;
The query
select station, client_codes = stuff((
select ', ' + cc.client_code
from (
select distinct c.client_code -- distinct client codes
from #temp t
join #client c
on t.client_id = c.client_id
where t.station = s.station) cc
order by client_code
for xml path('')), 1, 2, '')
from (
select distinct station
from #temp) s; -- distinct stations
The results
station client_codes
----------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
0 B, C, D, E, G, J
1 A, B, D, G, H, J
2 A, C, E, F, G, H, J
3 B, C, H, J
4 A, B, C, D, F, H, J
5 H, J
6 D, E, F, G, I
7 A, C, D, F, G, H, J
8 A, E, G
9 C, E, F, G, I
Hope this helps.

Improving a query to find out-of-sync values between two tables

I have the following query:
SELECT
tableOneId
SUM(a+b+c) AS tableOneData,
MIN(d) AS tableTwoData,
FROM
tableTwo JOIN tableOne ON tableOneId = tableTwoId
GROUP BY
tableOneId
All of the mentioned columns are declared as numeric(30,6) NOT NULL.
In tableOne, I have entries whose sum (columns a, b, c) should be equivalent to column d in Table Two.
A simple example of this:
Table One (id here should read tableOneId to match above query)
id=1, a=1, b=0, c=0
id=1, a=0, b=2, c=0
id=2, a=1, b=0, c=0
Table Two (id here should read tableTwoId to match above query)
id=1, d=3
id=2, d=1
My first iteration used SUM(d)/COUNT(*) but division is messy so I'm currently using MIN(d). What would be a better way to write this query?
Try this:
SELECT
tableOneId,
tableOneData,
d AS tableTwoData
FROM tableTwo
JOIN (select tableOneId, sum(a + b + c) AS tableOneData
from tableone
group by 1) x ON tableOneId = tableTwoId
where tableOneData <> d;
This will return all rows that have incorrect data in table 2.
select tableOneId, SUM(a) + SUM(b) + SUM(c) as tableOneData, d as tableTwoData
from tableTwo JOIN tableOne ON tableOneId = tableTwoId
GROUP BY tableOneId, d