SQL Server: Combine columns abort same values - sql

I am using ADO to connect SQL Server. I have a table and I want to group some cols to one col. I need the values in the new col is distinct.
This is my needing
Thank for all!

Import your excel file into SQL so you can run queries
Then Transpose your table. Transpose means to reverse columns and rows like:
+------+---------+----------+
| Name | Email1 | Email2 |
+------+---------+----------+
| A | A#a.com | A#aa.com |
+------+---------+----------+
| B | B#b.com | B#bb.com |
+------+---------+----------+
To something like this:
+---------+---------+----------+
| Name | A | B |
+---------+---------+----------+
| Email1 | A#a.com | B#b.com |
+---------+---------+----------+
| Email2 | A#aa.com| B#bb.com |
+---------+---------+----------+
The way is describing here : Simple way to transpose columns and rows in Sql?
Then you can easily SELECT DISTINCT [A] FROM [MyTable] (for each column which is each person) one by one and insert it to a temp table with a single column.
Then:
SELECT STUFF((
SELECT ', ' + [temptablecolumn]
FROM #temptable
FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 1, '')
This query it gives you this result: A#a.com, A#aa.com

You can use APPLY to convert your TMs into rows & concat them using FOR XML PATH() clause :
WITH t AS (
SELECT DISTINCT name, tm
FROM table t CROSS APPLY
( VALUES (TM1), (TM2), (TM3), (TM4), (TM5)
) tt (tm)
)
SELECT nam,
(SELECT ''+t1.tm
FROM t t1
WHERE t1.nam = t.nam
FOR XML PATH('')
) AS tn
FROM t;

One method uses a giant case expression:
select id,
(tn1 +
(case when tn2 not in (tn1) then tn2 else '' end) +
(case when tn3 not in (tn1, tn2) then tn3 else '' end) +
(case when tn4 not in (tn1, tn2, tn3) then tn4 else '' end) +
(case when tn5 not in (tn1, tn2, tn3, tn4) then tn5 else '' end)
) as tn
from t;
I will add that having multiple columns with essentially the same data is usually a sign of a bad data model. Normally, you would want a table with one row per tn and id pair.

Related

SQL group by concatenate string of a query result

I have a query that joins several tables. In the result I have several fields, but I need to group by one of them concatenating the content of other field in a string.
The query result is like next table:
* query result
+-----------+-------------+
| element | option |
+-----------+-------------+
| 25 | foo 2 |
| 25 | bar 1 |
| 25 | baz 1 |
| 30 | foo 2 |
| 30 | baz 5 |
| 32 | baz 1 |
+-----------+-------------+
I have done similar things before with GROUP_CONCAT like this:
SELECT
result.element,
GROUP_CONCAT(result.options SEPARATOR ', ') AS 'options'
FROM (
-- place here an sql query with joins and some calculated fields --
) AS result
GROUP BY result.element
And it usually works, but it seems that the sql server that I have to do this query now, does not support GROUP_CONCAT.
The sql server version is Microsoft SQL Server 2014 (SP2-CU8) (KB4037356) - 12.0.5557.0 (X64) Standard Edition (64-bit) on Windows NT 6.3 (Build 9600: ) (Hypervisor)
What I need in the end is something like this:
* final result
+-----------+-----------------------------+
| element | option |
+-----------+-----------------------------+
| 25 | foo 2, bar 1, baz 1 |
| 30 | foo 2, baz 5 |
| 32 | baz 1 |
+-----------+-----------------------------+
I've searched a lot and I found a way to do this directly from a table, but not from another query result. How it can be done?
EDIT: please, remember that I have to do the xml path from a query result, not from a table. I understand how to use it from a table, but I do not understand how to use the xml path from a query result.
If I use something like:
SELECT
result.element,
( SELECT STUFF((SELECT ',' + options
FROM result T2
WHERE T2.element= result.element
ORDER BY element
FOR XML PATH('')), 1, 1, '') )AS 'options'
FROM (
SELECT
st.element AS 'element',
CONCAT(st.salesoriginid, ' ', COUNT(st.salesoriginid)) AS 'options'
FROM SALESTABLE AS st WITH (NOLOCK)
LEFT JOIN SALESLINE AS sl WITH (NOLOCK) ON sl.SALESID = st.SALESID AND sl.DATAAREAID = st.DATAAREAID
LEFT JOIN INVENTDIM AS idim WITH (NOLOCK) ON idim.INVENTDIMID = sl.INVENTDIMID AND idim.DATAAREAID = sl.DATAAREAID
WHERE st.salestype = 3
AND st.salesoriginid IS NOT NULL
AND st.salesoriginid != ''
GROUP BY st.element, st.salesoriginid
) AS result
GROUP BY result.element
Then I get error:
Invalid object name 'result' [SQL State=S0002, DB Errorcode=208]
You can use STUFF
Select Distinct element, (
SELECT STUFF((SELECT ',' +option
FROM #T T2
Where T2.element = T1.element
ORDER BY element
FOR XML PATH('')), 1, 1, '') )AS [Options]
From #T T1
This should work:
select element,option= stuff((select ',' + option from table t1 where
t1.element=t2.element for xml path ('')),'',1,1) from table t2
group by element
How about using a CTE?
with t as (
<your query here>
)
select e.element,
stuff( (select ',' + t2.option
from t t2
where t2.element = e.element
for xml path ('')
), 1, 1, ''
) as options
from (select distinct element from t) e;
You can probably simplify this by pulling the elements directly from a base table.

Getting similar column names and count from multiple tables

So I have a table called 'SongsMetadata' in my database with 6 columns as shown below (appx 70k records). It contains all songs related information.
It is slightly different than the regular database table. The 'File_name' column contains .csv files. Those are the actual tables and values in front of them are the columns in that csv file.
So for '1001186_1_7562755270480253254.csv' record in SongsMetadata table, '1001186_1_7562755270480253254' is the table name and it's columns are '&nbsp', 'name', 'album', 'time', 'price' (these tables contain a lot of garbage values)
My goal is to compare all the tables(in this case .csv files) to get all the similar column names and their count. Now I already have a solution to get common column names and count for normal tables here. Each table will be compared with every other table. However, I'm not sure how I can achieve the same with .csv tables.
The expected output is:
1001186_1_7562755270480253254.csv & 1001186_0_5503858345485431752.csv | &nbsp, name, price| 3 #common columns count
1001186_0_5503858345485431752.csv & 99524146_0_3894874701785592836.csv | &nbsp, name, price| 3
and so on...
Any suggestions are appreciated.
The following solution shows how to treat your exsting table so that the wanted matching can occur efficiently, This requires an unpivot although the effect of an unpivot is performed by using cross apply and values which is a simple and efficient method. After that the "matching" is shown, followed by an alternative query for details yo may also find useful. Lastly the new table is displayed just to help visualize what it is.
See the as a live demo at SQL Fiddle
Small Sample:
CREATE TABLE SongsMetadata
([file_name] varchar(7), [col1] varchar(6), [col2] varchar(6), [col3] varchar(6), [col4] varchar(6))
;
INSERT INTO SongsMetadata
([file_name], [col1], [col2], [col3], [col4])
VALUES
('abc.csv', ' ', 'name', 'price', 'artist'),
('def.csv', 'name', ' ', ' ', 'price')
;
UNPIVOT Query
This query moves the column information into a normalized structure to enable the subsequent matching to occur. It is vital to the overall solution. As an added bonus you can mark some column names as "bad" so that these may be ignored later e.g. (which most likely is garbage data)
select
file_name, column_number, column_name
, case when column_name IN (' ','</b>','other-unwanted') then 0 else 1 end as col_is_good
into SongsMetadataUpivot
from (
select file_name, column_number, column_name
from SongsMetadata
cross apply (
values
(1, col1)
, (2, col2)
, (3, col3)
, (4, col4)
) ca (column_number, column_name)
) d
;
Query 1:
This is the "matching logic" provided at http://rextester.com/TLQ28814 but applied to the unpivoted songs data, AND it has the ability to exclude column names you simply don't want to consider (col_is_good).
with fmatch as (
select
l.file_name + ' & ' + r.file_name AS comparing_files
, l.column_name
from SongsMetadataUpivot l
inner join SongsMetadataUpivot r on l.column_name = r.column_name
and l.file_name < r.file_name
and r.col_is_good = 1
where l.col_is_good = 1
)
select --* from fmatch
f.comparing_files
, STUFF((
SELECT
N', ' + column_name
FROM fmatch c
WHERE f.comparing_files = c.comparing_files
order by c.column_name
FOR xml PATH (''), TYPE
)
.value('text()[1]', 'nvarchar(max)'), 1, 2, N'') as columns
, count(*) as num_col_matches
from fmatch f
group by f.comparing_files
Results:
| comparing_files | columns | num_col_matches |
|-------------------|-------------|-----------------|
| abc.csv & def.csv | name, price | 2 |
Query 2:
This will simply allow production of the column lists, in name order, together with their respective column positions in each file.
SELECT
file_name, ca.*
from SongsMetadata f
cross apply (
select
STUFF((
SELECT
N', ' + column_name
FROM SongsMetadataUpivot c
WHERE f.file_name = c.file_name
AND c.col_is_good = 1
ORDER BY column_name
FOR xml PATH (''), TYPE
)
.value('text()[1]', 'nvarchar(max)'), 1, 2, N'')
, STUFF((
SELECT
N', ' + cast(column_number as nvarchar)
FROM SongsMetadataUpivot c
WHERE f.file_name = c.file_name
AND c.col_is_good = 1
ORDER BY column_name
FOR xml PATH (''), TYPE
)
.value('text()[1]', 'nvarchar(max)'), 1, 2, N'')
) ca (column_names, col_numbers)
Results:
| file_name | column_names | col_numbers |
|-----------|---------------------|-------------|
| abc.csv | artist, name, price | 4, 2, 3 |
| def.csv | name, price | 1, 4 |
Query 3:
So you may visualize the "unpivoted" data, the overall solution requires this to occur.
select * from SongsMetadataUpivot
Results:
| file_name | column_number | column_name | col_is_good |
|-----------|---------------|-------------|-------------|
| abc.csv | 1 | | 0 |
| abc.csv | 2 | name | 1 |
| abc.csv | 3 | price | 1 |
| abc.csv | 4 | artist | 1 |
| def.csv | 1 | name | 1 |
| def.csv | 2 | | 0 |
| def.csv | 3 | | 0 |
| def.csv | 4 | price | 1 |

Rotate columns to rows for joined tables

I have two tables similar to shown below (just leaving out fields for simplicity).
Table lead :
id | fname | lname | email
---------------------------------------------
1 | John | Doe | jd#test.com
2 | Mike | Johnson | mj#test.com
Table leadcustom :
id | leadid | name | value
-------------------------------------------------
1 | 1 | utm_medium | cpc
2 | 1 | utm_term | fall
3 | 1 | subject | business
4 | 2 | utm_medium | display
5 | 2 | utm_term | summer
6 | 2 | month | may
7 | 2 | color | red
I have a database that captures leads for a wide variety of forms that often have many different form fields. The first table gets the basic info that I know is on each form. The second table captures all other forms fields that were sent over so it can really contain a lot of different fields.
What I am trying to do is to do a join where I can grab all fields from lead table along with utm_medium and utm_term from leadcustom table. I don't need any additional fields even if they were sent over.
Desired results :
id | fname | lname | email | utm_medium | utm_term
---------------------------------------------------------------------------
1 | John | Doe | jd#test.com | cpc | fall
2 | Mike | Johnson | mj#test.com | display | summer
Only way I know I could do this is to grab all lead data and then for each record make more calls to get leadcustom data I am looking for but I know there has to me a more efficient way of getting this data.
I appreciate any help with this and it is not something I can change the way I capture that data and table formats.
If your columns are fixed, you can do this with group by + case + max like this:
select
fname,
lname,
email,
max(case when name = 'utm_medium' then value end) as utm_medium,
max(case when name = 'utm_term' then value end) as utm_term
from
lead l
join leadcustom c
on l.id = c.leadid
group by
fname,
lname,
email
The case will assign value from the leadcustom table when it matches the given name, otherwise it will return null, and max will pick take the assigned value if it exists over the null.
You can test this in SQL Fiddle
The other way to do this is to use pivot operator, but that syntax is slightly more complex -- or at least this is more easy for me.
Unless I interpret your question incorrectly - in which case I'm happy to be corrected - you could achieve your goal with a simple left join where you join on ID of the first table:
select ld.*, ldcust.utm_medium, ldcust.utm_term
from lead ld
left join leadcustom ldcust
on ld.id = ldcust.leadid
You can use a cte or a derived table to solve this:
cte:
;with cte as
(
select leadid, [name], [value]
from leadcustom
where name in('utm_medium', 'display')
)
select id, fname, lname, email, [name], [value]
from lead
inner join cte on(id = leadid)
Derived table:
select id, fname, lname, email, [name], [value]
from lead
inner join
(
select leadid, [name], [value]
from leadcustom
where name in('utm_medium', 'display')
) derived on(id = leadid)
and since suslov used JamesZ's fiddle, I will use it too...
declare #t table (Id int,fname varchar(10),lname varchar(10),email varchar(20))
insert into #t(Id,fname,lname,email)values (1,'john','doe','jd#test.com'),(2,'mike','johnson','mj#test.com')
declare #tt table (id int,leadid int,name varchar(10),value varchar(10))
insert into #tt(id,leadid,name,value)values
(1,1,'utm_medium','cpc'),
(2,1,'utm_term','fall'),
(3,1,'subject','business'),
(4,2,'utm_medium','display'),
(5,2,'utm_term','summer'),
(6,2,'month','may'),(7,2,'color','red')
select Id,fname,lname,
email,
[utm_medium],
[utm_term]
from (
select t.Id,
t.fname,
t.lname,
t.email,
tt.name,
tt.value
from #t t JOIN #tt tt
ON t.Id = tt.leadid)R
PIVOT(MAX(value) for name IN([utm_medium],[utm_term]))P
You can try with pivot and join:
select [id]
, [fname]
, [lname]
, [email]
, [utm_medium]
, [utm_term]
from ( select t2.*
, t1.[name]
, t1.[value]
from [leadcustom] t1
join [lead] t2 on t2.[id] = t1.[leadid]
) t
pivot (
max([value])
for [name] in ([utm_medium], [utm_term])
) pt
pivot rotates the joined table-valued expression, by turning the unique values from [value] column in the expression into [utm_medium] and [utm_term] columns in the output, and performs fake aggregation with max function (it works so because a corresponding column can have multiple values for one unique pivoted column, in this case, [name] for [value]).
SQLFiddle

SQL Query to get aggregated result in comma separators along with group by column in SQL Server

I need to write a sql query on the table such that the result would have the group by column along with the aggregated column with comma separators.
My table would be in the below format
|`````````|````````|
| ID | Value |
|_________|________|
| 1 | a |
|_________|________|
| 1 | b |
|_________|________|
| 2 | c |
|_________|________|
Expected result should be in the below format
|`````````|````````|
| ID | Value |
|_________|________|
| 1 | a,b |
|_________|________|
| 2 | c |
|_________|________|
You want to use FOR XML PATH construct:
select
ID,
stuff((select ', ' + Value
from YourTable t2 where t1.ID = t2.ID
for xml path('')),
1,2,'') [Values]
from YourTable t1
group by ID
The STUFF function is to get rid of the leading ', '.
You can also see another examples here:
SQL same unit between two tables needs order numbers in 1 cell
SQL and Coldfusion left join tables getting duplicate results as a list in one column
Just for a balanced view, you can also do this with a CTE but its not as good as the cross apply method I don't think. I've coded this of the hoof so apologies if it doesn't work.
WITH CommaDelimitedCTE (RowNumber,ID,[Value],[Values]) AS
(
SELECT 1,MT.ID , MIN(MT.Value), CAST(MIN(MT.Value) AS VARCHAR(8000))
FROM MyTable MT
GROUP BY MT.ID
UNION ALL
SELECT CT.RowNumber + 1, MT.ID, MT.Value, CT.[Values] + ', ' + MT.Value
FROM MyTable MT
INNER JOIN CommaDelimitedCTE CT ON CT.ID = MT.ID
WHERE MT.[Value] > CT.[Value]
)
Select CommaDelimitedCTE.* from CommaDelimitedCTE
INNER JOIN (SELECT MT.ID,MAX(RowNumber) as MaxRowNumber from CommaDelimitedCTE GROUP BY MT.ID) Q on Q.MT.ID = CommaDelimitedCTE.MT.ID
AND Q.MaxRowNumber = CommaDelimitedCTE.RowNumber
In SQL Server 2017 (14.x) and later you can use the STRING_AGG function:
https://learn.microsoft.com/en-us/sql/t-sql/functions/string-agg-transact-sql?view=sql-server-ver16
SELECT
ID,
STRING_AGG(Value, ',')
FROM TableName
GROUP BY ID
Depending on the data type of Value you might need to convert it:
SELECT
ID,
STRING_AGG(CONVERT(NVARCHAR(max), Value), ',')
FROM TableName
GROUP BY ID

How to select FK as header names and values as a list of these values?

I have the following structure:
TABLE: Field
ID | Name
---|--------
1 | Email
2 | City
And
TABLE: Answers
ID | Field | Value | User
-----------------------------------
1 | 1 | m1#mail.com | 3
2 | 2 | abc | 3
3 | 1 | m2#mail.com | 4
4 | 2 | qwe | 4
I want to select:
Email | City
-------------------
m1#mail.com | abc
m2#mail.com | qwe
How can I do it?
You can try this:
DECLARE #columns NVARCHAR(MAX)
SELECT #columns = COALESCE(#columns + ',[' + cast(f.[Name] as varchar) + ']',
'[' + CAST(f.[Name] as VARCHAR)+ ']')
FROM Answers AS a INNER JOIN Field AS f ON a.[Field] = f.[ID]
GROUP BY f.[Name]
DECLARE #query NVARCHAR(MAX)
SET #query = '
SELECT * FROM
(SELECT f.[Name], a.[Value], a.[User]
FROM Answers AS a INNER JOIN Field AS f ON a.[Field]
= f.[ID]) AS s
PIVOT (MAX(Value) FOR [Name] IN (' + #columns + ')) AS p'
EXECUTE(#query);
I don't see how you can do that in a single select statement.
It's a little confusing, but I think this could work:
SELECT
External.Value as Email,
City
FROM
Answers as External
JOIN
(
SELECT
Answers.Value as City,
Answers.User
FROM
Answers
WHERE
Answers.Field = 2
) AS Internal
ON
(External.User = Internal.User)
WHERE
External.Field = 1
Since the column is the same, I'm first selecting the email and then selecting the city, and finally joining them both so they appear in the same result row.
SELECT User,
MAX(CASE WHEN field=1 THEN value END) AS [Email],
MAX(CASE WHEN field=2 THEN value END) as [City]
FROM test
GROUP BY User;
You can also do the same using PIVOT, but personally I found the syntax above clearer and easier to use than PIVOT. If you have dynamic fields, you need to make this query generic also. I'd assume creating function that analyzes all distinct values in in the first table, iterates through them, and returns a proper query ( You need to append MAX(CASE WHEN field=N THEN value END) AS [Field_N_Name] for each ID in the first table
SELECT A1.Value, A2.Value FROM Answers A1 JOIN Answers A2 on A1.User = A2.User
"Self-join". But this is a non-generic solution that will break when you add Field 3.