Remove Duplicate Texts in a Column

Remove Duplicate Texts in a Column - sql

In my temp table I have a column for a list of email addresses which might be repeated. For example:
Row#1: test#gmail.com; test#gmail.com; test#yahoo.com; abc#gmail.com
Row#2: abc#yahoo.com; abcde#yahoo.com; abcde#yahoo.com
Desired Results:
Row#1: test#gmail.com; test#yahoo.com; abc#gmail.com
Row#2: abc#yahoo.com; abcde#yahoo.com
Is there a way to achieve this in SQL Server language?

Well, assuming SQL Server 2017, and that you have a key column (or combination of columns), you could use both STRING_SPLIT and STRING_AGG:
WITH CTE AS
(
SELECT DISTINCT
T.KeyColumn,
E.Value Email
FROM dbo.YourTable T
OUTER APPLY STRING_SPLIT(Email,';') E
)
SELECT KeyColumn,
STRING_AGG(Email,';') Email
FROM CTE
GROUP BY KeyColumn
;
UPDATE for SQL Server 2016:
With no STRING_AGG you'll have to use one the old ways; for instance:
WITH CTE AS
(
SELECT DISTINCT
T.KeyColumn,
E.Value Email
FROM dbo.YourTable T
OUTER APPLY STRING_SPLIT(Email,';') E
)
SELECT t.KeyColumn,
Email = STUFF(( SELECT ';' + CONVERT(varchar(255),Email)
FROM CTE
WHERE KeyColumn = t.KeyColumn
FOR XML PATH(''), TYPE).value('.[1]','nvarchar(max)'),1,1,'')
FROM CTE t
GROUP BY t.KeyColumn
;

Related

Microsoft SQL Server - Convert column values to list for SELECT IN

I have this (3 int columns in one table)
Int1 Int2 Int3
---------------
1 2 3
I would like to run such query with another someTable:
SELECT * FROM someTable WHERE someInt NOT IN (1,2,3)
where 1,2,3 are list of INTs converted to a list that I can use with SELECT * NOT IN statement
Any suggestions how to achieve this without stored procedures in Micorosft SQL Server 2019 ?

If you want rows in some table that are not in one of three columns of another table, then use not exists:
select t.*
from sometable t
where not exists (select 1
from t t2
where t.someint in (t2.int1, t2.int2, t2.int3)
);
The subquery returns a row where there is a match. The outer query then rejects any rows with a match.

Seems like you actually want a NOT EXISTS?
SELECT {Your Columns}
FROM dbo.someTable sT
WHERE NOT EXISTS (SELECT 1
FROM dbo.oneTable oT
WHERE sT.someInt NOT IN (oT.int1,oT.int2,oT.int3));
An alternative method would be to unpivot the data, and then use an equality operator:
SELECT {Your Columns}
FROM dbo.someTable sT
WHERE NOT EXISTS (SELECT 1
FROM dbo.oneTable oT
CROSS APPLY (VALUES(oT.int1),(oT.int2),(oT.int3))V(I)
WHERE V.I = sT.someInt);

Column names of a CTE in SQL Server

I know it is possible to SELECT, from sys.columns and from tempdb.sys.columns the names of the columns of a specific table.
Can the same be done from a CTE?
with SampleCTE as (
Select
'Tom' as Name
,'Bombadill' as Surname
,99999 as Age
,'Withywindle' as Address
)
is there any way to know that the columns of this CTE are Name,Surname,Age and Address, without resorting to dumping the CTE result to a temporary table and reading the columns from there?
Thanks!

Here is a "dynamic" approach without actually using Dynamic SQL.
Unpivot (dynamic or not) would be more performant
Example
with SampleCTE as (
Select
'Tom' as Name
,'Bombadill' as Surname
,99999 as Age
,'Withywindle' as Address
)
Select C.*
From SampleCTE A
Cross Apply ( values (cast((Select A.* for XML RAW) as xml))) B(XMLData)
Cross Apply (
Select Item = a.value('local-name(.)','varchar(100)')
,Value = a.value('.','varchar(max)')
From B.XMLData.nodes('/row') as C1(n)
Cross Apply C1.n.nodes('./#*') as C2(a)
Where a.value('local-name(.)','varchar(100)') not in ('ID','ExcludeOtherCol')
) C
Returns
Item Value
Name Tom
Surname Bombadill
Age 99999
Address Withywindle

Yes, it is possible sys.dm_exec_describe_first_result_set :
This dynamic management function takes a Transact-SQL statement as a parameter and describes the metadata of the first result set for the statement.
SELECT name
FROM sys.dm_exec_describe_first_result_set(
N'
with SampleCTE as (
Select
''Tom'' as Name
,''Bombadill'' as Surname
,99999 as Age
,''Withywindle'' as Address
)
SELECT * FROM SampleCTE
', NULL, NULL);
db<>fiddle demo

String_agg for SQL Server before 2017

Can anyone help me make this query work for SQL Server 2014?
This is working on Postgresql and probably on SQL Server 2017. On Oracle it is listagg instead of string_agg.
Here is the SQL:
select
string_agg(t.id,',') AS id
from
Table t
I checked on the site some xml option should be used but I could not understand it.

In SQL Server pre-2017, you can do:
select stuff( (select ',' + cast(t.id as varchar(max))
from tabel t
for xml path ('')
), 1, 1, ''
);
The only purpose of stuff() is to remove the initial comma. The work is being done by for xml path.

Note that for some characters, the values will be escaped when using FOR XML PATH, for example:
SELECT STUFF((SELECT ',' + V.String
FROM (VALUES('7 > 5'),('Salt & pepper'),('2
lines'))V(String)
FOR XML PATH('')),1,1,'');
This returns the string below:
7 > 5,Salt & pepper,2
lines'
This is unlikely desired. You can get around this using TYPE and then getting the value of the XML:
SELECT STUFF((SELECT ',' + V.String
FROM (VALUES('7 > 5'),('Salt & pepper'),('2
lines'))V(String)
FOR XML PATH(''),TYPE).value('(./text())[1]','varchar(MAX)'),1,1,'');
This returns the string below:
7 > 5,Salt & pepper,2
lines
This would replicate the behaviour of the following:
SELECT STRING_AGG(V.String,',')
FROM VALUES('7 > 5'),('Salt & pepper'),('2
lines'))V(String);
Of course, there might be times where you want to group the data, which the above doesn't demonstrate. To achieve this you would need to use a correlated subquery. Take the following sample data:
CREATE TABLE dbo.MyTable (ID int IDENTITY(1,1),
GroupID int,
SomeCharacter char(1));
INSERT INTO dbo.MyTable (GroupID, SomeCharacter)
VALUES (1,'A'), (1,'B'), (1,'D'),
(2,'C'), (2,NULL), (2,'Z');
From this wanted the below results:
GroupID
Characters
1
A,B,D
2
C,Z
To achieve this you would need to do something like this:
SELECT MT.GroupID,
STUFF((SELECT ',' + sq.SomeCharacter
FROM dbo.MyTable sq
WHERE sq.GroupID = MT.GroupID --This is your correlated join and should be on the same columns as your GROUP BY
--You "JOIN" on the columns that would have been in the PARTITION BY
FOR XML PATH(''),TYPE).value('(./text())[1]','varchar(MAX)'),1,1,'')
FROM dbo.MyTable MT
GROUP BY MT.GroupID; --I use GROUP BY rather than DISTINCT as we are technically aggregating here
So, if you were grouping on 2 columns, then you would have 2 clauses your sub query's WHERE: WHERE MT.SomeColumn = sq.SomeColumn AND MT.AnotherColumn = sq.AnotherColumn, and your outer GROUP BY would be GROUP BY MT.SomeColumn, MT.AnotherColumn.
Finally, let's add an ORDER BY into this, which you also define in the subquery. Let's, for example, assume you wanted to sort the data by the value of the ID descending in the string aggregation:
SELECT MT.GroupID,
STUFF((SELECT ',' + sq.SomeCharacter
FROM dbo.MyTable sq
WHERE sq.GroupID = MT.GroupID
ORDER BY sq.ID DESC --This is identical to the ORDER BY you would have in your OVER clause
FOR XML PATH(''),TYPE).value('(./text())[1]','varchar(MAX)'),1,1,'')
FROM dbo.MyTable MT
GROUP BY MT.GroupID;
For would produce the following results:
GroupID
Characters
1
D,B,A
2
Z,C
Unsurprisingly, this will never be as efficient as a STRING_AGG, due to having the reference the table multiple times (if you need to perform multiple aggregations, then you need multiple sub queries), but a well indexed table will greatly help the RDBMS. If performance really is a problem, because you're doing multiple string aggregations in a single query, then I would either suggest you need to reconsider if you need the aggregation, or it's about time you conisidered upgrading.

SQL merging rows with dynamic column headings

I am trying to populate a Gridview to have checkboxes enabled per student, but depending to certain values from this query:
#SelectedDate is provided via a TextBox as a date only
SELECT v1.StudentID,
v1.StudentPreferredName + ' ' + v1.StudentFamilyName AS StudentName,
bcs.CheckStatusName,
rce.DateSubmitted,
rcp.RollCallPeriod
FROM tblBoardingRollCallEntries AS rce
INNER JOIN vwBoardingTenants AS v1
ON v1.StudentID = rce.StudentID
AND v1.[Year] = YEAR(#SelectedDate)
INNER JOIN tblBoardingCheckStatus AS bcs
ON bcs.CheckStatusID = rce.CheckStatusID
AND bcs.StatusActive = 1
INNER JOIN tblBoardingRollCallPeriods AS rcp
ON rcp.RollCallPeriodID = rce.RollCallPeriodID
AND rcp.PeriodYear = YEAR(#SelectedDate)
AND #SelectedDate BETWEEN rcp.PeriodStart AND rcp.PeriodEnd
AND rcp.RowStatus = 1
WHERE dbo.fnDateOnly(rce.DateSubmitted) = dbo.fnDateOnly(#SelectedDate)
My gridview:
Shows the following:
The data:
I want to be able to basically condense the rows in the GridView to be one student per row and the checkboxes ticked according to RollCallPeriod text.
I am playing with SQL pivots, to get the data to be as close as possible to what I am after so as to avoid code-behind, etc. However, I cannot get this to work.
select StudentID, [1],[10],[2],[3],[4],[5],[6],[7],[8],[9]
from
(
select StudentID, RollCallID, CheckStatusID
from tblBoardingRollCallEntries
unpivot
(
value for name in ([RollCallID],[StudentID],[CheckStatusID],[DateSubmitted],[StaffID])
) unpiv
) src
pivot
(
sum(RollCallPeriodID)
for RollCallPeriodID in ([1],[10],[2],[3],[4],[5],[6],[7],[8],[9])
) piv
I receive the following error:
Lookup Error - SQL Server Database Error: The type of column
"StudentID" conflicts with the type of other columns specified in the
UNPIVOT list.
Any other ideas?
Thanks

A couple of ways you can do this depending on your actual data.
This will give you the CheckStatusName as the value for the RollCallPeriod
SELECT *
FROM (
SELECT StudentName,
CheckStatusName,
RollCallPeriod
FROM [YourQueryGoesHere]
) t
PIVOT (
MAX(CheckStatusName)
FOR RollCallPeriod IN ([6:15 AM],[8:00 AM],[3:00 PM],[6:00 PM],[9:00 PM])
) p
Or you get the status and a COUNT() to show if that Student has a value for that CheckStatusName, RollCallPeriod
SELECT *
FROM (
SELECT StudentName,
CheckStatusName,
RollCallPeriod
FROM [YourQueryGoesHere]
) t
PIVOT (
COUNT(RollCallPeriod)
FOR RollCallPeriod IN ([6:15 AM],[8:00 AM],[3:00 PM],[6:00 PM],[9:00 PM])
) p

Two options:
Instead of unpivotting directly on tblBoardingRollCallEntries: first select columns cast to a VARCHAR(...) type in a derived table, then UNPIVOT the derived table. Shortened example:
select StudentID, RollCallID, CheckStatusID
from
(
SELECT ..., CAST(StudentId AS VARCHAR(128)) AS StudentId, ... FROM tblBoardingRollCallEntries)
) AS ups
unpivot
(
value for name in ([RollCallID],[StudentID],[CheckStatusID],[DateSubmitted],[StaffID])
) unpiv
Use CROSS APPLY (SELECT CAST(StudentId AS VARCHAR(128)) UNION ALL ... ) to unpivot, that way you can UNPIVOT casting the column directly to the appropriate type.

Adding aliasing to field names in Pivot SQL query

I have a query below, and need to have field [Cmp-Goal-RF-148] (which is pivoted to be a column) - I need the column title to be something besides [Cmp-Goal-RF-148], so I suppose I need to alias it. Doing this throws up an error: ([Cmp-Goal-RF-148] AS 'Ghost'). What am I missing?
select *
from
(
select EmpRvwPdDtl.Emp, EmpRvwPdDtl.Rvwr,
EmpRvwPdDtl.RvwItm,
CAST(EmpRvwPdDtl.RvwItmCom as VARCHAR(MAX)) as comment
from EmpRvwPdDtl
inner join EmpRvwPd
on (EmpRvwPd.Emp=EmpRvwPdDtl.Emp)
where EmpRvwPdDtl.RvwItmCom is not null
AND EmpRvwPd.Sup='RM04'
) as s
PIVOT
(
MAX(comment) for RvwItm in ([Cmp-Goal-RF-148])
) as pvit

You will add the alias in the final SELECT list:
select Emp, Rvwr,
[Cmp-Goal-RF-148] as Ghost -- alias goes here
from
(
select EmpRvwPdDtl.Emp, EmpRvwPdDtl.Rvwr,
EmpRvwPdDtl.RvwItm,
CAST(EmpRvwPdDtl.RvwItmCom as VARCHAR(MAX)) as comment
from EmpRvwPdDtl
inner join EmpRvwPd
on (EmpRvwPd.Emp=EmpRvwPdDtl.Emp)
where EmpRvwPdDtl.RvwItmCom is not null
AND EmpRvwPd.Sup='RM04'
) as s
PIVOT
(
MAX(comment) for RvwItm in ([Cmp-Goal-RF-148])
) as pvit

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Remove Duplicate Texts in a Column - sql

Related

Microsoft SQL Server - Convert column values to list for SELECT IN

Column names of a CTE in SQL Server

String_agg for SQL Server before 2017

SQL merging rows with dynamic column headings

Adding aliasing to field names in Pivot SQL query

Categories

Resources