Generate unique ID CSV list from table - sql

I am trying to get a unique list of IDs from a table in CSV format. I am close I just need to be able to remove duplicates. So far I have:
DECLARE #csv VARCHAR(max)
set #csv = null
SELECT #csv = COALESCE(#csv + ',', '') + ''''+ID+''''
FROM TABLE
select #csv
The only problem is the table can have multiple IDs, and I only want each occurrence once. I tried adding a "DISTINCT" before the ID but it doesn't like being there.

Using the syntax SELECT #Variable = #Variable + ... FROM is a documented antipattern and should be avoided; it relies on the data engine processing your data in a row by row order, which there is no guarentee of. Instead use string aggregation to achieve the same results. In recent versions of SQL Server that would be by using STRING_AGG, however, in older versions you'll need to use FOR XML PATH (and STUFF) to achieve the same results.
Assuming you are on a fully supported version of SQL Server, then use a CTE/derived table to get the DISTINCT values, and then aggregate that:
WITH CTE AS(
SELECT DISTINCT ID
FROM dbo.YourTable)
SELECT STRING_AGG(ID,',')
FROM CTE;

Related

How to get the data in SQL Server for string concat value compare to int value without using like operator

I have table with data like this:
Id | StringValue
----+-------------
1 | 4,50
2 | 90,40
I will get input StringValue like 4. I need to fetch the data exact matched record. When I am using LIKE operator, select query is returning two rows, but I need exact matched data record only.
Can anybody please help me with this?
SELECT *
FROM Table1
WHERE StringValue like '%4%'
But that returns two rows - both ID 1 and 2.
My expectation is I need to get ID = 1 row only
Storing delimited data like this is a well documented anti-pattern, violates basic normalisation principles and prevents the database engine from fully utilising an index.
What you can do is delimit your search value and also ensure the expression to search is correctly delimited; this is an unsargable expression however and the strorage engine will have to scan all rows every time -
declare #valueToFind varchar(10) = '4';
select *
from t
where Concat(',', t.StringValue, ',') like Concat('%,' #valueToFind, ',%');
for SQL Server 2016 and later you can use STRING_SPLIT or earlier version of SQL Server, there are many alternative, just do a search for it.
Or, you can simply do
SELECT * FROM Table1 where ',' + StringValue + ',' like '%,4,%'

How do I remove duplicate word in a cell in SQL

How do I remove duplicates in the following case in T-SQL?
I have a table with a column Code of type varchar(max).
It contains column value like truck/rail/truck/rail. I need the cell value to be truck/rail.
Other possibility is truck/rail/ship/truck need to be truck/rail/ship.
By using table valued function.
Thanks.
You can use String_Split along with String_agg to remove the duplicates.
DECLARE #t table(id int, val varchar(max))
insert into #t values(1,'truck/rail/truck/rail'), (2,'truck/rail/ship/truck')
SELECT t.id,STRING_AGG(splitval,'/') as newval FROM #t as t
cross apply (
SELECT distinct value from string_split(t.val,'/')) as ca(splitval)
group by t.id
id
newval
1
rail/truck
2
rail/ship/truck
Note1: String_Split, does not guarantee order. So, your concatenated results might be in different order from the original list, after duplicates removal. If you want to preserve the order, then we have to go for different solution using xml nodes or json array.
Note2: String_Split was introduced in SQL Server 2016. String_agg was introduced in SQL Server 2017. So, if you are using versions before that, you have to go for recursive CTE and CHARINDEX based solution.
If you know that the error exists, then just do an UPDATE where you replace the truck/rail/truck/rail with truck/rail using the REPLACE(Code,'truck/rail/truck/rail',truck/rail).
The same goes for your truck/rail/ship/truck issue.
If you need automatic detection and correction to be done, that's a whole 'nuther story but could still be done using nested REPLACES. Detection of the issue is the hard part. Personally, I'd be having a talk with the people that are providing the data.

Is there a way to pull part of a SQL query from a .sql file?

Let me simplify with an example. Let's say I have the following query saved on:
C:\sample.sql
grp.id IN
(001 --Bob
,002 --Tom
,003 --Fay
)
Now, that group of IDs could change, but instead of updating those IDs in every query it's related to, I was hoping to just update in sample.sql and the rest of the queries will pull from that SQL file directly.
For example, I have several queries that would have a section like this:
SELECT *
FROM GROUP grp
WHERE grp.DATERANGE >= '2017-12-01 AND grp.DATERANGE <= '2017-12-31
AND -- **this is where I would need to insert that query (ie. C:\sample.sql)**
More explained update:
Issue: I have several reports/queries having the same ID filter (that's the only thing in common between those reports)
What's needed: Instead of updating those IDs every time they change on each report, I was wondering if I can update those IDs in it's own SQL file (like the example above) and have the rest of the queries pull from there.
Note. I can't create a table or database in the used database.
Maybe the bulk insert utility could help. Hold your data in csv files and load them into temp tables at run time. Use these temp tables to drive your query.
CREATE TABLE #CsvData(
Column1 VARCHAR(40),
Column2 VARCHAR(40)
)
GO
BULK
INSERT #CsvData
FROM 'c:\csvtest.txt'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
GO
--Use #CsvData to drive your query
SELECT *
FROM #CsvData
maybe what you could use is a CTE (Common Table Expression) to pull your IDs using an additional query, specially if you only have read access. It would look something like this:
WITH myIDs AS (select IDs from grp where (conditions to get the IDs))
SELECT *
FROM grp
WHERE grp.DATERANGE BETWEEN '2017-12-01 AND '2017-12-31'
AND IDs in (select * from myIDs)
I've changed the dates syntax to use BETWEEN since it's more practical but only works if you have a SQL Server 2008 or later
Hope this helps!
Cheers!
The only chance to build a query out of text fragments is dynamic SQL:
Try this:
DECLARE #SomeCommand VARCHAR(MAX)='SELECT * FROM sys.objects';
EXEC(#SomeCommand);
Returns a list of all sys.object entries
Now I append a WHERE clause to the string
SET #SomeCommand=#SomeCommand + ' WHERE object_id IN(1,2,3,4,5,6,7,8,9)';
EXEC(#SomeCommand);
And you get a reduced result.
Another option is dynamic IN-list with a CSV paramter.
This is forbidden: DECLARE #idList VARCHAR(100)='1,2,3,4' and use it like IN (#idList).
But this works:
DECLARE #idList VARCHAR(100)='1,2,3,4,5,6,7,8,9';
SELECT sys.objects.*
FROM sys.objects
--use REPLACE to transform the list to <x>1</x><x>2</x>...
OUTER APPLY(SELECT CAST('<x>' + REPLACE(#idList,',','</x><x>') + '</x>' AS XML)) AS A(ListSplitted)
--Now use the XML (the former CSV) within your IN() as set-based filter
WHERE #idList IS NULL OR LEN(#idList)=0 OR object_id IN(SELECT B.v.value('.','int') FROM ListSplitted.nodes('/x') AS B(v));
With a version of SQL Server 2016+ this can be done much easier using STRING_SPLIT().
This approach allows you to pass the id-list as simple text parameter.

SQL query to check for inclusion of any element from an array

I have a database column containing a string that might look something like this u/1u/3u/19/g1/g4 for a particular row.
Is there a performant way to get all rows that have at least one of the following elements ['u/3', 'g4'] in that column?
I know I can use AND clauses, but the number of elements to verify against varies and could become large..
I am using RoR/ActiveRecord in my project.
in sql server, you can use XML to convert your list of search params into a record set, then cross join that with the base table, and do charIndex() to see if the column contains the substring.
Since i don't know your table or column names, i used a table (persons) that i already had data in, which has a column 'phone_home'. To search for any phone number that contains '202' or '785' i would use this query:
select person_id,phone_home,Split.data.value('.', 'VARCHAR(10)')
from (select *, cast('<n>202</n><n>785</n>' as XML) as myXML
from persons) as data cross apply myXML.nodes('/n') as Split(data)
where charindex(Split.data.value('.', 'VARCHAR(10)'),data.phone_Home) > 0
you will get duplicate records if it matches more than one value, so throw a distinct in there and remove the Split from the select statement if that is not desired.
Using xml in sql is voodoo magic to me...i got the idea from this post http://www.sqljason.com/2010/05/converting-single-comma-separated-row.html
no idea what performance is like...but at least there aren't any cursors or dynamic sql.
EDIT: Casting the XML is pretty slow, so i made it a variable so it only gets cast once.
declare #xml XML
set #xml = cast('<n>202</n><n>785</n>' as XML)
select person_id,phone_home,Split.persons.value('.', 'VARCHAR(10)')
from persons cross apply #xml.nodes('/n') as Split(persons)
where charindex(Split.persons.value('.', 'VARCHAR(10)'),phone_Home) > 0

T-SQL, Select #variable within a sub-query causes syntax error

I'm using a sub-query to get results needed (multiple records returned), and I want to put those results in a single record returned.
When I run the sub-query on its own, it works, but once I use it as a sub query, it no longer works due to a syntax error.
The following code causes a syntax error
(Incorrect syntax near '='.)
declare #test varchar(1000)
set #test = ''
SELECT description, (SELECT #test = #test + FirstName
FROM EMP_tblEmployee
)select #test
FROM EMP_tblCrew
So essentially, the sub query
(SELECT #test = #test + FirstName
FROM EMP_tblEmployee
)select #test
returns "charliejohnjacob"
The main query
SELECT description FROM EMP_tblCrew
returns "janitor"
So I want it to say
janitor | charliejohnjacob
2 fields, 1 record.
Your query is not syntactically correct and the T-SQL parser has a nasty habit of not reporting an error quite accurately at times. This is a bit of a stab in the dark but try:
SELECT
description,
(SELECT FirstName + ' ' FROM EMP_tblEmployee FOR XML PATH('')) AS [Name Concat Result]
FROM EMP_tblCrew
That will fix one thing at least, though I'm not sure how SQL server feels about concatenating inline like that. You also risk overflowing the varchar(1000) if your table is of appreciable size. Even varchar 8000 isn't very much for this kind of query.
Try searching google for "SQL Concatenate rows into string". There are a number of useful solutions for this.
It looks like you also need to join the employee to the crew table, so that you dont get some cartesian product (usually not what is wanted).
Probably the easiest path involves using a recursive CTE (common table expression). A detailed example of that is at https://www.simple-talk.com/sql/t-sql-programming/concatenating-row-values-in-transact-sql/
Note, this basically requires that you have sql 2008.
Another path would be to create a user defined function that returned the concatenated values from the EMP_tblEmployee table. You could do this in 2005 or 2008.