Get rows from comma separated list - sql

I want to convert a comma separated list back into a table.
For eg.
I have a table which looks like this
Sid roleid
500 1,5,
501 1,5,6,
I want output like this
Sid roleid
500 1
500 5
501 1
501 5
501 6
Please help.
Create table #temp(Sid int,roleid varchar(100))
Insert into #temp values(500,'1,5,'),(501,'1,5,6,')

Using STRING_SPLIT() means, that you are working on SQL Server 2016 (oder higher).
However, STRING_SPLIT() has a huge draw back: It is not guaranteed to return the items in the expected order (see the docs, section "Remarks"). In my eyes this is an absolut show stopper...
But - luckily - there is a fast and easy-to-use workaround in v2016+:
Create table #temp(Sid int,roleid varchar(100))
Insert into #temp values(500,'1,5,'),(501,'1,5,6,');
SELECT t.[Sid]
,A.[key] AS position
,A.[value] AS roleid
FROM #temp t
CROSS APPLY OPENJSON(CONCAT('["',REPLACE(t.roleid,',','","'),'"]')) A
WHERE A.[value]<>'';
A simple number array 1,3,5 needs nothing more than brackets to be a JSON array ([1,3,5]). In your case, due to the trailing comma, I deal with it as strings. 1,3,5, will be taken as array of strings: ["1","3","5",""]. The final empty string is taken away by the WHERE clause. The rest is easy...
Other than STRING_SPLIT() the docs proof, that OPENJSON will reflect an item's position in the [key] column:
When OPENJSON parses a JSON array, the function returns the indexes of the elements in the JSON text as keys.
General hint: avoid STRING_SPLIT() as lons as there is no additional key/position column added to its result set.

Use string_split() :
select t.sid, spt.value
from table t cross apply
string_split(t.roleid, ',') as spt
order by t.sid, spt.value;

Use sring_split():
select t.sid, value rid
from t
cross apply string_split(t.roleid, ',') rid
order by t.sid, rid

Related

How to select the immediate characters BEFORE a specific string in SQL

So imagine I have a SQL tempTable with a text field with 2 pages worth of text in it.
select * from tempTable
pkid
text
0
This is an example text with some images names like image1.svg or another one like image2.svg
1
This is another example text image3.svg and several images more like image4 and image5
What I want to know is if it's possible to select the characters before the .svg extension, so that the select result would look like
result
ike image1.svg
ike image2.svg
ext image3.svg
and so on.
I've alread read about CHARINDEX and SUBSTRING, but I've only been able to find selects that return ALL text before my filter (.svg).
So I found a way to do it. This is the query I used using PATINDEX().
select pkid, SUBSTRING (text, PATINDEX('%.svg%',text)-60,65)
from tempTable
where text like '%.svg%'
This way you can either return ALL text before desired word/expression, or get a certain number of characters before, you just need to change the substring ranges.
Here's what I came up with. This uses the string_split and LAG functions in MSSQL, but other database engines have similar features.
--create the temp table
DECLARE #temp AS TABLE (
pkid int,
text nvarchar(max)
)
--populate the temp table
INSERT INTO #temp (pkid, text) VALUES
(0, 'This is an example text with some images names like image1.svg or another one like image2.svg'),
(1, 'This is another example text image3.svg and several images more like image4 and image5')
--run the query to get the desired results
SELECT
CONCAT(RIGHT(split.priorValue, 10), '.svg') AS result
FROM (
SELECT
ss.value,
LAG(ss.value, 1) OVER (ORDER BY pkid) AS priorValue
FROM #temp
CROSS APPLY string_split(text, '.') ss
) AS split
WHERE split.value LIKE 'svg%'

filter ids by comma separated nvchar ids

imagine i have a simple table which contains Id (primary key) and Name rows.
Now i have a comma separated ids list like- 2,5,6. I just want to take these comma separated ids then get compare each of id with existing data. And return only those ids which is unique that's means not exists in database. Please note the output also should be same as like input format that's means comma separated format. I am using Microsoft SQL server 2017
What i already tried is like bellow:
select * from DemoTable where Id 2,5,6 not in DemoTable
But this seems not correct syntax. How can i fix it?
You may use this. You need to look from other side in this problem. I found it very interesting.
Actually you have a list of id's as base, and you want to exclude id from the string which are not in some table. So first we need to find the list of id's, after that we'll exclude them from the id's of table to get our desired result. At the end you may use stuff or string_agg to convert your final result into , separated string.
select Value from (
select value from string_split('1,2,3',',')) as t
where t.value not in (select id from demotable)
You may check this link for working fiddle.FIDDLE.
Here's your query.
select string_agg(val, ',') as result
from (
select value as val from string_split('1,2,3',',')) as t
where t.val not in (select id from tableA)
Result: (comma separated)
1,2,3

How do i find max combination from given result string in SQL

Here is the output.
ID Stack
-----------------------------------
123 307290,303665,307285
123 307290,307285,303424,303665
123 307290,307285,303800,303665
123 307061,307290
I want output like only last three row. The reason is in 1st output line stack column all three numbers are available in output line 2 and 3 stack column, so I don't need output line 1.
But the output lines 2,3,4 is different so I want those lines in my result.
I have tried doing it with row_number() and charindex but I'm not getting the proper result.
Thank you.
All the comments telling you to change your database's structure are right! You really should avoid comma separated values. This is breaking 1.NF and will be a pain in the neck forever.
The result of the second CTE might be used to shift all data into a new 1:n related structure.
Something like this?
DECLARE #tbl TABLE(ID INT,Stack VARCHAR(100));
INSERT INTO #tbl VALUES
(123,'307290,303665,307285')
,(123,'307290,307285,303424,303665')
,(123,'307290,307285,303800,303665')
,(123,'307061,307290');
WITH Splitted AS
(
SELECT ID
,Stack
,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS RowIndex
,CAST('<x>' + REPLACE(Stack,',','</x><x>') + '</x>' AS XML) Casted
FROM #tbl
)
,DerivedDistinctValues AS
(
SELECT DISTINCT
ID
,Stack
,RowIndex
,StackNr.value('.','int') AS Nr
FROM Splitted
CROSS APPLY Casted.nodes('/x') AS A(StackNr)
)
SELECT ddv1.ID
,ddv1.Stack
FROM DerivedDistinctValues AS ddv1
FULL OUTER JOIN DerivedDistinctValues AS ddv2 ON ddv1.RowIndex<>ddv2.RowIndex
AND ddv1.Nr=ddv2.Nr
WHERE ddv2.ID IS NULL
GROUP BY ddv1.ID,ddv1.Stack
This will be slow, especially with larger data sets.
Some explanation:
The first CTE will transform the CSV numbers to <x>307290</x><x>303665</x>... This can be casted to XML, which allows to generate a derived table returning all the numbers as rows. This happens in the second CTE calling the XQuery function .nodes().
The last query will do a full outer join - each with each. All rows, where there is at least one row without a corresponding row are to be kept.
But I assume, that this might not work with each and any situation (e.g. circular data)

Get each <tag> in String - stackexchange database

Mockup code for my problem:
SELECT Id FROM Tags WHERE TagName IN '<osx><keyboard><security><screen-lock>'
The problem in detail
I am trying to get tags used in 2011 from apple.stackexchange data. (this query)
As you can see, tags in tag changes are stored as plain text in the Text field.
<tag1><tag2><tag3>
<osx><keyboard><security><screen-lock>
How can I create a unique list of the tags, to look them up in the Tags table, instead of this hardcoded version:
SELECT * FROM Tags
WHERE TagName = 'osx'
OR TagName = 'keyboard'
OR TagName = 'security'
Here is a interactive example.
Stackexchange uses T-SQL, my local copy is running under postgresql using Postgres app version 9.4.5.0.
Assuming this table definition:
CREATE TABLE posthistory(post_id int PRIMARY KEY, tags text);
Depending on what you want exactly:
To convert the string to an array, trim leading and trailing '<>', then treat '><' as separator:
SELECT *, string_to_array(trim(tags, '><'), '><') AS tag_arr
FROM posthistory;
To get list of unique tags for whole table (I guess you want this):
SELECT DISTINCT tag
FROM posthistory, unnest(string_to_array(trim(tags, '><'), '><')) tag;
The implicit LATERAL join requires Postgres 9.3 or later.
This should be substantially faster than using regular expressions. If you want to try regexp, use regexp_split_to_table() instead of regexp_split_to_array() followed by unnest() like suggested in another answer:
SELECT DISTINCT tag
FROM posthistory, regexp_split_to_table(trim(tags, '><'), '><') tag;
Also with implicit LATERAL join. Related:
Split column into multiple rows in Postgres
What is the difference between LATERAL and a subquery in PostgreSQL?
To search for particular tags:
SELECT *
FROM posthistory
WHERE tags LIKE '%<security>%'
AND tags LIKE '%<osx>%';
SQL Fiddle.
Applied to your search in T-SQL in our data explorer:
SELECT TOP 100
PostId, UserId, Text AS Tags FROM PostHistory
WHERE year(CreationDate) = 2011
AND PostHistoryTypeId IN (3 -- initial tags
, 6 -- edit tags
, 9) -- rollback tags
AND Text LIKE ('%<' + ##TagName:String?postgresql## + '>%');
(T-SQL syntax uses the non-standard + instead of ||.)
https://data.stackexchange.com/apple/query/edit/417055
I've simplified the data to the relevant column only and called it tags to present the example.
Sample data
create table posthistory(tags text);
insert into posthistory values
('<lion><backup><time-machine>'),
('<spotlight><alfred><photo-booth>'),
('<lion><pdf><preview>'),
('<pdf>'),
('<asd>');
Query to get unique list of tags
SELECT DISTINCT
unnest(
regexp_split_to_array(
trim('><' from tags), '><'
)
)
FROM
posthistory
First we're removing all occurences of leading and trailing > and < signs from each row, then using regexp_split_to_array() function to get values into arrays, and then unnest() to expand an array to a set of rows. Finally DISTINCT eliminates duplicate values.
Presenting SQLFiddle to preview how it works.

Conversion failed when converting the varchar value to data type int

i have this query and its work fine for me
SELECT STUFF(
(SELECT ','+SLT_SubListName FROM sublists where SLT_SubListId in (1,2) FOR XML PATH('')),1,1,'');
but when i change the in parameters (1,2) into the 'select SBS_SubListId from subscriber where SBS_SubscriberId=1'
which also return the 1,2
SELECT STUFF(
(SELECT ','+SLT_SubListName FROM sublists where SLT_SubListId in (select SBS_SubListId from subscriber where SBS_SubscriberId=1
) FOR XML PATH('')),1,1,'');
its giving me the error which is the following
Conversion failed when converting the varchar value '1,2,4,5' to data type int.
if anybody needs i can also post my table schema here.
thanks
you have to first split these comma sepered values on comma bases, and then apply converrsion.
I suspect tht your subquery is returning one entry of "1,2,4,5" which is not an integer. You need to get it to return 4 rows with one of these in each.
If you are doing this - show the results of the subquery - then you may have type differences.
Since your subquery seems to be returning the varchar '1,2,4,5', you can try the query using CONTAINS:
SELECT STUFF(
(SELECT ','+SLT_SubListName FROM sublists where CONTAINS((select SBS_SubListId from subscriber where SBS_SubscriberId=1), SLT_SubListId
) FOR XML PATH('')),1,1,'');
That way it treats the results from the subquery as text.