SQL Splitting data in a column into a new table - sql

I am being given data constantly in the following format:
Items Instrument
------- ---------
1|2|3 201400001
2|3 201400002
3 201400003
1|4 201400004
and need to output:
Item Instrument
------- ---------
1 201400001
2 201400001
3 201400001
2 201400002
3 201400002
3 201400003
1 201400004
4 201400004
My answer is a stored function or procedure I know, but which? further, do i write the function or procedure to accept columns from any database i have or will i need to write this on a case by case basis? essentially these are reports being turned in often and i am asked to get them into sql so the data can be analyzed further.
i hope this makes sense. thank you in advance for your time.

If you have a list of all possible items, then you can do:
select i.itemId, t.instrument
from table t join
items i
on '|' + t.items + '|' like '%|'+cast(i.itemId as varchar(255))+'|%';

Original Code: http://dotnetbites.com/split-strings-into-table
Modified for your case:
CREATE FUNCTION [dbo].[SplitStringToTable]
(
#InputString VARCHAR(MAX) = ''
, #Delimiter CHAR(1) = ','
, #Instrument char(8)
)
RETURNS #RESULT TABLE(ID INT IDENTITY, Items VARCHAR(1000), Instrument char(8))
AS
BEGIN
DECLARE #XML XML
SELECT #XML = CONVERT(XML, SQL_TEXT)
FROM (
SELECT '<root><item>'
+ REPLACE(#InputString, #Delimiter, '</item><item>')
+ '</item></root>' AS SQL_TEXT
) dt
INSERT INTO #RESULT(Items, Instrument)
SELECT t.col.query('.').value('.', 'VARCHAR(1000)') AS Items , #Instrument As Instrument
FROM #XML.nodes('root/item') t(col)
RETURN
END

Related

Split a string by a delimiter using SQL

My table in DB has a column which stores values in following format.
1234#2345#6780
Four digit numbers are stored using delimiter "#".
Due to a data corruption, there are some records with five digit numbers. There may be one or more than one five digit numbers in a given row.
1234#12345#67895
I'm trying to write a script to get only those corrupted records But cannot find a way to split and check values.
Any help is appreciated.
I'm using SQL server 12.0 version
you can use this function to split values:
CREATE FUNCTION [dbo].[fnSplit]
(#sInputList VARCHAR(8000) -- List of delimited items
, #sDelimiter VARCHAR(8000) = '#' -- delimiter that separates items
)
RETURNS #List TABLE (item VARCHAR(8000))
BEGIN
DECLARE #sItem VARCHAR(8000)
WHILE CHARINDEX(#sDelimiter,#sInputList,0) <> 0
BEGIN
SELECT
#sItem=RTRIM(LTRIM(SUBSTRING(#sInputList,1,CHARINDEX(#sDelimiter,#sInputList,0)-1))),
#sInputList=RTRIM(LTRIM(SUBSTRING(#sInputList,CHARINDEX(#sDelimiter,#sInputList,0)+LEN(#sDelimiter),LEN(#sInputList))))
IF LEN(#sItem) > 0
INSERT INTO #List SELECT #sItem as item
END
IF LEN(#sInputList) > 0
INSERT INTO #List SELECT #sInputList as item -- Put the last item in
RETURN
END
and then ask about the result
You can use XML nodes to split the string before 2016 version.
Create table Xmltest(ID int, numbers nvarchar(max))
insert into Xmltest values (1, '1234#12345#67895')
select ID, N.value('.', 'varchar(255)') as xmlValue
from (
select ID ,
cast(('<w>' + replace(numbers,'#','</w><w>') + '</w>') as xml) as xmlValue
from Xmltest
) as z
cross apply xmlValue.nodes ('//w') as split(N)
Output you get, I added this ID column to identify which row may have more than 4 Characters.
ID xmlValue
1 1234
1 12345
1 67895
To check where you have more than 4 characters you can do:
select ID, N.value('.', 'varchar(255)') as xmlValue
from (
select ID ,
cast(('<w>' + replace(numbers,'#','</w><w>') + '</w>') as xml) as xmlValue
from Xmltest
) as z
cross apply xmlValue.nodes ('//w') as split(N)
where len(N.value('.', 'varchar(255)')) > 4
Output you get:
ID xmlValue
1 12345
1 67895
You can use this. it returns any numbers row which length greater than 4.
SELECT * FROM SampleData
WHERE data LIKE '%[0-9][0-9][0-9][0-9][0-9]%'
this will work patindex is orcale's equivalent of regexp_like():
select * from table_name where not PATINDEX ('^[0-9]{4}(#){1}[0-9]{4}(#){1}[0-9]
{4}$',col_name) !=0;
for SQL Server (starting with 2016)
you can use the built in function of SQL to split a string.
sample:
DECLARE #Text VARCHAR(100) = '1234#12345#67895'
SELECT * FROM STRING_SPLIT(#Text,'#')
result:
value
----
123
4456
78902
you can now easily manipulate the values after

SQL CSV as Query Results Column

I have the following SQL which queries a single table, single row, and returns the results as a comma separate string e.g.
Forms
1, 10, 4
SQL :
DECLARE #tmp varchar(250)
SET #tmp = ''
SELECT #tmp = #tmp + Form_Number + ', '
FROM Facility_EI_Forms_Required
WHERE Facility_ID = 11 AND EI_Year=2012 -- single Facility, single year
SELECT SUBSTRING(#tmp, 1, LEN(#tmp) - 1) AS Forms
The Facility_EI_Forms_Required table has three records for Facility_ID = 11
Facility_ID EI_Year Form_Number
11 2012 1
11 2012 10
11 2012 4
Form_number is a varchar field.
And I have a Facility table with Facility_ID and Facility_Name++.
How do I create a query to query all Facilites for a given year and produce the CSV output field?
I have this so far:
DECLARE #tmp varchar(250)
SET #tmp = ''
SELECT TOP 100 A.Facility_ID, A.Facility_Name,
(
SELECT #tmp = #tmp + B.Form_Number + ', '
FROM B
WHERE B.Facility_ID = A.Facility_ID
AND B.EI_Year=2012
)
FROM Facility A, Facility_EI_Forms_Required B
But it gets syntax errors on using #tmp
My guess is this is too complex a task for a query and a stored procedure may be need, but I have little knowledge of SPs. Can this be done with a nested query?
I tried a Scalar Value Function
ALTER FUNCTION [dbo].[sp_func_EI_Form_List]
(
-- Add the parameters for the function here
#p1 int,
#pYr int
)
RETURNS varchar
AS
BEGIN
-- Declare the return variable here
DECLARE #Result varchar
-- Add the T-SQL statements to compute the return value here
DECLARE #tmp varchar(250)
SET #tmp = ''
SELECT #tmp = #tmp + Form_Number + ', '
FROM OIS..Facility_EI_Forms_Required
WHERE Facility_ID = #p1 AND EI_Year = #pYr -- single Facility, single year
SELECT #Result = #tmp -- SUBSTRING(#tmp, 1, LEN(#tmp) - 1)-- #p1
-- Return the result of the function
RETURN #Result
END
The call
select Facility_ID, Facility.Facility_Name,
dbo.sp_func_EI_Form_List(Facility_ID,2012)
from facility where Facility_ID=11
returns
Facility_ID Facility_Name Form_List
11 Hanson Aggregates 1
so it is only returning the first record instead of all three. What am I doing wrong?
Try the following approach, which is an analogy to SO answer Concatenate many rows into a single text string. I hope it is correct, as I cannot try it out without having the schema and some demo data (maybe you can add schema and data to your question):
Select distinct A.Facility_ID, A.Facility_Name,
substring(
(
Select ',' + B.Form_Number AS [text()]
From Facility_EI_Forms_Required B
Where B.Facility_ID = A.Facility_ID
AND B.EI_Year=2012
ORDER BY B.Facility_ID
For XML PATH ('')
), 2, 1000) [Form_List]
From Facility A

Dynamically Create tables and Insert into it from another table with CSV values

Have a Table with the CSV Values in the columns as below
ID Name text
1 SID,DOB 123,12/01/1990
2 City,State,Zip NewYork,NewYork,01234
3 SID,DOB 456,12/21/1990
What is need to get is 2 tables in this scenario as out put with the corresponding values
ID SID DOB
1 123 12/01/1990
3 456 12/21/1990
ID City State Zip
2 NewYork NewYork 01234
Is there any way of achieving it using a Cursor or any other method in SQL server?
There are several ways that this can be done. One way that I would suggest would be to split the data from the comma separated list into multiple rows.
Since you are using SQL Server, you could implement a recursive CTE to split the data, then apply a PIVOT function to create the columns that you want.
;with cte (id, NameItem, Name, textItem, text) as
(
select id,
cast(left(Name, charindex(',',Name+',')-1) as varchar(50)) NameItem,
stuff(Name, 1, charindex(',',Name+','), '') Name,
cast(left(text, charindex(',',text+',')-1) as varchar(50)) textItem,
stuff(text, 1, charindex(',',text+','), '') text
from yt
union all
select id,
cast(left(Name, charindex(',',Name+',')-1) as varchar(50)) NameItem,
stuff(Name, 1, charindex(',',Name+','), '') Name,
cast(left(text, charindex(',',text+',')-1) as varchar(50)) textItem,
stuff(text, 1, charindex(',',text+','), '') text
from cte
where Name > ''
and text > ''
)
select id, SID, DOB
into table1
from
(
select id, nameitem, textitem
from cte
where nameitem in ('SID', 'DOB')
) d
pivot
(
max(textitem)
for nameitem in (SID, DOB)
) piv;
See SQL Fiddle with Demo. The recursive version will work great but if you have a large dataset, you could have some performance issues so you could also use a user defined function to split the data:
create FUNCTION [dbo].[Split](#String1 varchar(MAX), #String2 varchar(MAX), #Delimiter char(1))
returns #temptable TABLE (colName varchar(MAX), colValue varchar(max))
as
begin
declare #idx1 int
declare #slice1 varchar(8000)
declare #idx2 int
declare #slice2 varchar(8000)
select #idx1 = 1
if len(#String1)<1 or #String1 is null return
while #idx1 != 0
begin
set #idx1 = charindex(#Delimiter,#String1)
set #idx2 = charindex(#Delimiter,#String2)
if #idx1 !=0
begin
set #slice1 = left(#String1,#idx1 - 1)
set #slice2 = left(#String2,#idx2 - 1)
end
else
begin
set #slice1 = #String1
set #slice2 = #String2
end
if(len(#slice1)>0)
insert into #temptable(colName, colValue) values(#slice1, #slice2)
set #String1 = right(#String1,len(#String1) - #idx1)
set #String2 = right(#String2,len(#String2) - #idx2)
if len(#String1) = 0 break
end
return
end;
Then you can use a CROSS APPLY to get the result for each row:
select id, SID, DOB
into table1
from
(
select t.id,
c.colname,
c.colvalue
from yt t
cross apply dbo.split(t.name, t.text, ',') c
where c.colname in ('SID', 'DOB')
) src
pivot
(
max(colvalue)
for colname in (SID, DOB)
) piv;
See SQL Fiddle with Demo
You'd need to approach this as a multi-step ETL project. I'd probably start with exporting the two types of rows into a couple staging tables. So, for example:
select * from yourtable /* rows that start with a number */
where substring(text,1,1) in
('0','1','2','3','4','5','6','7','8','9')
select * from yourtable /* rows that don't start with a number */
where substring(text,1,1)
not in ('0','1','2','3','4','5','6','7','8','9')
/* or simply this to follow your example explicitly */
select * from yourtable where name like 'sid%'
select * from yourtable where name like 'city%'
Once you get the two types separated then you can split them out with one of the already written split functions found readily out on the interweb.
Aaron Bertrand (who is on here often) has written up a great post on the variety of ways to split comma delimted strings using SQL. Each of the methods are compared and contrasted here.
http://www.sqlperformance.com/2012/07/t-sql-queries/split-strings
If your row count is minimal (under 50k let's say) and it's going to be a one time operation than pick the easiest way and don't worry too much about all the performance numbers.
If you have a ton of rows or this is an ETL process that will run all the time then you'll really want to pay attention to that stuff.
A simple solution using cursors to build temporary tables. This has the limitation of making all columns VARCHAR and would be slow for large amounts of data.
--** Set up example data
DECLARE #Source TABLE (ID INT, Name VARCHAR(50), [text] VARCHAR(200));
INSERT INTO #Source
(ID, Name, [text])
VALUES (1, 'SID,DOB', '123,12/01/1990')
, (2, 'City,State,Zip', 'NewYork,NewYork,01234')
, (3, 'SID,DOB', '456,12/21/1990');
--** Declare variables
DECLARE #Name VARCHAR(200) = '';
DECLARE #Text VARCHAR(1000) = '';
DECLARE #SQL VARCHAR(MAX);
--** Set up cursor for the tables
DECLARE cursor_table CURSOR FAST_FORWARD READ_ONLY FOR
SELECT s.Name
FROM #Source AS s
GROUP BY Name;
OPEN cursor_table
FETCH NEXT FROM cursor_table INTO #Name;
WHILE ##FETCH_STATUS = 0
BEGIN
--** Dynamically create a temp table with the specified columns
SET #SQL = 'CREATE TABLE ##Table (' + REPLACE(#Name, ',', ' VARCHAR(50),') + ' VARCHAR(50));';
EXEC(#SQL);
--** Set up cursor to insert the rows
DECLARE row_cursor CURSOR FAST_FORWARD READ_ONLY FOR
SELECT s.Text
FROM #Source AS s
WHERE Name = #Name;
OPEN row_cursor;
FETCH NEXT FROM row_cursor INTO #Text;
WHILE ##FETCH_STATUS = 0
BEGIN
--** Dynamically insert the row
SELECT #SQL = 'INSERT INTO ##Table VALUES (''' + REPLACE(#Text, ',', ''',''') + ''');';
EXEC(#SQL);
FETCH NEXT FROM row_cursor INTO #Text;
END
--** Display the table
SELECT *
FROM ##Table;
--** Housekeeping
CLOSE row_cursor;
DEALLOCATE row_cursor;
DROP TABLE ##Table;
FETCH NEXT FROM cursor_table INTO #Name;
END
CLOSE cursor_table;
DEALLOCATE cursor_table;

Remove a sentence from a paragraph that has a specific pattern with T-SQL

I have a large number of descriptions that can be anywhere from 5 to 20 sentences each. I am trying to put a script together that will locate and remove a sentence that contains a word with numbers before or after it.
before example: Hello world. Todays department has 345 employees. Have a good day.
after example: Hello world. Have a good day.
My main problem right now is identifying the violation.
Here "345 employees" is what causes the sentence to be removed. However, each description will have a different number and possibly a different variation of the word employee.
I would like to avoid having to create a table of all the different variations of employee.
JTB
This would make a good SQL Puzzle.
Disclaimer: there are probably TONS of edge cases that would blow this up
This would take a string, split it out into a table with a row for each sentence, then remove the rows that matched a condition, and then finally join them all back into a string.
CREATE FUNCTION dbo.fn_SplitRemoveJoin(#Val VARCHAR(2000), #FilterCond VARCHAR(100))
RETURNS VARCHAR(2000)
AS
BEGIN
DECLARE #tbl TABLE (rid INT IDENTITY(1,1), val VARCHAR(2000))
DECLARE #t VARCHAR(2000)
-- Split into table #tbl
WHILE CHARINDEX('.',#Val) > 0
BEGIN
SET #t = LEFT(#Val, CHARINDEX('.', #Val))
INSERT #tbl (val) VALUES (#t)
SET #Val = RIGHT(#Val, LEN(#Val) - LEN(#t))
END
IF (LEN(#Val) > 0)
INSERT #tbl VALUES (#Val)
-- Filter out condition
DELETE FROM #tbl WHERE val LIKE #FilterCond
-- Join back into 1 string
DECLARE #i INT, #rv VARCHAR(2000)
SET #i = 1
WHILE #i <= (SELECT MAX(rid) FROM #tbl)
BEGIN
SELECT #rv = IsNull(#rv,'') + IsNull(val,'') FROM #tbl WHERE rid = #i
SET #i = #i + 1
END
RETURN #rv
END
go
CREATE TABLE #TMP (rid INT IDENTITY(1,1), sentence VARCHAR(2000))
INSERT #tmp (sentence) VALUES ('Hello world. Todays department has 345 employees. Have a good day.')
INSERT #tmp (sentence) VALUES ('Hello world. Todays department has 15 emps. Have a good day. Oh and by the way there are 12 employees somewhere else')
SELECT
rid, sentence, dbo.fn_SplitRemoveJoin(sentence, '%[0-9] Emp%')
FROM #tmp t
returns
rid | sentence | |
1 | Hello world. Todays department has 345 employees. Have a good day. | Hello world. Have a good day.|
2 | Hello world. Todays department has 15 emps. Have a good day. Oh and by the way there are 12 employees somewhere else | Hello world. Have a good day. |
I've used the split/remove/join technique as well.
The main points are:
This uses a pair of recursive CTEs, rather than a UDF.
This will work with all English sentence endings: . or ! or ?
This removes whitespace to make the comparison for "digit then employee" so you don't have to worry about multiple spaces and such.
Here's the SqlFiddle demo, and the code:
-- Split descriptions into sentences (could use period, exclamation point, or question mark)
-- Delete any sentences that, without whitespace, are like '%[0-9]employ%'
-- Join sentences back into descriptions
;with Splitter as (
select ID
, ltrim(rtrim(Data)) as Data
, cast(null as varchar(max)) as Sentence
, 0 as SentenceNumber
from Descriptions -- Your table here
union all
select ID
, case when Data like '%[.!?]%' then right(Data, len(Data) - patindex('%[.!?]%', Data)) else null end
, case when Data like '%[.!?]%' then left(Data, patindex('%[.!?]%', Data)) else Data end
, SentenceNumber + 1
from Splitter
where Data is not null
), Joiner as (
select ID
, cast('' as varchar(max)) as Data
, 0 as SentenceNumber
from Splitter
group by ID
union all
select j.ID
, j.Data +
-- Don't want "digit+employ" sentences, remove whitespace to search
case when replace(replace(replace(replace(s.Sentence, char(9), ''), char(10), ''), char(13), ''), char(32), '') like '%[0-9]employ%' then '' else s.Sentence end
, s.SentenceNumber
from Joiner j
join Splitter s on j.ID = s.ID and s.SentenceNumber = j.SentenceNumber + 1
)
-- Final Select
select a.ID, a.Data
from Joiner a
join (
-- Only get max SentenceNumber
select ID, max(SentenceNumber) as SentenceNumber
from Joiner
group by ID
) b on a.ID = b.ID and a.SentenceNumber = b.SentenceNumber
order by a.ID, a.SentenceNumber
One way to do this. Please note that it only works if you have one number in all sentences.
declare #d VARCHAR(1000) = 'Hello world. Todays department has 345 employees. Have a good day.'
declare #dr VARCHAR(1000)
set #dr = REVERSE(#d)
SELECT REVERSE(RIGHT(#dr,LEN(#dr) - CHARINDEX('.',#dr,PATINDEX('%[0-9]%',#dr))))
+ RIGHT(#d,LEN(#d) - CHARINDEX('.',#d,PATINDEX('%[0-9]%',#d)) + 1)

Is there a way to return multiple results with a subquery?

I have need to return multiple results from a subquery and have been unable to figure it out. The end result will produce the persons name across the vertical axis, various actions based on an action category across the horizontal axis. So the end result looking like:
----------
**NAME CATEGORY 1 CATEGORY 2**
Smith, John Action 1, Action 2 Action 1, Action 2, Action 3
----------
Is there a way to do this in a single query?
select
name,
(select action from actionitemtable where actioncategory = category1 and contact = contactid)
from
contact c
inner join actionitemtable a
on c.contactid = a.contactid
If more than one result is returned in that subquery I would like to be able to display it as a single comma separated string, or list of actions, etc.
Thank you.
Microsoft Sql Server 2005 is being used.
I use a User Defined Function for this task. The udf creates a delimited string with all elements matching the parameters, then you call the udf from your select statement such that you pull a delimited list for each record in the recordset.
CREATE FUNCTION dbo.ud_Concat(#actioncategory int, #contactid int)
RETURNS VARCHAR(8000)
AS
BEGIN
DECLARE #sOutput VARCHAR(8000)
SET #sOutput = ''
SELECT #sOutput = COALESCE(#sOutput, '') + action + ', '
FROM dbo.actionitemtable
WHERE actioncategory=#actioncategory AND contact=#contact
ORDER BY action
RETURN #sOutput
END
SELECT
name,
dbo.ud_Concat(category1, contactid) as contactList
FROM contact c
INNER JOIN actionitemtable a ON c.contactid = a.contactid
you need to give more info about your table structure and how they join to each other.
here is a generic example about combining multiple rows into a single column:
declare #table table (name varchar(30)
,ID int
,TaskID char(3)
,HoursAssigned int
)
insert into #table values ('John Smith' ,4592 ,'A01' ,40)
insert into #table values ('Matthew Jones',2863 ,'A01' ,20)
insert into #table values ('Jake Adams' ,1182 ,'A01' ,100)
insert into #table values ('Matthew Jones',2863 ,'A02' ,50)
insert into #table values ('Jake Adams' ,2863 ,'A02' ,10)
SELECT DISTINCT
t1.TaskID
,SUBSTRING(
replace(
replace(
(SELECT
t2.Name
FROM #Table AS t2
WHERE t1.TaskID=t2.TaskID
ORDER BY t2.Name
FOR XML PATH(''))
,'</NAME>','')
,'<NAME>',', ')
,3,2000) AS PeopleAssigned
FROM #table AS t1
OUTPUT:
TaskID PeopleAssigned
------ --------------------------------------
A01 Jake Adams, John Smith, Matthew Jones
A02 Jake Adams, Matthew Jones
(2 row(s) affected)
This is pretty abstract and complex. My initial reaction was "pivot query", but the more I looked at it (and at the earlier responses) the more I thought: Can you pass this one off to the application team? You return the "base", and they write and apply the procedural code that makes this kind of problem a snap. Sure, you can squeeze it in to SQL, but that doesn't make it the right place to do the work.
According to your query try this:
SELECT [Name],
STUFF(
(
SELECT ' ,' + [Action]
FROM [AactionItemTable]
WHERE [ActionCategory] = category1
AND [Contact] = contactid
FOR XML PATH('')
), 1, 2, ''
) AS [AdditionalData]
FROM [Contact] C
INNER JOIN [ActionItemTable] A
ON C.[ContactId] = A.[ContactId]
Guess this is the simplest way to do what you want.
EDIT: if there is no action in the subquery found, the [AdditionalData] result will be NULL.
You will probably have to create a custom aggregate function. Microsoft has a knowledge base article with sample code here.
If you don't mind using cursors, you can write your own function to do this. Here's an example that will work on the Adventureworks sample DB:
CREATE FUNCTION CommaFunctionSample
(
#SalesOrderID int
)
RETURNS varchar(max)
AS
BEGIN
DECLARE OrderDetailCursor CURSOR LOCAL FAST_FORWARD
FOR
SELECT SalesOrderDetailID
FROM Sales.SalesOrderDetail
WHERE SalesOrderID = #SalesOrderID
DECLARE #SalesOrderDetailID INT
OPEN OrderDetailCursor
FETCH NEXT FROM OrderDetailCursor INTO #SalesOrderDetailID
DECLARE #Buffer varchar(max)
WHILE ##FETCH_STATUS = 0
BEGIN
IF #Buffer IS NOT NULL SET #Buffer = #Buffer + ','
ELSE SET #Buffer = ''
SET #Buffer = #Buffer + CAST(#SalesOrderDetailID AS varchar(12))
FETCH NEXT FROM OrderDetailCursor INTO #SalesOrderDetailID
END
CLOSE OrderDetailCursor
DEALLOCATE OrderDetailCursor
RETURN #Buffer
END
This is how such a function would appear in a select query:
SELECT AccountNumber, dbo.CommaFunctionSample(SalesOrderID)
FROM Sales.SalesOrderHeader