Selecting substring x.x.x.x from a string field - sql

Using Sql Server, I have the following data in a column in a table:
NFPA 1: 4.5.8.1, SFM 69A-21
NFPA 101:7.2.1.8.
NFPA 1 14.13.1.1*
NFPA 101 7.2.1.15.8
NFPA 1 13.6.9.3.1.1.1
NFPA 101:7.1.3.2.1 (6)
?NFPA ?1 14?.?6?.?3?*
I would like to select just records with x.x.x.x.(.x)(.x)etc that have a period sequence. I would also like to show the x.x.x.x.(.x)(.x)etc data in the output and not the data before or after the periods. For example from the above data, the following would display as the output of the sql select:
4.5.8.1
7.2.1.8
14.13.1.1
7.2.1.15.8
13.6.9.3.1.1.1
7.1.3.2.1
[THIS RECORD WOULD NOT BE SELECTED BECAUSE THE DATA IS NOT IN THE FORMAT: ?NFPA ?1 14?.?6?.?3?*]
Any help would be appreciated thanks before hand.
UPDATE: Please check the sql fiddle:
SQL FIDDLE HERE
**UPDATE #2 **: No answers so far, hope someone can help.

Considering that you want a variable number of elements that are a variable number of digits, the best way that I am aware of is to use Regular Expressions. Regular Expressions are not natively available in T-SQL so you need to add SQLCLR functions to so that part. After that, it is just a simple Regular Expression to grab data matching that pattern.
DECLARE #TestData TABLE (String NVARCHAR(100));
INSERT INTO #TestData (String) VALUES ('NFPA 1: 4.5.8.1, SFM 69A-21');
INSERT INTO #TestData (String) VALUES ('NFPA 101:7.2.1.8.');
INSERT INTO #TestData (String) VALUES ('NFPA 1 14.13.1.1*');
INSERT INTO #TestData (String) VALUES ('NFPA 101 7.2.1.15.8');
INSERT INTO #TestData (String) VALUES ('NFPA 1 13.6.9.3.1.1.1');
INSERT INTO #TestData (String) VALUES ('NFPA 101:7.1.3.2.1 (6)');
INSERT INTO #TestData (String) VALUES ('?NFPA ?1 14?.?6?.?3?*');
;WITH cte AS
(
SELECT SQL#.RegEx_MatchSimple(tmp.String, N'\d+(?:\.\d+)+', 1, NULL)
AS [Filtered]
FROM #TestData tmp
)
SELECT *
FROM cte
WHERE cte.Filtered <> '';
Output:
4.5.8.1
7.2.1.8
14.13.1.1
7.2.1.15.8
13.6.9.3.1.1.1
7.1.3.2.1
For the example I used the SQL# library (which I am the author of). The RegEx function used here, and others, are available in the Free version.

Related

Get every string before character in SQL Server

I got two record in table which is as below -:
1.123-21
2.123-21-30
How to query for all string before certain place of character . Below shown expected output
1. 123-21 -> 123
2. 123-21-30 ->123-21
How can I solve it?
DECLARE #T TABLE (Vals VARCHAR(100))
INSERT INTO #T(Vals) VALUES ('123-21') , ('123-21-30')
SELECT LEFT(Vals, LEN(Vals) - CHARINDEX('-', REVERSE(Vals)) )
FROM #T

String_Split inserts only the first value

I'm trying to insert comma separated Guids into a temp table, to later check for a value using IN in these Guids. The following query is inserting only the first value in the table twice.
DECLARE #campaignids nvarchar(max) = '1DEBD122-FF1B-4E87-8812-D427ABA5D54E,FBD06A2E-24D1-4C06-B71D-B4306D8EA3BD'
DECLARE #TempCampaignIds TABLE (CampaignId uniqueidentifier)
INSERT INTO #TempCampaignIds
SELECT CAST(#campaignids AS uniqueidentifier)
FROM STRING_SPLIT(#campaignids, ',')
SELECT CampaignId FROM #TempCampaignIds
--result
CampaignId
1DEBD122-FF1B-4E87-8812-D427ABA5D54E
1DEBD122-FF1B-4E87-8812-D427ABA5D54E
You need to use the value from the string:
INSERT INTO #TempCampaignIds (CampaignId)
SELECT CAST(s.value AS uniqueidentifier)
FROM STRING_SPLIT(#campaignids, ',') s;
Here is a db<>fiddle.
I'm actually surprised that your code works, but SQL Server converts the first value of such a string without an error. That doesn't seem to happen for other data types. In fact, SQL Server appears to look at only the first 36 characters for a unique identifier.

Replace string without fixed length

I have some data that I'm looking at that has text formatting stored within a NTEXT field.
Happy enough with SQL Replace to remove data of a known length and format, however there are some fields with what looks like colour formatting and I'm trying to find a way to remove these.
An example of the data below, however (if possible) I would like to be able to remove whatever numbers follow the colours in the data but can't see how to introduce a wildcard into the replace statement.
Something like '\red***\green\***\blue***' as per Excel, but this doesn't work in Sql Server.
declare #str varchar(1500) = '\red3\green73\blue125;Jimmy Jazz\red31\green73\blue125;'
select #str,
replace(#str,'\red31\green73\blue125;','')
Any pointers would be gratefully received, thanks in advance.
Based on your sample data it would appear that you only need to remove the numbers in your string you can use patreplace8k or using patextract8K. Note the sample data and examples below:
-- Sample data
DECLARE #strings TABLE(stringId INT IDENTITY, string VARCHAR(100));
INSERT #strings VALUES('DeepPurple1978\yellow2\red009;pink\black3322'),
('red202\yellow5\red009;hotpink2'),('purple999\gray65\violet;blue\yellow381');
--==== Solution #1 Patreplace8k
SELECT
s.stringId,
pr.newString
FROM #strings AS s
CROSS APPLY samd.patReplace8K(s.string,'[0-9]','') AS pr;
--==== Solution #2 PatExtract8k + STRING_AGG (SQL 2017+)
SELECT
s.stringId,
NewString = STRING_AGG(pe.Item,'') WITHIN GROUP (ORDER BY pe.ItemNumber)
FROM #strings AS s
CROSS APPLY samd.patExtract8K(s.string,'[0-9]') AS pe
GROUP BY s.stringId;
--==== Solution #3 PatExtract8k + XML Concatination (Pre SQL 2017\)
SELECT
s.stringId,
NewString =
(
SELECT pe.item+''
FROM #strings AS s2
CROSS APPLY samd.patExtract8K(s2.string,'[0-9]') AS pe
WHERE s.stringId = s2.stringid
ORDER BY pe.itemNumber
FOR XML PATH('')
)
FROM #strings AS s
GROUP BY s.stringId;
Each of these solutions return:
stringId NewString
----------- -------------------------------------
1 DeepPurple\yellow\red;pink\black
2 red\yellow\red;hotpink
3 purple\gray\violet;blue\yellow
The second and third leverage concatenation, the second compatible with SQL Server 2017+ the third works with earlier versions (you did not include what version you are on.)
To only strip the numbers that follow one or more pre-defined colors you could use patternsplitCM. Note the use of a table with a group of colors your are seeking; in the real world I'd use a real table.
-- Colors
DECLARE #colors TABLE(color VARCHAR(20) PRIMARY KEY);
INSERT #colors VALUES('red'),('green'),('blue'),('yellow'),('purple'),('grey');
-- Sample data
DECLARE #strings TABLE(stringId INT IDENTITY, string VARCHAR(100));
INSERT #strings VALUES('Burger1978\yellow2\red009;pink\86thisfool'),
('red202\yellow5\red009;Freddy99'),('green999\grey65\violet;blue\yellow381');
SELECT
s.stringId, s.string, NewString =
(
SELECT
(
SELECT SUBSTRING(f.Item, IIF(f.M=0 AND EXISTS (SELECT c.Color FROM #colors AS c
WHERE c.Color = f.L),NULLIF(PATINDEX('%[^0-9]',f.item),0),1),8000)
FROM
(
SELECT ps.ItemNumber, ps.Item, ps.[Matched],
LAG(ps.Item,1,ps.Item) OVER (ORDER BY ps.ItemNumber)
FROM dbo.PatternSplitCM(s.string,'[^0-9\ ;]') AS ps
) AS f(ItemNumber,Item,M,L)
ORDER BY f.ItemNumber
FOR XML PATH(''), TYPE
).value('(text())[1]','varchar(8000)')
)
FROM #strings AS s;
Returns:
stringId string NewString
----------- --------------------------------------------- ----------------------------------------
1 Burger1978\yellow2\red009;pink\86thisfool Burger1978\yellow\red;pink\86thisfool
2 red202\yellow5\red009;Freddy99 red\yellow\red;Freddy99
3 green999\grey65\violet;blue\yellow381 green\grey\violet;blue\yellow

How to manipulate comma-separated list in SQL Server

I have a list of values such as
1,2,3,4...
that will be passed into my SQL query.
I need to have these values stored in a table variable. So essentially I need something like this:
declare #t (num int)
insert into #t values (1),(2),(3),(4)...
Is it possible to do that formatting in SQL Server? (turning 1,2,3,4... into (1),(2),(3),(4)...
Note: I can not change what those values look like before they get to my SQL script; I'm stuck with that list. also it may not always be 4 values; it could 1 or more.
Edit to show what values look like: under normal circumstances, this is how it would work:
select t.pk
from a_table t
where t.pk in (#place_holder#)
#placeholder# is just a literal place holder. when some one would run the report, #placeholder# is replaced with the literal values from the filter of that report:
select t.pk
from a_table t
where t.pk in (1,2,3,4) -- or whatever the user selects
t.pk is an int
note: doing
declare #t as table (
num int
)
insert into #t values (#Placeholder#)
does not work.
Your description is a bit ridicuolus, but you might give this a try:
Whatever you mean with this
I see what your trying to say; but if I type out '#placeholder#' in the script, I'll end up with '1','2','3','4' and not '1,2,3,4'
I assume this is a string with numbers, each number between single qoutes, separated with a comma:
DECLARE #passedIn VARCHAR(100)='''1'',''2'',''3'',''4'',''5'',''6'',''7''';
SELECT #passedIn; -->: '1','2','3','4','5','6','7'
Now the variable #passedIn holds exactly what you are talking about
I'll use a dynamic SQL-Statement to insert this in a temp-table (declared table variable would not work here...)
CREATE TABLE #tmpTable(ID INT);
DECLARE #cmd VARCHAR(MAX)=
'INSERT INTO #tmpTable(ID) VALUES (' + REPLACE(SUBSTRING(#passedIn,2,LEN(#passedIn)-2),''',''','),(') + ');';
EXEC (#cmd);
SELECT * FROM #tmpTable;
GO
DROP TABLE #tmpTable;
UPDATE 1: no dynamic SQL necessary, all ad-hoc...
You can get the list of numbers as derived table in a CTE easily.
This can be used in a following statement like WHERE SomeID IN(SELECT ID FROM MyIDs) (similar to this: dynamic IN section )
WITH MyIDs(ID) AS
(
SELECT A.B.value('.','int') AS ID
FROM
(
SELECT CAST('<x>' + REPLACE(SUBSTRING(#passedIn,2,LEN(#passedIn)-2),''',''','</x><x>') + '</x>' AS XML) AS AsXml
) as tbl
CROSS APPLY tbl.AsXml.nodes('/x') AS A(B)
)
SELECT * FROM MyIDs
UPDATE 2:
And to answer your question exactly:
With this following the CTE
insert into #t(num)
SELECT ID FROM MyIDs
... you would actually get your declared table variable filled - if you need it later...

Splitting delimited values in a SQL column into multiple rows

I would really like some advice here, to give some background info I am working with inserting Message Tracking logs from Exchange 2007 into SQL. As we have millions upon millions of rows per day I am using a Bulk Insert statement to insert the data into a SQL table.
In fact I actually Bulk Insert into a temp table and then from there I MERGE the data into the live table, this is for test parsing issues as certain fields otherwise have quotes and such around the values.
This works well, with the exception of the fact that the recipient-address column is a delimited field seperated by a ; character, and it can be incredibly long sometimes as there can be many email recipients.
I would like to take this column, and split the values into multiple rows which would then be inserted into another table. Problem is anything I am trying is either taking too long or not working the way I want.
Take this example data:
message-id recipient-address
2D5E558D4B5A3D4F962DA5051EE364BE06CF37A3A5#Server.com user1#domain1.com
E52F650C53A275488552FFD49F98E9A6BEA1262E#Server.com user2#domain2.com
4fd70c47.4d600e0a.0a7b.ffff87e1#Server.com user3#domain3.com;user4#domain4.com;user5#domain5.com
I would like this to be formatted as followed in my Recipients table:
message-id recipient-address
2D5E558D4B5A3D4F962DA5051EE364BE06CF37A3A5#Server.com user1#domain1.com
E52F650C53A275488552FFD49F98E9A6BEA1262E#Server.com user2#domain2.com
4fd70c47.4d600e0a.0a7b.ffff87e1#Server.com user3#domain3.com
4fd70c47.4d600e0a.0a7b.ffff87e1#Server.com user4#domain4.com
4fd70c47.4d600e0a.0a7b.ffff87e1#Server.com user5#domain5.com
Does anyone have any ideas about how I can go about doing this?
I know PowerShell pretty well, so I tried in that, but a foreach loop even on 28K records took forever to process, I need something that will run as quickly/efficiently as possible.
Thanks!
If you are on SQL Server 2016+
You can use the new STRING_SPLIT function, which I've blogged about here, and Brent Ozar has blogged about here.
SELECT s.[message-id], f.value
FROM dbo.SourceData AS s
CROSS APPLY STRING_SPLIT(s.[recipient-address], ';') as f;
If you are still on a version prior to SQL Server 2016
Create a split function. This is just one of many examples out there:
CREATE FUNCTION dbo.SplitStrings
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
AS
RETURN (SELECT Number = ROW_NUMBER() OVER (ORDER BY Number),
Item FROM (SELECT Number, Item = LTRIM(RTRIM(SUBSTRING(#List, Number,
CHARINDEX(#Delimiter, #List + #Delimiter, Number) - Number)))
FROM (SELECT ROW_NUMBER() OVER (ORDER BY s1.[object_id])
FROM sys.all_objects AS s1 CROSS APPLY sys.all_objects) AS n(Number)
WHERE Number <= CONVERT(INT, LEN(#List))
AND SUBSTRING(#Delimiter + #List, Number, 1) = #Delimiter
) AS y);
GO
I've discussed a few others here, here, and a better approach than splitting in the first place here.
Now you can extrapolate simply by:
SELECT s.[message-id], f.Item
FROM dbo.SourceData AS s
CROSS APPLY dbo.SplitStrings(s.[recipient-address], ';') as f;
Also I suggest not putting dashes in column names. It means you always have to put them in [square brackets].
SQL Server 2016 include a new table function string_split(), similar to the previous solution.
The only requirement is Set compatibility level to 130 (SQL Server 2016)
You may use CROSS APPLY (available in SQL Server 2005 and above) and STRING_SPLIT function (available in SQL Server 2016 and above):
DECLARE #delimiter nvarchar(255) = ';';
-- create tables
CREATE TABLE MessageRecipients (MessageId int, Recipients nvarchar(max));
CREATE TABLE MessageRecipient (MessageId int, Recipient nvarchar(max));
-- insert data
INSERT INTO MessageRecipients VALUES (1, 'user1#domain.com; user2#domain.com; user3#domain.com');
INSERT INTO MessageRecipients VALUES (2, 'user#domain1.com; user#domain2.com');
-- insert into MessageRecipient
INSERT INTO MessageRecipient
SELECT MessageId, ltrim(rtrim(value))
FROM MessageRecipients
CROSS APPLY STRING_SPLIT(Recipients, #delimiter)
-- output results
SELECT * FROM MessageRecipients;
SELECT * FROM MessageRecipient;
-- delete tables
DROP TABLE MessageRecipients;
DROP TABLE MessageRecipient;
Results:
MessageId Recipients
----------- ----------------------------------------------------
1 user1#domain.com; user2#domain.com; user3#domain.com
2 user#domain1.com; user#domain2.com
and
MessageId Recipient
----------- ----------------
1 user1#domain.com
1 user2#domain.com
1 user3#domain.com
2 user#domain1.com
2 user#domain2.com
for table = "yelp_business", split the column categories values separated by ; into rows and display as category column.
SELECT unnest(string_to_array(categories, ';')) AS category
FROM yelp_business;