Shuffling numbers based on the numbers from the row - sql

Let's say we have a 12-digit numbers in a given row.
AccountNumber
=============
136854775807
293910210121
763781239182
Is it possible to shuffle the numbers of a single row solely based on the numbers of that row? e.g. 136854775807 would become 573145887067

I have created a user-defined function to shuffle the numbers.
What I have done is, taken out each character and stored it into a table variable along with a random number. Then at last concatenated each character in the ascending order of the random number.
It is not possible to use RAND function inside a user-defined function. So created a VIEW for taking a random number.
View : random_num
create view dbo.[random_num]
as
select floor(rand()* 12) as [rnd];
It's not necessary that the random number should be between 0 and 12. We can give a larger number instead of 12.
User-defined function : fn_shuffle
create function dbo.[fn_shuffle](
#acc varchar(12)
)
returns varchar(12)
as begin
declare #tbl as table([a] varchar(1), [b] int);
declare #i as int = 1;
declare #l as int;
set #l = (select len(#acc));
while(#i <= #l)
begin
insert into #tbl([a], [b])
select substring(#acc, #i, 1), [rnd] from [random_num]
set #i += 1;
end
declare #res as varchar(12);
select #res = stuff((
select '' + [a]
from #tbl
order by [b], [a]
for xml path('')
)
, 1, 0, ''
);
return #res;
end
Then, you would be able to use the function like below.
select [acc_no],
dbo.[fn_shuffle]([acc_no]) as [shuffled]
from dbo.[your_table_name];
Find a demo here

I don't really see the utility, but you can. Here is one way:
select t.accountnumber, x.shuffled
from t cross apply
(select digit
from (values (substring(accountnumber, 1, 1)),
substring(accountnumber, 2, 1)),
. . .
substring(accountnumber, 12, 1))
)
) v(digit)
order by newid()
for xml path ('')
) x(shuffled);

Related

TSQL - Split GUID/UNIQUEIDENTIFIER

Case: We have smart guids in a table and need to extract 2nd and 4th parts out of it. I was thinking about writing a function that can take in #partnumber and return the extracted value for it.
e.g.
DECLARE #Guid UNIQUEIDENTIFIER = 'A7DDAA60-C33A-4D7A-A2D8-ABF20127C9AE'
1st part = A7DDAA60, 2nd part = C33A, 3rd part = 4D7A, 4th part =
A2D8, and 5th part = ABF20127C9AE
Based on the #partnumber, it would return one of those values.
I'm trying to figure out how to split it most efficiently (STRING_SPLIT doesn't guarantee order).
I am not sure exactly what you mean by "smart" guids, but why not just cast it to a char and pull out the parts by position?
create table t(myguid uniqueidentifier);
declare #p tinyint = 5;
select case #p
when 1 then left(c.v, 8)
when 2 then substring(c.v, 10, 4)
when 3 then substring(c.v, 15, 4)
when 4 then substring(c.v, 20, 4)
when 5 then right(c.v, 12)
end
from t
cross apply (select cast(t.myguid as char(36))) c(v)
You can use, OPENJSON
DECLARE #Guid UNIQUEIDENTIFIER = 'A7DDAA60-C33A-4D7A-A2D8-ABF20127C9AE',
#s varchar(100)
Select #s = replace(#guid,'-','","')
Select * from
(
Select [key] + 1 as Poistion, Value as Part
FROM OPENJSON('["' + #s + '"]')
) Q
Where Poistion in (2,4)
Here is the fiddle.

How to get all the two characters long substrings separated by dot (.) from an email address in SQL server. I want Scalar function

I have one email column that is having values like this 'claudio.passerini#uni.re.dit.mn.us'. I want to take two characters strings between dot (to check for the countries and states codes).
i want result like this
col1=re,mn,us
Solution
To do exactly what you've asked; i.e. pull back just the 2 char codes from within the email address's domain, you could use a function such as this:
create function dbo.fn_Get2AlphaCharCodesFromEmail
(
#email nvarchar(254) --max length of an email is 254: http://stackoverflow.com/questions/386294/what-is-the-maximum-length-of-a-valid-email-address
) returns nvarchar(254)
as
begin
declare #result nvarchar(254) = null
, #maxLen int = 254
;with cte(i, remainder,result) as
(
select cast(0 as int)
, cast('.' + substring(#email,charindex('#',#email)+1,#maxLen) + '.' as nvarchar(254))
, cast(null as nvarchar(254))
union all
select cast(i+1 as int)
, cast(substring(remainder,patindex('%.[A-Z][A-Z].%',remainder)+3,#maxLen)as nvarchar(254))
, cast(coalesce(result + ',','') + substring(remainder,patindex('%.[A-Z][A-Z].%',remainder)+1,2) as nvarchar(254))
from cte
where patindex('%.[A-Z][A-Z].%',remainder) > 0
)
select top 1 #result = result from cte order by i desc;
Return #result;
end
go
--demo
select dbo.fn_Get2AlphaCharCodesFromEmail ('claudio.passerini#uni.re.dit.mn.us')
--returns: re,mn,us
select dbo.fn_Get2AlphaCharCodesFromEmail ('claudio.passerini#uni.123.dit.mnx.usx')
--returns: NULL
Explanation
Create a function called fn_Get2AlphaCharCodesFromEmail in the schema dbo which takes a single parameter, #email which is a string of up to 254 characters, and returns a string of up to 254 characters.
create function dbo.fn_Get2AlphaCharCodesFromEmail
(
#email nvarchar(254)
) returns nvarchar(254)
as
begin
--... code that does the work goes here
end
declare the variables we'll be using later on.
#result holds the value we'll be returning from the function
#maxLen records the maximum length of an email; this makes it slightly easier should this length ever need to change; though not entirely simple since we have to specify the 254 length in our column & variable definitions later on anyway.
declare #result nvarchar(254) = null
, #maxLen int = 254
Now comes the interesting bit. We create a common table expression with 3 columns:
i is used to record which iteration each record was produced in; the highest value of i is the last record to be created.
remainder is used to hold the yet-to-be processed characters from the email.
result is used to record the 2 char codes; each new row adds another value to this column's comma separated values.
;with cte(i, remainder,result) as
(
--code to iterate through the email string, breaking it down, goes here
)
this gives us our first row in the cte "table".
The cast statements throughout this part are to ensure we have a consistent data type, as data types in a CTE are implicit, and not always correct
we initialise i (i.e. the first column) with value 0 to say that this is our first row (we could choose pretty much any value here; it doesn't matter
we initialise remainder (i.e. 2nd column) as the part of the email address which follows the # character; i.e. the email's domain.
we initialise result (i.e. 3rd column) as null; as we've not yet found a result (i.e. a 2 char string within the email's domain)
there is no from component as we're just getting data from the #email variable; no tables/views/etc are required.
select cast(0 as int)
, cast('.' + substring(#email,charindex('#',#email)+1,#maxLen) + '.' as nvarchar(254))
, cast(null as nvarchar(254))
union all is used to combing the first result(s) with the results of the next (recurring) statement. NB: The CTE code before this statement is run once to give initial values; the code after is run once for each new set of rows generated.
union all
The recurring code in the CTE is applied to new rows in the CTE until no new rows are generated.
i takes the value of the previous iteration's row's i incremented by 1.
select cast(i+1 as int)
remainder takes the previous iteration's remainder, and removes everything before (and including) the next 2 character code (result).
patindex('%.[A-Z][A-Z].%',remainder) returns a number giving the location of the a string containing a dot followed by 2 letters followed by a dot, occurring anywhere in the input string
, cast(substring(remainder,patindex('%.[A-Z][A-Z].%',remainder)+3,#maxLen)as nvarchar(254))
result uses the same logic as remainder, only it takes the 2 characters found, rather than everything after them. These characters are added on to the end of the previous iteartion's row's result value, separated by a comma.
, cast(coalesce(result + ',','') + substring(remainder,patindex('%.[A-Z][A-Z].%',remainder)+1,2) as nvarchar(254))
the from cte part just says that we're referencing the same "table" we're creating; i.e. this is how the recursion occurs
from cte
the where statement is used to prevent infinite recursion; i.e. once there are no more 2 char codes left in the remainder, stop looking.
where patindex('%.[A-Z][A-Z].%',remainder) > 0
Once we've found all the 2 char codes in the string, we know that the last row's result will contain the complete set; as such we assign this single row's value to the #result variable.
select top 1 #result = result
the from statement shows we're referencing the data we created in our with cte statement
from cte
the order by is used to determine which record comes first (i.e. which record is the top 1 record). We want it to be the last row generated by the CTE. Since we've been incrementing i by 1 each time, this last record will have the highest value of i, so by sorting by i desc (descending) that last generated row will be the row we get.
order by i desc;
Finally, we return the result generated above.
Return #result;
Alternative Approach
However, if you're trying to extract information from your emails, I'd recommend an alternate approach... have a list of values that you're looking for, and compare your email with that, without having to break apart the email address (beyond splitting on the # to ensure you're only checking the email's domain).
declare #countryCodes table (code nchar(2), name nvarchar(64)) --you'd use a real table for this; I'm just using a table variable so this demo's throwaway code
insert into #countryCodes (code, name)
values
('es','Spain')
,('fr','France')
,('uk','United Kingdom')
,('us','USA')
--etc.
--check a single mail
declare #mail nvarchar(256) = 'claudio.passerini#uni.re.dit.mn.us'
if exists (select top 1 1 from #countryCodes where '.' + substring(#mail,charindex('#',#mail)+1,256) + '.' like '%.' + code + '.%')
begin
select name from #countryCodes where '.' + substring(#mail,charindex('#',#mail)+1,256) + '.' like '%.' + code + '.%'
end
else
begin
select 'no results found'
end
--check a bunch of mails
declare #emailsToCheck table (email nvarchar(256))
insert into #emailsToCheck (email)
values
('claudio.passerini#uni.re.dit.mn.us')
,('someone#someplace.co.uk')
,('cant.see.me#never.never.land')
,('some.fr.address.hidden#france.not.in.this.bit')
select e.email, c.name
from #emailsToCheck e
left outer join #countryCodes c
on '.' + substring(email,charindex('#',email)+1,256) + '.' like '%.' + code + '.%'
order by e.email, c.name
If yo want individual columns you will need to pivot your data after splitting out your strings with a table valued function as per Marc's answer. If you are happy having them in rows, you can just use the select statement inside the brackets.
Query to get the data
declare #t table (Email nvarchar(50));
insert into #t values('claudio.passerini#uni.re.dit.mn.us'),('claudio.passerini#uni.ry.dit.mn.urg'),('claudio.passerini#uni.rn.dit.mn.uk');
select Email
,[1]
,[2]
,[3]
,[4]
,[5]
,[6]
from(
select t.Email
,s.Item
,row_number() over (partition by t.Email order by s.Item) as rn
from #t t
cross apply dbo.DelimitedSplit8K(t.Email,'.') s
where len(s.Item) = 2
) a
pivot
(
max(Item) for rn in([1],[2],[3],[4],[5],[6])
) pvt
Table valued function to split out the strings, courtesy of Jeff Moden
http://www.sqlservercentral.com/articles/Tally+Table/72993/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER FUNCTION [dbo].[DelimitedSplit8K]
--===== Define I/O parameters
(#pString VARCHAR(8000), #pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(#pString),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(#pString,t.N,1) = #pDelimiter
),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
SELECT s.N1,
ISNULL(NULLIF(CHARINDEX(#pDelimiter,#pString,s.N1),0)-s.N1,8000)
FROM cteStart s
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(#pString, l.N1, l.L1)
FROM cteLen l
You can create your own function to split strings.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[fnSplitString]
(
#string NVARCHAR(MAX),
#delimiter CHAR(1)
)
RETURNS #output TABLE(splitdata NVARCHAR(MAX)
)
BEGIN
set #delimiter = coalesce(#delimiter, dbo.cSeparador());
DECLARE #start INT, #end INT
SELECT #start = 1, #end = CHARINDEX(#delimiter, #string)
WHILE #start < LEN(#string) + 1 BEGIN
IF #end = 0
SET #end = LEN(#string) + 1
INSERT INTO #output (splitdata)
VALUES(SUBSTRING(#string, #start, #end - #start))
SET #start = #end + 1
SET #end = CHARINDEX(#delimiter, #string, #start)
END
RETURN
END
Using this function you can get all your country&state codes :
select splitdata from dbo.fnSplitString('claudio.passerini#uni.re.dit.mn.us', '.')
where len(splitdata) = 2
You can modify that query to concatenate the result on a single string :
SELECT
STUFF((SELECT ',' + splitdata
FROM dbo.fnSplitString('claudio.passerini#uni.re.dit.mn.us', '.')
WHERE len(splitdata) = 2
FOR XML PATH('')), 1, 1, '')
Here is how you put it into an scalar function :
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[fnCountryCodes](#email nvarchar(max)) returns nvarchar(max)
AS
BEGIN
RETURN (SELECT
STUFF((SELECT ',' + splitdata
FROM dbo.fnSplitString(#email, '.')
WHERE len(splitdata) = 2
FOR XML PATH('')), 1, 1, ''));
END
You call it like this :
select dbo.fnCountryCodes('claudio.passerini#uni.re.dit.mn.us')
Alternatively you can create a table-valued function that returns all the 2 characters long substrings from the domain of a mail address :
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE FUNCTION [dbo].[fnCountryCodes] (#email NVARCHAR(MAX))
RETURNS #output TABLE(subdomain1 nvarchar(2), subdomain2 nvarchar(2), subdomain3 nvarchar(2), subdomain4 nvarchar(2), subdomain5 nvarchar(2))
as
BEGIN
DECLARE #subdomain1 nvarchar(2);
DECLARE #subdomain2 nvarchar(2);
DECLARE #subdomain3 nvarchar(2);
DECLARE #subdomain4 nvarchar(2);
DECLARE #subdomain5 nvarchar(2);
DECLARE CURSOR_SUBDOMAINS CURSOR FOR select splitdata from dbo.fnSplitString(#email, '.') where len(splitdata) = 2;
OPEN CURSOR_SUBDOMAINS;
FETCH NEXT FROM CURSOR_SUBDOMAINS INTO #subdomain1;
FETCH NEXT FROM CURSOR_SUBDOMAINS INTO #subdomain2;
FETCH NEXT FROM CURSOR_SUBDOMAINS INTO #subdomain3;
FETCH NEXT FROM CURSOR_SUBDOMAINS INTO #subdomain4;
FETCH NEXT FROM CURSOR_SUBDOMAINS INTO #subdomain5;
CLOSE CURSOR_SUBDOMAINS;
DEALLOCATE CURSOR_SUBDOMAINS;
INSERT INTO #output (subdomain1, subdomain2, subdomain3, subdomain4, subdomain5)
values (#subdomain1, #subdomain2, #subdomain3, #subdomain4, #subdomain5)
RETURN
END
You use it like that :
select * from dbo.fnCountryCodes('claudio.passerini#uni.re.dit.mn.us')

Query to get only numbers from a string

I have data like this:
string 1: 003Preliminary Examination Plan
string 2: Coordination005
string 3: Balance1000sheet
The output I expect is
string 1: 003
string 2: 005
string 3: 1000
And I want to implement it in SQL.
First create this UDF
CREATE FUNCTION dbo.udf_GetNumeric
(
#strAlphaNumeric VARCHAR(256)
)
RETURNS VARCHAR(256)
AS
BEGIN
DECLARE #intAlpha INT
SET #intAlpha = PATINDEX('%[^0-9]%', #strAlphaNumeric)
BEGIN
WHILE #intAlpha > 0
BEGIN
SET #strAlphaNumeric = STUFF(#strAlphaNumeric, #intAlpha, 1, '' )
SET #intAlpha = PATINDEX('%[^0-9]%', #strAlphaNumeric )
END
END
RETURN ISNULL(#strAlphaNumeric,0)
END
GO
Now use the function as
SELECT dbo.udf_GetNumeric(column_name)
from table_name
SQL FIDDLE
I hope this solved your problem.
Reference
Try this one -
Query:
DECLARE #temp TABLE
(
string NVARCHAR(50)
)
INSERT INTO #temp (string)
VALUES
('003Preliminary Examination Plan'),
('Coordination005'),
('Balance1000sheet')
SELECT LEFT(subsrt, PATINDEX('%[^0-9]%', subsrt + 't') - 1)
FROM (
SELECT subsrt = SUBSTRING(string, pos, LEN(string))
FROM (
SELECT string, pos = PATINDEX('%[0-9]%', string)
FROM #temp
) d
) t
Output:
----------
003
005
1000
Query:
DECLARE #temp TABLE
(
string NVARCHAR(50)
)
INSERT INTO #temp (string)
VALUES
('003Preliminary Examination Plan'),
('Coordination005'),
('Balance1000sheet')
SELECT SUBSTRING(string, PATINDEX('%[0-9]%', string), PATINDEX('%[0-9][^0-9]%', string + 't') - PATINDEX('%[0-9]%',
string) + 1) AS Number
FROM #temp
Please try:
declare #var nvarchar(max)='Balance1000sheet'
SELECT LEFT(Val,PATINDEX('%[^0-9]%', Val+'a')-1) from(
SELECT SUBSTRING(#var, PATINDEX('%[0-9]%', #var), LEN(#var)) Val
)x
Getting only numbers from a string can be done in a one-liner.
Try this :
SUBSTRING('your-string-here', PATINDEX('%[0-9]%', 'your-string-here'), LEN('your-string-here'))
NB: Only works for the first int in the string, ex: abc123vfg34 returns 123.
I found this approach works about 3x faster than the top voted answer. Create the following function, dbo.GetNumbers:
CREATE FUNCTION dbo.GetNumbers(#String VARCHAR(8000))
RETURNS VARCHAR(8000)
AS
BEGIN;
WITH
Numbers
AS (
--Step 1.
--Get a column of numbers to represent
--every character position in the #String.
SELECT 1 AS Number
UNION ALL
SELECT Number + 1
FROM Numbers
WHERE Number < LEN(#String)
)
,Characters
AS (
SELECT Character
FROM Numbers
CROSS APPLY (
--Step 2.
--Use the column of numbers generated above
--to tell substring which character to extract.
SELECT SUBSTRING(#String, Number, 1) AS Character
) AS c
)
--Step 3.
--Pattern match to return only numbers from the CTE
--and use STRING_AGG to rebuild it into a single string.
SELECT #String = STRING_AGG(Character,'')
FROM Characters
WHERE Character LIKE '[0-9]'
--allows going past the default maximum of 100 loops in the CTE
OPTION (MAXRECURSION 8000)
RETURN #String
END
GO
Testing
Testing for purpose:
SELECT dbo.GetNumbers(InputString) AS Numbers
FROM ( VALUES
('003Preliminary Examination Plan') --output: 003
,('Coordination005') --output: 005
,('Balance1000sheet') --output: 1000
,('(111) 222-3333') --output: 1112223333
,('1.38hello#f00.b4r#\-6') --output: 1380046
) testData(InputString)
Testing for performance:
Start off setting up the test data...
--Add table to hold test data
CREATE TABLE dbo.NumTest (String VARCHAR(8000))
--Make an 8000 character string with mix of numbers and letters
DECLARE #Num VARCHAR(8000) = REPLICATE('12tf56se',800)
--Add this to the test table 500 times
DECLARE #n INT = 0
WHILE #n < 500
BEGIN
INSERT INTO dbo.NumTest VALUES (#Num)
SET #n = #n +1
END
Now testing the dbo.GetNumbers function:
SELECT dbo.GetNumbers(NumTest.String) AS Numbers
FROM dbo.NumTest -- Time to complete: 1 min 7s
Then testing the UDF from the top voted answer on the same data.
SELECT dbo.udf_GetNumeric(NumTest.String)
FROM dbo.NumTest -- Time to complete: 3 mins 12s
Inspiration for dbo.GetNumbers
Decimals
If you need it to handle decimals, you can use either of the following approaches, I found no noticeable performance differences between them.
change '[0-9]' to '[0-9.]'
change Character LIKE '[0-9]' to ISNUMERIC(Character) = 1 (SQL treats a single decimal point as "numeric")
Bonus
You can easily adapt this to differing requirements by swapping out WHERE Character LIKE '[0-9]' with the following options:
WHERE Letter LIKE '[a-zA-Z]' --Get only letters
WHERE Letter LIKE '[0-9a-zA-Z]' --Remove non-alphanumeric
WHERE Letter LIKE '[^0-9a-zA-Z]' --Get only non-alphanumeric
With the previous queries I get these results:
'AAAA1234BBBB3333' >>>> Output: 1234
'-çã+0!\aº1234' >>>> Output: 0
The code below returns All numeric chars:
1st output: 12343333
2nd output: 01234
declare #StringAlphaNum varchar(255)
declare #Character varchar
declare #SizeStringAlfaNumerica int
declare #CountCharacter int
set #StringAlphaNum = 'AAAA1234BBBB3333'
set #SizeStringAlfaNumerica = len(#StringAlphaNum)
set #CountCharacter = 1
while isnumeric(#StringAlphaNum) = 0
begin
while #CountCharacter < #SizeStringAlfaNumerica
begin
if substring(#StringAlphaNum,#CountCharacter,1) not like '[0-9]%'
begin
set #Character = substring(#StringAlphaNum,#CountCharacter,1)
set #StringAlphaNum = replace(#StringAlphaNum, #Character, '')
end
set #CountCharacter = #CountCharacter + 1
end
set #CountCharacter = 0
end
select #StringAlphaNum
declare #puvodni nvarchar(20)
set #puvodni = N'abc1d8e8ttr987avc'
WHILE PATINDEX('%[^0-9]%', #puvodni) > 0 SET #puvodni = REPLACE(#puvodni, SUBSTRING(#puvodni, PATINDEX('%[^0-9]%', #puvodni), 1), '' )
SELECT #puvodni
A solution for SQL Server 2017 and later, using TRANSLATE:
DECLARE #T table (string varchar(50) NOT NULL);
INSERT #T
(string)
VALUES
('003Preliminary Examination Plan'),
('Coordination005'),
('Balance1000sheet');
SELECT
result =
REPLACE(
TRANSLATE(
T.string COLLATE Latin1_General_CI_AI,
'abcdefghijklmnopqrstuvwxyz',
SPACE(26)),
SPACE(1),
SPACE(0))
FROM #T AS T;
Output:
result
003
005
1000
The code works by:
Replacing characters a-z (ignoring case & accents) with a space
Replacing spaces with an empty string.
The string supplied to TRANSLATE can be expanded to include additional characters.
I did not have rights to create functions but had text like
["blahblah012345679"]
And needed to extract the numbers out of the middle
Note this assumes the numbers are grouped together and not at the start and end of the string.
select substring(column_name,patindex('%[0-9]%', column_name),patindex('%[0-9][^0-9]%', column_name)-patindex('%[0-9]%', column_name)+1)
from table name
Although this is an old thread its the first in google search, I came up with a different answer than what came before. This will allow you to pass your criteria for what to keep within a string, whatever that criteria might be. You can put it in a function to call over and over again if you want.
declare #String VARCHAR(MAX) = '-123. a 456-78(90)'
declare #MatchExpression VARCHAR(255) = '%[0-9]%'
declare #return varchar(max)
WHILE PatIndex(#MatchExpression, #String) > 0
begin
set #return = CONCAT(#return, SUBSTRING(#string,patindex(#matchexpression, #string),1))
SET #String = Stuff(#String, PatIndex(#MatchExpression, #String), 1, '')
end
select (#return)
This UDF will work for all types of strings:
CREATE FUNCTION udf_getNumbersFromString (#string varchar(max))
RETURNS varchar(max)
AS
BEGIN
WHILE #String like '%[^0-9]%'
SET #String = REPLACE(#String, SUBSTRING(#String, PATINDEX('%[^0-9]%', #String), 1), '')
RETURN #String
END
Just a little modification to #Epsicron 's answer
SELECT SUBSTRING(string, PATINDEX('%[0-9]%', string), PATINDEX('%[0-9][^0-9]%', string + 't') - PATINDEX('%[0-9]%',
string) + 1) AS Number
FROM (values ('003Preliminary Examination Plan'),
('Coordination005'),
('Balance1000sheet')) as a(string)
no need for a temporary variable
Firstly find out the number's starting length then reverse the string to find out the first position again(which will give you end position of number from the end). Now if you deduct 1 from both number and deduct it from string whole length you'll get only number length. Now get the number using SUBSTRING
declare #fieldName nvarchar(100)='AAAA1221.121BBBB'
declare #lenSt int=(select PATINDEX('%[0-9]%', #fieldName)-1)
declare #lenEnd int=(select PATINDEX('%[0-9]%', REVERSE(#fieldName))-1)
select SUBSTRING(#fieldName, PATINDEX('%[0-9]%', #fieldName), (LEN(#fieldName) - #lenSt -#lenEnd))
T-SQL function to read all the integers from text and return the one at the indicated index, starting from left or right, also using a starting search term (optional):
create or alter function dbo.udf_number_from_text(
#text nvarchar(max),
#search_term nvarchar(1000) = N'',
#number_position tinyint = 1,
#rtl bit = 0
) returns int
as
begin
declare #result int = 0;
declare #search_term_index int = 0;
if #text is null or len(#text) = 0 goto exit_label;
set #text = trim(#text);
if len(#text) = len(#search_term) goto exit_label;
if len(#search_term) > 0
begin
set #search_term_index = charindex(#search_term, #text);
if #search_term_index = 0 goto exit_label;
end;
if #search_term_index > 0
if #rtl = 0
set #text = trim(right(#text, len(#text) - #search_term_index - len(#search_term) + 1));
else
set #text = trim(left(#text, #search_term_index - 1));
if len(#text) = 0 goto exit_label;
declare #patt_number nvarchar(10) = '%[0-9]%';
declare #patt_not_number nvarchar(10) = '%[^0-9]%';
declare #number_start int = 1;
declare #number_end int;
declare #found_numbers table (id int identity(1,1), val int);
while #number_start > 0
begin
set #number_start = patindex(#patt_number, #text);
if #number_start > 0
begin
if #number_start = len(#text)
begin
insert into #found_numbers(val)
select cast(substring(#text, #number_start, 1) as int);
break;
end;
else
begin
set #text = right(#text, len(#text) - #number_start + 1);
set #number_end = patindex(#patt_not_number, #text);
if #number_end = 0
begin
insert into #found_numbers(val)
select cast(#text as int);
break;
end;
else
begin
insert into #found_numbers(val)
select cast(left(#text, #number_end - 1) as int);
if #number_end = len(#text)
break;
else
begin
set #text = trim(right(#text, len(#text) - #number_end));
if len(#text) = 0 break;
end;
end;
end;
end;
end;
if #rtl = 0
select #result = coalesce(a.val, 0)
from (select row_number() over (order by m.id asc) as c_row, m.val
from #found_numbers as m) as a
where a.c_row = #number_position;
else
select #result = coalesce(a.val, 0)
from (select row_number() over (order by m.id desc) as c_row, m.val
from #found_numbers as m) as a
where a.c_row = #number_position;
exit_label:
return #result;
end;
Example:
select dbo.udf_number_from text(N'Text text 10 text, 25 term', N'term',2,1);
returns 10;
This is one of the simplest and easiest one. This will work on the entire String for multiple occurences as well.
CREATE FUNCTION dbo.fn_GetNumbers(#strInput NVARCHAR(500))
RETURNS NVARCHAR(500)
AS
BEGIN
DECLARE #strOut NVARCHAR(500) = '', #intCounter INT = 1
WHILE #intCounter <= LEN(#strInput)
BEGIN
SELECT #strOut = #strOut + CASE WHEN SUBSTRING(#strInput, #intCounter, 1) LIKE '[0-9]' THEN SUBSTRING(#strInput, #intCounter, 1) ELSE '' END
SET #intCounter = #intCounter + 1
END
RETURN #strOut
END
Following a solution using a single common table expression (CTE).
DECLARE #s AS TABLE (id int PRIMARY KEY, value nvarchar(max));
INSERT INTO #s
VALUES
(1, N'003Preliminary Examination Plan'),
(2, N'Coordination005'),
(3, N'Balance1000sheet');
SELECT * FROM #s ORDER BY id;
WITH t AS (
SELECT
id,
1 AS i,
SUBSTRING(value, 1, 1) AS c
FROM
#s
WHERE
LEN(value) > 0
UNION ALL
SELECT
t.id,
t.i + 1 AS i,
SUBSTRING(s.value, t.i + 1, 1) AS c
FROM
t
JOIN #s AS s ON t.id = s.id
WHERE
t.i < LEN(s.value)
)
SELECT
id,
STRING_AGG(c, N'') WITHIN GROUP (ORDER BY i ASC) AS value
FROM
t
WHERE
c LIKE '[0-9]'
GROUP BY
id
ORDER BY
id;
DECLARE #index NVARCHAR(20);
SET #index = 'abd565klaf12';
WHILE PATINDEX('%[0-9]%', #index) != 0
BEGIN
SET #index = REPLACE(#index, SUBSTRING(#index, PATINDEX('%[0-9]%', #index), 1), '');
END
SELECT #index;
One can replace [0-9] with [a-z] if numbers only are wanted with desired castings using the CAST function.
If we use the User Define Function, the query speed will be greatly reduced. This code extracts the number from the string....
SELECT
Reverse(substring(Reverse(rtrim(ltrim( substring([FieldName] , patindex('%[0-9]%', [FieldName] ) , len([FieldName]) )))) , patindex('%[0-9]%', Reverse(rtrim(ltrim( substring([FieldName] , patindex('%[0-9]%', [FieldName] ) , len([FieldName]) )))) ), len(Reverse(rtrim(ltrim( substring([FieldName] , patindex('%[0-9]%', [FieldName] ) , len([FieldName]) ))))) )) NumberValue
FROM dbo.TableName
CREATE OR REPLACE FUNCTION count_letters_and_numbers(input_string TEXT)
RETURNS TABLE (letters INT, numbers INT) AS $$
BEGIN
RETURN QUERY SELECT
sum(CASE WHEN input_string ~ '[A-Za-z]' THEN 1 ELSE 0 END) as letters,
sum(CASE WHEN input_string ~ '[0-9]' THEN 1 ELSE 0 END) as numbers
FROM unnest(string_to_array(input_string, '')) as input_string;
END;
$$ LANGUAGE plpgsql;
For the hell of it...
This solution is different to all earlier solutions, viz:
There is no need to create a function
There is no need to use pattern matching
There is no need for a temporary table
This solution uses a recursive common table expression (CTE)
But first - note the question does not specify where such strings are stored. In my solution below, I create a CTE as a quick and dirty way to put these strings into some kind of "source table".
Note also - this solution uses a recursive common table expression (CTE) - so don't get confused by the usage of two CTEs here. The first is simply to make the data avaliable to the solution - but it is only the second CTE that is required in order to solve this problem. You can adapt the code to make this second CTE query your existing table, view, etc.
Lastly - my coding is verbose, trying to use column and CTE names that explain what is going on and you might be able to simplify this solution a little. I've added in a few pseudo phone numbers with some (expected and atypical, as the case may be) formatting for the fun of it.
with SOURCE_TABLE as (
select '003Preliminary Examination Plan' as numberString
union all select 'Coordination005' as numberString
union all select 'Balance1000sheet' as numberString
union all select '1300 456 678' as numberString
union all select '(012) 995 8322 ' as numberString
union all select '073263 6122,' as numberString
),
FIRST_CHAR_PROCESSED as (
select
len(numberString) as currentStringLength,
isNull(cast(try_cast(replace(left(numberString, 1),' ','z') as tinyint) as nvarchar),'') as firstCharAsNumeric,
cast(isNull(cast(try_cast(nullIf(left(numberString, 1),'') as tinyint) as nvarchar),'') as nvarchar(4000)) as newString,
cast(substring(numberString,2,len(numberString)) as nvarchar) as remainingString
from SOURCE_TABLE
union all
select
len(remainingString) as currentStringLength,
cast(try_cast(replace(left(remainingString, 1),' ','z') as tinyint) as nvarchar) as firstCharAsNumeric,
cast(isNull(newString,'') as nvarchar(3999)) + isNull(cast(try_cast(nullIf(left(remainingString, 1),'') as tinyint) as nvarchar(1)),'') as newString,
substring(remainingString,2,len(remainingString)) as remainingString
from FIRST_CHAR_PROCESSED fcp2
where fcp2.currentStringLength > 1
)
select
newString
,* -- comment this out when required
from FIRST_CHAR_PROCESSED
where currentStringLength = 1
So what's going on here?
Basically in our CTE we are selecting the first character and using try_cast (see docs) to cast it to a tinyint (which is a large enough data type for a single-digit numeral). Note that the type-casting rules in SQL Server say that an empty string (or a space, for that matter) will resolve to zero, so the nullif is added to force spaces and empty strings to resolve to null (see discussion) (otherwise our result would include a zero character any time a space is encountered in the source data).
The CTE also returns everything after the first character - and that becomes the input to our recursive call on the CTE; in other words: now let's process the next character.
Lastly, the field newString in the CTE is generated (in the second SELECT) via concatenation. With recursive CTEs the data type must match between the two SELECT statements for any given column - including the column size. Because we know we are adding (at most) a single character, we are casting that character to nvarchar(1) and we are casting the newString (so far) as nvarchar(3999). Concatenated, the result will be nvarchar(4000) - which matches the type casting we carry out in the first SELECT.
If you run this query and exclude the WHERE clause, you'll get a sense of what's going on - but the rows may be in a strange order. (You won't necessarily see all rows relating to a single input value grouped together - but you should still be able to follow).
Hope it's an interesting option that may help a few people wanting a strictly expression-based solution.
In Oracle
You can get what you want using this:
SUBSTR('ABCD1234EFGH',REGEXP_INSTR ('ABCD1234EFGH', '[[:digit:]]'),REGEXP_COUNT ('ABCD1234EFGH', '[[:digit:]]'))
Sample Query:
SELECT SUBSTR('003Preliminary Examination Plan ',REGEXP_INSTR ('003Preliminary Examination Plan ', '[[:digit:]]'),REGEXP_COUNT ('003Preliminary Examination Plan ', '[[:digit:]]')) SAMPLE1,
SUBSTR('Coordination005',REGEXP_INSTR ('Coordination005', '[[:digit:]]'),REGEXP_COUNT ('Coordination005', '[[:digit:]]')) SAMPLE2,
SUBSTR('Balance1000sheet',REGEXP_INSTR ('Balance1000sheet', '[[:digit:]]'),REGEXP_COUNT ('Balance1000sheet', '[[:digit:]]')) SAMPLE3 FROM DUAL
If you are using Postgres and you have data like '2000 - some sample text' then try substring and position combination, otherwise if in your scenario there is no delimiter, you need to write regex:
SUBSTRING(Column_name from 0 for POSITION('-' in column_name) - 1) as
number_column_name

SQL server - Split and sum of a single cell

I have a table cell of type nvarchar(max) that typically looks like this:
A03 B32 Y660 P02
e.g. a letter followed by a number, separated by spaces. What I want to do is get a sum of all those numbers in a SQL procedure. Something rather simple in other languages, but I am fairly new to SQL and besides it seems to me like a rather clumsy language to play around with strings.
Aaanyway, I imagine it would go like this:
1) Create a temporary table and fill it using a split function
2) Strip the first character of every cell
3) Convert the data to int
4) Update target table.column set to sum of said temporary table.
So I got as far as this:
CREATE PROCEDURE [dbo].[SumCell] #delimited nvarchar(max), #row int
AS
BEGIN
declare #t table(data nvarchar(max))
declare #xml xml
set #xml = N'<root><r>' + replace(#delimited,' ','</r><r>') + '</r></root>'
insert into #t(data)
select
r.value('.','varchar(5)') as item
from #xml.nodes('//root/r') as records(r)
UPDATE TargetTable
SET TargetCell = SUM(#t.data) WHERE id = #row
END
Obviously, the first char stripping and conversion to int part is missing and on top of that, I get a "must declare the scalar variable #t" error...
Question is not very clear so assuming your text is in a single cell like A3 B32 Y660 P20 following snippet can be used to get the sum.
DECLARE #Cell NVARCHAR(400), #Sum INT, #CharIndex INT
SELECT #Cell = 'A3 B32 Y660 P20',#Sum=0
WHILE (LEN(LTRIM(#Cell))>0)
BEGIN
SELECT #CharIndex = CHARINDEX(' ',#Cell,0)
SELECT #Sum = #Sum +
SUBSTRING(#Cell,2,CASE WHEN #CharIndex>2 THEN #CharIndex-2 ELSE LEN(#Cell)-1 END )
SELECT #Cell = SUBSTRING(#Cell,#CharIndex+1,LEN(#Cell))
IF NOT (#CharIndex >0) BREAK;
END
--#Sum has the total of cell numbers
SELECT #Sum
I'm making the assumption that you really want to be able to find the sum of values in your delimited list for a full selection of a table. Therefore, I believe the most complicated part of your question is to split the values. The method I tend to use requires a numbers table, So I'll start with that:
--If you really want to use a temporary numbers table don't use this method!
create table #numbers(
Number int identity(1,1) primary key
)
declare #counter int
set #counter = 1
while #counter<=10000
begin
insert into #numbers default values
set #counter = #counter + 1
end
I'll also create some test data
create table #data(
id int identity(1,1),
cell nvarchar(max)
)
insert into #data(cell) values('A03 B32 Y660 P02')
insert into #data(cell) values('Y72 A12 P220 B42')
Then, I'd put the split functionality into a CTE to keep things clean:
;with split as (
select d.id,
[valOrder] = row_number() over(partition by d.cell order by n.Number),
[fullVal] = substring(d.cell, n.Number, charindex(' ',d.cell+' ',n.Number) - n.Number),
[char] = substring(d.cell, n.Number, 1),
[numStr] = substring(d.cell, n.Number+1, charindex(' ',d.cell+' ',n.Number) - n.Number)
from #data d
join #numbers n on substring(' '+d.cell, n.Number, 1) = ' '
where n.Number <= len(d.cell)+1
)
select id, sum(cast(numStr as int))
from split
group by id

Passing a variable into an IN clause within a SQL function? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Parameterizing an SQL IN clause?
I have a SQL function whereby I need to pass a list of IDs in, as a string, into:
WHERE ID IN (#MyList)
I have looked around and most of the answers are either where the SQL is built within C# and they loop through and call AddParameter, or the SQL is built dynamically.
My SQL function is fairly large and so building the query dynamically would be rather tedious.
Is there really no way to pass in a string of comma-separated values into the IN clause?
My variable being passed in is representing a list of integers so it would be:
"1,2,3,4,5,6,7" etc
Here is a slightly more efficient way to split a list of integers. First, create a numbers table, if you don't already have one. This will create a table with 100,000 unique integers (you may need more or less):
;WITH x AS
(
SELECT TOP (1000000) Number = ROW_NUMBER() OVER
(ORDER BY s1.[object_id])
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
ORDER BY s1.[object_id]
)
SELECT Number INTO dbo.Numbers FROM x;
CREATE UNIQUE CLUSTERED INDEX n ON dbo.Numbers(Number);
Then a function:
CREATE FUNCTION [dbo].[SplitInts_Numbers]
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT Item = CONVERT(INT, SUBSTRING(#List, Number,
CHARINDEX(#Delimiter, #List + #Delimiter, Number) - Number))
FROM dbo.Numbers
WHERE Number <= CONVERT(INT, LEN(#List))
AND SUBSTRING(#Delimiter + #List, Number, 1) = #Delimiter
);
You can compare the performance to an iterative approach here:
http://sqlfiddle.com/#!3/960d2/1
To avoid the numbers table, you can also try an XML-based version of the function - it is more compact but less efficient:
CREATE FUNCTION [dbo].[SplitInts_XML]
(
#List VARCHAR(MAX),
#Delimiter CHAR(1)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN ( SELECT Item = CONVERT(INT, Item) FROM (
SELECT Item = x.i.value('(./text())[1]', 'int') FROM (
SELECT [XML] = CONVERT(XML, '<i>' + REPLACE(#List, #Delimiter, '</i><i>')
+ '</i>').query('.') ) AS a CROSS APPLY [XML].nodes('i') AS x(i)) AS y
WHERE Item IS NOT NULL
);
Anyway once you have a function you can simply say:
WHERE ID IN (SELECT Item FROM dbo.SplitInts_Numbers(#MyList, ','));
Passing a string directly into the IN clause is not possible. However, if you are providing the list as a string to a stored procedure, for example, you can use the following dirty method.
First, create this function:
CREATE FUNCTION [dbo].[fnNTextToIntTable] (#Data NTEXT)
RETURNS
#IntTable TABLE ([Value] INT NULL)
AS
BEGIN
DECLARE #Ptr int, #Length int, #v nchar, #vv nvarchar(10)
SELECT #Length = (DATALENGTH(#Data) / 2) + 1, #Ptr = 1
WHILE (#Ptr < #Length)
BEGIN
SET #v = SUBSTRING(#Data, #Ptr, 1)
IF #v = ','
BEGIN
INSERT INTO #IntTable (Value) VALUES (CAST(#vv AS int))
SET #vv = NULL
END
ELSE
BEGIN
SET #vv = ISNULL(#vv, '') + #v
END
SET #Ptr = #Ptr + 1
END
-- If the last number was not followed by a comma, add it to the result set
IF #vv IS NOT NULL
INSERT INTO #IntTable (Value) VALUES (CAST(#vv AS int))
RETURN
END
(Note: this is not my original code, but thanks to versioning systems here at my place of work, I have lost the header comment linking to the source.)
Then use it like so:
SELECT *
FROM tblMyTable
INNER JOIN fnNTextToIntTable(#MyList) AS List ON tblMyTable.ID = List.Value
Or, as in your question:
SELECT *
FROM tblMyTable
WHERE ID IN ( SELECT Value FROM fnNTextToIntTable(#MyList) )