Extract e-mail address from string - sql

I have table called Entities with the column CustomData.
I need to extract the email address from each row.
Also if value is null I need to to show as null.
Sample rows from CustomData:
Id CustomData Name
273 [{"Name":"Customer","Value":"test customer"},{"Name":"Address","Value":null},{"Name":"Email","Value":null},{"Name":"Company Name","Value":null},{"Name":"Other Phone","Value":null}] 2323123213
274 [{"Name":"Customer","Value":"Cash Sale"},{"Name":"Address","Value":null},{"Name":"Email","Value":"test#outlook.com"},{"Name":"Company Name","Value":null},{"Name":"Other Phone","Value":null}] 2222222222
This is the string i will be using to update my system.
I have previously achieved selecting the phone number form this same data but it was a fixed length. I can't seem to pull the e-mail address.
I will post a couple of the different methods I have tried so far once im back at my PC

Well, in such a denormalized data your only option is to parse it and try to get email. Most elegant way - is to use json parser, but it is not awailable in current versions of sql server, so you have to parse it manually.
Assuming each record for email starts with {"Name":"Email","Value":, you can do it in a few steps:
Find position of {"Name":"Email","Value": in your string.
Find first occurence of } in the right remainder of the string.
Get substring in between.
Check if it is string equals to 'null' - then return null, otherwise return string itself.
So it can be done like in this snippet:
declare #data nvarchar(max), #pattern nvarchar(max)
select #data = '[{"Name":"Customer","Value":"test customer"},
{"Name":"Company Name","Value":null},
{"Name":"Other Phone","Value":null}]'
select #pattern = '{"Name":"Email","Value":'
select nullif(substring(#data,
charindex(#pattern, #data, 0) + len(#pattern),
charindex('}', #data, charindex(#pattern, #data, 0))
- charindex(#pattern, #data, 0) - len(#pattern)
), 'null')


get sub string in between mix symbols

I want to get sub string my output should look like gmail,outlook,Skype.
my string values are
somthing like this as you can see its having variable length with mix symbol '.' and '#'
string values store in table form as a column name Mail_ID and Table name is tbl_Data
i am using sql server 2012
i use chart index for getting sub string
select SUBSTRING(Mail_ID, CHARINDEX('#',MAil_ID)+1, (CHARINDEX('.',MAil_ID) - (CHARINDEX('#', Mail_ID)+1)))
from tbl_data
And i want my output like:
Please help me i am newbies in sql server
This is my solution. I first get the position of the '#', and then get the position of the '.' in the string prior to it (the '#'). Then I can use those results to get the appropriate substring:
SELECT V.YourString,
SUBSTRING(V.YourString,D.I,A.I - D.I) AS StringPart
FROM (VALUES('abc#gmail.com'),
CROSS APPLY(VALUES(CHARINDEX('#',V.YourString)))A(I) --Get position of # to not repeat logic
CROSS APPLY(VALUES(CHARINDEX('.',LEFT(V.YourString,A.I))+1))D(I) --Get position of . to not repeat logic
Note for value of 'abc.def.steve#... it would return 'def.steve'; however, we don't have such an example so I don't know what the correct return value would be.
I'm posting this as a new answer, a the OP moved the goal posts from the original answer. My initial answer was based on their original question, not their "new" one, and it seems silly to remove an answer that was correct at the time:
SELECT V.YourString,
SUBSTRING(V.YourString,A.I, D.I - A.I) AS StringPart
FROM (VALUES('abc#gmail.com'),
This answers the original version of the question.
This may be simplest with a case expression to detect if there is a period before the '#':
select (case when email like '%.%#%'
then stuff(left(email, charindex('#', email) - 1), 1, charindex('.', email), '')
else left(email, charindex('#', email) - 1)
from (values ('abc#gmail.com'), ('cde.nitish#yahoo.com'), ('xyz.vijay#sarvang.com.com')) v(email)
I create a temp table with your data and write below query its worked
AND its a output of my T-SQL
abc#gmail.com gmail
cde.nitish#yahoo.com yahoo
xyz.vijay#sarvang.com.com sarvang

Needing to parse out data

I am trying to parse out certain data from a string and I am having issues.
Here is the string:
1=BETA.1.0^2=175^3=812^4=R^5=N^9=1^12=1^13=00032^14=REP NOT FOUND ON REP TABLE, CANNOT INSERT TO REPRGR.^10=107~117~265~1114~3143~3505~3506~3513~5717^11=SA16~1~WY~WY~A~S~20100210~001~SE62^-omitted due to existing Rep Not Found
Here is my query SELECT CONVERT(VARCHAR(5000),CHARINDEX('14=',Column))FROM Table
If you're parsing, can we assume that you don't know what might come after the '^14=', but you need to capture whatever does? So searching for a particular string won't work because anything could come after '^14='. The best approach is to identify the longest reliable specific string that gives you a "foothold" to find the data you're looking for. What you don't want to do is accidentally capture the wrong data if the '^14=' appears more than once in your string. It looks like the '^' is your delimiter, since I don't see one at the start of the string. So you were actually on the right track, you just need to use SUBSTRING as a commenter mentioned. You also need to identify a marker for the end of the error message, which looks like it might be the next occurring '^', correct? Check several samples to be sure of this, and make sure the end marker doesn't at any point exist before your start marker or you'll get an error.
SELECT CAST((SUBSTRING(Column,CHARINDEX('14=',Column,0),CHARINDEX('^',Column,CHARINDEX('14=',Column,0) + 1) - CHARINDEX('14=',Column,0))) AS VARCHAR(5000)) FROM Table
You may need to increment or decrement the start position and end position by doing a +1 or -1 to fully capture your error message. But this should dynamically grab any length error message provided you are positive of your starting and ending markers.
I also have here a table-valued parsing function, where you would pass it the string and the '^' and it will return a table of data with not only the 14=, but everything.
CREATE function [dbo].[fn_SplitStringByDelimeter]
#list nvarchar(8000)
,#splitOn char(1)
returns #rtnTable table
id int identity(1,1)
,value nvarchar(100)
declare #index int
declare #string nvarchar(4000)
select #index = 1
if len(#list) < 1 or #list is null return
while #index!= 0
set #index = charindex(#splitOn,#list)
if #index!=0
set #string = left(#list,#index - 1)
set #string = #list
insert into #rtnTable(value) values(#string)
set #list = right(#list,len(#list) - #index)
if len(#list) = 0 break
It sounds like you're trying to get the value of argument 14. This should do it:
select substring(
, charindex('^14=',someData) + 4
, charindex('^',someData, charindex('^14=',someData) + 4) - charindex('^14=',someData) - 4
) errorMessage
from myData
where charindex('^14=',someData) > 0
and charindex('^',someData, charindex('^14=',someData) + 4) > 0
Try it here: http://sqlfiddle.com/#!18/22f23/2
This gets a substring of the given input.
The substring starts at the first character after the string ^14=; i.e. we get the index of ^14= in the string, then add 4 to it to skip over the matched characters themselves.
The substring ends at the first ^ character after the one in ^14=. We get the index of that character, then subtract the starting position from it to get the length of the desired output.
Caveats: If there is no parameter (^) after ^14= this will not work. Equally if there is no ^14= (even if the string starts 14=) this will not work. From the information available that's OK; but if this is a concern please say and we can provide something to handle that more complex scenario.
Code to create table & populate demo data
create table myData (someData nvarchar(256))
insert myData (someData)
values ('1=BETA.1.0^2=175^3=812^4=R^5=N^9=1^12=1^13=00032^14=REP NOT FOUND ON REP TABLE, CANNOT INSERT TO REPRGR.^10=107~117~265~1114~3143~3505~3506~3513~5717^11=SA16~1~WY~WY~A~S~20100210~001~SE62^-omitted due to existing Rep Not Found')
, ('1xx^14=something else.^10=xx')
You could try to use a Case When statement with wildcards to find the value that you want.
WHEN x LIKE '%REP Not Found%'
You could use this query (assuming MySQL database):
-- item is the column that contains the string
select SUBSTR(item, LOCATE('REP',item), LOCATE('REPRGR.',item) + LENGTH('REPRGR.') - LOCATE('REP', item)) info_msg from Table;
create table parsetest (item varchar(5000));
insert into parsetest values('1=BETA.1.0^2=175^3=812^4=R^5=N^9=1^12=1^13=00032^14=REP NOT FOUND ON REP TABLE, CANNOT INSERT TO REPRGR.^10=107~117~265~1114~3143~3505~3506~3513~5717^11=SA16~1~WY~WY~A~S~20100210~001~SE62^-omitted due to existing Rep Not Found');
select * from parsetest;
| item |
| 1=BETA.1.0^2=175^3=812^4=R^5=N^9=1^12=1^13=00032^14=REP NOT FOUND ON REP TABLE, CANNOT INSERT TO REPRGR.^10=107~117~265~1114~3143~3505~3506~3513~5717^11=SA16~1~WY~WY~A~S~20100210~001~SE62^-omitted due to existing Rep Not Found |
select SUBSTR(item, LOCATE('REP',item), LOCATE('REPRGR.',item) + LENGTH('REPRGR.') - LOCATE('REP', item)) info_msg from parsetest;
| info_msg |

Using Upper to Capitalize the first letter of City name

I am doing some data clean-up and need to Capitalize the first letter of City names. How do I capitalize the second word in a City Like Terra Bella.
FROM masterfeelisting
My results is this 'Terra bella' and I need 'Terra Bella'. Thanks in advance.
Ok, I know I answered this before, but it bugged me that we couldn't write something efficient to handle an unknown amount of 'text segments'.
So re-thinking it and researching, I discovered a way to change the [MAILCITY] field into XML nodes where each 'text segment' is assigned it's own Node within the xml field. Then those xml fields can be processed node by node, concatenated together, and then changed back to a SQL varchar. It's convoluted, but it works. :)
Here's the code:
#masterfeelisting (
[MAILCITY] varchar(max) not null
INSERT INTO #masterfeelisting VALUES
('terra bellA')
,(' terrA novA ')
,('chicagO ')
,('porT dE sanTo')
,(' porT dE sanTo pallo ');
UPPER([xmlField].[xmlNode].value('.', 'char(1)')) +
LOWER(STUFF([xmlField].[xmlNode].value('.', 'varchar(max)'), 1, 1, '')) + ' '
FROM [xmlNodeRecordSet].[nodeField].nodes('/N') as [xmlField]([xmlNode]) FOR
xml path(''), type
).value('.', 'varchar(max)')
CAST('<N>' + REPLACE([MAILCITY],' ','</N><N>')+'</N>' as xml) as [nodeField]
FROM #masterfeelisting
) as [xmlNodeRecordSet];
Drop table #masterfeelisting;
First I create a table and fill it with dummy values.
Now here is the beauty of the code:
For each record in #masterfeelisting, we are going to create an xml field with a node for each 'text segment'.
ie. '<N></N><N>terrA</N><N>novA</N><N></N>'
(This is built from the varchar ' terrA novA ')
1) The way this is done is by using the REPLACE function.
The string starts with a '<N>' to designate the beginning of the node. Then:
This effectively goes through the whole [MAILCITY] string and replaces each
' ' with '</N><N>'
and then the string ends with a '</N>'. Where '</N>' designates the end of each node.
So now we have a beautiful XML string with a couple of empty nodes and the 'text segments' nicely nestled in their own node. All the 'spaces' have been removed.
2) Then we have to CAST the string into xml. And we will name that field [nodeField]. Now we can use xml functions on our newly created record set. (Conveniently named [xmlNodeRecordSet].)
3) Now we can read the [xmlNodeRecordSet] into the main sub-Select by stating:
FROM [xmlNodeRecordSet].[nodeField].nodes('/N')
This tells us we are reading the [nodeField] as nodes with a '/N' delimiter.
This table of node fields is then parsed by stating:
as [xmlField]([xmlNode]) FOR xml path(''), type
This means each [xmlField] will be parsed for each [xmlNode] in the xml string.
4) So in the main sub-select:
Each blank node '<N></N>' is discarded. (Or not processed.)
Each node with a 'text segment' in it will be parsed. ie <N>terrA</N>
UPPER([xmlField].[xmlNode].value('.', 'char(1)')) +
This code will grab each node out of the field and take its contents '.' and only grab the first character 'char(1)'. Then it will Upper case that character. (the plus sign at the end means it will concatenate this letter with the next bit of code:
LOWER(STUFF([xmlField].[xmlNode].value('.', 'varchar(max)'), 1, 1, ''))
Now here is the beauty... STUFF is a function that will take a string, from a position, for a length, and substitute another string.
STUFF(string, start position, length, replacement string)
So our string is:
[xmlField].[xmlNode].value('.', 'varchar(max)')
Which grabs the whole string inside the current node since it is 'varchar(max)'.
The start position is 1. The length is 1. And the replacement string is ''. This effectively strips off the first character by replacing it with nothing. So the remaining string is all the other characters that we want to have lower case. So that's what we do... we use LOWER to make them all lower case. And this result is concatenated to our first letter that we already upper cased.
But wait... we are not done yet... we still have to append a + ' '. Which adds a blank space after our nicely capitalized 'text segment'. Just in case there is another 'text segment' after this node is done.
This main sub-Select will now parse each node in our [xmlField] and concatenate them all nicely together.
5) But now that we have one big happy concatenation, we still have to change it back from an xml field to a SQL varchar field. So after the main sub-select we need:
.value('.', 'varchar(max)')
This changes our [MAILCITY] back to a SQL varchar.
6) But hold on... we still are not done. Remember we put an extra space at the end of each 'text segment'??? Well the last 'text segment still has that extra space after it. So we need to Right Trim that space off by using RTRIM.
7) And dont forget to rename the final field back to as [MAILCITY]
8) And that's it. This code will take an unknown amount of 'text segments' and format each one of them. All using the fun of XML and it's node parsers.
Hope that helps :)
Here's one way to handle this using APPLY. Note that this solution supports up to 3 substrings (e.g. "Phoenix", "New York", "New York City") but can easily be updated to handle more.
DECLARE #string varchar(100) = 'nEW yoRk ciTY';
SELECT #string, LEN(RTRIM(LTRIM(#string)))-LEN(REPLACE(RTRIM(LTRIM(#string)),' ',''))
CROSS APPLY (SELECT CHARINDEX(char(32), string, 1)) CI1(CI1)
CROSS APPLY (SELECT CHARINDEX(char(32), string, CI1.CI1+1)) CI2(CI2)
OldString = #string,
NewString =
WHEN 0 THEN UPPER(SUBSTRING(string,1,1))+LOWER(SUBSTRING(string,2,8000))
UPPER(SUBSTRING(string,CI1+1,1))+LOWER(SUBSTRING(string,CI1+2,CI2-(CI1+1))) +
OldString NewString
--------------- --------------
nEW yoRk ciTY New York City
This will only capitalize the first letter of the second word. A shorter but less flexible approach. Replace #str with [Mail City].
DECLARE #str AS VARCHAR(50) = 'Los angelas'
SELECT STUFF(#str, CHARINDEX(' ', #str) + 1, 1, UPPER(SUBSTRING(#str, CHARINDEX(' ', #str) + 1, 1)));
This is a way to use imbedded Selects for three City name parts.
It uses CHARINDEX to find the location of your separator character. (ie a space)
I put an 'if' structure around the Select to test if you have any records with more than 3 parts to the city name. If you ever get the warning message, you could add another sub-Select to handle another city part.
Although... just to be clear... SQL is not the best language to do complicated formatting. It was written as a data retrieval engine with the idea that another program will take that data and massage it into a friendlier look and feel. It may be easier to handle the formatting in another program. But if you insist on using SQL and you need to account for city names with 5 or more parts... you may want to consider using Cursors so you can loop through the variable possibilities. (But Cursors are not a good habit to get into. So don't do that unless you've exhausted all other options.)
Anyway, the following code creates and populates a table so you can test the code and see how it works. Enjoy!
#masterfeelisting (
[MAILCITY] varchar(30) not null
Insert into #masterfeelisting select 'terra bella';
Insert into #masterfeelisting select ' terrA novA ';
Insert into #masterfeelisting select 'chicagO ';
Insert into #masterfeelisting select 'bostoN';
Insert into #masterfeelisting select 'porT dE sanTo';
--Insert into #masterfeelisting select ' porT dE sanTo pallo ';
Declare #intSpaceCount as integer;
SELECT #intSpaceCount = max (len(RTRIM(LTRIM([MAILCITY]))) - len(replace([MAILCITY],' ',''))) FROM #masterfeelisting;
if #intSpaceCount > 2
SELECT 'You need to account for more than 3 city name parts ' as Warning, #intSpaceCount as SpacesFound;
cThird.[MAILCITY1] + cThird.[MAILCITY2] + cThird.[MAILCITY3] as [MAILCITY]
bSecond.[MAILCITY1] as [MAILCITY1]
,SUBSTRING(bSecond.[MAILCITY2],1,bSecond.[intCol2]) as [MAILCITY2]
,UPPER(SUBSTRING(bSecond.[MAILCITY2],bSecond.[intCol2] + 1, 1)) +
SUBSTRING(bSecond.[MAILCITY2],bSecond.[intCol2] + 2,LEN(bSecond.[MAILCITY2]) - bSecond.[intCol2]) as [MAILCITY3]
SUBSTRING(aFirst.[MAILCITY],1,aFirst.[intCol1]) as [MAILCITY1]
,UPPER(SUBSTRING(aFirst.[MAILCITY],aFirst.[intCol1] + 1, 1)) +
SUBSTRING(aFirst.[MAILCITY],aFirst.[intCol1] + 2,LEN(aFirst.[MAILCITY]) - aFirst.[intCol1]) as [MAILCITY2]
,CHARINDEX ( ' ', SUBSTRING(aFirst.[MAILCITY],aFirst.[intCol1] + 1, LEN(aFirst.[MAILCITY]) - aFirst.[intCol1]) ) as intCol2
,CHARINDEX ( ' ', RTRIM(LTRIM(mstr.[MAILCITY]))) as intCol1
#masterfeelisting as mstr -- Initial Master Table
) as aFirst -- First Select Shell
) as bSecond -- Second Select Shell
) as cThird; -- Third Select Shell
Drop table #masterfeelisting;

using PARSENAME to find the last item in a list

I am using Parsename in SQL and would like to extract the last element in a list of items. I am using the following code.
Declare #string as varchar(1000)
set #string = ''
This works and returns the value 28 as I expect. However if I expand my list past more than 4 items then the result returns a NULL. For example:
Declare #string2 as varchar(1000)
set #string2 = ''
I would expect this to return a value of 29 however only NULL is returned
I'm sure there is a simple explaination to this can anyone help?
PARSENAME is designed specifically to parse an sql object name. The number of periods in the latter example exempt it from being such a name so the call correctly fails.
select right(#string2, charindex('.', reverse(#string2), 1) - 1)
PARSENAME ( 'object_name' , object_piece )
Is the name of the object for which to retrieve the specified object part.
This name can have four parts: the server name, the database name, the owner name, and the object name.
If we give more than 4 parts, it will always return null.
For Ref: http://msdn.microsoft.com/en-us/library/ms188006.aspx

SQL strip text and convert to integer

In my database (SQL 2005) I have a field which holds a comment but in the comment I have an id and I would like to strip out just the id, and IF possible convert it to an int:
activation successful of id 1010101
The line above is the exact structure of the data in the db field.
And no I don't want to do this in the code of the application, I actually don't want to touch it, just in case you were wondering ;-)
This should do the trick:
SELECT SUBSTRING(column, PATINDEX('%[0-9]%', column), 999)
FROM table
Based on your sample data, this that there is only one occurence of an integer in the string and that it is at the end.
I don't have a means to test it at the moment, but:
select convert(int, substring(fieldName, len('activation successful of id '), len(fieldName) - len('activation successful of id '))) from tableName
Would you be open to writing a bit of code? One option, create a CLR User Defined function, then use Regex. You can find more details here. This will handle complex strings.
If your above line is always formatted as 'activation successful of id #######', with your number at the end of the field, then:
declare #myColumn varchar(100)
set #myColumn = 'activation successful of id 1010102'
#myColumn as [OriginalColumn]
, CONVERT(int, REVERSE(LEFT(REVERSE(#myColumn), CHARINDEX(' ', REVERSE(#myColumn))))) as [DesiredColumn]
Will give you:
OriginalColumn DesiredColumn
---------------------------------------- -------------
activation successful of id 1010102 1010102
(1 row(s) affected)
select cast(right(column_name,charindex(' ',reverse(column_name))) as int)
-- Test table, you will probably use some query
DECLARE #testTable TABLE(comment VARCHAR(255))
INSERT INTO #testTable(comment)
VALUES ('activation successful of id 1010101')
-- Use Charindex to find "id " then isolate the numeric part
-- Finally check to make sure the number is numeric before converting
select right(comment, len(comment) - charindex('id ', comment)-2) as justnumber
from #testtable) TT
I would also add that this approach is more set based and hence more efficient for a bunch of data values. But it is super easy to do it just for one value as a variable. Instead of using the column comment you can use a variable like #chvComment.
If the comment string is EXACTLY like that you can use replace.
select replace(comment_col, 'activation successful of id ', '') as id from ....
It almost certainly won't be though - what about unsuccessful Activations?
You might end up with nested replace statements
select replace(replace(comment_col, 'activation not successful of id ', ''), 'activation successful of id ', '') as id from ....
[sorry can't tell from this edit screen if that's entirely valid sql]
That starts to get messy; you might consider creating a function and putting the replace statements in that.
If this is a one off job, it won't really matter. You could also use a regex, but that's quite slow (and in any case mean you now have 2 problems).