SQL Server : to get last 4 character in first column and get the first letter of the words in 2nd column but ignore non alphabets - sql

Can I check will it be possible to run SQL with this requirement? I trying to get a new value for new column from these 2 existing columns ID and Description.
For ID, simply retrieve last 4 characters
For Description, would like to get the first alphabets for each word but ignore the numbers & symbols.

SQL Server has lousy string processing capabilities. Even split_string() doesn't preserve the order of the words that it finds.
One approach to this uses a recursive CTE to split the strings and accumulate the initials:
with t as (
select v.*
from (values (2004120, 'soccer field 2010'), (2004121, 'ruby field')) v(id, description)
),
cte as (
select id, description, convert(varchar(max), left(description, charindex(' ', description + ' '))) as word,
convert(varchar(max), stuff(description, 1, charindex(' ', description + ' ') , '')) as rest,
1 as lev,
(case when description like '[a-zA-Z]%' then convert(varchar(max), left(description, 1)) else '' end) as inits
from t
union all
select id, description, convert(varchar(max), left(rest, charindex(' ', rest + ' '))) as word,
convert(varchar(max), stuff(rest, 1, charindex(' ', rest + ' ') , '')) as rest,
lev + 1,
(case when rest like '[a-zA-Z]%' then convert(varchar(max), inits + left(rest, 1)) else inits end) as inits
from cte
where rest > ''
)
select id, description, inits + right(id, 4)
from (select cte.*, max(lev) over (partition by id) as max_lev
from cte
) cte
where lev = max_lev;
Here is a db<>fiddle.

To get the last 4 numbers of the ID you could use:
SELECT Id%10000 as New_Id from Tablename;
To get the starting of each Word you could use(letting the answer be String2):
LEFT(Description,1)
This is equivalent to using SUBSTRING(Description,1,1)
This helps you get the first letter of each word.
To concatenate both of them you could use the CONCAT function:
SELECT CONCAT(String2,New_Id)
See more on the CONCAT function here

Related

Retrieve email ids from multiple full Email Address using SQL query

Using SQL-server
Am trying to retrieve email id from list of full email addresses
Example
Input
"ABCD"<abcd#gmail.com>;"TEST" <test#outlook.com>;
Needed output
abcd#gmail.com,test#outlook.com
i have tried the below query but it is giving only one email id not all.Some help will be greatly appreciated.
SELECT (CASE WHEN to_address LIKE '%<%'
THEN SUBSTRING(to_address, CHARINDEX('<',to_address)+1, CHARINDEX('>',to_address) - CHARINDEX('<',to_address)-1)
ELSE to_address
END) AS ToAddress
from test
In SQL Server, you can use string_split():
select replace(stuff(s.value, 1, charindex('<', s.value), ''), '>', '')
from t cross apply
string_split(t.list, ';') s
If you want to re-concatenate them:
select s.emails
from t cross apply
(select string_agg(replace(stuff(s.value, 1, charindex('<', s.value), ''), '>', ''), ';') as email
from string_split(t.list, ';') s
) s;
EDIT:
Another approach which works with earlier versions of SQL Server is a recursive CTE:
with cte as (
select convert(varchar(max), null) as email,
convert(varchar(max), emails) as rest,
convert(varchar(max), '') as output_emails, 0 as lev
from (values ('"ABCD"<abcd#gmail.com>;"TEST" <test#outlook.com>; ')
) v(emails)
union all
select replace(stuff(v.el, 1, charindex('<', v.el), ''), '>;', ''),
stuff(rest, 1, charindex(';', rest), ''),
concat(output_emails,
replace(stuff(v.el, 1, charindex('<', v.el), ''), '>;', ''),
';'),
lev + 1
from cte cross apply
(values (left(rest, charindex(';', rest)))) v(el)
where rest <> '' and lev < 5
)
select output_emails
from (select cte.*, row_number() over (order by lev desc) as seqnum
from cte
) cte
where seqnum = 1;
Here is a db<>fiddle.

How to split names into first/middle/last if there are people who typed last/first/middle? Order known

I am trying to split names into first, middle, last, based on an indicated order. I am at a lost on how to do this, and any help would be much appreciated. I am using sql server 2008 for work.
I attached an example dataset and the ideal dataset I would like to create.
ID ORDER NAME
1 first, middle, last Bruce, Batman, Wayne
2 middle, last, first Superman, Kent, Clark
3 last, first, middle Prince, Diana, Wonderwoman
INTO:
ID ORDER NAME
1 first Bruce
1 middle Batman
1 last Wayne
2 middle Superman
2 last Kent
2 first Clark
3 last Prince
3 first Diana
3 middle Wonderwoman
SQL Server does not have very good string processing functions. You can do this using a recursive CTE, though:
with cte as (
select id,
convert(varchar(max), left(ord, charindex(',', ord) - 1)) as ord,
convert(varchar(max), left(name, charindex(',', name) - 1)) as name,
convert(varchar(max), stuff(ord, 1, charindex(',', ord) + 1, '')) as ord_rest,
convert(varchar(max), stuff(name, 1, charindex(',', name) + 1, '')) as name_rest,
1 as lev
from t
union all
select id,
convert(varchar(max), left(ord_rest, charindex(',', ord_rest + ',') - 1)) as ord,
convert(varchar(max), left(name_rest, charindex(',', name_rest + ',') - 1)) as name,
convert(varchar(max), stuff(ord_rest, 1, charindex(',', ord_rest + ',') + 1, '')) as ord_rest,
convert(varchar(max), stuff(name_rest, 1, charindex(',', name_rest + ',') + 1, '')) as name_rest,
lev + 1
from cte
where ord_rest <> '' and lev < 10
)
select id, ord, name
from cte
order by id, lev
Here is a db<>fiddle.
With the help of a parse/split function that returns the sequence, this becomes a small matter using a CROSS APPLY
Example
Select A.ID
,B.*
From YourTable A
Cross Apply (
Select [Order] = B1.RetVal
,[Name] = B2.RetVal
From [dbo].[tvf-Str-Parse]([ORDER],',') B1
Join [dbo].[tvf-Str-Parse]([NAME] ,',') B2 on B1.RetSeq=B2.RetSeq
) B
Returns
ID Order Name
1 first Bruce
1 middle Batman
1 last Wayne
2 middle Superman
2 last Kent
2 first Clark
3 last Prince
3 first Diana
3 middle Wonderwoman
The Function if Interested
CREATE FUNCTION [dbo].[tvf-Str-Parse] (#String varchar(max),#Delimiter varchar(10))
Returns Table
As
Return (
Select RetSeq = row_number() over (order by 1/0)
,RetVal = ltrim(rtrim(B.i.value('(./text())[1]', 'varchar(max)')))
From ( values (cast('<x>' + replace((Select replace(#String,#Delimiter,'§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml).query('.'))) as A(x)
Cross Apply x.nodes('x') AS B(i)
);
I found the other answers a bit hard to follow - they're neat tricks for sure but I think anyone coming to maintain them might be like "whaaaat?". Here I work out the indexes of the commas (the first comma string index goes in o1/n1, the second comma goes in o2/n2) in the first cte, cut the string up (substring between 1 and first comma, substring between first and second comma, substring after third comma) in the second cte and then use a couple of unions to turn the results from 7 columns into 3
WITH idxs AS
(
SELECT
id,
order,
name,
CHARINDEX(',', [order]) as o1,
CHARINDEX(',', [order], CHARINDEX(',', [order]) + 1) as o2,
CHARINDEX(',', name) as n1,
CHARINDEX(',', name, CHARINDEX(',', name) + 1) as n2
FROM
t
),
cuts as (
SELECT
id,
SUBSTRING([order], 1, o1-1) as ord1,
SUBSTRING([order], o1+1, o2-o1-1) as ord2,
SUBSTRING([order], o2+1, 4000) as ord3,
SUBSTRING(name, 1, n1-1) as nam1,
SUBSTRING(name, n1+1, n2-n1-1) as nam2,
SUBSTRING(name, n2+1, 4000) as nam3
FROM
idxs
)
SELECT id, ord1 as [order], nam1 as name FROM cuts
UNION ALL
SELECT id, ord2, nam2 FROM cuts
UNION ALL
SELECT id, ord3, nam3 FROM cuts
Note that if your data sometimes has spaces in and sometimes does not you'll benefit from using either LTRIM/RTRIM in the output
if the spaces are always there after a comma, you could also adjust the substring indexes to cut the spaces out (any start index that is x+1 would be x+2 and the length would hence have to be -2)

SQL - re.split: BY comma THEN space

I want to split the following string into City, Province, Postal Code.
Thank you so much for the help!!!!
Description: Split by comma, then split by space only once.
A = 'Vaughan, ON L6D 9X0'
Result:
(Vaughan, ON, L6D9X0)
Attempt:
re.split(',|/s[1]', A)
Try This,
This selected the values into rows, maybe you can pivot the cte C2 same in order to get it as columns
WITH CTE
AS
(
SELECT
1 Seq,
'Vaughan, ON L6D 9X0' "Txt"
UNION ALL
SELECT
Seq+1 "Seq",
Txt = CASE WHEN Txt LIKE '% %'
THEN LTRIM(RTRIM(SUBSTRING(Txt,CHARINDEX(' ',Txt),LEN(Txt))))
ELSE NULL END
FROM CTE
WHERE ISNULL(txt,'')<>''
),C2
AS
(
SELECT
CASE Seq
WHEN 1 THEN 'City'
WHEN 2 THEN 'Province'
ELSE 'PostalCode' END "Head",
CASE Seq
WHEN 1
THEN SUBSTRING(Txt,1,CHARINDEX(',',Txt)-1)
WHEN 2
THEN SUBSTRING(Txt,1,CHARINDEX(' ',Txt)-1)
ELSE Txt END "Txt"
FROM CTE
WHERE Seq<4
)
SELECT
*
FROM C2;
If you have multiple rows that need to be parsed in the same way then, you can give the same in the select statement of 1st CTE and in that case, the logic for Seq on the first select might need to be changed. As per the above, the output will be as follows
With Postgres you can do that using:
select split_part(a, ',', 1) as city,
left(trim(split_part(a,',',2)), strpos(trim(split_part(a,',',2)), ' ')),
substr(trim(split_part(a,',',2)) as province, strpos(trim(split_part(a,',',2)), ' ') + 1) as postal_code
from the_table;
This can be made a bit more readable by using a derived table:
select city,
left(second_part, strpos(second_part, ' ')) as province,
substr(second_part, strpos(second_part, ' ') + 1) as postal_code
from (
select split_part(a, ',', 1) as city,
trim(split_part(a, ',', 2)) as second_part
from the_table
) t
IF, You are working with Microsoft SQL Server, then you could use SUBSTRING() &CHARINDEX() function to find specific Char index to split the string values.
SELECT SUBSTRING(<column>, 1, CHARINDEX(',', <column>)-1) City,
SUBSTRING(SUBSTRING(<column>, CHARINDEX(' ', <column>)+1, LEN(<column>)), 1, CHARINDEX(' ', SUBSTRING(<column>, CHARINDEX(' ', <column>)+1, LEN(<column>)))) Province,
SUBSTRING(SUBSTRING(<column>, CHARINDEX(' ', <column>)+1, LEN(<column>)), CHARINDEX(' ', SUBSTRING(<column>, CHARINDEX(' ', <column>)+1, LEN(<column>)))+1, LEN(<column>)) [Postal Code];
Desired Output :
City Province Postal Code
Vaughan ON L6D 9X0

How to get middle portion from Sql server table data?

I am trying to get First name from employee table, in employee table full_name is like this: Dow, Mike P.
I tried with to get first name using below syntax but it comes with Middle initial - how to remove middle initial from first name if any. because not all name contain middle initial value.
-- query--
select Employee_First_Name as full_name,
SUBSTRING(
Employee_First_Name,
CHARINDEX(',', Employee_First_Name) + 1,
len(Employee_First_Name)) AS FirstName
---> remove middle initial from right side from employee
-- result
Full_name Firstname Dow,Mike P. Mike P.
--few example for Full_name data---
smith,joe j. --->joe (need result as)
smith,alan ---->alan (need result as)
Instead of specifying the len you need to use charindex again, but specify that you want the second occurrence of a space.
select Employee_First_Name as full_name,
SUBSTRING(
Employee_First_Name,
CHARINDEX(',', Employee_First_Name) + 1,
CHARINDEX(' ', Employee_First_Name, 2)) AS FirstName
One thing to note, the second charindex can return 0 if there is no second occurence. In that case, you would want to use something like the following:
select Employee_First_Name as full_name,
SUBSTRING(
Employee_First_Name,
CHARINDEX(',', Employee_First_Name) + 1,
IIF(CHARINDEX(' ', Employee_First_Name, 2) = 0, Len(Employee_First_name), CHARINDEX(' ', Employee_First_Name, 2))) AS FirstName
This removes the portion before the comma.. then uses that string and removes everything after space.
WITH cte AS (
SELECT *
FROM (VALUES('smith,joe j.'),('smith,alan'),('joe smith')) t(fullname)
)
SELECT
SUBSTRING(
LTRIM(SUBSTRING(fullname,CHARINDEX(',',fullname) + 1,LEN(fullname))),
0,
COALESCE(NULLIF(CHARINDEX(' ',LTRIM(SUBSTRING(fullname,CHARINDEX(',',fullname) + 1,LEN(fullname)))),0),LEN(fullname)))
FROM cte
output
------
joe
alan
joe
To be honest, this is most easily expressed using multiple levels of logic. One way is using outer apply:
select ttt.firstname
from t outer apply
(select substring(t.full_name, charindex(', ', t.full_name) + 2, len(t.full_name) as firstmi
) tt outer apply
(select (case when tt.firstmi like '% %'
then left(tt.firstmi, charindex(' ', tt.firstmi)
else tt.firstmi
end) as firstname
) as ttt
If you want to put this all in one complicated statement, I would suggest a computed column:
alter table t
add firstname as (stuff((case when full_name like '%, % %.',
then left(full_name,
charindex(' ', full_name, charindex(', ', full_name) + 2)
)
else full_name
end),
1,
charindex(', ', full_name) + 2,
'')
If format of this full_name field is the same for all rows, you may utilize power of SQL FTS word breaker for this task:
SELECT N'Dow, Mike P.' AS full_name INTO #t
SELECT display_term FROM #t
CROSS APPLY sys.dm_fts_parser(N'"' + full_name + N'"', 1033, NULL, 1) p
WHERE occurrence = 2
DROP TABLE #t

Can the Select list in a SQL Statement use Regular Expressions

I have a SQL statement,
select ColumnName from Table
And I get this result,
Error 192.168.1.67 UserName 0bce6c62-1efb-416d-bce5-71c3c8247b75 An existing ....
So anyway the field has a lot of stuff in it, I just want to get out the 'UserName'.
Can I use a regex for that?
I mean it would be kind of like this,
select SUBSTRING(ColumnName, 0, 5) from Table
Except the SUBSTRING would be replaced with a regex of some kind. I am comfortable with regex, but I am not sure how to apply it in this case, or even if you can.
If I could get this working it would be great because I plan to pull the data into a temporary table, and do some quite complicated things matching it with other tables etc. If I can get this all working it would save me writing a C# app to do it with.
Thanks.
No, out of the box, SQL Server doesn't support regexs.
You could retrofit those by means of a SQL-CLR assembly that you deploy into SQL Server.
I think going you should use SUBSTRING anyway. Using regular expression is more flexible but also lead to a large processing overhead. This becomes even worse if your have to process a large recordsets.
You have to justify if there's the need for flexibility in first place.
If so you should read about it here:
http://msdn.microsoft.com/en-us/magazine/cc163473.aspx
Using T-SQL only can look like that:
SELECT 'Error 192.168.1.67 XUserNameX 0bce6c62-1efb-416d-bce5-71c3c8247b75 An existing' expr
INTO log_table
GO
WITH
split1 (expr, cstart, cend)
AS (
SELECT
expr, 1, 0
FROM
log_table a
), split2 (expr, cstart, cend, div)
AS (
SELECT
a.expr, a.cend + 1, CHARINDEX(' ', a.expr, a.cend + 1), 1
FROM
split1 a
UNION ALL
SELECT
a.expr, a.cend + 1, CHARINDEX(' ', a.expr, a.cend + 1), div+1
FROM
split2 a
WHERE
a.cend > 1
), substrings(expr, div)
AS (
SELECT
SUBSTRING(expr, cstart, cend - cstart), div
FROM
split2
)
SELECT expr from
substrings a
where
a.div = 3
UPDATE
we cannot tell where the start of the
username is. Unless we can say 'find
me the start character after the
second space'
That is fairly straightforward:
Filter out strings that have fewer than
two spaces (alternatively, have three
or more words);
Find the position after the first
space (alternatively, the beginning
of the second word);
Find the position after the the first
space after the first space
(alternatively, the beginning of the
third word);
Determine the length of the third
word using the position of the next
space (or the end of the string is
there are only three words);
Use the above values with the
SUBSTRING() function to return the
third word.
Example:
WITH MyTable (ColumnName)
AS
(
SELECT NULL
UNION ALL
SELECT ''
UNION ALL
SELECT 'One.'
UNION ALL
SELECT 'Two words.'
UNION ALL
SELECT 'Three word sentence.'
UNION ALL
SELECT 'Sentence containing four words.'
UNION ALL
SELECT 'Five words in this sentence.'
UNION ALL
SELECT 'Sentence containing more than five words.'
),
AtLeastThreeWords (ColumnName, pos_word_2_start)
AS
(
SELECT M1.ColumnName, CHARINDEX(' ', M1.ColumnName) + LEN(' ') + 1
FROM MyTable AS M1
WHERE LEN(M1.ColumnName) - LEN(REPLACE(M1.ColumnName, ' ', '')) >= 2
),
MyTable2 (ColumnName, pos_word_3_start)
AS
(
SELECT M1.ColumnName,
CHARINDEX(' ', M1.ColumnName, pos_word_2_start) + LEN(' ') + 1
FROM AtLeastThreeWords AS M1
),
MyTable3 (ColumnName, pos_word_3_start, pos_word_3_end)
AS
(
SELECT M1.ColumnName, M1.pos_word_3_start,
CHARINDEX(' ', M1.ColumnName, pos_word_3_start) + LEN(' ')
FROM MyTable2 AS M1
),
MyTable4 (ColumnName, pos_word_3_start, word_3_length)
AS
(
SELECT M1.ColumnName, M1.pos_word_3_start,
CASE
WHEN pos_word_3_start < pos_word_3_end
THEN pos_word_3_end - pos_word_3_start
ELSE LEN(M1.ColumnName) - pos_word_3_start + 1
END
FROM MyTable3 AS M1
)
SELECT M1.ColumnName,
SUBSTRING(M1.ColumnName, pos_word_3_start, word_3_length)
AS word_3
FROM MyTable4 AS M1;
ORIGINAL ANSWER:
Is the problem that the position and/or length of the username value may not be constant in the data but always follows the string 'username '? If so, you can use CHARINDEX with SUBSTRING e.g.
WITH MyTable (ColumnName)
AS
(
SELECT 'Error 192.168.1.67 UserName 0bce6c62-1efb-416d-bce5-71c3c8247b75 An existing ....'
UNION ALL
SELECT 'Username onedaywhen is invalid'
),
MyTable1 (ColumnName, pos1)
AS
(
SELECT M1.ColumnName, CHARINDEX('UserName ', M1.ColumnName) + LEN('UserName ') + 1
FROM MyTable AS M1
),
MyTable2 (ColumnName, pos1, pos2)
AS
(
SELECT M1.ColumnName, M1.pos1,
CHARINDEX(' ', M1.ColumnName, pos1) - M1.pos1
FROM MyTable1 AS M1
)
SELECT SUBSTRING(M1.ColumnName, M1.pos1, M1.pos2)
FROM MyTable2 AS M1;
...though you'd need to make it more robust e.g. when there is no trailing space after the username value etc.