Get Rightmost Pair of Letters from String in SQL - sql

Given a field with combinations of letters and numbers, is there a way to get the last (Rightmost) pair of letters (2 letters) in SQL?
SAMPLE DATA
RT34-92837DF82982
DRE3-9292928373DO
FOR THOSE, I would want
DF and
DO
For clarity, there will only be numbers after these letters.
Edits
This is for SQL Server.

I would remove any characters that aren't letters, using REGEXP_REPLACE or similar function based on your DBMS.
regexp_replace(col1, '[^a-zA-Z]+', '')
Then use a RIGHT or SUBSTRING function to select the "right-most".
right(regexp_replace(col1, '[^a-zA-Z]+', ''), 2)
substring(regexp_replace(col1, '[^a-zA-Z]+', ''),len(regexp_replace(col1, '[^a-zA-Z]+', ''))-2,len(regexp_replace(col1, '[^a-zA-Z]+', ''))
If you can have single occurrences of letters ('DF1234A124') then could change the regex pattern to remove those also - ([^a-zA-Z][a-zA-Z][^a-zA-Z])|[^a-zA-Z]

As you said, there will only be numbers after these letters, you can use the Trim and Right functions as the following:
select
Right(Trim('0123456789' from val), 2) as res
from t
Note: This is valid from SQL Server 2017.
For older versions try the following:
select
Left
(
Right(val, PATINDEX('%[A-Z]%', Reverse(val))+1),
2
) as res
from t
See demo

Related

Split and Concat String on SQL and SSIS

I am trying to split and concat a string.
Example: Data value1: "12abc,34efg,56hij"
Data value2: "12abc"
Expected result:
Numbers Column 1: "12,34,56"
Numbers Column 2: "12"
Alphabets Column 1: "abc,efg,hij"
Alphabets Column 2 "abc"
Several attempts made:
1.
SELECT [String], value, CONCAT(SUBSTRING(value,1,2), ',') AS Numbers, CONCAT(SUBSTRING(value,3,3), ',') AS Alphabets, LEFT(String,LEN(String)-CHARINDEX(',',String))
FROM [Test].[dbo].[TEST]
CROSS APPLY string_split([String],',') value
WHERE String = String
2.
SELECT [String], LEFT(String,LEN(String)-CHARINDEX(',',String)), LEFT(String,2) AS Numbers, RIGHT(STRING,3) AS Alphabets
FROM [Test].[dbo].[TEST]
WHERE String = String
I have followed [How to split a string after specific character in SQL Server and update this value to specific column] because I thought it was pretty similar but I did not receive the results I want so I do not know how to proceed or what I went wrong.
I am unsure of how to concatenate different columns into 1 column.
Additional info:
I am currently using SQL Server Management Studio v18.9.2.
*Apologies if my explanation is horrible.
Firstly, let's get to the point; your design is flawed. Never store delimited data in your database, it breaks the fundamental rules of normalisation. I strongly suggest that what you actually do here is fix your design and normalise your data.
Next, the assumptions:
You are using SQL Server 2017+
The column string can only contain alphanumerical characters (A-z, 0-9)
You are using a case insensitive collation or all characters are lowercase
If this is the case, then you can just use TRANSLATE and REPLACE to remove the characters. You'll need to create some variables (or use the tally inline) to create the replacement strings first.
So, firstly, we get the 2 variables we need, which is one containing the letters a-z, and the other with the numbers 0-9. I use a tally to achieve this:
DECLARE #Alphas varchar(26),
#Numerics varchar(10);
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT TOP (26)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2, N N3)
SELECT #Alphas = STRING_AGG(CHAR(96 + T.I),''),
#Numerics = STRING_AGG(CASE WHEN T.I <= 10 THEN CHAR(47+T.I) END,'')
FROM Tally T;
Now we can use those values to TRANSLATE all those characters to a different character (I'm going to use a pipe (|)) and the REPLACE those pipe characters with nothing:
SELECT YT.String,
REPLACE(TRANSLATE(YT.String, #Alphas,REPLICATE('|',LEN(#Alphas))),'|','') AS Numerics,
REPLACE(TRANSLATE(YT.String, #Numerics,REPLICATE('|',LEN(#Numerics))),'|','') AS Alphas
FROM dbo.YourTable YT;
Or, of course, you could just type it out. ;)
SELECT YT.String,
REPLACE(TRANSLATE(YT.String, 'abcdefghijklmnopqrstuvwxyz',REPLICATE('|',LEN('abcdefghijklmnopqrstuvwxyz'))),'|','') AS Numerics,
REPLACE(TRANSLATE(YT.String, '0123456789',REPLICATE('|',LEN('0123456789'))),'|','') AS Numerics
FROM dbo.YourTable YT;
You can CROSS APPLY to a STRING_SPLIT that uses STRING_AGG (since Sql Server 2017) to stick the numbers and alphabets back together.
select Numbers, Alphabets
from TEST
cross apply (
select
string_agg(left(value, patindex('%[0-9][^0-9]%', value)), ',') as Numbers
, string_agg(right(value, len(value)-patindex('%[0-9][^0-9]%', value)), ',') as Alphabets
from string_split(String, ',') s
) ca;
GO
Numbers | Alphabets
:------- | :----------
12,34,56 | abc,efg,hij
12 | abc
db<>fiddle here

SQL syntax alphanumeric characters, SQL Server

If I pull this ID down from my source system it looks like 9006ABCD.
What would the syntax look like if I just want to return 9006 as the ID?
Essentially, I don't need the alpha characters.
Assuming that '9006ABCD' is a string value, then you can extract the leading numbers using:
select left(id, patindex('%[^0-9]%', id + 'X') - 1)
Of course, there may be easier ways. If you just want the first four characters, then use left(id, 4).

SQL Substring \g

I would just like to know where do I put the \g in this query?
SELECT project,
SUBSTRING(address FROM 'A-Za-z') AS letters,
SUBSTRING(address FROM '\d') AS numbers
FROM repositories
I tried this but this brings back nothing (it doesn't throw an error though)
SELECT project,
SUBSTRING(CONCAT(address, '#') FROM 'A-Za-z' FOR '#') AS letters,
SUBSTRING(CONCAT(address, '#') FROM '\d' FOR '#') AS numbers
FROM repositories
Here is an example: I would like the string 1DDsg6bXmh3W63FTVN4BLwuQ4HwiUk5hX to return DDsgbXmhWFTVNBLwuQHwiUkhX. So basically return all the letters...and then my second one is to return all the numbers.
The g (“global”) modifier in regular expressions indicates that all matches rather than only the first one should be used.
That doesn't make much sense in the substring function, which returns only a single value, namely the first match. So there is no way to use g with substring.
In those functions where it makes sense in PostgreSQL (regexp_replace and regexp_matches), the g can be specified in the optional last flags parameter.
If you want to find all substrings that match a pattern, use regexp_matches.
For your example, which really has nothing to do with substring at all, I'd use
SELECT translate('1DDsg6bXmh3W63FTVN4BLwuQ4HwiUk5hX', '0123456789', '');
translate
---------------------------
DDsgbXmhWFTVNBLwuQHwiUkhX
(1 row)
So this is not pure SQL but Postgresql, but this also does the job:
SELECT project,
regexp_replace(address, '[^A-Za-z]', '', 'g') AS letters,
regexp_replace(address, '[^0-9]', '', 'g') AS numbers
FROM repositories;

SQL / REGEX pattern matching

I want to use regex through sql to query some data to return values. The only valid values below returned would be "GB" and "LDN", or could also be "GB-LDN"
G-GB-LDN-TT-TEST
G-GB-LDNN-TT-TEST
G-GBS-LDN-TT-TEST
As it writes the first GB set needs to have 2 characters specifically, and the LDN needs to have 3 characters specifically. Both sets/groups seperated by an - symbol. I kind of need to extract the data but at the same time ensure it is within that pattern. I took a look at regex but I can't see how to, well it's like substring but I can't see it.
IF i undertsand correctly, you could still use of substring() function to extract the string parts separated by -.
select left(parsename(a.string, 3), 2) +'-'+ left(parsename(a.string, 2) ,3) from
(
select replace(substring(data, 1, len(data)-charindex('-', reverse(data))), '-', '.') [string] from <table>
) a
As in above you could also define the length of extracted string.
Result :
GB-LDN
GB-LDN
GB-LDN

Cut string after first occurrence of a character

I have strings like 'keepme:cutme' or 'string-without-separator' which should become respectively 'keepme' and 'string-without-separator'. Can this be done in PostgreSQL? I tried:
select substring('first:last' from '.+:')
But this leaves the : in and won't work if there is no : in the string.
Use split_part():
SELECT split_part('first:last', ':', 1) AS first_part
Returns the whole string if the delimiter is not there. And it's simple to get the 2nd or 3rd part etc.
Substantially faster than functions using regular expression matching. And since we have a fixed delimiter we don't need the magic of regular expressions.
Related:
Split comma separated column data into additional columns
regexp_replace() may be overload for what you need, but it also gives the additional benefit of regex. For instance, if strings use multiple delimiters.
Example use:
select regexp_replace( 'first:last', E':.*', '');
SQL Select to pick everything after the last occurrence of a character
select right('first:last', charindex(':', reverse('first:last')) - 1)