I am trying to split and concat a string.
Example: Data value1: "12abc,34efg,56hij"
Data value2: "12abc"
Expected result:
Numbers Column 1: "12,34,56"
Numbers Column 2: "12"
Alphabets Column 1: "abc,efg,hij"
Alphabets Column 2 "abc"
Several attempts made:
1.
SELECT [String], value, CONCAT(SUBSTRING(value,1,2), ',') AS Numbers, CONCAT(SUBSTRING(value,3,3), ',') AS Alphabets, LEFT(String,LEN(String)-CHARINDEX(',',String))
FROM [Test].[dbo].[TEST]
CROSS APPLY string_split([String],',') value
WHERE String = String
2.
SELECT [String], LEFT(String,LEN(String)-CHARINDEX(',',String)), LEFT(String,2) AS Numbers, RIGHT(STRING,3) AS Alphabets
FROM [Test].[dbo].[TEST]
WHERE String = String
I have followed [How to split a string after specific character in SQL Server and update this value to specific column] because I thought it was pretty similar but I did not receive the results I want so I do not know how to proceed or what I went wrong.
I am unsure of how to concatenate different columns into 1 column.
Additional info:
I am currently using SQL Server Management Studio v18.9.2.
*Apologies if my explanation is horrible.
Firstly, let's get to the point; your design is flawed. Never store delimited data in your database, it breaks the fundamental rules of normalisation. I strongly suggest that what you actually do here is fix your design and normalise your data.
Next, the assumptions:
You are using SQL Server 2017+
The column string can only contain alphanumerical characters (A-z, 0-9)
You are using a case insensitive collation or all characters are lowercase
If this is the case, then you can just use TRANSLATE and REPLACE to remove the characters. You'll need to create some variables (or use the tally inline) to create the replacement strings first.
So, firstly, we get the 2 variables we need, which is one containing the letters a-z, and the other with the numbers 0-9. I use a tally to achieve this:
DECLARE #Alphas varchar(26),
#Numerics varchar(10);
WITH N AS(
SELECT N
FROM (VALUES(NULL),(NULL),(NULL))N(N)),
Tally AS(
SELECT TOP (26)
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS I
FROM N N1, N N2, N N3)
SELECT #Alphas = STRING_AGG(CHAR(96 + T.I),''),
#Numerics = STRING_AGG(CASE WHEN T.I <= 10 THEN CHAR(47+T.I) END,'')
FROM Tally T;
Now we can use those values to TRANSLATE all those characters to a different character (I'm going to use a pipe (|)) and the REPLACE those pipe characters with nothing:
SELECT YT.String,
REPLACE(TRANSLATE(YT.String, #Alphas,REPLICATE('|',LEN(#Alphas))),'|','') AS Numerics,
REPLACE(TRANSLATE(YT.String, #Numerics,REPLICATE('|',LEN(#Numerics))),'|','') AS Alphas
FROM dbo.YourTable YT;
Or, of course, you could just type it out. ;)
SELECT YT.String,
REPLACE(TRANSLATE(YT.String, 'abcdefghijklmnopqrstuvwxyz',REPLICATE('|',LEN('abcdefghijklmnopqrstuvwxyz'))),'|','') AS Numerics,
REPLACE(TRANSLATE(YT.String, '0123456789',REPLICATE('|',LEN('0123456789'))),'|','') AS Numerics
FROM dbo.YourTable YT;
You can CROSS APPLY to a STRING_SPLIT that uses STRING_AGG (since Sql Server 2017) to stick the numbers and alphabets back together.
select Numbers, Alphabets
from TEST
cross apply (
select
string_agg(left(value, patindex('%[0-9][^0-9]%', value)), ',') as Numbers
, string_agg(right(value, len(value)-patindex('%[0-9][^0-9]%', value)), ',') as Alphabets
from string_split(String, ',') s
) ca;
GO
Numbers | Alphabets
:------- | :----------
12,34,56 | abc,efg,hij
12 | abc
db<>fiddle here
If I pull this ID down from my source system it looks like 9006ABCD.
What would the syntax look like if I just want to return 9006 as the ID?
Essentially, I don't need the alpha characters.
Assuming that '9006ABCD' is a string value, then you can extract the leading numbers using:
select left(id, patindex('%[^0-9]%', id + 'X') - 1)
Of course, there may be easier ways. If you just want the first four characters, then use left(id, 4).
I would just like to know where do I put the \g in this query?
SELECT project,
SUBSTRING(address FROM 'A-Za-z') AS letters,
SUBSTRING(address FROM '\d') AS numbers
FROM repositories
I tried this but this brings back nothing (it doesn't throw an error though)
SELECT project,
SUBSTRING(CONCAT(address, '#') FROM 'A-Za-z' FOR '#') AS letters,
SUBSTRING(CONCAT(address, '#') FROM '\d' FOR '#') AS numbers
FROM repositories
Here is an example: I would like the string 1DDsg6bXmh3W63FTVN4BLwuQ4HwiUk5hX to return DDsgbXmhWFTVNBLwuQHwiUkhX. So basically return all the letters...and then my second one is to return all the numbers.
The g (“global”) modifier in regular expressions indicates that all matches rather than only the first one should be used.
That doesn't make much sense in the substring function, which returns only a single value, namely the first match. So there is no way to use g with substring.
In those functions where it makes sense in PostgreSQL (regexp_replace and regexp_matches), the g can be specified in the optional last flags parameter.
If you want to find all substrings that match a pattern, use regexp_matches.
For your example, which really has nothing to do with substring at all, I'd use
SELECT translate('1DDsg6bXmh3W63FTVN4BLwuQ4HwiUk5hX', '0123456789', '');
translate
---------------------------
DDsgbXmhWFTVNBLwuQHwiUkhX
(1 row)
So this is not pure SQL but Postgresql, but this also does the job:
SELECT project,
regexp_replace(address, '[^A-Za-z]', '', 'g') AS letters,
regexp_replace(address, '[^0-9]', '', 'g') AS numbers
FROM repositories;
I want to use regex through sql to query some data to return values. The only valid values below returned would be "GB" and "LDN", or could also be "GB-LDN"
G-GB-LDN-TT-TEST
G-GB-LDNN-TT-TEST
G-GBS-LDN-TT-TEST
As it writes the first GB set needs to have 2 characters specifically, and the LDN needs to have 3 characters specifically. Both sets/groups seperated by an - symbol. I kind of need to extract the data but at the same time ensure it is within that pattern. I took a look at regex but I can't see how to, well it's like substring but I can't see it.
IF i undertsand correctly, you could still use of substring() function to extract the string parts separated by -.
select left(parsename(a.string, 3), 2) +'-'+ left(parsename(a.string, 2) ,3) from
(
select replace(substring(data, 1, len(data)-charindex('-', reverse(data))), '-', '.') [string] from <table>
) a
As in above you could also define the length of extracted string.
Result :
GB-LDN
GB-LDN
GB-LDN
I have strings like 'keepme:cutme' or 'string-without-separator' which should become respectively 'keepme' and 'string-without-separator'. Can this be done in PostgreSQL? I tried:
select substring('first:last' from '.+:')
But this leaves the : in and won't work if there is no : in the string.
Use split_part():
SELECT split_part('first:last', ':', 1) AS first_part
Returns the whole string if the delimiter is not there. And it's simple to get the 2nd or 3rd part etc.
Substantially faster than functions using regular expression matching. And since we have a fixed delimiter we don't need the magic of regular expressions.
Related:
Split comma separated column data into additional columns
regexp_replace() may be overload for what you need, but it also gives the additional benefit of regex. For instance, if strings use multiple delimiters.
Example use:
select regexp_replace( 'first:last', E':.*', '');
SQL Select to pick everything after the last occurrence of a character
select right('first:last', charindex(':', reverse('first:last')) - 1)