Extracting substring before alphabet in SQL - sql

Say I have the following strings contained in column1:
1) 12345BC01
2) 67890DE05
How can I formulate my SELECT clause to extract only the values before any alphabet character? So my output would look like:
1) 12345
2) 67890
I found the following solution, but it seems to grab everything AFTER the alphabet characters:
SELECT STUFF(column1,1,ISNULL(NULLIF(PATINDEX('%[^0-9]%',column1),0)-1,0),'')
I wish I could detail what else I've tried but I don't know the first thing about regex unfortunately. Any help would be greatly appreciated

You can try like this:
Select left(column1,patindex( '%[^0-9]%', column1+'A')-1)
from #YourTable

It would be easier to just use the left function and isnull the result if there are only numbers
select IsNull(Left(Col1, NullIf(PatIndex('%[^0-9]%', col1), 0)-1), col1)
from t;

Related

Substring of a specific occurence

I have a column as varchar2 datatype, the data in it is in format:
100323.3819823.222
100.323123.443422
1001010100.233888
LOL12333.DDD33.44
I need to remove the whole part after the first occurrence of '.'
In the end it should look like this:
100323
100
1001010100
LOL12333
I cant seem to find the exact substring expression due to the fact that there is not any fix length of the first part.
One way is to use REGEXP_SUBSTR:
SELECT REGEXP_SUBSTR(column_name,'^[^.]*') FROM table
The other way is to combine SUBSTR with INSTR, which is a bit faster, but will result in NULL if the data doesn't contain a dot, so you'll have to add a switch if needed:
SELECT SUBSTR(column_name, 1, INSTR(column_name,'.') - 1) FROM table
For oracle you can try this:
select substr (i,1,Instr(i,'.',i)-1) from Table name.

replace all occurrences of a sub string between 2 charcters using sql

Input string: ["1189-13627273","89-13706681","118-13708388"]
Expected Output: ["14013627273","14013706681","14013708388"]
What I am trying to achieve is to replace any numbers till the '-' for each item with hard coded text like '140'
SELECT replace(value_to_replace, '-', '140')
FROM (
VALUES ('1189-13627273-77'), ('89-13706681'), ('118-13708388')
) t(value_to_replace);
check this
I found the right way to achieve that using the below regular expression.
SELECT REGEXP_REPLACE (string_to_change, '\\"[0-9]+\\-', '140')
You don't need a regexp for this, it's as easy as concatenation of 140 and the substring from - (or the second part when you split by -)
select '140'||substring('89-13706681' from position('-' in '89-13706681')+1 for 1000)
select '140'||split_part('89-13706681','-',2)
also, it's important to consider if you might have instances that don't contain - and what would be the output in this case
Use regexp_replace(text,text,text) function to do so giving the pattern to match and replacement string.
First argument is the value to be replaced, second is the POSIX regular expression and third is a replacement text.
Example
SELECT regexp_replace('1189-13627273', '.*-', '140');
Output: 14013627273
Sample data set query
SELECT regexp_replace(value_to_replace, '.*-', '140')
FROM (
VALUES ('1189-13627273'), ('89-13706681'), ('118-13708388')
) t(value_to_replace);
Caution! Pattern .*- will replace every character until it finds last occurence of - with text 140.

Regular expression for gettin data after - in sql

I have a column with assignment numbers like - 11827,27266,91717,09818-2,726252-3,8716151-0,827272,18181
Now i am selecting the records like
select assignment_number from table;
But now i want that the column detail is retreived in such a way that numbers are only retrieved without -2 -3 etc like
726252-3---> 726252 8716151-0-->8716151
I know i can use regex for this but i do not know how to use it
This will select everthing before the character -:
^([^-]+)
From 726252-3 will match 726252
You would use regexp() substr:
select regexp_substr(assignmentnumber, '[0-9]+')
This will return the first string of numbers encountered in the string.

Get rows that contain only certain characters

I want to get only those rows that contain ONLY certain characters in a column.
Let's say the column name is DATA.
I want to get all rows where in DATA are ONLY (must have all three conditions!):
Numeric characters (1 2 3 4 5 6 7 8 9 0)
Dash (-)
Comma (,)
For instance:
Value "10,20,20-30,30" IS OK
Value "10,20A,20-30,30Z" IS NOT OK
Value "30" IS NOT OK
Value "AAAA" IS NOT OK
Value "30-" IS NOT OK
Value "30," IS NOT OK
Value "-," IS NOT OK
Try patindex:
select * from(
select '10,20,20-30,30' txt union
select '10,20,20-30,40' txt union
select '10,20A,20-30,30Z' txt
)x
where patindex('%[^0-9,-]%', txt)=0
For you table, try like:
select
DATA
from
YourTable
where
patindex('%[^0-9,-]%', DATA)=0
As per your new edited question, the query should be like:
select
DATA
from
YourTable
where
PATINDEX('%[^0-9,-]%', DATA)=0 and
PATINDEX('%[0-9]%', LEFT(DATA, 1))=1 and
PATINDEX('%[0-9]%', RIGHT(DATA, 1))=1 and
PATINDEX('%[,-][-,]%', DATA)=0
Edit: Your question was edited, so this answer is no longer correct. I won't bother updating it since someone else already has updated theirs. This answer does not fulfil the condition that all three character types must be found.
You can use a LIKE expression for this, although it's slightly convoluted:
where data not like '%[^0123456789,!-]%' escape '!'
Explanation:
[^...] matches any character that is not in the ... part. % matches any number (including zero) of any character. So [^0123456789-,] is the set of characters that you want to disallow.
However: - is a special character inside of [], so we must escape it, which we do by using an escape character, and I've chosen !.
So, you match rows that do not contain (not like) any character that is not in your disallowed set.
Use option with PATINDEX and LIKE logic operator
SELECT *
FROM dbo.test70
WHERE PATINDEX('%[A-Z]%', DATA) = 0
AND PATINDEX('%[0-9]%', DATA) > 0
AND DATA LIKE '%-%'
AND DATA LIKE '%,%'
Demo on SQLFiddle
As already mentioned u can use a LIKE expression but it will only work with some minor modifications, otherwise too many rows will be filtered out.
SELECT * FROM X WHERE T NOT LIKE '%[^0-9!-,]%' ESCAPE '!'
see working example here:
http://sqlfiddle.com/#!3/474f5/6
edit:
to meet all 3 conditions:
SELECT *
FROM X
WHERE T LIKE '%[0-9]%'
AND T LIKE '%-%'
AND T LIKE '%,%'
see: http://sqlfiddle.com/#!3/86328/1
Maybe not the most beautiful but a working solution.

Searching set of characters in a string

How to query if i want to check if string contains numbers(characters) like 1,2,3,4,5,6,7 in any order. It can be achieved by writing AND statements on LIKE clause for number 1,2..and so on like below,
Where days like '%1%' and days like '%2%' ...... So on
Is there any query which check specific characters present in string. or how above example can achieve with a short hand query. Please help. Thanks.
One way to do this is create a table with all the string you like to search.
e.g.
DECLARE #searchstr TABLE (s VARCHAR(10))
INSERT INTO #searchstr VALUES ('1'),('2'),('3'),('4'),('5'),('6'),('7')
DECLARE #tbl TABLE (days VARCHAR(100))
INSERT INTO #tbl VALUES ('1234567'),('123'),('1122334'),('7654321')
SELECT t.days
FROM #tbl t
LEFT JOIN #searchstr s
ON t.days LIKE '%' + s.s+ '%'
GROUP BY t.days HAVING COUNT(DISTINCT s.s) = 7
Will not CONTAINS() work for you?
Also notice how to split word into char array.
I would suggest looking into [PATINDEX][1]. From MSDN:
Returns the starting position of the first occurrence of a pattern in
a specified expression, or zeros if the pattern is not found, on all
valid text and character data types.
You probably will end up using something like WHERE PATINDEX(%[0-9]%', foo) > 0
You can use CHARINDEX
CHARINDEX(stringthatcontainschars1, '1', numofpositiontostartlookingat)
Output: 24, the position that your searched for char is at. returns zero if not found
if you were looking for the number 1 in a column or a string you would put:
CHARINDEX(columnname, '1', 1)
This would search for '1' beginning at position 1.
You can do an conditional statement based on whether or not it returns 0 to decide if char is in string or not and write whatever code you need after that.
I'd use isnumeric to ensure they're all digits, then test for 0, 8, or 9 instead of 1-7 because it's shorter.
If days are coma separated you can use FIND_IN_SET
SELECT FIND_IN_SET(1, days) AND FIND_IN_SET(2, days);
It is different, not better, but as far as I know there is no other native way.
And LIKEs probably will be much faster then find_in_set