extract all numbers from start of string? - sql

I have a table which contains some bad data I am trying to clean up.
An example of the fields is below
36234735HAN876
2342JOE9823
554444PUT003
What I want to do is remove all the numeric characters before the first alphabetical character so it would look like the below:
HAN876
JOE9823
PUT003
What would be the best way to achieve this? I have used the below method but this can only be used to extract ALL numeric from the string, not the ones before the alphabetical characters
How to get the numeric part from a string using T-SQL?

You could achieve this using PATINDEX to locate the first position of an alphabetical character in the string, and then use SUBSTRING to only return the characters after that position:
CREATE TABLE #temp (val VARCHAR(50));
INSERT INTO #temp VALUES ('36234735HAN876'), ('2342JOE9823'), ('554444PUT003'), ('TEST1234');
SELECT val,
SUBSTRING(val, PATINDEX('%[A-Z]%', val), LEN(val)) AS output
FROM #temp;
DROP TABLE #temp;
Outputs:
val output
36234735HAN876 HAN876
2342JOE9823 JOE9823
554444PUT003 PUT003
TEST1234 TEST1234
Note that I have created a temporary table with a column named val. You should change this to work with whatever the actual column is called.
About case sensitivity: If you are using a non-case sensitive collation this will work without issue. If your collation is case sensitive then you may need to alter the pattern being matched to cater for upper- and lower-case letters.

Use PATINDEX to find the first non-numeric character (or first alpha character, depending on the logic) and STUFF to remove them:
SELECT STUFF(V.YourString,1,ISNULL(NULLIF(PATINDEX('%[^0-9]%',V.YourString),0)-1,0),'')
FROM (VALUES('36234735HAN876'),
('2342JOE9823'),
('554444PUT003'),
('ABC123'))V(YourString)
If the logic is the first alpha character, instead of the first non-numeric, then the pattern would be [A-z].
The NULLIF and ISNULL are in there for when/if the string starts with a alpha/non-numeric and thus doesn't cause STUFF to error due to the 3rd parameter being -1. The is demonstrated with the additional example I put into the sample data ('ABC123').

Related

Postgres SQL regexp_replace replace all number

I need some help with the next. I have a field text in SQL, this record a list of times sepparates with '|'. For example
'14613|15474|3832|148|5236|5348|1055|524' Each value is a time in milliseconds. This field could any length, for example is perfect correct '3215|2654' or '4565' (only 1 value). I need get this field and replace all number with -1000 value.
So '14613|15474|3832|148|5236|5348|1055|524' will be '-1000|-1000|-1000|-1000|-1000|-1000|-1000|-1000'
Or '3215|2654' => '-1000|-1000' Or '4565' => '-1000'.
I try use regexp_replace(times_field,'[[:digit:]]','-1000','g') but it replace each digit, not the complete number, so in this example:
'3215|2654' than must be '-1000|-1000', i get:
'-1000-1000-1000-1000|-1000-1000-1000-1000', I try with other combinations and more options of regexp but i'm done.
Please need your help, thanks!!!.
We can try using REGEXP_REPLACE here:
UPDATE yourTable
SET times_field = REGEXP_REPLACE(times_field, '\y[0-9]+\y', '-1000', 'g');
If instead you don't really want to alter your data but rather just view your data this way, then use a select:
SELECT
times_field,
REGEXP_REPLACE(times_field, '\y[0-9]+\y', '-1000', 'g') AS times_field_replace
FROM yourTable;
Note that in either case we pass g as the fourtb parameter to REGEXP_REPLACE to do a global replacement of all pipe separated numbers.
[[:digit:]] - matches a digit [0-9]
+ Quantifier - matches between one and unlimited times, as many times as possible
your regexp must look like
regexp_replace(times_field,'[[:digit:]]+','-1000','g')

SQLite TRIM same character, multiple columns

I have a table in an SQLite db which has multiple columns with leading '='. I understand that I can use...
SELECT TRIM(`column1`, '=') FROM table;
to clean one column however I get a syntax error if I try for example, this...
SELECT TRIM(`column1`, `column2`, `column3`, '=') FROM table;
Due to incorrect number of arguments.
Is there a more efficient way of writing this code than applying the trim to each column separately like this?
SELECT TRIM(`column1`,'=')as `col1`, TRIM(`column2`,'=')as `col2`, TRIM(`column3`,'=')as `col3` FROM table;
How SQLite guide tells:
trim(X,Y)
The trim(X,Y) function returns a string formed by removing any and all
characters that appear in Y from both ends of X. If the Y argument is
omitted, trim(X) removes spaces from both ends of X.
You have only two parameters, so it's impossible apply it one shot on 3 columns table.
The first parameter is a column, or variable on you can apply trim. The second parameter is a character to change.

How can I extract a substring from a character column without using SUBSTR()?

I have a questions regarding below data.
You clearly can see each EMP_IDENTIFIER has connected with EMP_ID.
So I need to pull only identifier which is 10 characters that will insert another column.
How would I do that?
I did some traditional way, using INSTR, SUBSTR.
I just want to know is there any other way to do it but not using INSTR, SUBSTR.
EMP_ID(VARCHAR2)EMP_IDENTIFIER(VARCHAR2)
62049 62049-2162400111
6394 6394-1368000222
64473 64473-1814702333
61598 61598-0876000444
57452 57452-0336503555
5842 5842-0000070666
75778 75778-0955501777
76021 76021-0546004888
76274 76274-0000454999
73910 73910-0574500122
I am using Oracle 11g.
If you want the second part of the identifier and it is always 10 characters:
select t.*, substr(emp_identifier, -10) as secondpart
from t;
Here is one way:
REGEXP_SUBSTR (EMP_IDENTIFIER, '-(.{10})',1,1,null,1)
That will give the 1st 10 character string that follows a dash ("-") in your string. Thanks to mathguy for the improvement.
Beyond that, you'll have to provide more details on the exact logic for picking out the identifier you want.
Since apparently this is for learning purposes... let's say the assignment was more complicated. Let's say you had a longer input string, and it had several groups separated by -, and the groups could include letters and digits. You know there are at least two groups that are "digits only" and you need to grab the second such "purely numeric" group. Then something like this will work (and there will not be an instr/substr solution):
select regexp_substr(input_str, '(-|^)(\d+)(-|$)', 1, 2, null, 2) from ....
This searches the input string for one or more digits ( \d means any digit, + means one or more occurrences) between a - or the beginning of the string (^ means beginning of the string; (a|b) means match a OR b) and a - or the end of the string ($ means end of the string). It starts searching at the first character (the second argument of the function is 1); it looks for the second occurrence (the argument 2); it doesn't do any special matching such as ignore case (the argument "null" to the function), and when the match is found, return the fragment of the match pattern included in the second set of parentheses (the last argument, 2, to the regexp function). The second fragment is the \d+ - the sequence of digits, without the leading and/or trailing dash -.
This solution will work in your example too, it's just overkill. It will find the right "digits-only" group in something like AS23302-ATX-20032-33900293-CWV20-3499-RA; it will return the second numeric group, 33900293.

replace two characters in one cell

I am using this query to replace one character in a cell
select replace(id,',','')id from table
But I want to replace two characters in a cell.
If the cell is having this data (1,3.1), and I want it to look like this (131).
How can I replace two different characters in one cell?
Use TRANSLATE instead of REPLACE(). It replaces each occurrence of a character in the first pattern with its matched character in the second. To remove characters, simply leave cut short the replacement string:
select translate(id, '1,.', '1') id from table
Note that the second string cannot be null. Hence the need to include 1 (or some other character) in both strings.
Find out more.
Obviously the more characters you need to convert/remove the more attractive TRANSLATE() becomes. The main use for REPLACE is changing patterns (such as words) rather than individual characters.
Can use
select replace(translate(id,',.',' '),' ','') from table;
or
select regexp_replace('1,3.1','[,.]','') from dual;
or
select replace(replace(id,',',''),'.','') from table;
Call the replace again.
select replace(replace(id,',',''), '.','') id from table
Do this:
select REPLACE(REPLACE(id,',',''),'.','')
Or use a regular expression:
select regexp_replace(id, '[.,]', '') id from table
Find out more

SQL: insert space before numbers in string

I have a nvarchar field in my table, which contains all sorts of strings.
In case there are strings which contain a number following a non-number sign, I want to insert a space before that number.
That is - if a certain entry in that field is abc123, it should be turned into abc 123, or ab12.34 should become ab 12. 34.I want this to be done throughout the entire table.
What's the best way to achieve it?
You can try something like that:
select left(col,PATINDEX('%[0-9]%',col)-1 )+space(1)+
case
when PATINDEX('%[.]%',col)<>0
then substring(col,PATINDEX('%[0-9]%',col),len(col)+1-PATINDEX('%[.]%',col))
+space(1)+
substring(col,PATINDEX('%[.]%',col)+1,len(col)+1-PATINDEX('%[.]%',col))
else substring(col,PATINDEX('%[0-9]%',col),len(col)+1-PATINDEX('%[0-9]%',col))
end
from tab
It's not simply, but I hope it will help you.
SQL Fiddle
I used functions (link to MSDN):
LEFT, PATINDEX, SPACE, SUBSTRING, LEN
and regular expression.