Trim FIRST character in string that has multiple text of that character - sql

I'm using SQL Server 2008 and I'm trying to trim values that looks like this
DocID
----------------
FOO_1_23_456
FOO1_1_23_4567
I'm trying to make it so it will only give me everything after the first '_'
Result
_1_23_456
_1_23_4567
Right now my query is
select
right(DocIDDocument, LEN(DocID.Document) - 3)) AS NewDocID
which only the trims the first 3 characters, I need it to where it trims everything before the first '_'
Thanks

Use stuff() and charindex():
select stuff(document, 1, charindex('_', document) - 1, '')

Related

how to replace dots from 2nd occurrence

I have column with software versions. I was trying to remove dot from 2nd occurrence in column, like
select REGEXP_REPLACE('12.5.7.8', '.','');
expected out is 12.578
sample data is here
Is it possible to remove dot from 2nd occurrence
One option is to break this into two pieces:
Get the first number.
Get the rest of the numbers as an array.
Then convert the array to a string with no separator and combine with the first:
select (split_part('12.5.7.8', '.', 1) || '.' ||
array_to_string((REGEXP_SPLIT_TO_ARRAY('12.5.7.8', '[.]'))[2:], '')
)
Another option is to replace the first '.' with something else, then get rid of the '.'s and replace the something else with a '|':
select translate(regexp_replace(version, '^([^.]+)[.](.*)$', '\1|\2'), '|.', '.')
from software_version;
Here is a db<>fiddle with the three versions, including the version a_horse_with_no_name mentions in the comment.
I'd just take the left and right:
concat(
left(str, position('.' in str)),
replace(right(str, -position('.' in str)), '.', '')
)
For a str of 12.34.56.78, the left gives 12. and the right using a negative position gives 34.56.78 upon which the replace then removes the dots

Isolating an alphanumerical ID that can be anywhere in a text based cell

I'm trying to isolate a certain character string from a text cell.
For example, I would like to extract "AB-T120-15" from the string "His server ID was AB-T120-15 and his problem was that he needed a reboot"
AB-T120-15 is an example, but they would all be codes of a max length of 13 characters starting by something like AB-T, CL-R, etc.
The codes can appear anywhere in a text field of the column.
string_split() cannot be used since the DB we are under is older.
I have tried many combinations of Substring and LEFT, but I cannot seem to have it worked.
Any thoughts?
String operations are not the strength of SQL Server -- which I assume you are using.
You can do this with rather painful string manipulation:
select left(stuff(str, 1, patindex('%[A-Z][A-Z]-[A-Z]%', str) - 1, ''),
charindex(' ', stuff(str, 1, patindex('%[A-Z][A-Z]-[A-Z]%', str), '') + ' ')
)
from (values ('His server ID was AB-T120-15 and his problem was that he needed a reboot')) v(str);

SQL - How to remove a space character from a string

I have a table where the max length of a column (varchar) is 12, someone has loaded some value with a space, so rather than 'SPACE' it's 'SPACE '
I want to remove the space using a script, I was positive RTRIM or REPLACE(myValue, ' ', '') would work but LEN(myValue) shows there is still and extra character?
As mentioned by a couple folks, it may not be a space. Grab a copy of ngrams8k and you use it to identify the issue. For example, here we have the text, " SPACE" with a preceding space and trailing CHAR(160) (HTML BR tag). CHAR(160) looks like a space in SSMS but isn't "trimable". For example consider this query:
DECLARE #string VARCHAR(100) = ' SPACE'+CHAR(160);
SELECT '"'+#string+'"'
Using ngrams8k you could do this:
DECLARE #string VARCHAR(100) = ' SPACE'+CHAR(160);
SELECT
ng.position,
ng.token,
asciival = ASCII(ng.token)
FROM dbo.ngrams8k(#string,1) AS ng;
Returns:
position token asciival
---------- ------- -----------
1 32
2 S 83
3 P 80
4 A 65
5 C 67
6 E 69
7   160
As you can see, the first character (position 1) is CHAR(32), that's a space. The last character (postion 7) is not a space.
Knowing that CHAR(160) is the issue you could fix it like so:
SET #string = REPLACE(LTRIM(#string),CHAR(160),'')
If you are using SQL Server 2017+ you can also use TRIM which does a whole lot more than just LTRIM-and-RTRIM-ing. For example, this will remove
leading and trailing tabs, spaces, carriage returns, line returns and HTML BR tags.
SET #string = SELECT TRIM(CHAR(32)+CHAR(9)+CHAR(10)+CHAR(13)+CHAR(160) FROM #string)
Odds are it is some other non-printing character, carriage return is a big one when going from between *nix and other OS. One way to tell is to use the DUMP function. So you could start with a query like:
SELECT dump(column_name)
FROM your_table
WHERE column_name LIKE 'SPACE%'
That should help you find the offending character, however, that doesn't fix your problem. Instead, I would use something like REGEXP_REPLACE:
SELECT REGEXP_REPLACE(column_name, '[^A-z]')
FROM your_table
That should take care of any non-printing characters. You may need to play with the regular expression if you expect numbers or symbols in your string. You could switch to a character class like:
SELECT REGEXP_REPLACE(column_name, '[:cntrl:]')
FROM your_table

Snowflake Syntax - Unexpected space character in Syntax

Query
select sis.subject_code||'_'||LEFT(REPLACE(sis.SIS_TERM_ID,0,''),LENGTH(sis.SIS_TERM_ID) - 4)||''|| REPLACE(SUBSTR(sis.SIS_TERM_ID, 8, 8),'','')
from TableX;
Result looks like the following
XXXX888543_134 1 ---there is a space before the last value. I am not sure where this is being derived from. Any ideas on what I could modify in the string above please.
Assuming the space is really a space, how about doing the replace() across the whole string?
select replace(sis.subject_code || '_' || LEFT(REPLACE(sis.SIS_TERM_ID, 0, ''), LENGTH(sis.SIS_TERM_ID) - 4) || SUBSTR(sis.SIS_TERM_ID, 8, 8), '', '')
It is unclear whether the replace is coming from the last element or the one before it. But you don't seem to want any spaces in the string.

Remove leading zeros

Given data in a column which look like this:
00001 00
00026 00
I need to use SQL to remove anything after the space and all leading zeros from the values so that the final output will be:
1
26
How can I best do this?
Btw I'm using DB2
This was tested on DB2 for Linux/Unix/Windows and z/OS.
You can use the LOCATE() function in DB2 to find the character position of the first space in a string, and then send that to SUBSTR() as the end location (minus one) to get only the first number of the string. Casting to INT will get rid of the leading zeros, but if you need it in string form, you can CAST again to CHAR.
SELECT CAST(SUBSTR(col, 1, LOCATE(' ', col) - 1) AS INT)
FROM tab
In DB2 (Express-C 9.7.5) you can use the SQL standard TRIM() function:
db2 => CREATE TABLE tbl (vc VARCHAR(64))
DB20000I The SQL command completed successfully.
db2 => INSERT INTO tbl (vc) VALUES ('00001 00'), ('00026 00')
DB20000I The SQL command completed successfully.
db2 => SELECT TRIM(TRIM('0' FROM vc)) AS trimmed FROM tbl
TRIMMED
----------------------------------------------------------------
1
26
2 record(s) selected.
The inner TRIM() removes leading and trailing zero characters, while the outer trim removes spaces.
This worked for me on the AS400 DB2.
The "L" stands for Leading.
You can also use "T" for Trailing.
I am assuming the field type is currently VARCHAR, do you need to store things other than INTs?
If the field type was INT, they would be removed automatically.
Alternatively, to select the values:
SELECT (CAST(CAST Col1 AS int) AS varchar) AS Col1
I found this thread for some reason and find it odd that no one actually answered the question. It seems that the goal is to return a left adjusted field:
SELECT
TRIM(L '0' FROM SUBSTR(trim(col) || ' ',1,LOCATE(' ',trim(col) || ' ') - 1))
FROM tab
One option is implicit casting: SELECT SUBSTR(column, 1, 5) + 0 AS column_as_number ...
That assumes that the structure is nnnnn nn, ie exactly 5 characters, a space and two more characters.
Explicit casting, ie SUBSTR(column,1,5)::INT is also a possibility, but exact syntax depends on the RDBMS in question.
Use the following to achieve this when the space location is variable, or even when it's fixed and you want to make a more robust query (in case it moves later):
SELECT CAST(SUBSTR(LTRIM('00123 45'), 1, CASE WHEN LOCATE(' ', LTRIM('00123 45')) <= 1 THEN LEN('00123 45') ELSE LOCATE(' ', LTRIM('00123 45')) - 1 END) AS BIGINT)
If you know the column will always contain a blank space after the start:
SELECT CAST(LOCATE(LTRIM('00123 45'), 1, LOCATE(' ', LTRIM('00123 45')) - 1) AS BIGINT)
both of these result in:
123
so your query would
SELECT CAST(SUBSTR(LTRIM(myCol1), 1, CASE WHEN LOCATE(' ', LTRIM(myCol1)) <= 1 THEN LEN(myCol1) ELSE LOCATE(' ', LTRIM(myCol1)) - 1 END) AS BIGINT)
FROM myTable1
This removes any content after the first space character (ignoring leading spaces), and then converts the remainder to a 64bit integer which will then remove all leading zeroes.
If you want to keep all the numbers and just remove the leading zeroes and any spaces you can use:
SELECT CAST(REPLACE('00123 45', ' ', '') AS BIGINT)
While my answer might seem quite verbose compared to simply SELECT CAST(SUBSTR(myCol1, 1, 5) AS BIGINT) FROM myTable1 but it allows for the space character to not always be there, situations where the myCol1 value is not of the form nnnnn nn if the string is nn nn then the convert to int will fail.
Remember to be careful if you use the TRIM function to remove the leading zeroes, and actually in all situations you will need to test your code with data like 00120 00 and see if it returns 12 instead of the correct value of 120.