Split a string to only use the middle part in SQL - sql

I have a string
ABC - ABCDEFGHIJK - 05/07/2016
I want to only use the ABCDEFGHIJK section and remove the first and third parts of the string.
I have tried using SUBSTRING with CHARINDEX, but was only able to remove the first part of the string.
Anyone help with this?

You can use SUBSTRING_INDEX() and TRIM() for spaces :
SELECT TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(string_col,'-',2),'-',-1)) AS Strig_Col
FROM YourTable;

Related

TERADATA REGEXP_SUBSTR Get string between two values

I am fairly new to teradata, but I was trying to understand how to use REGEXP_SUBSTR
For example I have the following cell value = ABCD^1234567890^1
How can I extract 1234567890
What I attempted to do is the following:
REGEXP_SUBSTR(x, '(?<=^).*?(?=^)')
But this didnt seem to work.
Can anyone help?
It might (or might not) be possible to use REGEXP_SUBSTR() to handle this, but you would need to use a capture group. An alternative here would be to do a regex replacement instead:
SELECT x, REGEXP_REPLACE(x, '^.*?\^|\^.*$', '') AS output
FROM yourTable;
The regex pattern used here matches:
^.*?\^ everything from the start to the first ^
| OR
\^.*$ everything from the second ^ to the end
We then replace with empty string to remove the content being matched.

How to remove a specific part of a string in Postgres SQL?

Say I have a column in postgres database called pid that looks like this:
set1/2019-10-17/ASR/20190416832-ASR.pdf
set1/2019-03-15/DEED/20190087121-DEED.pdf
set1/2021-06-22/DT/20210376486-DT.pdf
I want to remove everything after the last dash "-" including the dash itself. So expected results:
set1/2019-10-17/ASR/20190416832.pdf
set1/2019-03-15/DEED/20190087121.pdf
set1/2021-06-22/DT/20210376486.pdf
I've looked into replace() and split_part() functions but still can't figure out how
to do this. Please advise.
We can use a regex replacement here:
SELECT col, REGEXP_REPLACE(col, '^(.*)-[^-]+(\.\w+)$', '\1\2') AS col_out
FROM yourTable;
The regex used above captures the column value before the last dash in \1, and the extension in \2. It then builds the output using \1\2 as the replacement.
Here is a working regex demo.

using Regex get substring between underscore 2 and underscore 3 of string, vb.net

I have a string like: Title Name_2021-04-13_A+B+C_Division.txt. I need to extract the A+B+C. The A+B+C may be other letters. I believe that using Regex would be the simplest way to do this. In other words I need to get the substring between underscore 2 and underscore 3 of string. All of my code is written in vb.net. I have tried:
boatClass = Regex.Match(myFile, "(?<=_)(.*)(?=_)").ToString
I know this is not right but I think it is close. What do I need to add or change?
The regex code that will extract a substring between the second and third underscore of a string is:
(?:[^_]+_){2}([^_]+)
However, I chose to use the split function:
myString.Split("_"c)(2)

Finding strings between dashes using REGEXP_EXTRACT in Bigquery

In Bigquery, I am trying to find a way to extract particular segments of a string based on how many dashes come before it. The number of total dashes in the string will always be the same. For example, I could be looking for the string after the second dash and before the third dash in the following string:
abc-defgh-hij-kl-mnop
Currently, I am using the following regex to extract, which counts the dashes from the back:
([^-]+)(?:-[^-]+){2}$
The problem is that if there is nothing in between the dashes, the regex doesn't work. For example, something like this returns null:
abc-defgh-hij--mnop
Is there a way to use regex to extract a string after a certain number of dashes and cut it off before the subsequent dash?
Thank you!
Below is for BigQuery Standrd SQL
The simplest way in your case is to use SPLIT and OFFSET as in below example
SELECT SPLIT(str, '-')[OFFSET(3)]
above will return empty string for abc-defgh-hij--mnop
to prevent error in case of calling non-existing element - better to use SAFE_OFFSET
SELECT SPLIT(str, '-')[SAFE_OFFSET(3)]

Remove Special Characters from an Oracle String

From within an Oracle 11g database, using SQL, I need to remove the following sequence of special characters from a string, i.e.
~!##$%^&*()_+=\{}[]:”;’<,>./?
If any of these characters exist within a string, except for these two characters, which I DO NOT want removed, i.e.: "|" and "-" then I would like them completely removed.
For example:
From: 'ABC(D E+FGH?/IJK LMN~OP' To: 'ABCD EFGHIJK LMNOP' after removal of special characters.
I have tried this small test which works for this sample, i.e:
select regexp_replace('abc+de)fg','\+|\)') from dual
but is there a better means of using my sequence of special characters above without doing this string pattern of '\+|\)' for every special character using Oracle SQL?
You can replace anything other than letters and space with empty string
[^a-zA-Z ]
here is online demo
As per below comments
I still need to keep the following two special characters within my string, i.e. "|" and "-".
Just exclude more
[^a-zA-Z|-]
Note: hyphen - should be in the starting or ending or escaped like \- because it has special meaning in the Character class to define a range.
For more info read about Character Classes or Character Sets
Consider using this regex replacement instead:
REGEXP_REPLACE('abc+de)fg', '[~!##$%^&*()_+=\\{}[\]:”;’<,>.\/?]', '')
The replacement will match any character from your list.
Here is a regex demo!
The regex to match your sequence of special characters is:
[]~!##$%^&*()_+=\{}[:”;’<,>./?]+
I feel you still missed to escape all regex-special characters.
To achieve that, go iteratively:
build a test-tring and start to build up your regex-string character by character to see if it removes what you expect to be removed.
If the latest character does not work you have to escape it.
That should do the trick.
SELECT TRANSLATE('~!##$%sdv^&*()_+=\dsv{}[]:”;’<,>dsvsdd./?', '~!##$%^&*()_+=\{}[]:”;’<,>./?',' ')
FROM dual;
result:
TRANSLATE
-------------
sdvdsvdsvsdd
SQL> select translate('abc+de#fg-hq!m', 'a+-#!', etc.) from dual;
TRANSLATE(
----------
abcdefghqm