SQL substr query - sql

I need to understand this query as best as possible, thanks
substr(b_Aplicacion,1,4)
|| '-'
|| substr(b_Aplicacion,5,2)
|| '-'
|| substr(b_Aplicacion,7,2)

I assume you're aware of how the substr() function works. (If not, here's an explanation.)
In PLSQL || is a string concatenation operator.
Example: 'left' || ' - ' || 'right' evaluates to 'left - right'
Your example looks like it's reformatting a string that probably is a date like 20120102 into 2012-01-02

This expression inserts dashes into the string after the 4-th and 6-th position, and throws away characters after the 8-th position. For example, abcdefghijkl becomes abcd-ef-gh.
Substr cuts out three parts from the string: abcd, ef, and gh in my example. || '-' || glues the parts back together, inserting dashes in between. || between two string expressions represent concatenation, i.e. it makes one string by gluing the part on its left to the part on its right.

substr( string, start_position, [ length ] ) is performed like this:
string is the source string.
start_position is the position for extraction. The first position in the string is always 1.
length is optional. It is the number of characters to extract. If this parameter is omitted, substr will return the entire string.
The || represents concatenation.
So that query is separating the placcing a '-' character after 4th and 6th positions.
For example, if you have 20121221 as b_Aplicacion that query will return 2011-12-21.

Related

Issue with replace function in Oracle

I want to replace 6 of the last 10 digits in a string with XXXXXX. The length of the string can be 16 or 19.
Using below query:
SELECT REPLACE('0000000000000000000',SUBSTR('0000000000000000000',-10,6), 'XXXXXX') FROM DUAL;
--Actual Output --XXXXXXXXXXXXXXXXXX0
--Expected Output--000000000XXXXXX0000
SELECT REPLACE('1234561234561234561',SUBSTR('1234561234561234561',-10,6), 'XXXXXX') FROM DUAL;
--Actual Output --123XXXXXXXXXXXX4561
--Expected Output--123456123XXXXXX4561
SELECT REPLACE('0004421640006525212',SUBSTR('0004421640006525212',-10,6), 'XXXXXX') FROM DUAL;
--Actual Output --000442164XXXXXX5212
--Expected Output--000442164XXXXXX5212
Why do the first two give the wrong result, and how can I fix the query?
If the length of the string was always 19 you could do:
substr('0004421640006525212', 1, 9) || 'XXXXXX' || substr('0004421640006525212', -4)
With two possible lengths you could use a case expression to decide the second argument for the first substr() call, based on the actual string length; or you could allow for any length (of at least 10, anyway) with:
substr('0004421640006525212', 1, length('0004421640006525212') - 10) || 'XXXXXX' || substr('0004421640006525212', -4)
or with a placeholder/column for brevity:
substr(str, 1, length(str) - 10) || 'XXXXXX' || substr(str, -4)
Or maybe simpler, but slower, you could use a regular expression:
regexp_replace('0004421640006525212', '^(.*?)(.{6})(.{4})$', '\1XXXXXX\3')
The regular expression splits the string into three groups; working backwards, (.{4})$ is a group of exactly four characters at the end of the string; then (.{6}) is a group of exactly six characters (the ones you want to replace); then ^(.*} is a group of any/all the remaining characters from the start of the string. The replacement pattern keeps the first and third groups - with \1 and \3 - and puts the fixed Xs between those. The second group - of six characters - is discarded.
SQL Fiddle getting the values, and a couple of shorter ones, form a table to avoid having to repeat them all; which also shows the first version doesn't work properly with varying lengths.
The replace function replaces every occurrence of one string with another. It doesn't know or care how the second argument is generated; it doesn't know you're getting it from a particular position in the same string.
When you do:
REPLACE('0004421640006525212',SUBSTR('0004421640006525212',-10,6), 'XXXXXX')
the SUBSTR() evaluates to '000652', so it's effectively:
REPLACE('0004421640006525212','000652', 'XXXXXX')
and that does what you want, because that substring only appears once in the original string. But with:
REPLACE('1234561234561234561',SUBSTR('1234561234561234561',-10,6), 'XXXXXX')
the SUBSTR() evaluates to '456123', so it's effectively:
REPLACE('1234561234561234561','456123', 'XXXXXX')
and that appears multiple times in the original string:
1234561234561234561
^^^^^^
^^^^^^
and both of those are replaced. With all zeros it's even worse; the SUBSTR() is now '000000', so it matches three times:
0000000000000000000
^^^^^^
^^^^^^
^^^^^^
and all three of those are replaced.

Extract Specific Set of data from a String in Oracle

I have the string '1_A_B_C_D_E_1_2_3_4_5' and I am trying to extract the data 'A_B_C_D_E'. I am trying to remove the _1_2_3_4_5 & the 1_ portion from the string. Which is essentially the numeric portion in the string. any special characters after the last alphabet must also be removed. In this example the _ after the character E must also not be present.
and the Query I am trying is as below
SELECT
REGEXP_SUBSTR('1_A_B_C_D_E_1_2_3_4_5','[^0-9]+',1,1)
from dual
The Data I get from the above query is as below: -
_A_B_C_D_E_
I am trying to figure a way to remove the underscore towards the end. Any other way to approach this?
Assuming the "letters" come first and then the "digits", you could do something like this:
select regexp_substr('A_B_C_D_E_1_2_3_4_5','.*[A-Z]') from dual;
This will pull all the characters from the beginning of the string, up to the last upper-case letter in the string (.* is greedy, it will extend as far as possible while still allowing for one more upper-case letter to complete the match).
I have the string '1_A_B_C_D_E_1_2_3_4_5' and I am trying to extract the data 'A_B_C_D_E'
Use REGEXP_REPLACE:
SQL> SELECT trim(BOTH '_' FROM
2 (REGEXP_SUBSTR('1_A_B_C_D_E_1_2_3_4_5','[0-9]+', ''))) str
3 FROM dual;
STR
---------
A_B_C_D_E
How it works:
REGEXP_REPLACE will replace all numeric occurrences '[0-9]+' from the string. Alternatively, you could also use POSIX character class '[^[:digit:]]+'
TRIM BOTH '_' will remove any leading and lagging _ from the string.
Also using REGEXP_SUBSTR:
SELECT trim(BOTH '_' FROM
(REGEXP_SUBSTR('1_A_B_C_D_E_1_2_3_4_5','[^0-9]+'))) str
FROM dual;
STR
---------
A_B_C_D_E

Hidden character in SQL column value in oracle

I have a value in column of a table and somehow there is some hidden character at the end of the string. I cannot see it or remove it. The string is placed below. The total characters that I can see in this string is 25, but I when check the length of the string it is showing as 26. I tried TRIM function but thinking it could be a space, but it is not. How to remove this kind of characters from string in oracle query. Actually, I am using regexp_replace to replace some part of this string, but because of this issue the regex not able to match the last number in the string to replace everything before it.
28/110/41492/171486/98122
Here is my regex function
regexp_replace(trim(ATTRIBUTE_VALUE), '(^|.*?/)' || '98122' || '(/|$)', 'replaced' || '\2', 1, 1)
Do this in two steps:
remove all non-printable characters
apply your replace pattern
This is:
regexp_replace
(
regexp_replace(attribute_value, '[^[:print:]]'), -- printable string
'(^|.*?/)' || '98122' || '(/|$)', -- search pattern
'replaced' || '\2', -- replace pattern
1, -- position
1 -- occurrence
)

Oracle SQL - Is there a better way to concatenate multiple strings with a given delimiter?

First question, so apologies in advance if this is stupid or unoriginal, but I've searched for about 30 mins now without finding any mention anywhere of my exact question:
Is there a way to concatenate a series of strings, to be separated by a given delimiter, without manually putting the delimiter between each column being concatenated?
To give a concrete example, I currently have this:
SELECT member_no as Member#,
(member_gname
|| ' '
|| member_fname) as Name,
(member_street
|| ' '
|| member_city
|| ' '
|| member_state
|| ' '
|| member_postcode) AS Address,
member_phone AS Phone,
TO_CHAR(member_joindate, 'dd-Mon-yyyy') as Joined
FROM MEMBER;
It works fine, and produces exactly the output I wanted, but as this is for study I'm less concerned about the output and more concerned with the readability and 'best practise' factors of the .sql file itself. I understand that CONCAT() only takes two arguments, so that won't work without nesting them (which is even uglier and less readable). I'm coming in totally naively here, but I was hoping there'd be some kind of magical AWESOMECONCAT() type of function that would take all the columns i need, as well as allowing me to specify what character I want separating them (in this case, a space). Any ideas?
Also, this is a separate question not worthy of posting by itself, but is there any way to select a column 'AS' and give it a name including whitespace? E.g 'Member #' would look better imo, and 'Join Date' would be clearer, but I've tried both brackets and single quotes after the AS and neither seems to fly with SQL developer.
We can still write our own AWESOMECONCAT(). Unfortunately, Oracle has no in built function. As the concatenate operator does the basic thing.
Using double quotes in the alias, you can make the column references case sensitive and even accept blanks. But note that, any more references to that column/expression needs double quotes with same text.
SELECT member_no as "Member #",
(member_gname
|| ' '
|| member_fname) as Name,
(member_street
|| ' '
|| member_city
|| ' '
|| member_state
|| ' '
|| member_postcode) AS Address,
member_phone AS Phone,
TO_CHAR(member_joindate, 'dd-Mon-yyyy') as "Join Date"
FROM MEMBER;
Is there a way to concatenate a series of strings, to be separated by a given delimiter, without manually putting the delimiter between each column being concatenated?
The best way to do concatenation from 11g onwards is the new string literal technique q'[]'.
For example :
select q'[This is a string, 'this is also a string'.]' from dual

INSTR/SUBTR to filter string in Oracle SQL

I have a string that appears as:
00012345678 Rain, Kip
I would like to filter out the first numbers/integers, then re-arrange the first and last name.
Kip Rain
I was thinking that I could do INSTR({string},',','1') to get to the first comma, but I am unsure how to do both numbers and punctuation in one line. Would I have to chain the INSTR?
Thanks for your help!
You can chain them; but with complicated things this quickly becomes confusing to work out what's happening. Unless you have demonstrable performance concerns it's often quicker to use regular expressions. In this case, it's probably easiest to use REGEXP_REPLACE()
select regexp_replace(your_string
, '[^[:alpha:]]+([[:alpha:]]+)[^[:alpha:]]+([[:alpha:]]+)'
, '\2 \1')
from ...
The second parameter is the match string; in this case we're searching for everything that is not an alphabetic character ([^[:alpha:]]) 1 or more times (+), followed by alphabetic characters ([[:alpha:]]) 1 or more times. This is repeated to take into account the spaces and comma; and would match your string as follows:
|string | matched by |
+--------------+----------------+
|'00012345678 '| [^[:alpha:]]+ |
|'Rain' | ([[:alpha:]]+) |
|', ' | [^[:alpha:]]+ |
|'Kip' | ([[:alpha:]]+) |
The parenthesis here represent groups; the first set the first group etc...
The third parameter of REGEXP_REPLACE() tells Oracle what to replace your string with; this where the groups come in - you can replace groups in any order. In this instance I want the second group (Kip), followed by a space, followed by the first group (Rain).
You can see this all demonstrated in this SQL Fiddle
Yes, it is alright to chain them:
substr(str, 1, instr(str, ' ')) number_part
substr(str, instr(str, ' '), instr(str, ',') - instr(str, ' ')) Kip
substr(str, instr(str, ' ', 2), len(str)) Rain
In last example you may use something more preceise than len(str) if your string is longer.
I am biased towards using the regular expression variation of the substr function.
First obtain a repeating list of non-numeric characters as follows:
REGEXP_SUBSTR('00012345678 Rain, Kip','([[:alpha:]]|[-])+',1,1)
where [[:alpha:]] is a character class where all alphabetic characters are included.
The bracketed expression, [-], is just a matching list which is my way of identifying that the last name, Rain, could include a hyphen. The alternation operator, '|', states that either the alphabetic or hyphen characters are acceptable.
The '+' indicates that we are looking to match one or more occurrences.
Second, obtain the last non-numeric characters at the end of the string:
REGEXP_SUBSTR('00012345678 Rain, Kip','[^, ]+$',1,1)
Here, I am going to the end of the string (using the anchor, '$'), and find all character after the comma and space.
Next I combine (with a space in between) using the concatenator operator, ||.
REGEXP_SUBSTR('00012345678 Rain, Kip','[^, ]+$',1,1) ||' ' || REGEXP_SUBSTR('00012345678 Rain, Kip','([[:alpha:]]|[ -])+',1,1)