Oracle SQL replace - sql

Unfortunately I don't have the possibility to change field type.
I would like to REPLACE a , to . in a Typ=1 type of field (e.g.: 4,37 so in the end it should be 4.37), and I've tried CAST() and TO_NUMBER and TO_CHAR and I don't even know what else also, but I keep getting the ORA-01722 and it drives me crazy already. Why does it have to be a number for replacing ???
SELECT REPLACE(fmm, ',', '.') fmm FROM ...
Or do you have a better idea how can I do it without REPLACE maybe ?
UPDATE: it seems he has a problem with:
ORDER BY TO_NUMBER(fmm, '99D99')
So it seems he is taking the replaced version, so with . of fmm, but why ????

Try to remove the commas by replace(nvl(nr,0),',',''), and then formatting by
with tab as
(
select '1,234,567' as nr
from dual
)
select to_char(
replace(nvl(nr,0),',','')
,'fm999G999G990','NLS_NUMERIC_CHARACTERS = '',.''')
as "Number"
from tab;
Number
----------
1.234.567
Demo

Passing a string (varchar2) value into the replace function cannot throw an ORA-01722.
it seems he has a problem with:
ORDER BY TO_NUMBER(fmm, '99D99')
If that's complaining when fnm is '4,37' then you could add a replace() call inside the to_number(), but it's simpler/clearer to specify the NLS_NUMERIC_CHARACTERS as part of the conversion, so it knows that D is represented by a comma, and doesn't rely on the session settings:
order by to_number(fnm, '99D99', 'NLS_NUMERIC_CHARACTERS=,.')
If your table has a mix of values with period and comma decimal separators then you need to fix the data - this is the main reason you should not be storing numbers as strings in the first place. If you can't fix the data then you can workaround it with replace(), but it isn't ideal; you can then use a fixed period as the decimal character:
order by to_number(replace(fnm, ',', '.'), '99.99');
or still specify NLS_NUMERIC_CHARACTERS:
order by to_number(replace(fnm, ',', '.'), '99D99', 'NLS_NUMERIC_CHARACTERS=.,')
Either way that is 'normalising' all the string to only have periods, with no commas; and that allows them all to be converted.
db<>fiddle
what I don't understand, if I do some changes in the SELECT to a field, how can it affect the ORDER BY section? fmm should still remain 4,37 and not 4.37 in the ORDER BY section, shouldn't it?
No, because you gave the column expression REPLACE(fmm, ',', '.') the alias fnm, which is the same as the original column name; and the order-by clause is the only place column aliases are allowed, where it masks the original table column. When you do:
ORDER BY TO_NUMBER(fmm, '99D99')
the fnm in that conversion is the value of the column expression aliased as fnm, and not the original table column.
You can still access the table column, but to do so you have to prefix it with table name or alias, as the column from expression from the select list takes precedence (which is implied but not stated clearly in the docs:
expr orders rows based on their value for expr. The expression is based on columns in the select list or columns in the tables, views, or materialized views in the FROM clause.
So you can either explicitly refer to the table column via the table name or, here, an alias:
SELECT REPLACE(t.fmm, ',', '.') fmm
FROM your_table t
ORDER BY TO_NUMBER(t.fmm, '99D99')
though you still shouldn't rely on the session NLS settings really, so can/should still specify the NLS option to match the table column format:
SELECT REPLACE(t.fmm, ',', '.') fmm
FROM your_table t
ORDER BY TO_NUMBER(t.fmm, '99D99', 'NLS_NUMERIC_CHARACTERS=,.')
or use the replaced value and specify the NLS option for that (notice the option itself is different):
SELECT REPLACE(fmm, ',', '.') fmm
FROM your_table
ORDER BY TO_NUMBER(fmm, '99D99', 'NLS_NUMERIC_CHARACTERS=.,')
db<>fiddle
If your table has a mix of period and comma values then you need to use the column-alias version so it is consistent when it tries to convert. If you you only have commas then you can use either. (But again, you shouldn't be storing numbers as strings in the first place...)

Related

extract all numbers from start of string?

I have a table which contains some bad data I am trying to clean up.
An example of the fields is below
36234735HAN876
2342JOE9823
554444PUT003
What I want to do is remove all the numeric characters before the first alphabetical character so it would look like the below:
HAN876
JOE9823
PUT003
What would be the best way to achieve this? I have used the below method but this can only be used to extract ALL numeric from the string, not the ones before the alphabetical characters
How to get the numeric part from a string using T-SQL?
You could achieve this using PATINDEX to locate the first position of an alphabetical character in the string, and then use SUBSTRING to only return the characters after that position:
CREATE TABLE #temp (val VARCHAR(50));
INSERT INTO #temp VALUES ('36234735HAN876'), ('2342JOE9823'), ('554444PUT003'), ('TEST1234');
SELECT val,
SUBSTRING(val, PATINDEX('%[A-Z]%', val), LEN(val)) AS output
FROM #temp;
DROP TABLE #temp;
Outputs:
val output
36234735HAN876 HAN876
2342JOE9823 JOE9823
554444PUT003 PUT003
TEST1234 TEST1234
Note that I have created a temporary table with a column named val. You should change this to work with whatever the actual column is called.
About case sensitivity: If you are using a non-case sensitive collation this will work without issue. If your collation is case sensitive then you may need to alter the pattern being matched to cater for upper- and lower-case letters.
Use PATINDEX to find the first non-numeric character (or first alpha character, depending on the logic) and STUFF to remove them:
SELECT STUFF(V.YourString,1,ISNULL(NULLIF(PATINDEX('%[^0-9]%',V.YourString),0)-1,0),'')
FROM (VALUES('36234735HAN876'),
('2342JOE9823'),
('554444PUT003'),
('ABC123'))V(YourString)
If the logic is the first alpha character, instead of the first non-numeric, then the pattern would be [A-z].
The NULLIF and ISNULL are in there for when/if the string starts with a alpha/non-numeric and thus doesn't cause STUFF to error due to the 3rd parameter being -1. The is demonstrated with the additional example I put into the sample data ('ABC123').

SQLite TRIM same character, multiple columns

I have a table in an SQLite db which has multiple columns with leading '='. I understand that I can use...
SELECT TRIM(`column1`, '=') FROM table;
to clean one column however I get a syntax error if I try for example, this...
SELECT TRIM(`column1`, `column2`, `column3`, '=') FROM table;
Due to incorrect number of arguments.
Is there a more efficient way of writing this code than applying the trim to each column separately like this?
SELECT TRIM(`column1`,'=')as `col1`, TRIM(`column2`,'=')as `col2`, TRIM(`column3`,'=')as `col3` FROM table;
How SQLite guide tells:
trim(X,Y)
The trim(X,Y) function returns a string formed by removing any and all
characters that appear in Y from both ends of X. If the Y argument is
omitted, trim(X) removes spaces from both ends of X.
You have only two parameters, so it's impossible apply it one shot on 3 columns table.
The first parameter is a column, or variable on you can apply trim. The second parameter is a character to change.

Remove # characters from arrays in PostgreSQL table?

I have a field (of type character varying) called 'directedlink_href' in a table which contains arrays that have values that all start with a '#' character.
How am I able to remove the '#' character from any entries in these arrays in this field?
For instance...
{#osgb4000000030451486,#osgb4000000030451491}
to
{osgb4000000030451486,osgb4000000030451491}
The clean solution is to unnest, replace and then re-aggregate the values:
select id,
(select array_agg(substr(x.val,2) order by x.idx) from unnest(t1.directedlink_href) with ordinality as x(val,idx)) as data
from the_table t1;
If you want to actually change the data in the table:
update the_table t1
set directedlink_href = (select array_agg(substr(x.val,2) order by x.idx) from unnest(t1.directedlink_href) with ordinality as x(val,idx));
This simply strips off the first character. If you might have other characters at the start of the value, you need to use regexp_replace(x.val,'^#', '') instead of the substr(x.val,2)
#a_horse_with_no_name got my upvote for a cleaner and more "Posgres-ish" solution.
I was about to delete this answer, but after some tests, it seems that performance wise this solution has an advantage.
Therefore, I would leave this solution here, but I do recommend to choose the solution of #a_horse_with_no_name as the right answer.
I'm using chr(1) has a character that most likely does not appear in the array's' elements.
select string_to_array(substr(replace(array_to_string(directedlink_href,chr(1)),chr(1)||'#',chr(1)),2),chr(1))
from t
;
Think this is a simpler and more generic solution, thought I'd share:
SELECT regexp_split_to_array(regexp_replace(array_to_string(ARRAY['#osgb4000000030451486','#osgb4000000030451491'], '__DELIMITER__'), '#', '', 'g'), '__DELIMITER__');

How to make to_number ignore non-numerical values

Column xy of type 'nvarchar2(40)' in table ABC.
Column consists mainly of numerical Strings
how can I make a
select to_number(trim(xy)) from ABC
query, that ignores non-numerical strings?
In general in relational databases, the order of evaluation is not defined, so it is possible that the select functions are called before the where clause filters the data. I know this is the case in SQL Server. Here is a post that suggests that the same can happen in Oracle.
The case statement, however, does cascade, so it is evaluated in order. For that reason, I prefer:
select (case when NOT regexp_like(xy,'[^[:digit:]]') then to_number(xy)
end)
from ABC;
This will return NULL for values that are not numbers.
You could use regexp_like to find out if it is a number (with/without plus/minus sign, decimal separator followed by at least one digit, thousand separators in the correct places if any) and use it like this:
SELECT TO_NUMBER( CASE WHEN regexp_like(xy,'.....') THEN xy ELSE NULL END )
FROM ABC;
However, as the built-in function TO_NUMBER is not able to deal with all numbers (it fails at least when a number contains thousand separators), I would suggest to write a PL/SQL function TO_NUMBER_OR_DEFAULT(numberstring, defaultnumber) to do what you want.
EDIT: You may want to read my answer on using regexp_like to determine if a string contains a number here: https://stackoverflow.com/a/21235443/2270762.
You can add WHERE
SELECT TO_NUMBER(TRIM(xy)) FROM ABC WHERE REGEXP_INSTR(email, '[A-Za-z]') = 0
The WHERE is ignoring columns with letters. See the documentation

How to extract group from regular expression in Oracle?

I got this query and want to extract the value between the brackets.
select de_desc, regexp_substr(de_desc, '\[(.+)\]', 1)
from DATABASE
where col_name like '[%]';
It however gives me the value with the brackets such as "[TEST]". I just want "TEST". How do I modify the query to get it?
The third parameter of the REGEXP_SUBSTR function indicates the position in the target string (de_desc in your example) where you want to start searching. Assuming a match is found in the given portion of the string, it doesn't affect what is returned.
In Oracle 11g, there is a sixth parameter to the function, that I think is what you are trying to use, which indicates the capture group that you want returned. An example of proper use would be:
SELECT regexp_substr('abc[def]ghi', '\[(.+)\]', 1,1,NULL,1) from dual;
Where the last parameter 1 indicate the number of the capture group you want returned. Here is a link to the documentation that describes the parameter.
10g does not appear to have this option, but in your case you can achieve the same result with:
select substr( match, 2, length(match)-2 ) from (
SELECT regexp_substr('abc[def]ghi', '\[(.+)\]') match FROM dual
);
since you know that a match will have exactly one excess character at the beginning and end. (Alternatively, you could use RTRIM and LTRIM to remove brackets from both ends of the result.)
You need to do a replace and use a regex pattern that matches the whole string.
select regexp_replace(de_desc, '.*\[(.+)\].*', '\1') from DATABASE;