Can (should) I include elements of TO_CHAR, CONCAT and FM in the same select statement? - sql

I'm following up on a previous post that successfully generated an Oracle SELECT statement. From within that prior script I now need to
concatenate two different fields (numeric values for 3-digit area codes and 7-digit phone numbers), then
format the resulting column as XXX-XXX-XXXX
but my attempts at using TO_CHAR, CONCAT (or || I have tried doing the concatenation both ways), and FM in the same line result in invalid number or invalid operator errors (depending on how I've rearranged the elements in the line) painfully reminding me that my barely-elementary scripting shows a significant lack understanding of proper use and syntax.
The combination of TO_CHAR and CONCAT (||) successfully produces a 9-digit string, but I'm trying to attain as result formatted as XXX-XXX-XXXX from the following (I've edited out the lines from the original script for data elements not relevant to this particular question; nothing in the original query is nested, it just selects several fields and has a series of left joins linking on a common UID field in different tables)
select distinct
cn.dflt_id StudentIdNumber,
to_char (p.area_code || p.phone_no) Phone,
from
co_name cn
left join co_v_name_phone1 p on cn.name_id = p.name_id
order by cn.dflt_id
Would anyone offer helpful advice on attaining the desired XXX-XXX-XXXX formatting in the resulting Phone column? My attempts with variants of 'fm999g999g9999' have thus far not been successful.
Thanks,
Scott

Here are a few options that crossed my mind; have a look, pick the one you find the most appropriate. If you still have problems, post your own test case.
RES2 is a simple concatenation of substrings that have a - in between
RES3 uses format mask with an adjusted NLS_NUMERIC_CHARACTERS for thousands
RES4 concatenates area code (which is OK by itself) with regular expression that splits a string into two parts; the first has {3} characters, and the second one has {4} of them
By the way, are area codes really numbers? No leading zeros?
SQL> with test (area_code, phone_number) as
2 (select 123, 9884556 from dual union
3 select 324, 1254789 from dual
4 )
5 select
6 to_char(area_code) || to_char(phone_number) l_concat,
7 --
8 substr(to_char(area_code) || to_char(phone_number), 1, 3) ||'-'||
9 substr(to_char(area_code) || to_char(phone_number), 4, 3) ||'-'||
10 substr(to_char(area_code) || to_char(phone_number), 7)
11 res2,
12 --
13 to_char(to_char(area_code) || to_char(phone_number),
14 '000g000g0000', 'nls_numeric_characters=.-') res3,
15 --
16 to_char(area_code) ||'-'||
17 regexp_replace(to_char(phone_number), '(\d{3})(\d{4})', '\1-\2') res4
18 from test;
L_CONCAT RES2 RES3 RES4
------------- ------------- ------------- -------------
1239884556 123-988-4556 123-988-4556 123-988-4556
3241254789 324-125-4789 324-125-4789 324-125-4789
SQL>

Related

Search a pattern from comma seperated parameters in plsql

My Parameter to a procedure lv_ip := 'MNS-GC%|CS,MIB-TE%|DC'
My cursor query should search for records that start with 'MNS-GC%' and 'MIB-TE%'.
Select id, date,program,program_start_date
from table_1
where program like 'MNS-GC%' or program LIKE 'MIB-TE%'
Please suggest ways to read it from the parameter and an alternative to LIKE.
Since you mention you want to preserve what's on the right side of the pipe, and want to be able to process parameters dynamically, here's a way to parse multi-delimited data that could give you some ideas using a CTE.
The table called 'tbl' just sets up your original data. tbl_comma contains that data split on the comma. The final query splits that data into name/value pairs.
Hopefully this will help give you some ideas even though it's not the exact answer you are looking for.
COLUMN ID FORMAT a3
COLUMN PROGRAM FORMAT a10
COLUMN part2 FORMAT a6
-- Original data
WITH tbl(ID, DATA) AS (
SELECT 1, 'MNS-GC%|CS,MIB-TE%|DC' FROM dual UNION ALL
SELECT 2, 'MNS-GC%|CS,MIB-TE%|DC,MIB-TA%|AB,MIB-TB%|BC' FROM dual
),
tbl_comma(ID, CASE) AS (
SELECT ID,
REGEXP_SUBSTR(DATA, '(.*?)(,|$)', 1, LEVEL, NULL, 1) CASE
FROM tbl
CONNECT BY REGEXP_SUBSTR(DATA, '(.*?)(,|$)', 1, LEVEL) IS NOT NULL
AND PRIOR ID = ID
AND PRIOR SYS_GUID() IS NOT NULL
)
--SELECT * FROM tbl_comma;
-- Parse into name/value pairs
SELECT ID,
REGEXP_REPLACE(CASE, '^(.*)\|.*', '\1') PROGRAM,
REGEXP_REPLACE(CASE, '.*\|(.*)$', '\1') PART2
FROM tbl_comma;
ID PROGRAM PART2
--- ---------- ------
1 MNS-GC% CS
1 MIB-TE% DC
2 MNS-GC% CS
2 MIB-TE% DC
2 MIB-TA% AB
2 MIB-TB% BC
6 rows selected.
If you're stuck with that input and the structure is fixed, with each comma-separated element having a pipe-delimited value, you could possibly convert that string to a regular expression pattern, and then use regexp_like to pattern-match:
select id, date, program, program_start_date
from table_1
where regexp_like(
program,
'^(' || rtrim(regexp_replace(lv_ip, '%\|.*?(,|$)', '|'), '|') || ')')
With your example parameter, the
'^(' || rtrim(regexp_replace(lv_ip, '%\|.*?(,|$)', '|'), '|') || ')'
would generate the pattern
^(MNS-GC|MIB-TE)
i.e. looking for either of those strings at the start of the program value.
db<>fiddle
Alternatively you could split the input up yourself, with instr and substr, and - since the number of elements may vary - create a dynamic query using them. That might be faster than using regular expression, but might be harder to maintain.
What would the regexp be to match CS|DC
It depends how you plan to use those values, but if you're looking for some column exactly matching one of them, then you could do something similar with:
'^(' || ltrim(regexp_replace(l_ip, '(^|,)[^|]*', null), '|') || ')$'
which with your input string would generate the pattern
^(CS|DC)$
But if you need to match the corresponding values as pairs - so the equivalent of something like:
where (program like 'MNS-GC%' and some_col = 'CS')
or (program like 'MIB-TE%' and some_col = 'DC')
... then you'd need to extract them as pairs, as #Gary_W has shown.

Querying substrings against a list of values

I'm reading from a dataset that I unfortunately don't have the access to modify. It has concatenated strings of values, and I want to select records for which any of those substrings (as split by a given character) matches any of the values in a specific list. I'll be passing the queries in via Python, so it won't be compared against a static list.
For example, the table looks like:
CrappyColumn
-----------
1;2
4
1
2;1
1;3
2
And I might want to return anything that has 2 or 4 in it. So, my result should be:
1;2
4
2
2;1
I have played with regexp_substr and gotten something that actually works; however, it just runs indefinitely (as much as 10 minutes before I give up) when I run it on the full dataset (which only includes about three thousand records with values that are often a couple hundred characters long). I need something that works in a reasonable amount of time for repeated execution.
I realize that--even with a variable comparison list--I could just write my Python code to parse the list and construct multiple LIKE statements, but that seems inefficient, and I assume that there is a better way.
And here's what I've done that takes too long:
SELECT DISTINCT CrappyColumn
FROM
(SELECT DISTINCT CrappyColumn, regexp_substr(CrappyColumn, '[^;]+', 1, LEVEL) as UGH
FROM CrappyTable
CONNECT BY regexp_substr(CrappyColumn, '[^;]+', 1, LEVEL) IS NOT NULL)
WHERE UGH IN ('2', '4')
Is there a better, faster, cleaner way to accomplish this?
EDIT - RESOLUTION:
Thanks to vkp's help, here is what I implemented:
regexp_like(SITE_ID, '^(2|4)(:)|(:)(2|4)(:)|(:)(2|4)$|^(2|4)$')
I modified it for my final product, so that it can handle strings of more than one character--by changing [2|4] to (2|4). This works in cases of searching for numbers that aren't single-digit.
You can use like:
select t.*
from crappytable t
where ';' || crappycolumn || ';' like '%;2;%' or
';' || crappycolumn || ';' like '%;4;%';
You seem to know that storing lists of values in a single column is a bad idea, so I'll spare the harangue ;)
EDIT:
If you don't like like, you can use regexp_like() like this:
where regexp_like(';' || crappycolumn || ';', ';2;|;4;')
A simpler method would be to use regexp_like to check if the list has 2 or 4 in it.
select *
from tablename
where regexp_like(crappycolumn,'^[2|4][^0-9]|[^0-9][2|4][^0-9]|[^0-9][2|4]$|^[2|4]$')
^[2|4][^0-9] - Starts with 2 or 4 not followed by a digit.
[^0-9][2|4][^0-9] - 2 or 4 not succeeded or preceded by a digit.
[^0-9][2|4]$ - Ends with 2 or 4 not preceded by a digit.
^[2|4]$ - 2 or 4 is the only character in the string.
Another form of regexp_like(). This regex looks for 2 or 4 only when proceeded by the beginning of the line or a semi-colon and when followed by a semi-colon or the end of the line:
SQL> with crappy_tbl(crappy_col) as (
select '1;2' from dual union
select '4' from dual union
select '1' from dual union
select '2;1' from dual union
select '1;3' from dual union
select '2' from dual union
select '22;;44;' from dual
)
select crappy_col
from crappy_tbl
where regexp_like(crappy_col, '(^|;)(2|4)(;|$)');
CRAPPY_
-------
1;2
2
2;1
4
SQL>

how to add zeros after decimal in Oracle

I want to add zeroes after the number .
for eg a= 6895
then a= 6895.00
datatype of a is number(12);
I am using the below code .
select to_char(6895,'0000.00') from dual .
I m getting the desired result from above code but
'6895' can be any number.so due to that i need to add '0' in above code manually.
for eg.
select to_char(68955698,'00000000.00') from dual .
Can any one suggest me the better method .
The number format models are the place to start when converting numbers to characters. 0 prepends 0s, which means you'd have to get rid of them somehow. However, 9 means:
Returns value with the specified number of digits with a leading space if positive or with a leading minus if negative. Leading zeros are blank, except for a zero value, which returns a zero for the integer part of the fixed-point number.
So, the following gets you almost there, save for the leading space:
SQL> select to_char(987, '9999.00') from dual;
TO_CHAR(
--------
987.00
You then need to use a format model modifier, FM, which is described thusly:
FM Fill mode. Oracle uses trailing blank characters and leading zeroes
to fill format elements to a constant width. The width is equal to the
display width of the largest element for the relevant format model
...
The FM modifier suppresses the above padding in the return value of
the TO_CHAR function.
This gives you a format model of fm9999.00, which'll work:
SQL> select to_char(987, 'fm9999.00') from dual;
TO_CHAR(
--------
987.00
If you want a lot of digits before the decimal then simply add a lot of 9s.
datatype of a is number(12);
Then use 12 9s in the format model. And, keep the decimal to just 2. So, since the column datatype is NUMBER(12), you cannot have any number more than the given size.
SQL> WITH DATA AS(
2 SELECT 12 num FROM dual union ALL
3 SELECT 234 num FROM dual UNION ALL
4 SELECT 9999 num FROM dual UNION ALL
5 SELECT 123456789 num FROM dual)
6 SELECT to_char(num,'999999999999D99') FROM DATA
7 /
TO_CHAR(NUM,'999
----------------
12.00
234.00
9999.00
123456789.00
SQL>
Update Regarding leading spaces
SQL> select ltrim(to_char(549,'999999999999.00')) from dual;
LTRIM(TO_CHAR(54
----------------
549.00
SQL>
With using CASE and SUBSTR it is very simple.
CASE WHEN SUBSTR(COLUMN_NAME,1,1) = '.' THEN '0'||COLUMN_NAME ELSE COLUMN_NAME END

SQL change date formats inside a string

I would like to convert a string containing dates in SQL select from Oracle 11g database.
Original string (CLOB) example:
"1.12.2011 - event 1
2.2.2012 - event 2
13.3.2012 - event 44"
Desired output:
"20111201 - event 1
20120202 - event 2
20120313 - event 44"
Is there a better (faster) way than using 4 separate replacements?
regexp_replace(regexp_replace(regexp_replace(regexp_replace(my_string,
'(\d\d)\.(\d\d)\.(20\d\d)', '\3\2\1'),
'(\d\d)\.(\d)\.(20\d\d)', '\30\2\1'),
'(\d)\.(\d\d)\.(20\d\d)', '\3\20\1'),
'(\d)\.(\d)\.(20\d\d)', '\30\20\1')
Especially if you're using clobs you have to be careful unless you're certain of the data in there.
However, if your clob only looks like that then you need threeregexp_replace in order for this to work; it'll also be much more dynamic. Just explicitly specify digits using [[:digit:]] then specify a minimum and maximum number of times these digits could be there using {1,2}.
Then the following would work:
select regexp_replace(
regexp_replace(
regexp_replace( my_string
, '([[:digit:]]{1,2})\.([[:digit:]]{1,2})\.(20[[:digit:]]{2})'
, '\3-\2-\1')
, '-([[:digit:]]{1}(-|$))'
, '0\1' )
, ('-')
, '')
from dual
This means:
match ( group 1 ) 1 or 2 digits
match a full stop.
match ( group 2 ) 1 or 2 digits
match a full stop
match ( group 3 ) 20 + 2 digits.
Then take out only groups 1, 2 and 3, i.e. ignoring the full stops and return then in the order 3, 2, 1 padded with a hyphen
Then replace any [digit] that is followed by either a hyphen or the end of the string, i.e. the number of digits is only 1 with -0[digit].
Lastly replace all the hyphens.
Separately from that I agree with tbone. It would make a lot more sense to store this data in a separate table (event_id number, event_date date). Any string transformations are easy with no chance of getting it wrong, unlike in this situation, and the data is easy to query and compare.
there are no better options (both correct and readable) with better performance - or if there are, no one cares..
i prefer a 2-level regexp_replace for date part:
select regexp_replace(
regexp_replace( my_string,
'([[:digit:]]{1,2})\.([[:digit:]]{1,2})\.(20[[:digit:]]{2})',
'\3-0\2-0\1' ),
'(20[[:digit:]]{2})-0?([[:digit:]]{2})-0?([[:digit:]]{2})',
'\3\2\1' )
from dual;
Demo
Maybe try doing:
select to_char(to_date('13.3.2011', 'DD.MM.YYYY'),'YYYYMMDD') from dual;

Finding rows that don't contain numeric data in Oracle

I am trying to locate some problematic records in a very large Oracle table. The column should contain all numeric data even though it is a varchar2 column. I need to find the records which don't contain numeric data (The to_number(col_name) function throws an error when I try to call it on this column).
I was thinking you could use a regexp_like condition and use the regular expression to find any non-numerics. I hope this might help?!
SELECT * FROM table_with_column_to_search WHERE REGEXP_LIKE(varchar_col_with_non_numerics, '[^0-9]+');
To get an indicator:
DECODE( TRANSLATE(your_number,' 0123456789',' ')
e.g.
SQL> select DECODE( TRANSLATE('12345zzz_not_numberee',' 0123456789',' '), NULL, 'number','contains char')
2 from dual
3 /
"contains char"
and
SQL> select DECODE( TRANSLATE('12345',' 0123456789',' '), NULL, 'number','contains char')
2 from dual
3 /
"number"
and
SQL> select DECODE( TRANSLATE('123405',' 0123456789',' '), NULL, 'number','contains char')
2 from dual
3 /
"number"
Oracle 11g has regular expressions so you could use this to get the actual number:
SQL> SELECT colA
2 FROM t1
3 WHERE REGEXP_LIKE(colA, '[[:digit:]]');
COL1
----------
47845
48543
12
...
If there is a non-numeric value like '23g' it will just be ignored.
In contrast to SGB's answer, I prefer doing the regexp defining the actual format of my data and negating that. This allows me to define values like $DDD,DDD,DDD.DD
In the OPs simple scenario, it would look like
SELECT *
FROM table_with_column_to_search
WHERE NOT REGEXP_LIKE(varchar_col_with_non_numerics, '^[0-9]+$');
which finds all non-positive integers. If you wau accept negatiuve integers also, it's an easy change, just add an optional leading minus.
SELECT *
FROM table_with_column_to_search
WHERE NOT REGEXP_LIKE(varchar_col_with_non_numerics, '^-?[0-9]+$');
accepting floating points...
SELECT *
FROM table_with_column_to_search
WHERE NOT REGEXP_LIKE(varchar_col_with_non_numerics, '^-?[0-9]+(\.[0-9]+)?$');
Same goes further with any format. Basically, you will generally already have the formats to validate input data, so when you will desire to find data that does not match that format ... it's simpler to negate that format than come up with another one; which in case of SGB's approach would be a bit tricky to do if you want more than just positive integers.
Use this
SELECT *
FROM TableToSearch
WHERE NOT REGEXP_LIKE(ColumnToSearch, '^-?[0-9]+(\.[0-9]+)?$');
After doing some testing, i came up with this solution, let me know in case it helps.
Add this below 2 conditions in your query and it will find the records which don't contain numeric data
and REGEXP_LIKE(<column_name>, '\D') -- this selects non numeric data
and not REGEXP_LIKE(column_name,'^[-]{1}\d{1}') -- this filters out negative(-) values
Starting with Oracle 12.2 the function to_number has an option ON CONVERSION ERROR clause, that can catch the exception and provide default value.
This can be used for the test of number values. Simple set NULL when the conversion fails and filer all not NULL values.
Example
with num as (
select '123' vc_col from dual union all
select '1,23' from dual union all
select 'RV12P2000' from dual union all
select null from dual)
select
vc_col
from num
where /* filter numbers */
vc_col is not null and
to_number(vc_col DEFAULT NULL ON CONVERSION ERROR) is not null
;
VC_COL
---------
123
1,23
From http://www.dba-oracle.com/t_isnumeric.htm
LENGTH(TRIM(TRANSLATE(, ' +-.0123456789', ' '))) is null
If there is anything left in the string after the TRIM it must be non-numeric characters.
I've found this useful:
select translate('your string','_0123456789','_') from dual
If the result is NULL, it's numeric (ignoring floating point numbers.)
However, I'm a bit baffled why the underscore is needed. Without it the following also returns null:
select translate('s123','0123456789', '') from dual
There is also one of my favorite tricks - not perfect if the string contains stuff like "*" or "#":
SELECT 'is a number' FROM dual WHERE UPPER('123') = LOWER('123')
After doing some testing, building upon the suggestions in the previous answers, there seem to be two usable solutions.
Method 1 is fastest, but less powerful in terms of matching more complex patterns.
Method 2 is more flexible, but slower.
Method 1 - fastest
I've tested this method on a table with 1 million rows.
It seems to be 3.8 times faster than the regex solutions.
The 0-replacement solves the issue that 0 is mapped to a space, and does not seem to slow down the query.
SELECT *
FROM <table>
WHERE TRANSLATE(replace(<char_column>,'0',''),'0123456789',' ') IS NOT NULL;
Method 2 - slower, but more flexible
I've compared the speed of putting the negation inside or outside the regex statement. Both are equally slower than the translate-solution. As a result, #ciuly's approach seems most sensible when using regex.
SELECT *
FROM <table>
WHERE NOT REGEXP_LIKE(<char_column>, '^[0-9]+$');
You can use this one check:
create or replace function to_n(c varchar2) return number is
begin return to_number(c);
exception when others then return -123456;
end;
select id, n from t where to_n(n) = -123456;
I tray order by with problematic column and i find rows with column.
SELECT
D.UNIT_CODE,
D.CUATM,
D.CAPITOL,
D.RIND,
D.COL1 AS COL1
FROM
VW_DATA_ALL_GC D
WHERE
(D.PERIOADA IN (:pPERIOADA)) AND
(D.FORM = 62)
AND D.COL1 IS NOT NULL
-- AND REGEXP_LIKE (D.COL1, '\[\[:alpha:\]\]')
-- AND REGEXP_LIKE(D.COL1, '\[\[:digit:\]\]')
--AND REGEXP_LIKE(TO_CHAR(D.COL1), '\[^0-9\]+')
GROUP BY
D.UNIT_CODE,
D.CUATM,
D.CAPITOL,
D.RIND ,
D.COL1
ORDER BY
D.COL1