How remove symbols from the sentence in Oracle? - sql

In Oracle database I have such table.
| TREE | ORG_NAME |
|---------------------------------|----------|
| \Google earth\Nest global\ATAP | ATAP |
| \Google earth\Nest\Beemoney\ | Beemoney |
| \Google\\\BeeKey\ | |
| | York |
I am trying to make sql query which would return such result.
| ORGANIZATION |
|-----------------------------------|
| Google earth > Nest global > ATAP |
| Google earth > Nest Beemoney |
| Google > BeeKey |
| York |
As you can see I want:
1) Replace \ symbol at the beginning and end of the sentence.
2) Replace \ symbol which is inside sentence to > symbol.
3) Replace \\\ symbol which is inside sentence to > symbol.
4) If TREE colomn is empty take record from ORG_NAME colomn.
Here is how I started. This SQL query solve 2, 3 and 4 part. How to solve problem with 1 part. I think I need to use REGEXP_REPLACE, right? How to make it correctly? Is there any other more elegant way to redisign sql query? As you can see I walk on the same table a few times.
SELECT
COALESCE (TREE, ORG_NAME) as ORGANIZATION
FROM (
SELECT
REPLACE(TREE, '\', '>') AS TREE,
ORG_NAME
FROM (
SELECT
REPLACE(TREE, '\\\', '>') AS TREE,
ORG_NAME
FROM
ORG
)
)

This could be a way with a regexp_replace and a trim to remove the characters from the beginning and the end of the string:
select nvl(regexp_replace( trim('\' from tree), '\\+', ' > '), org_name)
from yourTable

Here is a working solution which uses two calls to regexp_replace:
select
regexp_replace(
regexp_replace('\Google\\\BeeKey\', '^\\?(.*?)\\?$', '\1'), '\\+', ' > ')
from dual;
Google > BeeKey
Demo
The inner call to regexp_replace strips off any possible leading or trailing path separators. The outer call converts any number of internal path separators / to > separators as a replacement.

Related

Regex to match the pattern split with slash

I want to query the database column with regex to match the string like the following...
1. qwge1/dg2/hjetg3
2. tahry4/rtg5
3. jtyg6
How to split the zero to multiple slashes and match the [a-z]+[0-9] part?
You could use:
^([a-z]+[0-9](/|$))+$
The inner expression, [a-z]+[0-9](/|$), describes a series of alphabetic characters followed by a digit, then by a slash or the end of the string. This expression may be repeated 1 to N times, followed by the end of the string.
Demo on DB Fiddle - I added a few non-matching strings to your sample data:
select val, val ~ '^([a-z]+[0-9](/|$))+$'
from (values
('qwge1/dg2/hjetg3'),
('tahry4/rtg5'),
('jtyg6'),
('abc'),
('qwge1/dg2/hjetg'),
('qwge1/dg2/3')
) x(val)
val | ?column?
:--------------- | :-------
qwge1/dg2/hjetg3 | t
tahry4/rtg5 | t
jtyg6 | t
abc | f
qwge1/dg2/hjetg | f
qwge1/dg2/3 | f

sql-remove dashes from string column

in stored procedure, i have this field
LTRIM(ISNULL(O.Column1, ''))
If there is a dash(-) symbol at end of the value, want to remove it. only in conditions if a dash symbol exist at start/end.
Any suggestions
EDIT:
Microsoft SQL Server 2014 12.0.5546.0
Expected output:
1)input: "abc-abc" //output: "abc-abc"
2)input: "abc-" //output: "abc"
3)input: "abc" //ouput: "abc"
I think you might be stuck with string manipulation here.
The CASE expression here takes the LTRIM/RTRIM result from your column and checks both ends for a dash, and then each end for a dash. If dashes exist, it strips them out. It's not pretty, and won't perform well on a mountain of data, but will do what you need.
Data setup:
create table trim (col1 varchar(10));
insert trim (col1)
values
('abc'),
(' abc-'),
('abc- '),
('abc-abc '),
(' -abc'),
('-abc '),
(NULL),
(''),
(' -abc- ');
The query:
select
case
when right(ltrim(rtrim(isnull(col1,''))),1) = '-'
and left(ltrim(rtrim(isnull(col1,''))),1) = '-'
then substring(ltrim(rtrim(isnull(col1,''))),2,len(ltrim(rtrim(isnull(col1,''))))-2)
when right(ltrim(rtrim(isnull(col1,''))),1) = '-'
then left(ltrim(rtrim(isnull(col1,''))), len(ltrim(rtrim(isnull(col1,''))))-1)
when left(ltrim(rtrim(isnull(col1,''))),1) = '-'
then right(ltrim(rtrim(isnull(col1,''))), len(ltrim(rtrim(isnull(col1,''))))-1)
else ltrim(rtrim(isnull(col1,'')))
end as trimmed
from trim;
Results:
+---------+
| trimmed |
+---------+
| abc |
| abc |
| abc |
| abc-abc |
| abc |
| abc |
| |
| |
| abc |
+---------+
SQL Fiddle Demo
Since the Database is not mentioned, here is how you do it (rather find it)
SQL Server
Remove the last character in a string in T-SQL?
Oracle
Remove last character from string in sql plus
Postgresql
Postgresql: Remove last char in text-field if the column ends with minus sign
MySQL
Strip last two characters of a column in MySQL
You can use LEFT function, along with SUBSTRING to achieve the result.
SELECT CASE WHEN RIGHT(stringVal,1)= '-' THEN SUBSTRING(stringVal,1,LEN(stringVal)-1)
ELSE stringVal END AS ModifiedString
from
( VALUES ('abc-abc'), ('abc-'),('abc')) as t(stringVal)
+----------------+
| ModifiedString |
+----------------+
| abc-abc |
| abc |
| abc |
+----------------+

Oracle SQL - Substring issue

I have an field pattern and value in that field is INDI/17-18/6767/KER/787 .I want to get 6767 from this string
I used the query
select substr(pattern,12,15) from pattern_table
But the output I got is 6767/KER/787 instead of 6767.
Try this:
You have to give the length as the 3rd value, not the position.
SELECT SUBSTR(pattern,12,4) FROM pattern_table
For a generic result to get the 3rd value separated by a delimiter, you may use REGEXP_SUBSTR.
SQL Fiddle
Query 1:
SELECT pattern,REGEXP_SUBSTR(pattern, '[^/]+', 1, 3) id
FROM pattern_table
Results:
| PATTERN | ID |
|--------------------------|-------|
| INDI/17-18/6767/KER/787 | 6767 |
| INDI/17-18-19/67/KER/787 | 67 |
| INDI/16-18/67890/KAR/986 | 67890 |
even this will also work:
SELECT substr('INDI/17-18/6767/KER/787',instr('INDI/17-
18/6767/KER/787','/',1,2)+1,4) FROM dual;

regex to convert alphanumeric and special characters in a string to * in oracle

I have a requirement to convert all the characters in my string to *. My string can also contain special characters as well.
For Example:
abc_d$ should be converted to ******.
Can any body help me with regex like this in oracle.
Thanks
Use REGEXP_REPLACE and replace any single character (.) with *.
SELECT
REGEXP_REPLACE (col, '.', '*')
FROM yourTable
Demo
Instead of regex you could also use
select rpad('*', length('abc_d$ s'),'*') from dual
-- use '*' and pad it until length fits with other *
Doku: rpad(string,length,appendWhat)
Repeat with a string of '*' should work as well: repeat(string,count) (not tested)
regex or rpad makes no difference - they are optimized down to the same execution plan:
n-th try of rpad:
Plan Hash Value : 1388734953
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 | 00:00:01 |
| 1 | FAST DUAL | | 1 | | 2 | 00:00:01 |
-----------------------------------------------------------------
n-th try of regex_replace
Plan Hash Value : 1388734953
-----------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost | Time |
-----------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 | 00:00:01 |
| 1 | FAST DUAL | | 1 | | 2 | 00:00:01 |
-----------------------------------------------------------------
So it does not matter wich u use.
THIS IS NOT AN ANSWER
As suggested by Tom Biegeleisen’s brother Tim, I ran a test to compare a solution based on regular expressions to one using just standard string functions. (Specifically, Tim's answer with regular expressions vs. Patrick Artner's solution using just LENGTH and RPAD.)
Details of the test are shown below.
CONCLUSION: On a table with 5 million rows, each consisting of one string of length 30 (in a single column), the regular expression query runs in 21 seconds. The query using LENGTH and RPAD runs in one second. Both solutions read all the data from the table; the only difference is the function used in the SELECT clause. As noted already, both queries have the same execution plan, AND the same estimated cost - because the cost does not take into account differences in function calculation time.
Setup:
create table tbl ( str varchar2(30) );
insert into tbl
select a.str
from ( select dbms_random.string('p', 30) as str
from dual
connect by level <= 100
) a
cross join
( select level
from dual
connect by level <= 50000
) b
;
commit;
Note that there are only 100 distinct values, and each is repeated 50,000 times for a total of 5 million values. We know the values are repeated; Oracle doesn't know that. It will really do "the same thing" 5 million times, it won't just do it 100 times and then simply copy the results; it's not that smart. This is something that would be known only by seeing the actual stored data, it's not known to Oracle beforehand, so it can't "prepare" for such shortcuts.
Queries:
The two queries - note that I didn't want to send 5 million rows to screen, nor did I want to populate another table with the "masked" values (and muddy the waters with the time it takes to INSERT the results into another table); rather, I compute all the new strings and take the MAX. Again, in this test all "new" strings are equal to each other - they are all strings of 30 asterisks - but there is no way for Oracle to know that. It really has to compute all 5 million new strings and take the max over them all.
select max(new_str)
from ( select regexp_replace(str, '.', '*' ) as new_str
from tbl
)
;
select max(new_str)
from ( select rpad('*', length(str), '*') as new_str
from tbl
)
;
Try this:
SELECT
REGEXP_REPLACE('B^%2',
'*([A-Z]|[a-z]|[0-9]|[ ]|([^A-Z]|[^a-z]|[^0-9]|[^ ]))', '*') "REGEXP_REPLACE"
FROM DUAL;
I have included for white spaces too
select name,lpad(regexp_replace(name,name,'*'),length(name),'*')
from customer;

Replacing first occurence of character in a string using HiveQL

I am trying to replace the first occurrence of '-' in a string in Hive table. I am using HiveQL. I searched this topic here and other websites, but could not find clear explanation how to use metacharacters with regexp_replace() to do that.
This is a string from which I need to replace first '-' with empty space: 16-001-02707
The result should be like this: 16001-02707
This is the method I used:
select regexp_replace ('16-001-02707','[^[:digit:]]', '');
However, this doesn't do anything.
select regexp_replace ('16-001-02707','^(.*?)-', '$1');
16001-02707
Following the OP question in the comments
with t as (select '111-22-333333-4-555-6-7-8888-999999' as col)
select regexp_replace (col,'^(.*?)-','$1')
,regexp_replace (col,'^(.*?-.*?)-','$1')
,regexp_replace (col,'^((.*?-){2}.*?)-','$1')
,regexp_replace (col,'^((.*?-){3}.*?)-','$1')
,regexp_replace (col,'^((.*?-){4}.*?)-','$1')
,regexp_replace (col,'^((.*?-){5}.*?)-','$1')
from t
+------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+
| _c0 | _c1 | _c2 | _c3 | _c4 | _c5 |
+------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+
| 11122-333333-4-555-6-7-8888-999999 | 111-22333333-4-555-6-7-8888-999999 | 111-22-3333334-555-6-7-8888-999999 | 111-22-333333-4555-6-7-8888-999999 | 111-22-333333-4-5556-7-8888-999999 | 111-22-333333-4-555-67-8888-999999 |
+------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+------------------------------------+