How to Split camel case sentence in ts_vector (PosgreSQL)? - sql

When I write a query
SELECT to_tsvector('simple','logDescription');. The output is:
to_tsvector
--------------------
'logdescription':1
I want output like the below: Is there a way to achieve the below outcome?
to_tsvector
--------------------
'log': 1 'Description':1

The simple way is split your string into words using regexp_replace before contrition into tsvector
SELECT to_tsvector(
'simple',
regexp_replace('logDescription', '([a-z])([A-Z])', '\1 \2','g')
);
https://sqlize.online/sql/psql15/6647b42a8cb3bb752c0a40d52811761e/
+=========================+
| to_tsvector |
+=========================+
| 'description':2 'log':1 |
+-------------------------+

Related

SQL - trimming values before bracket

I have a column of values where some values contain brackets with text which I would like to remove. This is an example of what I have and what I want:
CREATE TABLE test
(column_i_have varchar(50),
column_i_want varchar(50))
INSERT INTO test (column_i_have, column_i_want)
VALUES ('hospital (PWD)', 'hopistal'),
('nursing (LLC)','nursing'),
('longterm (AT)', 'longterm'),
('inpatient', 'inpatient')
I have only come across approaches that use the number of characters or the position to trim the string, but these values have varying lengths. One way I was thinking was something like:
TRIM('(*',col1)
Doesn't work. Is there a way to do this in postgres SQL without using the position? THANK YOU!
If all the values contain "valid" brackets, then you may use split_part function without any regular expressions:
select
test.*,
trim(split_part(column_i_have, '(', 1)) as res
from test
column_i_have | column_i_want | res
:------------- | :------------ | :--------
hospital (PWD) | hopistal | hospital
nursing (LLC) | nursing | nursing
longterm (AT) | longterm | longterm
inpatient | inpatient | inpatient
db<>fiddle here
You can replace partial patterns using regular expressions. For example:
select *, regexp_replace(v, '\([^\)]*\)', '', 'g') as r
from (
select '''hospital (PWD)'', ''nursing (LLC)'', ''longterm (AT)'', ''inpatient''' as v
) x
Result:
r
-------------------------------------------------
'hospital ', 'nursing ', 'longterm ', 'inpatient'
See example at db<>fiddle.
Could it be as easy as:
SELECT SUBSTRING(column_i_have, '\w+') AS column_i_want FROM test
See demo
If not, and you still want to use SUBSTRING() to get upto but exclude paranthesis, then maybe:
SELECT SUBSTRING(column_i_have, '^(.+?)(?:\s*\(.*)?$') AS column_i_want FROM test
See demo
But if you really are looking upto the opening paranthesis, then maybe just use SPLIT_PART():
SELECT SPLIT_PART(column_i_have, ' (', 1) AS column_i_want FROM test
See demo

Merging tags to values separated by new line character in Oracle SQL

I have a database field with several values separated by newline.
Eg-(can be more than 3 also)
A
B
C
I want to perform an operation to modify these values by adding tags from front and end.
i.e the previous 3 values should need to be turned into
<Test>A</Test>
<Test>B</Test>
<Test>C</Test>
Is there any possible query operation in Oracle SQL to perform such an operation?
Just replace the start and end of each string with the XML tags using a multi-line match parameter of the regular expression:
SELECT REGEXP_REPLACE(
REGEXP_REPLACE( value, '^', '<Test>', 1, 0, 'm' ),
'$', '</Test>', 1, 0, 'm'
) AS replaced_value
FROM table_name;
Which, for the sample data:
CREATE TABLE table_name ( value ) AS
SELECT 'A
B
C' FROM DUAL;
Outputs:
| REPLACED_VALUE |
| :------------- |
| <Test>A</Test> |
| <Test>B</Test> |
| <Test>C</Test> |
db<>fiddle here
You can use normal replace function as follows:
Select '<test>'
|| replace(your_column,chr(10),'</test>'||chr(10)||'<test>')
|| '</test>'
From your_table;
It will be faster than its regexp_replace function.
Db<>fiddle

Remove single quotes in Oracle

I have a string like (''acc','xyz''), I need the output as ('acc','xyz').
What will be the query or any regular expression to remove extra quotes.?
Try REPLACE function:
replace(q'[(''acc','xyz'')]', q'['']',q'[']')
Demo: http://www.sqlfiddle.com/#!4/abb5d3/1
SELECT replace(q'[(''acc','xyz'')]', q'['']',q'[']')
FROM dual;
| REPLACE(Q'[(''ACC','XYZ'')]',Q'['']',Q'[']') |
|----------------------------------------------|
| ('acc','xyz') |
If that string always looks like you described, then replace two consecutive single quotes (CHR(39)) with a single one, such as
SQL> with test (col) as
2 (select q'[(''acc','xyz'')]' from dual)
3 select col,
4 replace(col, chr(39)||chr(39), chr(39)) result
5 from test;
COL RESULT
--------------- ---------------
(''acc','xyz'') ('acc','xyz')
SQL>
Why CHR(39)? Because this: replace(col, '''''', '''') is difficult to read, and this: replace(col, q'['']', q'[']') looks stupid, but - use any of these (or invent your own way).

Non-greedy Oracle SQL regexp_replace [duplicate]

This question already has answers here:
Why doesn't a non-greedy quantifier sometimes work in Oracle regex?
(4 answers)
Closed 5 years ago.
I'm having some issues dealing with the non-greedy regex operator in Oracle.
This seems to work:
select regexp_replace('abcc', '^ab.*?c', 'Z') from dual;
-- output: Zc (does not show greedy behavior)
while this does not:
select regexp_replace('abc:"123", def:"456", hji="789", dasdjaoijdsa', '(^.*def:")(.*?)(".*$)', '\2') from dual;
-- output: 456", hji="789 (shows greedy behavior)
-- I would expect 456 as output.
Is there something glaringly obvious that I may be missing here?
Thanks
You can use a non-greedy regular expression in REGEXP_SUBSTR:
SELECT REGEXP_SUBSTR(
'abc:"123", def:"456", hji="789", dasdjaoijdsa', -- input
'def:"(.*?)"', -- pattern
1, -- start character
1, -- occurrence
NULL, -- flags
1 -- capture group
) AS def
FROM DUAL;
Results:
| DEF |
|-----|
| 456 |
If you want to skip escaped quotation marks then you can use:
SELECT REGEXP_SUBSTR(
'abc:"123", def:"456\"Test\"", hji="789", dasdjaoijdsa',
'def:"((\\"|[^"])*)"',
1,
1,
NULL,
1
) AS def
FROM DUAL;
Results:
| DEF |
|-------------|
| 456\"Test\" |
Update:
You can get your query to work by making the first wild-card match non-greedy:
select regexp_replace(
'abc:"123", def:"456", hji="789", dasdjaoijdsa',
'(^.*?def:")(.*?)(".*$)',
'\2'
) AS def
FROM DUAL;
Results:
| DEF |
|-----|
| 456 |
I don't know exactly why your regex replace is failing, but I can offer a version of your query which is working:
select
regexp_replace('abc:"123", def:"456", hji="789", dasdjaoijdsa',
'^(.*def:")([^"]*).*',
'\2') from dual
The only explanation I have is that lazy dot isn't working, at least not in the context of the capture group. When I switch ([^"]*) above to (.*?), the query will fail.
Demo

SQL Regex to select string between second and third forward slash

I am using Postgres/Redshift to query a table of URLs and am trying to use
SELECT regex_substr to select a string that is between the second and third forward slash in the column.
For example I need the second slash delimited string in the following data:
/abc/required_string/5856365/
/abc/required_string/2/
/abc/required_string/l-en/
/abc/required_string/l-en/
Following some of the regexs in this this thread:
SELECT regexp_substr(column, '/[^/]*/([^/]*)/')
FROM table
None seem to work. I keep getting:
/abc/required_string/
/abc/required_string/
What about split_part?
SELECT split_part(column, '/', 3) FROM table
Example:
select split_part ('/abc/required_string/2/', '/', 3)
Returns: required string
This may work :
SQL Fiddle
PostgreSQL 9.3 Schema Setup:
CREATE TABLE t
("c" varchar(29))
;
INSERT INTO t
("c")
VALUES
('/abc/required_string/5856365/'),
('/abc/required_string/2/'),
('/abc/required_string/l-en/'),
('/abc/required_string/l-en/')
;
Query 1:
SELECT substring("c" from '/[^/]*/([^/]*)/')
FROM t
Results:
| substring |
|-----------------|
| required_string |
| required_string |
| required_string |
| required_string |