How do you escape this regular expression? - sql

I'm looking for
"House M.D." (2004)
with anything after it. I've tried where id~'"House M\.D\." \(2004\).*'; and there's no matches
This works id~'.*House M.D..*2004.*'; but is a little slow.

I suspect you're on an older PostgreSQL version that interprets strings in a non standards-compliant C-escape-like mode by default, so the backslashes are being treated as escapes and consumed. Try SET standard_conforming_strings = 'on';.
As per the lexical structure documentation on string constants, you can either:
Ensure that standard_conforming_strings is on, in which case you must double any single quotes (ie ' becomes '') but backslashes aren't treated as escapes:
id ~ '"House M\.D\." \(2004\)'
Use the non-standard, PostgreSQL-specific E'' syntax and double your backslashes:
id ~ E'"House M\\.D\\." \\(2004\\)'
PostgreSQL versions 9.1 and above set standard_conforming_strings to on by default; see the documentation.
You should turn it on in older versions after testing your code, because it'll make updating later much easier. You can turn it on globally in postgresql.conf, on a per-user level with ALTER ROLE ... SET, on a per-database level with ALTER DATABASE ... SET or on a session level with SET standard_conforming_strings = on. Use SET LOCAL to set it within a transaction scope.

Looks that your regexp is ok
http://sqlfiddle.com/#!12/d41d8/113

CREATE OR REPLACE FUNCTION public.regexp_quote(IN TEXT)
RETURNS TEXT
LANGUAGE plpgsql
STABLE
AS $$
/*******************************************************************************
* Function Name: regexp_quote
* In-coming Param:
* The string to decoded and convert into a set of text arrays.
* Returns:
* This function produces a TEXT that can be used as a regular expression
* pattern that would match the input as if it were a literal pattern.
* Description:
* Takes in a TEXT in and escapes all of the necessary characters so that
* the output can be used as a regular expression to match the input as if
* it were a literal pattern.
******************************************************************************/
BEGIN
RETURN REGEXP_REPLACE($1, '([[\\](){}.+*^$|\\\\?-])', '\\\\\\1', 'g');
END;
$$
Test:
SELECT regexp_quote('"House M.D." (2004)'); -- produces: "House M\\.D\\." \\(2004\\)

Related

PostgreSQL RETURNING fails with REGEXP_REPLACE

I'm running PostgreSQL 9.4 and are inserting a lot of records into my database. I use the RETURNING clause for further use after an insert.
When I simply run:
... RETURNING my_car, brand, color, contact
everything works, but if I try to use REGEXP_REPLACE it fails:
... RETURNing my_car, brand, color, REGEXP_REPLACE(contact, '^(\+?|00)', '') AS contact
it fails with:
ERROR: invalid regular expression: quantifier operand invalid
If I simply run the query directly in PostgreSQL it does work and return a nice output.
Tried to reproduce and failed:
t=# create table s1(t text);
CREATE TABLE
t=# insert into s1 values ('+4422848566') returning REGEXP_REPLACE(t, '^(\+?|00)', '');
regexp_replace
----------------
4422848566
(1 row)
INSERT 0 1
So elaborated #pozs suggested reason:
set standard_conforming_strings to off;
leads to
WARNING: nonstandard use of escape in a string literal
LINE 1: ...alues ('+4422848566') returning REGEXP_REPLACE(t, '^(\+?|00)...
^
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.
ERROR: invalid regular expression: quantifier operand invalid
update
As OP author says standard_conforming_strings is on as supposed from 9.1 by default working with psql and is off working with pg-prommise
update from vitaly-t
The issue is simply with the JavaScript literal escaping, not with the
flag.
He elaborates further in his answer
The current value of environment variable standard_conforming_strings is inconsequential here. You can see it if you prefix your query with SET standard_conforming_strings = true;, which will change nothing.
Passing in a regEx string unescaped from the client is the same as using E prefix from the command line: E'^(\+?|00)'.
In JavaScript \ is treated as a special symbol, and you simply always have to provide \\ to indicate the symbol, which is what needed for your regular expressions.
Other than that, pg-promise will escape everything correctly, here's an example:
db.any("INSERT INTO users(name) VALUES('hello') RETURNING REGEXP_REPLACE(name, $1, $2)", ['^(\\+?|00)', 'replaced'])
To understand how the command-line works, prefix the regex string with E:
db.any("INSERT INTO users(name) VALUES('hello') RETURNING REGEXP_REPLACE(name, E$1, $2)", ['^(\\+?|00)', 'replaced'])
And you will get the same error: invalid regular expression: quantifier operand invalid.

escape in a select statement

In the following sql, what the use of escape is ?
select * from dual where dummy like 'funny&_' escape '&';
SQL*Plus ask for the value of _ whether escape is specified or not.
The purpose of the escape clause is to stop the wildcard characters (eg. % or _) from being considered as wildcards, as per the documentation
The reason why you're being prompted for the value of _ is because you're using &, which is also usually the character used to prompt for a substitution variable.
To stop the latter from happening, you could:
change to a different escape character
prior to running your statement, run set define off if you're using SQL*Plus (or as a script in a GUI, eg. Toad) or turn off the substitution variable prompting if you're using a GUI.
change the define character to something different by running set define <character>
The escape character is used to indicate that the underscore should be matched as an actual character, rather than as a single-character wildcard. This is explained in the documentation.
You can include the actual characters % or _ in the pattern by using the ESCAPE clause, which identifies the escape character. If the escape character precedes the character % or _ in the pattern, then Oracle interprets this character literally in the pattern rather than as a special pattern-matching character.
If you didn't have the escape clause then the underscore would match any single character, so where dummy like 'funny_' would match 'funnyA', 'funnyB', etc. and not just an actual underscore.
The escape character you've chosen is & which is the default SQL*Plus client substitution variable marker. It has nothing to do with the escape clause, and using that is causing the &_ part of the pattern to be interpreted as a substitution variable called _, hence your being prompted. As it isn't related, the escape clause has no effect on that.
The simplest thing is probably to choose a different escape character. If you want to use that specific escape character and not be prompted, disable or change the substitution character:
set define off
select * from dual where dummy like 'funny&_' escape '&';
set define on
That will then match rows where dummy contains exactly the string 'funny_'. (It's therefore equivalent to where dummy = 'funny_', as there are no unescaped wildcards, making the like pattern matching redundant). It will not match any that start with that pattern (it's sort of like using regexp_like with start and end anchors, and you might be expecting it to work as if you hadn't supplied anchors, but it doesn't). You would need to add a % wildcard for that:
set define off
select * from dual where dummy like 'funny&_%' escape '&';
set define on
And if you want to match any that don't start with funny_ but have it somewhere in the middle of the value, you would need to add another wildcard before it too:
set define off
select * from dual where dummy like '%funny&_%' escape '&';
set define on
You haven't shown any sample data or expected results to it isn't clear which pattern you need.
SQL Fiddle doesn't have substitution variables but here's an example showing how those three patterns match various values.
The syntax for the SQL LIKE Condition is:
expression LIKE pattern [ ESCAPE 'escape_character' ]
Parameters or Arguments
expression : A character expression such as a column or field.
pattern : A character expression that contains pattern matching. The patterns that you can choose from are:
Wildcard | Explanation
---------+-------------
% | Allows you to match any string of any length (including zero length)
_ | Allows you to match on a single character
escape_character: Optional. It allows you to test for literal instances of a wildcard character such as % or _.
Source : http://www.techonthenet.com/sql/like.php

How to escape value in parameter passed to Oracle SQL script

I have an sql script which is executed using sql plus. It reads input parameters and the beginning looks like this:
SET DEFINE ON
DEFINE PARAM1 = '&1'
DEFINE PARAM2 = '&2'
DECLARE
...
Now I would like to use this script with the parameters, but I need to use some special characters, particularly '
##./update.sql 'value of first param' 'Doesn't work'
^
--------------------------------------------| Here's the problem
commit;
When I do the usual way of concatenation strings like this:
'Doesn'||chr(39)||'t work'
only Doesn appear in the PARAM2. Is there some way to escape the character in a way that the sqlplus will read it as a single string?
You need to use escape characters to achieve this.
{} Use braces to escape a string of characters or symbols. Everything within a set of braces in considered part of the escape sequence. When you use braces to escape a single character, the escaped character becomes a separate token in the query.
\ Use the backslash character to escape a single character or symbol. Only the character immediately following the backslash is escaped.
Some examples on single character escape
SELECT 'Frank''s site' AS text FROM DUAL;
TEXT
--------------------
Franks's site
Read more here
For escaping & in SQL*Plus
SET ESCAPE '\'
SELECT '\&abc' FROM dual;
OR
SET SCAN OFF
SELECT '&ABC' x FROM dual;
Escaping wild card
SELECT name FROM emp
WHERE id LIKE '%/_%' ESCAPE '/';
SELECT name FROM emp
WHERE id LIKE '%\%%' ESCAPE '\';
From a shell you should call the script like that:
sqlplus user#db/passwd #update 'value of first param' "Doesn't work"
(the word #update refer to your script which is named update.sql)
Then you have to use literal quoting in your script:
DEFINE PARAM2 = q'[&2]';
Documentation for literals can be found here.

Escaping quotes inside text when dumping Postgres Sql

Let's say my table is:
id text
+-----+----------+
123 | foo bar
321 | bar "baz"
Is there any way to escape those quotes around 'baz' when dumping?
My query is in the form:
SELECT text FROM aTable WHERE ...
And I would like the output to be:
foo bar
bar \"baz\"
rather than:
foo bar
bar baz
You probably want to use replace:
SELECT REPLACE(text, '"', E'\\"') FROM aTable WHERE ...
You'll need to escape your escape character to get a literal backslash (hence the doubled backslash) and use the "E" prefix on the replacement string to get the right escape syntax.
UPDATE: And thanks to a_horse_with_no_name's usual strictness (a good thing BTW), we have a solution that doesn't need the extra backslash or non-standard "E" prefix:
set standard_conforming_strings = on;
SELECT REPLACE(text, '"', '\"') FROM aTable WHERE ...
The standard_conforming_strings option tells PostgreSQL to use standard syntax for SQL strings:
This controls whether ordinary string literals ('...') treat backslashes literally, as specified in the SQL standard.
This would also impact your \x5C escape:
If the configuration parameter standard_conforming_strings is off, then PostgreSQL recognizes backslash escapes in both regular and escape string constants. This is for backward compatibility with the historical behavior, where backslash escapes were always recognized.
You can use the following incarnation of the COPY command:
COPY (SELECT * FROM table) TO ... WITH FORMAT 'CSV', ESCAPE '<WHATEVER ESCAPE CHARACTER YOU WANT>'
as described here.
You might not have to do anything, as in some cases your QUOTE option will be doubled automatically. Please consult examples for the referenced link. You can also use VALUES in addition to SELECT. No further data mangling should be necessary.
This is assuming you are using 7.3 or higher. The syntax is slightly different between 7.3 and 9.0, so please consult the appropriate docs.

How do I ignore ampersands in a SQL script running from SQL Plus?

I have a SQL script that creates a package with a comment containing an ampersand (&). When I run the script from SQL Plus, I am prompted to enter a substitute value for the string starting with &. How do I disable this feature so that SQL Plus ignores the ampersand?
This may work for you:
set define off
Otherwise the ampersand needs to be at the end of a string,
'StackOverflow &' || ' you'
EDIT: I was click-happy when saving... This was referenced from a blog.
If you sometimes use substitution variables you might not want to turn define off. In these cases you could convert the ampersand from its numeric equivalent as in || Chr(38) || or append it as a single character as in || '&' ||.
I resolved with the code below:
set escape on
and put a \ beside & in the left 'value_\&_intert'
Att
You can set the special character, which is looked for upon execution of a script, to another value by means of using the SET DEFINE <1_CHARACTER>
By default, the DEFINE function itself is on, and it is set to &
It can be turned off - as mentioned already - but it can be avoided as well by means of setting it to a different value. Be very aware of what sign you set it to. In the below example, I've chose the # character, but that choice is just an example.
SQL> select '&var_ampersand #var_hash' from dual;
Enter value for var_ampersand: a value
'AVALUE#VAR_HASH'
-----------------
a value #var_hash
SQL> set define #
SQL> r
1* select '&var_ampersand #var_hash' from dual
Enter value for var_hash: another value
'&VAR_AMPERSANDANOTHERVALUE'
----------------------------
&var_ampersand another value
SQL>
set define off <- This is the best solution I found
I also tried...
set define }
I was able to insert several records containing ampersand characters '&' but I cannot use the '}' character into the text
So I decided to use "set define off" and everything works as it should.
According to this nice FAQ there are a couple solutions.
You might also be able to escape the ampersand with the backslash character \ if you can modify the comment.
I had a CASE statement with WHEN column = 'sometext & more text' THEN ....
I replaced it with
WHEN column = 'sometext ' || CHR(38) || ' more text' THEN ...
you could also use
WHEN column LIKE 'sometext _ more text' THEN ...
(_ is the wildcard for a single character)