Postgres PL/pgSQL Function results to file, with filename as argument - sql

I am migrating some client side stuff into the server, and want to put it into a function.
I need to get the results of a query into a CSV file. But, I'd like to pass the file name/location of the resulting file as an argument of the function.
So, this is a simple example of what I want to do:
CREATE FUNCTION send_email_results(filename1 varchar) RETURNS void AS $$
DECLARE
BEGIN
COPY(SELECT * FROM mytable) TO filename1 WITH CSV;
END;
$$ LANGUAGE plpgsql;
Postgres is complaining about this though, as it is translating the filename1 argument to '$1', and it doesn't know what to do.
I can hardcode the path if need be, but being able to pass it as a parameter sure would be handy.
Anyone have any clues?

I just ran in to this. It turns out that you can't use parameterized arguments when the copy command is used (at least that's the case with python as the stored proc language). So, you have to build the command without arguments, like:
CREATE FUNCTION send_email_results(filename1 varchar) RETURNS void AS $$
DECLARE
BEGIN
execute 'copy (select * frommytable) to ' || filename1 || ' with csv;';
END;
$$ LANGUAGE plpgsql;
You might have to use the quoting feature to make it a little more readable. I don't know, I don't use plpgsql as a postgres function language, so the syntax might be wrong.
execute 'copy (select * frommytable) to ' || quote_literal(filename1) || ' with csv;'

Related

How to decode the HTML characters in SQL query [duplicate]

I have just set about the task of stripping out HTML entities from our database, as we do a lot of crawling and some of the crawlers didn't do this at input time :(
So I started writing a bunch of queries that look like;
UPDATE nodes SET name=regexp_replace(name, 'à', 'à', 'g') WHERE name LIKE '%#xe0%';
UPDATE nodes SET name=regexp_replace(name, 'á', 'á', 'g') WHERE name LIKE '%#xe1%';
UPDATE nodes SET name=regexp_replace(name, 'â', 'â', 'g') WHERE name LIKE '%#xe2%';
Which is clearly a pretty naive approach. I've been trying to figure out if there is something clever I can do with the decode function; maybe grabbing the html entity by regex like /&#x(..);/, then passing just the %1 part to the ascii decoder, and reconstructing the string...or something...
Shall I just press on with the queries? There will probably only be 40 or so of them.
Write a function using pl/perlu and use this module https://metacpan.org/pod/HTML::Entities
Of course you need to have perl installed and pl/perl available.
1)
First of all create the procedural language pl/perlu:
CREATE EXTENSION plperlu;
2) Then create a function like this:
CREATE FUNCTION decode_html_entities(text) RETURNS TEXT AS $$
use HTML::Entities;
return decode_entities($_[0]);
$$ LANGUAGE plperlu;
3) Then you can use it like this:
select decode_html_entities('aaabbb&.... asasdasdasd …');
decode_html_entities
---------------------------
aaabbb&.... asasdasdasd …
(1 row)
You could use xpath (HTML-encoded content is the same as XML encoded content):
select
'AT&T' as input ,
(xpath('/z/text()', ('<z>' || 'AT&T' || '</z>')::xml))[1] as output
This is what it took for me to get working on Ubuntu 18.04 with PG10, and Perl didn't decode some entities like &comma; for some reason. So I used Python3.
From the command line
sudo apt install postgresql-plpython3-10
From your SQL interface:
CREATE LANGUAGE plpython3u;
CREATE OR REPLACE FUNCTION htmlchars(str TEXT) RETURNS TEXT AS $$
from html.parser import HTMLParser
h = HTMLParser()
if str is None:
return str
return h.unescape(str);
$$ LANGUAGE plpython3u;

Unterminated dollar-quoted string at or near "$$

I'm trying to declare some variables using DBeaver and keep hitting this error.
Unterminated dollar-quoted string at or near "$$
DO $$
DECLARE A integer; B integer;
BEGIN
END$$;
Any ideas?
DBeaver was the issue. Switched to PGAdmin and no more problems.
As of DBeaver 6, you can execute the script with ALT-X (on Windows), which does not attempt to do variable capture/interpolation involving dollar signs.
The syntax posted is fine. Your problem is that the client application or driver is mangling the query, probably because it doesn't understand dollar-quoting.
It might be trying to split it into separate statements on semicolons, running:
DO $$ DECLARE A integer;
B integer;
BEGIN END$$;
as three separate statements. This would result in the reported error, e.g.
$ psql -c 'DO $$ DECLARE A integer;'
ERROR: unterminated dollar-quoted string at or near "$$ DECLARE A integer;"
LINE 1: DO $$ DECLARE A integer;
^
This is why you must specify your client driver/application when asking questions.
Another possibility with some clients is that it might treat $ as an escaped query-parameter placeholder and replace it with a single $ or try to substitute it for a server-side placeholder like $1. That's not what's happening here, though.
DBeaver also gives this error when there is a SQL syntax error in the script.
In my case, it was a pair of mismatched parenthesis in a select calculated column.

how to insert filenames with single quotes in a postgresql COPY command from delphi, parameters?

I'm having some fun trying to include a filename in a Postgresql script in delphi, the desired string is
COPY myschema.mytable FROM 'c:\data\data.csv' CSV HEADER;
I know this SQL query is parsed ok by postgresql as I've tested it in pgadmin, the problem is how to generate it in Delphi. Delphi uses single quotes for strings, so even using the QuotedStr method like
TempSQL := 'COPY myschema.mytable FROM '+QuotedStr(myfilename)+ ' CSV HEADER';
ADOQuery1.SQL.Add (TempSQL);
the string is generated as
COPY myschema.mytable FROM ''c:\data\data.csv'' CSV HEADER;
So I'm trying to use Parameters.ParamByName like
TempSQL := 'COPY myschema.mytable FROM :PFileName CSV HEADER';
ADOQuery1.SQL.Add (TempSQL);
FileNameParam := LQuery.Parameters.ParamByName('PFileName');
FileNameParam.DataType := ftstring;
FileNameParam.Value := 'c:\data\data.csv';
ADOQuery1.Open;
Gives the error: ERROR: syntax error at or near "$1"; Error while executing the query. $1 is usually caused by paramnames being the same as column names, thats not the case here, I tried different paramnames. I think the problem is that maybe Parambyname doesn't work for this type of argument, it's normally used like
SELECT * FROM myschema.mytable WHERE myfield = :myparameter
ie the colon comes after an = which isn't the case with the copy command. Any suggestions welcome. The delphi code basically scans a directory for (1000s of) suitable files and keeps a log of what is imported, maybe I have to interface with the db in a different way entirely.
This code works perfectly in a quick test app, and displays the properly quoted string in the call to ShowMessage, which means there's something other than what you've shown us here going on in your code.
procedure TForm4.FormCreate(Sender: TObject);
var
TempStr: string;
MyFileName: string;
begin
MyFileName := 'somefile.txt';
TempStr := 'COPY myschema.mytable FROM ' + QuotedStr(myfilename) + ' CSV HEADER';
ShowMessage(TempStr);
end
The resulting dialog:

How do you escape this regular expression?

I'm looking for
"House M.D." (2004)
with anything after it. I've tried where id~'"House M\.D\." \(2004\).*'; and there's no matches
This works id~'.*House M.D..*2004.*'; but is a little slow.
I suspect you're on an older PostgreSQL version that interprets strings in a non standards-compliant C-escape-like mode by default, so the backslashes are being treated as escapes and consumed. Try SET standard_conforming_strings = 'on';.
As per the lexical structure documentation on string constants, you can either:
Ensure that standard_conforming_strings is on, in which case you must double any single quotes (ie ' becomes '') but backslashes aren't treated as escapes:
id ~ '"House M\.D\." \(2004\)'
Use the non-standard, PostgreSQL-specific E'' syntax and double your backslashes:
id ~ E'"House M\\.D\\." \\(2004\\)'
PostgreSQL versions 9.1 and above set standard_conforming_strings to on by default; see the documentation.
You should turn it on in older versions after testing your code, because it'll make updating later much easier. You can turn it on globally in postgresql.conf, on a per-user level with ALTER ROLE ... SET, on a per-database level with ALTER DATABASE ... SET or on a session level with SET standard_conforming_strings = on. Use SET LOCAL to set it within a transaction scope.
Looks that your regexp is ok
http://sqlfiddle.com/#!12/d41d8/113
CREATE OR REPLACE FUNCTION public.regexp_quote(IN TEXT)
RETURNS TEXT
LANGUAGE plpgsql
STABLE
AS $$
/*******************************************************************************
* Function Name: regexp_quote
* In-coming Param:
* The string to decoded and convert into a set of text arrays.
* Returns:
* This function produces a TEXT that can be used as a regular expression
* pattern that would match the input as if it were a literal pattern.
* Description:
* Takes in a TEXT in and escapes all of the necessary characters so that
* the output can be used as a regular expression to match the input as if
* it were a literal pattern.
******************************************************************************/
BEGIN
RETURN REGEXP_REPLACE($1, '([[\\](){}.+*^$|\\\\?-])', '\\\\\\1', 'g');
END;
$$
Test:
SELECT regexp_quote('"House M.D." (2004)'); -- produces: "House M\\.D\\." \\(2004\\)

PostgreSQL - Replace HTML Entities

I have just set about the task of stripping out HTML entities from our database, as we do a lot of crawling and some of the crawlers didn't do this at input time :(
So I started writing a bunch of queries that look like;
UPDATE nodes SET name=regexp_replace(name, 'à', 'à', 'g') WHERE name LIKE '%#xe0%';
UPDATE nodes SET name=regexp_replace(name, 'á', 'á', 'g') WHERE name LIKE '%#xe1%';
UPDATE nodes SET name=regexp_replace(name, 'â', 'â', 'g') WHERE name LIKE '%#xe2%';
Which is clearly a pretty naive approach. I've been trying to figure out if there is something clever I can do with the decode function; maybe grabbing the html entity by regex like /&#x(..);/, then passing just the %1 part to the ascii decoder, and reconstructing the string...or something...
Shall I just press on with the queries? There will probably only be 40 or so of them.
Write a function using pl/perlu and use this module https://metacpan.org/pod/HTML::Entities
Of course you need to have perl installed and pl/perl available.
1)
First of all create the procedural language pl/perlu:
CREATE EXTENSION plperlu;
2) Then create a function like this:
CREATE FUNCTION decode_html_entities(text) RETURNS TEXT AS $$
use HTML::Entities;
return decode_entities($_[0]);
$$ LANGUAGE plperlu;
3) Then you can use it like this:
select decode_html_entities('aaabbb&.... asasdasdasd …');
decode_html_entities
---------------------------
aaabbb&.... asasdasdasd …
(1 row)
You could use xpath (HTML-encoded content is the same as XML encoded content):
select
'AT&T' as input ,
(xpath('/z/text()', ('<z>' || 'AT&T' || '</z>')::xml))[1] as output
This is what it took for me to get working on Ubuntu 18.04 with PG10, and Perl didn't decode some entities like &comma; for some reason. So I used Python3.
From the command line
sudo apt install postgresql-plpython3-10
From your SQL interface:
CREATE LANGUAGE plpython3u;
CREATE OR REPLACE FUNCTION htmlchars(str TEXT) RETURNS TEXT AS $$
from html.parser import HTMLParser
h = HTMLParser()
if str is None:
return str
return h.unescape(str);
$$ LANGUAGE plpython3u;