Generate unique random strings in plpgsql - sql

I am trying to write a function to create unique random tokens of variable length. However, I am stumped by the plpgsql syntax. My intention is to create a function which
Takes a table and column as input
Generates a random string of a given length, with a given set of characters
Checks if the string is already in the colum
If so (and this is expected to be rare), simply generate a new random string.
Otherwise, return the random string
My current attempt looks like this:
CREATE FUNCTION random_token(_table TEXT, _column TEXT, _length INTEGER) RETURNS text AS $$
DECLARE
alphanum CONSTANT text := 'abcdefghijkmnopqrstuvwxyz23456789';
range_head CONSTANT integer := 25;
range_tail CONSTANT integer := 33;
random_string text;
BEGIN
REPEAT
SELECT substring(alphanum from trunc(random() * range_head + 1)::integer for 1) ||
array_to_string(array_agg(substring(alphanum from trunc(random() * range_tail + 1)::integer for 1)), '')
INTO random_string FROM generate_series(1, _length - 1);
UNTIL random_string NOT IN FORMAT('SELECT %I FROM %I WHERE %I = random_string;', _column, _table, _column)
END REPEAT;
RETURN random_string;
END
$$ LANGUAGE plpgsql;
However, this doesn't work, and gives me a not very helpful error:
DatabaseError: error 'ERROR: syntax error at or near "REPEAT"
I have tried a number of variations, but without knowing what the error in the syntax is I am stumped. Any idea how to fix this function?

There is no repeat statement in plpgsql. Use simple loop.
CREATE OR REPLACE FUNCTION random_token(_table TEXT, _column TEXT, _length INTEGER) RETURNS text AS $$
DECLARE
alphanum CONSTANT text := 'abcdefghijkmnopqrstuvwxyz23456789';
range_head CONSTANT integer := 25;
range_tail CONSTANT integer := 33;
random_string text;
ct int;
BEGIN
LOOP
SELECT substring(alphanum from trunc(random() * range_head + 1)::integer for 1) ||
array_to_string(array_agg(substring(alphanum from trunc(random() * range_tail + 1)::integer for 1)), '')
INTO random_string FROM generate_series(1, _length - 1);
EXECUTE FORMAT('SELECT count(*) FROM %I WHERE %I = %L', _table, _column, random_string) INTO ct;
EXIT WHEN ct = 0;
END LOOP;
RETURN random_string;
END
$$ LANGUAGE plpgsql;
Note, random_string should be a parameter to format().
Update. According to the accurate hint from Abelisto, this should be faster for a large table:
DECLARE
dup boolean;
...
EXECUTE FORMAT('SELECT EXISTS(SELECT 1 FROM %I WHERE %I = %L)', _table, _column, random_string) INTO dup;
EXIT WHEN NOT dup;
...

This is almost certainly not what you want. When you say, "checks if the string is already in the column" you're not referring to something that looks unique, you're referring to something that actually is UNIQUE.
Instead, I would point you over this answer I gave about UUIDs.

Related

ORACLE: Missing IN or OUT parameter at index:: 1

Can someone tell me what's wrong in my code. I need to create function that displays the number of digits given a number but I keep getting missing in and out parameter. Im am using Oracle SQL. Thank you
SET SERVEROUTPUT ON;
CREATE OR REPLACE FUNCTION Digit (n1 IN OUT INTEGER) RETURN INTEGER IS
Counter INTEGER := 0;
BEGIN
WHILE (n1 != 0 ) LOOP
n1 := n1 /10;
Counter := Counter + 1;
END LOOP;
RETURN Counter;
END;
Test block:
DECLARE
n1 INTEGER := 0;
BEGIN:
n1 := &n1;
DBMS_OUTPUT.PUT_LINE('The number of digit = ' ||Digit(Counter));
END;
The error is probably because of the stray : character after begin in your test block.
I would write it like this:
create or replace function digits
( p_num integer )
return integer
as
pragma udf;
i simple_integer := p_num;
l_digits simple_integer := 0;
begin
while i <> 0 loop
i := i / 10;
l_digits := l_digits + 1;
end loop;
return l_digits;
end digits;
I made the parameter in only, instead of in out. This means you can use it in SQL queries, and also in PL/SQL code without needing to pass in a variable whose value will get changed to 0 by the function.
pragma pdf tells the compiler to optimise the function for use in SQL.
I used simple_integer as in theory it's slightly more efficient for arithmetic operations, although I doubt any improvement is measurable in the real world (and I'm rather trusting the optimising compiler to cast my literal 10 as a simple_integer, as otherwise the overhead of type conversion will defeat any arithmetic efficiency).

How get all matching positions in a string?

I have a column flag_acumu in a table in PostgreSQL with values like:
'SSNSSNNNNNNNNNNNNNNNNNNNNNNNNNNNNSNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN'
I need to show all positions with an 'S'. With this code, I only get the first such position, but not the later ones.
SELECT codn_conce, flag_acumu, position('S' IN flag_acumu) AS the_pos
FROM dh12
WHERE position('S' IN flag_acumu) != 0
ORDER BY the_pos ASC;
How to get all of them?
In Postgres 9.4 or later you can conveniently use unnest() in combination with WITH ORDINALITY:
SELECT *
FROM dh12 d
JOIN unnest(string_to_array(d.flag_acumu, NULL))
WITH ORDINALITY u(elem, the_pos) ON u.elem = 'S'
WHERE d.flag_acumu LIKE '%S%' -- optional, see below
ORDER BY d.codn_conce, u.the_pos;
This returns one row per match.
WHERE d.flag_acumu LIKE '%S%' is optional to quickly eliminate source rows without any matches. Pays if there are more than a few such rows.
Detailed explanation and alternatives for older versions:
PostgreSQL unnest() with element number
Since you didn't specify your needs to a point in which one could answer properly, I'm going with my assumption that you want a list of positions of occurence of a substring (can be more than 1 character long).
Here's the function to do that using:
FOR .. LOOP control structure,
function substr(text, int, int).
CREATE OR REPLACE FUNCTION get_all_positions_of_substring(text, text)
RETURNS text
STABLE
STRICT
LANGUAGE plpgsql
AS $$
DECLARE
output_text TEXT := '';
BEGIN
FOR i IN 1..length($1)
LOOP
IF substr($1, i, length($2)) = $2 THEN
output_text := CONCAT(output_text, ';', i);
END IF;
END LOOP;
-- Remove first semicolon
output_text := substr(output_text, 2, length(output_text));
RETURN output_text;
END;
$$;
Sample call and output
postgres=# select * from get_all_positions_of_substring('soklesocmxsoso','so');
get_all_positions_of_substring
--------------------------------
1;6;11;13
This works too. And a bit faster I think.
create or replace function findAllposition(_pat varchar, _tar varchar)
returns int[] as
$body$
declare _poslist int[]; _pos int;
begin
_pos := position(_pat in _tar);
while (_pos>0)
loop
if array_length(_poslist,1) is null then
_poslist := _poslist || (_pos);
else
_poslist := _poslist || (_pos + _poslist[array_length(_poslist,1)] + 1);
end if;
_tar := substr(_tar, _pos + 1, length(_tar));
_pos := position(_pat in _tar);
end loop;
return _poslist;
end;
$body$
language plpgsql;
Will return a position list which is an int array.
{position1, position2, position3, etc.}

How to cut varchar/text before n'th occurence of delimiter? PostgreSQL

I have strings (saved in database as varchar) and I have to cut them just before n'th occurence of delimiter.
Example input:
String: 'My-Example-Awesome-String'
Delimiter: '-'
Occurence: 2
Output:
My-Example
I implemented this function for fast prototype:
CREATE OR REPLACE FUNCTION find_position_delimiter(fulltext varchar, delimiter varchar, occurence integer)
RETURNS varchar AS
$BODY$
DECLARE
result varchar = '';
arr text[] = regexp_split_to_array( fulltext, delimiter);
word text;
counter integer := 0;
BEGIN
FOREACH word IN ARRAY arr LOOP
EXIT WHEN ( counter = occurence );
IF (counter > 0) THEN result := result || delimiter;
END IF;
result := result || word;
counter := counter + 1;
END LOOP;
RETURN result;
END;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
SELECT find_position_delimiter('My-Example-Awesome-String', '-', 2);
For now it assumes that string is not empty (provided by query where I will call function) and delimiter string contains at least one delimiter of provided pattern.
But now I need something better for performance test. If it is possible, I would love to see the most universal solution, because not every user of my system is working on PostgreSQL database (few of them prefer Oracle, MySQL or SQLite), but it is not the most importatnt. But performance is - because on specific search, that function can be called even few hundreds times.
I didn't find anything about fast and easy using varchar as a table of chars and checking for occurences of delimiter (I could remember position of occurences and then create substring from first char to n'th delimiter position-1). Any ideas? Are smarter solutions?
# EDIT: yea, I know that function in every database will be a bit different, but body of function can be very similliar or the same. Generality is not a main goal :) And sorry for that bad function working-name, I just saw it has not right meaning.
you can try doing something based on this:
select
varcharColumnName,
INSTR(varcharColumnName,'-',1,2),
case when INSTR(varcharColumnName,'-',1,2) <> 0
THEN SUBSTR(varcharColumnName, 1, INSTR(varcharColumnName,'-',1,2) - 1)
else '...'
end
from tableName;
of course, you have to handle "else" the way you want. It works on postgres and oracle (tested), it should work on other dbms's because these are standard sql functions
//edit - as a function, however this way it's rather hard to make it cross-dbms
CREATE OR REPLACE FUNCTION find_position_delimiter(fulltext varchar, delimiter varchar, occurence integer)
RETURNS varchar as
$BODY$
DECLARE
result varchar := '';
delimiterPos integer := 0;
BEGIN
delimiterPos := INSTR(fulltext,delimiter,1,occurence);
result := SUBSTR(fulltext, 1, delimiterPos - 1);
RETURN result;
END;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
SELECT find_position_delimiter('My-Example-Awesome-String', '-', 2);
create or replace function trunc(string text, delimiter char, occurence int) returns text as $$
return delimiter.join(string.split(delimiter)[:occurence])
$$ language plpythonu;
# select trunc('My-Example-Awesome-String', '-', 2);
trunc
------------
My-Example
(1 row)

How can I retrieve a column value of a specific row

I'm using PostgreSQL 9.3.
The table partner.partner_statistic contains the following columns:
id reg_count
serial integer
I wrote the function convert(integer):
CREATE FUNCTION convert(d integer) RETURNS integer AS $$
BEGIN
--Do something and return integer result
END
$$ LANGUAGE plpgsql;
And now I need to write a function returned array of integers as follows:
CREATE FUNCTION res() RETURNS integer[] AS $$
<< outerblock >>
DECLARE
arr integer[]; --That array of integers I need to fill in depends on the result of query
r partner.partner_statistic%rowtype;
table_name varchar DEFAULT 'partner.partner_statistic';
BEGIN
FOR r IN
SELECT * FROM partner.partner_statistic offset 0 limit 100
LOOP
--
-- I need to add convert(r[reg_count]) to arr where r[id] = 0 (mod 5)
--
-- How can I do that?
END LOOP;
RETURN;
END;
$$ LANGUAGE plpgsql;
You don't need (and shouldn't use) PL/PgSQL loops for this. Just use an aggregate. I'm kind of guessing about what you mean by "where r[id] = 0 (mod 5) but I'm assuming you mean "where id is evenly divisible by 5". (Note that this is NOT the same thing as "every fifth row" because generated IDs have gaps).
Something like:
SELECT array_agg(r.reg_count)
FROM partner.partner_statistic
WHERE id % 5 = 0
LIMIT 100
probably meets your needs.
If you want to return the value, use RETURN QUERY SELECT ... or preferably use a simple sql language function.
If you want a dynamic table name, use:
RETURN QUERY EXECUTE format('
SELECT array_agg(r.reg_count)
FROM %I
WHERE id % 5 = 0
LIMIT 100', table_name::regclass);

Call Function Postgresql

I want to insert table lokasi from function... but when i call these function there are error... Please your answer
CREATE OR REPLACE FUNCTION insert_lokasi2
(anip character varying, aeksemplar character varying)
RETURNS boolean AS
$BODY$
DECLARE
eks integer;
tot integer;
nilai boolean;
eks1 integer;
eks2 integer;
tot2 integer;
BEGIN
select sum(CAST(eksemplar AS INT))
INTO eks
from lokasi
where nip = anip;
tot := eks + aeksemplar;
select CAST(eksemplar AS INT)
INTO eks1
from sensus
where nip = anip;
select CAST(eksemplar2 AS INT)
INTO eks2
from sensus
where nip = anip;
tot2 := eks1 + eks2;
IF (tot <> tot2) THEN
nilai := false;
else
nilai := true;
END IF;
RETURN nilai;
END
$BODY$
LANGUAGE 'plpgsql' VOLATILE
COST 100;
ALTER FUNCTION insert_lokasi2(character varying, character varying) OWNER TO postgres;
select * from insert_lokasi2('10.1010.4703','1');
ERROR: operator does not exist: integer + character varying
LINE 1: SELECT $1 + $2
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
QUERY: SELECT $1 + $2
CONTEXT: PL/pgSQL function "insert_lokasi2" line 12 at assignment
eks is an integer whilst aeksemplar is a string. You need a cast by your addition:
tot := eks + CAST(aeksemplar AS INT)
Better would be to either do all these castings at the top of the finction or, if possible, change the argument types so that the casting is unecessary.