Postgresql: CASE statement inside STRING_AGG() loops forever - sql

I am trying to execute a (postgresql) query the auto-generates the INSERT statement for a table dynamically, for test data. I am using a SWITCH CASE statement to decide which value to return based on the column's type.
Anyone know why the following SQL query seems to execute indefinitely (or at least, for a very, very long time).
select STRING_AGG(
CASE
WHEN (udt_name = 'timestamptz' OR udt_name = 'timestamp' OR udt_name = 'date') THEN
'1990-02-02'
WHEN (udt_name = 'text') THEN
--text value as long as length of column
schema_a.make_text_value(character_maximum_length) -- returns value instantly
WHEN (udt_name = 'numeric') THEN
null
WHEN (udt_name = 'bit') THEN
null
WHEN (udt_name = 'bool') THEN
null
WHEN (udt_name = 'int4') THEN
null
ELSE null
END
, ','
) cols
FROM INFORMATION_SCHEMA.COLUMNS col2
WHERE TABLE_SCHEMA='schema_a'
AND col2.TABLE_NAME='big_event' -- big_event has 4 columns
LIMIT 1
The stored procedure called is as follows:
CREATE OR REPLACE FUNCTION schema_a.make_text_value(loopMax integer)
RETURNS TEXT AS $$
DECLARE
loopCounter INTEGER := 1;
val TEXT := '';
BEGIN
LOOP
EXIT WHEN loopCounter = loopMax ;
val := val || 'A';
loopCounter := loopCounter + 1;
END LOOP;
-- CONCAT('version_id' , CAST(idCounter AS TEXT)),
-- '2020-08-28 13:25:00.123456'
RETURN val;
END;
$$
LANGUAGE plpgsql VOLATILE

If loopMax is NULL, it can never be equal to loopCounter. So now let's look at the definition for INFORMATION_SCHEMA.Columns.character_maximum_length:
If data_type identifies a character or bit string type, the declared maximum length; null for all other data types or if no maximum length was declared.
(emphasis mine)
It certainly seems possible for this value to be NULL, based on that definition. Of course, you know your own schema, so you might be able to show this never happens... but I'd triple check to be sure, and maybe add a well-considered guard around the loopMax value anyway; you probably don't want to concatenate two billion strings in that situation.

Related

Procedure to apply formatting to all rows in a table

I had a SQL procedure that increments through each row and and pads some trailing zeros on values depending on the length of the value after a decimal point. Trying to carry this over to a PSQL environment I realized there was a lot of syntax differences between SQL and PSQL. I managed to make the conversion over time but I am still getting a syntax error and cant figure out why. Can someone help me figure out why this wont run? I am currently running it in PGadmin if that makes any difference.
DO $$
DECLARE
counter integer;
before decimal;
after decimal;
BEGIN
counter := 1;
WHILE counter <> 2 LOOP
before = (select code from table where ID = counter);
after = (SELECT SUBSTRING(code, CHARINDEX('.', code) + 1, LEN(code)) as Afterward from table where ID = counter);
IF before = after
THEN
update table set code = before + '.0000' where ID = counter;
ELSE
IF length(after) = 1 THEN
update table set code = before + '000' where ID = counter;
ELSE IF length(after) = 2 THEN
update table set code = before + '00' where ID = counter;
ELSE IF length(after) = 3 THEN
update table set code = before + '0' where ID = counter;
ELSE
select before;
END IF;
END IF;
counter := counter + 1;
END LOOP
END $$;
Some examples of the input/output of the intended result:
Input 55.5 > Output 55.5000
Input 55 > Output 55.0000
Thanks for your help,
Justin
There is no need for a function or even an update on the table to format values when displaying them.
Assuming the values are in fact numbers stored in a decimal or float column, all you need to do is to apply the to_char() function when retrieving them:
select to_char(code, 'FM999999990.0000')
from data;
This will output 55.5000 or 55.0000
The drawback of the to_char() function is that you need to anticipate the maximum number of digits of that can occur. If you have not enough 9 in the format mask, the output will be something like #.###. But as too many digits in the format mask don't hurt, I usually throw a lot into the format mask.
For more information on formatting functions, please see the manual: https://www.postgresql.org/docs/current/static/functions-formatting.html#FUNCTIONS-FORMATTING-NUMERIC-TABLE
If you insist on storing formatted data, you can use to_char() to update the table:
update the_table
set code = to_char(code::numeric, 'FM999999990.0000');
Casting the value to a number will of course fail if there a non-numeric values in the column.
But again: I strong recommend to store numbers as numbers, not as strings.
If you want to compare this to a user input, it's better to convert the user input to a proper number and compare that to the (number) values stored in the database.
The string matching that you are after doesn't actually require a function either. Using substring() with a regex will do that:
update the_table
set code = code || case length(coalesce(substring(code from '\.[0-9]*$'), ''))
when 4 then '0'
when 3 then '00'
when 2 then '000'
when 1 then '0000'
when 0 then '.0000'
else ''
end
where length(coalesce(substring(code from '\.[0-9]*$'), '')) < 5;
substring(code from '\.[0-9]*$') extracts everything the . followed by numbers that is at the end of the string. So for 55.0 it returns .0 for 55.50 it returns .50 if there is no . in the value, then it returns null that's why the coalesce is needed.
The length of that substring tells us how many digits are present. Depending on that we can then append the necessary number of zeros. The case can be shortened so that not all possible length have to be listed (but it's not simpler):
update the_table
set code = code || case length(coalesce(substring(code from '\.[0-9]*$'), ''))
when 0 then '.0000'
else lpad('0', 5- length(coalesce(substring(code from '\.[0-9]*$'), '')), '0')
end
where length(coalesce(substring(code from '\.[0-9]*$'), '')) < 5;
Another option is to use the position of the . inside the string to calculate the number of 0 that need to be added:
update the_table
set code =
code || case
when strpos(code, '.') = 0 then '0000'
else rpad('0', 4 - (length(code) - strpos(code, '.')), '0')
end
where length(code) - strpos(code, '.') < 4;
Regular expressions are quite expensive not using them will make this faster. The above will however only work if there is always at most one . in the value.
But if you can be sure that every value can be cast to a number, the to_char() method with a cast is definitely the most robust one.
To only process rows where the code columns contains correct numbers, you can use a where clause in the SQL statement:
where code ~ '^[0-9]+(\.[0-9][0-9]?)?$'
To change the column type to numeric:
alter table t alter column code type numeric

How get all matching positions in a string?

I have a column flag_acumu in a table in PostgreSQL with values like:
'SSNSSNNNNNNNNNNNNNNNNNNNNNNNNNNNNSNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN'
I need to show all positions with an 'S'. With this code, I only get the first such position, but not the later ones.
SELECT codn_conce, flag_acumu, position('S' IN flag_acumu) AS the_pos
FROM dh12
WHERE position('S' IN flag_acumu) != 0
ORDER BY the_pos ASC;
How to get all of them?
In Postgres 9.4 or later you can conveniently use unnest() in combination with WITH ORDINALITY:
SELECT *
FROM dh12 d
JOIN unnest(string_to_array(d.flag_acumu, NULL))
WITH ORDINALITY u(elem, the_pos) ON u.elem = 'S'
WHERE d.flag_acumu LIKE '%S%' -- optional, see below
ORDER BY d.codn_conce, u.the_pos;
This returns one row per match.
WHERE d.flag_acumu LIKE '%S%' is optional to quickly eliminate source rows without any matches. Pays if there are more than a few such rows.
Detailed explanation and alternatives for older versions:
PostgreSQL unnest() with element number
Since you didn't specify your needs to a point in which one could answer properly, I'm going with my assumption that you want a list of positions of occurence of a substring (can be more than 1 character long).
Here's the function to do that using:
FOR .. LOOP control structure,
function substr(text, int, int).
CREATE OR REPLACE FUNCTION get_all_positions_of_substring(text, text)
RETURNS text
STABLE
STRICT
LANGUAGE plpgsql
AS $$
DECLARE
output_text TEXT := '';
BEGIN
FOR i IN 1..length($1)
LOOP
IF substr($1, i, length($2)) = $2 THEN
output_text := CONCAT(output_text, ';', i);
END IF;
END LOOP;
-- Remove first semicolon
output_text := substr(output_text, 2, length(output_text));
RETURN output_text;
END;
$$;
Sample call and output
postgres=# select * from get_all_positions_of_substring('soklesocmxsoso','so');
get_all_positions_of_substring
--------------------------------
1;6;11;13
This works too. And a bit faster I think.
create or replace function findAllposition(_pat varchar, _tar varchar)
returns int[] as
$body$
declare _poslist int[]; _pos int;
begin
_pos := position(_pat in _tar);
while (_pos>0)
loop
if array_length(_poslist,1) is null then
_poslist := _poslist || (_pos);
else
_poslist := _poslist || (_pos + _poslist[array_length(_poslist,1)] + 1);
end if;
_tar := substr(_tar, _pos + 1, length(_tar));
_pos := position(_pat in _tar);
end loop;
return _poslist;
end;
$body$
language plpgsql;
Will return a position list which is an int array.
{position1, position2, position3, etc.}

How to cut varchar/text before n'th occurence of delimiter? PostgreSQL

I have strings (saved in database as varchar) and I have to cut them just before n'th occurence of delimiter.
Example input:
String: 'My-Example-Awesome-String'
Delimiter: '-'
Occurence: 2
Output:
My-Example
I implemented this function for fast prototype:
CREATE OR REPLACE FUNCTION find_position_delimiter(fulltext varchar, delimiter varchar, occurence integer)
RETURNS varchar AS
$BODY$
DECLARE
result varchar = '';
arr text[] = regexp_split_to_array( fulltext, delimiter);
word text;
counter integer := 0;
BEGIN
FOREACH word IN ARRAY arr LOOP
EXIT WHEN ( counter = occurence );
IF (counter > 0) THEN result := result || delimiter;
END IF;
result := result || word;
counter := counter + 1;
END LOOP;
RETURN result;
END;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
SELECT find_position_delimiter('My-Example-Awesome-String', '-', 2);
For now it assumes that string is not empty (provided by query where I will call function) and delimiter string contains at least one delimiter of provided pattern.
But now I need something better for performance test. If it is possible, I would love to see the most universal solution, because not every user of my system is working on PostgreSQL database (few of them prefer Oracle, MySQL or SQLite), but it is not the most importatnt. But performance is - because on specific search, that function can be called even few hundreds times.
I didn't find anything about fast and easy using varchar as a table of chars and checking for occurences of delimiter (I could remember position of occurences and then create substring from first char to n'th delimiter position-1). Any ideas? Are smarter solutions?
# EDIT: yea, I know that function in every database will be a bit different, but body of function can be very similliar or the same. Generality is not a main goal :) And sorry for that bad function working-name, I just saw it has not right meaning.
you can try doing something based on this:
select
varcharColumnName,
INSTR(varcharColumnName,'-',1,2),
case when INSTR(varcharColumnName,'-',1,2) <> 0
THEN SUBSTR(varcharColumnName, 1, INSTR(varcharColumnName,'-',1,2) - 1)
else '...'
end
from tableName;
of course, you have to handle "else" the way you want. It works on postgres and oracle (tested), it should work on other dbms's because these are standard sql functions
//edit - as a function, however this way it's rather hard to make it cross-dbms
CREATE OR REPLACE FUNCTION find_position_delimiter(fulltext varchar, delimiter varchar, occurence integer)
RETURNS varchar as
$BODY$
DECLARE
result varchar := '';
delimiterPos integer := 0;
BEGIN
delimiterPos := INSTR(fulltext,delimiter,1,occurence);
result := SUBSTR(fulltext, 1, delimiterPos - 1);
RETURN result;
END;
$BODY$
LANGUAGE 'plpgsql' IMMUTABLE;
SELECT find_position_delimiter('My-Example-Awesome-String', '-', 2);
create or replace function trunc(string text, delimiter char, occurence int) returns text as $$
return delimiter.join(string.split(delimiter)[:occurence])
$$ language plpythonu;
# select trunc('My-Example-Awesome-String', '-', 2);
trunc
------------
My-Example
(1 row)

How can I determine if a string is numeric in SQL?

In a SQL query on Oracle 10g, I need to determine whether a string is numeric or not. How can I do this?
You can use REGEXP_LIKE:
SELECT 1 FROM DUAL
WHERE REGEXP_LIKE('23.9', '^\d+(\.\d+)?$', '')
You ca try this:
SELECT LENGTH(TRIM(TRANSLATE(string1, ' +-.0123456789', ' '))) FROM DUAL
where string1 is what you're evaluating. It will return null if numeric. Look here for further clarification
I don't have access to a 10G instance for testing, but this works in 9i:
CREATE OR REPLACE FUNCTION is_numeric (p_val VARCHAR2)
RETURN NUMBER
IS
v_val NUMBER;
BEGIN
BEGIN
IF p_val IS NULL OR TRIM (p_val) = ''
THEN
RETURN 0;
END IF;
SELECT TO_NUMBER (p_val)
INTO v_val
FROM DUAL;
RETURN 1;
EXCEPTION
WHEN OTHERS
THEN
RETURN 0;
END;
END;
SELECT is_numeric ('333.5') is_numeric
FROM DUAL;
I have assumed you want nulls/empties treated as FALSE.
As pointed out by Tom Kyte in http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:7466996200346537833, if you're using the built-in TO_NUMBER in a user defined function, you may need a bit of extra trickery to make it work.
FUNCTION is_number(x IN VARCHAR2)
RETURN NUMBER
IS
PROCEDURE check_number (y IN NUMBER)
IS
BEGIN
NULL;
END;
BEGIN
PRAGMA INLINE(check_number, 'No');
check_number(TO_NUMBER(x);
RETURN 1;
EXCEPTION
WHEN INVALID_NUMBER
THEN RETURN 0;
END is_number;
The problem is that the optimizing compiler may recognize that the result of the TO_NUMBER is not used anywhere and optimize it away.
Says Tom (his example was about dates rather then numbers):
the disabling of function inlining will make it do the call to
check_date HAS to be made as a function call - making it so that the
DATE has to be pushed onto the call stack. There is no chance for the
optimizing compiler to remove the call to to_date in this case. If the
call to to_date needed for the call to check_date fails for any
reason, we know that the string input was not convertible by that date
format.
Here is a method to determine numeric that can be part of a simple query, without creating a function. Accounts for embedded spaces, +- not the first character, or a second decimal point.
var v_test varchar2(20);
EXEC :v_test := ' -24.9 ';
select
(case when trim(:v_test) is null then 'N' ELSE -- only banks, or null
(case when instr(trim(:v_test),'+',2,1) > 0 then 'N' ELSE -- + sign not first char
(case when instr(trim(:v_test),'-',2,1) > 0 then 'N' ELSE -- - sign not first char
(case when instr(trim(:v_test),' ',1,1) > 0 then 'N' ELSE -- internal spaces
(case when instr(trim(:v_test),'.',1,2) > 0 then 'N' ELSE -- second decimal point
(case when LENGTH(TRIM(TRANSLATE(:v_test, ' +-.0123456789',' '))) is not null then 'N' ELSE -- only valid numeric charcters.
'Y'
END)END)END)END)END)END) as is_numeric
from dual;
I found that the solution
LENGTH(TRIM(TRANSLATE(string1, ' +-.0123456789', ' '))) is null
allows embedded blanks ... it accepts "123 45 6789" which for my purpose is not a number.
Another level of trim/translate corrects this. The following will detect a string field containing consecutive digits with leading or trailing blanks such that to_number(trim(string1)) will not fail
LENGTH(TRIM(TRANSLATE(translate(trim(string1),' ','X'), '0123456789', ' '))) is null
For integers you can use the below. The first translate changes spaces to be a character and the second changes numbers to be spaces. The Trim will then return null if only numbers exist.
TRIM(TRANSLATE(TRANSLATE(TRIM('1 2 3d 4'), ' ','#'),'0123456789',' ')) is null

Parameters in query with in clause?

I want to use parameter for query like this :
SELECT * FROM MATABLE
WHERE MT_ID IN (368134, 181956)
so I think about this
SELECT * FROM MATABLE
WHERE MT_ID IN (:MYPARAM)
but it doesn't work...
Is there a way to do this ?
I actually use IBX and Firebird 2.1
I don't know how many parameters in IN clause.
For whom ever is still interested. I did it in Firebird 2.5 using another stored procedure inspired by this post.
How to split comma separated string inside stored procedure?
CREATE OR ALTER PROCEDURE SPLIT_STRING (
ainput varchar(8192))
RETURNS (
result varchar(255))
AS
DECLARE variable lastpos integer;
DECLARE variable nextpos integer;
DECLARE variable tempstr varchar(8192);
BEGIN
AINPUT = :AINPUT || ',';
LASTPOS = 1;
NEXTPOS = position(',', :AINPUT, LASTPOS);
WHILE (:NEXTPOS > 1) do
BEGIN
TEMPSTR = substring(:AINPUT from :LASTPOS for :NEXTPOS - :LASTPOS);
RESULT = :TEMPSTR;
LASTPOS = :NEXTPOS + 1;
NEXTPOS = position(',', :AINPUT, LASTPOS);
suspend;
END
END
When you pass the SP the following list
CommaSeperatedList = 1,2,3,4
and call
SELECT * FROM SPLIT_STRING(:CommaSeperatedList)
the result will be :
RESULT
1
2
3
4
And can be used as follows:
SELECT * FROM MyTable where MyKeyField in ( SELECT * FROM SPLIT_STRING(:CommaSeperatedList) )
I ended up using a global temporary table in Firebird, inserting parameter values first and to retrieve results I use a regular JOIN instead of a WHERE ... IN clause. The temporary table is transaction-specific and cleared on commit (ON COMMIT DELETE ROWS).
Maybe you should wite it like this:
SELECT * FROM MATABLE
WHERE MT_ID IN (:MYPARAM1 , :MYPARAM2)
I don't think it's something that can be done. Are there any particular reason why you don't want to build the query yourself?
I've used this method a couple of times, it doesn't use parameters though. It uses a stringlist and it's property DelimitedText. You create a IDList and populate it with your IDs.
Query.SQL.Add(Format('MT_ID IN (%s)', [IDList.DelimitedText]));
You might also be interested in reading the following:
http://www.sommarskog.se/dynamic_sql.html
and
http://www.sommarskog.se/arrays-in-sql-2005.html
Covers dynamic sql with 'in' clauses and all sorts. Very interesting.
Parameters are placeholders for single values, that means that an IN clause, that accepts a comma delimited list of values, cannot be used with parameters.
Think of it this way: wherever I place a value, I can use a parameter.
So, in a clause like: IN (:param)
I can bind the variable to a value, but only 1 value, eg: IN (4)
Now, if you consider an "IN clause value expression", you get a string of values: IN (1, 4, 6) -> that's 3 values with commas between them. That's part of the SQL string, not part of a value, which is why it cannot be bound by a parameter.
Obviously, this is not what you want, but it's the only thing possible with parameters.
The answer from Yurish is a solution in two out of three cases:
if you have a limited number of items to be added to your in clause
or, if you are willing to create parameters on the fly for each needed element (you don't know the number of elements in design time)
But if you want to have arbitrary number of elements, and sometimes no elements at all, then you can generate SLQ statement on the fly. Using format helps.
SELECT * FROM MATABLE
WHERE MT_ID IN (:MYPARAM) instead of using MYPARAM with :, use parameter name.
like SELECT * FROM MATABLE
WHERE MT_ID IN (SELECT REGEXP_SUBSTR(**MYPARAM,'[^,]+', 1, LEVEL)
FROM DUAL
CONNECT BY REGEXP_SUBSTR(MYPARAM, '[^,]+', 1, LEVEL) IS NOT NULL))**
MYPARAM- '368134,181956'
If you are using Oracle, then you should definitely check out Tom Kyte's blog post on exactly this subject (link).
Following Mr Kyte's lead, here is an example:
SELECT *
FROM MATABLE
WHERE MT_ID IN
(SELECT TRIM(substr(text, instr(text, sep, 1, LEVEL) + 1,
instr(text, sep, 1, LEVEL + 1) -
instr(text, sep, 1, LEVEL) - 1)) AS token
FROM (SELECT sep, sep || :myparam || sep AS text
FROM (SELECT ',' AS sep
FROM dual))
CONNECT BY LEVEL <= length(text) - length(REPLACE(text, sep, '')) - 1)
Where you would bind :MYPARAM to '368134,181956' in your case.
Here is a technique I have used in the past to get around that 'IN' statement problem. It builds an 'OR' list based on the amount of values specified with parameters (unique). Then all I had to do was add the parameters in the order they appeared in the supplied value list.
var
FilterValues: TStringList;
i: Integer;
FilterList: String;
Values: String;
FieldName: String;
begin
Query.SQL.Text := 'SELECT * FROM table WHERE '; // set base sql
FieldName := 'some_id'; // field to filter on
Values := '1,4,97'; // list of supplied values in delimited format
FilterList := '';
FilterValues := TStringList.Create; // will get the supplied values so we can loop
try
FilterValues.CommaText := Values;
for i := 0 to FilterValues.Count - 1 do
begin
if FilterList = '' then
FilterList := Format('%s=:param%u', [FieldName, i]) // build the filter list
else
FilterList := Format('%s OR %s=:param%u', [FilterList, FieldName, i]); // and an OR
end;
Query.SQL.Text := Query.SQL.Text + FilterList; // append the OR list to the base sql
// ShowMessage(FilterList); // see what the list looks like.
if Query.ParamCount <> FilterValues.Count then
raise Exception.Create('Param count and Value count differs.'); // check to make sure the supplied values have parameters built for them
for i := 0 to FilterValues.Count - 1 do
begin
Query.Params[i].Value := FilterValues[i]; // now add the values
end;
Query.Open;
finally
FilterValues.Free;
end;
Hope this helps.
There is one trick to use reversed SQL LIKE condition.
You pass the list as string (VARCHAR) parameter like '~12~23~46~567~'
Then u have query like
where ... :List_Param LIKE ('%~' || CAST( NumField AS VARCHAR(20)) || '~%')
CREATE PROCEDURE TRY_LIST (PARAM_LIST VARCHAR(255)) RETURNS (FIELD1....)
AS
BEGIN
/* Check if :PARAM_LIST begins with colon "," and ands with colon ","
the list should look like this --> eg. **",1,3,4,66,778,33,"**
if the format of list is right then GO if not just add then colons
*/
IF (NOT SUBSTRING(:PARAM_LIST FROM 1 FOR 1)=',') THEN PARAM_LIST=','||PARAM_LIST;
IF (NOT SUBSTRING(:PARAM_LIST FROM CHAR_LENGTH(:PARAM_LIST) FOR 1)=',') THEN PARAM_LIST=PARAM_LIST||',';
/* Now you are shure thet :PARAM_LIST format is correct */
/ * NOW ! */
FOR SELECT * FROM MY_TABLE WHERE POSITION(','||MY_FIELD||',' in :PARAM_LIST)>0
INTO :FIELD1, :FIELD2 etc... DO
BEGIN
SUSPEND;
END
END
How to use it.
SELECT * FROM TRY_LIST('3,4,544,87,66,23')
or SELECT * FROM TRY_LIST(',3,4,544,87,66,23,')
if the list have to be longer then 255 characters then just change the part of header f.eg. like PARAM_LIST VARCHAR(4000)