How to get the first field from an anonymous row type in PostgreSQL 9.4? - sql

=# select row(0, 1) ;
row
-------
(0,1)
(1 row)
How to get 0 within the same query? I figured the below sort of working but is there any simple way?
=# select json_agg(row(0, 1))->0->'f1' ;
?column?
----------
0
(1 row)
No luck with array-like syntax [0].
Thanks!

Your row type is anonymous and therefore you cannot access its elements easily. What you can do is create a TYPE and then cast your anonymous row to that type and access the elements defined in the type:
CREATE TYPE my_row AS (
x integer,
y integer
);
SELECT (row(0,1)::my_row).x;
Like Craig Ringer commented in your question, you should avoid producing anonymous rows to begin with, if you can help it, and type whatever data you use in your data model and queries.

If you just want the first element from any row, convert the row to JSON and select f1...
SELECT row_to_json(row(0,1))->'f1'
Or, if you are always going to have two integers or a strict structure, you can create a temporary table (or type) and a function that selects the first column.
CREATE TABLE tmptable(f1 int, f2 int);
CREATE FUNCTION gettmpf1(tmptable) RETURNS int AS 'SELECT $1.f1' LANGUAGE SQL;
SELECT gettmpf1(ROW(0,1));
Resources:
https://www.postgresql.org/docs/9.2/static/functions-json.html
https://www.postgresql.org/docs/9.2/static/sql-expressions.html

The json solution is very elegant. Just for fun, this is a solution using regexp (much uglier):
WITH r AS (SELECT row('quotes, "commas",
and a line break".',null,null,'"fourth,field"')::text AS r)
--WITH r AS (SELECT row('',null,null,'')::text AS r)
--WITH r AS (SELECT row(0,1)::text AS r)
SELECT CASE WHEN r.r ~ '^\("",' THEN ''
WHEN r.r ~ '^\("' THEN regexp_replace(regexp_replace(regexp_replace(right(r.r, -2), '""', '\"', 'g'), '([^\\])",.*', '\1'), '\\"', '"', 'g')
ELSE (regexp_matches(right(r.r, -1), '^[^,]*'))[1] END
FROM r
When converting a row to text, PostgreSQL uses quoted CSV formatting. I couldn't find any tools for importing quoted CSV into an array, so the above is a crude text manipulation via mostly regular expressions. Maybe someone will find this useful!

With Postgresql 13+, you can just reference individual elements in the row with .fN notation. For your example:
select (row(0, 1)).f1; --> returns 0.
See https://www.postgresql.org/docs/13/sql-expressions.html#SQL-SYNTAX-ROW-CONSTRUCTORS

Related

PostgreSQL SQL query to find number of occurrences of substring in string

I’m trying to wrap my head around a problem but I’m hitting a blank. I know SQL quite well, but I’m not sure how to approach this.
My problem:
Given a string and a table of possible substrings, I need to find the number of occurrences.
The search table consists of a single colum:
searchtable
| pattern TEXT PRIMARY KEY|
|-------------------------|
| my |
| quick |
| Earth |
Given the string "Earth is my home planet and where my friends live", the expected outcome is 3 (2x "my" and 1x "Earth").
In my function, I have variable bodytext which is the string to examine.
I know I can do IN (SELECT pattern FROM searchtable) to get the list of substrings, and I could possibly use a LIKE ANY clause to get matches, but how can I count occurrences of the substrings in the table within the search string?
This is easily done without a custom function:
select count(*)
from (values ('Earth is my home planet and where my friends live')) v(str) cross join lateral
regexp_split_to_table(v.str, ' ') word join
patterns p
on word = p.pattern
Just break the original string into "words". Then match on the words.
Another method uses regular expression matching:
select (select count(*) from regexp_matches(v.str, p.rpattern, 'g'))
from (values ('Earth is my home planet and where my friends live')) v(str) cross join
(select string_agg(pattern, '|') as rpattern
from patterns
) p;
This stuffs all the patterns into a regular expression. Not that this version does not take word breaks into account.
Here is a db<>fiddle.
I solved the problem with the following code:
CREATE OR REPLACE FUNCTION count_matches(body TEXT, OUT matches INTEGER) AS $$
DECLARE
results INTEGER := 0;
matchlist RECORD;
BEGIN
FOR matchlist IN (SELECT pattern FROM searchtable)
LOOP
results := results + (SELECT LENGTH(body) -
LENGTH(REPLACE(body, matchlist.pattern, ''))) /
LENGTH(matchlist.pattern);
END LOOP;
matches := results;
END;
$$ LANGUAGE plpgsql;

NOT IN is not working as expected with Listagg function

Below is the DDL of the table
create or replace table tempdw.blk_table;
(
db_name varchar,
tbl_expr varchar
);
insert into tempdw.blk_table values ('edw','ABC%');
insert into tempdw.blk_table values ('edw','EFG%');
select * from tempdw.blk_table;
Below code is not working, expected output should not return any
select * from tempdw.blk_table where tbl_expr not in (
select regexp_replace(regexp_replace(replace(listagg(tbl_expr,','),',','\',\''),'^','\''),'$','\'') from tempdw.blk_table);
When I run below code it works fine , Trying to understand why it's not working for above code
select * from tempdw.blk_table where tbl_expr NOT IN('ABC%','EFG%');
Au contraire The code is working just fine. You don't understand the difference between a string that has commas and a list of strings.
Unfortunately, it is rather hard to figure out what you do want to do, because your question does not explain that.
I can speculate that you want something like:
select bt.*
from blk_table bt
where db_name like tbl_expr;
This is just a guess, however.
with data as (
select * from values ('edw','ABC%'),('edw','ABC%') v(db_name,tbl_expr )
)
select * from data
where tbl_expr not in (
select regexp_replace(regexp_replace(replace(listagg(tbl_expr,','),',','\',\''),'^','\''),'$','\'') from data);
does indeed give the results you don't want. aka:
DB_NAME TBL_EXPR
edw ABC%
edw ABC%
because your sub-query only has one row of results, because you have aggregated the two input into one row.
REGEXP_REPLACE( REGEXP_REPLACE( REPLACE( LISTAGG( TBL_EXPR,','),',','\',\''),'^','\''),'$','\'')
'ABC%','ABC%'
and NOT IN is a exact match .. thus if we change from strings to numbers:
SELECT num, num in (2,3,4) FROM values (1),(3),(5) v(num);
gives:
NUM NUM IN (2,3,4)
1 0
3 1
5 0
so your NOT IN would only return strings that are not in the list of one you have... and given your list is the aggregate of the same input, that are by definition not that same.
back to strings..
SELECT str
,str in ('str_a', 'str_b')
,str not in ('str_a', 'str_b')
from values ('a'),('str_b') v(str);
gives:
STR STR IN ('STR_A', 'STR_B') STR NOT IN ('STR_A', 'STR_B')
a 0 1
str_b 1 0
Thus the results you are getting..
now I suspect you are want LIKE type behavior OR a REGEX match, but given you are building the list you know what you are doing there..
also note:
listagg(tbl_expr,',') AS a
,replace(a,',','\',\'') AS b
,regexp_replace(b,'^','\'') AS c
,regexp_replace(c,'$','\'') AS d
is the effect of what you are doing can be replaced with
listagg('\'' || tbl_expr || '\'',',')
unless you want strings with embedded comma to become independent "list" items..

How to remove elements of array in PostgreSQL?

Is it possible to remove multiple elements from an array?
Before removing elements Array1 is :
{1,2,3,4}
Array2 that contains some elements I wish to remove:
{1,4}
And I want to get:
{2,3}
How to operate?
Use unnest() with array_agg(), e.g.:
with cte(array1, array2) as (
values (array[1,2,3,4], array[1,4])
)
select array_agg(elem)
from cte, unnest(array1) elem
where elem <> all(array2);
array_agg
-----------
{2,3}
(1 row)
If you often need this functionality, define the simple function:
create or replace function array_diff(array1 anyarray, array2 anyarray)
returns anyarray language sql immutable as $$
select coalesce(array_agg(elem), '{}')
from unnest(array1) elem
where elem <> all(array2)
$$;
You can use the function for any array, not only int[]:
select array_diff(array['a','b','c','d'], array['a','d']);
array_diff
------------
{b,c}
(1 row)
With some help from this post:
select array_agg(elements) from
(select unnest('{1,2,3,4}'::int[])
except
select unnest('{1,4}'::int[])) t (elements)
Result:
{2,3}
With the intarray extension, you can simply use -:
select '{1,2,3,4}'::int[] - '{1,4}'::int[]
Result:
{2,3}
Online demonstration
You'll need to install the intarray extension if you didn't already. It adds many convenient functions and operators if you're dealing with arrays of integers.
This answer is the simplest I think:
https://stackoverflow.com/a/6535089/673187
SELECT array(SELECT unnest(:array1) EXCEPT SELECT unnest(:array2));
so you can easily use it in an UPDATE command, when you need to remove some elements from an array column:
UPDATE table1 SET array1_column=(SELECT array(SELECT unnest(array1_column) EXCEPT SELECT unnest('{2, 3}'::int[])));
You can use this function for when you are dealing with bigint/int8 numbers and want to maintain order:
CREATE OR REPLACE FUNCTION arr_subtract(int8[], int8[])
RETURNS int8[] AS
$func$
SELECT ARRAY(
SELECT a
FROM unnest($1) WITH ORDINALITY x(a, ord)
WHERE a <> ALL ($2)
ORDER BY ord
);
$func$ LANGUAGE sql IMMUTABLE;
I got this solution from the following answer to a similar question: https://stackoverflow.com/a/8584080/1544473
User array re-dimension annotation
array[<start index>:<end index>]
WITH t(stack, dim) as (
VALUES(ARRAY[1,2,3,4], ARRAY[1,4])
) SELECT stack[dim[1]+1:dim[2]-1] FROM t

Select a portion of a comma delimited string in DB2/DB2400

I need to select a value within a comma delimited string using only SQL. Is this possible?
Data
A B C
1 Luigi Apple,Banana,Pineapple,,Citrus
I need to select specifically the 2nd item in column C, in this case banana. I need help. I cannot create new SQL functions, I can only use SQL. This is the as400 so the SQL is somewhat old tech.
Update..
With help from #Sandeep we were able to come up with
SELECT xmlcast(xmlquery('$x/Names/Name[2]' passing xmlparse(document CONCAT(CONCAT('<?xml version="1.0" encoding="UTF-8" ?><Names><Name>',REPLACE(ODWDATA,',','</Name><Name>')),'</Name></Names>')) as "x") as varchar(1000)) FROM ACL00
I'm getting this error
Keyword PASSING not expected. Valid tokens: ) ,.
New update. Problem solved by using UDF of Oracle's INSTR
I'm assuming db2 which I don't use, so the following syntax may not be bang on but the approach works.
In Oracle I'd use INSTR() and SUBSTR(), Google suggests LOCATE() and SUBSTR() for db2
Use LOCATE to get the position of the first comma, and use that value in SUBSTR to grab the end of YourColumn starting after the first comma
SUBSTR(YourColumn, LOCATE(YourColumn, ',') + 1)
You started with "Apple,Banana,Pineapple,,Citrus", you should now have "Banana,Pineapple,,Citrus", so we use LOCATE and SUBSTR again on the string returned above.
SUBSTR(SUBSTR(YourColumn, LOCATE(YourColumn, ',') + 1), 1, LOCATE(SUBSTR(YourColumn, LOCATE(YourColumn, ',') + 1), ',') - 1)
First SUBSTR is getting the right hand side of the string so we only need a start position parameter, second SUBSTR is grabbing the left side of the string so we need two, the start position and the length to return.
If you want 2nd item only than you can use substring function:
DECLARE #TABLE TABLE
(
A INT,
B VARCHAR(100),
C VARCHAR(100)
)
DECLARE #NTH INT = 3
INSERT INTO #TABLE VALUES (1,'Luigi','Apple,Banana,Pineapple,,Citrus')
SELECT REPLACE(REPLACE(CAST(CAST('<Name>'+ REPLACE(C,',','</Name><Name>') +'</Name>' AS XML).query('/Name[sql:variable("#NTH")]') AS VARCHAR(1000)),'<Name>',''),'</Name>','') FROM #TABLE
I am answering my own question now. It is impossible to do this with the built in functions within AS400
You have to create an UDF of Oracle's INSTR
Enter this within STRSQL it will create a new function called INSTRB
CREATE FUNCTION INSTRB (C1 VarChar(4000), C2 VarChar(4000), N integer, M integer)
RETURNS Integer
SPECIFIC INSTRBOracleBase
LANGUAGE SQL
CONTAINS SQL
NO EXTERNAL ACTION
DETERMINISTIC
BEGIN ATOMIC
DECLARE Pos, R, C2L Integer;
SET C2L = LENGTH(C2);
IF N > 0 THEN
SET (Pos, R) = (N, 0);
WHILE R < M AND Pos > 0 DO
SET Pos = LOCATE(C2,C1,Pos);
IF Pos > 0 THEN
SET (Pos, R) = (Pos + 1, R + 1);
END IF;
END WHILE;
RETURN (Pos - 1)*(1-SIGN(M-R));
ELSE
SET (Pos, R) = (LENGTH(C1)+N, 0);
WHILE R < M AND Pos > 0 DO
IF SUBSTR(C1,Pos,C2L) = C2 THEN
SET R = R + 1;
END IF;
SET Pos = Pos - 1;
END WHILE;
RETURN (Pos + 1)*(1-SIGN(M-R));
END IF;
END
Then to select the nth delimited value within a comma delimited string... in this case the 14th
use this query utilizing the new function
SELECT SUBSTRING(C,INSTRB(C,',',1,13)+1,INSTRB(C,',',1,14)-INSTRB(C,',',1,13)-1) FROM TABLE
A much prettier solution IMO would be to encapsulate a Recursive Common Table Expression (recursive CTE aka RCTE) of the data from the column C to generate a result TABLE [i.e. a User Defined Table Function (a Table UDF aka UDTF)] then use a Scalar Subselect to choose which effective record\row number.
select
a
, b
, ( select S.token_vc
from table( split_tokens(c) ) as S
where S.token_nbr = 2
) as "2nd Item of column C"
from The_Table /* in OP described with columns a,b,c but no DDL */
Yet prettier would be to make the result of that same RCTE a scalar value, so as to allow being invoked simply as a Scalar UDF with the effective row number [as another argument] defining specifically which element to select.
select
a
, b
, split_tokens(c, 2) as "2nd Item of column C"
from The_Table /* in OP described with columns a,b,c but no DDL */
The latter could be more efficient, limiting the row-data produced by the RCTE, to only the desired numbered token and those preceding numbered tokens. I can not comment on the efficiency with regard to impacts on CPU and storage as contrasted with any of the other answers offered, but my own experience with the temporary-storage implementation and the overall quickness of the RCTE results has been positive especially when other row selection limits the number of derived-table results that must be produced for the overall query request.
The UDF [and\or UDTF and the RCTE that implements them] is left as an exercise for the reader; mostly, because I do not have a system on a release that has support for recursive table expressions. If asked [e.g. in a comment to this answer], I could provide untested code source.
I have found the locate_in_string function to work very well in this case.
select substr(
c,
locate_in_string(c, ',')+1,
locate_in_string(c, ',', locate_in_string(c, ',')+1) - locate_in_string(c, ',')-1
) as fruit2
from ACL00 for read only with ur;

Selecting data into a Postgres array

I have the following data:
name id url
John 1 someurl.com
Matt 2 cool.com
Sam 3 stackoverflow.com
How can I write an SQL statement in Postgres to select this data into a multi-dimensional array, i.e.:
{{John, 1, someurl.com}, {Matt, 2, cool.com}, {Sam, 3, stackoverflow.com}}
I've seen this kind of array usage before in Postgres but have no idea how to select data from a table into this array format.
Assuming here that all the columns are of type text.
You cannot use array_agg() to produce multi-dimensional arrays, at least not up to PostgreSQL 9.4.
(But the upcoming Postgres 9.5 ships a new variant of array_agg() that can!)
What you get out of #Matt Ball's query is an array of records (the_table[]).
An array can only hold elements of the same base type. You obviously have number and string types. Convert all columns (that aren't already) to text to make it work.
You can create an aggregate function for this like I demonstrated to you here before.
CREATE AGGREGATE array_agg_mult (anyarray) (
SFUNC = array_cat
,STYPE = anyarray
,INITCOND = '{}'
);
Call:
SELECT array_agg_mult(ARRAY[ARRAY[name, id::text, url]]) AS tbl_mult_arr
FROM tbl;
Note the additional ARRAY[] layer to make it a multidimensional array (2-dimenstional, to be precise).
Instant demo:
WITH tbl(id, txt) AS (
VALUES
(1::int, 'foo'::text)
,(2, 'bar')
,(3, '}b",') -- txt has meta-characters
)
, x AS (
SELECT array_agg_mult(ARRAY[ARRAY[id::text,txt]]) AS t
FROM tbl
)
SELECT *, t[1][3] AS arr_element_1_1, t[3][4] AS arr_element_3_2
FROM x;
You need to use an aggregate function; array_agg should do what you need.
SELECT array_agg(s) FROM (SELECT name, id, url FROM the_table ORDER BY id) AS s;