Working with where clauses inside views - sql

CREATE VIEW summary AS
SELECT sales.inventory_id,
sum(sales.value) AS value,
count(sales.*) AS sales_n,
FROM promotion
JOIN sales USING (promotion_id)
GROUP BY sales.inventory_id
;
This is a simplified query of a problem that I am dealing with
However, I would like to be able to add a WHERE clause into the view. Something like this WHERE promotion.created_at > ?
QUESTION:
How should I modify the view so that I can do something like this:
SELECT *
FROM summary
WHERE promotion.created_at > [timestamp]
I am currently not able to do this because there is a GROUP BY on inventory_id so promotion.created_at is not captured

Use an SQL function taking the timestamp as parameter instead:
CREATE FUNCTION summary(_created_at timestamp)
RETURNS TABLE (inventory_id int, value bigint, sales_n int)
AS
$func$
SELECT s.inventory_id
, sum(s.value) -- AS value -- alias not visible outside function
, count(*) -- AS sales_n
FROM promotion p
JOIN sales s USING (promotion_id)
WHERE p.created_at > $1
GROUP BY s.inventory_id
$func$ LANGUAGE sql STABLE;
Call:
SELECT * FROM summary('2014-06-01 10:00');
I shortened the syntax with table aliases.
count(*) does the same as count(s.*) here.
Adapt RETURNS TABLE(...) to your actual data types.

Related

Table variable join equivalent in Oracle

I'm given quite a huge table My_Table and a user-defined collection Picture_Arr as an input usually having 5-10 rows declared as:
TYPE Picture_Rec IS RECORD (
seq_no NUMBER,
task_seq NUMBER);
TYPE Picture_Arr IS TABLE OF Picture_Rec;
In MS SQL I would normally write something like:
DECLARE #Picture_Arr TABLE (seq_no INT, task_seq INT)
SELECT M.*
FROM My_Table M
INNER JOIN #Picture_Arr A
ON M.seq_no = A.seq_no AND M.task_seq = A.task_seq
But I can't get my head around how to re-write the same code in Oracle as Picture_Arr is not a table. As some tutorials state that I could've looped through My_Table and compare keys, but is it efficient in Oracle or is there another way of doing that?
Perhaps this is what you are looking for. It is a bit complicated to understand what is the desired output, and whether the data of the record is stored somewhere or not
create type Picture_Rec as object(
seq_no NUMBER,
task_seq NUMBER);
)
/
create type Picture_Tab as table of Picture_Rec
/
create or replace function get_picture_list
return Picture_Tab
is
l_pic Picture_Tab;
begin
select Picture_Rec ( seqno, taskseq )
bulk collect into l_pic
from your_table; -- the table you have these records
return l_pic;
end;
/
Then you run
SELECT M.*
FROM My_Table M
JOIN TABLE ( get_picture_list() ) p
ON M.seq_no = p.seq_no AND M.task_seq = p.task_seq

PostgreSQL function returning a data cube

First off, the iceberg-cube query is defined as in
Let's say I have a relation item,location,year,supplier,unit_sales,
and I would like to write a plpgsql functions as
a wrapper around the query in the image, to specify the parameter N,
like so:
create or replace function iceberg_query( percentage integer )
returns cube
/* Code here */
as
$$
declare
numrows int;
begin
select count(*) into numrows from sales;
select item, location, year, count(*)
from sales
group by cube(item,location,year)
having count(*) >= numrows*percentage/100;
end;
$$ language 'plpgsql'
What do I need to add to Code here-part, to make this work? How to specify a data cube as a return type in plpgsql?
To make your plpgsql function work, you need a RETURNS clause matching what you return. And you need to actually return something. I suppose:
CREATE OR REPLACE FUNCTION iceberg_query ( percentage numeric)
RETURNS TABLE (item ?TYPE?, location ?TYPE?, year ?TYPE?, ct bigint)
AS
$func$
DECLARE
numrows bigint := (SELECT count(*) FROM sales);
BEGIN
RETURN QUERY
SELECT s.item, s.location, s.year, count(*)
FROM sales s
GROUP BY cube(s.item,s.location,s.year)
HAVING count(*) >= numrows * percentage / 100;
END
$func$ LANGUAGE plpgsql;
Replace the placeholders ?TYPE? with actual (undisclosed) data types.
Call the function with:
SELECT * FROM iceberg_query (10);
Note how I table-qualify all column names in the query to avoid naming collisions with the new OUT parameters of the same name.
And note the use of numeric instead of integer as pointed out by Scoots in a comment.
Related:
How to return result of a SELECT inside a function in PostgreSQL?
plpgsql error "RETURN NEXT cannot have a parameter in function with OUT parameters" in table-returning function
Aside: you don't need a function for this. This plain SQL query does the same:
SELECT s.item, s.location, s.year, count(*)
FROM sales s
GROUP BY cube(s.item,s.location,s.year)
HAVING count(*) >= (SELECT count(*) * $percentage / 100 FROM sales); -- your pct here
Provide a numeric literal (10.0, not 10) to avoid integer division and the rounding that comes with it.

PostgreSQL Function Returning Table or SETOF

Quick background: very new to PostgreSQL, came from SQL Server. I am working on converting stored procedures from SQL Server to PostgreSQL. I have read Postgres 9.3 documentation and looked through numerous examples and questions but still cannot find a solution.
I have this select statement that I execute weekly to return new values
select distinct id, null as charges
, generaldescription as chargedescription, amount as paidamount
from mytable
where id not in (select id from myothertable)
What can I do to turn this into a function where I can just right click and execute on a weekly basis. Eventually I will automate this using a program another developer built. So it will execute with no user involvement and send results in a spreadsheet to the user. Not sure if that last part matters to how the function is written.
This is one of my many failed attempts:
CREATE FUNCTION newids
RETURNS TABLE (id VARCHAR, charges NUMERIC
, chargedescription VARCHAR, paidamount NUMERIC) AS Results
Begin
SELECT DISTINCT id, NULL AS charges
, generaldescription AS chargedescription, amount AS paidamount
FROM mytable
WHERE id NOT IN (SELECT id FROM myothertable)
END;
$$ LANGUAGE plpgsql;
Also I am using Navicat and the error is:
function result type must be specified
You can use PL/pgSQL here and it may even be the better choice.
Difference between language sql and language plpgsql in PostgreSQL functions
But you need to fix a syntax error, match your dollar-quoting, add an explicit cast (NULL::numeric) and use RETURN QUERY:
CREATE FUNCTION newids_plpgsql()
RETURNS TABLE (
id varchar
, charges numeric
, chargedescription varchar
, paidamount numeric
) AS -- Results -- remove this
$func$
BEGIN
RETURN QUERY
SELECT ...;
END;
$func$ LANGUAGE plpgsql;
Or use a simple SQL function:
CREATE FUNCTION newids_sql()
RETURNS TABLE (
id varchar
, charges numeric
, chargedescription varchar
, paidamount numeric
) AS
$func$
SELECT ...;
$func$ LANGUAGE sql;
Either way, the SELECT statement can be more efficient:
SELECT DISTINCT t.id, NULL::numeric AS charges
, t.generaldescription AS chargedescription, t.amount AS paidamount
FROM mytable t
LEFT JOIN myothertable m ON m.id = t.id
WHERE m.id IS NULL;
Select rows which are not present in other table
Of course, all data types must match. I can't tell, table definition is missing.
And are you sure you need DISTINCT?

Left join with dynamic table name derived from column

I am new in PostgreSQL and I wonder if it's possible to use number from table tbc as part of the table name in left join 'pa' || number. So for example if number is 456887 I want left join with table pa456887. Something like this:
SELECT tdc.cpa, substring(tdc.ku,'[0-9]+') AS number, paTab.vym
FROM public."table_data_C" AS tdc
LEFT JOIN concat('pa' || number) AS paTab ON (paTab.cpa = tdc.cpa)
And I want to use only PostgreSQL, not additional code in PHP for example.
Either way, you need dynamic SQL.
Table name as given parameter
CREATE OR REPLACE FUNCTION foo(_number int)
RETURNS TABLE (cpa int, nr text, vym text) AS -- adapt to actual data types!
$func$
BEGIN
RETURN QUERY EXECUTE format(
'SELECT t.cpa, substring(t.ku,'[0-9]+'), p.vym
FROM public."table_data_C" t
LEFT JOIN %s p USING (cpa)'
, 'pa' || _number
);
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM foo(456887)
Generally, you would sanitize table names with format ( %I ) to avoid SQL injection. With just an integer as dynamic input that's not necessary. More details and links in this related answer:
INSERT with dynamic table name in trigger function
Data model
There may be good reasons for the data model. Like partitioning / sharding or separate privileges ...
If you don't have such a good reason, consider consolidating multiple tables with identical schema into one and add the number as column. Then you don't need dynamic SQL.
Consider inheritance. Then you can add a condition on tableoid to only retrieve rows from a given child table:
SELECT * FROM parent_table
WHERE tableoid = 'pa456887'::regclass
Be aware of limitations for inheritance, though. Related answers:
Get the name of a row's source table when querying the parent it inherits from
Select (retrieve) all records from multiple schemas using Postgres
Name of 2nd table depending on value in 1st table
Deriving the name of the join table from values in the first table dynamically complicates things.
For only a few tables
LEFT JOIN each on tableoid. There is only one match per row, so use COALESCE.
SELECT t.*, t.tbl, COALESCE(p1.vym, p2.vym, p3.vym) AS vym
FROM (
SELECT cpa, ('pa' || substring(ku,'[0-9]+'))::regclass AS tbl
FROM public."table_data_C"
-- WHERE <some condition>
) t
LEFT JOIN pa456887 p1 ON p1.cpa = t.cpa AND p1.tableoid = t.tbl
LEFT JOIN pa456888 p2 ON p2.cpa = t.cpa AND p2.tableoid = t.tbl
LEFT JOIN pa456889 p3 ON p3.cpa = t.cpa AND p3.tableoid = t.tbl
For many tables
Combine a loop with dynamic queries:
CREATE OR REPLACE FUNCTION foo(_number int)
RETURNS TABLE (cpa int, nr text, vym text) AS
$func$
DECLARE
_nr text;
BEGIN
FOR _nr IN
SELECT DISTINCT substring(ku,'[0-9]+')
FROM public."table_data_C"
LOOP
RETURN QUERY EXECUTE format(
'SELECT t.cpa, _nr, p.vym
FROM public."table_data_C" t
LEFT JOIN %I p USING (cpa)
WHERE t.ku LIKE (_nr || '%')'
, 'pa' || _nr
);
END LOOP;
END
$func$ LANGUAGE plpgsql;

How to return result of a SELECT inside a function in PostgreSQL?

I have this function in PostgreSQL, but I don't know how to return the result of the query:
CREATE OR REPLACE FUNCTION wordFrequency(maxTokens INTEGER)
RETURNS SETOF RECORD AS
$$
BEGIN
SELECT text, count(*), 100 / maxTokens * count(*)
FROM (
SELECT text
FROM token
WHERE chartype = 'ALPHABETIC'
LIMIT maxTokens
) as tokens
GROUP BY text
ORDER BY count DESC
END
$$
LANGUAGE plpgsql;
But I don't know how to return the result of the query inside the PostgreSQL function.
I found that the return type should be SETOF RECORD, right? But the return command is not right.
What is the right way to do this?
Use RETURN QUERY:
CREATE OR REPLACE FUNCTION word_frequency(_max_tokens int)
RETURNS TABLE (txt text -- also visible as OUT param in function body
, cnt bigint
, ratio bigint)
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY
SELECT t.txt
, count(*) AS cnt -- column alias only visible in this query
, (count(*) * 100) / _max_tokens -- I added parentheses
FROM (
SELECT t.txt
FROM token t
WHERE t.chartype = 'ALPHABETIC'
LIMIT _max_tokens
) t
GROUP BY t.txt
ORDER BY cnt DESC; -- potential ambiguity
END
$func$;
Call:
SELECT * FROM word_frequency(123);
Defining the return type explicitly is much more practical than returning a generic record. This way you don't have to provide a column definition list with every function call. RETURNS TABLE is one way to do that. There are others. Data types of OUT parameters have to match exactly what is returned by the query.
Choose names for OUT parameters carefully. They are visible in the function body almost anywhere. Table-qualify columns of the same name to avoid conflicts or unexpected results. I did that for all columns in my example.
But note the potential naming conflict between the OUT parameter cnt and the column alias of the same name. In this particular case (RETURN QUERY SELECT ...) Postgres uses the column alias over the OUT parameter either way. This can be ambiguous in other contexts, though. There are various ways to avoid any confusion:
Use the ordinal position of the item in the SELECT list: ORDER BY 2 DESC. Example:
Select first row in each GROUP BY group?
Repeat the expression ORDER BY count(*).
(Not required here.) Set the configuration parameter plpgsql.variable_conflict or use the special command #variable_conflict error | use_variable | use_column in the function. See:
Naming conflict between function parameter and result of JOIN with USING clause
Don't use "text" or "count" as column names. Both are legal to use in Postgres, but "count" is a reserved word in standard SQL and a basic function name and "text" is a basic data type. Can lead to confusing errors. I use txt and cnt in my examples, you may want more explicit names.
Added a missing ; and corrected a syntax error in the header. (_max_tokens int), not (int maxTokens) - data type after name.
While working with integer division, it's better to multiply first and divide later, to minimize the rounding error. Or work with numeric or a floating point type. See below.
Alternative
This is what I think your query should actually look like (calculating a relative share per token):
CREATE OR REPLACE FUNCTION word_frequency(_max_tokens int)
RETURNS TABLE (txt text
, abs_cnt bigint
, relative_share numeric)
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY
SELECT t.txt, t.cnt
, round((t.cnt * 100) / (sum(t.cnt) OVER ()), 2) -- AS relative_share
FROM (
SELECT t.txt, count(*) AS cnt
FROM token t
WHERE t.chartype = 'ALPHABETIC'
GROUP BY t.txt
ORDER BY cnt DESC
LIMIT _max_tokens
) t
ORDER BY t.cnt DESC;
END
$func$;
The expression sum(t.cnt) OVER () is a window function. You could use a CTE instead of the subquery. Pretty, but a subquery is typically cheaper in simple cases like this one (mostly before Postgres 12).
A final explicit RETURN statement is not required (but allowed) when working with OUT parameters or RETURNS TABLE (which makes implicit use of OUT parameters).
round() with two parameters only works for numeric types. count() in the subquery produces a bigint result and a sum() over this bigint produces a numeric result, thus we deal with a numeric number automatically and everything just falls into place.
Please see the following link for documentation:
https://www.postgresql.org/docs/current/xfunc-sql.html
Example:
CREATE FUNCTION sum_n_product_with_tab (x int)
RETURNS TABLE(sum int, product int) AS $$
SELECT $1 + tab.y, $1 * tab.y FROM tab;
$$ LANGUAGE SQL;