Setting variable in a Postgres function - sql

CREATE OR REPLACE FUNCTION "freeTicket" (eid integer NOT NULL)
DECLARE
couponCode text
BEGIN
INSERT INTO purchases p (cid, pdate, eid, ccode)
VALUES
(
SELECT p.cid, GETDATE(), $1, couponCode FROM purchase p
GROUP BY p.cid
HAVING COUNT(1) > 5
ORDER BY p.cid
);
END; LANGUAGE plpgsql;
I need to set the variable of couponCode to the output of:
Select code from couponCode where eid = $1 and percentage = 100;
And use it in the insert query above.
What is the best way to do this?

That would be SELECT <expressions> INTO <variables> FROM ..., but you can do it all in one statement:
INSERT INTO purchases p (cid, pdate, eid, ccode)
SELECT p.cid,
current_date,
$1,
(SELECT code FROM couponcode
WHERE eid = $1 AND percentage = 100)
FROM purchase p
GROUP BY p.cid
HAVING COUNT(1) > 5:
ORDER BY makes no sense here.

Basics about assigning variables in PL/pgSQL:
Store query result in a variable using in PL/pgSQL
Apart from that, your function has a number of syntax errors and other problems. Starting with:
CREATE OR REPLACE FUNCTION "freeTicket" (eid integer NOT NULL)
DECLARE ...
NOT NULL isn't valid syntax here.
You must declare the return type somehow. If the function does not return anything, add RETURNS void.
For your own good, avoid CaMeL-case identifiers in Postgres. Use legal, lower-case identifiers exclusively if possible. See:
Are PostgreSQL column names case-sensitive?
The function would work like this:
CREATE OR REPLACE FUNCTION free_ticket(_eid integer, OUT _row_ct int) AS
$func$
DECLARE
coupon_code text; -- semicolon required
BEGIN
INSERT INTO purchases (cid, pdate, eid, ccode)
SELECT cid, now()::date, _eid
, (SELECT code FROM couponCode WHERE eid = _eid AND percentage = 100)
FROM purchase
GROUP BY cid
HAVING COUNT(*) > 5 -- count(*) is faster
ORDER BY cid; -- ORDER BY is *not* pointless.
GET DIAGNOSTICS _row_ct := ROW_COUNT;
END
$func$ LANGUAGE plpgsql;
The added OUT row_ct int is returned at the end of the function automatically. It obviates the need for an explicit RETURNS declaration.
You also had a table alias in:
INSERT INTO purchases p (cid, pdate, eid, ccode)
But INSERT statements require the AS keyword for aliases to avoid ambiguity (unlike other DML statements). So: INSERT INTO purchases AS p .... But no need for an alias since there is no ambiguity in the statement.
Related:
Count the rows affected by plpgsql function
Asides: Two tables named purchase and purchases, that's bound to lead to confusion. And the second table might also be replaced with a VIEW or MATERIALIZED VIEW.

Related

Adjusting Month Specific SQL Query to Iterate Across all Months Greater than Base Month

I've inherited a query that has parameters which specify pulls data for a single desired month. The extract then gets manually added to previous month's extract in Excel. I'd like to eliminate the manual portion by adjusting the existing query to iterate across all months greater than a given base month, then (if this is what makes most sense) unioning the individual "final" outputs.
My attempt was to add the entire block of code for each specific month to the existing code, and then run it together. The idea was that I'd just paste in a new block each new month. I knew this was very inefficient, but I don't have the luxury of learning how to do it efficiently, so if it worked I'd be happy.
I ran into problems because the existing query has two subqueries which then are used to create a final table, and I couldn't figure out how to retain the final table at the end of the code so that it could be referenced in a union later (fwiw, I was attempting to use a Select Into for that final table).
with eligibility_and_customer_type AS
(SELECT DISTINCT ON(sub_id, mbr_sfx_id)
sub_id AS subscriber_id
, mbr_sfx_id AS member_suffix_id
, src_mbr_key
, ctdv.cstmr_typ_cd
, gdv.grp_name
FROM adw_common.cstmr_typ_dim_vw ctdv
JOIN adw_common.mbr_eligty_by_mo_fact_vw
ON ctdv.cstmr_typ_key = mbr_eligty_by_mo_fact_vw.cstmr_typ_key
AND mbr_eligty_yr = '2018'
AND mbr_eligty_mo = '12'
JOIN adw_common.prod_cat_dim_vw
ON prod_cat_dim_vw.prod_cat_key = mbr_eligty_by_mo_fact_vw.prod_cat_key
AND prod_cat_dim_vw.prod_cat_cd = 'M'
JOIN adw_common.mbr_dim_abr
ON mbr_eligty_by_mo_fact_vw.mbr_key = mbr_dim_abr.mbr_key
JOIN consumer.facets_xref_abr fxf
ON mbr_dim_abr.src_mbr_key = fxf.source_member_key
JOIN adw_common.grp_dim_vw gdv
ON gdv.grp_key=mbr_eligty_by_mo_fact_vw.grp_key),
facets_ip as
(select distinct cl.meme_ck
FROM gpgen_cr_ai.cmc_clcl_claim_abr cl
/* LEFT JOIN gpgen_cr_ai.cmc_clhp_hosp_abr ch
ON cl.clcl_id = ch.clcl_id*/
LEFT JOIN gpgen_cr_ai.cmc_cdml_cl_line cd
ON cl.clcl_id = cd.clcl_id
WHERE cd.pscd_id = '21'
/*AND ch.clcl_id IS NULL*/
AND cl.clcl_cur_sts NOT IN ('91','92')
AND cl.clcl_low_svc_dt >= '20181201'
and cl.clcl_low_svc_dt <= '20181231'
group by 1)
select distinct c.meme_ck,
e.cstmr_typ_cd,
'201812' as Yearmo
from facets_ip c
left join eligibility_and_customer_type e
on c.meme_ck = e.src_mbr_key;
The code above has date parameters that get updated when necessary.
The final output would be a version of the final table created above, but with results corresponding to, say, 201801 - present.
If you provide:
DDL of the underlying tables
Sample Data of the underlying tables
Expected resultset
DBMS you are using
, then one would be able to provide the best solution here.
Without knowing them, and as you said you only care about dynamically looping through each month, here is one way you can utilize your code to loop it through in SQL Server. Please fill the variable #StartDate and #EndDate values and provide proper datatype for meme_ck and cstmr_typ_cd.
IF OBJECT_ID ('tempdb..#TempTable', N'U') IS NOT NULL
BEGIN
DROP TABLE #TempTable
END
CREATE TABLE #TempTable
(
meme_ck <ProvideProperDataTypeHere>
,cstmr_typ_cd <ProvideProperDataTypeHere>
,Yearmo VARCHAR(10)
)
DECLARE #StartDate DATE = '<Provide the first day of the start month>'
DECLARE #EndDate DATE = '<Provide the end date inclusive>'
WHILE #StartDate <= #EndDate
BEGIN
DECLARE #MonthEndDate DATE = CASE WHEN DATEADD(DAY, -1, DATEADD(MONTH, 1, #StartDate)) <= #EndDate THEN DATEADD(DAY, -1, DATEADD(MONTH, 1, #StartDate)) ELSE #EndDate END
DECLARE #MonthYear VARCHAR(6) = LEFT(CONVERT(VARCHAR(8), #StartDate, 112), 6)
--This is your code which I am not touching without not knowing any detail about it. Just feeding the variables to make it dynamic
;with eligibility_and_customer_type AS
(SELECT DISTINCT ON(sub_id, mbr_sfx_id)
sub_id AS subscriber_id
, mbr_sfx_id AS member_suffix_id
, src_mbr_key
, ctdv.cstmr_typ_cd
, gdv.grp_name
FROM adw_common.cstmr_typ_dim_vw ctdv
JOIN adw_common.mbr_eligty_by_mo_fact_vw
ON ctdv.cstmr_typ_key = mbr_eligty_by_mo_fact_vw.cstmr_typ_key
AND mbr_eligty_yr = CAST(YEAR(#StartDate) AS VARCHAR(10)) -- NO need to cast if mbr_eligty_yr is an Integer
AND mbr_eligty_mo = CAST(MONTH(#StartDate) AS VARCHAR(10)) -- NO need to cast if mbr_eligty_yr is an Integer
JOIN adw_common.prod_cat_dim_vw
ON prod_cat_dim_vw.prod_cat_key = mbr_eligty_by_mo_fact_vw.prod_cat_key
AND prod_cat_dim_vw.prod_cat_cd = 'M'
JOIN adw_common.mbr_dim_abr
ON mbr_eligty_by_mo_fact_vw.mbr_key = mbr_dim_abr.mbr_key
JOIN consumer.facets_xref_abr fxf
ON mbr_dim_abr.src_mbr_key = fxf.source_member_key
JOIN adw_common.grp_dim_vw gdv
ON gdv.grp_key=mbr_eligty_by_mo_fact_vw.grp_key),
facets_ip as
(select distinct cl.meme_ck
FROM gpgen_cr_ai.cmc_clcl_claim_abr cl
/* LEFT JOIN gpgen_cr_ai.cmc_clhp_hosp_abr ch
ON cl.clcl_id = ch.clcl_id*/
LEFT JOIN gpgen_cr_ai.cmc_cdml_cl_line cd
ON cl.clcl_id = cd.clcl_id
WHERE cd.pscd_id = '21'
/*AND ch.clcl_id IS NULL*/
AND cl.clcl_cur_sts NOT IN ('91','92')
AND cl.clcl_low_svc_dt BETWEEN #StartDate AND #MonthEndDate
group by 1)
INSERT INTO #TempTable
(
meme_ck
,cstmr_typ_cd
,Yearmo
)
select distinct c.meme_ck,
e.cstmr_typ_cd,
#MonthYear as Yearmo
from facets_ip c
left join eligibility_and_customer_type e
on c.meme_ck = e.src_mbr_key;
SET #StartDate = DATEADD(MONTH, 1, #StartDate)
END
SELECT * FROM #TempTable;
I don't have enough information on your tables to really create an optimal solution. The solutions I am providing just have a single parameter (table name) and for your solution, you will need to pass in an additional parameter for the date filter.
The idea of "looping" is not something you'll need to do in Greenplum. That is common for OLTP databases like SQL Server or Oracle that can't handle big data very well and have to process smaller amounts at a time.
For these example solutions, a table is needed with some data in it.
CREATE TABLE public.foo
(id integer,
fname text,
lname text)
DISTRIBUTED BY (id);
insert into foo values (1, 'jon', 'roberts'),
(2, 'sam', 'roberts'),
(3, 'jon', 'smith'),
(4, 'sam', 'smith'),
(5, 'jon', 'roberts'),
(6, 'sam', 'roberts'),
(7, 'jon', 'smith'),
(8, 'sam', 'smith');
Solution 1: Learn how functions work in the database. Here is a quick example of how it would work.
Create a function that does the Create Table As Select (CTAS) where you pass in a parameter.
Note: You can't execute DDL statements in a function directly so you have to use "EXECUTE" instead.
create or replace function fn_test(p_table_name text) returns void as
$$
declare
v_sql text;
begin
v_sql :='drop table if exists ' || p_table_name;
execute v_sql;
v_sql := 'create table ' || p_table_name || ' with (appendonly=true, compresstype=quicklz) as
with t as (select * from foo)
select * from t
distributed by (id)';
execute v_sql;
end;
$$
language plpgsql;
Execute the function with a simple select statement.
select fn_test('foo3');
Notice how I pass in a table name that will be created when you execute the function.
Solution 2: Use psql variables
Create a sql file name "test.sql" with the following contents.
drop table if exists :p_table_name;
create table :p_table_name with (appendonly=true, compresstype=quicklz) as
with t as (select * from foo)
select * from t
distributed by (id);
Next, you execute psql and pass in the variable p_table_name.
psql -f test.sql -v p_table_name=foo4
psql:test.sql:1: NOTICE: table "foo4" does not exist, skipping
DROP TABLE
SELECT 8

converting multiline Table valued function to inline in SQL

Considering the performance issues, I wanted to make my multiline Table valued function, an inline TVF.
Here is the sample code for Multiline TVF:
CREATE FUNCTION MatchAptNumber (#AptNumberFromUser nvarchar(20))
RETURNS #MatchedData Table
(
RowNumber int null ,
PercentMatch int null
)
AS
Begin
Insert into #MatchedData(RowNumber) select dbo.Patients.Rowid from dbo.Patients where dbo.Patients.Aptnumber = #AptNumberFromUser
update #MatchedData set PercentMatch= 100
RETURN;
END;
Go
Here is how I use it:
select #constVal = FunctionWeight from dbo.FunctionWeights where FunctionWeights.FunctionName = 'MatchAptNumber';
INSERT INTO #Temp2(RowNumber, ValFromFunc, FuncWeight, percentage)
SELECT RowNumber, PercentMatch, #constVal, PercentMatch * #constVal
from dbo.MatchAptNumber(#Aptnumber);
Is it possible to convert it into an inline TVF and use it as mentioned above? I do know the syntactic differences between two but not sure how it be possible to use it the same way? Can I get some pointers on same?
You can get the '100' as a constant in the SELECT so the function becomes;
CREATE FUNCTION MatchAptNumber (#AptNumberFromUser nvarchar(20))
RETURNS TABLE AS RETURN
SELECT
p.Rowid AS RowNumber ,
CAST(100 AS INT) AS PercentMatch
FROM
dbo.Patients p
WHERE
p.Aptnumber = #AptNumberFromUser
GO

Compare two columns and make the insert

I want to compare two columns which come from two different tables.
One of the columns, I need to make SUM for all rows with identity let's say 3 and store to a variable.
After that, compare with one row from other table for same identity 3 and to INSERT something ELSE to BREAK if first_column <= second_column.
Can someone suggest some query for this? For Postgresql...
CREATE OR REPLACE FUNCTION "SA_PRJ".usp_add_timesheet_test(p_uid integer, p_project_id integer, p_allocated_time numeric, p_achieved_time numeric, p_task_desc character varying, p_obs character varying, p_date timestamp without time zone)
RETURNS character varying AS
$BODY$
DECLARE sum_alloc_time numeric;
DECLARE alloc_hours integer;
DECLARE fld_id integer;
DECLARE alloc_id integer;
BEGIN
if not "SA_ADM".usp_check_permission(p_uid, 'SA_PRJ', 'usp_add_timesheet_record') then
raise exception 'User ID % dont have permission!', p_uid;
end if;
select a.fld_id into alloc_id from "SD_PRJ".tbl_project_allocation a where a.fld_emp_id = p_uid and a.fld_project_id = p_project_id;
SELECT SUM(fld_allocated_time)
INTO sum_alloc_time
FROM "SD_PRJ".tbl_project_timesheet
WHERE fld_project_id = p_project_id;
SELECT p.fld_allocated_days, p.fld_id
INTO alloc_hours, fld_id
FROM "SD_PRJ".tbl_project p
JOIN "SD_PRJ".tbl_project_timesheet t USING (fld_id)
WHERE t.fld_project_id = p_project_id;
IF #sum_alloc_time <= #alloc_hours THEN
INSERT INTO "SD_PRJ".tbl_project_timesheet
(fld_emp_id, fld_project_id, fld_is_allocated, fld_allocated_time
, fld_achieved_time, fld_task_desc, fld_obs, fld_date)
VALUES (p_uid, p_project_id, coalesce(alloc_id,0), p_allocated_time
, p_achieved_time, p_task_desc, p_obs, p_date);
RAISE NOTICE 'INSERT OK!';
ELSE
RAISE NOTICE 'NOT OK';
END IF;
END
1.tbl_project (fld_id, fld_allocated_days,fld_project_id)
2.tbl_project_timesheet(fld_id,fld_allocated_time,fld_project_id), all INTEGER
I have this , but dosen't work as I wish.Thanks
I think one problem is here:
SELECT p.fld_allocated_days, p.fld_id
INTO alloc_hours, fld_id
FROM "SD_PRJ".tbl_project p
JOIN "SD_PRJ".tbl_project_timesheet t USING (fld_id)
WHERE t.fld_project_id = p_project_id;
That will cough (I think) whenever the select query returns more than one row i.e. whenever tbl_project_timesheet has more than one record for a fld_id,project_id combination.
Anyway. Here's a partial, simplified answer, but hopefully you get the idea...
I wouldn't use local variables. Do the insert in one step:
INSERT INTO timesheet(emp_id,project_id) -- other columns
SELECT
p_uid,p.fld_project_id -- other columns
FROM
projects p
INNER JOIN
(SELECT SUM(fld_allocated_time) as sumtime
FROM timesheet t WHERE fld_project_id = p_project_id) as sumtime_subquery
ON p.fld_allocated_days < sumtime -- just join on the allocated time
WHERE p.fld_project_id = p_project_id;
Now, you need to know if anything was actually inserted. I think you can use the RETURNING option of the INSERT statement, e.g. from here (caveat - I have never used RETURNING, nor set a local variable from a with statement):
WITH ROWS AS (
INSERT INTO timesheet(emp_id,project_id) -- other columns
SELECT
p_uid,p.fld_project_id -- other columns
FROM
projects p
INNER JOIN
(SELECT SUM(fld_allocated_time) as sumtime
FROM timesheet t WHERE fld_project_id = p_project_id) as sumtime_subquery
ON p.fld_allocated_days < sumtime -- just join on the allocated time
WHERE p.fld_project_id = p_project_id
RETURNING 1
)
SELECT COUNT(*) into l_updatedCount FROM rows; -- you have to declare l_updatedCount
-- Now an if statement to handle l_updatedCount

plpgsql how to just execute query in a function or procedure

I am beginning to learn stored procedures and functions in sql in a postgres database.
I need an example to get me going for what I am trying to accomplish.
I need to run a procedure and have it return results. For example something like this:
run_query(name):
begin
return select * from employees where first_name = $name
end
end
I want something like the above to return the result set when I run it. Is this possible? thank you for your help in advance!
Here is the function im trying to create:
CREATE OR REPLACE FUNCTION test() RETURNS TABLE(id INT, subdomain varchar, launched_on_xxx timestamp, UVs bigint, PVs bigint) AS
'SELECT dblink_connect(''other_DB'');
SELECT c.id as id, c.subdomain, c.launched_on_xxx, COALESCE(SUM(tbd.new_unique_visitors), 0) AS UVs, COALESCE(SUM(tbd.page_views), 0) AS PVs
FROM dblink(''SELECT id, subdomain, launched_on_xxx FROM communities'')
AS c(id int, subdomain character varying, launched_on_xxx timestamp)
LEFT OUTER JOIN days_of_center tbd
ON c.id = tbd.community_id
WHERE c.launched_on_xxx < now()
GROUP BY c.id, c.subdomain, c.launched_on_xxx;
SELECT dblink_disconnect();'
LANGUAGE SQL;
Your function could look like this:
CREATE OR REPLACE FUNCTION test()
RETURNS TABLE(id int, subdomain varchar, launched_on_xxx timestamp
,uvs bigint, pvs bigint) AS
$func$
SELECT dblink_connect('other_DB');
SELECT c.id
,c.subdomain
,c.launched_on_xxx
,COALESCE(SUM(tbd.new_unique_visitors), 0) AS uvs
,COALESCE(SUM(tbd.page_views), 0) AS pvs
FROM dblink('
SELECT id, subdomain, launched_on_xxx
FROM communities
WHERE launched_on_xxx < now()')
AS c(id int, subdomain varchar, launched_on_xxx timestamp)
LEFT JOIN days_of_center tbd ON tbd.community_id = c.id
GROUP BY c.id, c.subdomain, c.launched_on_xxx;
SELECT dblink_disconnect();
$func$ LANGUAGE SQL;
Pull the WHERE clause down into the dblink function. It's much more effective not to fetch rows to begin with - instead of fetching them from the external database and then discarding them.
Use dollar-quoting to avoid confusion with quoting. That has become standard procedure with bigger function definitions.
To output it in "table format", call a function returning multiple columns like this:
SELECT * FROM test();
Just about the simplest possible example would be this;
CREATE FUNCTION test() RETURNS TABLE(num INT) AS
'SELECT id FROM table1'
LANGUAGE SQL;
SELECT * FROM test()
An SQLfiddle to test with.
If you need a parameter, here's another example;
CREATE FUNCTION test(sel INT) RETURNS TABLE(val VARCHAR) AS
'SELECT value FROM table1 WHERE id=sel'
LANGUAGE SQL;
SELECT * FROM test(2)
Another SQLfiddle to test with.

SQL UDF Group By Parameter Issue

I'm having some issues with a group by clause in SQL. I have the following basic function:
CREATE FUNCTION dbo.fn_GetWinsYear (#Year int)
RETURNS int
AS
BEGIN
declare #W int
select #W = count(1)
from tblGames
where WinLossForfeit = 'W' and datepart(yyyy,Date) = #Year
return #W
END
I'm trying to run the following basic query:
select dbo.fn_GetWinsYear(datepart(yyyy,date))
from tblGames
group by datepart(yyyy,date)
However, I'm encountering the following error message: Column 'tblGames.Date' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Any ideas why this is occurring? FYI, I know I can remove the function and combine into one call but I'd like to keep the function in place if possible.
I think you should be calling your function like this.
select dbo.fn_GetWinsYear(datepart(yyyy,getdate()))
OR
select dbo.fn_GetWinsYear('2010')
Essentially you are just passing a year to your function and the function is returning the number of wins for that year.
If you don't know the year, your function could look something like this...
CREATE FUNCTION dbo.fn_GetWinsYear ()
RETURNS #tblResults TABLE
( W INT, Y INT )
AS
BEGIN
INSERT #tblResults
SELECT count(1), datepart(yyyy,[Date])
FROM tblGames
WHERE WinLossForfeit = 'W'
GROUP BY datepart(yyyy,[Date])
RETURN
END
SELECT * FROM dbo.fn_GetWinsYear()