I've inherited a query that has parameters which specify pulls data for a single desired month. The extract then gets manually added to previous month's extract in Excel. I'd like to eliminate the manual portion by adjusting the existing query to iterate across all months greater than a given base month, then (if this is what makes most sense) unioning the individual "final" outputs.
My attempt was to add the entire block of code for each specific month to the existing code, and then run it together. The idea was that I'd just paste in a new block each new month. I knew this was very inefficient, but I don't have the luxury of learning how to do it efficiently, so if it worked I'd be happy.
I ran into problems because the existing query has two subqueries which then are used to create a final table, and I couldn't figure out how to retain the final table at the end of the code so that it could be referenced in a union later (fwiw, I was attempting to use a Select Into for that final table).
with eligibility_and_customer_type AS
(SELECT DISTINCT ON(sub_id, mbr_sfx_id)
sub_id AS subscriber_id
, mbr_sfx_id AS member_suffix_id
, src_mbr_key
, ctdv.cstmr_typ_cd
, gdv.grp_name
FROM adw_common.cstmr_typ_dim_vw ctdv
JOIN adw_common.mbr_eligty_by_mo_fact_vw
ON ctdv.cstmr_typ_key = mbr_eligty_by_mo_fact_vw.cstmr_typ_key
AND mbr_eligty_yr = '2018'
AND mbr_eligty_mo = '12'
JOIN adw_common.prod_cat_dim_vw
ON prod_cat_dim_vw.prod_cat_key = mbr_eligty_by_mo_fact_vw.prod_cat_key
AND prod_cat_dim_vw.prod_cat_cd = 'M'
JOIN adw_common.mbr_dim_abr
ON mbr_eligty_by_mo_fact_vw.mbr_key = mbr_dim_abr.mbr_key
JOIN consumer.facets_xref_abr fxf
ON mbr_dim_abr.src_mbr_key = fxf.source_member_key
JOIN adw_common.grp_dim_vw gdv
ON gdv.grp_key=mbr_eligty_by_mo_fact_vw.grp_key),
facets_ip as
(select distinct cl.meme_ck
FROM gpgen_cr_ai.cmc_clcl_claim_abr cl
/* LEFT JOIN gpgen_cr_ai.cmc_clhp_hosp_abr ch
ON cl.clcl_id = ch.clcl_id*/
LEFT JOIN gpgen_cr_ai.cmc_cdml_cl_line cd
ON cl.clcl_id = cd.clcl_id
WHERE cd.pscd_id = '21'
/*AND ch.clcl_id IS NULL*/
AND cl.clcl_cur_sts NOT IN ('91','92')
AND cl.clcl_low_svc_dt >= '20181201'
and cl.clcl_low_svc_dt <= '20181231'
group by 1)
select distinct c.meme_ck,
e.cstmr_typ_cd,
'201812' as Yearmo
from facets_ip c
left join eligibility_and_customer_type e
on c.meme_ck = e.src_mbr_key;
The code above has date parameters that get updated when necessary.
The final output would be a version of the final table created above, but with results corresponding to, say, 201801 - present.
If you provide:
DDL of the underlying tables
Sample Data of the underlying tables
Expected resultset
DBMS you are using
, then one would be able to provide the best solution here.
Without knowing them, and as you said you only care about dynamically looping through each month, here is one way you can utilize your code to loop it through in SQL Server. Please fill the variable #StartDate and #EndDate values and provide proper datatype for meme_ck and cstmr_typ_cd.
IF OBJECT_ID ('tempdb..#TempTable', N'U') IS NOT NULL
BEGIN
DROP TABLE #TempTable
END
CREATE TABLE #TempTable
(
meme_ck <ProvideProperDataTypeHere>
,cstmr_typ_cd <ProvideProperDataTypeHere>
,Yearmo VARCHAR(10)
)
DECLARE #StartDate DATE = '<Provide the first day of the start month>'
DECLARE #EndDate DATE = '<Provide the end date inclusive>'
WHILE #StartDate <= #EndDate
BEGIN
DECLARE #MonthEndDate DATE = CASE WHEN DATEADD(DAY, -1, DATEADD(MONTH, 1, #StartDate)) <= #EndDate THEN DATEADD(DAY, -1, DATEADD(MONTH, 1, #StartDate)) ELSE #EndDate END
DECLARE #MonthYear VARCHAR(6) = LEFT(CONVERT(VARCHAR(8), #StartDate, 112), 6)
--This is your code which I am not touching without not knowing any detail about it. Just feeding the variables to make it dynamic
;with eligibility_and_customer_type AS
(SELECT DISTINCT ON(sub_id, mbr_sfx_id)
sub_id AS subscriber_id
, mbr_sfx_id AS member_suffix_id
, src_mbr_key
, ctdv.cstmr_typ_cd
, gdv.grp_name
FROM adw_common.cstmr_typ_dim_vw ctdv
JOIN adw_common.mbr_eligty_by_mo_fact_vw
ON ctdv.cstmr_typ_key = mbr_eligty_by_mo_fact_vw.cstmr_typ_key
AND mbr_eligty_yr = CAST(YEAR(#StartDate) AS VARCHAR(10)) -- NO need to cast if mbr_eligty_yr is an Integer
AND mbr_eligty_mo = CAST(MONTH(#StartDate) AS VARCHAR(10)) -- NO need to cast if mbr_eligty_yr is an Integer
JOIN adw_common.prod_cat_dim_vw
ON prod_cat_dim_vw.prod_cat_key = mbr_eligty_by_mo_fact_vw.prod_cat_key
AND prod_cat_dim_vw.prod_cat_cd = 'M'
JOIN adw_common.mbr_dim_abr
ON mbr_eligty_by_mo_fact_vw.mbr_key = mbr_dim_abr.mbr_key
JOIN consumer.facets_xref_abr fxf
ON mbr_dim_abr.src_mbr_key = fxf.source_member_key
JOIN adw_common.grp_dim_vw gdv
ON gdv.grp_key=mbr_eligty_by_mo_fact_vw.grp_key),
facets_ip as
(select distinct cl.meme_ck
FROM gpgen_cr_ai.cmc_clcl_claim_abr cl
/* LEFT JOIN gpgen_cr_ai.cmc_clhp_hosp_abr ch
ON cl.clcl_id = ch.clcl_id*/
LEFT JOIN gpgen_cr_ai.cmc_cdml_cl_line cd
ON cl.clcl_id = cd.clcl_id
WHERE cd.pscd_id = '21'
/*AND ch.clcl_id IS NULL*/
AND cl.clcl_cur_sts NOT IN ('91','92')
AND cl.clcl_low_svc_dt BETWEEN #StartDate AND #MonthEndDate
group by 1)
INSERT INTO #TempTable
(
meme_ck
,cstmr_typ_cd
,Yearmo
)
select distinct c.meme_ck,
e.cstmr_typ_cd,
#MonthYear as Yearmo
from facets_ip c
left join eligibility_and_customer_type e
on c.meme_ck = e.src_mbr_key;
SET #StartDate = DATEADD(MONTH, 1, #StartDate)
END
SELECT * FROM #TempTable;
I don't have enough information on your tables to really create an optimal solution. The solutions I am providing just have a single parameter (table name) and for your solution, you will need to pass in an additional parameter for the date filter.
The idea of "looping" is not something you'll need to do in Greenplum. That is common for OLTP databases like SQL Server or Oracle that can't handle big data very well and have to process smaller amounts at a time.
For these example solutions, a table is needed with some data in it.
CREATE TABLE public.foo
(id integer,
fname text,
lname text)
DISTRIBUTED BY (id);
insert into foo values (1, 'jon', 'roberts'),
(2, 'sam', 'roberts'),
(3, 'jon', 'smith'),
(4, 'sam', 'smith'),
(5, 'jon', 'roberts'),
(6, 'sam', 'roberts'),
(7, 'jon', 'smith'),
(8, 'sam', 'smith');
Solution 1: Learn how functions work in the database. Here is a quick example of how it would work.
Create a function that does the Create Table As Select (CTAS) where you pass in a parameter.
Note: You can't execute DDL statements in a function directly so you have to use "EXECUTE" instead.
create or replace function fn_test(p_table_name text) returns void as
$$
declare
v_sql text;
begin
v_sql :='drop table if exists ' || p_table_name;
execute v_sql;
v_sql := 'create table ' || p_table_name || ' with (appendonly=true, compresstype=quicklz) as
with t as (select * from foo)
select * from t
distributed by (id)';
execute v_sql;
end;
$$
language plpgsql;
Execute the function with a simple select statement.
select fn_test('foo3');
Notice how I pass in a table name that will be created when you execute the function.
Solution 2: Use psql variables
Create a sql file name "test.sql" with the following contents.
drop table if exists :p_table_name;
create table :p_table_name with (appendonly=true, compresstype=quicklz) as
with t as (select * from foo)
select * from t
distributed by (id);
Next, you execute psql and pass in the variable p_table_name.
psql -f test.sql -v p_table_name=foo4
psql:test.sql:1: NOTICE: table "foo4" does not exist, skipping
DROP TABLE
SELECT 8
Problem Illustration
I am trying to find that magical query to generate summary information. I have mapped my problem into fictitious illustration. I have 'WaterLeakage%' table which records leakage occurred in hotel rooms over several year.
I have another table which records WaterConsumption in liters for each table.
Now i have to find actual water leakage in liters for given room number over given date range.
Basically i have to group several rows in 'WaterLeakage%' table to several rows in 'WaterConsumption' table. I am trying to figure out magical efficient query to find this. Unable to find it, please help.
DECLARE #START_DATE_PARAM DATE = '01/10/2017';
DECLARE #END_DATE_PARAM DATE = '01/31/2017';
DECLARE #ROOM_NUMBER INT = 101;
IF (EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = '#WATER_CONSUMPTION'))
DROP TABLE #WATER_CONSUMPTION;
IF (EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_NAME = '#WATER_LEAKAGE_PER'))
DROP TABLE #WATER_LEAKAGE_PER;
--Table for daily daily water consumption per room
CREATE TABLE #WATER_CONSUMPTION(
ROOM_NUMBER INT,
UDAY DATE,
WATER_CONSUMPTION_LITER INT
)
--Table for water leakage percent per room for date range
CREATE TABLE #WATER_LEAKAGE_PER
(
ROOM_NUMBER INT,
START_DATE DATE,
END_DATE DATE,
WATER_LEAKAGE_PERCENT INT
)
-- Raw Data
INSERT INTO #WATER_LEAKAGE_PER(ROOM_NUMBER,START_DATE,END_DATE,WATER_LEAKAGE_PERCENT)
VALUES(101,'2017/01/01','2017/01/02',5),
(102,'2017/01/01','2017/01/05',10),
(101,'2017/01/04','2017/02/06',10);
-- Raw Data
INSERT INTO #WATER_CONSUMPTION
VALUES(101,'2017/01/01',100),
(101,'2017/01/02',100),
(101,'2017/01/03',100),
(101,'2017/01/04',100),
(101,'2017/01/05',100),
(101,'2017/01/06',100),
(102,'2017/01/01',100),
(102,'2017/01/02',100),
(102,'2017/01/03',100),
(102,'2017/01/04',100),
(102,'2017/01/05',100);
DECLARE #TotalLeak REAL = 0;
SELECT * FROM #WATER_CONSUMPTION;
SELECT * FROM #WATER_LEAKAGE_PER;
SELECT * FROM #WATER_CONSUMPTION T1 JOIN (SELECT * FROM #WATER_LEAKAGE_PER WHERE ROOM_NUMBER=#ROOM_NUMBER) T2
ON (T1.ROOM_NUMBER=T2.ROOM_NUMBER AND T1.UDAY >= T2.START_DATE AND T1.UDAY <= T2.END_DATE);
DROP TABLE #WATER_CONSUMPTION;
DROP TABLE #WATER_LEAKAGE_PER;
I am very close to solution now. Basically i changed my thinking. I will join reverse now.
BEGIN
--Input Parameters for calculating water wastage between date range
DECLARE #START_DATE_PARAM DATE = '01/10/2017';
DECLARE #END_DATE_PARAM DATE = '01/31/2017';
--Table for daily daily water consumption per room
CREATE TABLE #WATER_CONSUMPTION(
ROOM_NUMBER INT,
UDAY DATE,
WATER_CONSUMPTION_LITER INT
)
--Table for water leakage percent per room for date range
CREATE TABLE #WATER_LEAKAGE_PER
(
ROOM_NUMBER INT,
START_DATE DATE,
END_DATE DATE,
WATER_LEAKAGE_PERCENT INT,
LEAKAGE_PER_DAY_IN_LITER INT
)
-- Leakage in liter per room for each day, This will have multiple entries for room and date if room number and date is available in multiple date ranges, ex. in #WATER_CONSUMPTION table for room number 101 we have multiple entries with overlapping dates
CREATE TABLE #DAY_WISE_LEAKAGE
(
ROOM_NUMBER INT,
LDATE DATE,
LEAKAGE_IN_LITER INT
)
-- Raw Data
INSERT INTO #WATER_LEAKAGE_PER(ROOM_NUMBER,START_DATE,END_DATE,WATER_LEAKAGE_PERCENT)
VALUES(101,'2017/01/15','2017/01/18',30),
(102,'2017/01/15','2017/01/18',10),
(101,'2017/01/15','2017/02/13',5);
-- Raw Data
INSERT INTO #WATER_CONSUMPTION
VALUES(101,'01/01/2017',1001),
(101,'01/02/2017',1001),
(101,'01/03/2017',1001),
(101,'01/04/2017',1001),
(101,'01/05/2017',1001),
(101,'01/06/2017',1001),
(101,'01/07/2017',1001),
(101,'01/08/2017',1001),
(101,'01/09/2017',1001),
(101,'01/10/2017',1001),
(101,'01/11/2017',1001),
(101,'01/12/2017',1001),
(101,'01/13/2017',1001),
(101,'01/14/2017',1001),
(101,'01/15/2017',1001),
(101,'01/16/2017',1001),
(101,'01/17/2017',1001),
(101,'01/18/2017',1001),
(101,'01/19/2017',1001),
(101,'01/20/2017',1001),
(101,'01/21/2017',1001),
(101,'01/22/2017',1001),
(101,'01/23/2017',1001),
(101,'01/24/2017',1001),
(101,'01/25/2017',1001),
(101,'01/26/2017',1001),
(101,'01/27/2017',1001),
(101,'01/28/2017',1001),
(101,'01/29/2017',1001),
(101,'01/30/2017',1001),
(101,'01/31/2017',1001);
DECLARE #ROOM_NUMBER INT
DECLARE #START_DATE DATE
DECLARE #END_DATE DATE
DECLARE #WATER_LEAKAGE_PERCENT INT
-- cursor for calculating water wastage pre date range per day available in #WATER_LEAKAGE_PER table
DECLARE WATER_LEAKAGE_PER_CURSOR CURSOR FOR
SELECT ROOM_NUMBER,START_DATE,END_DATE,WATER_LEAKAGE_PERCENT FROM #WATER_LEAKAGE_PER
OPEN WATER_LEAKAGE_PER_CURSOR
FETCH NEXT FROM WATER_LEAKAGE_PER_CURSOR
INTO #ROOM_NUMBER, #START_DATE ,#END_DATE, #WATER_LEAKAGE_PERCENT
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #TOTAL_WATER_USED_FOR_DATE_RANGE INT=0;
DECLARE #NUMBER_OF_DAYS INT=0;
DECLARE #LEAKAGE_PER_DAY_IN_LITER INT=0;
-- Total Liters of water used for 1 date range
SELECT #TOTAL_WATER_USED_FOR_DATE_RANGE =SUM(WATER_CONSUMPTION_LITER),#NUMBER_OF_DAYS=COUNT(1) FROM #WATER_CONSUMPTION WHERE ROOM_NUMBER=#ROOM_NUMBER AND UDAY BETWEEN #START_DATE AND #END_DATE;
-- Liters of water leakage per day for selevted date range in cursor
SELECT #LEAKAGE_PER_DAY_IN_LITER=((#TOTAL_WATER_USED_FOR_DATE_RANGE*#WATER_LEAKAGE_PERCENT)/100)/#NUMBER_OF_DAYS;
UPDATE #WATER_LEAKAGE_PER SET LEAKAGE_PER_DAY_IN_LITER = #LEAKAGE_PER_DAY_IN_LITER WHERE ROOM_NUMBER=#ROOM_NUMBER AND START_DATE = #START_DATE AND END_DATE=#END_DATE AND WATER_LEAKAGE_PERCENT=#WATER_LEAKAGE_PERCENT;
-- generate dates and water leakage, this will be used for actual calculation of water leakage in date range.
;WITH n AS
(
SELECT TOP (DATEDIFF(DAY, #START_DATE, #END_DATE) + 1)
n = ROW_NUMBER() OVER (ORDER BY [object_id])
FROM sys.all_objects
)
INSERT INTO #DAY_WISE_LEAKAGE SELECT #ROOM_NUMBER, DATEADD(DAY, n-1, #START_DATE),#LEAKAGE_PER_DAY_IN_LITER
FROM n;
FETCH NEXT FROM WATER_LEAKAGE_PER_CURSOR
INTO #ROOM_NUMBER, #START_DATE ,#END_DATE, #WATER_LEAKAGE_PERCENT
END
CLOSE WATER_LEAKAGE_PER_CURSOR;
DEALLOCATE WATER_LEAKAGE_PER_CURSOR;
-- Average of Liters of water leakage per Room number.
SELECT ROOM_NUMBER,SUM(LEAKAGE_IN_LITER) FROM #DAY_WISE_LEAKAGE WHERE LDATE BETWEEN #START_DATE_PARAM AND #END_DATE_PARAM GROUP BY ROOM_NUMBER;
DROP TABLE #WATER_CONSUMPTION;
DROP TABLE #WATER_LEAKAGE_PER;
DROP TABLE #DAY_WISE_LEAKAGE
END
I have a table with a string in some columns values that tells me if I should delete the row....however this string needs some parsing to understand whether to delete or not.
What is the string: it tells me the recurrence of meetings eg everyday starting 21st march for 10 meetings.
My table is a single column called recurrence:
Recurrence
-------------------------------
daily;1;21/03/2015;times;10
daily;1;01/02/2016;times;8
monthly;1;01/01/2016;times;2
weekly;1;21/01/2016;times;4
What to do: if the meetings are finished then remove the row.
The string is of the following format
<frequency tag>;<frequency number>;<start date>;times;<no of times>
For example
daily;1;21/03/2016;times;10
everyday starting 21st march, for 10 times
Does anybody know how I would calculate if the string indicates all meetings are in past? I want a select statement that tells me if the recurrence values are in past - true or false
I added one string ('weekly;1;21/05/2016;times;4') that definitely must not be deleted to show some output. At first try to add to temp table `#table1' all data from your table and check if all is deleted well.
DECLARE #table1 TABLE (
Recurrence nvarchar(max)
)
DECLARE #xml xml
INSERT INTO #table1 VALUES
('daily;1;21/03/2016;times;10'),
('daily;1;21/03/2015;times;10'),
('daily;1;01/02/2016;times;8'),
('monthly;1;01/01/2016;times;2'),
('weekly;1;21/01/2016;times;4'),
('weekly;1;21/05/2016;times;4')
SELECT #xml= (
SELECT CAST('<s><r>' + REPLACE(Recurrence,';','</r><r>') + '</r><r>'+ Recurrence+'</r></s>' as xml)
FROM #table1
FOR XML PATH ('')
)
;WITH cte as (
SELECT t.v.value('r[1]','nvarchar(10)') as how,
t.v.value('r[2]','nvarchar(10)') as every,
CONVERT(date,t.v.value('r[3]','nvarchar(10)'),103) as since,
t.v.value('r[4]','nvarchar(10)') as what,
t.v.value('r[5]','int') as howmany,
t.v.value('r[6]','nvarchar(max)') as Recurrence
FROM #xml.nodes('/s') as t(v)
)
DELETE t
FROM #table1 t
LEFT JOIN cte c ON c.Recurrence=t.Recurrence
WHERE
CASE WHEN how = 'daily' THEN DATEADD(day,howmany,since)
WHEN how = 'weekly' THEN DATEADD(week,howmany,since)
WHEN how = 'monthly' THEN DATEADD(month,howmany,since)
ELSE NULL END < GETDATE()
SELECT * FROM #table1
Output:
Recurrence
-----------------------------
weekly;1;21/05/2016;times;4
(1 row(s) affected)
I'm now all day on a fairly simple udf. It's below. When I paste the select statement into a query, it runs as expected... when I execute the entire function, I get "0" every time. As you know there aren't a ton of debugging options, so it's hard to see what value are/ aren't being set as it executes. The basic purpose of it is to make sure stock data exists in a daily pricing table. So I can check by how many days' data I'm checking for, the ticker, and the latest trading date to check. A subquery gets me the correct trading dates, and I use "IN" to pull data out of the pricing and vol table... if the count of what comes back is less than the number of days I'm checking, no good. If it does, we're in business. Any help would be great, I'm a newb that is punting at this point:
ALTER FUNCTION dbo.PricingVolDataAvailableToDateProvided
(#Ticker char,
#StartDate DATE,
#NumberOfDaysBack int)
RETURNS bit
AS
BEGIN
DECLARE #Result bit
DECLARE #RecordCount int
SET #RecordCount = (
SELECT COUNT(TradeDate) AS Expr1
FROM (SELECT TOP (100) PERCENT TradeDate
FROM tblDailyPricingAndVol
WHERE ( Symbol = #Ticker )
AND ( TradeDate IN (SELECT TOP (#NumberOfDaysBack)
CAST(TradingDate AS DATE) AS Expr1
FROM tblTradingDays
WHERE ( TradingDate <= #StartDate )
ORDER BY TradingDate DESC) )
ORDER BY TradeDate DESC) AS TempTable )
IF #RecordCount = #NumberOfDaysBack
SET #Result = 1
ELSE
SET #Result = 0
RETURN #Result
END
#Ticker char seems suspect.
If you don't declare a length in the parameter definition it defaults to char(1) so quite likely your passed in tickers are being silently truncated - hence no matches.
SELECT TOP (100) PERCENT TradeDate ... ORDER BY TradeDate DESC
in the derived table is pointless but won't affect the result.