I am trying to add date parameters for my start and end date in my code. The code seeks to amortize revenue over a period of time selected from the date parameter.
However, I get the error, 'Table-value function not found'. Anyone has an idea why I'm getting this error?
Edit:
All data are from tables in my BigQuery. I happened to append multiple tables and saved as a View, from which I am querying them.
The original table has columns like:
User | fullname | subscription_start_date | subscription_end_date | Amount
I wrote a code that amortized the Amount by considering the subscription duration and the period selected from the dynamic date input.
The issue now is that, after writing the code below, I got the error of Table-function not being available. Not sure what that means.
CASE
--Sub falls within the selected period
WHEN DATE_DIFF(#DS_END_DATE,'2000-01-01', DAY) >= DATE_DIFF(end_at,'2000-01-01', DAY) AND DATE_DIFF(#DS_START_DATE,'2000-01-01', DAY) <= DATE_DIFF(start_at,'2000-01-01', DAY) THEN CAST(t1.amount AS NUMERIC)
--Selected period falls within the sub period
WHEN DATE_DIFF(#DS_END_DATE,'2000-01-01', DAY) <= DATE_DIFF(end_at,'2000-01-01', DAY) AND DATE_DIFF(#DS_START_DATE,'2000-01-01', DAY) >= DATE_DIFF(start_at,'2000-01-01', DAY) THEN
(DATE_DIFF(#DS_END_DATE,'2000-01-01', DAY) - DATE_DIFF(#DS_START_DATE,'2000-01-01', DAY))/(DATE_DIFF(end_at,'2000-01-01', DAY) - DATE_DIFF(start_at,'2000-01-01', DAY)) * CAST(t1.amount AS NUMERIC)
-- Subscription starts within but ends outside the selected period
WHEN DATE_DIFF(#DS_END_DATE, '2000-01-01', DAY) <= DATE_DIFF(end_at,'2000-01-01', DAY) AND DATE_DIFF(#DS_START_DATE,'2000-01-01', DAY) <= DATE_DIFF(start_at,'2000-01-01', DAY) THEN
(DATE_DIFF(#DS_END_DATE, '2000-01-01', DAY) - DATE_DIFF(start_at, '2000-01-01', DAY))/(DATE_DIFF(end_at, '2000-01-01', DAY) - DATE_DIFF(start_at, '2000-01-01', DAY)) * CAST(t1.amount AS NUMERIC)
-- Subscription starts outside of but ends within the selected period
WHEN DATE_DIFF(#DS_END_DATE, '2000-01-01', DAY) >= DATE_DIFF(end_at, '2000-01-01', DAY) AND DATE_DIFF(#DS_START_DATE, '2000-01-01', DAY) >= DATE_DIFF(start_at, '2000-01-01', DAY) THEN
(DATE_DIFF(end_at, '2000-01-01', DAY) - DATE_DIFF(#DS_START_DATE, '2000-01-01', DAY))/(DATE_DIFF(end_at, '2000-01-01', DAY) - DATE_DIFF(start_at, '2000-01-01', DAY)) * CAST(t1.amount AS NUMERIC)
END AS real_revenue
The real_revenue is supposed to be the contribution per subscription for the time period selected.
NB: Data Studio date parameters are represented by #DS_START_DATE and #DS_END_DATE
Have a quick look at this Doc it seems to imply you cannot use query parameters in views and #DS_START_DATE, #DS_END_DATE are query parameters.
If you include your original query (the one used to build the view) as a sub query you this should work fine. You can then add custom queries into datastudio when you add a datasource, just copy the query into that UI, and it should work with the Date parameters (if ticked)
Side note: I've simplified your query a little and add the sub query.
SELECT *,
CASE
--Sub falls within the selected period
WHEN DATE_DIFF(#DS_END_DATE, end_at, DAY) > 0 AND DATE_DIFF(#DS_START_DATE, start_at, DAY) < 0 THEN CAST(t1.amount as NUMERIC)
--Selected period falls within the sub period
WHEN DATE_DIFF(#DS_END_DATE, end_at, DAY) < 0 AND DATE_DIFF(#DS_START_DATE, start_at, DAY) > 0 THEN
DATE_DIFF(#DS_END_DATE, #DS_START_DATE, DAY) / DATE_DIFF(end_at, start_at, DAY) * CAST(t1.amount AS NUMERIC)
-- Subscription starts within but ends outside the selected period
WHEN DATE_DIFF(#DS_END_DATE, end_at, DAY) > 0 AND DATE_DIFF(#DS_START_DATE, start_at, DAY) > 0 THEN
DATE_DIFF(#DS_END_DATE, start_at, DAY) / DATE_DIFF(end_at, start_at, DAY) * CAST(t1.amount AS NUMERIC)
-- Subscription starts outside of but ends within the selected period
WHEN DATE_DIFF(#DS_END_DATE, end_at, DAY) < 0 AND DATE_DIFF(#DS_START_DATE, start_at, DAY) > 0 THEN
DATE_DIFF(end_at,#DS_START_DATE, DAY) / DATE_DIFF(end_at, start_at, DAY) * CAST(t1.amount AS NUMERIC)
END AS real_revenue
FROM (
-- YOUR ORIGINAL VIEW CREATION QUERY
)
WHERE
-- any conditions you might want to add.
Related
For a given date I want to add business days to it. For example, if today is 10-17-2022 and I have a field that is 8 business days. How can I add 8 business days to 10-17-2022 which would be 10-27-2022.
Current Data:
BUSINESS_DAYS
Date
8
10-11-2022
10
10-13-2022
9
10-12-2022
Desired Output Data
BUSINESS_DAYS
Date
FINAL_DATE
8
10-11-2022
10-21-2022
10
10-13-2022
10-27-2022
9
10-12-2022
10-25-2022
As you can see we are skipping all weekends. We can ignore holidays for now.
Update:
Using
The suggest logic I got the following answer. I changed the names up.
I used:
DATE_ADD(A.PO_SENT_DATE , INTERVAL
(CAST(PREDICTED_LEAD_TIME AS INT64)
+ (date_diff(A.PO_SENT_DATE , DATE_ADD(A.PO_SENT_DATE , INTERVAL CAST(PREDICTED_LEAD_TIME AS INT64) DAY), week)* 2))
DAY) as FINAL_DATE
Update2: Using the following:
DATE_ADD(`Date`, INTERVAL
(BUSINESS_DAYS
+ (date_diff( DATE_ADD(`Date`, INTERVAL BUSINESS_DAYS DAY),`Date`, week) * 2))
DAY) as FINAL_DATE
There are instances where the result falls on the weekend. See screenshot below. 10-22-2022 falls on a Saturday.
Consider below simple solution
select *,
( select day
from unnest(generate_date_array(date, date + (div(business_days, 5) + 1) * 7)) day
where not extract(dayofweek from day) in (1, 7)
qualify row_number() over(order by day) = business_days + 1
) final_date
from your_table
if applied to sample data in your question
with your_table as (
select 8 business_days, date '2022-10-11' date union all
select 10, '2022-10-13' union all
select 9, '2022-10-12'
)
output is
The solution from #mikhailberlyant is really really cool, and very innovative. However if you have a lot of rows in your table and value of "business_days" column varies a lot, query will be less efficient especially for larger "business_days" values as implementation needs to generate entire range of array for each row, unnest it, and then do manipulation in that array.
This might help you do calculation without any array business:
select day, add_days as add_business_days,
DATE_ADD(day, INTERVAL cast(add_days +2*ceil((add_days -(5-(
(case when EXTRACT(DAYOFWEEK FROM day) = 7 then 1 else EXTRACT(DAYOFWEEK FROM day) end)
-1)))/5)+(case when EXTRACT(DAYOFWEEK FROM day) = 7 then 1 else 0 end) as int64) DAY) as final_day
from
(select parse_date('%Y-%m-%d', "2022-10-11") as day, 8 as add_days)
I have a postgres table "Generation" with half-hourly timestamps spanning 2009 - present with energy data:
I need to aggregate (average) the data across different intervals from specific timepoints, for example data from 2021-01-07T00:00:00.000Z for one year at 7 day intervals, or 3 months at 1 day interval or 7 days at 1h interval etc. date_trunc() partly solves this, but rounds the weeks to the nearest monday e.g.
SELECT date_trunc('week', "DATETIME") AS week,
count(*),
AVG("GAS") AS gas,
AVG("COAL") AS coal
FROM "Generation"
WHERE "DATETIME" >= '2021-01-07T00:00:00.000Z' AND "DATETIME" <= '2022-01-06T23:59:59.999Z'
GROUP BY week
ORDER BY week ASC
;
returns the first time series interval as 2021-01-04 with an incorrect count:
week count gas coal
"2021-01-04 00:00:00" 192 18291.34375 2321.4427083333335
"2021-01-11 00:00:00" 336 14477.407738095239 2027.547619047619
"2021-01-18 00:00:00" 336 13947.044642857143 1152.047619047619
****EDIT: the following will return the correct weekly intervals by checking the start date relative to the nearest monday / start of week, and adjusts the results accordingly:
WITH vars1 AS (
SELECT '2021-01-07T00:00:00.000Z'::timestamp as start_time,
'2021-01-28T00:00:00.000Z'::timestamp as end_time
),
vars2 AS (
SELECT
((select start_time from vars1)::date - (date_trunc('week', (select start_time from vars1)::timestamp))::date) as diff
)
SELECT date_trunc('week', "DATETIME" - ((select diff from vars2) || ' day')::interval)::date + ((select diff from vars2) || ' day')::interval AS week,
count(*),
AVG("GAS") AS gas,
AVG("COAL") AS coal
FROM "Generation"
WHERE "DATETIME" >= (select start_time from vars1) AND "DATETIME" < (select end_time from vars1)
GROUP BY week
ORDER BY week ASC
returns..
week count gas coal
"2021-01-07 00:00:00" 336 17242.752976190477 2293.8541666666665
"2021-01-14 00:00:00" 336 13481.497023809523 1483.0565476190477
"2021-01-21 00:00:00" 336 15278.854166666666 1592.7916666666667
And then for any daily or hourly (swap out day with hour) intervals you can use the following:
SELECT date_trunc('day', "DATETIME") AS day,
count(*),
AVG("GAS") AS gas,
AVG("COAL") AS coal
FROM "Generation"
WHERE "DATETIME" >= '2022-01-07T00:00:00.000Z' AND "DATETIME" < '2022-01-10T23:59:59.999Z'
GROUP BY day
ORDER BY day ASC
;
In order to select the complete week, you should change the WHERe-clause to something like:
WHERE "DATETIME" >= date_trunc('week','2021-01-07T00:00:00.000Z'::timestamp)
AND "DATETIME" < (date_trunc('week','2022-01-06T23:59:59.999Z'::timestamp) + interval '7' day)::date
This will effectively get the records from January 4,2021 until (and including ) January 9,2022
Note: I changed <= to < to stop the end-date being included!
EDIT:
when you want your weeks to start on January 7, you can always group by:
(date_part('day',(d-'2021-01-07'))::int-(date_part('day',(d-'2021-01-07'))::int % 7))/7
(where d is the column containing the datetime-value.)
see: dbfiddle
EDIT:
This will get the list from a given date, and a specified interval.
see DBFIFFLE
WITH vars AS (
SELECT
'2021-01-07T00:00:00.000Z'::timestamp AS qstart,
'2022-01-06T23:59:59.999Z'::timestamp AS qend,
7 as qint,
INTERVAL '1 DAY' as qinterval
)
SELECT
(select date(qstart) FROM vars) + (SELECT qinterval from vars) * ((date_part('day',("DATETIME"-(select date(qstart) FROM vars)))::int-(date_part('day',("DATETIME"-(select date(qstart) FROM vars)))::int % (SELECT qint FROM vars)))::int) AS week,
count(*),
AVG("GAS") AS gas,
AVG("COAL") AS coal
FROM "Generation"
WHERE "DATETIME" >= (SELECT qstart FROM vars) AND "DATETIME" <= (SELECT qend FROM vars)
GROUP BY week
ORDER BY week
;
I added the WITH vars to do the variable stuff on top and no need to mess with the rest of the query. (Idea borrowed here)
I only tested with qint=7,qinterval='1 DAY' and qint=14,qinterval='1 DAY' (but others values should work too...)
Using the function EXTRACT you may calculate the difference in days, weeks and hours between your timestamp ts and the start_date as follows
Difference in Days
extract (day from ts - start_date)
Difference in Weeks
Is the difference in day divided by 7 and truncated
trunc(extract (day from ts - start_date)/7)
Difference in Hours
Is the difference in day times 24 + the difference in hours of the day
extract (day from ts - start_date)*24 + extract (hour from ts - start_date)
The difference can be used in GROUP BY directly. E.g. for week grouping the first group is difference 0, i.e. same week, the next group with difference 1, the next week, etc.
Sample Example
I'm using a CTE for the start date to avoid multpile copies of the paramater
with start_time as
(select DATE'2021-01-07' as start_ts),
prep as (
select
ts,
extract (day from ts - (select start_ts from start_time)) day_diff,
trunc(extract (day from ts - (select start_ts from start_time))/7) week_diff,
extract (day from ts - (select start_ts from start_time)) *24 + extract (hour from ts - (select start_ts from start_time)) hour_diff,
value
from test_table
where ts >= (select start_ts from start_time)
)
select week_diff, avg(value)
from prep
group by week_diff order by 1
At my work we run a report a couple times a week to pull some information from BigQuery.
We run the report every Monday and Thursday.
I'd like to automate the report to run on these days and want to know if I can put in some logic so that if I run the report on a Monday, it runs the data for the previous business week (Sunday - Saturday), and if I run the report on a Thursday, it runs the report for the current business week so far (Sunday - Wednesday).
On another report where I only run the report for previous week I use:
select last_day(current_date - 14, week(monday)) as lw_week_start, last_day(current_date - 7, week(sunday)) as lw_week_end
And to get the current week dates I can use:
select last_day (current_date -7, week(monday)), (current_date -1)
So can I put both of these in my query, and use some sort of logic to say, if I run on a Monday use the first one, if I run on a Thursday, use the second one?
Thanks
You can define the period as a CTE (or if you prefer as variables) and then use that information in the query:
with period as (
select (case when extract(dayofweek from current_date) = 2
then last_day(date_add(current_date, interval -14 day), week(monday))
when extract(dayofweek from current_date) = 5
then last_day(date_add(current_date, interval -7 day), week(monday))
end) as lw_week_start,
(case when extract(dayofweek from current_date) = 2
then last_day(date_add(current_date, interval -7 day), week(sunday)
when extract(dayofweek from current_date) = 5
then date_add(current_date, interval -1 day)
end) as lw_week_end
)
select . . .
from period cross join
. . .
Notes:
This only includes Mondays and Thursdays. I imagine you want to extend this to the other days of the week.
current_date is the current date UTC. You might want to include your timezone:
select current_date('America/New_York')
/* if current_date is monday then it will return previous week report
else it will give report for present week for any other current_date */
IF (EXTRACT (DAYOFWEEK FROM CURRENT_DATE)) = 2 THEN
select last_day(current_date - 14, week(monday)) as lw_week_start, last_day(current_date - 7, week(sunday)) as lw_week_end;
ELSE
select last_day (current_date -7, week(monday)) as week_start, (current_date -1) as previous_day ;
END IF
Scripting on BigQuery
BigQuery Date fuctions
Simply add below to your where clause
where your_date_column in unnest(
case extract(dayofweek from current_date())
when 2 then generate_date_array(last_day(current_date() - 14, week(monday)), last_day(current_date() - 7, week(sunday)))
when 5 then generate_date_array(last_day (current_date() - 7, week(monday)), current_date() - 1)
end
)
The below query returns a distinct count of 'members' for a given month and brand (see image below).
select to_char(transaction_date, 'YYYY-MM') as month, brand,
count(distinct UNIQUE_MEM_ID) as distinct_count
from source.table
group by to_char(transaction_date, 'YYYY-MM'), brand;
The data is collected with a 15 day lag after the month closes (meaning September 2016 MONTHLY data won't be 100% until October 15). I am only concerned with monthly data.
The query I would like to build: Until the 15th of this month (October), last month's data (September) should reflect August's data. The current partial month (October) should default to the prior month and thus also to the above logic.
After the 15th of this month, last month's data (September) is now 100% and thus September should reflect September (and October will reflect September until November 15th, and so on).
The current partial month will always = the prior month. The complexity of the query is how to calc prior month.
This query will be ran on a rolling basis so needs to be dynamic.
To be clear, I am trying to build a query where distinct_count for the prior month (until end of current month + 15 days) should reflect (current month - 2) value (for each respective brand). After 15 days of the close of the month, prior month = (current month - 1).
Partial current month defaults to prior month's data. The 15 day value should be variable/modifiable.
First, simplify the query to:
select to_char(transaction_date, 'YYYY-MM') as month, brand,
count(distinct members) as distinct_count
from source.table
group by members, to_char(transaction_date, 'YYYY-MM'), brand;
Then, you are going to have a problem. The problem is that one row (say from Aug 20th) needs to go into two groups. A simple group by won't handle this. So, let's use union all. I think the result is something like this:
select date_trunc('month', transaction_date) as month, brand,
count(distinct members) as distinct_count
from source.table
where (date_trunc('month', transaction_date) < date_trunc('month' current_date) - interval '1 month') or
(day(current_date) > 15 and date_trunc('month', transaction_date) = date_trunc('month' current_date) - interval '1 month')
group by date_trunc('month', transaction_date), brand
union all
select date_trunc('month' current_date) - interval '1 month' as month, brand,
count(distinct members) as distinct_count
from source.table
where (day(current_date) < 15 and date_trunc('month', transaction_date) = date_trunc('month' current_date) - interval '1 month')
group by brand;
Since you already have a working query, I concentrate on the subselect. The condition you can use here is CASE, especially "Searched CASE"
case
when extract(day from current_date) < 15 then
extract(month from current_date - interval '2 months')
else
extract(month from current_date - interval '1 month')
end case
This may be used as part of a where clause, for example.
Here is some sudo code to get the begin date and the end date for your interval.
Begin date:
date DATE_TRUNC('month', CURRENT_DATE - integer 15) - interval '1 month'
This will return the current month only after the 15th day, from there you can subtract a full month to get your starting point.
End Date:
To calculate this, grab the begin date, plus a month, minus a day.
If the source table is partitioned by transaction_date, this syntax (not masking transaction_date with expression) enables partitions eliminatation.
select to_char(transaction_date, 'YYYY-MM') as month
,count (distinct members) as distinct_count
,brand as brand
FROM source.table
where transaction_date between date_trunc('month', current_date) - case when extract (day from current_date) >= 15 then 1 else 2 end * interval '1' month
and date_trunc('month', current_date) - case when extract (day from current_date) >= 15 then 0 else 1 end * interval '1' month - interval '1' day
group by to_char(transaction_date, 'YYYY-MM')
,brand
;
I have a table which has all the purchases of my costumers. I want to select all entries from the last week, (week start from Sunday).
id value date
5907 1.20 "2015-06-05 09:08:34-03"
5908 120.00 "2015-06-09 07:58:12-03"
I've tried this:
SELECT id, valor, created, FROM compras WHERE created >= now() - interval '1 week' and parceiro_id= '1'
But I got the data from the last week including data from this week, I only want data from the last week.
How to get data only from last week ?
This condition will return records from Sunday till Saturday last week:
WHERE created BETWEEN
NOW()::DATE-EXTRACT(DOW FROM NOW())::INTEGER-7
AND NOW()::DATE-EXTRACT(DOW from NOW())::INTEGER
There is an example:
WITH compras AS (
SELECT ( NOW() + (s::TEXT || ' day')::INTERVAL )::TIMESTAMP(0) AS created
FROM generate_series(-20, 20, 1) AS s
)
SELECT to_char( created, 'DY'::TEXT), created
FROM compras
WHERE created BETWEEN
NOW()::DATE-EXTRACT(DOW FROM NOW())::INTEGER-7
AND NOW()::DATE-EXTRACT(DOW from NOW())::INTEGER
In answer to #d456:
Wouldn't using BETWEEN include midnight on Sunday at both ends of the interval?
That right, BETWEEN includes midnight on Sunday at both ends of the interval. To exclude midnight on Sunday at end of interval it is necessary to use operators >= and <:
WITH compras AS (
SELECT s as created
FROM generate_series( -- this would produce timestamps with 20 minutes step
(now() - '20 days'::interval)::date,
(now() + '20 days'::interval)::date,
'20 minutes'::interval) AS s
)
SELECT to_char( created, 'DY'::TEXT), created
FROM compras
WHERE TRUE
AND created >= NOW()::DATE-EXTRACT(DOW FROM NOW())::INTEGER-7
AND created < NOW()::DATE-EXTRACT(DOW from NOW())::INTEGER
Postgres by default starts weeks on a Sunday, so you are in luck. You can use date_trunc() to get the beginning of the previous week:
WHERE (created >= date_trunc('week', CURRENT_TIMESTAMP - interval '1 week') and
created < date_trunc('week', CURRENT_TIMESTAMP)
)
EDIT:
Postgres by default starts week for date_trunc on Monday, but for dow on Sunday. So, you can do what you want by using that logic, which Nicolai has in his answer.