how to query daily cost of specific product in bigQuery? - google-bigquery

I exported billing to bigquery and I want to get the translations total cost in specific date from bigQuery monthly or specific date. like April 1, 2019.
google docs sample query get monthly.
SELECT
invoice.month,
SUM(cost)
+ SUM(IFNULL((SELECT SUM(c.amount)
FROM UNNEST(credits) c), 0))
AS total,
(SUM(CAST(cost * 1000000 AS int64))
+ SUM(IFNULL((SELECT SUM(CAST(c.amount * 1000000 as int64))
FROM UNNEST(credits) c), 0))) / 1000000
AS total_exact
FROM `project.dataset.gcp_billing_export_v1_XXXXXX_XXXXXX_XXXXXX`
GROUP BY 1
ORDER BY 1 ASC
;
but I created my query this way:
$myVariable=
"SELECT
COUNT(*) total_times,
SUM(cost) total_cost
FROM
`project.dataset.gcp_billing_export_v1_XXXXXX_XXXXXX_XXXXXX`
WHERE
service.description = 'Translate' AND (usage_end_time >= timestamp('2019-04-04 00:00:00') AND usage_end_time <= timestamp('2019-04-04 23:59:59'))";
I want to get the total cost of the current day and the total cost from the first day of the month to the current day.
sample:
1. 2019/04/04: 4223.05 - (882 Times)
2. 2019/04/Total: 16505.43 - (3882 Times)

You can further add details to your working query:
SELECT
service.description,
timestamp_trunc(usage_start_time,DAY) as time_fragment,
ROUND(SUM(cost)
+ SUM(IFNULL((SELECT SUM(c.amount)
FROM UNNEST(credits) c), 0)),3)
AS total,
round((SUM(CAST(cost * 1000000 AS int64))
+ SUM(IFNULL((SELECT SUM(CAST(c.amount * 1000000 as int64))
FROM UNNEST(credits) c), 0))) / 1000000,3)
AS total_exact
FROM `project.dataset.gcp_billing_export_v1_XXXXXX_XXXXXX_XXXXXX`
WHERE service.description='Translate'
GROUP BY 1,2
ORDER BY 2 desc;
which displays:
you can further to into HOURly granularity if you edit line 3.

Related

How to spread month to day with amount value divided by total days per month

I have data with an amount of 1 month and want to change it to 30 days.
if 1 month the amount is 20000 then per day is 666.67
The following are sample data and results:
Account
Project
Date
Segment
Amount
Acc1
1
September 2022
Actual
20000
Result :
I need a query using sql server
You may try a set-based approach using an appropriate number table and a calculation with windowed COUNT().
Data:
SELECT *
INTO Data
FROM (VALUES
('Acc1', 1, CONVERT(date, '20220901'), 'Actual', 20000.00)
) v (Account, Project, [Date], Segment, Amount)
Statement for all versions, starting from SQL Server 2016 (the number table is generated using JSON-based approach with OPENJSON()):
SELECT d.Account, d.Project, a.[Date], d.Segment, a.Amount
FROM Data d
CROSS APPLY (
SELECT
d.Amount / COUNT(*) OVER (ORDER BY (SELECT NULL)),
DATEADD(day, CONVERT(int, [key]), d.[Date])
FROM OPENJSON('[1' + REPLICATE(',1', DATEDIFF(day, d.[Date], EOMONTH(d.[Date]))) + ']')
) a (Amount, Date)
Statement for SQL Server 2022 (the number table is generated with GENERATE_SERIES()):
SELECT d.Account, d.Project, a.[Date], d.Segment, a.Amount
FROM Data d
CROSS APPLY (
SELECT
d.Amount / COUNT(*) OVER (ORDER BY (SELECT NULL)),
DATEADD(day, [value], d.[Date])
FROM GENERATE_SERIES(0, DATEDIFF(day, d.[Date], EOMONTH(d.[Date])))
) a (Amount, Date)
Notes:
Both approaches calculate the days for each month. If you always want 30 days per month, replace DATEDIFF(day, d.[Date], EOMONTH(d.[Date])) with 29.
There is a rounding issue with this calculation. You may need to implement an additional calculation for the last day of the month.
You can use a recursive CTE to generate each day of the month and then divide the amount by the number of days in the month to achive the required output
DECLARE #Amount NUMERIC(18,2) = 20000,
#MonthStart DATE = '2022-09-01'
;WITH CTE
AS
(
SELECT
CurrentDate = #MonthStart,
DayAmount = CAST(#Amount/DAY(EOMONTH(#MonthStart)) AS NUMERIC(18,2)),
RemainingAmount = CAST(#Amount - (#Amount/DAY(EOMONTH(#MonthStart))) AS NUMERIC(18,2))
UNION ALL
SELECT
CurrentDate = DATEADD(DAY,1,CurrentDate),
DayAmount = CASE WHEN DATEADD(DAY,1,CurrentDate) = EOMONTH(#MonthStart)
THEN RemainingAmount
ELSE DayAmount END,
RemainingAmount = CASE WHEN DATEADD(DAY,1,CurrentDate) = EOMONTH(#MonthStart)
THEN 0
ELSE CAST(RemainingAmount-DayAmount AS NUMERIC(18,2)) END
FROM CTE
WHERE CurrentDate < EOMONTH(#MonthStart)
)
SELECT
CurrentDate,
DayAmount
FROM CTE
In case you want an equal split without rounding errors and without loops you can use this calculation. It spreads the rounding error across all entries, so they are all as equal as possible.
DECLARE #Amount NUMERIC(18,2) = 20000,
#MonthStart DATE = '20220901'
SELECT DATEADD(DAY,Numbers.i - 1,#MonthStart)
, ShareSplit.Calculated_Share
, SUM(ShareSplit.Calculated_Share) OVER (ORDER BY (SELECT NULL)) AS Calculated_Total
FROM (SELECT DISTINCT number FROM master..spt_values WHERE number BETWEEN 1 AND DAY(EOMONTH(#MonthStart)))Numbers(i)
CROSS APPLY(SELECT CAST(ROUND(#Amount * 100 / DAY(EOMONTH(#MonthStart)),0) * 0.01
+ CASE
WHEN Numbers.i
<= ABS((#Amount - (ROUND(#Amount * 100 / DAY(EOMONTH(#MonthStart)),0) / 100.0 * DAY(EOMONTH(#MonthStart)))) * 100)
THEN 0.01 * SIGN(#Amount - (ROUND(#Amount * 100 / DAY(EOMONTH(#MonthStart)),0) / 100.0 * DAY(EOMONTH(#MonthStart))))
ELSE 0
END AS DEC(18,2)) AS Calculated_Share
)ShareSplit

SQL - Calculate percentage by group, for multiple groups

I have a table in GBQ in the following format :
UserId Orders Month
XDT 23 1
XDT 0 4
FKR 3 6
GHR 23 4
... ... ...
It shows the number of orders per user and month.
I want to calculate the percentage of users who have orders, I did it as following :
SELECT
HasOrders,
ROUND(COUNT(*) * 100 / CAST( SUM(COUNT(*)) OVER () AS float64), 2) Parts
FROM (
SELECT
*,
CASE WHEN Orders = 0 THEN 0 ELSE 1 END AS HasOrders
FROM `Table` )
GROUP BY
HasOrders
ORDER BY
Parts
It gives me the following result:
HasOrders Parts
0 35
1 65
I need to calculate the percentage of users who have orders, by month, in a way that every month = 100%
Currently to do this I execute the query once per month, which is not practical :
SELECT
HasOrders,
ROUND(COUNT(*) * 100 / CAST( SUM(COUNT(*)) OVER () AS float64), 2) Parts
FROM (
SELECT
*,
CASE WHEN Orders = 0 THEN 0 ELSE 1 END AS HasOrders
FROM `Table` )
WHERE Month = 1
GROUP BY
HasOrders
ORDER BY
Parts
Is there a way execute a query once and have this result ?
HasOrders Parts Month
0 25 1
1 75 1
0 45 2
1 55 2
... ... ...
SELECT
SIGN(Orders),
ROUND(COUNT(*) * 100.000 / SUM(COUNT(*), 2) OVER (PARTITION BY Month)) AS Parts,
Month
FROM T
GROUP BY Month, SIGN(Orders)
ORDER BY Month, SIGN(Orders)
Demo on Postgres:
https://dbfiddle.uk/?rdbms=postgres_10&fiddle=4cd2d1455673469c2dfc060eccea8020
You've stated that it's important for the total to be 100% so you might consider rounding down in the case of no orders and rounding up in the case of has orders for those scenarios where the percentages falls precisely on an odd multiple of 0.5%. Or perhaps rounding toward even or round smallest down would be better options:
WITH DATA AS (
SELECT SIGN(Orders) AS HasOrders, Month,
COUNT(*) * 10000.000 / SUM(COUNT(*)) OVER (PARTITION BY Month) AS PartsPercent
FROM T
GROUP BY Month, SIGN(Orders)
ORDER BY Month, SIGN(Orders)
)
select HasOrders, Month, PartsPercent,
PartsPercent - TRUNCATE(PartsPercent) AS Fraction,
CASE WHEN HasOrders = 0
THEN FLOOR(PartsPercent) ELSE CEILING(PartsPercent)
END AS PartsRound0Down,
CASE WHEN PartsPercent - TRUNCATE(PartsPercent) = 0.5
AND MOD(TRUNCATE(PartsPercent), 2) = 0
THEN FLOOR(PartsPercent) ELSE ROUND(PartsPercent) -- halfway up
END AS PartsRoundTowardEven,
CASE WHEN PartsPercent - TRUNCATE(PartsPercent) = 0.5 AND PartsPercent < 50
THEN FLOOR(PartsPercent) ELSE ROUND(PartsPercent) -- halfway up
END AS PartsSmallestTowardZero
from DATA
It's usually not advisable to test floating-point values for equality and I don't know how BigQuery's float64 will work with the comparison against 0.5. One half is nevertheless representable in binary. See these in a case where the breakout is 101 vs 99. I don't have immediate access to BigQuery so be aware that Postgres's rounding behavior is different:
https://dbfiddle.uk/?rdbms=postgres_10&fiddle=c8237e272427a0d1114c3d8056a01a09
Consider below approach
select hasOrders, round(100 * parts, 2) as parts, month from (
select month,
countif(orders = 0) / count(*) `0`,
countif(orders > 0) / count(*) `1`,
from your_table
group by month
)
unpivot (parts for hasOrders in (`0`, `1`))
with output like below

Postgresql - Aggregate queries inside aggregate queries

I'm working on building a select statement for a sales rep commission report that uses postgresql tables. I want it to show these columns:
-Customer No.
-Part No.
-Month-to-date Qty (MTD Qty)
-Year-to-date Qty (YTD Qty)
-Month-to-date Extended Selling Price (MTD Extended)
-Year-to-date Extended Selling Price (YTD Extended)
The data is in two tables:
Sales_History (one record per invoice and this table includes Cust. No. and Invoice Date)
Sales_History_Items (one record per part no. per invoice and this table includes Part No., Qty and Unit Price).
If I do a simple query that combines these two tables, this is what it looks like:
Date / Cust / Part / Qty / Unit Price
Apr 1 / ABC Co. / WIDGET / 5 / $11
Apr 4 / ABC Co. / WIDGET / 8 / $11.50
Apr 1 / ABC Co. / GADGET / 1 / $30
Apr 7 / XYZ Co. / WIDGET / 3 / $11.50
etc.
This is the final result I want (one line per customer per part):
Cust / Part / Qty / MTD Qty / MTD Sales / YTD Qty / YTD Sales
ABC Co. / WIDGET / 13 / $147 / 1500 / $16,975
ABC Co. / GADGET / 1 / $30 / 7 / $210
XYZ Co. / WIDGET / 3 / $34.50 / 18 / $203.40
I’ve been able to come up with this SQL statement so far, which does not get me the extended selling columns (committed_qty * unit_price) per line and then summarize them by cust no./part no., and that’s my problem:
with mtd as
(SELECT sales_history.cust_no, part_no, Sum(sales_history_items.committed_qty) AS MTDQty
FROM sales_history left JOIN sales_history_items
ON sales_history.invoice_no = sales_history_items.invoice_no where sales_history_items.part_no is not null and sales_history.invoice_date >= '2020-04-01' and sales_history.invoice_date <= '2020-04-30'
GROUP BY sales_history.cust_no, sales_history_items.part_no),
ytd as
(SELECT sales_history.cust_no, part_no, Sum(sales_history_items.committed_qty) AS YTDQty
FROM sales_history left JOIN sales_history_items
ON sales_history.invoice_no = sales_history_items.invoice_no where sales_history_items.part_no is not null and sales_history.invoice_date >= '2020-01-01' and sales_history.invoice_date <= '2020-12-31' GROUP BY sales_history.cust_no, sales_history_items.part_no),
mysummary as
(select MTDQty, YTDQty, coalesce(ytd.cust_no,mtd.cust_no) as cust_no,coalesce(ytd.part_no,mtd.part_no) as part_no
from ytd full outer join mtd on ytd.cust_no=mtd.cust_no and ytd.part_no=mtd.part_no)
select * from mysummary;
I believe that I have to nest another couple of aggregate queries in here that would group by cust_no, part_no, unit_price but then have those extended price totals (qty * unit_price) sum up by cust_no, part_no.
Any assistance would be greatly appreciated. Thanks!
Do this in one go with filter expressions:
with params as (
select '2020-01-01'::date as year, 4 as month
)
SELECT h.cust_no, i.part_no,
SUM(i.committed_qty) AS YTDQty,
SUM(i.committed_qty * i.unit_price) as YTDSales,
SUM(i.committed_qty) FILTER
(WHERE extract('month' from h.invoice_date) = p.month) as MTDQty,
SUM(i.committed_qty * i.unit_price) FILTER
(WHERE extract('month' from h.invoice_date) = p.month) as MTDSales
FROM params p
CROSS JOIN sales_history h
LEFT JOIN sales_history_items i
ON i.invoice_no = h.invoice_no
WHERE i.part_no is not null
AND h.invoice_date >= p.year
AND h.invoice_date < p.year + interval '1 year'
GROUP BY h.cust_no, i.part_no
If I follow you correctly, you can do conditional aggregation:
select sh.cust_no, shi.part_no,
sum(shi.qty) mtd_qty,
sum(shi.qty * shi.unit_price) ytd_sales,
sum(shi.qty) filter(where sh.invoice_date >= date_trunc('month', current_date) mtd_qty,
sum(shi.qty * shi.unit_price) filter(where sh.invoice_date >= date_trunc('month', current_date) mtd_sales
from sales_history sh
left join sales_history_items shi on sh.invoice_no = shi.invoice_no
where shi.part_no is not null and sh.invoice_date >= date_trunc('year', current_date)
group by sh.cust_no, shi.part_no
The logic is to filter on the current year, and use simple aggregation to compute the "year to date" figures. To get the "month to date" columns, we can just filter the aggregate functions.

google bigQuery realtime does not match ga report

Hello I would like to see real-time data status using google bigquery real time table.
However, simple query statements do not match GA reports. I created a query that shows the number of sessions per hour, but I had an error rate of 10 to 30%.
Is the accuracy of google bigquery realtime not so good? Or am I making a mistake?
WITH noDuplicateTable as (
SELECT
ARRAY_AGG (t ORDER BY exportTimeUsec DESC LIMIT 1) [OFFSET (0)]. *
FROM
`tablename_20 *` AS t
WHERE
_TABLE_SUFFIX = FORMAT_DATE ("% y% m% d", CURRENT_DATE ('Asia / Seoul'))
GROUP BY
T.VisitKey
),
session as (
SELECT
ROW_NUMBER () OVER () sessionRow,
FORMAT_TIMESTAMP ('% H', TIMESTAMP_SECONDS (time), 'Asia / Seoul') AS startTime,
sum (session) as session,
(sum (session) -sum (isvisit)) as uniqueSession,
(sum (isvisit) / sum (session) * 100) as bounce,
sum (totalPageView) as totalPageView
FROM (
SELECT
count (visitId) as session,
visitStartTime as time,
sum (Ifnull (totals.Bounces, 0)) as isVisit,
sum (totals.pageviews) as totalPageView
FROM
noDuplicateTable
GROUP BY
visitStartTime
)
GROUP BY startTime
)
select * from session

Calculative cumulative returns using SQL

I currently generate a user's "monthly_return" between two months using the code below. How would I turn "monthly_return" into a cumulative "linked" return similar to the StackOverflow question linked below?
Similar question: Running cumulative return in sql
I tried:
exp(sum(log(1 + cumulative_return) over (order by date)) - 1)
But get the error:
PG::WrongObjectType: ERROR: OVER specified, but log is not a window function nor an aggregate function LINE 3: exp(sum(log(1 + cumulative_return) over (order by date)) - 1... ^ : SELECT portfolio_id, exp(sum(log(1 + cumulative_return) over (order by date)) - 1) FROM (SELECT date, portfolio_id, (value_cents * 0.01 - cash_flow_cents * 0.01) / (lag(value_cents * 0.01, 1) over ( ORDER BY portfolio_id, date)) - 1 AS cumulative_return FROM portfolio_balances WHERE portfolio_id = 16 ORDER BY portfolio_id, date) as return_data;
The input data would be:
1/1/2017: $100 value, $100 cash flow
1/2/2017: $100 value, $0 cash flow
1/3/2017: $100 value, $0 cash flow
1/4/2017: $200 value, $100 cash flow
The output would be:
1/1/2017: 0% cumulative return
1/2/2017: 0% cumulative return
1/3/2017: 0% cumulative return
1/4/2017: 0% cumulative return
My current code which shows monthly returns which are not linked (cumulative).
SELECT
date,
portfolio_id,
(value_cents * 0.01 - cash_flow_cents * 0.01) / (lag(value_cents * 0.01, 1) over ( ORDER BY portfolio_id, date)) - 1 AS monthly_return
FROM portfolio_balances
WHERE portfolio_id = 16
ORDER BY portfolio_id, date;
If you want a cumulative sum:
SELECT p.*,
SUM(monthly_return) OVER (PARTITION BY portfolio_id ORDER BY date) as running_monthly_return
FROM (SELECT date, portfolio_id,
(value_cents * 0.01 - cash_flow_cents * 0.01) / (lag(value_cents * 0.01, 1) over ( ORDER BY portfolio_id, date)) - 1 AS monthly_return
FROM portfolio_balances
WHERE portfolio_id = 16
) p
ORDER BY portfolio_id, date;
I don't see that this makes much sense, because you have the cumulative sum of a ratio, but that appears to be what you are asking for.