I am trying to calculate the average time our customers spend during checkout process using Google Big Query.
As I am very new to SQL and BQ, I am running the following code trying to get the timestamp for checkout start, timestamp for checkout completion and then calculate the average.
SELECT
month, chekout_start as Checkout Started,
time_to_transaction as Checkout
FROM ((SELECT
MONTH(TIMESTAMP(date)) AS month,
TIME(AVG(TimeToCheckout)) AS time_to_transaction
FROM (
SELECT
date,
fullVisitorId,
timestamp(integer(visitStartTime*1000000)) as start_time,
timestamp(integer(visitStartTime*1000000 + hits.time*1000)) as hit_time,
(TIMESTAMP_TO_SEC(timestamp(integer(visitStartTime*1000000 + hits.time*1000))) - TIMESTAMP_TO_SEC(timestamp(integer(visitStartTime*1000000)) ) ) AS TimeToCheckout
FROM (TABLE_DATE_RANGE([data.ga_sessions_],
TIMESTAMP('2018-01-01'), TIMESTAMP('2018-12-31')))
WHERE totals.transactions>=1
)
GROUP BY month) transaction
INNER JOIN
(
SELECT
MONTH(TIMESTAMP(date)) AS month,
TIME(AVG(TimeToCheckout)) AS checkout_start
FROM (
SELECT
date,
fullVisitorId,
timestamp(integer(visitStartTime*1000000)) as start_time,
timestamp(integer(visitStartTime*1000000 + hits.time*1000)) as hit_time,
(TIMESTAMP_TO_SEC(timestamp(integer(visitStartTime*1000000 + hits.time*1000))) - TIMESTAMP_TO_SEC(timestamp(integer(visitStartTime*1000000)) ) ) AS TimeToCheckout
FROM (TABLE_DATE_RANGE([data.ga_sessions_],
TIMESTAMP('2018-01-01'), TIMESTAMP('2018-12-31')))
WHERE (hits.page.pagePath = 'checkout/buy')
)
GROUP BY month
) checkout_start
ON transaction.month = checkout_start.month)
ORDER BY month ASC
The desirable outcome looks like:
However, I am getting the error 'Encountered " "transaction "" at line 18, column 17. Was expecting: ")" ...'. Can you please have a look at my code and explain what am I doing wrong? Thanks!
Related
I am trying to use order by in big query to sort my query. What I want to do is, to order the results based on the week number of the year but it doesn't seem to work. Nor does it show any kind of syntax issue.
SELECT * FROM (SELECT concat(cast(EXTRACT(week FROM elt.event_datetime) as string),', ', extract(year from elt.event_datetime)) WEEK, elt.msg_source SOURCE, (elt.source_timedelta_s_ + elt.pipeline_timedelta_s_) Latency FROM <table> elt join ,<table1> ai ON elt.msg_id = ai.msg_id WHERE ai.report_type <> 'PFR' and EXTRACT(date FROM elt.event_datetime) > extract(date from (date_sub(current_timestamp(), INTERVAL 30 day)))
order by WEEK desc)PIVOT ( AVG(Latency) FOR SOURCE IN ('FLYHT', 'SMTP')) t
Basically, I want my results as they are numbered in green in the image below.
Can someone check what is the issue?
SELECT * FROM (SELECT concat(cast(EXTRACT(week FROM elt.event_datetime) as string),', ', extract(year from elt.event_datetime)) WEEK, elt.msg_source SOURCE, (elt.source_timedelta_s_ + elt.pipeline_timedelta_s_) Latency FROM <table> elt join ,<table1> ai ON elt.msg_id = ai.msg_id WHERE ai.report_type <> 'PFR' and EXTRACT(date FROM elt.event_datetime) > extract(date from (date_sub(current_timestamp(), INTERVAL 30 day))))
PIVOT ( AVG(Latency) FOR SOURCE IN ('FLYHT', 'SMTP')) t order by (select RIGHT(t.WEEK,4)) desc ,(select regexp_substr(t.WEEK,'[^,]+')) desc
as suggested by #Shipra Sarkar in the comments.
I'm trying to figure out a way to calculate two new columns for the following model:
Link: http://sqlfiddle.com/#!7/0a6ce9/1
A) first column: should be the "OPENING_BALANCE". It needs to be the sum of the "AMOUNT" column starting off when transaction date started. The transaction date could be any.
B) Second column: Should be the "CLOSING_BALANCE". This one will always be the sum of the "opening balance" from previous day + "amount" of the current day. So, by the second TRANSACTION_DATE forward, the "opening balance" will always be the "closing balance" from the previous day.
Here's an example:
Can anyone share any examples of how I could achieve this?
You can use SUM() window function and a CASE expression to check for the 1st transaction:
SELECT *,
CASE
WHEN ROW_NUMBER() OVER (ORDER BY TRANSACTION_DATE, TRANSACTION_ID) = 1 THEN AMOUNT
ELSE SUM(AMOUNT) OVER (ORDER BY TRANSACTION_DATE, TRANSACTION_ID) - AMOUNT
END OPENING_BALANCE,
SUM(AMOUNT) OVER (ORDER BY TRANSACTION_DATE, TRANSACTION_ID) CLOSING_BALANCE
FROM TRANSATION_TABLE;
See the demo.
select f.TRANSACTION_ID, f.TRANSACTION_DATE, f.AMOUNT,CASE WHEN sum(s.AMOUNT)-f.AMOUNT >0
THEN sum(s.AMOUNT)-f.AMOUNT ELSE s.AMOUNT END AS OPENING_BALANCE,sum(s.AMOUNT) as CLOSING_BALANCE
from TRANSATION_TABLE f inner join TRANSATION_TABLE s on f.TRANSACTION_ID >= s.TRANSACTION_ID
group by f.amount order by f.TRANSACTION_ID asc
I have year, month, day of month, hour column also sale column . The data is for four years. How to create lag varaibles for sales in next year, same month, day of month, hour?
SELECT
[UtilityName],
[CustomerID],
[DT_EST],
[Date_Raw],
[Hour_Raw],
[EPT_Year],
[EPT_month],
[EPT_DayNum],
[EPT_Hour24],
[Sales],
lag([Sales]) over( partition by [UtilityName] ,[CustomerID],[EPT_month],
[EPT_DayNum],[EPT_Hour24] order by [DT_EST] ) as lag_Sales
FROM [dbo].[table]
I would suggest instead using left join:
SELECT t.*, tprev.Sales as prev_year_sales
FROM [dbo].[table] t LEFT JOIN
[dbo].[table] tprev
ON tprev.UtilityName = t.UtilityName AND
tprev.CustomerId = t.CustomerId AND
tprev.EPT_Year = t.EPT_Year - 1 AND
tprev.EPT_month = t.EPT_month AND
tprev.EPT_DayNum = t.EPT_DayNum AND
tprev.EPT_Hour24 = t.EPT_Hour24;
You must partition by month, day and hour also and order by year descending to get the previous year's value:
lag([Sales])
over(partition by
[UtilityName] ,[CustomerID],
[EPT_month], [EPT_DayNum], [EPT_Hour24]
order by [EPT_Year]
) as lag_Sales
I have to draft a SQL query which does the following:
Compare current week (e.g. week 10) amount to the average amount over previous 4 weeks (Week# 9,8,7,6).
Now I need to run the query on a monthly basis so say for weeks (10,11,12,13).
As of now I am running it four times giving the week parameter on each run.
For example my current query is something like this :
select account_id, curr.amount,hist.AVG_Amt
from
(
select
to_char(run_date,'IW') Week_ID,
sum(amount) Amount,
account_id
from Transaction t
where to_char(run_date,'IW') = '10'
group by account_id,to_char(run_date,'IW')
) curr,
(
select account_id,
sum(amount) / count(to_char(run_date,'IW')) as AVG_Amt
from Transactions
where to_char(run_date,'IW') in ('6','7','8','9')
group by account_id
) hist
where
hist.account_id = curr.account_id
and curr.amount > 2*hist.AVG_Amt;
As you can see, if I have to run the above query for week 11,12,13 I have to run it three separate times. Is there a way to consolidate or structure the query such that I only run once and I get the comparison data all together?
Just an additional info, I need to export the data to Excel (which I do after running query on the PL/SQL developer) and export to Excel.
Thanks!
-Abhi
You can use a correlated sub-query to get the sum of amounts for the last 4 weeks for a given week.
select
to_char(run_date,'IW') Week_ID,
sum(amount) curAmount,
(select sum(amount)/4.0 from transaction
where account_id = t.account_id
and to_char(run_date,'IW') between to_char(t.run_date,'IW')-4
and to_char(t.run_date,'IW')-1
) hist_amount,
account_id
from Transaction t
where to_char(run_date,'IW') in ('10','11','12','13')
group by account_id,to_char(run_date,'IW')
Edit: Based on OP's comment on the performance of the query above, this can also be accomplished using lag to get the previous row's value. Count of number of records present in the last 4 weeks can be achieved using a case expression.
with sum_amounts as
(select to_char(run_date,'IW') wk, sum(amount) amount, account_id
from Transaction
group by account_id, to_char(run_date,'IW')
)
select wk, account_id, amount,
1.0 * (lag(amount,1,0) over (order by wk) + lag(amount,2,0) over (order by wk) +
lag(amount,3,0) over (order by wk) + lag(amount,4,0) over (order by wk))
/ case when lag(amount,1,0) over (order by wk) <> 0 then 1 else 0 end +
case when lag(amount,2,0) over (order by wk) <> 0 then 1 else 0 end +
case when lag(amount,3,0) over (order by wk) <> 0 then 1 else 0 end +
case when lag(amount,4,0) over (order by wk) <> 0 then 1 else 0 end
as hist_avg_amount
from sum_amounts
I think that is what you are looking for:
with lagt as (select to_char(run_date,'IW') Week_ID, sum(amount) Amount, account_id
from Transaction t
group by account_id, to_char(run_date,'IW'))
select Week_ID, account_id, amount,
(lag(amount,1,0) over (order by week) + lag(amount,2,0) over (order by week) +
lag(amount,3,0) over (order by week) + lag(amount,4,0) over (order by week)) / 4 as average
from lagt;
I'm attempting to write a query that will return any customer that has multiple work orders with these work orders falling on different days of the week. Every work order for each customer should be falling on the same day of the week so I want to know where this is not the case so I can fix it.
The name of the table is Core.WorkOrder, and it contains a column called CustomerId that specifies which customer each work order belongs to. There is a column called TimeWindowStart that can be used to see which day each work order falls on (I'm using DATENAME(weekday, TimeWindowStart) to do so).
Any ideas how to write this query? I'm stuck here.
Thanks!
Select ...
From WorkOrder As W
Where Exists (
Select 1
From WorkOrder As W1
And W1.CustomerId = W.CustomerId
And DatePart( dw, W1.TimeWindowStart ) <> DatePart( dw, W.TimeWindowStart )
)
SELECT *
FROM (
SELECT *,
COUNT(dp) OVER (PARTITION BY CustomerID) AS cnt
FROM (
SELECT DISTINCT CustomerID, DATEPART(dw, TimeWindowStart) AS dp
FROM workOrder
) q
) q
WHERE cnt >= 2
SELECT CustomerId,
MIN(DATENAME(weekday, TimeWindowStart)),
MAX(DATENAME(weekday, TimeWindowStart))
FROM Core.WorkOrder
GROUP BY CustomerId
HAVING MIN(DATENAME(weekday, TimeWindowStart)) != MAX(DATENAME(weekday, TimeWindowStart))