I am using firebase analytics and have my event params in google big query. I did this query:
SELECT * FROM 'myProject.analytics_number.events_2021*', UNNEST(event_params) AS param WHERE event_name ="Read_Free_Article" AND param.value.string_value="X"
this gives me results like this:
Now I'd like to query multiple things. For example I'd like to query avg(timeSpend) to get the average timeSpend value and count(title) ... group by title to get the count of events that have the same title value.
But I don't understand how I can query by the different event_params.key values. I only managed to query by the event_name and param.value.string_value which just checks if any event_params.value.string_value has the desired value.
You can access the individual param values in the select part of the query like this:
SELECT (SELECT ep.value.string_value)
FROM UNNEST(event_params) ep
WHERE ep.key = 'title') title,
(SELECT ep.value.string_value
FROM UNNEST(event_params) ep
WHERE ep.key = 'publisher') publisher,
COUNT(1) count_titles,
(SELECT AVG(CAST(ep.value.string_value AS numeric))
FROM UNNEST(event_params) ep
WHERE ep.key = 'timeSpend) avg_time_spend,
FROM `myProject.analytics_number.events_2021*`
WHERE event_name ='Read_Free_Article'
AND publisher = 'X'
GROUP BY
title,
publisher
Related
I don't understand what is the differents between those queries:
SELECT event_timestamp ,user_pseudo_id, value.double_value as tax
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`, UNNEST(event_params) as event_params
WHERE event_name = "purchase" and event_params.key = "tax"
The other query is:
SELECT event_timestamp ,user_pseudo_id,
(SELECT value.double_value FROM UNNEST(event_params) WHERE key = "tax") as tax
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`
WHERE event_name = "purchase"
In the first query, I get 5.242 registers and in the second 5.692. What is the mistake?
Thank you!
It depends on what you define as accurate. The reason you are getting a row count mismatches is because of the way the tax field is being handled. You can see this by running the following query to see the discrepancies:
with unnested as (
SELECT event_timestamp ,user_pseudo_id, value.double_value as tax
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`, UNNEST(event_params) as event_params
WHERE event_name = "purchase" and event_params.key = "tax"
)
SELECT events.event_timestamp ,events.user_pseudo_id,
(SELECT value.double_value FROM UNNEST(event_params) WHERE key = "tax") as tax
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*` events
LEFT JOIN unnested un
on events.event_timestamp=un.event_timestamp
and events.user_pseudo_id=un.user_pseudo_id
WHERE events.event_name = "purchase"
and un.event_timestamp is null
;
If you pick out a single record from that list and investigate with the two following queries:
SELECT *
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`, UNNEST(event_params) as event_params
WHERE 1=1
-- and event_name = "purchase" and event_params.key = "tax"
and event_name = "purchase" and event_timestamp=1608955242902332 and user_pseudo_id='43627350.3807676886';
SELECT
*,
(SELECT value.double_value FROM UNNEST(event_params) WHERE key = "tax") as tax
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*` events
WHERE events.event_name = "purchase"
and event_timestamp=1608955242902332 and user_pseudo_id='43627350.3807676886'
;
The first query is filtering out the records without a tax field from your final set, while the second returns the records as having a null tax value. If the number registered is dependent on the presence of a value in the tax field the 5242 value is the correct number.
I need help in a SQL query in BigQuery. I want to know what event did the user did before he/she uninstalled the app (event_name = 'app_remove')
I am trying to get 'event_timestamp', 'event_name', 'event_params.key', 'event_value.string_value' just before a user triggers event_name = 'app_remove'.
My data looks like this: (there's a column named 'user_pseudo_id' not visible in pic)
I used below query to get the 'user_pseudo_id', 'event_params.key', 'event_value.string_value' associated with users who did event_name = 'app_remove'.
SELECT
TIMESTAMP_MICROS(event_timestamp),
user_pseudo_id,
event_name,
e.key,
e.value.string_value
FROM
`privatedata.events_20201129`, unnest(event_params) as e
WHERE
user_pseudo_id in
(
SELECT
user_pseudo_id
FROM
`privatedata.events_20201129`
WHERE
event_name = 'app_remove')
However, I only want that event (and also it's parameters) which was performed just before the 'app_remove' event.
Any help for me will be highly appreciated.
Thanks and Regards,
Shantanu Jain
Hmmm . . . I am thinking of using array logic;
SELECT p.*, event_params[SAFE_OFFSET(n - 1)]
FROM (SELECT p.*,
(SELECT MIN(n)
FROM unnest(event_params) ep WITH OFFSET n
WHERE ep.event_name = 'app_remove'
) as app_remove_n
FROM `privatedata.events_20201129` p
) p
WHERE app_remove_n IS NOT NULL
I'm new to BQ/SQL.
I'm using Google Analytics dataset to pull a COUNT in the same query for two things:
Total event_name hits
Total event_name hits where a particular criteria was met
This is my query so far. How can I improve line #3 so that the second count occurs as a nested WHERE function while the first count queries the full table? Thanks.
SELECT
COUNT (event_name) AS total_events,
COUNT (event_name) AS goal WHERE event_name = 'visited x page',
FROM \foodotcom-app-plus-web.analytics_1234567.events_20200809``
Below is for BigQuery Standard SQL
#standardSQL
SELECT
COUNT(event_name) AS total_events,
COUNTIF(event_name = 'visited x page') AS goal
FROM `project.dataset.table`
select
count(event_name) as total_events,
count(case when event_name = 'visited x page' then event_name else null end) as goal
from `project.dataset.table`
I have a table named PublishedData, see image below
I'm trying to get the output like, below image
I think you can use a query like this:
SELECT dt.DistrictName, ISNULL(dt.Content, 'N/A') Content, dt.UpdatedDate, mt.LastPublished, mt.Unpublished
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY DistrictName ORDER BY UpdatedDate DESC, ISNULL(Content, 'zzzzz')) seq
FROM PublishedData) dt
INNER JOIN (
SELECT DistrictName, MAX(LastPublished) LastPublished, COUNT(CASE WHEN IsPublished = 0 THEN 1 END) Unpublished
FROM PublishedData
GROUP BY DistrictName) mt
ON dt.DistrictName = mt.DistrictName
WHERE
dt.seq = 1;
Because I think you use an order over UpdatedDate, Content to gain you two first columns.
Check out something like this (I don't have your tables, but you will get the idea where to follow with your query):
SELECT DirectName,
MAX(UpdatedDate),
MAX(LastPublished),
(
SELECT COUNT(*)
FROM PublishedData inr
WHERE inr.DirectName = outr.DirectName
AND inr.IsPublished = 0
) AS Unpublished
FROM PublishedData outr
GROUP BY DirectName
We should required a unique identity for that required output in PublishedData Table,Because We can't get the Latest content from given Schema.
If you want data apart from content like DistictName,updatedDate,LastPublishedDate and count of Unpublished records ,Please use Query given below :
select T1.DistrictName,T1.UpdatedDate,T1.LastPublished,T2.Unpublished from
(select DistrictName,Max(UpdateDate) as UpdatedDate,Max(LastPublished) as LastPublished from PublishedData group by DistrictName) T1
inner join
(select DistrictName,count(IsPublished) as Unpublished from PublishedData where isPublished=0 group by DistrictName) T2 ON T1.DistrictName=T2.DistrictName ORDER BY T2.Unpublished DESC
I have a table metrics which has the following columns :
stage name
---------------------
new member
new member
old member
new visitor
old visitor
Now I can find out how many new or old members are there by running a query like this :
select stage, count(*) from metrics where name = 'member' group by stage;
This will give me the following result:
stage count
-----------
new 2
old 1
But along with this I want output like this :
total stage count
------------------
3 new 2
3 old 1
Total is sum of all rows statisfying where clause above. How do I need to modify my previous query to get the result I need? Thanks.
You can do something like this:
with t as
(select stage from metrics where name = 'member')
select
(select count(*) from t) as total,
stage, count(*)
from t
group by stage
Check it: http://sqlfiddle.com/#!15/b97a4/9
This is compact variant and includes the 'member' constant only once.
The window-function using variant:
with member as (
select stage, count(*)
from metrics where name = 'member'
group by stage
)
select sum(count) over () as total, member.*
from member
http://sqlfiddle.com/#!15/b97a4/18
This can do what you want:
SELECT t2.totalCount,
t1.stage,
t1.stageCount
FROM
(SELECT stage,
COUNT(*) stageCount
FROM metrics
WHERE name = 'member'
GROUP BY stage
) t1,
(SELECT COUNT(*) AS totalCount FROM metrics WHERE name = 'member'
) t2;
See sqlfiddle http://sqlfiddle.com/#!2/0240b/5.
An advantage of this approach in comparison to using the subquery is that the sql finding the total count will not be run for each row of the sql defining the count by stage.
Try this code:
select (select count(*) as total from metrics where name = 'member' group by name),stage, count(*) from metrics where name = 'member' group by stage;