Computing conversion rate by counting TRUE/FALSE statements - sql

Sorry if this sounds very basic; bear with me.
I need to determine the conversion rate of 3 ads, each representing a product; that would be subscription divided by the number of people who clicked the ad.
COLUMNS:
person_id - unique identifier of the person
date - date they were shown the ad
ad_id - content of the ad: ad_1_product1, ad_2_product2, or ad_3_product3
clicked (TRUE/FALSE) - clicked on the ad
signed_up - (TRUE/FALSE) created an account
subscribed (TRUE/FALSE) - started a paid subscription
I set clicked, signed_up and subscribed as boolean.
MY CODE:
SELECT ad_id, (count(subscribed) / count(clicked)) as CR
FROM videoadcampaign
WHERE subscribed = 'TRUE' AND clicked = 'TRUE'
GROUP BY ad_id;
Of course, the code above gives me a ratio of 1, because SQL is still counting the total and dividing by the same number because of those conditions.
I am totally stuck.
I will also need to calculate other KPIs for clicks and signed_up, so filtering those booleans and put them into a ratio is the core of what I need to do.
Is there a way I can tell SQL to compute CR = SUBS (TRUE) / SUBS (TRUE + FALSE) [or total count] and then filter by CLICK = TRUE?
Thank you tons for your help!

It depends on your database, but the general notion would be:
SELECT ad_id,
(SUM(CASE WHEN subscribed = 'TRUE' THEN 1.0 ELSE 0 END) /
SUM(CASE WHEN clicked = 'TRUE' THEN 1 ELSE 0 END)
) as CR
FROM videoadcampaign
GROUP BY ad_id;
In many databases, you can do something like this if the columns are integers (0 = false, 1 = true):
SELECT ad_id, SUM(subscribed) / SUM(clicked) as CR
FROM videoadcampaign
WHERE clicked = 'TRUE'
GROUP BY ad_id;
Or even:
SELECT ad_id, AVG(subscribed) as CR
FROM videoadcampaign
WHERE clicked = 'TRUE'
GROUP BY ad_id;

Related

SQL select column group by where the ratio of a value is 1

I am using PSQL.
I have a table with a few columns, one column is event that can have 4 different values - X1, X2, Y1, Y2. I have another column that is the name of the service and I want to group by using this column.
My goal is to make a query that take an event and verify that for a specific service name I have count(X1) == count(X2) if not display a new column with "error"
Is this even possible? I am kinda new to SQL and not sure how to write this.
So far I tried something like this
select
service_name, event, count(service_name)
from
service_table st
group by
(service_name, event);
I am getting the count of each event for specific service_name but I would like to verify that count of event 1 == count of event 2 for each service_name.
I want to add that each service_name have a choice of 2 different event only.
You may not need a subquery/CTE for this, but it will work (and makes the logic easier to follow):
WITH event_counts_by_service AS (SELECT
service_name
, COUNT(CASE WHEN event='X1' THEN 1 END) AS count_x1
, COUNT(CASE WHEN event='X2' THEN 1 END) AS count_x2
FROM service_table
GROUP BY service_name)
SELECT service_name
, CASE WHEN count_x1=count_x2 THEN NULL ELSE 'Error' END AS are_counts_equal
FROM event_counts_by_service

BigQuery Google Data Transfer Impressions & Cost Don't Match Google Ads UI

Disclaimer...I'm a Noob
I am writing a query from CampaignStats table that aggregates based on a stripped campaign label. The query returns correct values for all metrics except Impressions and Cost. No matter what I've tried so far, this figure still doesn't match. Here are the totals for two of my campaigns from yesterday (June 17th):
CampaignStats:
Date label Impressions cost clicks avg_cpc
6/17/2022 sat_brand 2687 140.472666 15 9.3648444
CampaignBasicStats:
Date label Impressions cost clicks avg_cpc
6/17/2022 sat_brand 699 152.620961 15 10.17473073
Utilizing the CampaignBasicStats table, I receive aggregated totals for all metrics that match the UI, including Impressions and Cost. The issue is there are metrics in CampaignStats and getting some illumination/information on what I may not be doing correct, will help in the future.
I did a JOIN with the Campaign table originally; the below query refers to a permanent table that I pulled out separately in case this was a cause of the discrepancy.
Code Below:
SELECT
cs.Date,
EXTRACT(ISOWEEK FROM cs.DATE) AS isoweek,
cl.label,
(SUM(cs.Cost) / 1000000) AS cost,
SUM(cs.Clicks) AS clicks,
CASE WHEN SUM(Clicks)=0 OR SUM(Cost)=0 THEN 0 ELSE
((SUM(Cost)/SUM(Clicks))/1000000) END AS avg_cpc,
SUM(cs.Impressions) AS Impressions,
CASE WHEN SUM(cs.Clicks)=0 THEN 0 ELSE
(SUM(cs.Clicks)/SUM(cs.Impressions)) END AS ctr,
SUM(cs.Conversions) as conversions,
CASE WHEN SUM(cs.Conversions)=0 OR SUM(cs.Clicks)=0 THEN 0 ELSE
(SUM(cs.Conversions)/SUM(cs.Clicks)) END AS cvr,
CASE WHEN SUM(cs.Conversions)=0 OR SUM(cs.Cost)=0 THEN 0 ELSE
(SUM(cs.Cost)/SUM(cs.Conversions))/1000000 END AS cost_per_conversion
FROM
`bold-quanta-######.######_google_ads_dataset.CampaignStats_##########` cs
JOIN
`bold-quanta-######.queried_permanent_tables.process_campaign_labels` cl
ON
cs.CampaignId = cl.CampaignId
GROUP BY
1, 2, 3

How to exclude users with multiple options sql

Trying to retrieve just users that don't have a disabled campaign, where disabled = 1.
A user can have a disabled campaign and a non-disabled campaign, but if they have any disabled campaigns I want to exclude them from my final result.
Thinking I need something like
SELECT DISTINCT
user_id,
CASE
WHEN
disabled = 1
THEN
'Disabled'
ELSE
'Good'
END
AS campaign_disabled
But this just returns two rows for each user_id, one being Good and the other campaign_disabled
You want the users that have disabled = 0 for all their campaigns, so the max value of the column disabled must be 0:
SELECT user_id
FROM tablename
GROUP BY user_id
HAVING MAX(disabled) = 0

How can I count field value based on another field?

I have EmailEvent table. This is STI table which stores opens and clicks (by using type attribute). I have right now this code which provides information on how many emails use recipient address (email_address field):
EmailEvent.where("type = 'OpenEmailEvent' OR type = 'ClickEmailEvent'")
.where(email_id: email_ids).happened_between_dates(start_date, end_date)
.select("email_address, count(distinct email_id) as emails_count")
.group(:email_address).reorder(nil)
It generates such code:
SELECT email_address, count(distinct email_id) as emails_count
FROM "email_events"
WHERE "email_events"."deleted_at" IS NULL
AND (type = 'OpenEmailEvent' OR type = 'ClickEmailEvent')
AND "email_events"."email_id" IN (85487, 75554, 85489, 77184, 78562, 75587, 82879, 85535, 85534)
AND (created_at between '2017-02-28 22:52:01.000000' AND '2017-03-29 05:59:59.000000')
GROUP BY "email_events"."email_address"
I need to provide information on how many opens and clicks there per email_address. I can't use count(id) because it'll give me both opens and clicks so I need to use type attribute somehow but I can't figure out exactly how it should work. Can somebody give me suggestion what should I do here?
replace this
count(distinct email_id) as emails_count
to something like in SQLServer using case or IIF
sum(case when type = 'OpenEmailEvent' then 1 else 0 end) [Opens],
sum(case when type = 'ClickEmailEvent' then 1 else 0 end) [Click],

Comparing "queries" and then saving the result for new select statement

Iam trying to make a query that will check if a user is logged in or not.
The data is stored as 2 seperate rows one is called "in", when a users logs in and the other "out". I then need to find all people currently logged in but not logged out. so what ive tried is comparing the two select statement. This gives me the name(UNILOGIN) of all the people currently logged in and not out:
select UNILOGIN from timereg where date = CONVERT(DATE,GETDATE(),110) and CHECKEDIN = 'IND'
except
select UNILOGIN from timereg where date = CONVERT(DATE,GETDATE(),110) and CHECKEDIN = 'UD'
I then need to find their top 1 time, when they checked in. How would one make a statement that could get result in in one query string? If possible at all. something like:
SELECT TOP 1 UNILOGIN, TIME from TIMEREG where UNILOGIN = "result of query"
Tell me if i need to elaborate.
Aggregate. Get one result row per unilogin, make sure it has an 'IND' record and no 'UD' record and select the maximum login time for the date in question.
select unilogin, max(time)
from timereg
where date = convert(date, getdate(), 110)
and checkedin in ('IND', 'UD')
group by unilogin
having count(case when checkedin = 'IND' then 1 end) > 0
and count(case when checkedin = 'UD' then 1 end) = 0;