How to get average from counting values? - sql

I am using SQLite.
I need to determine average impressions per day, where impression = count of person_id.
COLUMNS:
person_id - unique identifier of the person
date - date they were shown the ad
ad_id - content of the ad: ad_1_product1, ad_2_product2, or ad_3_product3
clicked (TRUE/FALSE) - clicked on the ad
signed_up - (TRUE/FALSE) created an account
subscribed (TRUE/FALSE) - started a paid subscription
I set clicked, signed_up and subscribed as BOOLEAN, the rest is text.
MY CODE:
SELECT AVG(impressions) AS avg_impressions
FROM (
SELECT COUNT(person_id) as impressions
FROM videoadcampaign
GROUP BY date
) date;
I get 1 row with 1 colum avg_impression = 591
I cannot break down the average by date. Date is in 2021-04-27 format, total date count is 8.
result
expected, ignore the column name, it's just to show you
Any help is highly appreciated.

If you want the percentage of rows for each day, then you can do it by dividing each day's count by the total number of rows.
With SUM() window function:
SELECT date, 1.0 * COUNT(*) / SUM(COUNT(*)) OVER () AS avg_impressions
FROM videoadcampaign
GROUP BY date
Or, with a subquery:
SELECT date, 1.0 * COUNT(*) / (SELECT COUNT(*) FROM videoadcampaign) AS avg_impressions
FROM videoadcampaign
GROUP BY date
I assume that the column person_id is not nullable, so instead of COUNT(person_id) you may use COUNT(*).

The average impressions per day is the total impressions divided by the number of people. No subquery is needed:
SELECT COUNT(*) * 1.0 / COUNT(DISTINCT person_id) AS avg_impressions
FROM videoadcampaign;

Related

want to calculate 2 different aggregations on different criteria in bigquery

Have customer payments , i want to calculate who are the top 10 customers per day based on sum of amount per day per customer. Eventually i want to display those 10 customers and their payment per hour (sum of the amount per hour)
I tried to create 2 window functions in bigquery one window function for per customer and per hour (Value_Hr) values, and one more window function for sum of values per customer (Value_customer).
with base as (
select Name, sum(amount) over W1 as Value_Hr, Hour, sum(amount) over w2 as Value_customer
from
(SELECT trim(cast(format('%t',Name) as string) ) as Name,
cast(round(amount) as numeric) as amount , extract(hour from SettlementTimestamp) as Hr
FROM Payments
where length(trim(Name))>0
)
qualify row_number() over (partition by Name,hr )=1
window w1 as (partition by Name,hr ),
w2 as (partition by Name)
)
select Name,Value_Hr,Hour ,Value_customer
from base
qualify row_number() over (partition by Value_customer order by Value_customer desc )<=10
I expect data as below
but row_number is calculating with in the group of customers and hourly amounts instead per customer and its total value
Can anyone help ?

How to get the average of the number of actions per day

I have written the sql query:
SELECT id
date_diff("day", create_date, date) as day
action_type
FROM "my_database"
It brings this:
id day action_type
1 0 upload
1 0 upload
1 0 upload
1 1 upload
1 1 upload
2 0 upload
2 0 upload
2 1 upload
How to change my query to get table with unique days in column day and average number "upload" action_type among all id's. So desired result must look like this:
day avg_num_action
0 2.5
1 1.5
It is 2.5, because (3+2)/2 (3 uploads of id:1 and 2 uploads for id:2). same for 1.5
Please try this. Consider your given query as a table. If any WHERE condition needed then please enable this other wise disable where clause.
SELECT t.day
, COUNT(*) / COUNT(DISTINCT t.id) avg_num_action
FROM (SELECT id,
date_diff("day", create_date, date) as day,
action_type
FROM "my_database") t
WHERE t.action_type = 'upload'
GROUP BY t.day
Create a table from your given result set and write query based on that.
SELECT t.tday
, COUNT(*) / COUNT(DISTINCT t.id) avg_num_action
FROM my_database t
GROUP BY t.tday
Please check from url https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=871935ea2b919c4e24eb83fcbce78973
Update: I think my two-steps approach is more complicated than needed. Rahul Biswas shows how this can be done in one step. I suggest you use and accept his answer.
Original answer:
Two steps:
Count entries per ID and day
Take the average count per day
The query:
with rows as (select id, date_diff('day', create_date, date) as day from mytable)
, per_id_and_day as (select id, day, count(*) as cnt from rows group by id, day)
select day, avg(cnt)
from per_id_and_day
group by day
order by day;
You don't need a subquery for this logic:
SELECT date_diff("day", create_date, date) as day,
COUNT(*) * 1.0 / COUNT(DISTINCT id)
FROM "my_database"
GROUP BY date_diff("day", create_date, date)

i am trying to use the avg() function in a subquery after using a count in the inner query but i cannot seem to get it work in SQL

my table name is CustomerDetails and it has the following columns:
customer_id, login_id, session_id, login_date
i am trying to write a query that calculates the average number of customers login in per day.
i tried this:
select avg(session_id)
from CustomerDetails
where exists (select count(session_id) from CustomerDetails as 'no_of_entries')
.
but then i realized it was going straight to the column and just calculating the average of that column but that's not what i want to do. can someone help me?
thanks
The first thing you need to do is get logins per day:
SELECT login_date, COUNT(*) AS loginsPerDay
FROM CustomerDetails
GROUP BY login_date
Then you can use that to get average logins per day:
SELECT AVG(loginsPerDay)
FROM (
SELECT login_date, COUNT(*) AS loginsPerDay
FROM CustomerDetails
GROUP BY login_date
)
If your login_date is a DATE type you're all set. If it has a time component then you'll need to truncate it to date only:
SELECT AVG(loginsPerDay)
FROM (
SELECT CAST(login_date AS DATE), COUNT(*)
FROM CustomerDetails
GROUP BY CAST(login_date AS DATE)
)
i am trying to write a query that calculates the average number of customers login in per day.
Count the number of customers. Divide by the number of days. I think that is:
select count(*) * 1.0 / count(distinct cast(login_date as date))
from customerdetails;
I understand that you want do count the number of visitors per day, not the number of visits. So if a customer logged twice on the same day, you want to count him only once.
If so, you can use distinct and two levels of aggregation, like so:
select avg(cnt_visitors) avg_cnt_vistors_per_day
from (
select count(distinct customer_id) cnt_visitors
from customer_details
group by cast(login_date as date)
) t
The inner query computes the count of distinct customers for each day, he outer query gives you the overall average.

Postgresql - Cumulative sum of created users

I have a users table with a timestamp when each user was created. I'd like to get the cumulative sum of users created per month.
I do have the following query which is working, but it's showing me the sum on a per day basis. I have a hard time going from this to a per month basis.
SELECT
created_at,
sum(count(*)) OVER (ORDER BY created_at) as total
FROM users
GROUP BY created_at
Expected output:
created_at count
-----------------
2016-07 100
2016-08 150
2016-09 200
2016-10 500
Former reading:
Calculating Cumulative Sum in PostgreSQL
Count cumulative total in Postgresql
Rolling sum / count / average over date interval
Cumulative sum of values by month, filling in for missing months
Postgres window function and group by exception
I'd take a two-step approach. First, use an inner query to count how many users were created each month. Then, wrap this query with another query that calculates the cumulative sum of these counts:
SELECT created_at, SUM(cnt) OVER (ORDER BY created_at ASC)
FROM (SELECT TO_CHAR(created_at, 'YYYY-MM') AS created_at, COUNT(*) AS cnt
FROM users
GROUP BY TO_CHAR(created_at, 'YYYY-MM')) t
ORDER BY 1 ASC;

Top 10 based on last month showing 6 previous months

I want to show a graph with income from different parties over the last 6 months, but based on the top income of 10 people only based on the last month.
So this can change each month as the top 10 people can change when they deposit more money, so the graph will show these 10 people's deposits of the last 6 months, based on the last month deposit only.
I already used a LAG function and a RANK() OVER PARTITION function.
I don't understand why you'll need rank or lag functions.
You can simply use an IN statement:
SELECT * FROM YourTable t
WHERE t.depositDate between StartRangeDate and EndRangeDate
AND t.ID in(select ID from(SELECT s.id,sum(s.depositAmount) as total
from YourTable s
where s.date between ThisMonthStart and ThisMonthEnd
group by s.id)
order by total
limit 10)
You can play with the first select to select what ever you want/add a group by and sum them or I don't know.