sql - find the number of days a user was using the app

sql - find the number of days a user was using the app - sql

I like to write a sql query that counts the number of days each user used the application and how many concurrent days. A user can enter the app several times a day but that should count as 1.
My table looks like this:
id | bigint
user_id | bigint
action_date | timestamp without time zone

To count the number of days per user:
SELECT user_id, count(DISTINCT action_date::date) AS days
FROM user_action_tbl
GROUP BY user_id;

One way to do it
SELECT user_id, COUNT(*) days_total, SUM(conseq) days_consecutive
FROM
(
SELECT user_id,
CASE WHEN LEAD(date, 1) OVER (PARTITION BY user_id ORDER BY date) - date = 1 THEN 1 ELSE 0 END consecutive
FROM
(
SELECT user_id, action_date::date date
FROM table1
GROUP BY user_id, action_date::date
) q
) p
GROUP BY user_id
Here is a SQLFiddle demo

Related

SQL to find when amount reached a certain value for the first time

I have a table that has 3 columns: user_id, date, amount. I need to find out on which date the amount reached 1 Million for the first time. The amount can go up or down on any given day.
I tried using partition by user_id order by date desc but I can't figure out how to find the exact date on which it reached 1 Million for the first time. I am exploring lead, lag functions. Any pointers would be appreciated.

You may use conditional aggregation as the following:
select user_id,
min(case when amount >= 1000000 then date end) as expected_date
from table_name
group by user_id
And if you want to check where the amount reaches exactly 1M, use case when amount = 1000000 ...
If you meant that the amount is a cumulative amount over the increasing of date, then query will be:
select user_id,
min(case when cumulative_amount >= 1000000 then date end) as expected_date
from
(
select *,
sum(amount) over (partition by user_id order by date) cumulative_amount
from table_name
) T
group by user_id;

Try this:
select date,
sum(amount) as totalamount
from tablename
group by date
having totalamount>=1000000
order by date asc
limit 1
This would summarize the amount for each day and return 1 record where it reached 1M for the first time.
Sample result on SQL Fiddle.
And if you want it to be grouped for both date and user_id, add user_id in select and group by clauses.
select user_id, date,
sum(amount) as totalamount
from tablename
group by user_id,date
having totalamount>=1000000
order by date asc
limit 1
Example here.

Daily count of user_ids who have visited my store 4 or more than 4 times every day

I have a table of user_id who have visited my platform. I want to get count of only those user IDs who have visited my store 4 or more times for each user and for every day for a duration of 10 days.
To achieve this I am using this query:
select date(arrival_timestamp), count(user_id)
from mytable
where date(arrival_timestamp) >= current_date-10
and date(arrival_timestamp) < current_date
group by 1
having count(user_id)>=4
order by 2 desc
limit 10;
But this query is virtually taking all the users having count value greater than 4 and not on a daily basis which covers almost every user and hence I am not able to segregate only those users who vist my store more than once on a particular day. Any help in this regard is appreciated.
Thanks

you can try this
with list as (
select user_id, count(*) as user_count, array_agg(arrival_timestamp) as arrival_timestamp
from mytable
where date(arrival_timestamp) >= current_date-10
and date(arrival_timestamp) < current_date
group by user_id)
select user_id, unnest(arrival_timestamp)
from list
where user_count >= 4

From a list of daily users that have visited your store 4 or more times a day over the last 10 days (the internal query) select these who have 10 occurencies, i.e. every day.
select user_id
from
(
select user_id
from the_table
where arrival_timestamp::date between current_date - 10 and current_date - 1
group by user_id, arrival_timestamp::date
having count(*) >= 4
) t
group by user_id
having count(*) = 10;

How to get the average of the number of actions per day

I have written the sql query:
SELECT id
date_diff("day", create_date, date) as day
action_type
FROM "my_database"
It brings this:
id day action_type
1 0 upload
1 0 upload
1 0 upload
1 1 upload
1 1 upload
2 0 upload
2 0 upload
2 1 upload
How to change my query to get table with unique days in column day and average number "upload" action_type among all id's. So desired result must look like this:
day avg_num_action
0 2.5
1 1.5
It is 2.5, because (3+2)/2 (3 uploads of id:1 and 2 uploads for id:2). same for 1.5

Please try this. Consider your given query as a table. If any WHERE condition needed then please enable this other wise disable where clause.
SELECT t.day
, COUNT(*) / COUNT(DISTINCT t.id) avg_num_action
FROM (SELECT id,
date_diff("day", create_date, date) as day,
action_type
FROM "my_database") t
WHERE t.action_type = 'upload'
GROUP BY t.day
Create a table from your given result set and write query based on that.
SELECT t.tday
, COUNT(*) / COUNT(DISTINCT t.id) avg_num_action
FROM my_database t
GROUP BY t.tday
Please check from url https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=871935ea2b919c4e24eb83fcbce78973

Update: I think my two-steps approach is more complicated than needed. Rahul Biswas shows how this can be done in one step. I suggest you use and accept his answer.
Original answer:
Two steps:
Count entries per ID and day
Take the average count per day
The query:
with rows as (select id, date_diff('day', create_date, date) as day from mytable)
, per_id_and_day as (select id, day, count(*) as cnt from rows group by id, day)
select day, avg(cnt)
from per_id_and_day
group by day
order by day;

You don't need a subquery for this logic:
SELECT date_diff("day", create_date, date) as day,
COUNT(*) * 1.0 / COUNT(DISTINCT id)
FROM "my_database"
GROUP BY date_diff("day", create_date, date)

Hive/SQL How do you access the value of the column which you just computed for previous rows?

I have a table uv_user_date looks like this:
Its basically a user log in table which shows the cumulative login days partition by user_id.
And the column pre show the last login date of a user login record.
Based on this I want to compute the consecutive login days for each user record.
The answer should be :
My idea is : for a record
if(uv_date - pre = 1 day)
then consecutive login days is the last consecutive login days + 1
else
1
but I am having trouble with accessing the last consecutive login days value.
The Code would be:
SELECT *,
if(pre = date_add(uv_date, -1), last(consecutive_days) + 1, 1) consecutive_days
FROM uv_user_date
Is there any way to get the value of last(consecutive_days)

First find date difference
tbl1:
select *,
if(pre = NULL, 1, datediff(uv_date, pre)) as diff
from your_table
then difference between cumulative sum of difference and accumulative_uv_date for each user_id, you want to use it as rank
tbl2:
select *,
sum(diff) over (partition by user_id order by uv_date rows between unbounded preceding and current) - accumulative_uv_date as rnk
from tbl1
finally, count consecutive days
select user_id, uv_date, rnk
row_number() over (partition by user_id, rnk order by uv_date) as consecutive_days
from tbl2

Active customers for each day who were active in last 30 days

I have a BQ table, user_events that looks like the following:
event_date | user_id | event_type
Data is for Millions of users, for different event dates.
I want to write a query that will give me a list of users for every day who were active in last 30 days.
This gives me total unique users on only that day; I can't get it to give me the last 30 for each date. Help is appreciated.
SELECT
user_id,
event_date
FROM
[TableA]
WHERE
1=1
AND user_id IS NOT NULL
AND event_date >= DATE_ADD(CURRENT_TIMESTAMP(), -30, 'DAY')
GROUP BY
1,
2
ORDER BY
2 DESC

Below is for BigQuery Standard SQL and has few assumption about your case:
there is only one row per date per user
user is considered active in last 30 days if user has at least 5 (sure can be any number - even just 1) entries/rows within those 30 days
If above make sense - see below
#standardSQL
SELECT
user_id, event_date
FROM (
SELECT
user_id, event_date,
(COUNT(1)
OVER(PARTITION BY user_id
ORDER BY UNIX_DATE(event_date)
RANGE BETWEEN 30 PRECEDING AND 1 PRECEDING)
) >= 5 AS activity
FROM `yourTable`
)
WHERE activity
GROUP BY user_id, event_date
-- ORDER BY event_date
If above assumption #1 is not correct - you can just simple add pre-grouping as a sub-select
#standardSQL
SELECT
user_id, event_date
FROM (
SELECT
user_id, event_date,
(COUNT(1)
OVER(PARTITION BY user_id
ORDER BY UNIX_DATE(event_date)
RANGE BETWEEN 30 PRECEDING AND 1 PRECEDING)
) >= 5 AS activity
FROM (
SELECT user_id, event_date
FROM `yourTable`
GROUP BY user_id, event_date
)
)
WHERE activity
GROUP BY user_id, event_date
-- ORDER BY event_date
UPDATE
From comments: If user have any of the event_type IN ('view', 'conversion', 'productDetail', 'search') , they will be considered active. That means any kind of event triggered within the app
So, you can go with below, I think
#standardSQL
SELECT
user_id, event_date
FROM (
SELECT
user_id, event_date,
(COUNT(1)
OVER(PARTITION BY user_id
ORDER BY UNIX_DATE(event_date)
RANGE BETWEEN 30 PRECEDING AND 1 PRECEDING)
) >= 5 AS activity
FROM (
SELECT user_id, event_date
FROM `yourTable`
WHERE event_type IN ('view', 'conversion', 'productDetail', 'search')
GROUP BY user_id, event_date
)
)
WHERE activity
GROUP BY user_id, event_date
-- ORDER BY event_date

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

sql - find the number of days a user was using the app - sql

I like to write a sql query that counts the number of days each user used the application and how many concurrent days. A user can enter the app several times a day but that should count as 1. My table looks like this: id | bigint user_id | bigint action_date | timestamp without time zone

To count the number of days per user: SELECT user_id, count(DISTINCT action_date::date) AS days FROM user_action_tbl GROUP BY user_id;

Related

SQL to find when amount reached a certain value for the first time

Daily count of user_ids who have visited my store 4 or more than 4 times every day

How to get the average of the number of actions per day

Hive/SQL How do you access the value of the column which you just computed for previous rows?

Active customers for each day who were active in last 30 days

Categories

Resources