Subscription History - sql

I have a table of user_ids, subscription_tier, activity_date (which is each date the user was active). How would I construct a table which would for each user show their movement and dates between tiers.
Using Min(activity_date) and Max(activity_date) would only work if the user didn't move to a tier they had previously been at. Where the reality is people upgrade and downgrade all the time.
What I want to create is a table with the columns user_id, subscription_tier, tier_start_date, tier_end_date.

Do you just want `lead()?
select user_id, subscription_tier, activity_date as tier_start_date,
lead(activity_date) over (partition by user_id order by activity_date) as tier_end_date
from t;

Related

Reduce consecutive rows with the same value in a column to a single row

I am trying to create a biometric attendance system that receives data from a biometric device.
The structure of the attendance table received from the device looks something like this.
The table originally has a lot of data with more than one emp_no, but I created a stored procedure that extracts details of one employee on a specific date as seen above.
The challenge that is facing right now is that, I need to analyze this table and restructure it ( recreate another table ) so that it has alternating check-ins and checkouts ( each checkin must be followed by a checkout and vice versa ) and for
consecutive check-ins, I should take the earlier one while for consecutive check-outs, I should take the latest one.
Any ideas on how to go about this will be very much appreciated.
Thank you.
Use the window functions lag() and lead():
select emp_id, att_date, att_time, status
from (
select
emp_id, att_date, att_time, status,
case
when status = 'checkin' then lag(status) over w is distinct from 'checkin'
else lead(status) over w is distinct from 'checkout'
end as visible
from my_table
window w as (partition by emp_id, att_date order by att_time)
) s
where visible
Db<>fiddle.

SQL database – track user's activity

I have a simple database structure, with models and relations:
Models:
User
Group
Activity
Relations:
User/Group –> User belongs to Group, Group has many Users
User/Activity –> User has many Activities, Acitivty belongs to User
Group/Activity –> Activity belongs to Group, Group has many Activities
My problem – I want to be able to track number of activities performed in the group by the user within a given period (probably per week, but possibly per day) and I do not know what's the best/ most performant way to achieve this.
Theoretically, I can just perform a query that would count those activities based on the created_at date attribute but I assume this is not the most performant way (am I wrong?)
Does anyone know how to properly structure something like this?
As per the relationship provided by you, your activity table has a foreign key reference to user_id and group_id, you can get the count of a user activity under a group in a day.
SELECT a.user_id, a.group_id, count(a.user_id)
FROM activity a
WHERE a.user_id = '123'
AND a.group_id = '1'
AND a.activity_time >= '2019-08-31'
AND a.activity_time < '2019-08-31' + INTERVAL 1 DAY
Create a composite key on user_id, group_id, activity_time for faster retrieval if table size increases in the future.
Please note this query is in MySQL.

How to (partially) order selection so that rows are ordered for each specific user id?

Typically using ORDER BY on a large selection runs on a single node which is not preferable. To remedy this I am fine with having my result ordered so that for a specific user's timestamps are in ascending order but not necessarily globally.
How can one achieve this?
E.g.
SELECT *
FROM table
ORDER BY timestamp OVER (PARTITION BY user_id)
Which should result in result being ordered by timestamp when considering one user at a time.
Try this:
ORDER BY user, timestamp
this will sort users first and all duplicate user will be sorted by timestamp.
UPDATE:
Order can also be changed for every column
ORDER BY user DESC, timestamp ASC
The question is not very specific, so it's hard to give a specific answer. You can use a query of this form:
#standardSQL
SELECT
user_id,
ARRAY_AGG(
STRUCT(user_timestamp, user_event)
ORDER BY user_timestamp
) AS user_attributes
FROM UserTable
GROUP BY user_id
This builds an array with user attributes for each user ID, ordered by a user timestamp.

I'm trying to get a query to get activity, excluding the day of sign up (first occurrence of a specific id)

I have a basic table "events" with the row id and date. It shows each time a user interacts (register or login) to a site.
I want to calculate active users per day. But I want to exclude all activity on the day the user signed up.
For now, the same table format but without the ids from the first occurrence would solve my problem.
ok so what you essentially need for each event is :
user_id
acitivity_date
first_activity_date
You should already have the first two from your events table: for the first_activity_date, you can use a window function to calculate that: Following is a sample query:
SELECT
activity_date,
COUNT(*) AS cnt
FROM (
SELECT
user_id,
activity_date,
MIN(activity_date) OVER (PARTITION BY user_id) AS first_activity_date
FROM
[project_id:dataset.events]
)
WHERE
activity_date != first_activity_date
GROUP BY
1
ORDER BY
1
try this out
SELECT date, userid FROM consumer.events WHERE Count(userid) > 1 ORDER BY date, userid
Explanation
This select statement will limit the query to only show where the userid doesn't show up for the first time

Assigning a field value to all uniques in a table

I have an analytics table with the following fields:
unique_id,
revenue,
pagename
An analytics record is created for every page a user visits. The question I would like to answer is this: How much revenue is coming from users that have been to a maps screen (pagename=mapview) versus users that have not. The revenue is only recorded when the user hits a page with a transactional element. I'm not keeping track of whether the user has been to a maps view once they hit a page with transaction elements
Do I need to create a separate table that tracks whether a particular user (unique_id) has been to a map screen and then join this with the original table? Or is there an easier way?
You can do this with aggregation -- two levels of aggregation:
select isMapView, sum(revenue), count(*) as numUsers
from (select unique_id, sum(revenue) as revenue,
max(case when pagename = 'mapview' then 1 else 0 end) as isMapView
from t
group by unique_id
) u
group by isMapView;