Compare number of users on year ago - sql

I want to get the count of distinct current users and one year ago using sql I have used the following query but it gets the current number and zero for one year ago values.
select CAST(root_tstamp as DATE) as Date,
count(DISTINCT userid) as sign_ups,
count(Distinct case when CAST(root_tstamp as DATE) = ADD_MONTHS(CAST(root_tstamp as
DATE),-12) then userid end) as Last_year_Signups
FROM table1

I have added sql server code.
For other type of DBs if there are any difference in syntax it will be only in date_add function.
;with table1 as(select '2022-01-25 13:14:37.840' as root_tstamp,1 as userid
union select '2022-01-25 13:14:37.840' as root_tstamp,2 as userid
union select '2021-01-25 13:14:37.840' as root_tstamp,1 as userid
union select '2021-01-25 13:14:37.840' as root_tstamp,2 as userid
union select '2022-01-25 13:14:37.840' as root_tstamp,1 as userid
)
select CAST(A.root_tstamp as DATE)as date
,count(DISTINCT a.userid) as sign_ups
,count(Distinct dateadd(MONTH,-12,CAST(B.root_tstamp as DATE))) as Last_year_Signups
from table1 a
left join table1 b on a.userid = b.userid and CAST(A.root_tstamp as DATE) = dateadd(MONTH,12, CAST(B.root_tstamp as DATE))
Result
date
sign_ups
Last_year_Signups
2021-01-25
2
0
2022-01-25
2
1

Related

how to 'CASE WHEN DATEDIFF(...) THEN DATEDIFF(..)..' without grouping by date

Using this query:
select a.username, count(*)
from A a inner join (
/* duplicate results will be removed by the union */
select username, date from B union
select username, date from C union
select username, date from D union
select username, date from E union
select username, date from F
) u on u.username = a.username
where a.min_date >= u.date
group by username;
which gets the count of unique dates across some tables and groups them by username.
I want to get the difference of the first date and the last date for each username within an 30 day interval i.e. something like this (but obviously it's not working):
CASE WHEN DATEDIFF(DAY, min_date, [Date]) <= 30
THEN DATEDIFF(DAY, min([Date]), max([Date])) ELSE NULL END as diff
which tells me I need to groupby min_date and [Date], but this is something I believe I don't want to do. Is there a workaround? If I use some aggregate function on case e.g. MAX(CASE... then I get the cannot use aggregated function containing an aggregate function.

SQL count display 0s using 1 table

I have a table called requests, and Im looking to count how many requests where the column rideId is not null each day during the last week. I have the following query:
Select count(*), dayname(time) as Day
from request
where time >= (select current_timestamp - interval 7 day) and rideId is not null
group by dayname(time)
order by dayofweek(Day);
How can I make it so it shows me those days where there is no request with rideId and count should be 0
Table is: Request(userId, time, rideId)
Move the not null check into your count, and join to a calendar table to bring in the missing days.
SELECT
t1.dname,
COALESCE(t2.numRides, 0) AS numRides
FROM
(
SELECT 'Monday' AS dname, 2 AS dow UNION ALL
SELECT 'Tuesday', 3 UNION ALL
SELECT 'Wednesday', 4 UNION ALL
SELECT 'Thursday', 5 UNION ALL
SELECT 'Friday', 6 UNION ALL
SELECT 'Saturday', 7 UNION ALL
SELECT 'Sunday', 1
) t1
LEFT JOIN
(
SELECT DAYNAME(time) AS dname, COUNT(rideId) AS numRides
FROM request
WHERE time >= DATE_SUB(CURDATE(),INTERVAL 7 DAY)
GROUP BY DAYNAME(time)
) t2
ON t1.dname = t2.dname
ORDER BY t1.dow;
Select a.day, coalesce(b.cnt, 0) as cnt
from (--select all days here) a
left join
(select dayname(time) as day, count(*) as cnt
from requests
where some_condition
group by day) b
using a.day = b.day
order by day;

BigQuery Monthly Active Users?

I'm currently working off a query from this post. That query is written in Legacy SQL and will not work in my environment. I've modified the query to use the modern SQL functions and updated the SELECT date as date to use timestamp_micros.
I should also mention that the rows I'm trying to select are coming in from Firebase Analytics.
My Query:
SELECT
FORMAT_TIMESTAMP('%Y-%m-%d', TIMESTAMP_MICROS(event.timestamp_micros)) as date,
SUM(CASE WHEN period = 7 THEN users END) as days_07,
SUM(CASE WHEN period = 14 THEN users END) as days_14,
SUM(CASE WHEN period = 30 THEN users END) as days_30
FROM (
SELECT
FORMAT_TIMESTAMP('%Y-%m-%d', TIMESTAMP_MICROS(event.timestamp_micros)) as date,
periods.period as period,
COUNT(DISTINCT user_dim.app_info.app_instance_id) as users
FROM `com_sidearm_fanapp_uiowa_IOS.*` as activity
CROSS JOIN
UNNEST(event_dim) as event
CROSS JOIN (
SELECT
FORMAT_TIMESTAMP('%Y-%m-%d', TIMESTAMP_MICROS(event.timestamp_micros)) as date
FROM `com_sidearm_fanapp_uiowa_IOS.*`
CROSS JOIN
UNNEST(event_dim) as event
GROUP BY event.timestamp_micros
) as dates
CROSS JOIN (
SELECT
period
FROM
(
SELECT 7 as period
UNION ALL
SELECT 14 as period
UNION ALL
SELECT 30 as period
)
) as periods
WHERE
dates.date >= activity.date
AND
SAFE_CAST(FLOOR(TIMESTAMP_DIFF(dates.date, activity.date, DAY)/periods.period) AS INT64) = 0
GROUP BY 1,2
)
CROSS JOIN
UNNEST(event_dim) as event
GROUP BY date
ORDER BY date DESC
Column name period is ambiguous at [24:13] error.
to fix this particular error - you should fix below
CROSS JOIN (
SELECT
period
FROM
(SELECT 7 as period),
(SELECT 14 as period),
(SELECT 30 as period)
) as periods
so it should look like:
CROSS JOIN (
SELECT
period
FROM
(SELECT 7 as period UNION ALL
SELECT 14 as period UNION ALL
SELECT 30 as period)
) as periods
Answer on your updated question
Try below. I didn't have chance to test it but hope it can help you fix your query
SELECT
date,
SUM(CASE WHEN period = 7 THEN users END) as days_07,
SUM(CASE WHEN period = 14 THEN users END) as days_14,
SUM(CASE WHEN period = 30 THEN users END) as days_30
FROM (
SELECT
activity.date as date,
periods.period as period,
COUNT(DISTINCT user) as users
FROM (
SELECT
event.timestamp_micros as date,
user_dim.app_info.app_instance_id as user
FROM `yourTable` CROSS JOIN UNNEST(event_dim) as event
) as activity
CROSS JOIN (
SELECT
event.timestamp_micros as date
FROM `yourTable` CROSS JOIN UNNEST(event_dim) as event
GROUP BY event.timestamp_micros
) as dates
CROSS JOIN (
SELECT period
FROM
(SELECT 7 as period UNION ALL
SELECT 14 as period UNION ALL
SELECT 30 as period)
) as periods
WHERE dates.date >= activity.date
AND SAFE_CAST(FLOOR(TIMESTAMP_DIFF(TIMESTAMP_MICROS(dates.date), TIMESTAMP_MICROS(activity.date), DAY)/periods.period) AS INT64) = 0
GROUP BY 1,2
)
GROUP BY date
ORDER BY date DESC

Calculating per day in SQL

I have an sql table like that:
Id Date Price
1 21.09.09 25
2 31.08.09 16
1 23.09.09 21
2 03.09.09 12
So what I need is to get min and max date for each id and dif in days between them. It is kind of easy. Using SQLlite syntax:
SELECT id,
min(date),
max(date),
julianday(max(date)) - julianday(min(date)) as dif
from table group by id
Then the tricky one: how can I receive the price per day during this difference period. I mean something like this:
ID Date PricePerDay
1 21.09.09 25
1 22.09.09 0
1 23.09.09 21
2 31.08.09 16
2 01.09.09 0
2 02.09.09 0
2 03.09.09 12
I create a cte as you mentioned with calendar but dont know how to get the desired result:
WITH RECURSIVE
cnt(x) AS (
SELECT 0
UNION ALL
SELECT x+1 FROM cnt
LIMIT (SELECT ((julianday('2015-12-31') - julianday('2015-01-01')) + 1)))
SELECT date(julianday('2015-01-01'), '+' || x || ' days') as date FROM cnt
p.s. If it will be in sqllite syntax-would be awesome!
You can use a recursive CTE to calculate all the days between the min date and max date. The rest is just a left join and some logic:
with recursive cte as (
select t.id, min(date) as thedate, max(date) as maxdate
from t
group by id
union all
select cte.id, date(thedate, '+1 day') as thedate, cte.maxdate
from cte
where cte.thedate < cte.maxdate
)
select cte.id, cte.date,
coalesce(t.price, 0) as PricePerDay
from cte left join
t
on cte.id = t.id and cte.thedate = t.date;
One method is using a tally table.
To build a list of dates and join that with the table.
The date stamps in the DD.MM.YY format are first changed to the YYYY-MM-DD date format.
To make it possible to actually use them as a date in the SQL.
At the final select they are formatted back to the DD.MM.YY format.
First some test data:
create table testtable (Id int, [Date] varchar(8), Price int);
insert into testtable (Id,[Date],Price) values (1,'21.09.09',25);
insert into testtable (Id,[Date],Price) values (1,'23.09.09',21);
insert into testtable (Id,[Date],Price) values (2,'31.08.09',16);
insert into testtable (Id,[Date],Price) values (2,'03.09.09',12);
The SQL:
with Digits as (
select 0 as n
union all select 1
union all select 2
union all select 3
union all select 4
union all select 5
union all select 6
union all select 7
union all select 8
union all select 9
),
t as (
select Id,
('20'||substr([Date],7,2)||'-'||substr([Date],4,2)||'-'||substr([Date],1,2)) as [Date],
Price
from testtable
),
Dates as (
select Id, date(MinDate,'+'||(d2.n*10+d1.n)||' days') as [Date]
from (
select Id, min([Date]) as MinDate, max([Date]) as MaxDate
from t
group by Id
) q
join Digits d1
join Digits d2
where date(MinDate,'+'||(d2.n*10+d1.n)||' days') <= MaxDate
)
select d.Id,
(substr(d.[Date],9,2)||'.'||substr(d.[Date],6,2)||'.'||substr(d.[Date],3,2)) as [Date],
coalesce(t.Price,0) as Price
from Dates d
left join t on (d.Id = t.Id and d.[Date] = t.[Date])
order by d.Id, d.[Date];
The recursive SQL below was totally inspired by the excellent answer from Gordon Linoff.
And a recursive SQL is probably more performant for this anyway.
(He should get the 15 points for the accepted answer).
The difference in this version is that the datestamps are first formatted to YYYY-MM-DD.
with t as (
select Id,
('20'||substr([Date],7,2)||'-'||substr([Date],4,2)||'-'||substr([Date],1,2)) as [Date],
Price
from testtable
),
cte as (
select Id, min([Date]) as [Date], max([Date]) as MaxDate from t
group by Id
union all
select Id, date([Date], '+1 day'), MaxDate from cte
where [Date] < MaxDate
)
select cte.Id,
(substr(cte.[Date],9,2)||'.'||substr(cte.[Date],6,2)||'.'||substr(cte.[Date],3,2)) as [Date],
coalesce(t.Price, 0) as PricePerDay
from cte
left join t
on (cte.Id = t.Id and cte.[Date] = t.[Date])
order by cte.Id, cte.[Date];

SQL Server: Attempting to output a count with a date

I am trying to write a statement and just a bit puzzled what is the best way to put it together. So I am doing a UNION on a number of tables and then from there I want to produce as the output a count for the UserID within that day.
So I will have numerous tables union such as:
Order ID, USERID, DATE, Task Completed.
UNION
Order ID, USERID, DATE, Task Completed
etc
Above is layout of the table which will have 4 tables union together with same names.
Then statement output I want is for a count of USERID that occurred within the last 24 hours.
So output should be:
USERID--- COUNT OUTPUT-- DATE
I was attempting a WHERE statement but think the output is not what I am after exactly, just thinking if anyone can point me in the right direction and if there is alternative way compared to the union? Maybe a joint could be a better alternative, any help be appreciated.
I will eventually then put this into a SSRS report, so it gets updated daily.
You can try this:
select USERID, count(*) as [COUNT], cast(DATE as date) as [DATE]
from
(select USERID, DATE From SomeTable1
union all
select USERID, DATE From SomeTable2
....
) t
where DATE <= GETDATE() AND DATE >= DATEADD(hh, -24, GETDATE())
group by USERID, cast(DATE as date)
First, you should use union all rather than union. Second, you need to aggregate and use count distinct to get what you want:
So, the query you want is something like:
select count(distinct userid)
from ((select date, userid
from table1
where date >= '2015-05-26'
) union all
(select date, userid
from table2
where date >= '2015-05-26'
) union all
(select date, userid
from table3
where date >= '2015-05-26'
)
) du
Note that this hardcodes the date. In SQL Server, you would do something like:
date >= cast(getdate() - 1 as date)
And in MySQL
date >= date_sub(curdate(), interval 1 day)
EDIT:
I read the question as wanting a single day. It is easy enough to extend to all days:
select cast(date as date) as dte, count(distinct userid)
from ((select date, userid
from table1
) union all
(select date, userid
from table2
) union all
(select date, userid
from table3
)
) du
group by cast(date as date)
order by dte;
For even more readability, you could use a CTE:
;WITH cte_CTEName AS(
SELECT UserID, Date, [Task Completed] FROM Table1
UNION
SELECT UserID, Date, [Task Completed] FROM Table2
etc
)
SELECT COUNT(UserID) AS [Count] FROM cte_CTEName
WHERE Date <= GETDATE() AND Date >= DATEADD(hh, -24, GETDATE())
I think this is what you are trying to achieve...
Select
UserID,
Date,
Count(1)
from
(Select *
from table1
Union All
Select *
from table2
Union All
Select *
from table3
Union All
Select *
from table4
) a
Group by
Userid,
Date