Today vs weeks ago with aggregate function - sql

I'm working on the following presto/sql query using inline filter to get side by side comparison of current date range vs weeks ago data.
In my case query current date range is 2017-09-13 to 2017-09-14.
So far I'm able to get the following results, but unfortunately this is not what I want.
Any kind of help would be greatly appreciated.
SELECT
DATE_TRUNC('day',DATE_PARSE(CAST(sample.datep AS VARCHAR),'%Y%m%d')) AS date,
CAST(SUM(sample.page_views) FILTER (WHERE sample.datep BETWEEN 20170913 AND 20170914) AS DOUBLE) AS page_views,
CAST(SUM(sample.page_views) FILTER (WHERE sample.datep BETWEEN 20170906 AND 20170907) AS DOUBLE) AS page_views_weeks_ago
FROM
sample
WHERE
(
datep BETWEEN 20170906 AND 20170914
)
GROUP BY
1
ORDER BY
1 ASC
LIMIT 50
Actual result:
+------------+------------+----------------------+
| date | page_views | page_views_weeks_ago |
+------------+------------+----------------------+
| 2017-09-06 | 0 | 990,929 |
| 2017-09-07 | 0 | 913,802 |
| 2017-09-08 | 0 | 0 |
| 2017-09-09 | 0 | 0 |
| 2017-09-10 | 0 | 0 |
| 2017-09-11 | 0 | 0 |
| 2017-09-12 | 0 | 0 |
| 2017-09-13 | 1,507,715 | 0 |
| 2017-09-14 | 48,625 | 0 |
+------------+------------+----------------------+
Expected result:
+------------+------------+----------------------+
| date | page_views | page_views_weeks_ago |
+------------+------------+----------------------+
| 2017-09-13 | 1,507,715 | 990,929 |
| 2017-09-14 | 48,625 | 913,802 |
+------------+------------+----------------------+

You can achieve with joining a table with itself as a previous day. For brevity, I assume that we have a date field so that date substructions can be done easily.
SELECT date,
SUM(curr.page_views) AS page_views,
SUM(prev.page_views) AS page_views_weeks_ago
FROM sample curr
JOIN sample prev ON curr.date - 7 = prev.date
GROUP BY 1
ORDER BY 1 ASC

Related

SQL Server - Counting total number of days user had active contracts

I want to count the number of days while user had active contract based on table with start and end dates for each service contract. I want to count the time of any activity, no matter if the customer had 1 or 5 contracts active at same time.
+---------+-------------+------------+------------+
| USER_ID | CONTRACT_ID | START_DATE | END_DATE |
+---------+-------------+------------+------------+
| 1 | 14 | 18.02.2021 | 18.04.2022 |
| 1 | 13 | 02.01.2019 | 02.01.2020 |
| 1 | 12 | 01.01.2018 | 01.01.2019 |
| 1 | 11 | 13.02.2017 | 13.02.2019 |
| 2 | 23 | 19.06.2021 | 18.04.2022 |
| 2 | 22 | 01.07.2019 | 01.07.2020 |
| 2 | 21 | 19.01.2019 | 19.01.2020 |
+---------+-------------+------------+------------+
In result I want a table:
+---------+--------------------+
| USER_ID | DAYS_BEEING_ACTIVE |
+---------+--------------------+
| 1 | 1477 |
| 2 | 832 |
+---------+--------------------+
Where
1477 stands by 1053 (days from 13.02.2017 to 02.01.2020 - user had active contracts during this time) + 424 (days from 18.02.2021 to 18.04.2022)
832 stands by 529 (days from 19.01.2019 to 01.07.2020) + 303 (days from 19.06.2021 to 18.04.2022).
I tried some queries with joins, datediff's, case when conditions but nothing worked. I'll be grateful for any help.
If you don't have a Tally/Numbers table (highly recommended), you can use an ad-hoc tally/numbers table
Example or dbFiddle
Select User_ID
,Days = count(DISTINCT dateadd(DAY,N,Start_Date))
from YourTable A
Join ( Select Top 10000 N=Row_Number() Over (Order By (Select NULL))
From master..spt_values n1, master..spt_values n2
) B
On N<=DateDiff(DAY,Start_Date,End_Date)
Group By User_ID
Results
User_ID Days
1 1477
2 832

SQL: tricky question for finding lockout dates

Hope you can help. We have a table with two columns Customer_ID and Trip_Date. The customer receives 15% off on their first visit and on every visit where they haven't received the 15% off offer in the past thirty days. How do I write a single SQL query that finds all days where a customer received 15% off?
The table looks like this
+-----+-------+----------+
| Customer_ID | date |
+-----+-------+----------+
| 1 | 01-01-17 |
| 1 | 01-17-17 |
| 1 | 02-04-17 |
| 1 | 03-01-17 |
| 1 | 03-15-17 |
| 1 | 04-29-17 |
| 1 | 05-18-17 |
+-----+-------+----------+
The desired output would look like this:
+-----+-------+----------+--------+----------+
| Customer_ID | date | received_discount |
+-----+-------+----------+--------+----------+
| 1 | 01-01-17 | 1 |
| 1 | 01-17-17 | 0 |
| 1 | 02-04-17 | 1 |
| 1 | 03-01-17 | 0 |
| 1 | 03-15-17 | 1 |
| 1 | 04-29-17 | 1 |
| 1 | 05-18-17 | 0 |
+-----+-------+----------+--------+----------+
We are doing this work in Netezza. I can't think of a way using just window functions, only using recursion and looping. Is there some clever trick that I'm missing?
Thanks in advance,
GF
You didn't tell us what your backend is, nor you gave some sample data and expected output nor you gave a sensible data schema :( This is an example based on guess of schema using postgreSQL as backend (would be too messy as a comment):
(I think you have Customer_Id, Trip_Date and LocationId in trips table?)
select * from trips t1
where not exists (
select * from trips t2
where t1.Customer_id = t2.Customer_id and
t1.Trip_Date > t2.Trip_Date
and t1.Trip_date - t2.Trip_Date < 30
);

Count Query in Select Statement

May I ask someone why the two queries below don't output the same result using MSSMS? What do I have to add to the second one to get me the first output.
Table
|----------------------------------------|
| RoomBase |
|---------------------|------------------|
| CustomBit1 | Date |
| 1 | 2018-11-01 |
| 1 | 2018-11-01 |
| 1 | 2018-11-01 |
| 1 | 2018-11-01 |
| 1 | 2018-11-01 |
| 1 | 2018-10-01 |
| 0 | 2018-10-01 |
| 0 | 2018-10-01 |
| 0 | 2018-10-01 |
| 1 | 2018-10-01 |
|---------------------|------------------|
Result from first query. [Desired Result.]
|---------------------|
| Count |
|---------------------|
| 2 |
| 5 |
|---------------------|
Result from second query. [Undesired result.]
|---------------------|
| Count |
|---------------------|
| 630 |
| 630 |
|---------------------|
630 is coming from the original amount of the data.
SELECT ISnull(Count(1),0) FROM dbo.[RoomBase]
WHERE dbo.[RoomBase].CustomBit1 = 1
GROUP BY [RoomBase].Date
SELECT (SELECT ISnull(Count(1), 0) FROM dbo.[RoomBase]
WHERE dbo.[RoomBase].CustomBit1 =1)
FROM dbo.[RoomBase]
GROUP BY [RoomBase].Date
All help is appreciated...
Thank you!
You should also filter your sub query by date
SELECT (
SELECT ISnull(Count(1), 0) FROM dbo.[RoomBase] RB WHERE RB.CustomBit1 = 1
AND RB.Date = [RoomBase].Date
)
FROM dbo.[RoomBase]
GROUP BY [RoomBase].Date
First Query will count all records that has CustomBit1 = 1 after grouped them by Date.
The second one, will count all records that has CustomBit1 = 1 (without grouping them), because you didn't include the group by inside the sub-query.
So, you'll have to do something like this in the second query :
SELECT
(
SELECT TOP 1
COUNT(*)
FROM RoomBase
WHERE
CustomBit1 = 1
GROUP BY [Date]
)
FROM RoomBase
Also, ISNULL() is not needed with COUNT() or SUM() functions. As these functions will always return an integer, and never return a NULL.

SQL Query to Count Number of Responses Matching Certain Criteria over a Date Range and Display as Grouped per Day

I have the following set of survey responses in a table.
It's not very clear but the numbers represent the 'satisfaction' level where:
0 = happy
1 = neutral
2 = sad
+----------+--------+-------+------+-----------+-------------------------+
| friendly | polite | clean | rate | recommend | booking_date |
+----------+--------+-------+------+-----------+-------------------------+
| 2 | 2 | 2 | 0 | 0 | 2014-02-03 00:00:00.000 |
| 1 | 2 | 0 | 0 | 2 | 2014-02-04 00:00:00.000 |
| 0 | 0 | 0 | 1 | 0 | 2014-02-04 00:00:00.000 |
| 1 | 1 | 2 | 0 | 2 | 2014-02-04 00:00:00.000 |
| 0 | 0 | 1 | 2 | 1 | 2014-02-04 00:00:00.000 |
| 2 | 2 | 0 | 2 | 0 | 2014-02-05 00:00:00.000 |
| 2 | 1 | 1 | 0 | 2 | 2014-02-05 00:00:00.000 |
| 1 | 0 | 1 | 2 | 0 | 2014-02-05 00:00:00.000 |
| 0 | 1 | 1 | 1 | 1 | 2014-02-05 00:00:00.000 |
| 1 | 0 | 2 | 2 | 0 | 2014-02-05 00:00:00.000 |
+----------+--------+-------+------+-----------+-------------------------+
For each day I need the totals of each of the columns matching each response option. This will answer the question: "How may people answered happy, neutral or sad for each of the available question options".
I would then require a recordset returned such as:
+------------+----------+------------+--------+----------+------------+--------+
| Date | FriHappy | FriNeutral | FriSad | PolHappy | PolNeutral | PolSad |
+------------+----------+------------+--------+----------+------------+--------+
| 2014-02-03 | 0 | 0 | 1 | 0 | 0 | 1 |
| 2014-02-04 | 2 | 2 | 0 | 2 | 1 | 1 |
| 2014-02-05 | 1 | 2 | 2 | 2 | 2 | 1 |
+------------+----------+------------+--------+----------+------------+--------+
This shows that on the 4th two responders answered "happy" for the "Polite?" question, one answered "Neutral" and one answered "sad".
On the 5th, one responder answered "happy" for the Friendly option, two choose "neutral" and two chose "sad".
I really wish to avoid doing this in code but my SQL isn't great. I did have a look around but couldn't find anything matching this specific requirement.
Obviously this is never going to work (nice if it did) but this may help explain:
SELECT cast(booking_date as date) [booking_date],
COUNT(friendly=0) [FriHappy],
COUNT(friendly=1) [FriNeutral],
COUNT(friendly=2) [FriSad]
FROM [u-rate-gatwick-qsm].[dbo].[Questions]
WHERE booking_date >= '2014-02-01'
AND booking_date <= '2014-03-01'
GROUP BY cast(booking_date as date)
Any pointers would be much appreciated.
Many thanks.
Here is a working version of your sample query:
SELECT cast(booking_date as date) as [booking_date],
sum(case when friendly = 0 then 1 else 0 end) as [FriHappy],
sum(case when friendly = 1 then 1 else 0 end) as [FriNeutral],
sum(case when friendly = 2 then 1 else 0 end) as [FriSad]
FROM [u-rate-gatwick-qsm].[dbo].[Questions]
WHERE booking_date >= '2014-02-01' AND booking_date <= '2014-03-01'
GROUP BY cast(booking_date as date)
ORDER BY min(booking_date);
Your expression count(friendly = 0) doesn't work in SQL Server. Even if it did, it would be the same as count(friendly) -- that is, the number of non-NULL values in the column. Remember what count() does. It counts the number of non-NULL values.
The above logic says: add 1 when there is a match to the appropriate friendly value.
By the way, SQL Server doesn't guarantee the ordering of results from an aggregation, so I also added an order by clause. The min(booking_date) is just an easy way of ordering by the date.
And, I didn't make the change, but I think the second condition in the where should be < rather than <= so you don't include bookings on March 1st (even one at exactly midnight).

How to limit results by SUM

I have a table of events called event. For the purpose of this question it only has one field called date.
The following query returns me a number of events that are happening on each date for the next 14 days:
SELECT
DATE_FORMAT( ev.date, '%Y-%m-%d' ) as short_date,
count(*) as date_count
FROM event ev
WHERE ev.date >= NOW()
GROUP BY short_date
ORDER BY ev.start_date ASC
LIMIT 14
The result could be as follows:
+------------+------------+
| short_date | date_count |
+------------+------------+
| 2010-03-14 | 1 |
| 2010-03-15 | 2 |
| 2010-03-16 | 9 |
| 2010-03-17 | 8 |
| 2010-03-18 | 11 |
| 2010-03-19 | 14 |
| 2010-03-20 | 13 |
| 2010-03-21 | 7 |
| 2010-03-22 | 2 |
| 2010-03-23 | 3 |
| 2010-03-24 | 3 |
| 2010-03-25 | 6 |
| 2010-03-26 | 23 |
| 2010-03-27 | 14 |
+------------+------------+
14 rows in set (0.06 sec)
Let's say I want to dislay these events by date. At the same time I only want to display a maximum of 10 at a time. How would I do this?
Somehow I need to limit this result by the SUM of the date_count field but I do not know how.
Anybody run into this problem before?
Any help would be appreciated. Thanks
Edited:
The extra requirement (crucial one, oops) which I forgot in my original post, is that I only want whole days.
ie. Given the limit is 10, it would only return the following rows:
+------------+------------+
| short_date | date_count |
+------------+------------+
| 2010-03-14 | 1 |
| 2010-03-15 | 2 |
| 2010-03-16 | 9 |
+------------+------------+
use a date function to limit the 14 day range of date
use limit to display the first 10
SELECT
DATE_FORMAT( ev.date, '%Y-%m-%d' ) as short_date,
count(*) as date_count
FROM event ev
WHERE ev.date between NOW() and date_add(now(), interval 14 day)
GROUP BY date(short_date)
ORDER BY ev.start_date ASC
LIMIT 0,10
I think that using LIMIT 0, 10 will work for you.