I am trying to group date ranges together so I can sort my report by batch jobs. However the batch id repeats maybe twice per year so I have to group by date as well as batch ID. My dilemma is I am unable to get the range of batch IDs
Let’s say I have these date values
1/1/2021
5/1/2021
8/1/2021
3/7/2020
4/2/2019
I want to get
8/12/2020 - 8/1/2021
3/6/2020 - 3/7/2020
4/1/2019 - 4/2/2019
First time asking for help on Stack Overflow and on mobile. Forgive the formatting
From your comment:
I’m trying to group by once a month
Then use GROUP BY TRUNC(date_column) to group into calendar months.
SELECT batch_id,
TRUNC(date_column) AS month,
SUM(your_other_column) AS other_column_total
FROM table_name
GROUP BY TRUNC(date_column);
If you want to group by a different range, i.e. from the 8th of the month until the 7th of the next month then use an offset:
SELECT batch_id,
TRUNC(date_column - INTERVAL '7' DAY) AS month_from_8th,
SUM(your_other_column) AS other_column_total
FROM table_name
GROUP BY TRUNC(date_column - INTERVAL '7' DAY);
If you want to group by something else then you will need to define how to calculate the group ranges.
Related
I'm trying to query a table comparing order numbers from last week (Sunday to Saturday) vs 2 weeks ago, and calculate percent change between the two. My thought process so far has been to group my date column by week, then use a lag function to pull last week and the previous week in to the same row. From there use basic arithmetic functions to calculate percent change. In practice, I haven't been able to get a working query, but I picture the table to look as follows:
Week
Orders
Orders - Previous Week
% Change
2023-02-05
5
10
-0.5
2023-01-29
10
2
+5.0
2023-01-29
2
Important to note that the days in last week should not change regardless of what day it is today (i.e not use today -7 days to calculate last week, and -14 days to calculate 2 weeks ago)
My query so far:
SELECT
min(date) as date,
orders,
coalesce(lag(order) over (order by (date), 0)) as Orders - Previous Week
FROM `table`
WHERE date BETWEEN '2023-01-01' AND current_date()
group by date_trunc(date, WEEK)
ORDER BY date desc
I realize I'm not using coalesce and my lag function correctly, but a bit lost on how to correct it
To calculate the percent change, you can use the following query:
sql
Copy code
SELECT
min(date) as Week,
sum(orders) as Orders,
coalesce(sum(lag(orders) over (order by date_trunc(date, WEEK))), 0) as "Orders - Previous Week",
(sum(orders) - coalesce(sum(lag(orders) over (order by date_trunc(date, WEEK))), 0)) / coalesce(sum(lag(orders) over (order by date_trunc(date, WEEK))), 0) as "% Change"
FROM `table`
WHERE date BETWEEN '2023-01-01' AND current_date()
group by date_trunc(date, WEEK)
ORDER BY Week desc
In this query, the sum function is used to aggregate the orders by week. The coalesce function is used to handle the case where there is no previous week data, and default to 0. The percent change calculation uses the same formula you described.
I have a query with dates for the last two years. I want to get data dynamically for the last 4 months.
That is if today 04/01/2021 I want to get data from 01/09/2020 up today inclusive.
The problem with my query is that I get data between 04/09/2020 up today, i.e. 4 months not including a full first month.
PostgreSQL
SELECT Category,
Product,
Sales,
Date
FROM Table
WHERE Date>= now() - INTERVAL '4 months'
What I need to change in my query ??
You can do it using date_trunc like following query.
SELECT Category,
Product,
Sales,
Date
FROM Table
WHERE Date>= date_trunc('month', current_date-interval '4 months')
In my Spiceworks database there is a table, tickets, with two columns I am concerned with, first_response_secs and created_at.
I have been tasked with finding the average response time of tickets for every week.
So if I run the following query:
select AVG(first_response_secs) from (
select first_response_secs,created_at
from tickets
where created_at BETWEEN '2017-03-19' and '2017-03-25'
)
I will get back the average first response seconds for that week. But that's as far as my limited SQL gets me. I need 6 months worth of data and I don't want to manually edit the date range and rerun the query 24 times.
I would like to write a query that will return output similar to the following:
WEEK AVERAGE RESPONSE TIME(secs)
-----------------------------------------------------------
2017-02-26 - 2017-03-04 21447
2017-03-05 - 2017-03-11 20564
2017-03-12 - 2017-03-18 25883
2017-03-19 - 2017-03-25 12244
Or something like that, back 6 months.
Weeks are tricky. How about:
select min(created_at) as weekstart, first_response_secs, created_at
from tickets
group by floor(julianday('2017-03-25) - julianday(created_at)) % 7 = 0
order by weekstart
One dirty way is to use case to define week boundaries:
select week, avg(first_response_secs)
from (
select case
when created_at between '2017-02-26' and '2017-03-04' then '2017-02-26 - 2017-03-04'
when created_at between '2017-03-05' and '2017-03-11' then '2017-03-05 - 2017-03-11'
when created_at between '2017-03-12' and '2017-03-18' then '2017-03-12 - 2017-03-18'
when created_at between '2017-03-19' and '2017-03-25' then '2017-03-19 - 2017-03-25'
end as week,
first_response_secs
from tickets
) t
group by week;
Demo
Note that this method is a general purpose one and can be modified to change the boundaries as you wish.
I am trying to write an efficient query to get the sum of the previous 7 days worth of values from a relational DB table, and record each total against the final date in the 7 day period (e.g. the 'WeeklyTotals Table' in the example below). For example, in my WeeklyTotals query, I would like the value for February 15th to be 333, since that is the total sum of users from Feb 9th - Feb 15th, and so on:
I have a base query which gets me my previous weeks users for today's date (simplified for the sake of the example):
SELECT Date, Sum("Total Users")
FROM "UserRecords"
WHERE (dateadd(hour, -8, "UserRecords"."Date") BETWEEN
dateadd(hour, -8, sysdate) - INTERVAL '7 DAY' AND dateadd(hour, -8, sysdate);
The problem is, this only get's me the total for today's date. I need a query which will get me this information for the previous seven days.
I know I can make a view for each date (since I only need the previous seven entries) and join them all together, but that seems really inefficient (I'll have to create/update 7 views, and then do all the inner join operations). I am wondering if there's a more efficient way to achieve this.
Provided there are no gaps, you can use a running total with SUM OVER including the six previous rows. Use ROW_NUMBER to exclude the first six records, as their totals don't represent complete weeks.
select log_date, week_total
from
(
select
log_date,
sum(total_users) over (order by log_date rows 6 preceding) as week_total,
row_number() over (order by log_date) as rn
from mytable
where log_date > 0
)
where rn >= 7
order by log_date;
UPDATE: In case there are gaps, it should be
sum(total_users) over (order by log_date range interval '6' day preceding)
but I don't know whether PostgreSQL supports this already. (Moreover the ROW_NUMBER exclusion wouldn't work then and would have to be replaced by something else.)
Here's a a query that self joins to the previous 6 days and sums the value to get the weekly totals:
select u1.date, sum(u2.total_users) as weekly_users
from UserRecords u1
join UserRecords u2
on u1.date - u2.date < 7
and u1.date >= u2.date
group by u1.date
order by u1.date
You can use the SUM over Window function, with the expression using Date Part, of week.
Self joins are much slower than Window functions.
I've been working on this query and I'm about 90% where I need to be however there is one piece of this that I'm unable to figure out. Basically I'm looking for the Sum of Net flows by month, starting 12/31/2014 through the current date. I'm able to extract data by day and the sum of net flows for that day, however now I really need to be able to group all the dates in to their respective months. Ex. If I have 01/01/2015, 01/02/2015 and 01/03/2015 I just want both of them to be grouped together and show up as 01/2015. Bellow is the query that I have written. Please help with this last step.
SELECT
DATE,
SUM(NET_FLOWS/1000000.00) AS YTD_NET_FLOWS
FROM
HISTORY_TBL
WHERE
DATE >= TO_DATE ('12312014','MMDDYYYY')
GROUP BY
DATE
ORDER BY
DATE
You can truncate a date to a given format (day, year, month, etc.), as shown here: http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions230.htm#i1002084
SELECT TRUNC(DATE,'MM'), SUM(NET_FLOWS/1000000.00) AS YTD_NET_FLOWS
FROM HISTORY_TBL
WHERE DATE >= TO_DATE ('12312014','MMDDYYYY')
GROUP BY TRUNC(DATE,'MM')
ORDER BY DATE