Calculate Average for Amount for certain date range in a year based on month - sql

I have a table like below :
ID
Amount
Date
1
500
2022-01-03
1
200
2022-01-04
1
500
2022-01-05
1
340
2022-01-06
1
500
2022-01-25
1
500
2022-01-26
1
567
2022-01-27
1
500
2022-01-28
1
598
2022-01-31
1
500
2022-02-01
1
787
2022-02-02
1
500
2022-02-03
1
5340
2022-02-04
PROBLEM :-
So I have to calculate average of column where StartDate = 03/01/2022 (3rd Jan 2022) and for each month it would be like for January Average of Amount from StartDate to 25th Jan, then for Feb Startdate to 22nd Feb, so this date logic is also there
SET #Last = (SELECT DATEADD(DAY, CASE DATENAME(WEEKDAY, #Date)
WHEN 'Sunday' THEN -6
When 'Saturday' THEN -5
ELSE -7 END, DATEDIFF(DAY, 0, #Date)))
RETURN #Last
ID
Amount
Date
Last
1
500
2022-01-03
2022-01-25
1
500
2022-01-04
2022-01-25
1
340
2022-01-05
2022-01-25
1
500
2022-01-06
2022-01-25
1
567
2022-01-25
2022-01-25
1
500
2022-01-26
2022-01-25
1
500
2022-01-27
2022-01-25
1
40
2022-01-28
2022-01-25
1
500
2022-01-31
2022-01-25
1
589
2022-02-01
2022-02-22
1
540
2022-02-02
2022-02-22
1
500
2022-02-03
2022-02-22
1
5340
2022-02-04
2022-02-22
Like the above table..
Now if I calculate Avg(Amount), from 3rd jan to 25th Jan for Jan and 3rd Jan to 22nd Feb and so on.. It's not giving correct average, like it is calculating the rest of the days amount also. Also grouping by is grouping month wise not as where clause
Select Avg(Amount) from Table
where Date BETWEEN #StartDate AND Last
StartDate is fixed # 3rd Jan.
This is not giving the correct Avg. Any other way I could get the required data?

Related

How to get last N week data in different year

I need to get last 6 weeks data from some table, right now the logic that I use is this
WEEK([date column]) BETWEEN WEEK(NOW()) - 6 AND WEEK(NOW())
It run as I want, but January is near and I realize that this query will not working as it is. I try to run my query on 15th January 2022, I only get data from 1st January to 15th January when I use my logic.
TGL MINGGU_KE
2022-01-01 | 1
2022-01-02 | 2
2022-01-03 | 2
2022-01-04 | 2
2022-01-05 | 2
2022-01-06 | 2
2022-01-07 | 2
2022-01-08 | 2
2022-01-09 | 3
2022-01-10 | 3
2022-01-11 | 3
2022-01-12 | 3
2022-01-13 | 3
2022-01-14 | 3
2022-01-15 | 3
Can I get the last 6 weeks data including last year?
This is my dbfiddle: https://dbfiddle.uk/o9BeAFJF
You can round the dates to the first day of the week using ROUND, TRUNC or THIS_WEEK
WITH
SEARCH_WEEK (TGL) AS (
VALUES date '2020-12-01'
UNION ALL
SELECT tgl + 1 DAY FROM SEARCH_WEEK WHERE tgl < CURRENT date
),
BASE_DATE (base_date) AS (
VALUES date '2022-01-15'
),
OPTIONS (OPTION, OPTION_BASE_DATE) AS (
SELECT OPTION, option_base_date FROM base_date CROSS JOIN LATERAL (
VALUES
('ROUND D', ROUND(base_date, 'D')),
('ROUND IW', ROUND(base_date, 'IW')),
('ROUND W', ROUND(base_date, 'W')),
('ROUND WW', ROUND(base_date, 'WW')),
('TRUNC D', TRUNC(base_date, 'D')),
('TRUNC IW', TRUNC(base_date, 'IW')),
('TRUNC W', TRUNC(base_date, 'W')),
('TRUNC WW', TRUNC(base_date, 'WW')),
('THIS_WEEK', THIS_WEEK(base_date)),
('THIS_WEEK + 1 DAY', THIS_WEEK(base_date) + 1 DAY)
) a (OPTION, OPTION_BASE_DATE)
)
SELECT
OPTION,
MIN(TGL) BEGIN,
max(tgl) END,
dayname(MIN(TGL)) day_BEGIN,
dayname(max(tgl)) day_end,
days_between(max(tgl), min(tgl)) + 1 duration_in_days
FROM
SEARCH_WEEK
CROSS JOIN options
WHERE
TGL BETWEEN option_base_date - 35 DAYS AND option_base_date + 6 DAYS
GROUP BY OPTION
OPTION
BEGIN
END
DAY_BEGIN
DAY_END
DURATION_IN_DAYS
ROUND D
2021-12-12
2022-01-22
Sunday
Saturday
42
ROUND IW
2021-12-13
2022-01-23
Monday
Sunday
42
ROUND W
2021-12-11
2022-01-21
Saturday
Friday
42
ROUND WW
2021-12-11
2022-01-21
Saturday
Friday
42
THIS_WEEK
2021-12-05
2022-01-15
Sunday
Saturday
42
THIS_WEEK + 1 DAY
2021-12-06
2022-01-16
Monday
Sunday
42
TRUNC D
2021-12-05
2022-01-15
Sunday
Saturday
42
TRUNC IW
2021-12-06
2022-01-16
Monday
Sunday
42
TRUNC W
2021-12-11
2022-01-21
Saturday
Friday
42
TRUNC WW
2021-12-11
2022-01-21
Saturday
Friday
42
fiddle
you can use dateadd to get first day of week six weeks ago like this:
Select * from tableName
where [dateColumn] between dateadd(WEEK,-6,getdate()) and getdate()
You can use DATEADD to get last 6 weeks of data as follows:
Select * from [TableName] where [DateColumn] between
DATEADD(WEEK,-6,GETDATE()) and GETDATE();

CASE in WHERE Clause in Snowflake

I am trying to do a case statement within the where clause in snowflake but I’m not quite sure how should I go about doing it.
What I’m trying to do is, if my current month is Jan, then the where clause for date is between start of previous year and today. If not, the where clause for date would be between start of current year and today.
WHERE
CASE MONTH(CURRENT_DATE()) = 1 THEN DATE BETWEEN DATE_TRUNC(‘YEAR’, DATEADD(YEAR, -1, CURRENT_DATE())) AND CURRENT_DATE()
CASE MONTH(CURRENT_DATE()) != 1 THEN DATE BETWEEN DATE_TRUNC(‘YEAR’, CURRENT_DATE()) AND CURRENT_DATE()
END
Appreciate any help on this!
Use a CASE expression that returns -1 if the current month is January or 0 for any other month, so that you can get with DATEADD() a date of the previous or the current year to use in DATE_TRUNC():
WHERE DATE BETWEEN
DATE_TRUNC('YEAR', DATEADD(YEAR, CASE WHEN MONTH(CURRENT_DATE()) = 1 THEN -1 ELSE 0 END, CURRENT_DATE()))
AND
CURRENT_DATE()
I suspect that you don't even need to use CASE here:
WHERE
(MONTH(CURRENT_DATE()) = 1 AND
DATE BETWEEN DATE_TRUNC(‘YEAR’, DATEADD(YEAR, -1, CURRENT_DATE())) AND
CURRENT_DATE()) OR
(MONTH(CURRENT_DATE()) != 1 AND
DATE BETWEEN DATE_TRUNC(‘YEAR’, CURRENT_DATE()) AND CURRENT_DATE())
So the other answers are quite good, but... the answer can be even simpler
Making a little table to brake down what is happening.
select
row_number() over (order by null) - 1 as rn,
dateadd('day', rn * 5, date_trunc('year',current_date())) as pretend_current_date,
DATEADD(YEAR, -1, pretend_current_date) as pcd_sub1,
month(pretend_current_date) as pcd_month,
DATE_TRUNC(year, iff(pcd_month = 1, pcd_sub1, pretend_current_date)) as _from,
pretend_current_date as _to
from table(generator(ROWCOUNT => 30))
order by rn;
this shows:
RN
PRETEND_CURRENT_DATE
PCD_SUB1
PCD_MONTH
_FROM
_TO
0
2022-01-01
2021-01-01
1
2021-01-01
2022-01-01
1
2022-01-06
2021-01-06
1
2021-01-01
2022-01-06
2
2022-01-11
2021-01-11
1
2021-01-01
2022-01-11
3
2022-01-16
2021-01-16
1
2021-01-01
2022-01-16
4
2022-01-21
2021-01-21
1
2021-01-01
2022-01-21
5
2022-01-26
2021-01-26
1
2021-01-01
2022-01-26
6
2022-01-31
2021-01-31
1
2021-01-01
2022-01-31
7
2022-02-05
2021-02-05
2
2022-01-01
2022-02-05
8
2022-02-10
2021-02-10
2
2022-01-01
2022-02-10
9
2022-02-15
2021-02-15
2
2022-01-01
2022-02-15
10
2022-02-20
2021-02-20
2
2022-01-01
2022-02-20
11
2022-02-25
2021-02-25
2
2022-01-01
2022-02-25
12
2022-03-02
2021-03-02
3
2022-01-01
2022-03-02
13
2022-03-07
2021-03-07
3
2022-01-01
2022-03-07
14
2022-03-12
2021-03-12
3
2022-01-01
2022-03-12
15
2022-03-17
2021-03-17
3
2022-01-01
2022-03-17
16
2022-03-22
2021-03-22
3
2022-01-01
2022-03-22
17
2022-03-27
2021-03-27
3
2022-01-01
2022-03-27
18
2022-04-01
2021-04-01
4
2022-01-01
2022-04-01
19
2022-04-06
2021-04-06
4
2022-01-01
2022-04-06
20
2022-04-11
2021-04-11
4
2022-01-01
2022-04-11
21
2022-04-16
2021-04-16
4
2022-01-01
2022-04-16
22
2022-04-21
2021-04-21
4
2022-01-01
2022-04-21
23
2022-04-26
2021-04-26
4
2022-01-01
2022-04-26
24
2022-05-01
2021-05-01
5
2022-01-01
2022-05-01
25
2022-05-06
2021-05-06
5
2022-01-01
2022-05-06
26
2022-05-11
2021-05-11
5
2022-01-01
2022-05-11
27
2022-05-16
2021-05-16
5
2022-01-01
2022-05-16
28
2022-05-21
2021-05-21
5
2022-01-01
2022-05-21
29
2022-05-26
2021-05-26
5
2022-01-01
2022-05-26
Your logic is asking "is the current date in the month of January", at which point take the prior year, and then date truncate to the year, otherwise take the current date and truncate to the year. As the start of a BETWEEN test.
This is the same as getting the current date subtracting one month, and truncating this to year.
Thus there is no need for any IFF or CASE
WHERE date BETWEEN DATE_TRUNC(year, DATEADD(month,-1, CURRENT_DATE())) AND CURRENT_DATE()
and if you like to drop some paren's, CURRENT_DATE can be used if you leave it in upper case, thus it can even be smaller:
WHERE date BETWEEN DATE_TRUNC(year, DATEADD(month,-1, CURRENT_DATE)) AND CURRENT_DATE

SQL query for getting data for the last 6 months grouped by month?

I know a basic query to get some results for the last 6 months. Let's say like this:
SELECT *
FROM RANDOM_TABLE
WHERE Date_Column >= DATEADD(MONTH, -6, GETDATE())
But what if I'd like to get results grouped by month - each month looking back 6 months into the past?
The first three rows of a result could ideally look like this (count of IDs is random):
Month_and_year
COUNT(ID)
January 2017
120
February 2017
160
March 2017
240
The last three rows:
Month_and_year
COUNT(ID)
November 2021
80
December 2021
350
January 2021
260
Hope it's understandable.
Thanks in advance!
EDIT:
Over the hours I made a few corrections. Most notably I corrected the self join query to reflect my intentions and also added more details to better explain what is going on.
To my knowledge there are two ways about it (which are probably the same under the hood).
Also, please note that these solutions assume you have a month field already in place. If you have a date or timestamp field, you should take one extra preparation step.
[Addendum] To be more precise, I'd say that the ideal would be to have a date/timestamp field that is truncated/flattened to the first day of the month.
As an example,
month
amount
2021-01-01
50
2021-02-01
20
2021-03-01
10
2021-04-01
100
2021-05-01
20
2021-06-01
40
2021-07-01
80
2021-08-01
50
The first is to use a "self-non-equi join"
SELECT
a.month,
SUM(b.amount) AS amount_over_6_months
FROM table AS a
INNER JOIN table AS b ON a.month BETWEEN b.month AND DATEADD(MONTH, 5, b.month)
WHERE a.month >= DATEADD(MONTH, -5, GETDATE())
GROUP BY a.month
What happens here is that you are joining the table with itself. Specifically, for each row in the (a) alias, you will join six rows from the (b) alias. For each row you will join the rows where the month is equal, all the way back to five months prior. So...
a.month
b.month
a.amount
b.amount
2021-01-01
2021-01-01
50
50
2021-02-01
2021-01-01
20
50
2021-02-01
2021-02-01
20
20
2021-03-01
2021-01-01
10
50
2021-03-01
2021-02-01
10
20
2021-03-01
2021-03-01
10
10
2021-04-01
2021-01-01
100
50
2021-04-01
2021-02-01
100
20
2021-04-01
2021-03-01
100
10
2021-04-01
2021-04-01
100
100
2021-05-01
2021-01-01
20
50
2021-05-01
2021-02-01
20
20
2021-05-01
2021-03-01
20
10
2021-05-01
2021-04-01
20
100
2021-05-01
2021-05-01
20
20
2021-06-01
2021-01-01
40
50
2021-06-01
2021-02-01
40
20
2021-06-01
2021-03-01
40
10
2021-06-01
2021-04-01
40
100
2021-06-01
2021-05-01
40
20
2021-06-01
2021-06-01
40
40
2021-07-01
2021-02-01
80
20
2021-07-01
2021-03-01
80
10
2021-07-01
2021-04-01
80
100
2021-07-01
2021-05-01
80
20
2021-07-01
2021-06-01
80
40
2021-07-01
2021-07-01
80
80
...
...
...
...
Then it's just a matter of grouping based on the month in the (a) alias, and summing the amounts coming from the (b) alias.
The advantage of this approach is that it should be vendor and generation agnostic, save the DATEADD() fucuntion.
The second solution would be to use window functions. I cannot comment on whether this would work with your vendor and the specific version.
SELECT
month,
SUM(amount) OVER (ORDER BY month ROWS BETWEEN 5 PRECEDING AND CURRENT ROW)
FROM table

How to group data weekly in column and hourly in row

I have data like following
ID SalesTime Qty Unit Price Item
1 01/01/2021 08:10:00 10 10 A
2 01/01/2021 11:30:00 2 9 B
3 01/01/2021 11:59:50 1 8 C
4 01/02/2021 13:00:00 5 15 D
5 01/03/2021 10:00:00 4 10 A
6 01/03/2021 12:00:00 5 9 B
7 01/03/2021 12:50:00 6 15 D
8 01/04/2021 10:50:00 5 8 C
9 01/04/2021 11:10:00 2 10 A
10 ............
I wanna summarize the total into the form,
for example:
Mon Tue Wed Thu Fri Sat Sun
08:00~09:59 20 21 50 100 60 70 210
10:00~11:59 60 25 60 90 75 80 200
12:00~13:59 100 10 50 60 70 50 150
How to do that in MS SQL, thanks a lot.
You can extract the hour and divide by two for the rows. And then use conditional aggregation for the columns. Assuming you want the total of the price times quantity:
select convert(time, dateadd(hour, 2 * (datepart(hour, salestime) / 2), 0)) as hh,
sum(case when datename(weekday, salestime) = 'Monday' then qty * unit_price end) as mon,
sum(case when datename(weekday, salestime) = 'Tuesday' then qty * unit_price end) as tue,
. . .
from t
group by datepart(hour, salestime) / 2
order by min(salestime);
Note: This just returns the beginning of the time period, rather than the full range.

Calculate number of days from date time column to a specific date - pandas

I have a df as shown below.
df:
ID open_date limit
1 2020-06-03 100
1 2020-06-23 500
1 2019-06-29 300
1 2018-06-29 400
From the above I would like to calculate a column named age_in_days.
age_in_days is the number of days from open_date to 2020-06-30.
Expected output
ID open_date limit age_in_days
1 2020-06-03 100 27
1 2020-06-23 500 7
1 2019-06-29 300 367
1 2018-06-29 400 732
Make sure open_date in datetime dtype and subtract it from 2020-06-30
df['open_date'] = pd.to_datetime(df.open_date)
df['age_in_days'] = (pd.Timestamp('2020-06-30') - df.open_date).dt.days
Out[209]:
ID open_date limit age_in_days
0 1 2020-06-03 100 27
1 1 2020-06-23 500 7
2 1 2019-06-29 300 367
3 1 2018-06-29 400 732