Aggregate result from query by quarter SQL - sql

Lets say I have a table which holds all exports for some time back in Microsoft SQL database:
Name:
ExportTable
Columns:
id - numeric(18)
exportdate - datetime
In order to get the number of exports per week I can run the following query:
SELECT DATEPART(ISO_WEEK,[exportdate]) as 'exportdate', count(exportdate) as 'totalExports'
FROM [ExportTable]
Group By DATEPART(ISO_WEEK,[exportdate])
order by exportdate;
Returns:
exportdate totalExports
---------- ------------
27 13
28 12
29 15
30 8
31 17
32 10
33 7
34 15
35 4
36 18
37 10
38 14
39 14
40 21
41 19
Would it be possible to aggregate the week results by quarter so the output becomes something like the bellow?
UPDATE
Sorry for not being crystal clear, I would like the current result to add upp with previous result up to a new quarter.
Note week 41 contains 21+19 = 40
Week 39 contains 157 (13+12+15+8+17+10+7+15+4+18+10+14+14)
exportdate totalExports Quarter
---------- ------------ -------
27 13 3
28 25 3
29 40 3
30 48 3
31 65 3
32 75 3
33 82 3
34 97 3
35 101 3
36 119 3
37 129 3
38 143 3
39 157 3 -- Sum of 3 Quarter values.
40 21 4 -- New Quarter show current week value
41 40 4 -- (21+19)

You can use this.
SELECT
DATEPART(ISO_WEEK,[exportdate]) as 'exportdate'
, SUM( count(exportdate) ) OVER ( PARTITION BY DATEPART(QUARTER,MIN([exportdate])) ORDER BY DATEPART(ISO_WEEK,[exportdate]) ROWS UNBOUNDED PRECEDING ) as 'totalExports'
, DATEPART(QUARTER,MIN([exportdate])) [Quarter]
FROM [ExportTable]
Group By DATEPART(ISO_WEEK,[exportdate])
order by exportdate;

You could use a case statement to separate the dates into quarters.
e.g.
CASE
WHEN EXPORT_DATE BETWEEN '1' AND '4' THEN 1
WHEN Export_Date BETWEEN '5' and '9' THEN 2
ELSE 0 AS [Quarter]
END
Its just an example but you get the idea.
You could then use the alias from the case

SELECT DATEPART(ISO_WEEK,[exportdate]) as 'exportdate', count(exportdate) as 'totalExports', DATEPART(quarter,[exportdate]) as quarter FROM [ExportTable] Group By DATEPART(ISO_WEEK,[exportdate]), DATEPART(quarter,[exportdate]) order by exportdate;

Related

Display rows where multiple columns are different

I have data that looks like this. Thousands of rows returned, but this is just a sample.
Most days have the same numbers in them, but some do not. Note that ID 1 and 5 have identical numbers every day.
ID
Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
1
26
26
26
26
26
26
26
2
44
44
30
30
44
44
44
3
55
55
55
55
80
90
55
4
12
12
43
43
43
43
43
5
36
36
36
36
36
36
36
I'd like to only return rows where the days of the week have different numbers.
In this case, the only IDs returned should be 2, 3 & 4.
What would I want this query to look like?
Thanks!
One idea that should work in most RDBMS (with some syntax tweaks) is the following.
This is SQL Server compatible: pivot the days into rows and count the distinct values and filter accordingly:
select id
from t
cross apply (
select Count(distinct d) from (
values(sunday),(monday),(tuesday),(wednesday),(thursday),(friday),(saturday)
)d(d)
)d(v)
where d.v>1

Hive Summing up data in the table based on the date range

Have a table with the following schema design and the data residing inside it is like:
ID HITS MISS DDATE
1 10 3 20180101
1 33 21 20180122
1 84 11 20180901
1 11 2 20180405
1 54 23 20190203
1 33 43 20190102
4 54 22 20170305
4 56 88 20180115
5 87 22 20180809
5 66 48 20180617
5 91 53 20170606
DataTypes:
ID INT
HITS INT
MISS INT
DDATE STRING
The requirement is to calculate the total of the given (HITS and MISS) on yearly basis i.e 2017,2018,2019...
Written the following query:
SELECT ID,
SUM(HITS) AS HITS,SUM(MISS) AS MISS,
CASE
WHEN DDATE BETWEEN '201701' AND '201712' THEN '2017' ELSE
'NOTHING' END AS TTL_YR17_DATA
CASE
WHEN DDATE BETWEEN '201801' AND '201812' THEN '2018' ELSE
'NOTHING' END AS TTL_YR18_DATA
CASE
WHEN DDATE BETWEEN '201901' AND '201912' THEN '2019' ELSE
'NOTHING' END AS TTL_YR19_DATA
FROM
HST_TABLE
WHERE
DDATE BETWEEN '201801' AND '201812'
GROUP BY
ID,DDATE;
But, the query is not fetching the expected result.
Actual O/P:
1 10 3 2018
1 33 21 2018
1 84 11 2018
1 11 2 2018
1 54 23 2019
1 33 43 2019
4 54 22 2017
4 56 88 2018
5 87 22 2018
5 66 48 2018
5 91 53 2017
Expected O/P:
1 138 37 2018
4 56 88 2018
5 153 70 2018
1 87 66 2019
5 91 53 2017
Another related question:
Is there a way that I can avoid passing the DDATE range in the query? As this should be given by the user and shouldn't be hardcoded.
Any help/advice to achieve the above two requirements will be really helpful.
OK,it's easy to implement this with the substring function in HIVE, as below:
select
substring(dddate,0,4) as the_year,
id,
sum(hits) as hits_num,
sum(miss) as miss_num
from
hst_table
group by
substring(dddate,0,4),
id
order by
the_year,
id
The answer above by #Shawn.X is correct but has a logical flaw. Below is the corrected one:
select
substring(ddate,0,4) as the_year,
id,
sum(hits) as hits_num,
sum(miss) as miss_num
from
hst_table
group by
substring(ddate,0,4),
id
order by
the_year,
id;

yearly average from monthly daterange data

I have the following table in postgresql;
Value period
1 [2017-01-01,2017-02-01)
2 [2017-02-01,2017-03-01)
3 [2017-03-01,2017-04-01)
4 [2017-04-01,2017-05-01)
5 [2017-05-01,2017-06-01)
6 [2017-06-01,2017-07-01)
7 [2017-07-01,2017-08-01)
8 [2017-08-01,2017-09-01)
9 [2017-09-01,2017-10-01)
10 [2017-10-01,2017-11-01)
11 [2017-11-01,2017-12-01)
12 [2017-12-01,2018-01-01)
13 [2018-01-01,2018-02-01)
14 [2018-02-01,2018-03-01)
15 [2018-03-01,2018-04-01)
16 [2018-04-01,2018-05-01)
17 [2018-05-01,2018-06-01)
18 [2018-06-01,2018-07-01)
19 [2018-07-01,2018-08-01)
20 [2018-08-01,2018-09-01)
21 [2018-09-01,2018-10-01)
22 [2018-10-01,2018-11-01)
23 [2018-11-01,2018-12-01)
24 [2018-12-01,2019-01-01)
25 [2019-01-01,2019-02-01)
26 [2019-02-01,2019-03-01)
27 [2019-03-01,2019-04-01)
28 [2019-04-01,2019-05-01)
29 [2019-05-01,2019-06-01)
30 [2019-06-01,2019-07-01)
31 [2019-07-01,2019-08-01)
32 [2019-08-01,2019-09-01)
33 [2019-09-01,2019-10-01)
34 [2019-10-01,2019-11-01)
35 [2019-11-01,2019-12-01)
36 [2019-12-01,2020-01-01)
37 [2020-01-01,2020-02-01)
38 [2020-02-01,2020-03-01)
39 [2020-03-01,2020-04-01)
40 [2020-04-01,2020-05-01)
41 [2020-05-01,2020-06-01)
42 [2020-06-01,2020-07-01)
How can I get yearly average from monthly data in postgresql?
Note: Column Value is type integer and column period is type daterange.
The expected result should be
6.5 2017
18.5 2018
30.5 2019
39.5 2020
If your periods are always taking one month, including the lower bound and excluding the upper, you could try this
select
avg(value * 1.0) as average,
extract(year from lower(period)) as year
from table
group by year

Custom sorting by month name in SQL Server

I have a table where for some dates a certain number of entries are placed. Here is the table structure :
ID EntryName Entries DateOfEntry
1 A 20 2016-01-17
2 B 22 2016-01-29
3 C 23 2016-02-17
4 D 19 2016-02-17
5 E 29 2016-03-17
6 F 30 2016-03-17
7 G 43 2016-04-17
8 H 10 2016-04-17
9 I 5 2016-05-17
10 J 120 2016-05-17
11 K 220 2016-06-17
12 L 210 2016-06-17
13 M 10 2016-07-17
14 N 20 2016-07-17
15 O 15 2016-08-17
16 P 17 2016-08-17
17 Q 19 2016-09-17
18 R 23 2016-09-17
19 S 43 2016-10-17
20 T 56 2016-10-17
21 U 65 2016-11-17
22 V 78 2016-11-17
23 W 12 2016-12-17
24 X 23 2016-12-17
25 Y 43 2016-02-17
26 Z 67 2016-03-17
27 AA 35 2015-01-17
28 AB 23 2015-01-29
29 AC 43 2015-02-17
30 AD 35 2015-02-17
31 AE 45 2015-03-17
32 AF 23 2015-03-17
33 AG 43 2015-04-17
34 AH 19 2015-04-17
35 AI 21 2015-05-17
36 AJ 13 2015-05-17
37 AK 22 2015-06-17
38 AL 45 2015-06-17
39 AM 66 2015-07-17
40 AN 77 2015-07-17
41 AO 89 2015-08-17
42 AP 127 2015-08-17
43 AQ 19 2015-09-17
44 AR 223 2015-09-17
45 AS 143 2015-10-17
46 AT 36 2015-10-17
47 AU 45 2015-11-17
48 AV 28 2015-11-17
49 AW 72 2015-12-17
50 AX 24 2015-12-17
51 AY 46 2015-02-17
52 AZ 62 2015-03-17
The column EntryName is the entry identifier, the column Entries has the total number of entries for the date specified in the column DateOfEntry.
I am trying to formulate a query where the total number of entries are displayed on a month-wise basis. I currently have this query :
SELECT DateName(MONTH, e.DateOfEntry) AS MonthOfEntry,
MONTH(e.DateOfEntry) AS MonthNumber,
SUM(e.Entries) AS TotalEntries
FROM #entry e
GROUP BY MONTH(e.DateOfEntry), DateName(MONTH,e.DateOfEntry)
ORDER BY MONTH(e.DateOfEntry) ASC
which works fine as far as displaying the results are concerned. However, my issue here is that I need to sort the results on a month-wise basis where the starting month would be dynamic i.e. arising from a parameter (supplied by the user).
This means that if the user selects May of 2015 the results should be sorted from May 2015 to April 2016. Similarly, if the user selects October 2015, the results would be displayed from October 2015 to September 2016.
How would I go about getting this condition within the ORDER BY clause ?
You can put an offset into the ORDER BY using modulo arithmetic. For April:
ORDER BY (MONTH(e.DateOfEntry) + 12 - 4) % 12
--------------------------------------^ month number to start with
(The + 12 is simply so I don't have to remember if % returns negative numbers with negative operands.)
If you want the results chronologically, you can instead do:
ORDER BY MIN(e.DateOfEntry)
You could use the belosw in order by
ORDER BY YEAR(e.DATEOFENTRY),
DATEPART(MM,e.DAREOFENTRY)
This will sort the result first for Year and next for month.
Here you need to specify these same columns in Select.
If I understood you correctly
"This means that if the user selects May of 2015 the results should be sorted from May 2015 to April 2016. Similarly, if the user selects October 2015, the results would be displayed from October 2015 to September 2016."
this should work:
SAMPLE DATA:
IF OBJECT_ID('tempdb..#entry') IS NOT NULL
DROP TABLE #entry;
CREATE TABLE #entry(ID INT ,EntryName VARCHAR(10) , Entries INT , DateOfEntry DATE);
INSERT INTO #entry (ID ,EntryName ,Entries ,DateOfEntry)
VALUES
(1 ,'A', 20 ,'2016-01-17'),
(2 ,'B', 22 ,'2016-01-29'),
(3 ,'C', 23 ,'2016-02-17'),
(4 ,'D', 19 ,'2016-02-17'),
(5 ,'E', 29 ,'2016-03-17'),
(6 ,'F', 30 ,'2016-03-17'),
(7 ,'G', 43 ,'2016-04-17'),
(8 ,'H', 10 ,'2016-04-17'),
(9 ,'I', 5 ,'2016-05-17'),
(10,'J', 120 ,'2016-05-17'),
(11,'K', 220 ,'2016-06-17'),
(12,'L', 210 ,'2016-06-17'),
(13,'M', 10 ,'2016-07-17'),
(14,'N', 20 ,'2016-07-17'),
(15,'O', 15 ,'2016-08-17'),
(16,'P', 17 ,'2016-08-17'),
(17,'Q', 19 ,'2016-09-17'),
(18,'R', 23 ,'2016-09-17'),
(19,'S', 43 ,'2016-10-17'),
(20,'T', 56 ,'2016-10-17'),
(21,'U', 65 ,'2016-11-17'),
(22,'V', 78 ,'2016-11-17'),
(23,'W', 12 ,'2016-12-17'),
(24,'X', 23 ,'2016-12-17'),
(25,'Y', 43 ,'2016-02-17'),
(26,'Z', 67 ,'2016-03-17'),
(27,'AA',35 ,'2015-01-17'),
(28,'AB',23 ,'2015-01-29'),
(29,'AC',43 ,'2015-02-17'),
(30,'AD',35 ,'2015-02-17'),
(31,'AE',45 ,'2015-03-17'),
(32,'AF',23 ,'2015-03-17'),
(33,'AG',43 ,'2015-04-17'),
(34,'AH',19 ,'2015-04-17'),
(35,'AI',21 ,'2015-05-17'),
(36,'AJ',13 ,'2015-05-17'),
(37,'AK',22 ,'2015-06-17'),
(38,'AL',45 ,'2015-06-17'),
(39,'AM',66 ,'2015-07-17'),
(40,'AN',77 ,'2015-07-17'),
(41,'AO',89 ,'2015-08-17'),
(42,'AP',127 ,'2015-08-17'),
(43,'AQ',19 ,'2015-09-17'),
(44,'AR',223 ,'2015-09-17'),
(45,'AS',143 ,'2015-10-17'),
(46,'AT',36 ,'2015-10-17'),
(47,'AU',45 ,'2015-11-17'),
(48,'AV',28 ,'2015-11-17'),
(49,'AW',72 ,'2015-12-17'),
(50,'AX',24 ,'2015-12-17'),
(51,'AY',46 ,'2015-02-17'),
(52,'AZ',62 ,'2015-03-17')
QUERY WITH PARAMS:
DECLARE #Month VARCHAR(2) = '05', #Year VARCHAR(4) = '2015'
SELECT DateName(MONTH, e.DateOfEntry) AS MonthOfEntry,
MONTH(e.DateOfEntry) AS MonthNumber,
SUM(e.Entries) AS TotalEntries
FROM #entry e
WHERE CAST(e.DateOfEntry AS DATE) >= CAST( #Year+#Month+'01' AS DATE)
GROUP BY MONTH(e.DateOfEntry), DateName(MONTH,e.DateOfEntry)
ORDER BY MONTH(e.DateOfEntry) ASC
RESULTS:
add a where clause for the query
WHERE MONTH(e.DateOfEntry) < User.Month AND YEAR(e.DateOfEntry) < User.Year AND MONTH(e.DateOfEntry) > (User.Month-1) AND YEAR(e.DateOfEntry) > (User.Year+1)
Assuming your parameter is an Integer called #FirstMonth, you could get the proper month order using:
Case
WHEN MONTH(e.DateOfEntry) < #FirstMonth then MONTH(e.DateOfEntry) + 12
ELSE MONTH(e.DateOfEntry)
END AS MonthNumber
Of all the answers and suggestions I have come across here, I find this way (suggested by xQbert in the question's comments) to be the simplest one :
SELECT DateName(MONTH, e.DateOfEntry) + ' ' + CONVERT(NVARCHAR(100), YEAR(e.DateOfEntry)) AS MonthOfEntry,
MONTH(e.DateOfEntry) AS MonthNumber,
SUM(e.Entries) AS TotalEntries
FROM Entry e
WHERE e.DateOfEntry BETWEEN #StartDate AND (DATEADD(YEAR, 1, #StartDate))
GROUP BY MONTH(e.DateOfEntry), DateName(MONTH,e.DateOfEntry), YEAR(e.DateOfEntry)
ORDER BY YEAR(e.DateOfEntry) ASC, MONTH(e.DateOfEntry) ASC
A fiddle to demonstrate this : http://rextester.com/CJFFP5640
Initially, I was using the following query :
SELECT sortingList.MonthOfEntry,
sortingList.TotalEntries,
sortingList.MonthNumber
FROM (
SELECT DateName(MONTH, DATEADD(MONTH, MONTH(e.DateOfEntry), 0) - 1) + ' ' + CONVERT(nvarchar(20),YEAR(e.DateOfEntry)) AS MonthOfEntry,
SUM(e.Entries) as TotalEntries,
CASE
WHEN ((MONTH(e.DateOfEntry) - MONTH(#StartDate)) > 0)
THEN (MONTH(e.DateOfEntry) - MONTH(#StartDate)) + 1
WHEN ((MONTH(e.DateOfEntry) - MONTH(#StartDate)) = 0)
THEN 1
ELSE
((12 - MONTH(#StartDate)) + (MONTH(e.DateOfEntry))) + 1
END
AS MonthNumber
FROM Entry e
WHERE e.DateOfEntry >= #StartDate AND e.DateOfEntry < DATEADD(YEAR, 1, #StartDate)
GROUP BY DateName(MONTH, DATEADD(MONTH, MONTH(e.DateOfEntry), 0) - 1), YEAR(e.DateOfEntry), MONTH(e.DateOfEntry) - MONTH(#StartDate), MONTH(e.DateOfEntry)
) sortingList
ORDER BY sortingList.MonthNumber ASC
Here's an Fiddle to demonstrate this : http://rextester.com/LEVD30653
Explanation (non TL;DR)
You can see that it's essentially the same WHERE clause. However, the query at the top uses much simpler logic for sorting and is more fluent and readable.
Do note that the second solution (using the CASE statement) sorts the month numbers as per the user-provided month number i.e. if the user provides December 2015, then the second solution will number Dec 2015 as 1, January 2016 as 2, February 2016 as 3 and so on and so forth. This might be more beneficial in cases where you want to work on top of this data.
As far as my use-case is concerned, this makes more sense. However, as far as the scope of the question is concerned, the query at the top is the best one.

Oracle SQL weeks sales SUM

I have the sales data in terms of week:
ITEM LOC WEEK SALES
111 39 16/05/2015 10
222 39 16/05/2015 23
111 39 09/05/2015 13
222 39 09/05/2015 33
I want the sum of SALES column for the last 4 weeks.
So it comes like:
ITEM LOC 4-WEEKS-SALES
111 39 23
222 39 56
Just filter for last four weeks and agregate:
select ITEM, LOC,sum(SALES)
from theTable
where WEEK > SYSDATE - ( 7 * 4 )
group by ITEM,LOC
Try to this
select ITEM,LOC,sum(SALES) '4-WEEKS-SALES'
from tablename
where Datepart(wk, WEEK)>=(Datepart(wk, Getdate())-4)
Group by ITEM,LOC,Datepart(wk, WEEK)