Calculating Average based on Month and ID in SQL - sql

I have a sample table like below and would like to do average based on MONTH of the DATE and ID. Is there any way I could do this in SQL.
Table:Input
DATE ID VOLUME
20080630 A 45
20080628 A 23
20080629 A 34
20080627 A 33
20080730 A 45
20080728 A 12
20080730 A 34
20080724 A 56
20080430 A 34
20080428 A 23
20080630 B 12
20080628 B 45
20080629 B 67
20080627 B 78
20080730 B 45
20080728 B 12
20080730 B 34
20080724 B 56
20080430 B 2
20080428 B 34
Table:Output
DATE ID VOLUME AVERAGE
20080630 A 45 33.75
20080628 A 23 33.75
20080629 A 34 33.75
20080627 A 33 33.75
20080730 A 45 36.75
20080728 A 12 36.75
20080730 A 34 36.75
20080724 A 56 36.75
20080430 A 34 28.5
20080428 A 23 28.5
20080630 B 12 50.5
20080628 B 45 50.5
20080629 B 67 50.5
20080627 B 78 50.5
20080730 B 45 36.75
20080728 B 12 36.75
20080730 B 34 36.75
20080724 B 56 36.75
20080430 B 2 18
20080428 B 34 18

You could try this Query:
select DATE_FORMAT(date,'%Y%m'), id, avg(volume) from xxx1
group by DATE_FORMAT(date,'%Y%m'), ID;
or
select DATE_FORMAT(date,'%m'), id, avg(volume) from xxx1
group by DATE_FORMAT(date,'%m'), ID
Explanation:
DATE_FORMAT(date,'%Y%m') extracts the month and year from the date. Whereas DATE_FORMAT(date,'%m') extracts only the month. (I was not quite sure, if you want the month without the year).
Basically you extract the month and then group by it together with the id and calculate the average of the volume for these groups.

This is a Standard SQL answer, as you didn't tag your DBMS:
AVG(VOLUME)
OVER (PARTITION BY EXTRACT(YEAR FROM datecolumn)
,EXTRACT(MONTH FROM datecolumn))

If you are using the mssql server then so
select t1.*, t2.AVG
from TABLE1 t1
join
(
SELECT avg(volume) as AVG, MONTH(date) as DATE
FROM TABLE1
GROUP BY MONTH(date), ID
) t2 on MONTH(t1.DATE) = t2.DATE
The output will match the required. The query will select all columns from source table and will add GROUP BY one column.

Related

How to join two tables based on conditon in sql?

I need to Join two table with respective to two columns
Stu_id (Table 1) - Stu_id (Table 2)
Perf_yr(Table 1) - yr_month (Table 2)
perf_yr starts on every year Sept to Aug.
Perf_yr should match the yr_month based on Perf_yr start and end Month
Table 1
Stu_id Roll_No Avg_marks Perf_yr
1 100244 72 2017
2 200255 62 2018
3 100246 68 2019
Table 2
Stu_id Subject Marks yr_month
1 Maths 70 201609
1 Science 69 201701
1 Social 74 201712
2 Maths 60 201709
2 Science 61 201801
2 Social 62 201808
3 Maths 65 201810
3 Science 64 201912
3 Social 72 201902
Output
Stu_id Roll_No Avg_marks Perf_yr Subject Marks yr_month
1 100244 72 2017 Maths 70 201609
1 100244 72 2017 Science 70 201701
2 200255 62 2018 Maths 60 201709
2 200255 62 2018 Science 61 201801
2 200255 62 2018 Social 62 201808
3 100246 68 2019 Maths 65 201810
3 100246 68 2019 Science 64 201912
3 100246 68 2019 Social 72 201902
I TRIED :
SELECT A.*, B.* FROM
(SELECT * FROM TABLE1 )A
LEFT JOIN
(SELECT * FROM TABLE)B
ON
A.Stu_id = B.Stu_id
AND
A.Perf_yr = B.Yr_Month
BUT IT WONT GIVE THE DESIRED RESULT BECAUSE THE CONDITION IS NOT SATISFYING THE PERF YR START AND END DATE .
You need to parse eg 201609 as a date, add 4 months to it, then match it to the year from the other table. Adding 4 months converts a date range 201609-201708 into being 201701-201712 - we only care about the year part:
SELECT * FROM
t1
INNER JOIN t2
ON
t1.stu_id = t2.stu_id AND
t1.Perf_yr = EXTRACT(year FROM ADD_MONTHS(TO_DATE(t2.yr_month, 'YYYYMM'), 4))
This is oracle. The same logic will work for SQLS, you'll just need to adjust the functions used-
CONVERT(date, t2.yr_month+'01', 112) -- Convert yyyymmdd to a date
DATEADD(month, x, 4) -- add 4 months to x
YEAR(x) -- extract year from date x

Hive Summing up data in the table based on the date range

Have a table with the following schema design and the data residing inside it is like:
ID HITS MISS DDATE
1 10 3 20180101
1 33 21 20180122
1 84 11 20180901
1 11 2 20180405
1 54 23 20190203
1 33 43 20190102
4 54 22 20170305
4 56 88 20180115
5 87 22 20180809
5 66 48 20180617
5 91 53 20170606
DataTypes:
ID INT
HITS INT
MISS INT
DDATE STRING
The requirement is to calculate the total of the given (HITS and MISS) on yearly basis i.e 2017,2018,2019...
Written the following query:
SELECT ID,
SUM(HITS) AS HITS,SUM(MISS) AS MISS,
CASE
WHEN DDATE BETWEEN '201701' AND '201712' THEN '2017' ELSE
'NOTHING' END AS TTL_YR17_DATA
CASE
WHEN DDATE BETWEEN '201801' AND '201812' THEN '2018' ELSE
'NOTHING' END AS TTL_YR18_DATA
CASE
WHEN DDATE BETWEEN '201901' AND '201912' THEN '2019' ELSE
'NOTHING' END AS TTL_YR19_DATA
FROM
HST_TABLE
WHERE
DDATE BETWEEN '201801' AND '201812'
GROUP BY
ID,DDATE;
But, the query is not fetching the expected result.
Actual O/P:
1 10 3 2018
1 33 21 2018
1 84 11 2018
1 11 2 2018
1 54 23 2019
1 33 43 2019
4 54 22 2017
4 56 88 2018
5 87 22 2018
5 66 48 2018
5 91 53 2017
Expected O/P:
1 138 37 2018
4 56 88 2018
5 153 70 2018
1 87 66 2019
5 91 53 2017
Another related question:
Is there a way that I can avoid passing the DDATE range in the query? As this should be given by the user and shouldn't be hardcoded.
Any help/advice to achieve the above two requirements will be really helpful.
OK,it's easy to implement this with the substring function in HIVE, as below:
select
substring(dddate,0,4) as the_year,
id,
sum(hits) as hits_num,
sum(miss) as miss_num
from
hst_table
group by
substring(dddate,0,4),
id
order by
the_year,
id
The answer above by #Shawn.X is correct but has a logical flaw. Below is the corrected one:
select
substring(ddate,0,4) as the_year,
id,
sum(hits) as hits_num,
sum(miss) as miss_num
from
hst_table
group by
substring(ddate,0,4),
id
order by
the_year,
id;

Custom sorting by month name in SQL Server

I have a table where for some dates a certain number of entries are placed. Here is the table structure :
ID EntryName Entries DateOfEntry
1 A 20 2016-01-17
2 B 22 2016-01-29
3 C 23 2016-02-17
4 D 19 2016-02-17
5 E 29 2016-03-17
6 F 30 2016-03-17
7 G 43 2016-04-17
8 H 10 2016-04-17
9 I 5 2016-05-17
10 J 120 2016-05-17
11 K 220 2016-06-17
12 L 210 2016-06-17
13 M 10 2016-07-17
14 N 20 2016-07-17
15 O 15 2016-08-17
16 P 17 2016-08-17
17 Q 19 2016-09-17
18 R 23 2016-09-17
19 S 43 2016-10-17
20 T 56 2016-10-17
21 U 65 2016-11-17
22 V 78 2016-11-17
23 W 12 2016-12-17
24 X 23 2016-12-17
25 Y 43 2016-02-17
26 Z 67 2016-03-17
27 AA 35 2015-01-17
28 AB 23 2015-01-29
29 AC 43 2015-02-17
30 AD 35 2015-02-17
31 AE 45 2015-03-17
32 AF 23 2015-03-17
33 AG 43 2015-04-17
34 AH 19 2015-04-17
35 AI 21 2015-05-17
36 AJ 13 2015-05-17
37 AK 22 2015-06-17
38 AL 45 2015-06-17
39 AM 66 2015-07-17
40 AN 77 2015-07-17
41 AO 89 2015-08-17
42 AP 127 2015-08-17
43 AQ 19 2015-09-17
44 AR 223 2015-09-17
45 AS 143 2015-10-17
46 AT 36 2015-10-17
47 AU 45 2015-11-17
48 AV 28 2015-11-17
49 AW 72 2015-12-17
50 AX 24 2015-12-17
51 AY 46 2015-02-17
52 AZ 62 2015-03-17
The column EntryName is the entry identifier, the column Entries has the total number of entries for the date specified in the column DateOfEntry.
I am trying to formulate a query where the total number of entries are displayed on a month-wise basis. I currently have this query :
SELECT DateName(MONTH, e.DateOfEntry) AS MonthOfEntry,
MONTH(e.DateOfEntry) AS MonthNumber,
SUM(e.Entries) AS TotalEntries
FROM #entry e
GROUP BY MONTH(e.DateOfEntry), DateName(MONTH,e.DateOfEntry)
ORDER BY MONTH(e.DateOfEntry) ASC
which works fine as far as displaying the results are concerned. However, my issue here is that I need to sort the results on a month-wise basis where the starting month would be dynamic i.e. arising from a parameter (supplied by the user).
This means that if the user selects May of 2015 the results should be sorted from May 2015 to April 2016. Similarly, if the user selects October 2015, the results would be displayed from October 2015 to September 2016.
How would I go about getting this condition within the ORDER BY clause ?
You can put an offset into the ORDER BY using modulo arithmetic. For April:
ORDER BY (MONTH(e.DateOfEntry) + 12 - 4) % 12
--------------------------------------^ month number to start with
(The + 12 is simply so I don't have to remember if % returns negative numbers with negative operands.)
If you want the results chronologically, you can instead do:
ORDER BY MIN(e.DateOfEntry)
You could use the belosw in order by
ORDER BY YEAR(e.DATEOFENTRY),
DATEPART(MM,e.DAREOFENTRY)
This will sort the result first for Year and next for month.
Here you need to specify these same columns in Select.
If I understood you correctly
"This means that if the user selects May of 2015 the results should be sorted from May 2015 to April 2016. Similarly, if the user selects October 2015, the results would be displayed from October 2015 to September 2016."
this should work:
SAMPLE DATA:
IF OBJECT_ID('tempdb..#entry') IS NOT NULL
DROP TABLE #entry;
CREATE TABLE #entry(ID INT ,EntryName VARCHAR(10) , Entries INT , DateOfEntry DATE);
INSERT INTO #entry (ID ,EntryName ,Entries ,DateOfEntry)
VALUES
(1 ,'A', 20 ,'2016-01-17'),
(2 ,'B', 22 ,'2016-01-29'),
(3 ,'C', 23 ,'2016-02-17'),
(4 ,'D', 19 ,'2016-02-17'),
(5 ,'E', 29 ,'2016-03-17'),
(6 ,'F', 30 ,'2016-03-17'),
(7 ,'G', 43 ,'2016-04-17'),
(8 ,'H', 10 ,'2016-04-17'),
(9 ,'I', 5 ,'2016-05-17'),
(10,'J', 120 ,'2016-05-17'),
(11,'K', 220 ,'2016-06-17'),
(12,'L', 210 ,'2016-06-17'),
(13,'M', 10 ,'2016-07-17'),
(14,'N', 20 ,'2016-07-17'),
(15,'O', 15 ,'2016-08-17'),
(16,'P', 17 ,'2016-08-17'),
(17,'Q', 19 ,'2016-09-17'),
(18,'R', 23 ,'2016-09-17'),
(19,'S', 43 ,'2016-10-17'),
(20,'T', 56 ,'2016-10-17'),
(21,'U', 65 ,'2016-11-17'),
(22,'V', 78 ,'2016-11-17'),
(23,'W', 12 ,'2016-12-17'),
(24,'X', 23 ,'2016-12-17'),
(25,'Y', 43 ,'2016-02-17'),
(26,'Z', 67 ,'2016-03-17'),
(27,'AA',35 ,'2015-01-17'),
(28,'AB',23 ,'2015-01-29'),
(29,'AC',43 ,'2015-02-17'),
(30,'AD',35 ,'2015-02-17'),
(31,'AE',45 ,'2015-03-17'),
(32,'AF',23 ,'2015-03-17'),
(33,'AG',43 ,'2015-04-17'),
(34,'AH',19 ,'2015-04-17'),
(35,'AI',21 ,'2015-05-17'),
(36,'AJ',13 ,'2015-05-17'),
(37,'AK',22 ,'2015-06-17'),
(38,'AL',45 ,'2015-06-17'),
(39,'AM',66 ,'2015-07-17'),
(40,'AN',77 ,'2015-07-17'),
(41,'AO',89 ,'2015-08-17'),
(42,'AP',127 ,'2015-08-17'),
(43,'AQ',19 ,'2015-09-17'),
(44,'AR',223 ,'2015-09-17'),
(45,'AS',143 ,'2015-10-17'),
(46,'AT',36 ,'2015-10-17'),
(47,'AU',45 ,'2015-11-17'),
(48,'AV',28 ,'2015-11-17'),
(49,'AW',72 ,'2015-12-17'),
(50,'AX',24 ,'2015-12-17'),
(51,'AY',46 ,'2015-02-17'),
(52,'AZ',62 ,'2015-03-17')
QUERY WITH PARAMS:
DECLARE #Month VARCHAR(2) = '05', #Year VARCHAR(4) = '2015'
SELECT DateName(MONTH, e.DateOfEntry) AS MonthOfEntry,
MONTH(e.DateOfEntry) AS MonthNumber,
SUM(e.Entries) AS TotalEntries
FROM #entry e
WHERE CAST(e.DateOfEntry AS DATE) >= CAST( #Year+#Month+'01' AS DATE)
GROUP BY MONTH(e.DateOfEntry), DateName(MONTH,e.DateOfEntry)
ORDER BY MONTH(e.DateOfEntry) ASC
RESULTS:
add a where clause for the query
WHERE MONTH(e.DateOfEntry) < User.Month AND YEAR(e.DateOfEntry) < User.Year AND MONTH(e.DateOfEntry) > (User.Month-1) AND YEAR(e.DateOfEntry) > (User.Year+1)
Assuming your parameter is an Integer called #FirstMonth, you could get the proper month order using:
Case
WHEN MONTH(e.DateOfEntry) < #FirstMonth then MONTH(e.DateOfEntry) + 12
ELSE MONTH(e.DateOfEntry)
END AS MonthNumber
Of all the answers and suggestions I have come across here, I find this way (suggested by xQbert in the question's comments) to be the simplest one :
SELECT DateName(MONTH, e.DateOfEntry) + ' ' + CONVERT(NVARCHAR(100), YEAR(e.DateOfEntry)) AS MonthOfEntry,
MONTH(e.DateOfEntry) AS MonthNumber,
SUM(e.Entries) AS TotalEntries
FROM Entry e
WHERE e.DateOfEntry BETWEEN #StartDate AND (DATEADD(YEAR, 1, #StartDate))
GROUP BY MONTH(e.DateOfEntry), DateName(MONTH,e.DateOfEntry), YEAR(e.DateOfEntry)
ORDER BY YEAR(e.DateOfEntry) ASC, MONTH(e.DateOfEntry) ASC
A fiddle to demonstrate this : http://rextester.com/CJFFP5640
Initially, I was using the following query :
SELECT sortingList.MonthOfEntry,
sortingList.TotalEntries,
sortingList.MonthNumber
FROM (
SELECT DateName(MONTH, DATEADD(MONTH, MONTH(e.DateOfEntry), 0) - 1) + ' ' + CONVERT(nvarchar(20),YEAR(e.DateOfEntry)) AS MonthOfEntry,
SUM(e.Entries) as TotalEntries,
CASE
WHEN ((MONTH(e.DateOfEntry) - MONTH(#StartDate)) > 0)
THEN (MONTH(e.DateOfEntry) - MONTH(#StartDate)) + 1
WHEN ((MONTH(e.DateOfEntry) - MONTH(#StartDate)) = 0)
THEN 1
ELSE
((12 - MONTH(#StartDate)) + (MONTH(e.DateOfEntry))) + 1
END
AS MonthNumber
FROM Entry e
WHERE e.DateOfEntry >= #StartDate AND e.DateOfEntry < DATEADD(YEAR, 1, #StartDate)
GROUP BY DateName(MONTH, DATEADD(MONTH, MONTH(e.DateOfEntry), 0) - 1), YEAR(e.DateOfEntry), MONTH(e.DateOfEntry) - MONTH(#StartDate), MONTH(e.DateOfEntry)
) sortingList
ORDER BY sortingList.MonthNumber ASC
Here's an Fiddle to demonstrate this : http://rextester.com/LEVD30653
Explanation (non TL;DR)
You can see that it's essentially the same WHERE clause. However, the query at the top uses much simpler logic for sorting and is more fluent and readable.
Do note that the second solution (using the CASE statement) sorts the month numbers as per the user-provided month number i.e. if the user provides December 2015, then the second solution will number Dec 2015 as 1, January 2016 as 2, February 2016 as 3 and so on and so forth. This might be more beneficial in cases where you want to work on top of this data.
As far as my use-case is concerned, this makes more sense. However, as far as the scope of the question is concerned, the query at the top is the best one.

T-SQL Group by day date but i want show query full date

I want to show the date field can not group.
My Query:
SELECT DAY(T1.UI_CreateDate) AS DATEDAY, SUM(1) AS TOTALCOUNT
FROM mydb.dbo.LP_UseImpression T1 WHERE T1.UI_BR_BO_ID = 45
GROUP BY DAY(T1.UI_CreateDate)
Result:
DATEDAY TOTALCOUNT
----------- -----------
15 186
9 1
3 2
26 481
21 297
27 342
18 18
30 14
4 183
25 553
13 8
22 469
16 1
17 28
20 331
28 90
14 33
8 1
But i want to show the full date...
Example result:
DATEDAY TOTALCOUNT
----------- -----------
15/06/2015 186
9/06/2015 1
3/06/2015 2
26/06/2015 481
21/06/2015 297
27/06/2015 342
18/06/2015 18
30/06/2015 14
4/06/2015 183
25/06/2015 553
13/06/2015 8
22/06/2015 469
16/06/2015 1
17/06/2015 28
20/06/2015 331
28/06/2015 90
14/06/2015 33
8/06/2015 1
I want to see the results...
I could not get a kind of results...
How can I do?
Thanx!
How about just casting to date to remove any time component:
SELECT CAST(T1.UI_CreateDate as DATE) AS DATEDAY, COUNT(*) AS TOTALCOUNT
FROM mydb.dbo.LP_UseImpression T1
WHERE T1.UI_BR_BO_ID = 45
GROUP BY CAST(T1.UI_CreateDate as DATE)
ORDER BY DATEDAY;
SUM(1) for calculating the count does work. However, because SQL has the COUNT(*) function, it seems a bit awkward.
So you can group by DAY(T1.UI_CreateDate) or use full date for grouping. But these are different . As both these dates '2015-04-15' and '2015-12-15' result in same DAY value of 15.
Assuming you want to group on DAY rather than date please try the below version of query:
SELECT DISTINCT
T1.UI_CreateDate as DATEDAY,
count(1) over (PARTITION BY DAY(T1.UI_CreateDate) ) AS TOTALCOUNT
FROM mydb.dbo.LP_UseImpression T1 WHERE T1.UI_BR_BO_ID = 45
sql fiddle for demo: http://sqlfiddle.com/#!6/c3337/1

sum of two field in two table

I have four tables in the database as follows:
tblInvoice:
invcid,customerid,invoicedate
tblInvcDetail:
ID,invcid,item,itemprice,itemquantity
tblPay:
payid,invcid,paydate
tblPayDetail:
payid,amount
I need to create a list of invoiceid, invoicedate, (sum of itemprice*itemquantity), (sum of amount) where userid is given.
I tried this query:
SELECT tblinvoice.invcid,
tblinvoice.invcdate,
Sum(tblinvcdetail.itemprice * tblinvcdetail.itemquantity) AS SumOfInvoice,
Sum(tblpaydetail.amount) AS SumOfAmount
FROM ((tblinvoice
LEFT JOIN tblpay
ON tblinvoice.invcid = tblpay.invcid)
LEFT JOIN tblinvcdetail
ON tblinvoice.invcid = tblinvcdetail.invcid)
LEFT JOIN tblpaydetail
ON tblpay.payid = tblpaydetail.payid
GROUP BY tblinvoice.invcid,
tblinvoice.invcdate;
But the result is not quite correct
Please help me.
Thanks a lot.
Sample data:
tblInvoice:
invcid customerid invcdate |invcsum(manualy calculated)
18 8 6/30/2012 |$140,000
39 8 7/12/2012 |$170,000
40 8 7/12/2012 |$80,000
43 8 7/14/2012 |$80,000
44 8 7/14/2012 |$80,000
45 8 7/15/2012 |$700,000
46 8 7/17/2012 |$180,000
tblInvcDetail:
ID invccid itemname itemprice itemquantity
19 18 X $70,000 2
92 39 Y $80,000 1
93 39 Z $90,000 1
94 40 Y $80,000 1
97 43 Y $80,000 1
98 44 Y $80,000 1
99 45 W $700,000 1
100 46 Y $80,000 1
101 46 U $100,000 1
tblPay:
payid invcid paydate |AmountSUM(Manually Calculated)
35 18 7/11/2012 |$120,000
40 18 7/12/2012 |$147,000
41 40 7/12/2012 |$84,000
44 44 7/14/2012 |$84,000
46 45 7/15/2012 |$700,000
tblPayDetail:
payid amount
35 $100,000
35 $20,000
40 $147,000
41 $84,000
44 $84,000
46 $700,000
And finally the query result is:
invcid invcdate SumOfInvoice SumOfAmount
18 6/30/2012 $420,000.00 $267,000.00
39 7/12/2012 $170,000.00
40 7/12/2012 $80,000.00 $84,000.00
43 7/14/2012 $80,000.00
44 7/14/2012 $80,000.00 $84,000.00
45 7/15/2012 $700,000.00 $700,000.00
46 7/17/2012 $180,000.00
You can see that the calculation is wrong in the first row (SumOfInvoice column)
and the rest is correct!
How about:
SELECT a.invcid,
a.invcdate,
a.sumofinvoice,
b.sumofamount
FROM (SELECT ti.invcid,
ti.invcdate,
SUM(td.itemprice * td.itemquantity) AS SumOfInvoice
FROM tblinvoice AS ti
LEFT JOIN tblinvcdetail AS td
ON ti.invcid = td.invcid
GROUP BY ti.invcid,
ti.invcdate) a
LEFT JOIN (SELECT tp.invcid,
SUM(tpd.amount) AS SumOfAmount
FROM tblpay AS tp
LEFT JOIN tblpaydetail AS tpd
ON tp.payid = tpd.payid
GROUP BY tp.invcid) b
ON a.invcid = b.invcid