SUM rows by week syntax - sql

I am using the following query
SELECT CONVERT(date,lot.Killdate) as KillDate
,lot.[LotNo]
,lot.[HotWeight]
,lot.[AverageYieldGrade]
,(lot.HeadShorn + lot.HeadUnshorn) as 'Total'
,(lot.HotWeight) / (lot.HeadShorn + lot.HeadUnshorn) as 'AvgWeight'
,date.Date
,date.WeekOfYear
FROM [LambLot].[dbo].[LotHeader] lot, Master_Dim.dbo.DateDim date
WHERE CONVERT(date,lot.KillDate) = date.Date
and lot.KillLocation = 1
AND lot.HeadShorn > 0
AND lot.HeadUnshorn > 0
AND date.Year = 2016
group by lot.LotNo, lot.HeadShorn, lot.HeadUnshorn, lot.HotWeight lot.AverageYieldGrade, lot.KillDate, date.Date, date.WeekOfYear
order by date.WeekOfYear asc
Here is a copy of the output
KillDate LotNo HotWeight AverageYieldGrade Total AvgWeight Date WeekOfYear
1 2016-01-04 102 21603.5 2.28 348 62.0790229885057 2016-01-04 2
2 2016-01-04 103 2305.3 1.42 53 43.4962264150943 2016-01-04 2
3 2016-01-04 105 1159 0 17 68.1764705882353 2016-01-04 2
4 2016-01-04 108 1493.6 0 39 38.2974358974359 2016-01-04 2
5 2016-01-04 109 2982.8 0 80 37.285 2016-01-04 2
What i would like to do is sum each row into the weekofyear. Essentially giving me 52 rows of output, with sums of each value in each column shown. Is there a way to do this?

If you need to SUM by just WeekOfYear, you need to GROUP BY just week of year. You have to decide how you handle the other values, here's one way that may make sense (with some expected output it'd be easier to figure out what you're really looking for):
SELECT CONVERT(date, MAX(lot.Killdate)) as KillDate
,MAX(lot.[LotNo]) -- ?? Does this really belong in this query?
,SUM(lot.[HotWeight])
,AVG(lot.[AverageYieldGrade])
,(SUM(lot.HeadShorn) + SUM(lot.HeadUnshorn)) as 'Total'
,SUM(lot.HotWeight) / (SUM(lot.HeadShorn) + SUM(lot.HeadUnshorn)) as 'AvgWeight'
,MAX(date.Date)
,date.WeekOfYear
FROM [LambLot].[dbo].[LotHeader] lot, Master_Dim.dbo.DateDim date
WHERE CONVERT(date,lot.KillDate) = date.Date
and lot.KillLocation = 1
AND lot.HeadShorn > 0
AND lot.HeadUnshorn > 0
AND date.Year = 2016
group by date.WeekOfYear
order by date.WeekOfYear asc
Just remember: whatever columns you GROUP BY will be factored into how a distinct group is figured out. Anything you don't have in your GROUP BY but want in your output just needs some kind of aggregate function (such as MIN, MAX, AVG, SUM, COUNT, etc.).

Related

combine two rows with 2 months into one row of one month, containing null values into one

I would like to have a dataframe where 1 row only contains one month of data.
month cust_id closed_deals cum_closed_deals checkout cum_checkout
2019-10-01 1 15 15 null null
2019-10-01 1 null 15 210 210
2019-11-01 1 27 42 null 210
2019-11-01 1 null 42 369 579
Expected result:
month cust_id closed_deals cum_closed_deals checkout cum_checkout
2019-10-01 1 15 15 210 210
2019-11-01 1 27 42 369 579
At first, I thought a normal groupby will work, but as I try to group by only by "month" and "cust_id", I got an error saying that closed_deals and checkout also need to be in the groupby.
You may simply aggregate by the (first of the) month and cust_id and take the max of all other columns:
SELECT
month,
cust_id,
MAX(closed_deals) AS closed_deals,
MAX(cum_closed_deals) AS cum_closed_deals,
MAX(checkout) AS checkout,
MAX(cum_checkout) AS cum_checkout
FROM yourTable
GROUP BY
month,
cust_id;

SQL how to count but only count one instance if two columns match?

Wondering how to select from a table:
FIELDID personID purchaseID dateofPurchase
--------------------------------------------------
2 13 147 2014-03-21 00:00:00
3 15 165 2015-03-23 00:00:00
4 13 456 2018-03-24 00:00:00
5 1 133 2018-03-21 00:00:00
6 23 123 2013-03-22 00:00:00
7 25 456 2013-03-21 00:00:00
8 25 456 2013-03-23 00:00:00
9 22 456 2013-03-28 00:00:00
10 25 589 2013-03-21 00:00:00
11 82 147 1991-10-22 00:00:00
12 82 453 2003-03-22 00:00:00
I'd like to get a result table of two columns: weekday and the number of purchases of each weekday, but only count the distinct days of purchases if done by the same person on the same day - for example since personID 25 purchased two things on 2013-03-21, that should only count as one 'thursday' instead of 2.
Basically, if the personID and the dateofPurchase are the same for more than one row, only count it once is what I want.
Here is what I have currently: It does everything correctly except it will count the above scenario under the thursday twice, when I would only want to add one:
SELECT v.wkday as day, COUNT(*) as 'absences'
FROM dbo.AttendanceRecord pr CROSS APPLY
(VALUES (CASE WHEN DATEPART(WEEKDAY, date) IN (1, 7)
THEN 'Weekend'
ELSE DATENAME(WEEKDAY, date)
END)
) v(wkday)
GROUP BY v.wkday;
to clarify:
If an item is purchased for at least one puchaseID on a specific day they will be counted as purchased for that day, and do not need to be counted again for each new purchase ID on that day.
I think you want to count distinct persons, so that would be:
COUNT(DISTINCT personid) as absences
Note that single quotes are not appropriate around column aliases. If you need to escape them, use square braces.
EDIT:
If you want to count distinct person-days, then you can use:
COUNT(DISTINCT CONCAT(personid, ':', dateofpurchase) as absences

How to calculate a running total that is a distinct sum of values

Consider this dataset:
id site_id type_id value date
------- ------- ------- ------- -------------------
1 1 1 50 2017-08-09 06:49:47
2 1 2 48 2017-08-10 08:19:49
3 1 1 52 2017-08-11 06:15:00
4 1 1 45 2017-08-12 10:39:47
5 1 2 40 2017-08-14 10:33:00
6 2 1 30 2017-08-09 07:25:32
7 2 2 32 2017-08-12 04:11:05
8 3 1 80 2017-08-09 19:55:12
9 3 2 75 2017-08-13 02:54:47
10 2 1 25 2017-08-15 10:00:05
I would like to construct a query that returns a running total for each date by type. I can get close with a window function, but I only want the latest value for each site to be summed for the running total (a simple window function will not work because it sums all values up to a date--not just the last values for each site). So I guess it could be better described as a running distinct total?
The result I'm looking for would be like this:
type_id date sum
------- ------------------- -------
1 2017-08-09 06:49:47 50
1 2017-08-09 07:25:32 80
1 2017-08-09 19:55:12 160
1 2017-08-11 06:15:00 162
1 2017-08-12 10:39:47 155
1 2017-08-15 10:00:05 150
2 2017-08-10 08:19:49 48
2 2017-08-12 04:11:05 80
2 2017-08-13 02:54:47 155
2 2017-08-14 10:33:00 147
The key here is that the sum is not a running sum. It should only be the sum of the most recent values for each site, by type, at each date. I think I can help explain it by walking through the result set I've provided above. For my explanation, I'll walk through the original data chronologically and try to explain the expected result.
The first row of the result starts us off, at 2017-08-09 06:49:47, where chronologically, there is only one record of type 1 and it is 50, so that is our sum for 2017-08-09 06:49:47.
The second row of the result is at 2017-08-09 07:25:32, at this point in time we have 2 unique sites with values for type_id = 1. They have values of 50 and 30, so the sum is 80.
The third row of the result occurs at 2017-08-09 19:55:12, where now we have 3 sites with values for type_id = 1. 50 + 30 + 80 = 160.
The fourth row is where it gets interesting. At 2017-08-11 06:15:00 there are 4 records with a type_id = 1, but 2 of them are for the same site. I'm only interested in the most recent value for each site so the values I'd like to sum are: 30 + 80 + 52 resulting in 162.
The 5th row is similar to the 4th since the value for site_id:1, type_id:1 has changed again and is now 45. This results in the latest values for type_id:1 at 2017-08-12 10:39:47 are now: 30 + 80 + 45 = 155.
Reviewing the 6th row is also interesting when we consider that at 2017-08-15 10:00:05, site 2 has a new value for type_id 1, which gives us: 80 + 45 + 25 = 150 for 2017-08-15 10:00:05.
You can get a cumulative total (running total) by including an ORDER BY clause in your window frame.
select
type_id,
date,
sum(value) over (partition by type_id order by date) as sum
from your_table;
The ORDER BY works because
The default framing option is RANGE UNBOUNDED PRECEDING, which is the same as RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.
SELECT type_id,
date,
SUM(value) OVER (PARTITION BY type_id ORDER BY type_id, date) - (SUM(value) OVER (PARTITION BY type_id, site_id ORDER BY type_id, date) - value) AS sum
FROM your_table
ORDER BY type_id,
date

Get YTD sum value

The first two columns work great but I do not know how to get the results for the third column. Please tell me how I can have the third column to show the year to date data with accordance to the week. Please provide assistance.
select "Builder","Traffic", sum(cast("Traffic" as int)) as YTD
from trafficdatapcr
where "Week" = '2016-12-11'
group by "Builder","Traffic"
The sample data:
Week Builder Traffic
2016-12-11 Macys 100
2016-10-11 Bloomingdales 15
2016-08-11 Saks 85
2016-02-11 Cole Haan 95
2015-12-25 Kroger 65
My current results:
Builder Traffic YTD
Macys 100 100
The expected results:
Builder Traffic YTD
Macys 100 100
Saks 0 85
Bloomingdales 0 15
Cole Haan 0 95
Kroger 0 65
Your where clause is eliminating the the records you desire, use a case to conditionally display the traffic for the desired week instead of the where clause
select "Builder"
, case when "Week" = to_date('2016-12-11',YYYY-MM-DD') then "Traffic" else 0 end as "Traffic"
, sum(cast("Traffic" as int)) as YTD
from trafficdatapcr
group by "Builder","Traffic"
Order by week Desc
It does seem odd though that if someone were to select 2016-10-11 the YTD would be all dates.... so perhaps you want to conditionally sum as well...
select "Builder"
, case when "Week" = to_date('2016-12-11','YYYY-MM-DD') then "Traffic" else 0 end as "Traffic"
, sum(case when "week"<=to_date('2016-12-11','YYYY-MM-DD') then cast("Traffic" as int) else 0 end) as YTD
from trafficdatapcr
group by "Builder","Traffic"
Order by week Desc
This way
Macys will show as 0 0
Bloomingdales would be 15 15
So 2nd query should return (assumign date of 2016-10-11) but in correct date order (don't know what order you want)
Builder Traffic YTD
Macys 0 0
Saks 0 85
Bloomingdales 15 15
Cole Haan 0 95
Kroger 0 65

How to query to get totals for last seven days?

I am using SQL Server 2008.
I want to write a query that gives me total activity for a number of given days. Specifically, I want to count total votes per day for the last seven days.
My table looks like this:
VoteID --- VoteDate -------------- Vote --- BikeID
1 2012-01-01 08:24:25 1 1234
2 2012-01-01 08:24:25 0 5678
3 2012-01-02 08:24:25 1 1289
4 2012-01-03 08:24:25 0 1234
5 2012-01-04 08:24:25 1 5645
6 2012-01-05 08:24:25 0 1213
7 2012-01-06 08:24:25 1 1234
8 2012-01-07 08:24:25 0 1125
I need my results to look like this
VoteDate ---- Total
2012-01-01 5
2012-01-02 6
2012-01-03 7
2012-01-04 1
2012-01-05 3
My thought is that I have to do something like this:
SELECT SUM(CASE WHEN Vote = 1 THEN 1 ELSE 0 END) AS Total
FROM Votes
GROUP BY VoteDate
This query doesn't work because it counts only votes that occurred (almost exactly) at the same time. Of course, I want to look only at a specific day. How do I make this happen?
Cast it as a date:
SELECT
cast(VoteDate as date) as VoteDate,
SUM(CASE WHEN Vote = 1 THEN 1 ELSE 0 END) AS Total
FROM Votes
WHERE VoteDate between dateadd(day, -7, GETDATE()) and GETDATE()
GROUP BY cast(VoteDate as date)
Your VoteDate column is a datetime, but you just want the date part of it. The easiest way to do that is to cast it as a date type. You can read more about SQL Server date types here.
And if your Vote column is either 1 or 0, you can just do sum(vote) as Total instead of doing the case statement.
SELECT SUM(Vote) As Total, YEAR(VoteDate),Month(VoteDate),Day(VoteDate)
FROM Votes
Group By YEAR(VoteDate),Month(VoteDate),Day(VoteDate)
Some SQL Server functions that may be of interest
Some MySQL functions that may be of interest