Rolling total with no sub-select and no vendor specific extensions - sql

What I'm trying to achieve: rolling total for quantity and amount for a given day, grouped by hour.
It's easy in most cases, but if you have some additional columns (dir and product in my case) and you don't want to group/filter on them, that's a problem.
I know there are extensions in Oracle and MSSQL specifically for that, and there's SELECT OVER PARTITION in Postgres.
At the moment I'm working on an app prototype, and it's backed by MySQL, and I have no idea what it will be using in production, so I'm trying to avoid vendor lock-in.
The entrire table:
> SELECT id, dir, product, date, hour, quantity, amount FROM sales
ORDER BY date, hour;
+------+-----+---------+------------+------+----------+--------+
| id | dir | product | date | hour | quantity | amount |
+------+-----+---------+------------+------+----------+--------+
| 2230 | 65 | ABCDEDF | 2014-09-11 | 1 | 1 | 10 |
| 2231 | 64 | ABCDEDF | 2014-09-11 | 3 | 4 | 40 |
| 2232 | 64 | ABCDEDF | 2014-09-11 | 5 | 5 | 50 |
| 2235 | 64 | ZZ | 2014-09-11 | 7 | 6 | 60 |
| 2233 | 64 | ABCDEDF | 2014-09-11 | 7 | 6 | 60 |
| 2237 | 66 | ABCDEDF | 2014-09-11 | 7 | 6 | 60 |
| 2234 | 64 | ZZ | 2014-09-18 | 3 | 1 | 11 |
| 2236 | 66 | ABCDEDF | 2014-09-18 | 3 | 1 | 100 |
| 2227 | 64 | ABCDEDF | 2014-09-18 | 3 | 1 | 100 |
| 2228 | 64 | ABCDEDF | 2014-09-18 | 5 | 2 | 200 |
| 2229 | 64 | ABCDEDF | 2014-09-18 | 7 | 3 | 300 |
+------+-----+---------+------------+------+----------+--------+
For a given date:
> SELECT id, dir, product, date, hour, quantity, amount FROM sales
WHERE date = '2014-09-18'
ORDER BY hour;
+------+-----+---------+------------+------+----------+--------+
| id | dir | product | date | hour | quantity | amount |
+------+-----+---------+------------+------+----------+--------+
| 2227 | 64 | ABCDEDF | 2014-09-18 | 3 | 1 | 100 |
| 2236 | 66 | ABCDEDF | 2014-09-18 | 3 | 1 | 100 |
| 2234 | 64 | ZZ | 2014-09-18 | 3 | 1 | 11 |
| 2228 | 64 | ABCDEDF | 2014-09-18 | 5 | 2 | 200 |
| 2229 | 64 | ABCDEDF | 2014-09-18 | 7 | 3 | 300 |
+------+-----+---------+------------+------+----------+--------+
The results that I need, using sub-select:
> SELECT date, hour, SUM(quantity),
( SELECT SUM(quantity) FROM sales s2
WHERE s2.hour <= s1.hour AND s2.date = s1.date
) AS total
FROM sales s1
WHERE s1.date = '2014-09-18'
GROUP by date, hour;
+------------+------+---------------+-------+
| date | hour | sum(quantity) | total |
+------------+------+---------------+-------+
| 2014-09-18 | 3 | 3 | 3 |
| 2014-09-18 | 5 | 2 | 5 |
| 2014-09-18 | 7 | 3 | 8 |
+------------+------+---------------+-------+
My concerns for using sub-select:
once there are round million records in the table, the query may become too slow, not sure if it's subject to optimizations even though it has no HAVING statements.
if I had to filter on a product or dir, I will have to put those conditions to both main SELECT and sub-SELECT too (WHERE product = / WHERE dir =).
sub-select only counts a single sum, while I need two of them (sum(quantity) и sum(amount)) (ERROR 1241 (21000): Operand should contain 1 column(s)).
The closest result I were able to get using JOIN:
> SELECT DISTINCT(s1.hour) AS ih, s2.date, s2.hour, s2.quantity, s2.amount, s2.id
FROM sales s1
JOIN sales s2 ON s2.date = s1.date AND s2.hour <= s1.hour
WHERE s1.date = '2014-09-18'
ORDER by ih;
+----+------------+------+----------+--------+------+
| ih | date | hour | quantity | amount | id |
+----+------------+------+----------+--------+------+
| 3 | 2014-09-18 | 3 | 1 | 100 | 2236 |
| 3 | 2014-09-18 | 3 | 1 | 100 | 2227 |
| 3 | 2014-09-18 | 3 | 1 | 11 | 2234 |
| 5 | 2014-09-18 | 3 | 1 | 100 | 2236 |
| 5 | 2014-09-18 | 3 | 1 | 100 | 2227 |
| 5 | 2014-09-18 | 5 | 2 | 200 | 2228 |
| 5 | 2014-09-18 | 3 | 1 | 11 | 2234 |
| 7 | 2014-09-18 | 3 | 1 | 100 | 2236 |
| 7 | 2014-09-18 | 3 | 1 | 100 | 2227 |
| 7 | 2014-09-18 | 5 | 2 | 200 | 2228 |
| 7 | 2014-09-18 | 7 | 3 | 300 | 2229 |
| 7 | 2014-09-18 | 3 | 1 | 11 | 2234 |
+----+------------+------+----------+--------+------+
I could stop here and just use those results to group by ih (hour), calculate the sum for quantity and amount and be happy. But something eats me up telling that this is wrong.
If I remove DISTINCT most rows become to be duplicated. Replacing JOIN with its invariants doesn't help.
Once I remove s2.id from statement you get a complete mess with disappearing/collapsion meaningful rows (e.g. ids 2236/2227 got collapsed):
> SELECT DISTINCT(s1.hour) AS ih, s2.date, s2.hour, s2.quantity, s2.amount
FROM sales s1
JOIN sales s2 ON s2.date = s1.date AND s2.hour <= s1.hour
WHERE s1.date = '2014-09-18'
ORDER by ih;
+----+------------+------+----------+--------+
| ih | date | hour | quantity | amount |
+----+------------+------+----------+--------+
| 3 | 2014-09-18 | 3 | 1 | 100 |
| 3 | 2014-09-18 | 3 | 1 | 11 |
| 5 | 2014-09-18 | 3 | 1 | 100 |
| 5 | 2014-09-18 | 5 | 2 | 200 |
| 5 | 2014-09-18 | 3 | 1 | 11 |
| 7 | 2014-09-18 | 3 | 1 | 100 |
| 7 | 2014-09-18 | 5 | 2 | 200 |
| 7 | 2014-09-18 | 7 | 3 | 300 |
| 7 | 2014-09-18 | 3 | 1 | 11 |
+----+------------+------+----------+--------+
Summing doesn't help, and it adds up to the mess.
First row (hour = 3) should have SUM(s2.quantity) equal 3, but it has 9. What does SUM(s1.quantity) shows is a complete mystery to me.
> SELECT DISTINCT(s1.hour) AS hour, sum(s1.quantity), s2.date, SUM(s2.quantity)
FROM sales s1 JOIN sales s2 ON s2.date = s1.date AND s2.hour <= s1.hour
WHERE s1.date = '2014-09-18'
GROUP BY hour;
+------+------------------+------------+------------------+
| hour | sum(s1.quantity) | date | sum(s2.quantity) |
+------+------------------+------------+------------------+
| 3 | 9 | 2014-09-18 | 9 |
| 5 | 8 | 2014-09-18 | 5 |
| 7 | 15 | 2014-09-18 | 8 |
+------+------------------+------------+------------------+
Bonus points/boss level:
I also need a column that will show total_reference, the same rolling total for the same periods for a different date (e.g. 2014-09-11).

If you want a cumulative sum in MySQL, the most efficient way is to use variables:
SELECT date, hour,
(#q := q + #q) as cumeq, (#a := a + #a) as cumea
FROM (SELECT date, hour, SUM(quantity) as q, SUM(amount) as a
FROM sales s
WHERE s.date = '2014-09-18'
GROUP by date, hour
) dh cross join
(select #q := 0, #a := 0) vars
ORDER BY date, hour;
If you are planning on working with databases such as Oracle, SQL Server, and Postgres, then you should use a database more similar in functionality and that supports that ANSI standard window functions. The right way to do this is with window functions, but MySQL doesn't support those. Postgres, SQL Server, and Oracle all have free versions that yo can use for development purposes.
Also, with proper indexing, you shouldn't have a problem with the subquery approach, even on large tables.

Related

Count Since Last Max Within Window

I have been working on this query for most of the night, and just cannot get it to work. This is an addendum to this question. The query should find the "Seqnum" of the last Maximum over the last 10 records. I am unable to limit the last Maximum to just the window.
Below is my best effort at getting there although I have tried many other queries to no avail:
SELECT [id], high, running_max, seqnum,
MAX(CASE WHEN ([high]) = running_max THEN seqnum END) OVER (ORDER BY [id]) AS [lastmax]
FROM (
SELECT [id], [high],
MAX([high]) OVER (ORDER BY [id] ROWS BETWEEN 9 PRECEDING AND CURRENT ROW) AS running_max,
ROW_NUMBER() OVER (ORDER BY [id]) as seqnum
FROM PY t
) x
When the above query is run, the below results.
id | high | running_max | seqnum | lastmax |
+----+--------+-------------+--------+---------+
| 1 | 28.12 | 28.12 | 1 | 1 |
| 2 | 27.45 | 28.12 | 2 | 1 |
| 3 | 27.68 | 28.12 | 3 | 1 |
| 4 | 27.4 | 28.12 | 4 | 1 |
| 5 | 28.09 | 28.12 | 5 | 1 |
| 6 | 28.07 | 28.12 | 6 | 1 |
| 7 | 28.2 | 28.2 | 7 | 7 |
| 8 | 28.7 | 28.7 | 8 | 8 |
| 9 | 28.05 | 28.7 | 9 | 8 |
| 10 | 28.195 | 28.7 | 10 | 8 |
| 11 | 27.77 | 28.7 | 11 | 8 |
| 12 | 28.27 | 28.7 | 12 | 8 |
| 13 | 28.185 | 28.7 | 13 | 8 |
| 14 | 28.51 | 28.7 | 14 | 8 |
| 15 | 28.5 | 28.7 | 15 | 8 |
| 16 | 28.23 | 28.7 | 16 | 8 |
| 17 | 27.59 | 28.7 | 17 | 8 |
| 18 | 27.6 | 28.51 | 18 | 8 |
| 19 | 27.31 | 28.51 | 19 | 8 |
| 20 | 27.11 | 28.51 | 20 | 8 |
| 21 | 26.87 | 28.51 | 21 | 8 |
| 22 | 27.12 | 28.51 | 22 | 8 |
| 23 | 27.22 | 28.51 | 23 | 8 |
| 24 | 27.3 | 28.5 | 24 | 8 |
| 25 | 27.66 | 28.23 | 25 | 8 |
| 26 | 27.405 | 27.66 | 26 | 8 |
| 27 | 27.54 | 27.66 | 27 | 8 |
| 28 | 27.65 | 27.66 | 28 | 8 |
+----+--------+-------------+--------+---------+
Unfortunately the lastmax column is taking the last max of all the previous records and not the max of the last 10 records only. The way it should result is below:
It is important to note that their can be duplicates in the "High" column, so this will need to be taken into account.
Any help would be greatly appreciated.
This isn't a bug. The issue is that high and lastmax have to come from the same row. This is a confusing aspect when using window functions.
Your logic in the outer query is looking for a row where the lastmax on that row matches the high on that row. That last occurred on row 8. The subsequent maxima are "local", in the sense that there was a higher value on that particular row.
For instance, on row 25, the value is 26.660. That is the maximum value that you want from row 26 onward. But on row 25 itself, then maximum is 28.230. That is clearly not equal to high on that row. So, it doesn't match in the outer query.
I don't think you can easily do what you want using window functions. There may be some tricky way.
A version using cross apply works. I've used id for the lastmax. I'm not sure if you really need seqnum:
select py.[id], py.high, t.high as running_max, t.id as lastmax
from py cross apply
(select top (1) t.*
from (SELECT top (10) t.*
from PY t
where t.id <= py.id
order by t.id desc
) t
order by t.high desc
) t;
Here is a db<>fiddle.

How to select data order by sort value and sort NULL By name

I wrote following SQL query to select data from #tmp table variable.
SELECT #rowCount AS [row-count],
t.[row-no] AS [row-no],
t.[ServiceID] AS ServiceID,
t.ServiceName AS ServiceName,
t.[BranchServiceSortValue] AS SortValue,
(CASE WHEN t.OptIn = 1 THEN 'Yes' ELSE 'No' END) AS OptIn
FROM #tmp t
INNER JOIN dbo.Category
ON Category.CategoryId = t.FkCategoryId
INNER JOIN dbo.ServiceType
ON ServiceType.ServiceTypeId = t.FkServiceTypeId
WHERE t.[row-no] >= #startRow
AND t.[row-no] <= #endRow
ORDER BY t.BranchServiceSortValue,t.serviceName
According to the data in #tmp table,my above query return following output.
| row-count | row-no | ServiceID | ServiceName | SortValue | OptIn |
|-----------|--------|-----------|-------------|-----------|-------|
| 24 | 4 | 1088 | AAB | NULL | No |
| 24 | 5 | 1089 | AAC | NULL | No |
| 24 | 6 | 1090 | AAD | NULL | No |
| 24 | 1 | 1093 | GDGD | 0 | Yes |
| 24 | 7 | 1091 | EETETE | 1 | Yes |
| 24 | 8 | 1092 | CSCDF | 2 | Yes |
| 24 | 3 | 1086 | CXCX | 3 | Yes |
| 24 | 9 | 16 | ASA | 4 | Yes |
| 24 | 2 | 1087 | BFB | 5 | Yes |
| 24 | 10 | 7 | Mortgage | 6 | Yes |
| 24 | 11 | 17 | DDWW | 7 | Yes |
| 24 | 12 | 11 | IL | 8 | Yes |
| 24 | 13 | 5 | SAA | 9 | Yes |
| 24 | 14 | 9 | CD | 10 | Yes |
You can see according to my above query data rows are sorted by SortValue and when SortValue = NULL, those 3 rows sorted by its ServiceName,
But I need to displaySortValue = NULLrows at the bottom of the other rows.Its mean I need to display Null rows after the SortValue Not NULL data and SortValue = NULL should be display order by its ServiceName.
My Expected Output is:
| row-count | row-no | ServiceID | ServiceName | SortValue | OptIn |
|-----------|--------|-----------|-------------|-----------|-------|
| 14 | 1 | 1093 | GDGD | 0 | Yes |
| 14 | 7 | 1091 | EETETE | 1 | Yes |
| 14 | 8 | 1092 | CSCDF | 2 | Yes |
| 14 | 3 | 1086 | CXCX | 3 | Yes |
| 14 | 9 | 16 | ASA | 4 | Yes |
| 14 | 2 | 1087 | BFB | 5 | Yes |
| 14 | 10 | 7 | Mortgage | 6 | Yes |
| 14 | 11 | 17 | DDWW | 7 | Yes |
| 14 | 12 | 11 | IL | 8 | Yes |
| 14 | 13 | 5 | SAA | 9 | Yes |
| 14 | 14 | 9 | CD | 10 | Yes |
| 14 | 4 | 1088 | AAB | NULL | No |
| 14 | 5 | 1089 | AAC | NULL | No |
| 14 | 6 | 1090 | AAD | NULL | No |
How should I need to change my query to get above output? please help me
NULL has the lowest value, so you'll need to use a CASE to put NULL at the end, and then sort by SortValue:
ORDER BY CASE WHEN t.BranchServiceSortValue IS NULL THEN 1 ELSE 0 END,
t.BranchServiceSortValue,
t.serviceName;
Just add a key to the ORDER BY:
ORDER BY (CASE WHEN t.BranchServiceSortValue IS NOT NULL THEN 1 ELSE 2 END),
t.BranchServiceSortValue, t.serviceName
The SQL standard provides the options NULLS FIRST and NULLS LAST for ORDER BY clauses. SQL Server does not (yet) implement these.

windowing function avg in Hive with - over (order by colName)

i'm trying to understand how windowing function avg works, and somehow it seems to not be working as i expect.
here is the dataset :
select * from winsales;
+-------------------+------------------+--------------------+-------------------+---------------+-----------------------+--+
| winsales.salesid | winsales.dateid | winsales.sellerid | winsales.buyerid | winsales.qty | winsales.qty_shipped |
+-------------------+------------------+--------------------+-------------------+---------------+-----------------------+--+
| 30001 | NULL | 3 | b | 10 | 10 |
| 10001 | NULL | 1 | c | 10 | 10 |
| 10005 | NULL | 1 | a | 30 | NULL |
| 40001 | NULL | 4 | a | 40 | NULL |
| 20001 | NULL | 2 | b | 20 | 20 |
| 40005 | NULL | 4 | a | 10 | 10 |
| 20002 | NULL | 2 | c | 20 | 20 |
| 30003 | NULL | 3 | b | 15 | NULL |
| 30004 | NULL | 3 | b | 20 | NULL |
| 30007 | NULL | 3 | c | 30 | NULL |
| 30001 | NULL | 3 | b | 10 | 10 |
+-------------------+------------------+--------------------+-------------------+---------------+-----------------------+--+
When i fire the following query ->
select salesid, sellerid, qty, avg(qty) over (order by sellerid) as avg_qty from winsales order by sellerid,salesid;
I get the following ->
+----------+-----------+------+---------------------+--+
| salesid | sellerid | qty | avg_qty |
+----------+-----------+------+---------------------+--+
| 10001 | 1 | 10 | 20.0 |
| 10005 | 1 | 30 | 20.0 |
| 20001 | 2 | 20 | 20.0 |
| 20002 | 2 | 20 | 20.0 |
| 30001 | 3 | 10 | 18.333333333333332 |
| 30001 | 3 | 10 | 18.333333333333332 |
| 30003 | 3 | 15 | 18.333333333333332 |
| 30004 | 3 | 20 | 18.333333333333332 |
| 30007 | 3 | 30 | 18.333333333333332 |
| 40001 | 4 | 40 | 19.545454545454547 |
| 40005 | 4 | 10 | 19.545454545454547 |
+----------+-----------+------+---------------------+--+
Question is - how is the avg(qty) being calculated.
Since i'm not using partition by, i would expect the avg(qty) to be the same for all rows.
Any ideas ?
if you want to have same avg(qty) to get for all rows then remove order by sellerid in over clause, then you are going to have 19.545454545454547 value for all the rows.
Query to get same avg(qty) for all rows:
hive> select salesid, sellerid, qty, avg(qty) over () as avg_qty from winsales order by sellerid,salesid;
If we include order by sellerid in over clause then you are getting cumulative avg is caluculated for each sellerid.
i.e. for
sellerid 1 you are having 2 records total 2 records with qty as 10,30 so avg would be
(10+30)/2.
sellerid 2 you are having 2 records total 4 records with qty as 20,20 so avg would be
(10+30+20+20)/4 = 20.0
sellerid 3 you are having 5 records total 9 records with qty as so 10,10,15,20,30 avg would be
(10+30+20+20+10+10+15+20+30)/9 = 18.333
sellerid 4 avg is 19.545454545454547
when we include over clause then this is an expected behavior from hive.

SQL window excluding current group?

I'm trying to provide rolled up summaries of the following data including only the group in question as well as excluding the group. I think this can be done with a window function, but I'm having problems with getting the syntax down (in my case Hive SQL).
I want the following data to be aggregated
+------------+---------+--------+
| date | product | rating |
+------------+---------+--------+
| 2018-01-01 | A | 1 |
| 2018-01-02 | A | 3 |
| 2018-01-20 | A | 4 |
| 2018-01-27 | A | 5 |
| 2018-01-29 | A | 4 |
| 2018-02-01 | A | 5 |
| 2017-01-09 | B | NULL |
| 2017-01-12 | B | 3 |
| 2017-01-15 | B | 4 |
| 2017-01-28 | B | 4 |
| 2017-07-21 | B | 2 |
| 2017-09-21 | B | 5 |
| 2017-09-13 | C | 3 |
| 2017-09-14 | C | 4 |
| 2017-09-15 | C | 5 |
| 2017-09-16 | C | 5 |
| 2018-04-01 | C | 2 |
| 2018-01-13 | D | 1 |
| 2018-01-14 | D | 2 |
| 2018-01-24 | D | 3 |
| 2018-01-31 | D | 4 |
+------------+---------+--------+
Aggregated results:
+------+-------+---------+----+------------+------------------+----------+
| year | month | product | ct | avg_rating | avg_rating_other | other_ct |
+------+-------+---------+----+------------+------------------+----------+
| 2018 | 1 | A | 5 | 3.4 | 2.5 | 4 |
| 2018 | 2 | A | 1 | 5 | NULL | 0 |
| 2017 | 1 | B | 4 | 3.6666667 | NULL | 0 |
| 2017 | 7 | B | 1 | 2 | NULL | 0 |
| 2017 | 9 | B | 1 | 5 | 4.25 | 4 |
| 2017 | 9 | C | 4 | 4.25 | 5 | 1 |
| 2018 | 4 | C | 1 | 2 | NULL | 0 |
| 2018 | 1 | D | 4 | 2.5 | 3.4 | 5 |
+------+-------+---------+----+------------+------------------+----------+
I've also considered producing two aggregates, one with the product in question and one without, but having trouble with creating the appropriate joining key.
You can do:
select year(date), month(date), product,
count(*) as ct, avg(rating) as avg_rating,
sum(count(*)) over (partition by year(date), month(date)) - count(*) as ct_other,
((sum(sum(rating)) over (partition by year(date), month(date)) - sum(rating)) /
(sum(count(*)) over (partition by year(date), month(date)) - count(*))
) as avg_other
from t
group by year(date), month(date), product;
The rating for the "other" is a bit tricky. You need to add everything up and subtract out the current row -- and calculate the average by doing the sum divided by the count.

How to calculate running total (month to date) in SQL Server 2008

I'm trying to calculate a month-to-date total using SQL Server 2008.
I'm trying to generate a month-to-date count at the level of activities and representatives. Here are the results I want to generate:
| REPRESENTATIVE_ID | MONTH | WEEK | TOTAL_WEEK_ACTIVITY_COUNT | MONTH_TO_DATE_ACTIVITIES_COUNT |
|-------------------|-------|------|---------------------------|--------------------------------|
| 40 | 7 | 7/08 | 1 | 1 |
| 40 | 8 | 8/09 | 1 | 1 |
| 40 | 8 | 8/10 | 1 | 2 |
| 41 | 7 | 7/08 | 2 | 2 |
| 41 | 8 | 8/08 | 4 | 4 |
| 41 | 8 | 8/09 | 3 | 7 |
| 41 | 8 | 8/10 | 1 | 8 |
From the following tables:
ACTIVITIES_FACT table
+-------------------+------+-----------+
| Representative_ID | Date | Activity |
+-------------------+------+-----------+
| 41 | 8/03 | Call |
| 41 | 8/04 | Call |
| 41 | 8/05 | Call |
+-------------------+------+-----------+
LU_TIME table
+-------+-----------------+--------+
| Month | Date | Week |
+-------+-----------------+--------+
| 8 | 8/01 | 8/08 |
| 8 | 8/02 | 8/08 |
| 8 | 8/03 | 8/08 |
| 8 | 8/04 | 8/08 |
| 8 | 8/05 | 8/08 |
+-------+-----------------+--------+
I'm not sure how to do this: I keep running into problems with multiple-counting or aggregations not being allowed in subqueries.
A running total is the summation of a sequence of numbers which is
updated each time a new number is added to the sequence, simply by
adding the value of the new number to the running total.
I THINK He wants a running total for Month by each Representative_Id, so a simple group by week isn't enough. He probably wants his Month_To_Date_Activities_Count to be updated at the end of every week.
This query gives a running total (month to end-of-week date) ordered by Representative_Id, Week
SELECT a.Representative_ID, l.month, l.Week, Count(*) AS Total_Week_Activity_Count
,(SELECT count(*)
FROM ACTIVITIES_FACT a2
INNER JOIN LU_TIME l2 ON a2.Date = l2.Date
AND a.Representative_ID = a2.Representative_ID
WHERE l2.week <= l.week
AND l2.month = l.month) Month_To_Date_Activities_Count
FROM ACTIVITIES_FACT a
INNER JOIN LU_TIME l ON a.Date = l.Date
GROUP BY a.Representative_ID, l.Week, l.month
ORDER BY a.Representative_ID, l.Week
| REPRESENTATIVE_ID | MONTH | WEEK | TOTAL_WEEK_ACTIVITY_COUNT | MONTH_TO_DATE_ACTIVITIES_COUNT |
|-------------------|-------|------|---------------------------|--------------------------------|
| 40 | 7 | 7/08 | 1 | 1 |
| 40 | 8 | 8/09 | 1 | 1 |
| 40 | 8 | 8/10 | 1 | 2 |
| 41 | 7 | 7/08 | 2 | 2 |
| 41 | 8 | 8/08 | 4 | 4 |
| 41 | 8 | 8/09 | 3 | 7 |
| 41 | 8 | 8/10 | 1 | 8 |
SQL Fiddle Sample
As I understand your question:
SELECT af.Representative_ID
, lt.Week
, COUNT(af.Activity) AS Qnt
FROM ACTIVITIES_FACT af
INNER JOIN LU_TIME lt ON lt.Date = af.date
GROUP BY af.Representative_ID, lt.Week
SqlFiddle
Representative_ID Week Month_To_Date_Activities_Count
41 2013-08-01 00:00:00.000 1
41 2013-08-08 00:00:00.000 3
USE tempdb;
GO
IF OBJECT_ID('#ACTIVITIES_FACT','U') IS NOT NULL DROP TABLE #ACTIVITIES_FACT;
CREATE TABLE #ACTIVITIES_FACT
(
Representative_ID INT NOT NULL
,Date DATETIME NULL
, Activity VARCHAR(500) NULL
)
IF OBJECT_ID('#LU_TIME','U') IS NOT NULL DROP TABLE #LU_TIME;
CREATE TABLE #LU_TIME
(
Month INT
,Date DATETIME
,Week DATETIME
)
INSERT INTO #ACTIVITIES_FACT(Representative_ID,Date,Activity)
VALUES
(41,'7/31/2013','Chat')
,(41,'8/03/2013','Call')
,(41,'8/04/2013','Call')
,(41,'8/05/2013','Call')
INSERT INTO #LU_TIME(Month,Date,Week)
VALUES
(8,'7/31/2013','8/01/2013')
,(8,'8/01/2013','8/08/2013')
,(8,'8/02/2013','8/08/2013')
,(8,'8/03/2013','8/08/2013')
,(8,'8/04/2013','8/08/2013')
,(8,'8/05/2013','8/08/2013')
--Begin Query
SELECT AF.Representative_ID
,LU.Week
,COUNT(*) AS Month_To_Date_Activities_Count
FROM #ACTIVITIES_FACT AS AF
INNER JOIN #LU_TIME AS LU
ON AF.Date = LU.Date
Group By AF.Representative_ID
,LU.Week