I'm fairly new to SQL and am trying to get the price for a product transaction on a particular date my looking up the most recent price of that product prior to the transaction within a price catalog.
Specifically, I have the two following tables:
Transactions Catalog
----------------------------------------------------------------------------
ProductID | Design | Transaction_DT ProductID | Price | Effective_DT
1 | Plaid | 5/14/2016 1 | 20 | 4/22/2016
2 | Solid | 3/26/2016 1 | 10 | 5/2/2016
3 | PolkaDot | 4/12/2016 1 | 5 | 5/15/2016
4 | Solid | 4/24/2016 2 | 50 | 3/22/2016
5 | PolkaDot | 2/24/2016 2 | 25 | 4/1/2016
6 | PinStripe | 3/29/2016 2 | 10 | 4/2/2016
3 | 30 | 4/5/2016
3 | 25 | 4/9/2016
3 | 22 | 4/12/2016
4 | 12 | 3/15/2016
4 | 8 | 3/27/2016
4 | 6 | 4/25/2016
5 | 15 | 2/23/2016
5 | 11 | 2/25/2016
5 | 6 | 2/28/2016
6 | 26 | 2/2/2016
6 | 17 | 3/19/2016
6 | 13 | 5/16/2016
I have entered the following code:
SELECT Transactions.ProductID,
Catalog.Price,
Transactions.Transaction_DT,
Transactions.Design
FROM Transactions
LEFT JOIN
Catalog ON Transactions.ProductID = Catalog.ProductID AND
Catalog.Effective_DT = (
SELECT MAX(Effective_DT)
FROM Catalog
WHERE Effective_DT <= Transactions.Transactions DT
)
And obtained the following output:
ProductID | Price | Transaction_DT | Design
1 | Null | 5/14/2016 | Plaid
2 | 50 | 3/26/2016 | Solid
3 | 22 | 4/12/2016 | PolkaDot
4 | Null | 4/24/2016 | Solid
5 | 15 | 2/24/2016 | PolkaDot
6 | Null | 3/29/2016 | PinStripe
I would like to return the Price for products 1, 4, and 6 to be 10, 8, and 17 respectively (in addition to the correct prices which were properly output) instead of the Null values I'm getting. Any ideas on how I can obtain the proper results?
You forgot to filter the correlated query by the productID. You are not getting the correct latest date for the product. You need to use this query:
SELECT Transactions.ProductID,
Catalog.Price,
Transactions.Transaction_DT,
Transactions.Design
FROM Transactions
LEFT JOIN
Catalog ON Transactions.ProductID = Catalog.ProductID AND
Catalog.Effective_DT = (
SELECT MAX(Effective_DT)
FROM Catalog
WHERE Effective_DT <= Transactions.Transactions_DT
and ProductID = Transactions.ProductID
)
Related
I am trying (and failing) to join some tables in a SQLite database. The data itself is complicated but I think I have boiled it down to an illustrative example.
Here are the three tables I want to join.
Table: Events
+----+---------+-------+-----------+
| id | user_id | class | timestamp |
+----+---------+-------+-----------+
| 1 | 'user1' | 6 | 100 |
| 2 | 'user1' | 12 | 400 |
| 3 | 'user1' | 4 | 900 |
| 4 | 'user2' | 6 | 400 |
| 5 | 'user2' | 3 | 800 |
| 6 | 'user2' | 8 | 900 |
+----+---------+-------+-----------+
Table: Games
+---------+---------+------------+-----------+
| user_id | game_id | game_class | timestamp |
+---------+---------+------------+-----------+
| 'user1' | 1 | 'A' | 200 |
| 'user2' | 2 | 'A' | 300 |
| 'user1' | 3 | 'B' | 500 |
| 'user1' | 4 | 'A' | 600 |
| 'user1' | 5 | 'A' | 700 |
+---------+---------+------------+-----------+
Table: AScores
+---------+-------+
| game_id | score |
+---------+-------+
| 1 | 8 |
| 2 | 2 |
| 4 | 9 |
| 5 | 6 |
+---------+-------+
I would like to join these to provide an additional column on the first table containing the users current score in game class A at the time of the event. I.e. I would like theresult of the join to look like this:
Desired Result
+----+----------+-------+-----------+-----------------+
| id | user_id | class | timestamp | current_a_score |
+----+----------+-------+-----------+-----------------+
| 1 | 'user1' | 6 | 100 | (null) |
| 2 | 'user1' | 12 | 400 | 8 |
| 3 | 'user1' | 4 | 900 | 6 |
| 4 | 'user2' | 6 | 400 | 2 |
| 5 | 'user2' | 3 | 800 | 2 |
| 6 | 'user2' | 8 | 900 | 2 |
+----+----------+-------+-----------+-----------------+
The following simple join pulls together the two tables AScores and Games.
SELECT * FROM AScores
INNER JOIN Games
ON AScores.game_id = Games.game_id
And so I was hoping to join this to the Events table as a sub-query. Something like this:
SELECT Events.*, AScoredGames.time_stamp AS game_time_stamp, AScoredGames.score
FROM Events
LEFT OUTER JOIN (
SELECT AScores.score, Games.* FROM AScores
INNER JOIN Games
ON AScores.game_id = Games.game_id
) AS AScoredGames
ON Events.user_id = AScoredGames.user_id
AND Events.time_stamp >= AScoredGames.time_stamp
ORDER BY Events.time_stamp ASC
That results in the following:
+----+---------+-------+------------+-----------------+-------+
| id | user_id | class | time_stamp | game_time_stamp | score |
+----+---------+-------+------------+-----------------+-------+
| 1 | user1 | 6 | 100 | NULL | NULL |
| 2 | user1 | 12 | 400 | 200 | 8 |
| 4 | user2 | 6 | 400 | 300 | 2 |
| 5 | user2 | 3 | 800 | 300 | 2 |
| 6 | user2 | 8 | 900 | 300 | 2 |
| 3 | user1 | 4 | 900 | 200 | 8 |
| 3 | user1 | 4 | 900 | 600 | 9 |
| 3 | user1 | 4 | 900 | 700 | 6 |
+----+---------+-------+------------+-----------------+-------+
So I need to group by Events.id to get rid of the triplicated row with Events.id 3. But what I want to do is to choose the row with the maximum game_time_stamp but then use the row's score. If I do MAX(game_time_stamp) as my aggregation I still have to independently aggregate the score. Is there a way to tie the row choice in the score column's aggregation function to the result of the game_time_stamp column's aggregation function?
(N.B. Existing answers to questions like Select first record in a One-to-Many relation using left join and SQL Server: How to Join to first row seem to suggest I cannot and say one must use a WHERE clause over a sub-query. But I am struggling with that (I'll post another question about that) and I can think of at least one solution and I am hoping there are better ones.)
The following query should do it. It uses a NOT EXISTS condition with a correlated subquery to locate the relevant game record for each event.
SELECT e.*, s.score current_a_score
FROM
events e
LEFT JOIN games g
ON g.user_id = e .user_id
AND g.timestamp < e.timestamp
AND NOT EXISTS (
SELECT 1
FROM games g1
WHERE
g1.user_id = e .user_id
AND g1.timestamp < e.timestamp
AND g1.timestamp > g.timestamp
)
LEFT JOIN ascores s
ON s.game_id = g.game_id
ORDER BY e.id
This DB Fiddle demo with your test data returns :
| id | user_id | class | timestamp | current_a_score |
| --- | ------- | ----- | --------- | --------------- |
| 1 | user1 | 6 | 100 | |
| 2 | user1 | 12 | 400 | 8 |
| 3 | user1 | 4 | 900 | 6 |
| 4 | user2 | 6 | 400 | 2 |
| 5 | user2 | 3 | 800 | 2 |
| 6 | user2 | 8 | 900 | 2 |
I have one work-around, but it feels hacky and relies on the specifics of my data. First note that the time_stamps are all multiples of 100 while the scores are all below 10. I can acombine these in a way that will not interfere with my comparison but will mean they are both encoded in one numeric column. This query gives the desired result:
SELECT Events.id, MIN(Events.user_id) AS user_id, MIN(Events.class) AS class, MIN(Events.time_stamp) AS time_stamp, MAX(AScoredGames.combination) % 10 AS current_a_score
FROM Events
LEFT OUTER JOIN (
SELECT AScores.score, AScores.score + (Games.time_stamp - 10) AS combination, Games.* FROM AScores
INNER JOIN Games
ON AScores.game_id = Games.game_id) AS AScoredGames
ON Events.user_id = AScoredGames.user_id AND Events.time_stamp >= AScoredGames.time_stamp
GROUP BY Events.id
ORDER BY id ASC
(The combining is done in AScores.score + (Games.time_stamp - 10) and so the aggregate function becomes MAX(AScoredGames.combination) % 10.)
Actual Result
+----+---------+-------+------------+-----------------+
| id | user_id | class | time_stamp | current_a_score |
+----+---------+-------+------------+-----------------+
| 1 | user1 | 6 | 100 | NULL |
| 2 | user1 | 12 | 400 | 8 |
| 3 | user1 | 4 | 900 | 6 |
| 4 | user2 | 6 | 400 | 2 |
| 5 | user2 | 3 | 800 | 2 |
| 6 | user2 | 8 | 900 | 2 |
+----+---------+-------+------------+-----------------+
Here is my table A.
| Id | GroupId | StoreId | Amount |
| 1 | 20 | 7 | 15000 |
| 2 | 20 | 7 | 1230 |
| 3 | 20 | 7 | 14230 |
| 4 | 20 | 7 | 9540 |
| 5 | 20 | 7 | 24230 |
| 6 | 20 | 7 | 1230 |
| 7 | 20 | 7 | 1230 |
Here is my table B.
| Id | GroupId | StoreId | Credit |
| 12 | 20 | 7 | 1230 |
| 14 | 20 | 7 | 15000 |
| 15 | 20 | 7 | 14230 |
| 16 | 20 | 7 | 1230 |
| 17 | 20 | 7 | 7004 |
| 18 | 20 | 7 | 65523 |
I want to get this result without getting duplicate Id of both table.
I need to get the Id of table B and A where the Amount = Credit.
| A.ID | B.ID | Amount |
| 1 | 14 | 15000 |
| 2 | 12 | 1230 |
| 3 | 15 | 14230 |
| 4 | null | 9540 |
| 5 | null | 24230 |
| 6 | 16 | 1230 |
| 7 | null | 1230 |
My problem is when I have 2 or more same Amount in table A, I get duplicate ID of table B. which should be null. Please help me. Thank you.
I think you want a left join. But this is tricky because you have duplicate amounts, but you only want one to match. The solution is to use row_number():
select . . .
from (select a.*, row_number() over (partition by amount order by id) as seqnum
from a
) a left join
(select b.*, row_number() over (partition by credit order by id) as seqnum
from b
)b
on a.amount = b.credit and a.seqnum = b.seqnum;
Another approach, I think simplier and shorter :)
select ID [A.ID],
(select top 1 ID from TABLE_B where Credit = A.Amount) [B.ID],
Amount
from TABLE_A [A]
i hope my description will be enough. i tried to remove all non-significant fields.
i have 5 tables (Customer, Invoice, Items, Invoice_Item, Payment):
Customer fields and sample date are:
+----+------+
| ID | Name |
+----+------+
| 1 | John |
| 2 | Mary |
+----+------+
Invoice fields and sample date are:
+----+-----------+----------+------+
| ID | Date | Customer | Tax |
+----+-----------+----------+------+
| 1 | 1.1.2017 | 1 | 0.10 |
| 2 | 2.1.2017 | 2 | 0.10 |
| 3 | 3.1.2017 | 1 | 0.10 |
| 4 | 3.1.2017 | 2 | 0.10 |
| 5 | 8.1.2017 | 1 | 0.10 |
| 6 | 11.1.2017 | 1 | 0.10 |
| 7 | 12.1.2017 | 2 | 0.10 |
| 8 | 13.1.2017 | 1 | 0.10 |
+----+-----------+----------+------+
Item fields and sample data are:
+----+--------+
| ID | Name |
+----+--------+
| 1 | Door |
| 2 | Window |
| 3 | Table |
| 4 | Chair |
+----+--------+
Invoice_Item fields and sample data are:
+------------+---------+--------+------------+
| Invoice_ID | Item_ID | Amount | Unit_Price |
+------------+---------+--------+------------+
| 1 | 1 | 4 | 10 |
| 1 | 2 | 2 | 20 |
| 1 | 3 | 1 | 30 |
| 1 | 4 | 2 | 40 |
| 2 | 1 | 1 | 10 |
| 2 | 3 | 1 | 15 |
| 2 | 4 | 2 | 12 |
| 3 | 3 | 4 | 15 |
| 4 | 1 | 1 | 10 |
| 4 | 2 | 20 | 30 |
| 4 | 3 | 15 | 30 |
| 5 | 1 | 4 | 10 |
| 5 | 2 | 2 | 20 |
| 5 | 3 | 1 | 30 |
| 5 | 4 | 2 | 40 |
| 6 | 1 | 1 | 10 |
| 6 | 3 | 1 | 15 |
| 6 | 4 | 2 | 12 |
| 7 | 3 | 4 | 15 |
| 8 | 1 | 1 | 10 |
| 8 | 2 | 20 | 30 |
| 8 | 3 | 15 | 30 |
+------------+---------+--------+------------+
The reason the price is in this table not in the item table is because it is customer specific price.
Payment fields are:
+----------+--------+-----------+
| Customer | Amount | Date |
+----------+--------+-----------+
| 1 | 40 | 3.1.2017 |
| 2 | 10 | 7.1.2017 |
| 1 | 60 | 10.1.2017 |
+----------+--------+-----------+
so my report should be combine all tables and sort by DATE (either from Invoice or Payment) for a certain customer.
so for e.g. for customer John (1) it should be like:
+------------+----------------+---------+-----------+
| Invoice_ID | Invoice_Amount | Payment | Date |
+------------+----------------+---------+-----------+
| 1 | 171 | - | 1.1.2017 |
| 3 | 54 | - | 3.1.2017 |
| - | - | 40 | 3.1.2017 |
| 5 | 171 | - | 8.1.2017 |
| - | 10 | 60 | 10.1.2017 |
| 6 | 44.1 | - | 11.1.2017 |
| 8 | 954 | - | 13.1.2017 |
+------------+----------------+---------+-----------+
it is sorted by date, Invoice amount is (sum of (Amount* unit price)) * (1-tax)
i started with union but then got lost.
here is my try:
SELECT Inv_ID as Num, SUM(Invoice_Items.II_Price*Invoice_Items.II_Amount) AS Amount, Inv_Date as Created
FROM Invoice INNER JOIN Invoice_Items ON Invoice.Inv_ID = Invoice_Items.II_Inv_ID
UNION ALL
SELECT Null as Num, P_Value as Amount, P_Date as Created
FROM Payments
ORDER BY created ASC
Your help is appreciated!
Thanks
You can generate the report you requested using the following SQL script:
SELECT CustomerID,Invoice_ID,Invoice_Amount,Payment,Date
FROM (
SELECT c.ID AS CustomerID, i.ID AS Invoice_ID, SUM((t.Amount * t.UnitPrice)*(1-i.tax)) AS Invoice_Amount, NULL AS Payment,i.Date
FROM (Customer c
LEFT JOIN Invoice i
ON c.ID = i.Customer)
LEFT JOIN Invoice_Item t
ON i.ID = t.Invoice_ID
GROUP BY c.ID, i.ID,i.Date
UNION
SELECT c.ID AS CustomerID,NULL AS Invoice_ID, NULL AS Invoice_Amount, p.Amount AS Payment, p.Date
FROM Customer c
INNER JOIN Payment p
ON c.ID = p.Customer ) a
ORDER BY CustomerID, Date, Payment ASC
Note: I've added CustomerID to the output so you know what customer the data corresponds to.
here is the Answer which worked for me, a bit corrected from #Catzeye Answer , which didnt show the second part of the Union.
SELECT c.ID AS CustomerID,NULL AS Invoice_ID, NULL AS Invoice_Amount, p.Amount AS Payment, p.Date
FROM Customer c
INNER JOIN Payment p
ON c.ID = p.Customer
UNION ALL
SELECT c.ID AS CustomerID, i.ID AS Invoice_ID, SUM((t.Amount * t.Unit_Price)*(1-i.tax)) AS Invoice_Amount, NULL AS Payment,i.Date
FROM (Customer c
INNER JOIN Invoice i
ON c.ID = i.Customer)
INNER JOIN Invoice_Item t
ON i.ID = t.Invoice_ID
GROUP BY c.ID, i.ID,i.Date
ORDER BY CustomerID, Date, Payment;
What I'm trying to achieve: rolling total for quantity and amount for a given day, grouped by hour.
It's easy in most cases, but if you have some additional columns (dir and product in my case) and you don't want to group/filter on them, that's a problem.
I know there are extensions in Oracle and MSSQL specifically for that, and there's SELECT OVER PARTITION in Postgres.
At the moment I'm working on an app prototype, and it's backed by MySQL, and I have no idea what it will be using in production, so I'm trying to avoid vendor lock-in.
The entrire table:
> SELECT id, dir, product, date, hour, quantity, amount FROM sales
ORDER BY date, hour;
+------+-----+---------+------------+------+----------+--------+
| id | dir | product | date | hour | quantity | amount |
+------+-----+---------+------------+------+----------+--------+
| 2230 | 65 | ABCDEDF | 2014-09-11 | 1 | 1 | 10 |
| 2231 | 64 | ABCDEDF | 2014-09-11 | 3 | 4 | 40 |
| 2232 | 64 | ABCDEDF | 2014-09-11 | 5 | 5 | 50 |
| 2235 | 64 | ZZ | 2014-09-11 | 7 | 6 | 60 |
| 2233 | 64 | ABCDEDF | 2014-09-11 | 7 | 6 | 60 |
| 2237 | 66 | ABCDEDF | 2014-09-11 | 7 | 6 | 60 |
| 2234 | 64 | ZZ | 2014-09-18 | 3 | 1 | 11 |
| 2236 | 66 | ABCDEDF | 2014-09-18 | 3 | 1 | 100 |
| 2227 | 64 | ABCDEDF | 2014-09-18 | 3 | 1 | 100 |
| 2228 | 64 | ABCDEDF | 2014-09-18 | 5 | 2 | 200 |
| 2229 | 64 | ABCDEDF | 2014-09-18 | 7 | 3 | 300 |
+------+-----+---------+------------+------+----------+--------+
For a given date:
> SELECT id, dir, product, date, hour, quantity, amount FROM sales
WHERE date = '2014-09-18'
ORDER BY hour;
+------+-----+---------+------------+------+----------+--------+
| id | dir | product | date | hour | quantity | amount |
+------+-----+---------+------------+------+----------+--------+
| 2227 | 64 | ABCDEDF | 2014-09-18 | 3 | 1 | 100 |
| 2236 | 66 | ABCDEDF | 2014-09-18 | 3 | 1 | 100 |
| 2234 | 64 | ZZ | 2014-09-18 | 3 | 1 | 11 |
| 2228 | 64 | ABCDEDF | 2014-09-18 | 5 | 2 | 200 |
| 2229 | 64 | ABCDEDF | 2014-09-18 | 7 | 3 | 300 |
+------+-----+---------+------------+------+----------+--------+
The results that I need, using sub-select:
> SELECT date, hour, SUM(quantity),
( SELECT SUM(quantity) FROM sales s2
WHERE s2.hour <= s1.hour AND s2.date = s1.date
) AS total
FROM sales s1
WHERE s1.date = '2014-09-18'
GROUP by date, hour;
+------------+------+---------------+-------+
| date | hour | sum(quantity) | total |
+------------+------+---------------+-------+
| 2014-09-18 | 3 | 3 | 3 |
| 2014-09-18 | 5 | 2 | 5 |
| 2014-09-18 | 7 | 3 | 8 |
+------------+------+---------------+-------+
My concerns for using sub-select:
once there are round million records in the table, the query may become too slow, not sure if it's subject to optimizations even though it has no HAVING statements.
if I had to filter on a product or dir, I will have to put those conditions to both main SELECT and sub-SELECT too (WHERE product = / WHERE dir =).
sub-select only counts a single sum, while I need two of them (sum(quantity) и sum(amount)) (ERROR 1241 (21000): Operand should contain 1 column(s)).
The closest result I were able to get using JOIN:
> SELECT DISTINCT(s1.hour) AS ih, s2.date, s2.hour, s2.quantity, s2.amount, s2.id
FROM sales s1
JOIN sales s2 ON s2.date = s1.date AND s2.hour <= s1.hour
WHERE s1.date = '2014-09-18'
ORDER by ih;
+----+------------+------+----------+--------+------+
| ih | date | hour | quantity | amount | id |
+----+------------+------+----------+--------+------+
| 3 | 2014-09-18 | 3 | 1 | 100 | 2236 |
| 3 | 2014-09-18 | 3 | 1 | 100 | 2227 |
| 3 | 2014-09-18 | 3 | 1 | 11 | 2234 |
| 5 | 2014-09-18 | 3 | 1 | 100 | 2236 |
| 5 | 2014-09-18 | 3 | 1 | 100 | 2227 |
| 5 | 2014-09-18 | 5 | 2 | 200 | 2228 |
| 5 | 2014-09-18 | 3 | 1 | 11 | 2234 |
| 7 | 2014-09-18 | 3 | 1 | 100 | 2236 |
| 7 | 2014-09-18 | 3 | 1 | 100 | 2227 |
| 7 | 2014-09-18 | 5 | 2 | 200 | 2228 |
| 7 | 2014-09-18 | 7 | 3 | 300 | 2229 |
| 7 | 2014-09-18 | 3 | 1 | 11 | 2234 |
+----+------------+------+----------+--------+------+
I could stop here and just use those results to group by ih (hour), calculate the sum for quantity and amount and be happy. But something eats me up telling that this is wrong.
If I remove DISTINCT most rows become to be duplicated. Replacing JOIN with its invariants doesn't help.
Once I remove s2.id from statement you get a complete mess with disappearing/collapsion meaningful rows (e.g. ids 2236/2227 got collapsed):
> SELECT DISTINCT(s1.hour) AS ih, s2.date, s2.hour, s2.quantity, s2.amount
FROM sales s1
JOIN sales s2 ON s2.date = s1.date AND s2.hour <= s1.hour
WHERE s1.date = '2014-09-18'
ORDER by ih;
+----+------------+------+----------+--------+
| ih | date | hour | quantity | amount |
+----+------------+------+----------+--------+
| 3 | 2014-09-18 | 3 | 1 | 100 |
| 3 | 2014-09-18 | 3 | 1 | 11 |
| 5 | 2014-09-18 | 3 | 1 | 100 |
| 5 | 2014-09-18 | 5 | 2 | 200 |
| 5 | 2014-09-18 | 3 | 1 | 11 |
| 7 | 2014-09-18 | 3 | 1 | 100 |
| 7 | 2014-09-18 | 5 | 2 | 200 |
| 7 | 2014-09-18 | 7 | 3 | 300 |
| 7 | 2014-09-18 | 3 | 1 | 11 |
+----+------------+------+----------+--------+
Summing doesn't help, and it adds up to the mess.
First row (hour = 3) should have SUM(s2.quantity) equal 3, but it has 9. What does SUM(s1.quantity) shows is a complete mystery to me.
> SELECT DISTINCT(s1.hour) AS hour, sum(s1.quantity), s2.date, SUM(s2.quantity)
FROM sales s1 JOIN sales s2 ON s2.date = s1.date AND s2.hour <= s1.hour
WHERE s1.date = '2014-09-18'
GROUP BY hour;
+------+------------------+------------+------------------+
| hour | sum(s1.quantity) | date | sum(s2.quantity) |
+------+------------------+------------+------------------+
| 3 | 9 | 2014-09-18 | 9 |
| 5 | 8 | 2014-09-18 | 5 |
| 7 | 15 | 2014-09-18 | 8 |
+------+------------------+------------+------------------+
Bonus points/boss level:
I also need a column that will show total_reference, the same rolling total for the same periods for a different date (e.g. 2014-09-11).
If you want a cumulative sum in MySQL, the most efficient way is to use variables:
SELECT date, hour,
(#q := q + #q) as cumeq, (#a := a + #a) as cumea
FROM (SELECT date, hour, SUM(quantity) as q, SUM(amount) as a
FROM sales s
WHERE s.date = '2014-09-18'
GROUP by date, hour
) dh cross join
(select #q := 0, #a := 0) vars
ORDER BY date, hour;
If you are planning on working with databases such as Oracle, SQL Server, and Postgres, then you should use a database more similar in functionality and that supports that ANSI standard window functions. The right way to do this is with window functions, but MySQL doesn't support those. Postgres, SQL Server, and Oracle all have free versions that yo can use for development purposes.
Also, with proper indexing, you shouldn't have a problem with the subquery approach, even on large tables.
I'm trying to calculate a month-to-date total using SQL Server 2008.
I'm trying to generate a month-to-date count at the level of activities and representatives. Here are the results I want to generate:
| REPRESENTATIVE_ID | MONTH | WEEK | TOTAL_WEEK_ACTIVITY_COUNT | MONTH_TO_DATE_ACTIVITIES_COUNT |
|-------------------|-------|------|---------------------------|--------------------------------|
| 40 | 7 | 7/08 | 1 | 1 |
| 40 | 8 | 8/09 | 1 | 1 |
| 40 | 8 | 8/10 | 1 | 2 |
| 41 | 7 | 7/08 | 2 | 2 |
| 41 | 8 | 8/08 | 4 | 4 |
| 41 | 8 | 8/09 | 3 | 7 |
| 41 | 8 | 8/10 | 1 | 8 |
From the following tables:
ACTIVITIES_FACT table
+-------------------+------+-----------+
| Representative_ID | Date | Activity |
+-------------------+------+-----------+
| 41 | 8/03 | Call |
| 41 | 8/04 | Call |
| 41 | 8/05 | Call |
+-------------------+------+-----------+
LU_TIME table
+-------+-----------------+--------+
| Month | Date | Week |
+-------+-----------------+--------+
| 8 | 8/01 | 8/08 |
| 8 | 8/02 | 8/08 |
| 8 | 8/03 | 8/08 |
| 8 | 8/04 | 8/08 |
| 8 | 8/05 | 8/08 |
+-------+-----------------+--------+
I'm not sure how to do this: I keep running into problems with multiple-counting or aggregations not being allowed in subqueries.
A running total is the summation of a sequence of numbers which is
updated each time a new number is added to the sequence, simply by
adding the value of the new number to the running total.
I THINK He wants a running total for Month by each Representative_Id, so a simple group by week isn't enough. He probably wants his Month_To_Date_Activities_Count to be updated at the end of every week.
This query gives a running total (month to end-of-week date) ordered by Representative_Id, Week
SELECT a.Representative_ID, l.month, l.Week, Count(*) AS Total_Week_Activity_Count
,(SELECT count(*)
FROM ACTIVITIES_FACT a2
INNER JOIN LU_TIME l2 ON a2.Date = l2.Date
AND a.Representative_ID = a2.Representative_ID
WHERE l2.week <= l.week
AND l2.month = l.month) Month_To_Date_Activities_Count
FROM ACTIVITIES_FACT a
INNER JOIN LU_TIME l ON a.Date = l.Date
GROUP BY a.Representative_ID, l.Week, l.month
ORDER BY a.Representative_ID, l.Week
| REPRESENTATIVE_ID | MONTH | WEEK | TOTAL_WEEK_ACTIVITY_COUNT | MONTH_TO_DATE_ACTIVITIES_COUNT |
|-------------------|-------|------|---------------------------|--------------------------------|
| 40 | 7 | 7/08 | 1 | 1 |
| 40 | 8 | 8/09 | 1 | 1 |
| 40 | 8 | 8/10 | 1 | 2 |
| 41 | 7 | 7/08 | 2 | 2 |
| 41 | 8 | 8/08 | 4 | 4 |
| 41 | 8 | 8/09 | 3 | 7 |
| 41 | 8 | 8/10 | 1 | 8 |
SQL Fiddle Sample
As I understand your question:
SELECT af.Representative_ID
, lt.Week
, COUNT(af.Activity) AS Qnt
FROM ACTIVITIES_FACT af
INNER JOIN LU_TIME lt ON lt.Date = af.date
GROUP BY af.Representative_ID, lt.Week
SqlFiddle
Representative_ID Week Month_To_Date_Activities_Count
41 2013-08-01 00:00:00.000 1
41 2013-08-08 00:00:00.000 3
USE tempdb;
GO
IF OBJECT_ID('#ACTIVITIES_FACT','U') IS NOT NULL DROP TABLE #ACTIVITIES_FACT;
CREATE TABLE #ACTIVITIES_FACT
(
Representative_ID INT NOT NULL
,Date DATETIME NULL
, Activity VARCHAR(500) NULL
)
IF OBJECT_ID('#LU_TIME','U') IS NOT NULL DROP TABLE #LU_TIME;
CREATE TABLE #LU_TIME
(
Month INT
,Date DATETIME
,Week DATETIME
)
INSERT INTO #ACTIVITIES_FACT(Representative_ID,Date,Activity)
VALUES
(41,'7/31/2013','Chat')
,(41,'8/03/2013','Call')
,(41,'8/04/2013','Call')
,(41,'8/05/2013','Call')
INSERT INTO #LU_TIME(Month,Date,Week)
VALUES
(8,'7/31/2013','8/01/2013')
,(8,'8/01/2013','8/08/2013')
,(8,'8/02/2013','8/08/2013')
,(8,'8/03/2013','8/08/2013')
,(8,'8/04/2013','8/08/2013')
,(8,'8/05/2013','8/08/2013')
--Begin Query
SELECT AF.Representative_ID
,LU.Week
,COUNT(*) AS Month_To_Date_Activities_Count
FROM #ACTIVITIES_FACT AS AF
INNER JOIN #LU_TIME AS LU
ON AF.Date = LU.Date
Group By AF.Representative_ID
,LU.Week