Cumulative days across rows with several constraints SQL - sql

I'm trying to figure out how to return a single line per Asset which shows the total days of cumulative Periods. However, I only want to add certain Periods depending on whether the StartDate is within 10 days of the EndDate of the previous Period.
TotalDays column in the sample data: If a Period does not have an EndDate the total days is Today (13/10/2018) minus the StartDate.
Breakdown of the Expected Output Table:
Row 1 / Asset1: TotalDays is 278 because
Period 2 started 1 day after Period 1 ended
and Period 3 started 6 days after Period 2 ended
therefore 63+29+186 = 278
Row 2 / Asset 2: TotalDays is 120 because
Period 1 and 2 are both Open so use the earliest StartDate
Today minus 15/06/2018 = 120
Row 3 / Asset 3: TotalDays is 66 because
Period 2 started over 10 days after Period 1 ended
If an Asset has no Open Periods it would not be displayed in the Output.
Happy to clarify anything as I know this is a bit fiddly!
Many Thanks.
Data Sample:
+-----+---------+--------+------------+------------+-----------+--------------------------------------+--------+
| Row | AssetID | Period | StartDate | EndDate | TotalDays | DaysBetweenEndDateAndStartDateOfNext | Status |
+-----+---------+--------+------------+------------+-----------+--------------------------------------+--------+
| 1 | 1 | 1 | 01/01/2018 | 05/03/2018 | 63 | NULL | Closed |
| 2 | 1 | 2 | 06/03/2018 | 04/04/2018 | 29 | 1 | Closed |
| 3 | 1 | 3 | 10/04/2018 | NULL | 186 | 6 | Open |
| 4 | 2 | 1 | 15/06/2018 | NULL | 120 | NULL | Open |
| 5 | 2 | 2 | 01/07/2018 | NULL | 104 | NULL | Open |
| 6 | 3 | 1 | 01/02/2018 | 10/02/2018 | 9 | NULL | Closed |
| 7 | 3 | 2 | 08/08/2018 | NULL | 66 | 179 | Open |
+-----+---------+--------+------------+------------+-----------+--------------------------------------+--------+
Expected Output:
+-----+---------+------------+---------+-----------+
| Row | AssetID | StartDate | EndDate | TotalDays |
+-----+---------+------------+---------+-----------+
| 1 | 1 | 01/01/2018 | NULL | 278 |
| 2 | 2 | 15/06/2018 | NULL | 120 |
| 3 | 3 | 08/08/2018 | NULL | 66 |
+-----+---------+------------+---------+-----------+

Related

Find Aggregated Data Between Two Dates in Two Tables Where One is Updated Weekly and Other is Updated Hourly

I have data in two different tables, one is updated every week or once in the middle of the week if needed, and the other table is updated every hour or so because it has more data. The first table, can be seen as
agent_id | rank | ranking_date
---------------------------
1 | 1 | 2022-03-21
2 | 2 | 2022-03-21
1 | 4 | 2022-03-14
2 | 3 | 2022-03-14
1 | 2 | 2022-03-10
And the second table contains detailed information on sales.
agent_id | call_id | talk_time | product_sold | amount | call_date
------------------------------------------------------------------
1 | 1 | 13 | 1 | 53 |2022-03-10
1 | 2 | 24 | 2 | 2 |2022-03-10
2 | 3 | 43 | 4 | 11 |2022-03-10
1 | 4 | 31 | - | 0 |2022-03-10
2 | 5 | 12 | - | 0 |2022-03-10
1 | 6 | 11 | - | 0 |2022-03-11
1 | 7 | 35 | 2 | 79 |2022-03-11
2 | 8 | 76 | - | 0 |2022-03-11
1 | 9 | 42 | 1 | 23 |2022-03-11
2 | 10 | 69 | - | 0 |2022-03-11
How can I merge the two tables to get their aggregated information? Remember the ranks change at the beginning of every week, and the sales happen every day. But the rankings can also be changed in the middle of the week if needed. So what I am trying to get is created an aggregated table for understanding the sales by each agent. Something like this
agent_id | rank | ranking_date | total_calls_handled | total_talktime | total_amount
------------------------------------------------------------------------------------
1 | 1 | 2022-03-21 | 100 | 875 | 3000 (this is 3/21 - today)
2 | 2 | 2022-03-21 | 120 | 576 | 3689 (this is 3/21 - today)
1 | 4 | 2022-03-14 | 210 | 246 | 1846 (this is 3/14 - 3/21)
2 | 3 | 2022-03-14 | 169 | 693 | 8562 (this is 3/14 - 3/21)
1 | 2 | 2022-03-10 | 201 | 559 | 1749 (this is 3/7 - 3/10)
So the data is aggregated for each agent from 7-10, 10 - 14, then 14-21. Also, if say, the latest ranking date is 2022-03-21, and today is 2022-03-23, the query returns aggregation until today.
[Edit]: added table and data details
Table and data details:
Rankings table:
agent_id: unique_id of the agent
rank: rank of an agent assigned updated every Monday or if needed
ranking_date: date when agent's ranking was last updated (Automatically every Monday or if needed)
Sales Table:
agent_id: unique_id of the agent
call_id: unique_id for a call
talk_time: duration of the call
product_sold: unique_id of the product sold (- if agent was unsuccessful to sell)
amount: commission earned by the agent (therefore same product_id has different amount) (0 if agent was unsuccessful to sell)
call_date: date when which call was made
[Edit 2]: Here is SQLFiddle.
Here we join where ranking_date and call_date are in the same week. If you make calls sunday you will need to check whether it falls in the same week as you want.
The syntax in the query is for SQL server, as the SQL Fiddle given. You will need to modify the line of the join to
on date_part(w,r.ranking_date) = date_part(w,s.call_date)
which should be compatible with Google Redshift.
select
r.agent_id,
r.rank,
r.ranking_date,
count(s.call_id) TotalCalls,
sum(s.talk_time) TotalTime,
sum(s.amount) TotalAmount
from rankings r
left join sales s
on datename(ww,r.ranking_date)= datename(ww,s.call_date)
group by
r.agent_id,
r.rank,
r.ranking_date
GO
agent_id | rank | ranking_date | TotalCalls | TotalTime | TotalAmount
-------: | ---: | :----------- | ---------: | --------: | ----------:
1 | 1 | 2022-03-21 | 0 | null | null
1 | 2 | 2022-03-10 | 10 | 356 | 168
1 | 4 | 2022-03-14 | 0 | null | null
2 | 2 | 2022-03-21 | 0 | null | null
2 | 3 | 2022-03-14 | 0 | null | null
db<>fiddle here

Sum of Time differences while having times in 6 different columns

I have following database
Station|Status | Start_hour | Start_minute | Start_second | End_hour | End_minute | End_second
Is it possible to sum up times for every entry that has end_ fields filled between given time?
Example:
Station|Status|Start_hour|Start_minute|Start_second|End_hour|End_minute|End_second
8 | 0 | 10 | 5 | 0 | NULL | NULL | NULL
8 | 1 | 10 | 5 | 0 | 10 | 15 | 30
2 | 1 | 9 | 53 | 0 | 10 | 16 | 45
7 | 0 | 10 | 23 | 0 | NULL | NULL | NULL
So the output would look like this:
Sum of time
29:15 */sum of times for stations 8 and 2(5:30 + 23:45)/*
Problem? I have to do it in single query

SQL generate unique ID from rolling ID

I've been trying to find an answer to this for the better part of a day with no luck.
I have a SQL table with measurement data for samples and I need a way to assign a unique ID to each sample. Right now each sample has an ID number that rolls over frequently. What I need is a unique ID for each sample. Below is a table with a simplified dataset, as well as an example of a possible UID that would do what I need.
| Row | Time | Meas# | Sample# | UID (Desired) |
| 1 | 09:00 | 1 | 1 | 1 |
| 2 | 09:01 | 2 | 1 | 1 |
| 3 | 09:02 | 3 | 1 | 1 |
| 4 | 09:07 | 1 | 2 | 2 |
| 5 | 09:08 | 2 | 2 | 2 |
| 6 | 09:09 | 3 | 2 | 2 |
| 7 | 09:24 | 1 | 3 | 3 |
| 8 | 09:25 | 2 | 3 | 3 |
| 9 | 09:25 | 3 | 3 | 3 |
| 10 | 09:47 | 1 | 1 | 4 |
| 11 | 09:47 | 2 | 1 | 4 |
| 12 | 09:49 | 3 | 1 | 4 |
My problem is that rows 10-12 have the same Sample# as rows 1-3. I need a way to uniquely identify and group each sample. Having the row number or time of the first measurement on the sample would be good.
One other complication is that the measurement number doesn't always start with 1. It's based on measurement locations, and sometimes it skips location 1 and only has locations 2 and 3.
I am going to speculate that you want a unique number assigned to each sample, where now you have repeats.
If so, you can use lag() and a cumulative sum:
select t.*,
sum(case when prev_sample = sample then 0 else 1 end) over (order by row) as new_sample_number
from (select t.*,
lag(sample) over (order by row) as prev_sample
from t
) t;

MS SQL Server: Load All Data vs Aggregate with +1 round trip

I love to get your opinion on this problem.
I need to show the list of order records for the range of particular date/time. Then summarise it with # of Order compare with the "last" order. "Last" can mean either last month OR last year
Since I am going to show the list of order record, I am thinking to get the record from last month OR last year with one hit (ie. together with the records of current date/time range)
OR, alternatively, I can:
Get the record of current date/time range, THEN
Get the total number of order (using aggregate) for last month OR last year
The alternative means there is 2 round trips to database (but less data to return). Or should I stick with my current method (loading all records including those from last month OR last year).
NOTE: The website and the SQL server is hosted in Microsoft Azure Cloud. But we might switch to AWS in the future.
Thanks
Input example (some fields are omitted including time for simplicity)
----------------------------------------------------------------
| Warehouse Id | Order Id | Product Id | Quantity | Order Date |
----------------------------------------------------------------
| 1 | 10 | 1 | 10 | 2016-09-25 |
| 1 | 9 | 5 | 5 | 2016-09-24 |
| 1 | 8 | 4 | 8 | 2016-09-23 |
| 1 | 7 | 6 | 2 | 2016-09-23 |
| 1 | 6 | 8 | 1 | 2016-09-23 |
| 1 | 5 | 1 | 2 | 2016-09-22 |
| 1 | 4 | 1 | 2 | 2016-09-21 |
| 1 | 3 | 5 | 10 | 2016-09-21 |
| 1 | 2 | 5 | 15 | 2016-08-12 |
| 1 | 1 | 5 | 5 | 2016-08-10 |
----------------------------------------------------------------
The desire OUTPUT:
Input:
WarehouseId: 1
StartDate: 2016-09-01 End Date: 2016-09-30)
Comparison type: Last Month (ie. StartDate: 2016-08-01 EndDate: 2016-08-31)
Output:
Warehouse: xxx
-------------------------------------------------
| Order Id | Product Id | Quantity | Order Date |
-------------------------------------------------
| 10 | 1 | 10 | 2016-09-25 |
| 9 | 5 | 5 | 2016-09-24 |
| 8 | 4 | 8 | 2016-09-23 |
| 7 | 6 | 2 | 2016-09-23 |
| 6 | 8 | 1 | 2016-09-23 |
| 5 | 1 | 2 | 2016-09-22 |
| 4 | 1 | 2 | 2016-09-21 |
| 3 | 5 | 10 | 2016-09-21 |
-------------------------------------------------
Total Order: 40 (increase 100% from last month)
So, what I am doing now is to get ALL records from 2016-08-01 to 2016-09-30. That way I can avoid 2 round trips.
Alternatively, I can do the following:
1. Get record from 2016-09-01 to 2016-09-30
var rec = (from rec in tblOrders
where (rec.WarehouseId == whsId) && (rec.OrderDate >= startDate) && (rec.OrderDate <= endDate)
select rec).ToList();
2. Then do the SUM of total order from 2016-08-01 to 2016-08-31 for comparison purposes
var recSum = (from rec in ef.tblOrders
where (rec.WarehouseId == whsId) && (rec.OrderDate >= cStartDate) && (rec.OrderDate <= cEndDate)
group rec by rec.WarehouseId into grec
select new
{
TotalQty = grec.Sum(x => x.Quantity),
}).FirstOrDefault();
You can do this with window functions:
select o.*
from (select o.*
sum(case when datetime is "last month" or "last year" then 1 else 0 end) over () as last_num_orders
from orders o
) o
where o.datetime between #date1 and #date2;
I am very unclear what "last" means in this context. However, you can do what you want with window functions, which is the preferred option 0.

How to calculate running total (month to date) in SQL Server 2008

I'm trying to calculate a month-to-date total using SQL Server 2008.
I'm trying to generate a month-to-date count at the level of activities and representatives. Here are the results I want to generate:
| REPRESENTATIVE_ID | MONTH | WEEK | TOTAL_WEEK_ACTIVITY_COUNT | MONTH_TO_DATE_ACTIVITIES_COUNT |
|-------------------|-------|------|---------------------------|--------------------------------|
| 40 | 7 | 7/08 | 1 | 1 |
| 40 | 8 | 8/09 | 1 | 1 |
| 40 | 8 | 8/10 | 1 | 2 |
| 41 | 7 | 7/08 | 2 | 2 |
| 41 | 8 | 8/08 | 4 | 4 |
| 41 | 8 | 8/09 | 3 | 7 |
| 41 | 8 | 8/10 | 1 | 8 |
From the following tables:
ACTIVITIES_FACT table
+-------------------+------+-----------+
| Representative_ID | Date | Activity |
+-------------------+------+-----------+
| 41 | 8/03 | Call |
| 41 | 8/04 | Call |
| 41 | 8/05 | Call |
+-------------------+------+-----------+
LU_TIME table
+-------+-----------------+--------+
| Month | Date | Week |
+-------+-----------------+--------+
| 8 | 8/01 | 8/08 |
| 8 | 8/02 | 8/08 |
| 8 | 8/03 | 8/08 |
| 8 | 8/04 | 8/08 |
| 8 | 8/05 | 8/08 |
+-------+-----------------+--------+
I'm not sure how to do this: I keep running into problems with multiple-counting or aggregations not being allowed in subqueries.
A running total is the summation of a sequence of numbers which is
updated each time a new number is added to the sequence, simply by
adding the value of the new number to the running total.
I THINK He wants a running total for Month by each Representative_Id, so a simple group by week isn't enough. He probably wants his Month_To_Date_Activities_Count to be updated at the end of every week.
This query gives a running total (month to end-of-week date) ordered by Representative_Id, Week
SELECT a.Representative_ID, l.month, l.Week, Count(*) AS Total_Week_Activity_Count
,(SELECT count(*)
FROM ACTIVITIES_FACT a2
INNER JOIN LU_TIME l2 ON a2.Date = l2.Date
AND a.Representative_ID = a2.Representative_ID
WHERE l2.week <= l.week
AND l2.month = l.month) Month_To_Date_Activities_Count
FROM ACTIVITIES_FACT a
INNER JOIN LU_TIME l ON a.Date = l.Date
GROUP BY a.Representative_ID, l.Week, l.month
ORDER BY a.Representative_ID, l.Week
| REPRESENTATIVE_ID | MONTH | WEEK | TOTAL_WEEK_ACTIVITY_COUNT | MONTH_TO_DATE_ACTIVITIES_COUNT |
|-------------------|-------|------|---------------------------|--------------------------------|
| 40 | 7 | 7/08 | 1 | 1 |
| 40 | 8 | 8/09 | 1 | 1 |
| 40 | 8 | 8/10 | 1 | 2 |
| 41 | 7 | 7/08 | 2 | 2 |
| 41 | 8 | 8/08 | 4 | 4 |
| 41 | 8 | 8/09 | 3 | 7 |
| 41 | 8 | 8/10 | 1 | 8 |
SQL Fiddle Sample
As I understand your question:
SELECT af.Representative_ID
, lt.Week
, COUNT(af.Activity) AS Qnt
FROM ACTIVITIES_FACT af
INNER JOIN LU_TIME lt ON lt.Date = af.date
GROUP BY af.Representative_ID, lt.Week
SqlFiddle
Representative_ID Week Month_To_Date_Activities_Count
41 2013-08-01 00:00:00.000 1
41 2013-08-08 00:00:00.000 3
USE tempdb;
GO
IF OBJECT_ID('#ACTIVITIES_FACT','U') IS NOT NULL DROP TABLE #ACTIVITIES_FACT;
CREATE TABLE #ACTIVITIES_FACT
(
Representative_ID INT NOT NULL
,Date DATETIME NULL
, Activity VARCHAR(500) NULL
)
IF OBJECT_ID('#LU_TIME','U') IS NOT NULL DROP TABLE #LU_TIME;
CREATE TABLE #LU_TIME
(
Month INT
,Date DATETIME
,Week DATETIME
)
INSERT INTO #ACTIVITIES_FACT(Representative_ID,Date,Activity)
VALUES
(41,'7/31/2013','Chat')
,(41,'8/03/2013','Call')
,(41,'8/04/2013','Call')
,(41,'8/05/2013','Call')
INSERT INTO #LU_TIME(Month,Date,Week)
VALUES
(8,'7/31/2013','8/01/2013')
,(8,'8/01/2013','8/08/2013')
,(8,'8/02/2013','8/08/2013')
,(8,'8/03/2013','8/08/2013')
,(8,'8/04/2013','8/08/2013')
,(8,'8/05/2013','8/08/2013')
--Begin Query
SELECT AF.Representative_ID
,LU.Week
,COUNT(*) AS Month_To_Date_Activities_Count
FROM #ACTIVITIES_FACT AS AF
INNER JOIN #LU_TIME AS LU
ON AF.Date = LU.Date
Group By AF.Representative_ID
,LU.Week