Find start date of similar contiguous records

Find start date of similar contiguous records - sql

I have a table called activities that stores the activities that employees are doing. It stores simple information such as if they are working or if they are on various types of leave, e.g. annual leave, sick leave, compassionate leave etc. The table stores the employee number, the type of activity and the date that the activity is on. Only 1 type of activity can occur on a single day and only days that are normally worked will have an activity attributed to them. For example if an employee is a Monday to Friday worker and is on annual leave for a week, the weekend dates are not included in the table as they are not days the employee normally works.
Below is a sample table:
╔══════════╦════════════╦══════════════╗
║ Employee ║ Date ║ Activity ║
╠══════════╬════════════╬══════════════╣
║ 12345 ║ 25/11/2016 ║ Work ║
║ 12345 ║ 24/11/2016 ║ Work ║
║ 12345 ║ 23/11/2016 ║ Work ║
║ 12345 ║ 22/11/2016 ║ Work ║
║ 12345 ║ 21/11/2016 ║ Work ║
║ 12345 ║ 18/11/2016 ║ Work ║
║ 12345 ║ 17/11/2016 ║ Work ║
║ 12345 ║ 16/11/2016 ║ Work ║
║ 12345 ║ 15/11/2016 ║ Sick Leave ║
║ 12345 ║ 14/11/2016 ║ Sick Leave ║
║ 12345 ║ 11/11/2016 ║ Sick Leave ║
║ 12345 ║ 10/11/2016 ║ Work ║
║ 12345 ║ 9/11/2016 ║ Work ║
║ 12345 ║ 8/11/2016 ║ Work ║
║ 12345 ║ 7/11/2016 ║ Work ║
║ 12345 ║ 4/11/2016 ║ Work ║
║ 12345 ║ 3/11/2016 ║ Sick Leave ║
║ 12345 ║ 2/11/2016 ║ Sick Leave ║
║ 12345 ║ 1/11/2016 ║ Work ║
║ 12345 ║ 31/10/2016 ║ Work ║
║ 67890 ║ 25/11/2016 ║ Annual Leave ║
║ 67890 ║ 24/11/2016 ║ Annual Leave ║
║ 67890 ║ 23/11/2016 ║ Annual Leave ║
║ 67890 ║ 22/11/2016 ║ Annual Leave ║
║ 67890 ║ 21/11/2016 ║ Annual Leave ║
║ 67890 ║ 18/11/2016 ║ Work ║
║ 67890 ║ 17/11/2016 ║ Work ║
║ 67890 ║ 16/11/2016 ║ Work ║
║ 67890 ║ 15/11/2016 ║ Sick Leave ║
║ 67890 ║ 14/11/2016 ║ Sick Leave ║
║ 67890 ║ 11/11/2016 ║ Sick Leave ║
║ 67890 ║ 10/11/2016 ║ Work ║
║ 67890 ║ 9/11/2016 ║ Work ║
║ 67890 ║ 8/11/2016 ║ Work ║
║ 67890 ║ 7/11/2016 ║ Work ║
║ 67890 ║ 4/11/2016 ║ Work ║
║ 67890 ║ 3/11/2016 ║ Annual Leave ║
║ 67890 ║ 2/11/2016 ║ Annual Leave ║
║ 67890 ║ 1/11/2016 ║ Work ║
║ 67890 ║ 31/10/2016 ║ Work ║
╚══════════╩════════════╩══════════════╝
For a given employee, date and activity, I need to work backwards from that date and find the start date of the most recent block of that given activity. A 'block' is any group of the same activity, so it could be 1 day or many days.
As an example, using the table above, let's say I need to find the start date of the most recent 'Sick Leave' for employee 12345 working backwards from a date of 20/11/2016. In this case I would be looking to get a value of '11/11/2016' as this was the start date for the most recent block of sick leave.
As another example, using the table above, let's say I need to find the start date of the most recent 'Annual Leave' for employee 67890 working backwards from a date of 20/11/2016. In this case I would be looking to get a value of '21/11/2016' as this was the start date for the most recent block of annual leave.

This is an example of a "gaps-and-islands" problem. You can get the periods of activity for an employee using the difference of row numbers approach:
select employee, activity, min(date), max(date)
from (select t.*,
row_number() over (partition by employee order by date) as seqnum_e,
row_number() over (partition by employee, activity order by date) as seqnum_ea
from t
) t
group by employee, activity, (seqnum_e - seqnum_ea);
You can then use this to answer your questions. For instance:
with ea as (
select employee, activity, min(date) as date_from, max(date) as date_to
from (select t.*,
row_number() over (partition by employee order by date) as seqnum_e,
row_number() over (partition by employee, activity order by date) as seqnum_ea
from t
) t
group by employee, activity, (seqnum_e - seqnum_ea)
)
select top 1 ea.*
from ea
where employee = 12345 and activity = 'Sick Leave'
order by date_from desc;
There are other solutions for particular questions, but this is likely to be the most general.

Related

SQL Select priority data from overlapping date ranges

I have two tables, with overlapping data (actual tables have more columns, but the key I need to remove overlaps from is date):
Let's call them:
HighPri
╔════════╦═══════╗
║ Date ║ Value ║
╠════════╬═══════╣
║ Dec-19 ║ 1 ║
║ Jan-20 ║ 2 ║
║ Feb-20 ║ 3 ║
╚════════╩═══════╝
and LoPri
╔════════╦═══════╗
║ Date ║ Value ║
╠════════╬═══════╣
║ Jan-20 ║ 5 ║
║ Feb-20 ║ 6 ║
║ Mar-20 ║ 7 ║
╚════════╩═══════╝
And I'm looking for a Sql Server query that would return this. (preferentially High pri where there is overlap, ow LoPri):
╔════════╦═══════╗
║ Date ║ Value ║
╠════════╬═══════╣
║ Dec-19 ║ 1 ║
║ Jan-20 ║ 2 ║
║ Feb-20 ║ 3 ║
║ Mar-20 ║ 7 ║
╚════════╩═══════╝
Looking for pure sql solution.

I understand this as a full join with coalesce() for priorization:
select coalesce(h.date, l.date) as date,
coalesce(h.value, l.value) as value
from highPri h full outer join
lowPri l
on h.date = l.date;

SQL Server Switch Based on grouping of previous values

I have a table of order information in the following format:
╔══════════════╦══════╦════════════════╦═══════════╦═════════╦══════════╗
║ Order Number ║ Line ║ Item ║ Warehouse ║ Carrier ║ Quantity ║
╠══════════════╬══════╬════════════════╬═══════════╬═════════╬══════════╣
║ 255 ║ 1 ║ STUFFED-ANIMAL ║ WH1 ║ UPS ║ 3 ║
║ 256 ║ 1 ║ BLOCKS ║ WH2 ║ FEDEX ║ 1 ║
║ 257 ║ 1 ║ DOLL ║ WH1 ║ UPS ║ 1 ║
║ 257 ║ 2 ║ DRESS ║ WH1 ║ UPS ║ 3 ║
║ 257 ║ 3 ║ SHOES ║ WH2 ║ UPS ║ 1 ║
║ 258 ║ 1 ║ CHAIR ║ WH3 ║ FEDEX ║ 1 ║
║ 258 ║ 2 ║ CHAIR ║ WH3 ║ UPS ║ 2 ║
╚══════════════╩══════╩════════════════╩═══════════╩═════════╩══════════╝
I am trying to query it in such a way that I partition it into groups based on a unique combination of columns.
In my example, I would like the following result:
╔════════════════╦══════╦════════════════╦═══════════╦═════════╦══════════╗
║ Package-Number ║ Line ║ Item ║ Warehouse ║ Carrier ║ Quantity ║
╠════════════════╬══════╬════════════════╬═══════════╬═════════╬══════════╣
║ 255 ║ 1 ║ STUFFED-ANIMAL ║ WH1 ║ UPS ║ 3 ║
║ 256 ║ 1 ║ BLOCKS ║ WH2 ║ FEDEX ║ 1 ║
║ 257-1 ║ 1 ║ DOLL ║ WH1 ║ UPS ║ 1 ║
║ 257-1 ║ 2 ║ DRESS ║ WH1 ║ UPS ║ 3 ║
║ 257-2 ║ 3 ║ SHOES ║ WH2 ║ UPS ║ 1 ║
║ 258-1 ║ 1 ║ CHAIR ║ WH3 ║ FEDEX ║ 1 ║
║ 258-2 ║ 2 ║ CHAIR ║ WH3 ║ UPS ║ 2 ║
╚════════════════╩══════╩════════════════╩═══════════╩═════════╩══════════╝
To break it down I would like to do the following:
If the order number, warehouse, and carrier are the same that is one 'partition.' If there is only one partition then we just leave the order number as the package number, otherwise we break it down into packages. These packages are numbered by the same set of values being the same, but now have a number denoting which package it is.
I was looking into using row_number() over (partition by... that I found after searching for similar issues but I don't think it is exactly what I'm looking for.
Could someone point me in the right direction?

This is tricky. Having count(distinct) as a window function would help. But there is a convenient trick using the sums of dense_rank()s.
So, I think this does what you want:
select (case when seqnum_asc + seqnum_desc - 1 > 1 -- more than 1 distinct value
then concat(ordernumber, '-', seqnum_asc)
else concat(ordernumber, '') -- just to convert the value to a string
end) as packagenumber,
t.*
FROM (select t.*,
dense_rank() over (partition by ordernumber order by warehouse, carrier) as seqnum_asc,
dense_rank() over (partition by ordernumber order by warehouse desc, carrier desc) as seqnum_desc
from mytable t
) t;
Here is a db<>fiddle.
Note: This does not take into account the ordering by the line number -- because your question doesn't mention that at all. If you only want adjacent rows with the same value to be included in each group, then ask a new question with appropriate sample data and desired results.

Here is an option using Dense_Rank() instead of Row_Number()
Example
Select [Package-Number] = concat([Order Number]
,left(nullif(count(*) over (partition by [Order Number] ),1),0)
+dense_rank() over (partition by [Order Number],warehouse,carrier order by Line)*-1 )
,Line
,Item
,Warehouse
,Carrier
,Quantity
From YourTable
Returns

Find percentage ratio within sum values in same table

I have a list of retail transactions in a SQL table. The table contains details for the customer number, product number, transaction type, transaction date and amount.
Ultimately, I need to produce a record that contains the customer number, product number, transaction type, sum of transaction amounts and a %. The percentage represents what proportion of all transactions for that customer/product combination were of that given transaction type.
For example my table has data like this:
╔══════════╦═════════╦═════════╦════════════╦═══════════╗
║ Customer ║ Product ║ TxnType ║ TxnDate ║ TxnAmount ║
╠══════════╬═════════╬═════════╬════════════╬═══════════╣
║ Smith ║ 1234 ║ Cash ║ 01/01/2018 ║ 10 ║
║ Smith ║ 1234 ║ Credit ║ 02/01/2018 ║ 20 ║
║ Smith ║ 1234 ║ Cash ║ 03/01/2018 ║ 10 ║
║ Smith ║ 1234 ║ Cash ║ 04/01/2018 ║ 20 ║
║ Smith ║ 3456 ║ Cash ║ 01/01/2018 ║ 10 ║
║ Smith ║ 3456 ║ Credit ║ 02/01/2018 ║ 20 ║
║ Smith ║ 3456 ║ Cash ║ 03/01/2018 ║ 10 ║
║ Jones ║ 3456 ║ Credit ║ 01/01/2018 ║ 10 ║
║ Jones ║ 3456 ║ Cash ║ 02/01/2018 ║ 10 ║
║ Jones ║ 3456 ║ Credit ║ 01/01/2018 ║ 20 ║
║ Jones ║ 1234 ║ Credit ║ 01/01/2018 ║ 10 ║
║ Jones ║ 1234 ║ Credit ║ 02/01/2018 ║ 20 ║
║ Jones ║ 1234 ║ Credit ║ 03/01/2018 ║ 20 ║
║ Jones ║ 1234 ║ Credit ║ 04/01/2018 ║ 40 ║
╚══════════╩═════════╩═════════╩════════════╩═══════════╝
And I need a result of this:
╔══════════╦═════════╦═════════╦══════════════╦════════════╗
║ Customer ║ Product ║ TxnType ║ SumTxnAmount ║ %ofTxnType ║
╠══════════╬═════════╬═════════╬══════════════╬════════════╣
║ Smith ║ 1234 ║ Cash ║ 40 ║ 66% ║
║ Smith ║ 1234 ║ Credit ║ 20 ║ 33% ║
║ Smith ║ 3456 ║ Cash ║ 20 ║ 50% ║
║ Smith ║ 3456 ║ Credit ║ 20 ║ 50% ║
║ Jones ║ 3456 ║ Cash ║ 10 ║ 25% ║
║ Jones ║ 3456 ║ Credit ║ 30 ║ 75% ║
║ Jones ║ 1234 ║ Credit ║ 90 ║ 100% ║
╚══════════╩═════════╩═════════╩══════════════╩════════════╝

You can try below
DEMO
select
customer,product,TxnType,
sum(TxnAmount) as SumTxnAmount,cast((sum(TxnAmount)*100.00)/(select
sum(TxnAmount) from cte1 b where a.customer=b.customer and a.product=b.product) as decimal(16,2)) as '%ofTxnType'
from cte1 a
group by customer,product,TxnType
OUTPUT:
customer product TxnType SumTxnAmount %ofTxnType
Smith 1234 Cash 40 66.67
Smith 1234 Credit 20 33.33

Return rows only if 2 values both exist

I need to return a list of customer names from a purchase summary table but only if the customer has bought 2 definitive items within the 1 transaction.
For example table 'transaction'
╔══════════════╦════════╦══════════════╦════════╗
║ CustomerName ║ Item ║ Transaction# ║ Amount ║
╠══════════════╬════════╬══════════════╬════════╣
║ Smith ║ Hammer ║ 1 ║ 50.00 ║
║ Smith ║ Nail ║ 1 ║ 4.00 ║
║ Smith ║ Screw ║ 1 ║ 5.00 ║
║ Brown ║ Nail ║ 2 ║ 4.00 ║
║ Brown ║ Screw ║ 2 ║ 4.00 ║
║ Jones ║ Hammer ║ 3 ║ 50.00 ║
║ Jones ║ Screw ║ 3 ║ 4.00 ║
║ Smith ║ Nail ║ 4 ║ 50.00 ║
║ Smith ║ Hammer ║ 4 ║ 4.00 ║
║ Smith ║ Screw ║ 5 ║ 5.00 ║
╚══════════════╩════════╩══════════════╩════════╝
I only want to return customers who have bought a Hammer and a screw in the same transaction. It doesn't matter what other items were bought in the same transaction, I only need the details for the hammer and the screw, and only if both the hammer and screw were present in the same transaction.
So the above only needs to return:
╔══════════════╦════════╦══════════════╦════════╗
║ CustomerName ║ Item ║ Transaction# ║ Amount ║
╠══════════════╬════════╬══════════════╬════════╣
║ Smith ║ Hammer ║ 1 ║ 50.00 ║
║ Smith ║ Screw ║ 1 ║ 5.00 ║
╚══════════════╩════════╩══════════════╩════════╝
Because only transaction 1 contained both a hammer and a screw in the same transaction.

Use a sub-select to find transactions including both Hammer and Screw:
select CustomerName, Item, Transaction#, Amount
from purchase
where Transaction# in (select Transaction# from purchase
where Item in ('Hammer', 'Screw')
group by Transaction#
having count(distinct Item) = 2)
and Item in ('Hammer', 'Screw')
Remove last row if also Nail row should be returned!

SSRS - Group Consecutive Rows of Same Time Spans

Related to this SQL question - Group consecutive rows of same value using time spans
I want to convert this table:
╔═══════════╦════════════╦═══════════╦═══════════╦═════════╗
║ Classroom ║ CourseName ║ Lesson ║ StartTime ║ EndTime ║
╠═══════════╬════════════╬═══════════╬═══════════╬═════════╣
║ 1001 ║ Course 1 ║ Lesson 1 ║ 0800 ║ 0900 ║
║ 1001 ║ Course 1 ║ Lesson 2 ║ 0900 ║ 1000 ║
║ 1001 ║ Course 1 ║ Lesson 3 ║ 1000 ║ 1100 ║
║ 1001 ║ Course 2 ║ Lesson 10 ║ 1100 ║ 1200 ║
║ 1001 ║ Course 2 ║ Lesson 11 ║ 1200 ║ 1300 ║
║ 1001 ║ Course 1 ║ Lesson 4 ║ 1300 ║ 1400 ║
║ 1001 ║ Course 1 ║ Lesson 5 ║ 1400 ║ 1500 ║
╚═══════════╩════════════╩═══════════╩═══════════╩═════════╝
To this table:
╔═══════════╦════════════╦═══════════╦═════════╗
║ Classroom ║ CourseName ║ StartTime ║ EndTime ║
╠═══════════╬════════════╬═══════════╬═════════╣
║ 1001 ║ Course 1 ║ 0800 ║ 1100 ║
║ 1001 ║ Course 2 ║ 1100 ║ 1300 ║
║ 1001 ║ Course 1 ║ 1300 ║ 1500 ║
╚═══════════╩════════════╩═══════════╩═════════╝
The SQL solution from the related question works but the query takes forever because I have a lot of data in my tables and the SQL Query is using 2 sub queries.
Actually the original table is a query with 3 joins in itself so the complexity is even bigger.
I am looking for an SSRS solution.
Is it possible using some "VB Magic" or other kind of magic in SSRS 2008 R2 to do this ?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Find start date of similar contiguous records - sql

Related

SQL Select priority data from overlapping date ranges

SQL Server Switch Based on grouping of previous values

Find percentage ratio within sum values in same table

Return rows only if 2 values both exist

SSRS - Group Consecutive Rows of Same Time Spans

Categories

Resources