Finding Failures Within Last 3 days of Orders Per Customer - sql

I'm trying to write a query which is confusing me.
In essence, what I'm trying to look for is check the customer, look for the last 3 days where jobs were done and if each day has a failure, then it performs an action on that customer.
So customer A) Could have had jobs all week (I'll use this week as an example), 20/10 (Failure), 19/10 (Failure), 18/10 (Failure) would work
And Customer B) only has jobs fortnightly (20/10) (Failure), 05/10 (Failure), 20/09, (Failure)
What I am confused about, is I am not sure how to filter on the orders where I'm not looking at the orders themselves, but instead 3 seperate days where orders have been done
I was thinking of a top 3 dates, but the customer could have multiple jobs in a day and I need to find all of the orders for the last 3 days where jobs were done
SELECT distinct TOP 3 DATEPART(DAY,od.datetimeCreated), of.uniqueID FROM order.
data od LEFT JOIN order.dataFailure of ON of.orderID = od.orderID
This gives me something similar to what I want, however I still want to see all of the data for those 3 days
Could anyone give me some pointers on how I could go about this?
Sample Data:
Not sure what data would help with this issue, in essence, when the orderID from order.data is inside order.dataFailure, then we consider it a failure, else if it joins as a null, it hasn't failed.
As for dates, I compare the dates on a field called datetimeCreated and datetimeFailed, and then group it by a customer account code
Desired Results:
I need to find the last 3 days where orders were done for that customer, this could be 3 orders a day for the last 3 days, or 1 order a week for the last 3 weeks, and I'm looking to see if there is a failure on each of those days (Being is there a row in the order.datafailure table for each day of the last 3 days)
In this image I have filtered on the customer,
the query needs to be able to look and see that there has been a failure on 2020-03-12, then check the next day, 2019-12-13, also failure and 2015-07-13, no failure

Related

SQL performance issues with window functions on daily basis

Given ~23 million users, what is the most efficient way to compute the cumulative number of logins within the last X months for any given day (even when no login was performed) ? Start date of a customer is its first ever login, end date is today.
Desired output
c_id day nb_logins_past_6_months
----------------------------------------------
1 2019-01-01 10
1 2019-01-02 10
1 2019-01-03 9
...
1 today 5
➔ One line per user per day with the number of logins between current day and 179 days in the past
Approach 1
1. Cross join each customer ID with calendar table
2. Left join on login table on day
3. Compute window function (i.e. `sum(nb_logins) over (partition by c_id order by day rows between 179 preceding and current row)`)
+ Easy to understand and mantain
- Really heavy, quite impossible to run on daily basis
- Incremental does not bring much benefit : still have to go 179 days in the past
Approach 2
1. Cross join each customer ID with calendar table
2. Left join on login table on day between today and 179 days in the past
3. Group by customer ID and day to get nb logins within 179 days
+ Easier to do incremental
- Table at step 2 is exceeding 300 billion rows
What is the common way to deal with this knowing this is not the only use case, we have to compute other columns like this (nb logins in the past 12 months etc.)
In standard SQL, you would use:
select l.*,
count(*) over (partition by customerid
order by login_date
range between interval '6 month' preceding and current row
) as num_logins_180day
from logins l;
This assumes that the logins table has a date of the login with no time component.
I see no reason to multiply 23 million users by 180 days to generate a result set in excess of 4 million rows to answer this question.
For performance, don't do the entire task all at once. Instead, gather subtotals at the end of each month (or day or whatever makes sense for your data). Then SUM up the subtotals to provide the 'report'.
More discussion (with a focus on MySQL): http://mysql.rjweb.org/doc.php/summarytables
(You should tag questions with the specific product; different products have different syntax/capability/performance/etc.)

Reuse logic to query data based on date filter

I have logic in place to pull records based on date. For example i have to check if a record that appeared in a week has also occurred in next 14 days then that records need to be flagged. So basically i have put self join to get that record.
Now i have to pull record for 3 months and see if that record appeared again but logic will be same(in next 14 days), so ideally i have to change date filter in query for every week and get data, is there a way i can do it in same query and get full 3 months data
let me know if more clarification required.

Zoho Reports: SQL Query - Finding date and number of days

Problem Statement: I need to find out Over Due start date and from that i need to calculate number of Over due days. I know how to do for Over due days count, but i am not able to find a way to figure out for Over due start date.
Example: Let us say a customer did not pay for 4th November 2017, 4th December 2017, 4th Jan 2018, 4th Feb 2018. Now for these There were 4 Zero collection records placed in Collections table and 4 records placed in Over Due Collections table with D Flag. Now on 8th Feb Customer Paid an installment then the respective payment record has been placed in Collections table and another record in Over due collections with C flag. Since this payment gets adjusted for 4th November 2017 the Over due start date will be 4th December. Suppose if the customer did not pay then it will be 4th November 2017 as the Over due start date.
I have tables as follows for a Loan Management System:
Schedule (Payment Schedule): Which will have all the Installments, with the dates adn the respective amounts to be paid for each month.
Schema: LoanNo, Schedule Date, Installment No, Principle, Interest.
Collections (Payment Collections) for each month which has been collected. Suppose if the payment not received, A record placed with the respective date and with Zero amount. and another record will be placed in Over due collections table with D flag with the respective amounts. If there is any collection happens, then another record will be inserted with the flag C which represents collections.
Schema: LoanNo, PaymentReceived Date, Principle, Interest
Over Due Collections (Which there will be a record placed if there is a Due)
Schema: LoanID, Flag(D/C), Date, Principle, Interest
Please do suggest and guide me to write a proper query for this
it's interesting yet easy problem. you can tackle by calculating running sum of the amount and then compare with total payments by the customer. Take all the records having running sum greater than total payment. and choose minimum date out of it.
let me know if require further help I will give you SQL query. But you should try by your own
Edit 1
this will provide you running_sum
_______Subquery1_______
select a.LoanNO,a.Scheduledate,a.Amount,sum(b.amount)run_sum from
Paymentschedule a
join PayamentSchedule b
on a.LoanNo=b.LoanNo and a.ScheduleDate>b.ScheduleDate and
a.ScheduleDate<=now() group by 1,2,3
total collection against loan
_______subquery 2_____
select LoanNo,sum(amount)total collection from collection group by 1
now
select a.LoanNo,min(ScheduleDate) overduestartdate from subquery1 join subquery2 on
a.LoanNO=b.LoanNO
and a.run_sum>b.Collection group by 1
modify according to your schema

DAX - Need column with row count within past year

I have a table with sales information at the transaction level. We want to institute a new model where we compensate sales reps if a customer has been makes a purchase after more than a year of dormancy. To figure out how much this would have cost historically, I want to add a column with a flag for whether or not each purchase was the Buyer's first in the past 365 days. What I'd like to do is a rowcount in Powerpivot, for all sales made by that customer in the past 365 days, and wrap it in an IF to set the result to 0 or 1.
Example:
Order Date Buyer First Purchase in Year?
1/1/2015 1 1
1/2/2015 2 1
2/1/2015 1 0
4/1/2015 2 0
3/1/2016 2 1
5/1/2017 2 1
Any assistance would be greatly appreciated.
Excellent business use case! It's quite relevant in the business world.
To break this down for you, I will create 3 columns: 2 with some calculations, and 1 with the result. Once you understood how I did this, you can combine all 3 column formulas and make a single column for your dataset, if you like.
Here's a picture of the results:
So here's the 3 columns that I created:
Last Purchase - in order to run this calculation, you need to know when the buyer made their last purchase.
CALCULATE(MAX([Order Date]),FILTER(Table1,[Order Date]<EARLIER([Order Date]) && [Buyer]=EARLIER([Buyer])))
Days Since Last Purchase - now you can compare the Last Purchase date to the current Order Date.
DATEDIFF([Last Purchase],[Order Date],DAY)
First Purchase in 1 Year - finally, the results column. This simply checks to see if it has been more than 365 days since the last purchase OR if the last purchase column is blank (which means it was the first purchase), and creates the flag you want.
IF([Days Since Last Purchase]>365 || ISBLANK([Days Since Last Purchase]),1,0)
Now, you can easily combine the logic of these 3 columns into a single column and get what you want. Hope this helps!
One note I wanted to add is that for this type of analysis it's not a wise move to do row counts as you had originally suggested, as your dataset can easily expand later on (what if you wanted to add more attribute columns?) and then you would have problems. So this solution that I shared with you is much more robust.

How to group within groups in Access

I've been trying for a while and I'm just about to give up. I need to prepare a report that displays Item Numbers, the line they were produced on, and their production date, among other things. So, as you would imagine, each row contains a line number, item number, production date, and information regarding the number of items planned and produced for that entry.
I need to group the rows by line first, that was simple enough, afterwards, I need to group them by week, that also worked like a charm, except the dates were not really in order after this. I would need to apply a sort but by day this time. This works well but it's the next step that causes the most trouble. I also need to group the runs of items produced. For example:
Day - Item
Day 1 - Item A
Day 2 - Item A
Day 3 - Item A
These would be grouped with a footer counting the number of items produced for those consecutive entries. However, sometimes production looks like this:
Day - Item
Day 1 - Item B
Day 2 - Item B
Day 3 - Item A
Day 3 - Item B
I don't see a way to have the items ordered in a particular way that they can be grouped since I'm already ordering/sorting them by date because the date order is messed up by the first group. If I'm to group items at that point I would get one group header/footer per row, meaning it's not working at all.
My client suggests I make it so that when Access "notices the item number changes it gives a total". While that's wonderful in words, it implies that the rows should be sorted by item number and date. He will produce item A for three days, then produce item B for 2 days but part of the problem is that sometimes he will produce A for two and a half days and start B on that third day (following A) so if it's ordered by date, it may put one row above the other since they are on the same day. To my knowledge there is no real way to have Access to just "know" which products are produced first so as to group them after the item number changes. Of course it can keep the order they were entered in but if I ever need them sorted, that order will be lost.
I'm not sure if this is at all possible with this kinda of table structure. If not, can anyone suggest an alternative table structure? Or perhaps there's a way to have the first group by to not mess up the dates, which would allow me to remove the sort by date (although I'm not sure that it would work even if I could do that).
#Steve Kass
Day - Item
Day 1 - Item B
Day 2 - Item B
Day 3 - Item B
Day 3 - Item A
Day 3 - Item C
Day 4 - Item A
Day 5 - Item C
This is how it's laid out in his Excel sheet:
Day - Item
Day 1 - Item B
Day 2 - Item C
Day 3 - Item C
Day 3 - Item A
Day 4 - Item A
Day 4 - Item D
Day 5 - Item D
I've picked letters that represent the alphabetical order of the actual item numbers.
#Abe Miessler, Query so far:
SELECT Planned.Line,
Planned.[Production Date],
Items.[Item Number],
Items.[Bottles/Pallet],
Planned.PQ1,
Planned.AQ1,
Planned.PQ2,
Planned.AQ2,
Planned.PQ3,
Planned.AQ3
FROM Items
INNER JOIN Planned
ON Items.ID = Planned.ItemID;
#David-W-Fenton: Well I'm being asked to have a production summary per run. A run would be described as consecutive production of the same product. Products are produced on one of two lines and there can be multiple entries per day. The report must be grouped first by line so that each group shows entries for that line. That was done with a simple grouping. Within each line grouping I'm required to separate entries by week. Now, within each week, the days are not appearing in order. If the days are not in order we will not see a run simply because a run will most likely happen with consecutive days. One product will be produced for 3 days in a row for example, if these days are mixed up with the other days of the week, there will not be a consecutive, identifiable run. To have the entries in each week be in the correct order (by day) I applied a sort. What I've noticed is that after applying this sort each entry is handled as a separate "group" but without a header/footer. This results in not being able to group by product number afterwards since each entry is within its own "group".
I think you're asking for something impossible. But just in case you aren't, please let us know what order you want if these are your rows:
Day - Item
Day 1 - Item B
Day 2 - Item B
Day 3 - Item A
Day 3 - Item B
Day 3 - Item C
Day 4 - Item A
Day 5 - Item C
You say in a comment that you started with this:
Group by=>line
Group by=>week
Group by=>product number
...but it didn't work "because after grouping by week, they're grouped by week but within the week they're no longer ordered." So you (correctly) added a sorting group, thus:
Group by=>line
Group by=>week
Sort by=>day
Group by=>product number
But you say:
Now it's in order and you can see
consecutive days with the same
products but grouping results in each
row being grouped separately.
Where are the controls displaying the data? In the detail or in the group/sort header? It makes all the difference in the world. To display all records, you use the DETAIL. To show summary data, you use the HEADER. It sounds to me like you're putting your controls in the header instead of the detail.
Can you take a screenshot of your report in design view and insert it into your question? Without it, I don't see how to get any further.