Selecting only if at least one row matches condition - sql

I have a select statement and want to return all values only if at least one of them has a date with 60 days of difference from today.
The problem is that i have an outer apply which returns the column i want to compare to, and they come from different tables (one belongs to cash items, and the other to card items).
Considering I have the following:
OUTER APPLY (
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1 --Cash
UNION
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2 --Card
) AS items
I want to return all the rows only when DATEDIFF(DAY, MIN(items.item_date), GETDATE()) >= 60, but I want them all even if only one matches this condition.
What would be the best approach to do this?
EDIT
To make it clearer, I'll explain the use case:
I need to show the items of every loan, only if the client is late for more than 60 days of the due date on any of it

I am also not sure, what do you expect, but how about that:
WITH items
AS (SELECT Count(*) AS quantity,
Min(date) AS item_date
FROM dbo.Get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1
UNION
SELECT Count(*) AS quantity,
Min(date) AS item_date
FROM dbo.Get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2)
SELECT a.*
FROM items AS a,
(SELECT TOP 1 *
FROM items AS b
WHERE Datediff(day, b.item_date, Getdate()) >= 60) AS c
It's a sort of CROSS JOIN, where table C will have one or zero rows depending on that if the condition is met - it will than join to every row in other table.

Have you tried something like this?
SELECT a.quantity, a.item_date
FROM
(SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1
UNION
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2) a
WHERE DATEDIFF(day, a.item_date, GETDATE()) >= 60

Typically I do this using a CTE to select the key for the records I want to select and then join on that. Below is an attempt at an example:
with LateClients as
(
SELECT LoadId FROM Payment Where /*payment date later than 60 days*/
)
SELECT p.LoanId,
p.UserId
FROM Payment as p
INNER JOIN LateClients as LC
ON p.LoanId = lc.LoanId
OrderBy p.LoanId, p.UserId
I know it's a bit different from the code you posted, but this is a simplified example that should explain the concept. Good luck!

Related

Other alternatives to achieve LIMIT in SQL

I have created an SQL query to get certain data with LIMIT so I can use it in datatable. It has 76288 rows.
SELECT TransDate, AgentName, OfficeCode, year, ControlNumber,
ContainerNumber, BookingNumber, SealNumber, VesselName, ShippingLine, ShippingDate
FROM (
SELECT a.TransDate, a.AgentName, a.OfficeCode, DATEPART(YEAR, a.TransDate) AS year,
a.ControlNumber, b.ContainerNumber, b.BookingNumber,
b.SealNumber, b.VesselName, b.ShippingLine, b.ShippingDate,
ROW_NUMBER() OVER (ORDER BY a.TransDate) R
FROM Cargo_Transactions a
JOIN Cargo_Vessels b ON a.ControlNumber = b.ControlNumber
LEFT OUTER JOIN [Routes] c ON a.RouteID = c.RouteID
WHERE
a.TransDate IS NOT NULL
AND a.TransDate <= GETDATE()
AND DATEPART(YEAR, a.TransDate) = '2018'
) as f WHERE R BETWEEN 0 and 100
ORDER BY TransDate ASC;
0 and 100 is inside a variable that changes when the pagination is clicked.
If it's for the first hundred pages, it loads okay. But when I click the last page, it breaks saying timeout exceeded. Also, when I use the search function of the datatable, it's not working the way it should.
Example: I searched for dino in the datatable, it will say it has 95 records but will only show 1 record since the query is only between 0 and 10.
SELECT TransDate, AgentName, OfficeCode, year, ControlNumber, ContainerNumber,
BookingNumber, SealNumber, VesselName, ShippingLine, ShippingDate
FROM (
SELECT a.TransDate, a.AgentName, a.OfficeCode, DATEPART(YEAR, a.TransDate) AS year,
a.ControlNumber, b.ContainerNumber, b.BookingNumber, b.SealNumber,
b.VesselName, b.ShippingLine, b.ShippingDate,
ROW_NUMBER() OVER (ORDER BY a.TransDate) R
FROM Cargo_Transactions a
JOIN Cargo_Vessels b ON a.ControlNumber = b.ControlNumber
LEFT OUTER JOIN [Routes] c ON a.RouteID = c.RouteID
WHERE
a.TransDate IS NOT NULL
AND a.TransDate <= GETDATE()
AND DATEPART(YEAR, a.TransDate) = '2018'
) as f WHERE R BETWEEN 0 and 10 AND AgentName LIKE '%dino%'
ORDER BY TransDate ASC;
I also tried TOP and EXCEPT but when I search for SELECT TOP 0... EXCEPT SELECT TOP 100... but it's only showing 9 rows.
UPDATE:
I was able to make it work by including the WHERE clause in the subquery. My only problem now is the ORDER BY. It only works in the current page shown which is data 1 - 10 but not for all the data.
Any alternatives? Your help is highly appreciated. Thanks!

Tweaking a Query - looking for duplicates within a certain day range

I posted a question similar to this, and got an answer, but the answer isn't configurable - my fault I should have been more clear, so I'll try again.
I have a table where TABLENAME has the following information - OrderDate, OrderNumber, CustomerID, ProductSKU, ProductName exist. This table has lines for invoices. So an order will have a data line for every item in the order.
I want to know, which customers have ordered the same item, more than once, where the order is within 90 of any other order of that same product by that customer, after a specific date. Same product in the same order number do not count. The catch is that I want "more than once" to be configurable, so if I need to see 3 or more, or 4 or more I can adjust AND I want to see the counts. Here's the query I have so far, which I think gives me the items and the counts - but not the 90 day thing:
EDITED: I don't think the former version gave me the right counts
SELECT customerid, productsku, productname, count(distinct ordernumber) FROM tablename
WHERE orderdate >'2017-11-01'
GROUP BY customerid, productsku, productname
HAVING COUNT(distinct ordernumber) > 2
Try doing this. it'll go back 90 days
declare #date date = '2017-11-01'
SELECT customerid, productsku, productname, count(distinct ordernumber) FROM tablename
WHERE orderdate >= dateadd(DD,-90,#date) and orderdate <= #date
GROUP BY customerid, productsku, productname
HAVING COUNT(distinct ordernumber) > 1
yes that is what I was doing in the first query. so this might be a really crappy way of doing it but without seeing any data it was kind of tough. this query shows gives you the order dates as well. hope it helps
WITH DupsWithin90Days (customerid,productsku,productname,orderdate,num)
as
(
select customerid,productsku,productname,orderdate ,count(*) num from (
SELECT X.customerid, X.productsku, X.productname,X.ORDERDATE,ROW_NUMBER() OVER (partition by x.customerid,x.orderdate order by x.orderdate) rownum
FROM
(
SELECT T1.customerid, T1.productsku, T1.productname,T1.ORDERDATE
FROM TABLENAME1 T1
) X
JOIN
(
SELECT T2.customerid, T2.productsku, T2.productname,T2.ORDERDATE
FROM
TABLENAME1 T2
) Y
ON X.customerid = Y.customerid AND X.orderdate >= dateadd(DD,-90,Y.orderdate)
) dup
where rownum > 1
group by customerid,productsku,productname,orderdate
)
select customerid,productsku,productname,orderdate
from DupsWithin90Days
order by customerid ,orderdate desc

Using a date field for matching SQL Query

I'm having a bit of an issue wrapping my head around the logic of this changing dimension. I would like to associate these two tables below. I need to match the Cost - Period fact table to the cost dimension based on the Id and the effective date.
As you can see - if the month and year field is greater than the effective date of its associated Cost dimension, it should adopt that value. Once a new Effective Date is entered into the dimension, it should use that value for any period greater than said date going forward.
EDIT: I apologize for the lack of detail but the Cost Dimension will actually have a unique Index value and the changing fields to reference for the matching would be Resource, Project, Cost. I tried to match the query you provided with my fields, but I'm getting the incorrect output.
FYI: Naming convention change: EngagementId is Id, Resource is ConsultantId, and Project is ProjectId
I've changed the images below and here is my query
,_cte(HoursWorked, HoursBilled, Month, Year, EngagementId, ConsultantId, ConsultantName, ProjectId, ProjectName, ProjectRetainer, RoleId, Role, Rate, ConsultantRetainer, Salary, amount, EffectiveDate)
as
(
select sum(t.Duration), 0, Month(t.StartDate), Year(t.StartDate), t.EngagementId, c.ConsultantId, c.ConsultantName, c.ProjectId, c.ProjectName, c.ProjectRetainer, c.RoleId, c.Role, c.Rate, c.ConsultantRetainer,
c.Salary, 0, c.EffectiveDate
from timesheet t
left join Engagement c on t.EngagementId = c.EngagementId and Month(c.EffectiveDate) = Month(t.EndDate) and Year(c.EffectiveDate) = Year(t.EndDate)
group by Month(t.StartDate), Year(t.StartDate), t.EngagementId, c.ConsultantName, c.ConsultantId, c.ProjectId, c.ProjectName, c.ProjectRetainer, c.RoleId, c.Role, c.Rate, c.ConsultantRetainer,
c.Salary, c.EffectiveDate
)
select * from _cte where EffectiveDate is not null
union
select _cte.HoursWorked, _cte.HoursBilled, _cte.Month, _cte.Year, _cte.EngagementId, _cte.ConsultantId, _cte.ConsultantName, _cte.ProjectId, _Cte.ProjectName, _cte.ProjectRetainer, _cte.RoleId, _cte.Role, sub.Rate, _cte.ConsultantRetainer,_cte.Salary, _cte.amount, sub.EffectiveDate
from _cte
outer apply (
select top 1 EffectiveDate, Rate
from Engagement e
where e.ConsultantId = _cte.ConsultantId and e.ProjectId = _cte.ProjectId and e.RoleId = _cte.RoleId
and Month(e.EffectiveDate) < _cte.Month and Year(e.EffectiveDate) < _cte.Year
order by EffectiveDate desc
) sub
where _cte.EffectiveDate is null
Example:
I'm struggling with writing the query that goes along with this. At first I attempted to partition by greatest date. However, when I executed the join I got the highest effective date for every single period (even those prior to the effective date).
Is this something that can be accomplished in a query or should I be focusing on incremental updates of the destination table so that any effective date / time period in the past is left alone?
Any tips would be great!
Thanks,
Channing
Try this one:
; with _CTE as(
select p.* , c.EffectiveDate, c.Cost
from period p
left join CostDimension c on p.id = c.id and p.Month = DATEPART(month, c.EffectiveDate) and p.year = DATEPART (year, EffectiveDate)
)
select * from _CTE Where EffectiveDate is not null
Union
select _CTE.id, _CTE.Month, _CTE.Year, sub.EffectiveDate, sub.Cost
from _CTE
outer apply (select top 1 EffectiveDate, Cost
from CostDimension as cd
where cd.Id = _CTE.id and cd.EffectiveDate < DATETIMEFROMPARTS(_CTE.Year, _CTE.Month, 1, 0, 0, 0, 0)
order by EffectiveDate desc
) sub
where _Cte.EffectiveDate is null

Find consecutive days of service without a day break in between

service table:
claimid, customerid, serv-start-date, service-end-date, charge
1, A1, 1-1-14 , 1-5-14 , $200
2, A1, 1-6-14 , 1-8-14 , $300
3, A1, 2-1-14 , 2-1-14 , $100
4, A2, 2-1-14 , 2-1-14 , $100
5, A2, 2-3-14 , 2-5-14 , $100
6, A2, 2-6-14 , 2-8-14 , $100
Problem:
Basically to see the maximum total consecutive days Service start date and end date.
for customer A1 it would be 8 days (1-5 plus 6-8) and customer A2 it would be 5 6 days (3-5 plus 6-8) ... (claimid is unique PK).
Dates are in m-d-yy notation.
This gets a little messy since you could possibly have customers without multiple records. This uses a common-table-expressions, along with the max aggregate and union all to determine your results:
with cte as (
select s.customerid,
s.servicestartdate,
s2.serviceenddate,
datediff(day,s.servicestartdate,s2.serviceenddate)+1 daysdiff
from service s
join service s2 on s.customerid = s2.customerid
and s2.servicestartdate in (s.serviceenddate, dateadd(day,1,s.serviceenddate))
)
select customerid, max(daysdiff) daysdiff
from cte
group by customerid
union all
select customerid, max(datediff(day, servicestartdate, serviceenddate))
from service s
where not exists (
select 1
from cte
where s.customerid = cte.customerid
)
group by customerid
SQL Fiddle Demo
The second query in the union statement is what determines those service records without multiple records with consecutive days.
Here ya go, I think it's the simplest way:
SELECT customerid, sum(datediff([serv-end-date],[serv-start-date]))
FROM [service]
GROUP BY customerid
You will have to decide if same day start/end records count as 1. If they do, then add one to the datediff function, e.g. sum(datediff([serv-end-date],[serv-start-date]) + 1)
If you don't want to count same day services but DO want to count start/end dates inclusively when you sum them up, you will need to add a function that does the +1 only when start and end dates are different. Let me know if you want ideas on how to do that.
The only way I could think to solve the issue described by Jonathan Leffler (in a comment on another answer) was to use a temp table to merge contiguous date ranges. This would be best accomplished in an SP - but failing that the following batch may produce the output you are looking for:-
select *, datediff(day,servicestartdate,serviceenddate)+1 as numberofdays
into #t
from service
while ##rowcount>0 begin
update t1 set
t1.serviceenddate=t2.serviceenddate,
t1.numberofdays=datediff(day,t1.servicestartdate,t2.serviceenddate)+1
from #t t1
join #t t2 on t2.customerid=t1.customerid
and t2.servicestartdate=dateadd(day,1,t1.serviceenddate)
end
select
customerid,
max(numberofdays) as maxconsecutivedays
from #t
group by customerid
The update to the temp table needs to be in a loop because the date range could (I assume) be spread over any number of records (1->n). Interesting problem.
I've made updates to the code so that the temp table ends up with an extra column that holds the number of days in the date range on each record. This allows the following:-
select x.customerid, x.maxconsecutivedays, max(x.serviceenddate) as serviceenddate
from (
select t1.customerid, t1.maxconsecutivedays, t2.serviceenddate
from (
select
customerid,
max(numberofdays) as maxconsecutivedays
from #t
group by customerid
) t1
join #t t2 on t2.customerid=t1.customerid and t2.numberofdays=t1.maxconsecutivedays
) x
group by x.customerid, x.maxconsecutivedays
To identify the longest block of consecutive days (or the latest/longest if there is a tie) for each customer. This would allow you to subsequently dive back into the temp table to pull out the rows related to that block - by searching on the customerid and the serviceenddate (not maxconsecutivedays). Not sure this fits with your use case - but it may help.
WITH chain_builder AS
(
SELECT ROW_NUMBER() OVER(ORDER BY s.customerid, s.CLAIMID) as chain_ID,
s.customerid,
s.serv-start-date, s.service-end-date, s.CLAIMID, 1 as chain_count
FROM services s
WHERE s.serv-start-date <> ALL
(
SELECT DATEADD(d, 1, s2.service-end-date)
FROM services s2
)
UNION ALL
SELECT chain_ID, s.customerid, s.serv-start-date, s.service-end-date,
s.CLAIMID, chain_count + 1
FROM services s
JOIN chain_builder as c
ON s.customerid = c.customerid AND
s.serv-start-date = DATEADD(d, 1, c.service-end-date)
),
chains AS
(
SELECT chain_ID, customerid, serv-start-date, service-end-date,
CLAIMID, chain_count
FROM chain_builder
),
diff AS
(
SELECT c.chain_ID, c.customerid, c.serv-start-date, c.service-end-date,
c.CLAIMID, c.chain_count,
datediff(day,c.serv-start-date,c.service-end-date)+1 daysdiff
FROM chains c
),
diff_sum AS
(
SELECT chain_ID, customerid, serv-start-date, service-end-date,
CLAIMID, chain_count,
SUM(daysdiff) OVER (PARTITION BY chain_ID) as total_diff
FROM diff
),
diff_comp AS
(
SELECT chain_ID, customerid,
MAX(total_diff) OVER (PARTITION BY customerid) as total_diff
FROM diff_sum
)
SELECT DISTINCT ds.CLAIMID, ds.customerid, ds.serv-start-date,
ds.service-end-date, ds.total_diff as total_days, ds.chain_count
FROM diff_sum ds
JOIN diff_comp dc
ON ds.chain_ID = dc.chain_ID AND ds.customerid = dc.customerid
AND ds.total_diff = dc.total_diff
ORDER BY customerid, chain_count
OPTION (maxrecursion 0)

SQL Grouping Issues

I'm attempting to write a query that will return any customer that has multiple work orders with these work orders falling on different days of the week. Every work order for each customer should be falling on the same day of the week so I want to know where this is not the case so I can fix it.
The name of the table is Core.WorkOrder, and it contains a column called CustomerId that specifies which customer each work order belongs to. There is a column called TimeWindowStart that can be used to see which day each work order falls on (I'm using DATENAME(weekday, TimeWindowStart) to do so).
Any ideas how to write this query? I'm stuck here.
Thanks!
Select ...
From WorkOrder As W
Where Exists (
Select 1
From WorkOrder As W1
And W1.CustomerId = W.CustomerId
And DatePart( dw, W1.TimeWindowStart ) <> DatePart( dw, W.TimeWindowStart )
)
SELECT *
FROM (
SELECT *,
COUNT(dp) OVER (PARTITION BY CustomerID) AS cnt
FROM (
SELECT DISTINCT CustomerID, DATEPART(dw, TimeWindowStart) AS dp
FROM workOrder
) q
) q
WHERE cnt >= 2
SELECT CustomerId,
MIN(DATENAME(weekday, TimeWindowStart)),
MAX(DATENAME(weekday, TimeWindowStart))
FROM Core.WorkOrder
GROUP BY CustomerId
HAVING MIN(DATENAME(weekday, TimeWindowStart)) != MAX(DATENAME(weekday, TimeWindowStart))