Correlate Sequences of Independent Events - Calculate Time Intersection - sql

We are building a PowerBI reporting solution and I (well Stack) solved one problem and the business came up with a new reporting idea. Not sure of the best way to approach it as I know very little about PowerBI and the business seems to want quite complex reports.
We have two sequences of events from separate data sources. They both contain independent events occurring to vehicles. One describes what location a vehicle is within - the other describes incident events which have a reason code for the incident. The business wants to report on time spent in each location for each reason. Vehicles can change location totally independent of the incident events occurring - and events actually are datetime and occur at random points throughtout day. Each type of event has a startime/endtime and a vehicleID.
Vehicle Location Events
+------------------+-----------+------------+-----------------+----------------+
| LocationDetailID | VehicleID | LocationID | StartDateTime | EndDateTime |
+------------------+-----------+------------+-----------------+----------------+
| 1 | 1 | 1 | 2012-1-1 | 2016-1-1 |
| 2 | 1 | 2 | 2016-1-1 | 2016-4-1 |
| 3 | 1 | 1 | 2016-4-1 | 2016-11-1 |
| 4 | 2 | 1 | 2011-1-1 | 2016-11-1 |
+------------------+-----------+------------+-----------------+----------------+
Vehicle Status Events
+---------+---------------+-------------+-----------+--------------+
| EventID | StartDateTime | EndDateTime | VehicleID | ReasonCodeID |
+---------+---------------+-------------+-----------+--------------+
| 1 | 2012-1-1 | 2013-1-1 | 1 | 1 |
| 2 | 2013-1-1 | 2015-1-1 | 1 | 3 |
| 3 | 2015-1-1 | 2016-5-1 | 1 | 4 |
| 4 | 2016-5-1 | 2016-11-1 | 1 | 2 |
| 5 | 2015-9-1 | 2016-2-1 | 2 | 1 |
+---------+---------------+-------------+-----------+--------------+
Is there anyway I can correlate the two streams together and calculate total time per Vehicle per ReasonCode per location? This would seem to require me to be able to relate the two events - so a change of location may occur part way through a given ReasonCode.
Calculation Example ReasonCodeID 4
VehicleID 1 is in location ID 1 from 2012-1-1 to 2016-1-1 and
2016-4-1 to 2016-11-1
VehicleID 1 is in location ID 2 from 2016-1-1
to 2016-4-1
VehcileID 1 has ReasonCodeID 4 from 2015-1-1 to
2016-5-1
Therefore first Period in location 1 intersects with 365 days of ReasonCodeID 4 (2015-1-1 to 2016-1-1). 2nd period in location 1 intersects with 30 days (2016-4-1 to 2016-5-1).
In location 2 intersects with 91 days of ReasonCodeID 4(2016-1-1 to 2016-4-1
Desired output would be the below.
+-----------+--------------+------------+------------+
| VehicleID | ReasonCodeID | LocationID | Total Days |
+-----------+--------------+------------+------------+
| 1 | 1 | 1 | 366 |
| 1 | 3 | 1 | 730 |
| 1 | 4 | 1 | 395 |
| 1 | 4 | 2 | 91 |
| 1 | 2 | 1 | 184 |
| 2 | 1 | 1 | 154 |
+-----------+--------------+------------+------------+
I have created a SQL fiddle that shows the structure here
Vehicles have related tables and I'm sure the business will want them grouped by vehicle class etc but if I can understand how to calculate the intersection points in this case that would give me the basis for rest of reporting.

I think this solution requires a CROSS JOIN implementation. The relationship between both tables is Many to Many which implies the creation of a third table that bridges LocationEvents and VehicleStatusEvents tables so I think specifying the relationship in the expression could be easier.
I use a CROSS JOIN between both tables, then filter the results only to get those rows which VehicleID columns are the same in both tables. I am also filtering the rows that VehicleStatusEvents range dates intersects LocationEvents range dates.
Once the filtering is done I am adding a column to calculate the count of days between each intersection. Finally, the measure sums up the days for each VehicleID, ReasonCodeID and LocationID.
In order to implement the CROSS JOIN you will have to rename the VehicleID, StartDateTime and EndDateTime on any of both tables. It is necessary for avoiding ambigous column names errors.
I rename the columns as follows:
VehicleID : LocationVehicleID and StatusVehicleID
StartDateTime : LocationStartDateTime and StatusStartDateTime
EndDateTime : LocationEndDateTime and StatusEndDateTime
After this you can use CROSSJOIN in the Total Days measure:
Total Days =
SUMX (
FILTER (
ADDCOLUMNS (
FILTER (
CROSSJOIN ( LocationEvents, VehicleStatusEvents ),
LocationEvents[LocationVehicleID] = VehicleStatusEvents[StatusVehicleID]
&& LocationEvents[LocationStartDateTime] <= VehicleStatusEvents[StatusEndDateTime]
&& LocationEvents[LocationEndDateTime] >= VehicleStatusEvents[StatusStartDateTime]
),
"CountOfDays", IF (
[LocationStartDateTime] <= [StatusStartDateTime]
&& [LocationEndDateTime] >= [StatusEndDateTime],
DATEDIFF ( [StatusStartDateTime], [StatusEndDateTime], DAY ),
IF (
[LocationStartDateTime] > [StatusStartDateTime]
&& [LocationEndDateTime] >= [StatusEndDateTime],
DATEDIFF ( [LocationStartDateTime], [StatusEndDateTime], DAY ),
IF (
[LocationStartDateTime] <= [StatusStartDateTime]
&& [LocationEndDateTime] <= [StatusEndDateTime],
DATEDIFF ( [StatusStartDateTime], [LocationEndDateTime], DAY ),
IF (
[LocationStartDateTime] >= [StatusStartDateTime]
&& [LocationEndDateTime] <= [StatusEndDateTime],
DATEDIFF ( [LocationStartDateTime], [LocationEndDateTime], DAY ),
BLANK ()
)
)
)
)
),
LocationEvents[LocationID] = [LocationID]
&& VehicleStatusEvents[ReasonCodeID] = [ReasonCodeID]
),
[CountOfDays]
)
Then in Power BI you can build a matrix (or any other visualization) using this measure:
If you don't understand completely the measure expression, here is the T-SQL translation:
SELECT
dt.VehicleID,
dt.ReasonCodeID,
dt.LocationID,
SUM(dt.Diff) [Total Days]
FROM
(
SELECT
CASE
WHEN a.StartDateTime <= b.StartDateTime AND a.EndDateTime >= b.EndDateTime -- Inside range
THEN DATEDIFF(DAY, b.StartDateTime, b.EndDateTime)
WHEN a.StartDateTime > b.StartDateTime AND a.EndDateTime >= b.EndDateTime -- |-----|*****|....|
THEN DATEDIFF(DAY, a.StartDateTime, b.EndDateTime)
WHEN a.StartDateTime <= b.StartDateTime AND a.EndDateTime <= b.EndDateTime -- |...|****|-----|
THEN DATEDIFF(DAY, b.StartDateTime, a.EndDateTime)
WHEN a.StartDateTime >= b.StartDateTime AND a.EndDateTime <= b.EndDateTime -- |---|****|-----
THEN DATEDIFF(DAY, a.StartDateTime, a.EndDateTime)
END Diff,
a.VehicleID,
b.ReasonCodeID,
a.LocationID --a.StartDateTime, a.EndDateTime, b.StartDateTime, b.EndDateTime
FROM LocationEvents a
CROSS JOIN VehicleStatusEvents b
WHERE a.VehicleID = b.VehicleID
AND
(
(a.StartDateTime <= b.EndDateTime)
AND (a.EndDateTime >= b.StartDateTime)
)
) dt
GROUP BY dt.VehicleID,
dt.ReasonCodeID,
dt.LocationID
Note in T-SQL you could use an INNER JOIN operator too.
Let me know if this helps.

select coalesce(l.VehicleID,s.VehicleID) as VehicleID
,s.ReasonCodeID
,l.LocationID
,sum
(
datediff
(
day
,case when s.StartDateTime > l.StartDateTime then s.StartDateTime else l.StartDateTime end
,case when s.EndDateTime < l.EndDateTime then s.EndDateTime else l.EndDateTime end
)
) as TotalDays
from VehicleLocationEvents as l
full join VehicleStatusEvents as s
on s.VehicleID =
l.VehicleID
and case when s.StartDateTime > l.StartDateTime then s.StartDateTime else l.StartDateTime end <=
case when s.EndDateTime < l.EndDateTime then s.EndDateTime else l.EndDateTime end
group by coalesce(l.VehicleID,s.VehicleID)
,s.ReasonCodeID
,l.LocationID
or
select VehicleID
,ReasonCodeID
,LocationID
,sum (datediff (day,max_StartDateTime,min_EndDateTime)) as TotalDays
from (select coalesce(l.VehicleID,s.VehicleID) as VehicleID
,s.ReasonCodeID
,l.LocationID
,case when s.StartDateTime > l.StartDateTime then s.StartDateTime else l.StartDateTime end as max_StartDateTime
,case when s.EndDateTime < l.EndDateTime then s.EndDateTime else l.EndDateTime end as min_EndDateTime
from VehicleLocationEvents as l
full join VehicleStatusEvents as s
on s.VehicleID =
l.VehicleID
) ls
where max_StartDateTime <= min_EndDateTime
group by VehicleID
,ReasonCodeID
,LocationID

Related

SQL Day-over-Day count miscalculation

I'm encountering a bug in my SQL code that calculates the day-over-day (DoD) count difference. This table (curr_day) summarizes the count of trades on any business day (i.e. excluding weekends and government-mandated holidays) and is joined by a similar table (prev_day) that is day-lagged (previous day). The joining is based on the day's rank; for example the first day on the curr_day table is Jan-01 and it's rank is 1, the first day (rank 1) for the prev_day table is Dec-31.
My issue is that the trade count does not seem to calculate positive changes (see table below), only negative or no changes. This problem does not affect other fields that calculate the value of a trade, simply the amount of trades on a given day.
Sample of query
with curr_day as (select GROUP, COUNT from DB where DATE is not HOLIDAY),
prev_day as (select rank()over(partition by GROUP order by DATE) as RANK, GROUP, DATE, COUNT
from curr_day where DATE is not HOLIDAY)
select ID, DATE, curr_day.COUNT-prev_day.COUNT
from (select rank()over(partition by curr_day.GROUP order by curr_day.DATE) as RANK
from curr_day
where curr_day.DATE >= (select min(curr_day.DATE)+1) from curr_day)
left join prev_day on curr_day.RANK = prev_day.RANK and curr_day.GROUP = prev_day.GROUP)
;
Output table
Date | Group | Count | DoD_Cnt_Diff
2020-12-31 | A | 1 | 0
2021-01-01 | A | 1 | 0
2021-01-02 | A | 0 | -1
2021-01-03 | A | 1 | (null)
2021-01-04 | A | 0 | -1
2021-01-05 | A | 0 | 0
2021-12-31 | B | 0 | 0

SQL: Selecting data from multiple tables with multiple records to select and join

I have three tables: VolunteerRelationships, Organizations, and CampaignDates. I'm trying to write a query that will give me the organization id and name, and the org's start and end campaign dates for the current campaign year <#CampaignYear>, based on the selected volunteer <#SelectedInd>.
Dates are stored as separate column values for day, month and year which I'm trying to cast into a more an formatted date value. If I can get this, I'd also like to use a case statement to get the status of the campaign based on whether the date campaign dates are upcoming, currently running, or already closed, but need to get the first part of the query first.
Sorry if I'm leaving a lot of needed info out, this is my first time posting a question to this forumn. Thank you!
VolunteerRelationships
id | name | managesId |expiryDate
1 | john | 1 |
2 | jack | 2 |6/30/2020
3 | jerry| 3 |12/31/2021
Organizations
id | name1
1 | ACME
CampaignDates
orgId | dateDay | dateMonth | dateYear | dateType | Campaign Year
1 | 5 | 11 | 2020 | Start | 2020
1 | 15 | 11 | 2020 | End | 2020
Result
orgId | orgName | startDate | endDate | Status
1 | ACME | 2020-01-01| 2020-01-15 | Closed
select
v.MANAGEDACCOUNT,
o.Name1,
select * from
(select cast(cast dateyear*1000 + datemonth*100 + dateday as varchar(255)) as date as date1 from <#Schema>.CampaignDates where datetype = 'Start' and campaignyear = <#CampaignYear> and orgaccountnumber = v.MANAGEDACCOUNT) d1,
(select cast(cast dateyear*1000 + datemonth*100 + dateday as varchar(255)) as date as date2 from <#Schema>.CampaignDates where datetype = 'End' and campaignyear = <#CampaignYear> and orgaccountnumber = v.MANAGEDACCOUNT) d2
from <#Schema>.VolunteerRelationships v
inner join <#Schema>.organizations o
on o.accountnumber=v.MANAGEDACCOUNT
where v.VOLUNTEERACCOUNT = <#SelectedInd> and ( v.EXPIRYDATE IS NULL OR v.EXPIRYDATE > <#Today> )

How to select company which have two groups

I still tried select all customers which is in two group. Duplicate from customers is normal because select is from invoice but I need to know the customers who had a group in the first half year and jumped to another in the second half year.
Example:
SELECT
f.eankod as kod, --(groups)
ad.kod as firma, --(markComp)
f.nazfirmy as nazev, --(nameComp)
COUNT(ad.kod),
sum(f.sumZklZakl + f.sumZklSniz + f.sumOsv) as cena_bez_dph --(Price)
FROM
ddoklfak as f
LEFT OUTER JOIN aadresar ad ON ad.idfirmy = f.idfirmy
WHERE
f.datvyst >= '2017-01-01'
and f.datvyst <= '2017-12-31'
and f.modul like 'FAV'
GROUP BY
f.eankod,
ad.kod,
f.nazfirmy
HAVING COUNT (ad.kod) > 1
order by
ad.kod
Result:
GROUP markcomp nameComp price
| D002 | B5846 | Cosmopolis | price ... |
| D003 | B6987 | Tismotis | price ... |
| D009 | B8974 | Teramis | price ... |
| D006 | B8876 | Kesmethis | price ... | I need this, same company but diferent group, because this
| D008 | B8876 | Kesmethis | price ... | company jumped. I need know only jumped company. (last two rows from examples)
Thx for help.
You can use a CTE to find out which nameComp show up multiple times, and keep those ones only. For example:
with
x as (
-- your query
)
select * from x where nameComp in (
select nameComp from x group by nameComp having count(*) > 1
)

Is there way to add date difference values we get to the date automatically?

What I was trying to do is I have two dates and using DateDiff to get a difference between dates. For example, I Have planned Start Date and actual start Date and I got the difference between this date is 5, now I want to add this day to the Finish date.
If my Finish date is not what I assumed, but behind, then that difference we got I want to add and want to find next finish date because we are behind so next upcoming dates.
Sum (DATEDIFF(day, sa.PlannedStartDate, sa.ActualStartDate)) OVER
(Partition
By ts.Id)as TotalVariance,
Case when (Sum (DATEDIFF(day, sa.PlannedStartDate, sa.ActualStartDate))
OVER
(Partition By ts.Id) >30) then 'Positive' end as Violation,
DATEADD (day, DATEDIFF(day, sa.PlannedStartDate, sa.ActualStartDate))as
Summar violations,
If the activity 1 - planned Start date is 8/21/2019 but the actual start date is 9/21/2019, in this case we are behind 30 days.
Now the next activity will be delayed, so I want to add this difference to the next activity.
If the second activity planned Start date was 08/25/2019, but because of the delay of activity 1 the start date will change for second activity, in this case I want to find that new date.
Activity PlannedStartdate ActualStartDate Variance NewPlannedstartdate
Activity 1 8/21/2019 9/21/2019 30
Acivity 2 8/26/2019 null 9/26/2019
Here's an example you can run in SSMS:
-- CREATE ACTIVITY TABLE AND ADD SOME DATA --
DECLARE #Activity TABLE ( ActivityId INT, PlannedStart DATE, ActualStart DATE );
INSERT INTO #Activity (
ActivityId, PlannedStart, ActualStart
)
VALUES
( 1, '08/21/2019', '08/27/2019' ), ( 1, '08/26/2019', NULL ), ( 1, '09/14/2019', NULL );
Query #Activity to see what's in it:
SELECT * FROM #Activity ORDER BY ActivityId, PlannedStart;
#Activity content:
+------------+--------------+-------------+
| ActivityId | PlannedStart | ActualStart |
+------------+--------------+-------------+
| 1 | 2019-08-21 | 2019-08-27 |
| 1 | 2019-08-26 | NULL |
| 1 | 2019-09-14 | NULL |
+------------+--------------+-------------+
Query #Activity to factor the new starting dates:
;WITH Activity_CTE AS (
SELECT
ROW_NUMBER() OVER ( ORDER BY PlannedStart ) AS Id,
ActivityId, PlannedStart, ActualStart, DATEDIFF( dd, PlannedStart, ActualStart ) Delayed
FROM #Activity
WHERE
ActivityId = #ActivityId
)
SELECT
ActivityId,
PlannedStart,
ActualStart,
DATEADD( dd, Delays.DaysDelayed, PlannedStart ) AS NewStart
FROM Activity_CTE AS Activity
OUTER APPLY (
SELECT CASE
WHEN ( Delayed IS NOT NULL ) THEN Delayed
ELSE ISNULL( ( SELECT TOP 1 Delayed FROM Activity_CTE WHERE Id < Activity.Id AND Delayed IS NOT NULL ORDER BY Id DESC ), 0 )
END AS DaysDelayed
) AS Delays
ORDER BY
PlannedStart;
Returns
+------------+--------------+-------------+------------+
| ActivityId | PlannedStart | ActualStart | NewStart |
+------------+--------------+-------------+------------+
| 1 | 2019-08-21 | 2019-08-27 | 2019-08-27 |
| 1 | 2019-08-26 | NULL | 2019-09-01 |
| 1 | 2019-09-14 | NULL | 2019-09-20 |
+------------+--------------+-------------+------------+
The real "magic" here is this line:
ELSE ISNULL( ( SELECT TOP 1 Delayed FROM Activity_CTE WHERE Id < Activity.Id AND Delayed IS NOT NULL ORDER BY Id DESC ), 0 )
It's checking for any prior records to itself that has a delay. If none are found, it returns 0. This value is then used to add days to the PlannedStart date to determine the NewStart date. The ORDER BY is of particular note too. Sorting in a DESC order ensures we get the "closest" delay prior to the current row.
Using a CTE in this way also takes into account the idea that the delay may not happen on the very first record (e.g., say the 08/26 planned was delayed instead of 08/21). It conveniently gives us a subtable to query against in our OUTER APPLY.
This is what you would see if you included all columns on the CTE's SELECT:
+----+------------+--------------+-------------+---------+-------------+
| Id | ActivityId | PlannedStart | ActualStart | Delayed | DaysDelayed |
+----+------------+--------------+-------------+---------+-------------+
| 1 | 1 | 2019-08-21 | 2019-08-27 | 6 | 6 |
| 2 | 1 | 2019-08-26 | NULL | NULL | 6 |
| 3 | 1 | 2019-09-14 | NULL | NULL | 6 |
+----+------------+--------------+-------------+---------+-------------+
Because the very first record is the only record with a delay, its delay of 6 days persists through each of the following records.

Most efficient way to count number of matching rows with multiple criterias at once

I have a very large table (called device_operation with 50 million rows) which holds all the operations of a product in its lifecycle (such as "start", "stop", "refill", ..." and the status of these operations (row status : Completed, Failed), with the ID of the associated device (row device_id) and a timestamp for each operation (row create_date).
Something like this :
/------+-----------+------------------+---------\
| ID | Device ID | Create_Date | Status |
+------+-----------+------------------+---------+
| 1 | 1 | 2012-03-04 01:43 | Success |
| 2 | 4 | 2012-04-04 02:34 | Failed |
| 3 | 9 | 2013-01-01 01:23 | Failed |
| 4 | 4 | 2013-12-12 12:34 | Success |
| 5 | 23 | 2014-02-01 03:45 | Success |
| 6 | 1 | 2014-05-03 08:34 | Failed |
\------+-----------+------------------+---------/
I also have another table (called subscription) that tells me when the warranty has started (row create_date) for the product (row device_id). Warranty lasts one year.
/-----------+------------------\
| Device ID | Create_Date |
+-----------+------------------+
| 2 | 2011-04-03 05:00 |
| 4 | 2012-03-05 03:45 |
| 5 | 2012-03-05 06:07 |
| ... | ... |
\-----------+------------------/
I am using PostgreSQL.
I want to do the following :
List all device IDs which had at least one successful operation before a given date (2014-07-06)
For each of those devices, count :
The number of failed operations after that date + 2 days (2014-07-08), and the device was under warranty when the operation was attempted
The number of failed operations after that date + 2 days (2014-07-08), and the device was outside warranty when the operation was attempted
The number of successful operations after that date (device being under warranty or not)
I had some limited success with the following (query has been simplified a little bit for readability - there are other joins involved to get to the subscription table, and other criterias to include the devices in the list) :
SELECT distinct device_operation.device_id as did, subscription.create_date,
(
SELECT COUNT(*)
FROM device_operation dop
WHERE dop.device_id = device_operation.device_id and
dop.create_date > '2014-07-08' and
dop.status = 'Success'
) as success,
(
SELECT COUNT(*)
FROM device_operation dop2
WHERE
dop2.device_id = subscription.device_id and
dop2.create_date > '2014-07-08' and
dop2.status = 'Failed' and
dop2.create_date <= subscription.create_date + interval '1 year'
) as failed_during_warranty,
(
SELECT COUNT(*)
FROM device_operation dop2
WHERE
dop2.device_id = subscription.device_id and
dop2.create_date > '2014-07-08' and
dop2.status = 'Failed' and
dop2.create_date > subscription.create_date + interval '1 year'
) as failed_after_warranty,
FROM device_operation, subscription
WHERE
device_operation.status = 'Success' and -- list operations which are successful
device_operation.create_date <= '2014-07-06' and -- list operations before that date
device_operation.device_id = subscription.device_id -- get warranty start for each operation
ORDER BY success DESC, failed_during_warranty DESC, failed_after_warranty DESC
As you can guess, it's so slow I cannot run the query. However it gives you an idea of the structure.
I have tried to use NULLIF to combine the requests into one, in the hope it's going to make PostgreSQL only list the subquery once instead of 3, but it returns "subquery must return only one column" :
SELECT distinct device_operation.device_id as did, subscription.create_date,
(
SELECT COUNT(NULLIF(dop2.status != 'Success', true)) as completed,
COUNT(NULLIF(dop2.status != 'Failed' or not (dop2.create_date <= subscription.create_date + interval '1 year'), true)) as failed_in_warranty,
COUNT(NULLIF(dop2.status != 'Failed' or (dop2.create_date <= subscription.create_date + interval '1 year'), true)) as failed_after_warranty
FROM device_operation dop2
WHERE
dop2.device_id = device_operation.device_id and
dop2.device_id = subscription.device_id and
dop2.create_date > '2014-07-08'
) as subq
FROM device_operation, subscription
WHERE
device_operation.status = 'Success' and -- list operations which are successful
device_operation.create_date <= '2014-07-06' and -- list operations before that date
device_operation.device_id = subscription.device_id -- get warranty start for each operation
ORDER BY success DESC, failed_in_warranty DESC, failed_outside_warranty DESC
I also tried to move the subquery to the FROM clause but that doesn't work as I need to run the subquery for each row of the main query (or do I ? maybe there's a better way)
What I expect is something like this :
/-----------+---------+------------------------+-----------------------\
| Device ID | Success | Failed during warranty | Failed after warranty |
+-----------+---------+------------------------+-----------------------+
| 194853 | 10 | 0 | 0 |
| 7853 | 5 | 5 | 0 |
| 5848 | 3 | 0 | 56 |
| 8546455 | 0 | 45 | 0 |
| 102 | 0 | 4 | 1 |
| 69329548 | 0 | 0 | 9 |
| 17 | 0 | 0 | 0 |
\-----------+---------+------------------------+-----------------------+
Can someone help me find the most efficient way to do it ?
EDIT: Corner cases: You can consider all devices have an entry in subscription.
Thank you very much !
I think you just require conditional aggregation. I find the data structure and logic a bit hard to follow, but I think the following is basically what you need:
SELECT d.device_id,
SUM(CASE WHEN d.status = 'Failed' AND d.create_date <= '2014-07-06' + interval '2 day'
THEN 1 ELSE 0
END) as NumFails,
SUM(CASE WHEN d.status = 'Failed' AND d.create_date <= '2014-07-06' + interval '2 day' AND
d.create_date > s.create_date + interval '1 year'
THEN 1 ELSE 0
END) as NumFailsNoWarranty,
SUM(CASE WHEN d.status = 'Success' AND d.create_date <= '2014-07-06' + interval '2 day'
THEN 1 ELSE 0
END) as NumSuccesses
FROM device_operation d JOIN
subscription s
ON d.device_id = s.device_id
GROUP BY d.device_id
HAVING SUM(CASE WHEN d.status = 'Success' AND d.create_date <= '2014-07-06' THEN 1 ELSE 0 END) > 0;