How to check if dates overlap on different lines in SQL Server? - sql

I have a database with electricity meter readings. Sometimes people get a new meter and then their original meter gets an end date and the new meter gets a start date and the end date remains NULL. This can happen multiple times in a year and I want to know if there are no gaps in measurement. In other words, I need to figure out if end date 1 is the same as start date 2 and so on.
Sample data:
cust_id meter_id start_date end_date
--------------------------------------------------
a 1 2017-01-01 2017-05-02
a 2 2017-05-02 Null
b 3 2017-01-01 2017-06-01
b 4 2017-06-05 Null
This is what the data looks like and the result I am looking for is that for customer a the end date of meter 1 is equal to the start date of meter 2. For customer b however, there are 4 days between the end date of meter 3 and the start date of meter 4. That is something I want to flag.
I found customers for whom this can happen up to 8 times in the period I am researching. I tried something with nested queries and very complex cases but even I lost my way around it, so I was wondering if someone here has an idea of how to get to the answer a little smarter.

You can get the offending rows using lag():
select r.*
from (select r.*,
lag(end_date) over (partition by cust_id, meter_id order by start_date) as prev_end_date,
row_number() over (partition by cust_id, meter_id order by start_date) as seqnum
from readings r
) r
where prev_end_date <> start_date or prev_end_date is null and seqnum > 1;

Guessing there is now a better way to pull this off using LEAD and LAG, but I wrote an article in SQL 2008R2 called T-SQL: Identify bad dates in a time series where you can modify the big cte in the middle of the article to handle your definition of a bad date.
Good luck. There's too much detail in the article to post in a single SO question, otherwise I'd do that here.

Related

Calculate total working hours of employee based swipe in/ swipe out using oracel sql

I was recently given a task to calculate an employee's total office hours based on his card swipe in/swipe out. I have the following data :
id gate_1 gate_2 gate_3 gate_4
100 null null null 9:00
100 null 13:30 null null
100 null null 16:00 null
100 null null 18:00 null
Image
Here, the employee 100 comes in via gate_4 at 9 am and takes a break at 13:30 and goes out using gate_2. Then he comes back at 16:00 using gate_3 and leave office at 18:00 using gate_3. So, how to calculate the total in office timing using this data ?
Thanks in advance.
As has been pointed out your data model is denormalized to not even satisfy 1st normal form. The first step is to correct that (doing so in a query). Then there is no indication as to swipe in or swipe out, therefore it must be assumed that the first swipe time is always in and the ins/outs always alternate properly. Finally there is no indication of multiple days being covered so the assumption is just 1 period. That is a lot of assumptions.
Since an Oracle data type date contains time as well as the date and summing differences is much easier than with timestamps I convert timestamp to date in the first step of normalizing the data. Given all this we arrive at: (See Demo)
with normal (emp_id, inout_tm) as
( select emp_id, cast(gate1 as date)
from emp_gate_time
where gate1 is not null
union all
select emp_id, cast(gate2 as date)
from emp_gate_time
where gate2 is not null
union all
select emp_id, cast(gate3 as date)
from emp_gate_time
where gate3 is not null
union all
select emp_id, cast(gate4 as date)
from emp_gate_time
where gate4 is not null
)
select emp_id, round(24.0*(sum(hours)),1) hours_in_office
from ( select emp_id,(time_out - time_in) hours
from ( select emp_id, inout_tm time_in, rn
, lead(inout_tm) over(partition by emp_id order by inout_tm) time_out
from ( select n.*
, row_number() over(partition by emp_id order by inout_tm) rn
from normal n
)
)
where mod(rn,2) = 1
)
group by emp_id;
Items of Interest:
Subquery Factoring (CTE)
Date Arithmatic - in Hours ...Difference Between Dates in hours ...
Oracle Analytic Functions - Row_number, lead
You have a denormalized structure of your db scheme. You have fields as gate_1, gate_2 and etc. It's wrong way. The better way is following, you should have reference table of gates, for example like this
id|gate_name
--|---------
And your table with data for employee will be looks like this.
id_employee|id_gate|time
Then you can sort data in this table, and then count period of time between two consecutive rows.

traversing the records in sql

I would like to get the output for the over lapping date records
> Data: Id Open_date Closed_Date
> 1 2016-01-01 2017-01-01
**> 1 2016-12-31 2018-21-01
> 1 2016-01-01 2018-01-01**
> 2 2017-01-01 2018-02-02
Here, you see the second & 3rd records are starting with date than the closed_Date of their previous records. Here i need to identify those type of records
As you question is not much clear, I am assuming that you are looking for min of open date and max of close date.
If this is not the requirement edit the question to provide more details.
select id, min(Open_date), max(Closed_Date)
from table
group by id
Looks like you want to normalize a Slowly Changing Dimension Type 2. Of course the best way to handle them would be using Temporal tables using either Teradata or ANSI syntax.
There's a nice syntax in Teradata to get your expected result based on the Period data type, but it's imple to cast your begin/end dates to a period:
SELECT id,
-- split the period back into seperate dates
Begin(pd) AS Open_date,
End(pd) AS Closed_Date
FROM
(
SELECT NORMALIZE -- magic keyword :-)
id, PERIOD(Open_date, Closed_Date) AS pd
FROM tab
) AS dt

in redshift, how can I use window functions to assign a count to a previous row's date

the title would be too wordy if I actually tried to cram it all in there but here's what I need help with...
We are trying to calculate retention of users. Our users have assignment start dates and assignment end dates that may overlap. What I need to do is look at all candidate assignments and determine if they are retained (30 days or less between previous end and new start). The tricky part: I need to assign the retention credit to the previous assignment end date. Here's a preview of the data:
month | user_id | start_date | end_date | rank | days_btw_assignment
1 5 1-1-16 1-31-16 1 NULL
2 5 2-14-16 4-15-16 2 15
6 4 6-01-16 11-01-16 1 NULL
8 4 8-01-16 11-01-16 2 -81
Therefore for user 5, I would need to give credit of retention to the month of jan-16' because their assignment end date ends 1-31-16. For User 4, where there assignments overlap, I would give credit of retention to nov-16' because their previous assignment end date ends 11-01-16.
I've restricted this example to use cases where they only have 2 assignments, though, there could be more. I just need a step in the right direction and I can probably handle all other use cases by myself.
Here's the sample code I'm currently using:
with placement_facts as (select date_trunc('month',assignment_start_date) as month, user_id, assignment_start_date, assignment_end_date, rank () over (partition by user_id order by assignment_start_date asc), extract( day from assignment_start_date - lag(assignment_end_date, 1) over (partition by user_id order by assignment_start_date asc)) as time_btw_placement
from activations as ca
join offers on ca.offer_id = offers.id
where assignment_start_date != assignment_end_date
order by 2,4 asc)
select placement_facts.month, count(distinct case when time_btw_placement <=30 then user_id else null end) as retained_raw
from placement_facts
group by 1;
Appreciate the help and please lmk if I nee to clarify anything!
If I understand your question then I think you can achieve what you want by replacing your use of LAG() with LEAD(). It's basically the same function but it looks at a given number of rows ahead.

count occurrences for each week using db2

I am looking for some general advice rather than a solution. My problem is that I have a list of dates per person where due to administrative procedures, a person may have multiple records stored for this one instance, yet the date recorded is when the data was entered in as this person is passed through the paper trail. I understand this is quite difficult to explain so I'll give an example:
Person Date Audit
------ ---- -----
1 2000-01-01 A
1 2000-01-01 B
1 2000-01-02 C
1 2003-04-01 A
1 2003-04-03 A
where I want to know how many valid records a person has by removing annoying audits that have recorded the date as the day the data was entered, rather than the date the person first arrives in the dataset. So for the above person I am only interested in:
Person Date Audit
------ ---- -----
1 2000-01-01 A
1 2003-04-01 A
what makes this problem difficult is that I do not have the luxury of an audit column (the audit column here is just to present how to data is collected). I merely have dates. So one way where I could crudely count real events (and remove repeat audit data) is to look at individual weeks within a persons' history and if a record(s) exists for a given week, add 1 to my counter. This way even though there are multiple records split over a few days, I am only counting the succession of dates as one record (which after all I am counting by date).
So does anyone know of any db2 functions that could help me solve this problem?
If you can live with standard weeks it's pretty simple:
select
person, year(dt), week(dt), min(dt), min(audit)
from
blah
group by
person, year(dt), week(dt)
If you need seven-day ranges starting with the first date you'd need to generate your own week numbers, a calendar of sorts, e.g. like so:
with minmax(mindt, maxdt) as ( -- date range of the "calendar"
select min(dt), max(dt)
from blah
),
cal(dt,i) as ( -- fill the range with every date, count days
select mindt, 0
from minmax
union all
select dt+1 day , i+1
from cal
where dt < (select maxdt from minmax) and i < 100000
)
select
person, year(blah.dt), wk, min(blah.dt), min(audit)
from
(select dt, int(i/7)+1 as wk from cal) t -- generate week numbers
inner join
blah
on t.dt = blah.dt
group by person, year(blah.dt), wk

Cumulative Difference

I have a table
Meter_Reading
MeterID | Reading | DateRead |
1 10 1-Jan-2012
1 20 2-Feb-2012
1 30 1-Mar-2012
1 60 2-Apr-2012
1 80 1-May-2012
The reading is a cumulative value where i would need to calculate the difference from the previous month and the current month.
Could you help me figure out how to generate a view where i can see the consumption (previous month reading - current month reading) for each month?
I had tried the between function:
select address, reading as Consumption, dateread
from ServiceAddress, reading, meter
where address like '53 Drip Drive%'
and dateread
between (to_date('01-JAN-2012','DD-MON-YYYY')) and (to_date('30-SEP-2012', 'DD-MON-YYYY'))
and serviceaddress.serviceaddid = meter.serviceaddid and meter.meterid = reading.meterid;
but all i got was the readings for each month not the difference.
How could I make it list the monthly consumption?
Try with analytic functions. Something like this should do the trick:
SELECT meterid, dateread,
reading - LAG(reading, 1, 0) OVER(PARTITION BY meterid ORDER BY dateread)
FROM meter_reading
You can use the LAG function to get the reading for the prior month. The query you posted references three tables-- ServiceAddress, Reading, and Meter none of which are the Meter_Reading table you posted the structure and data for. I'll ignore the query you posted since I'm not sure what the data in those tables looks like and focus on the Meter_Reading table that you posted data for
SELECT MeterID,
DateRead,
Reading,
PriorReading,
Reading - PriorReading AmountUsed
FROM (SELECT MeterID,
DateRead,
Reading,
nvl(lag(Reading) over (partition by MeterID
order by DateRead),
0) PriorReading
FROM meter_reading)
I assume that if there is no prior reading that you want to assume that the prior reading was 0