How to calculate start and end time an event in sql? - sql

I have a table as below:
I want to calculate data in below format:
AreaID, Power_ON_Date, Power_OFF_Date, Diff_In_Minutes
Also, I need to handle:
Successive entries of same event. In case of successive entries of same event with different times, need to consider only the first occurrence of the event and ignore the others.
Merge two rows of successive OFF and ON event into 1 row to get the desired result.

You can do aggregation :
select areaid,
max(case when powerstatus = 'power on' then eventdatetime end) as Power_ON_Date,
min(eventdatetime) as Power_OFF_Date,
datediff(minute, min(eventdatetime), max(case when powerstatus = 'power on' then eventdatetime end)
) as diff_minute
from (select t.*,
sum(case when powerstatus = 'power off' then 1 else 0 end) over (partition by areaid order by eventdatetime) as grp
from table t
) t
group by areaid, grp;
Note : date_diff() is for SQL Server, however you didn't any specific database. So, the function definition may different

Related

Find difference between two rows in sql

I have table that stores the employe info in multiple rows and it having the common name for it along with its user login time and log out time for website, and would like to achieve the result and it may contains multiple names such as (N1,N2,N3..etc)
Name,Key,Time,
N1,TotalExp,No
N1,TotalYears,5
N1,LoggedIn,10:00:00
N1,LoggedOut,20:00:00
Expected Output will like below,
N1,TotalExp,TotalYrs,LoggedDifference
N1,No,5,10
Any one help me to achieve this
Even it's a fact that the design of your database doesn't look well, you can query your data this way:
with your_data as (
select 'N1' as Name,'TotalExp' as [Key],'No' as Time union all
select 'N1','TotalYears','5' union all
select 'N1','LoggedIn','10:00:00' union all
select 'N1','LoggedOut','20:00:00'
)
select
Name,
max(case when [Key] = 'TotalExp' then Time else null end) as TotalExp,
max(case when [Key] = 'TotalYears' then Time else null end) as TotalYrs,
datediff(
hour,
max(case when [Key] = 'LoggedIn' then convert(time, Time) else null end),
max(case when [Key] = 'LoggedOut' then convert(time, Time) else null end)
) as LoggedDifference
from your_data
group by Name
You can test on here

SQL Lag and LEAD query

I need data which is in Output column. When 1st column status is P then we need value from Filled date. But once status is anything from P then we need date from last P status. Pls. let me know if i am not able to explain. Thanks in advance.
In standard SQL, you can use:
select (case when status = 'P'
then filled_dt
else lag(case when status = 'P' then filled_dt end) over (partition by mbr_id order by filled_dt ignore nulls)
end) as imputed_filled_dt
This is standard SQL; however, not all databases support ignore nulls. This probably does what you want:
select (case when status = 'P'
then filled_dt
else max(case when status = 'P' then filled_dt end) over (partition by mbr_id order by filled_dt)
end) as imputed_filled_dt

aggregate function error in case expression

I have this query
SELECT mylearning.Employee_Id,
case
when max(case when not mylearning.CourseStatusTXT = 'Completed' then 1 else 0 end) = 0 then '2018 Complete'
when max(case when mylearning.CourseStatusTXT in ('Started', 'Not Started') then 1 else 0 end) = 1 then '2018 Not Complete'
end as Completion_Status
FROM Analytics.myLearning_Completions as mylearning inner join Analytics.Workday WD on mylearning.Employee_ID = WD.Employee_ID
And I want to add a condition to the first when statement to make it like this
when max(case when not mylearning.CourseStatusTXT = 'Completed' then 1 else 0 end) = 0
and WD.Adjusted_Hire_Date like '2019% '
and mylearning.CourseTimeCompletedH < cast (WD.Adjusted_Hire_Date as date format 'YYYY/MM/DD') +7
then '2018 Complete'
but I keep getting this error
Executed as Single statement. Failed [3504 : HY000] Selected non-aggregate values must be part of the associated group.
Elapsed time = 00:00:00.069
How can I fix it?
Like a couple others mentioned, you are trying to mix grouped data with non-aggregated data in your calculation, which is why you're getting the 3504 error. You need to either include the referenced columns in your GROUP BY or include them inside an aggregate function (i.e. MAX).
I'm not 100% sure if this is what you're after, but hopefully it can help you along.
SELECT
mylearning.Employee_Id,
CASE
WHEN
MAX(CASE WHEN NOT mylearning.CourseStatusTXT = 'Completed' THEN 1 ELSE 0 END) = 0 AND
WD.Adjusted_Hire_Date like '2019% ' AND
-- Check if most recently completed course is before Hire (Date + 1 week)
MAX(mylearning.CourseTimeCompletedH) <
CAST(WD.Adjusted_Hire_Date AS DATE FORMAT 'YYYY/MM/DD') + 7
THEN '2018 Complete' -- No incomplete learnings
WHEN MAX(
CASE WHEN mylearning.CourseStatusTXT IN ('Started', 'Not Started') THEN 1 ELSE 0 END
) = 1 THEN '2018 Not Complete' -- Started / Not Started learnings exist
END AS Completion_Status
FROM Analytics.myLearning_Completions as mylearning -- Get learning info
INNER JOIN Analytics.Workday WD on mylearning.Employee_ID = WD.Employee_ID -- Employee info
GROUP BY mylearning.Employee_Id, WD.Adjusted_Hire_Date
This will give you a summary per employee, with a couple assumptions:
Assuming employee_ID value in Analytics.Workday is a unique value (one-to-one join), to use WD.Adjusted_Hire_Date in your comparisons, you just need to include it in the GROUP BY.
Assuming you have multiple courses per employee_Id, in order to use mylearning.CourseTimeCompletedH in your comparisons, you'd need to wrap that in an aggregate like MAX.
The caveat here is that the query will check if the most recently completed course per employee is before the "hire_date" expression, so I'm not sure if that's what you're after.
Give it a try and let me know.
The issue here is that you are mixing detail row by row information in the same query as group or aggregated data. Aggregated data will output a single value for all the rows unless you have a group by clause. If you have a group by clause then it will output a single value for each group. When you are grouping you can also include any values that are in the group by clause since they will be unique for the group.
if you want this data for each employee, then you could group by employee_id. Any other data would need to also be an aggregate like Max(Adjusted_Hire_Date)
Maybe this is what you want?
SELECT
mylearning.employee_id
, case
when CourseStatusTXT = 'Completed' and WD.Adjusted_Hire_Date like '2019%'
and mylearning.CourseTimeCompletedH < cast (WD.Adjusted_Hire_Date as date format 'YYYY/MM/DD') +7
then '2018 Complete'
else '2018 Not Complete'
end CompletionStatus
FROM myLearning_Completions mylearning, Workday WD
WHERE mylearning.employee_id = WD.employee_id

SQL: Using case expression to compare values from column Conditional on Values of Other Column

For my example, I have (fake) crime data with three columns: city, number of crimes committed, and time period (containing time periods 1 and 2). I need to create a table with city as one column and crime_reduced as another which is an indicator for whether the crimes committed decreased from time period 1 to period 2.
How may I setup condition to test that crimes_committed in period 2 are less than crimes_committed in period 1? My constraint is that I cannot save a physical copy of a table, so I cannot split my table into one with time period 1 and the other with time period two. I tried the follow code with a case expression, which in retrospect makes no sense.
SELECT city,
CASE WHEN time_period = 1 AND crimes_committed > time_period = 2
AND crimes_committed THEN 1
ELSE 0 END AS crime_reduced
FROM crime_data
GROUP BY city;
Edit: Unfortunately, I couldn't get the case sign expression to work (it might be a platform problem). Though that lead to this question -- is there any way to embed a case expression within a case (this would allow for proper results without creating subqueries)? Something that would look like below (this does not work in Teradata):
SELECT city,
SUM(CASE WHEN
(CASE WHEN time_period = 1 THEN crimes_commited END) > (CASE WHEN time_period = 2
THEN crimes_committed END)
THEN 1 ELSE 0 END) AS crime_reduced
FROM crime_data
GROUP BY city;
You could join two sub queries of the table, each querying a different period:
SELECT t1.city,
CASE WHEN t1.crimes_committed > t2.crimes_committed THEN 'Yes'
ELSE 'No'
END AS crimes_reduced
FROM (SELECT city, crimes_committed
FROM crime_data
WHERE period = 1) t1
JOIN (SELECT city, crimes_committed
FROM crime_data
WHERE period = 2) t2 ON t1.city = t2.city
You need a conditional CASE:
SELECT city,
CASE SIGN(-- data for 1st period
MAX(CASE WHEN time_period = 1 THEN crimes_committed END) -- data for 2nd period
- MAX(CASE WHEN time_period = 2 THEN crimes_committed END))
WHEN 0 THEN 'Same'
WHEN 1 THEN 'Decreased'
WHEN -1 THEN 'Increased'
ELSE 'Unkown (no data)'
END
FROM crime_data
GROUP BY city;

SQL statement to get record datetime field value as column of result

I have the following two tables
activity(activity_id, title, description, group_id)
statistic(statistic_id, activity_id, date, user_id, result)
group_id and user_id come from active directory. Result is an integer.
Given a user_id and a date range of 6 days (Mon - Sat) which I've calculated on the business logic side, and the fact that some of the dates in the date range may not have a statistic result for the particular date (ie. day1 and day 4 may have entered statistic rows for a particular activity, but there may not be any entries for days 2, 3, 5 and 6) how can I get a SQL result with the following format? Keep in mind that if a particular activity doesn't have a record for the particular date in the statistics table, then that day should return 0 in the SQL result.
activity_id group_id day1result day2result day3result day4result day5result day6 result
----------- -------- ---------- ---------- ---------- ---------- ---------- -----------
sample1 Secured 0 5 1 0 2 1
sample2 Unsecured 1 0 0 4 3 2
Note: Currently I am planning on handling this in the business logic, but that would require multiple queries (one to create a list of distinct activities for that user for the date range, and one for each activity looping through each date for a result or lack of result, to populate the 2nd dimension of the array with date-related results). That could end up with 50+ queries for each user per date range, which seems like overkill to me.
I got this working for 4 days and I can get it working for all 6 days, but it seems like overkill. Is there a way to simplify this?:
SELECT d1d2.activity_id, ISNULL(d1d2.result1,0) AS day1, ISNULL(d1d2.result2,0) AS day2, ISNULL(d3d4.result3,0) AS day3, ISNULL(d3d4.result4,0) AS day4
FROM
(SELECT ISNULL(d1.activity_id,0) AS activity_id, ISNULL(result1,0) AS result1, ISNULL(result2,0) AS result2
FROM
(SELECT ISNULL(statistic_result,0) AS result1, ISNULL(activity_id,0) AS activity_id
FROM statistic
WHERE user_id='jeremiah' AND statistic_date='11/22/2011'
) d1
FROM JOIN
(SELECT ISNULL(statistic_result,0) AS result2, ISNULL(activity_id,0) AS activity_id
FROM statistic WHERE user_id='jeremiah' AND statistic_date='11/23/2011'
) d2
ON d1.activity_id=d2.activity_id
) d1d2
FULL JOIN
(SELECT d3.activity_id AS activity_id, ISNULL(d3.result3,0) AS result3, ISNULL(d4.result4,0) AS result4
FROM
(SELECT ISNULL(statistic_result,0) AS result3, ISNULL(activity_id,0) AS activity_id
FROM statistic WHERE user_id='jeremiah' AND statistic_date='11/24/2011'
) d3
FULL JOIN
(SELECT ISNULL(statistic_result,0) AS result4, ISNULL(activity_id,0) AS activity_id
FROM statistic WHERE user_id='jeremiah' AND statistic_date='11/25/2011'
) d4
ON d3.activity_id=d4.activity_id
) d3d4
ON d1d2.activity_id=d3d4.activity_id
ORDER BY d1d2.activity_id
Here is a typical approach for this kind of thing:
DECLARE #minDate DATETIME,
#maxdate DATETIME,
#userID VARCHAR(200)
SELECT #minDate = '2011-11-15 00:00:00',
#maxDate = '2011-11-22 23:59:59',
#userID = 'jeremiah'
SELECT A.activity_id, A.group_id,
SUM(CASE WHEN DATEDIFF(day, #minDate, S.date) = 0 THEN S.Result ELSE 0 END) AS Day1Result,
SUM(CASE WHEN DATEDIFF(day, #minDate, S.date) = 1 THEN S.Result ELSE 0 END) AS Day2Result,
SUM(CASE WHEN DATEDIFF(day, #minDate, S.date) = 2 THEN S.Result ELSE 0 END) AS Day3Result,
SUM(CASE WHEN DATEDIFF(day, #minDate, S.date) = 3 THEN S.Result ELSE 0 END) AS Day4Result,
SUM(CASE WHEN DATEDIFF(day, #minDate, S.date) = 4 THEN S.Result ELSE 0 END) AS Day5Result,
SUM(CASE WHEN DATEDIFF(day, #minDate, S.date) = 5 THEN S.Result ELSE 0 END) AS Day6Result
FROM activity A
LEFT OUTER JOIN statistic S
ON A.activity_id = S.activity_ID
AND S.user_id = #userID
WHERE S.date between #minDate AND #maxDate
GROUP BY A.activity_id, A.group_id
First, I'm using group by to reduce the resultset to one row per activity_id/group_id, then I'm using CASE to separate values for each individual column. In this case I'm looking at which day in the last seven, but you can use whatever logic there to determine what date. The case statements will return the value of S.result if the row is for that particular day, or 0 if it's not. SUM will add up the individual values (or just the one, if there is only one) and consolidate that into a single row.
You'll also note my date range is based on midnight on the first day in the range and 11:59PM on the last day of the range to ensure all times are included in the range.
Finally, I'm performing a left join so you will always have a 0 in your columns, even if there are no statistics.
I'm not entirely sure how your results are segregated by group in addition to activity (unless group is a higher level construct), but here is the approach I would take:
SELECT activity_id
day1result = SUM(CASE DATEPART(weekday, date) WHEN 1 THEN result ELSE 0 END)
FROM statistic
GROUP BY activity_id
I will leave the rest of the days and addition of group_id to you, but you should see the general approach.