I have a table of login and logout times for users, table looks something like below:
| ID | User | WorkDate | Start | Finish |
| 1 | Bill | 07/12/2017 | 09:00:00 | 17:00:00 |
| 2 | John | 07/12/2017 | 09:00:00 | 12:00:00 |
| 3 | John | 07/12/2017 | 12:30:00 | 17:00:00 |
| 4 | Mary | 07/12/2017 | 09:00:00 | 10:00:00 |
| 5 | Mary | 07/12/2017 | 10:10:00 | 12:00:00 |
| 6 | Mary | 07/12/2017 | 12:10:00 | 17:00:00 |
I'm running a query to find out the length of the breaks that each user took by running a date diff between the Min of Finish, and Max of Start, then doing some other sums/queries to find out their break length.
This works where i have a maximum of two rows per User per WorkDate, so rows 1,2,3 give me workable data.
Rows 4,5,6 do not.
So long story short, how can i calculate the break times based on the above data in MS Access in a query. I'm assuming i'm going to need some looping statement but have no idea where to begin.
Here is a solution that comes to mind first.
First query to get the min/max start and end times.
Second query to calculate the total time worked for each day by using your Min(start time) and max(end time) query.
Third query to calculate the total time worked for each shift (time difference between start and end times) and then do a daily sum.
Forth query to calculate the difference between total time from the second query and the total time from the third query. The difference gives you the amount of break time they took.
If you need additional help, I can provide some screenshots of example queries.
Related
I'm stuck, i think it should be simple but can't get it to work. I have a table 'tbTimeTable' with all the hours of the day.
tbTimeTable (only show the first 5 records, but it will end at 23:00 (24 records in total)
| ID | TimeStart | TimeStop |
|1 | 0:00 | 1:00 |
|2 | 1:00 | 2:00 |
|3 | 2:00 | 3:00 |
|4 | 3:00 | 4:00 |
|5 | 4:00 | 5:00 |
I have a totals query qryPartCountTotalsPerHour with the part count per hour.
| DateIn | PartCount | PeriodIn | PerdiodOut |
|19-5-2021 | 221 |0:00 | 1:00 |
|19-5-2021 | 203 |1:00 | 2:00 |
|19-5-2021 | 201 |2:00 | 3:00 |
|19-5-2021 | 215 |6:00 | 7:00 |
|19-5-2021 | 174 |7:00 | 8:00 |
What I want, is to show the part count result for all the hours of the day and if there are no records in that hour then show 0 in the part count. So every Date in the DateIn field should show at least 24 records.
I tried this:
SELECT qryPartCountTotalsPerHour.DateIn, qryPartCountTotalsPerHour.PartCount, qryPartCountTotalsPerHour.PeriodOut, qryPartCountTotalsPerHour.PeriodOut
FROM tbTimeTable LEFT JOIN qryPartCountTotalsPerHour ON tbTimeTable.TimeStart = qryPartCountTotalsPerHour.PeriodIn
ORDER BY qryPartCountTotalsPerHour.DateIn;
I also tried to convert the PeriodIn and TimeStart just to an 'Hour' with the Hour() function but nothing works. I make mistake somewhere but can't find it.
Edit: tried to clarify that the DateIn contains more than one date.
You are close. You need to aggregate by the first table:
SELECT t.PeriodIn, NZ(q.PartCount), q.PeriodOut, q.PeriodOut
FROM tbTimeTable as t LEFT JOIN
qryPartCountTotalsPerHour as q
ON t.TimeStart = qryPqrtCountTotalsPerHour.PeriodIn
ORDER BY t.PeriodIn;
This works for your sample data. It might get more complicated if more days are involved.
So I found the answer. This topic helped me to get the answer, specifically the answer from Ken Sheridan and his public database "Payments". This database should give you enough info to solve this or a similair problem. However I will give you a short brief of what I did.
So I created a query with the time table and all the log dates (no relationship between them) called qryCalendar. This resulted in the complete time table for each logging date.
Then I created a new query to show the final result:
SELECT qryCalendar.Date, qryCalendar.TimeStop, qryCalendar.TimeStart, Nz([qryPartCountTotalsPerHour].[PartCount],0) AS tPartCount
FROM qryCalendar LEFT JOIN qryPartCountTotalsPerHour ON (qryCalendar.TimeStop = qryPartCountTotalsPerHour.PeriodOut) AND (qryCalendar.Date = qryPartCountTotalsPerHour.DateIn)
ORDER BY qryCalendar.Date, qryCalendar.TimeStart;
I have a table of sent alerts as below:
id | user_id | sent_at
1 | 123 | 01/01/2020 12:09:39
2 | 452 | 04/01/2020 02:39:50
3 | 264 | 11/01/2020 05:09:39
4 | 123 | 16/01/2020 11:09:39
5 | 452 | 22/01/2020 16:09:39
Alerts are sparse and I've around 100 Million user_ids. This table has total ~500 Million entries (last 2 months).
I want to query alerts per user in last X hours/days/weeks/months for 10 million users_ids(saved in another table). I cannot use any external time-series database and it has to be done in postgres only.
I tried keeping hourly buckets for each user. But data is so sparse that I've too many rows (userIds*hours). For eg. Getting alerts count for 10 Million users in last 10 hours takes a long time from this table.
user_id | hour | count
123 | 01/01/2020 12:00:00 | 2
123 | 01/01/2020 10:00:00 | 1
234 | 11/01/2020 12:00:00 | 1
There are not many alerts per user, so an index on (user_id) should be sufficient.
However, you might as well put the time into it as well, so I would recommend (user_id, sent_at). This covers the where clause of your query. Postgres will still need to look up the original data pages to check for changes to the data.
I need to do a query that give me rows of 30 minutes of intervals based on two hours, start_hour and end_hour.
I have a table, in this table i have this columns "start_hour and end_hour".
Assuming that i have this
| start_hour | end_hour |
| 09:00AM | 08:00PM |
I need a query that gave a result like this.
| intervals |
| 09:00AM |
| 09:30AM |
| 10:00AM |
| 10:30AM |
| 11:00AM |
| 11:30AM |
| 12:00AM |
| 12:30AM |
...
...
...
| 07:30PM |
| 08:00PM |
And the rows need to finish in te end_hour value i have in the table, as shown in the example.
Someone can help me how to do it, i tried rounding the start_hour, but i don't have any result.
This is a bit clunky and will take a bit of editing based on your specific needs, but it's a very slightly modified bit of code I used a few years back that should work as a solid starting point for you:
select to_char(time_slot,'HH:MIPM')
from (select trunc(to_date('05/23/2019','MM/DD/YYYY'))+(rownum-1)*(30/24/60) time_slot
from dual
connect by level <= (24*2))
where to_char(time_slot,'HH24:MI') between
--start_hour
'09:00'
and
--end hour
'20:00';
OUTPUT
09:00AM
09:30AM
10:00AM
10:30AM
11:00AM
11:30AM
12:00PM
12:30PM
01:00PM
01:30PM
02:00PM
02:30PM
03:00PM
03:30PM
04:00PM
04:30PM
05:00PM
05:30PM
06:00PM
06:30PM
07:00PM
07:30PM
08:00PM
I'm in a situation where I need to track user information very similar to fitbit steps, and am looking for feedback on two thoughts I had on modelling the data.
My requirements are to store the number of samples on a minute by minute basis. These are also going to be associated to a user (who did the steps), challenges and tasks for the user to complete. (gamification)
Now I can store all the samples in one table
id(pk) | user | start date | steps | challengeId
uuid1 | user1 | 1/1/2015 10:00PM | 100 | challenge1
uuid2 | user1 | 1/1/2015 10:01PM | 101 | challenge1
... can have hundreds of minutes with a challenge
uuid3 | user1 | 1/1/2015 10:02PM | 102 |
uuid4 | user2 | 1/1/2015 10:00PM | 100 |
so user1 has 303 steps between 10:00PM and 10:02PM but was only participating in challenge1 at 10:00PM and 10:01 PM
However, I don't think this can scale, since assuming ideal data for a single user in a year
12 (hours in a day) * 60 (minutes in a day) * 365 (days in a year) = 262,800 records in a database, for 1 user. Considering 100k users, the table would become pretty large.
I'm also thinking about the idea of grouping the minutes into a concept of a session, where it would look like
id(pk) | user | start date | steps | challengeId
uuid1 | user1 | 1/1/2015 10:00PM | [100,101] | challenge1
uuid2 | user1 | 1/1/2015 10:01PM | [102] |
uuid3 | user2 | 1/1/2015 10:02PM | [102] |
where the steps array assumes an 1 minute intervals. based on the use cases, there could hundreds / thousands of minutes in a challenge.
I think the second approach makes sense, since it means querying single records vs hundreds or thousands and could shrink the table by a factor of hundreds, but if there are any gotchas to this approach or any thoughts, it would be appreciated.
Right now I'm storing a number of records in SQL Server with a DATETIME column that stores the current timestamp using GETUTCDATE(). This ensures that we're always storing the exact date without having to worry about questions like "well is this 2:00 my time or 2:00 your time?" Using UTC ensures that we know exactly when it happened regardless of timezone.
However, I have a query that essentially groups these records by date. A simplified version of this query looks something like this:
SELECT [created], SUM([amount]) AS [amount]
FROM (
SELECT [amount], LEFT(CONVERT(VARCHAR, [created], 120), 10) AS [created]
FROM (
SELECT [amount], DATEADD(HOUR, -5, [created]) AS [created]
FROM [sales]
WHERE [organization] = 1
) AS s
) AS s
GROUP BY [created]
ORDER BY [created] ASC
Obviously this query is far from ideal--the whole reason I'm here is to ask how to improve it. First of all, it does (for the most part) accomplish the goal of what I'm looking for here--it has things grouped by dates and the other values aggregated accordingly. But what it doesn't accomplish is handling Daylight Savings Time correctly.
I live in Madison, WI and we're on Central Time time, so between March and November we're UTC-5, otherwise we're UTC-6. That's why you see the -5 in the code there as a quick hack to get it working.
The problem is that if I run this query, and there are records that fall on both sides of the daylight savings time changeover, it could potentially group things incorrectly. So for instance, if the table looks something like this:
+----+--------+---------------------+
| id | amount | created |
+----+--------+---------------------+
| 1 | 100.00 | 2010-04-02 06:00:00 |
| 2 | 50.00 | 2010-04-02 04:30:00 |
| 3 | 75.00 | 2010-04-02 03:00:00 |
| 4 | 150.00 | 2010-03-02 07:00:00 |
| 5 | 25.00 | 2010-03-02 05:30:00 |
| 6 | 50.00 | 2010-03-02 04:00:00 |
+----+--------+---------------------+
My query will return this:
+------------+--------+
| created | amount |
+------------+--------+
| 2010-03-01 | 50.00 |
| 2010-03-02 | 175.00 |
| 2010-04-01 | 125.00 |
| 2010-04-02 | 100.00 |
+------------+--------+
However, ideally it SHOULD return this:
+------------+--------+
| created | amount |
+------------+--------+
| 2010-03-01 | 75.00 |
| 2010-03-02 | 150.00 |
| 2010-04-01 | 125.00 |
| 2010-04-02 | 100.00 |
+------------+--------+
The trouble is that if I just subtract a fixed -5, then April is correct but March is not, but if I instead subtract a fixed -6 then March is correct but April is not. What I really need to do is convert to the appropriate time zone in a way that is aware of Daylight Savings Time and can adjust accordingly. Can I do this with SQL query? How do I write this query?
None of the current date/time functions are DST aware.
Using an auxiliary calendar table may be your best bet:
http://web.archive.org/web/20070611150639/http://sqlserver2000.databases.aspfaq.com/why-should-i-consider-using-an-auxiliary-calendar-table.html
You can store UTCOffset's by date and reference it in your select statement
If you were able to store your data in a datetimeoffset field instead of datetime
this might help.
http://msdn.microsoft.com/en-us/library/bb630289.aspx
This datatype and the corepsonding functions are a new feature of sql server 2008.