Problem overview
I'm working on a simple app for reminding the user of weekly goals. Let's say the goal is to do 30 minutes of exercise on specific days of the week.
Sample goal: do exercise on Mon, Wed, Fri.
The app also needs to track past record, i.e. dates when the user did exercise. It could be just dates, e.g.: 2019-09-02, 2019-09-05, 2019-09-11 means the user did exercise on these days and did not on the others (doesn't need to be on "exercise goal" days of the week).
The goal can change in time. Let's say today is 2019-09-11 and the goal for this week ([2019-09-09, 2019-09-15]) is Mon, Wed, Fri but from 2019-08-05 to 2019-09-08 it was Mon, Thu (repeatedly for all these weeks).
I need to store these week-oriented goals and historic exercise of data and be able to retrieve the following:
The goal days for the current week (or any week, let's say I can compute start and end day for any week given a date).
Exercise history for a larger range of days together with goal days for that range (e.g. to show when the user was supposed to exercise and when they actually did in the last month).
Question
How to best store this data in SQL.
This is a little bit academic because I'm working on a small Android app and the data is just for a single user. So there will be little data and I can successfully use any approach, even a very clumsy one will be efficient enough.
However, I'd like to explore the topic and maybe learn a thing or two.
Possible solutions
Here are two approaches that come to my mind.
In both cases I would store exercise history as a table of dates. If there is an entry for that date it means the user did exercise on that day.
It's the goal storage that is interesting.
Approach 1
Store the goals per-week (it's SQLite so dates are stored as strings - all dates are just 'YEAR-MONTH-DAY'):
CREATE TABLE goals (
start_date TEXT,
exercise_days TEXT);
"start_date" is the first day of the week,
"exercise_days" is a comma-separated list of weekdays (let's say numbers 1-7).
So for the example above we might have two rows:
'2019-08-05', '1,4'
'2019-09-09', '1,3,5'
meaning that since 2019-08-05 the goal is Mon, Thu for all weeks until 2019-09-09, when the goal becomes Mon, Wed, Fri. So there is a gap in the data. I wouldn't want to generate data for weeks starting on 2019-08-12, 2019-08-19, 2019-08-26.
With this approach it is easy to work with the data week-wise. The current goal is the one with MAX('start_date'). The goal for a week for a given date is MAX('start_date') WHERE 'start_date' <= :date.
However it gets cumbersome when I want to get data for the last 3 months and show the user their progress.
Or maybe I want to show the user the percentage of actual exercise days to what they set as their goal in a year.
In this case it seems the best approach is to fetch the data separately and merge it in the application (or maybe write some complex queries), processing week by week. This is ok performance-wise because the amount of data is small and I rarely need more than a handful of weeks.
Approach 2
Store goals in such a way that each goal day is a record:
CREATE TABLE goals (
day TEXT,
);
"day" is a day when the user should exercise. So for the week starting 2019-09-09 (Mon, Wed, Fri) we would have:
'2019-09-09'
'2019-09-11'
'2019-09-13'
and for the week starting 2019-08-05 (Mon, Thu) we would have:
'2019-08-05'
'2019-08-09'
but what for the weeks in-between?
If my app could fill all the weeks in-between then it would be easy to merge this data with the exercise history and display days when the user was supposed to exercise and when they actually did. Extracting the goal for any given week would also be easy.
The problem is: this requires the app to generate data for the "gap" weeks even if the user doesn't tweak the goal. This can be implemented as a transaction that is run each time the app process starts. In some cases it could take noticeable time for occasional users of the app (think progress bar for a second).
Maybe there a smart way to generate the data in-between when making a SELECT query?
I don't like the fact that it requires generating data. I do like the fact that I can just join the tables and then process that (e.g. compute how many exercise days there were supposed to be in August and how many days the user did actually exercise and then show them percentage like "you did 85% of your goal" - in fact I can do this without joining the tables).
Also, it seems this approach gives me more flexibility for analysis in the future.
But is there a third way? Or maybe I am overthinking this? :)
(I am asking mostly for the way of organizing the data, there's no need for exact SQL queries)
Perhaps I'm over-thinking this, but if a goal can have multiple components to it, and can change over time I'd have a goal header record, with the ID, name and other data about the goal as a whole, and then a separate table linked with the components of that goal which are time-boxed, for example:
CREATE TABLE goal_days (goal_day_ID INT,
goal_ID INT,
day_ID INT,
target_minutes INT,
start_date TEXT,
end_date TEXT)
I'd have thought that allows you to easily check against the history to map against each day of the goal - e.g. they got 100% of the Mondays, but kept missing Thursday - however when the goal was changed to Friday instead they got better.
Related
I’m a data analyst in the insurance industry and we currently have a program in SAS EG that tracks catastrophe development week by week since the start of the event for all of the catastrophic events that are reported.
(I.E week 1 is catastrophe start date + 7 days, week 2 would be end of week 1 + 7 days and so on) then all transaction amounts (dollars) for the specific catastrophes would be grouped into the respective weeks based on the date each transaction was made.
Problem that we’re faced with is we are moving away from SAS EG to GCP big query and the current process of calculating those weeks is a manually read in list which isn’t very efficient and not easily translated to BigQuery.
Curious if anybody has an idea that would allow me to calculate each week number in periods of 7 days since the start of an event in SQL or has an idea specific for BigQuery? There would be different start dates for each event.
It is complex, I know and I’m willing to give more explanation as needed. Open to any ideas for this as I haven’t been able to find anything.
So I'm developing a database for an agency that manages many relief staff.
Relief workers set their availability for each day in one of three categories (day, evening, night).
We also need to be able to set some part-time relief workers as busy on weekly, biweekly, and in one instance, on a 9-week rotation. Since we're already developing recurring patterns of availability here, we might as well also give the relief workers the option of setting recurring availability days.
We also need to be able to query the database, and determine if an employee is available for a given day.
But here's the gotcha - we need to be able to use change data capture. So I'm not sure if calculating availability is the best option.
My SQL prototype table looks like this:
TABLE Availability Day
employee_id_fk | workday (DATETIME) | day | eve | night (all booleans)| worksite_code_fk (can be null)
I'm really struggling how to wrap my head around recurring events. I could create say, a years worth, of availability days following a pattern in 'x' day cycle. But how far ahead of time do we store information? I can see running into problems when we reach the end of the data set.
I was thinking of storing say, 6 months of information, then adding a server side task that runs monthly to keep the tables updated with 6 months of data, but my intuition is telling me this is a bad fix.
For absolutely flexibility in the future and keeping data from bloating my first thought would be something like
Calendar Dimension Table - Make it for like 100 years or Whatever you Want make it include day of week information etc.
Time Dimension Table - Hour, Minutes, every 15 what ever but only for 24 hour period
Shifts Table - 1 record per shift e.g. Day, Evening, and Night
Specific Availability Table - Relationship to Calendar & Time with Start & Stops recommend 1 record per day so even if they choose a range of 7 days split that to 1 record perday and 1 record per shift.
Recurring Availability Table - for day of week (1-7),Month,WeekOfYear, whatever you can think of. But again I am thinking 1 record per value so if they are available Mondays and Tuesday's that would be 2 rows. and if multiple shifts then it would be multiple rows.
Now and here is the perhaps the weird part, I would put a Available Column on the Specific and Recurring Availability Tables, maybe make it a tiny int and store something like 0 not available, 1 available, 2 maybe available, 3 available with notice.
If you want to take into account Availability with Notice you could add columns for that too such as x # of days. If you want full flexibility maybe that becomes a related table too.
The queries would be complex but you could use a stored procedure or a table valued function to handle it fairly routinely.
I have a report that, according to users, started miscalculating dates in one field in November 2015. After some digging around, I found that one of the tables the field referenced seemed to have an end date on 2015-10-31.
The "D" field seems to represent the day of the week, with Sunday being day 1 and Saturday being 7.
Is there a way to extend the calendar so that it ends further into the future, for example 2049-12-31?
Our calendar table, for a variety of reasons, goes the the end of the current year. We have written a query that adds a new year to this table. This query takes care of most of the fields in that table. It does not touch the holiday field. That is updated manually through a web page.
We send ourselves reminders. Starting in March, we send monthly reminders that we should think about adding another year. After ensuring that the database segment has space, and that none of the definitions, such as fiscal periods, have changed, we run the query that adds a year.
Later in the year we start mailing ourselves reminders about the holidays. Then we check to see if HR has declared them, and if so, update the records accordingly.
This meets our business requirements. Yours will be different of course.
I'm trying to figure the best approach to solve this problem.
--
I have a "History" table that,
lists ALL years that have data.
If a user clicks a given Year, it segues to a new Table and,
lists ALL months that have data.
Clicking a given month, shows a new table that,
lists ALL days that have data.
Clicking a specific day, shows a list of one or multiple Time Stamps.
--
What is the best approach to solve this?
If user creates a Time Stamp. I need to insert it with today's date.
I also need to have the ability that if a user,
Deletes a given year. Everything in that year is deleted.
That same way,
Deleting a month, deletes everything in that month, for it's particular year.
And so on, to the point where the user should be able to delete Individual Time Stamps.
--
I thought I would Use a Dictionary with key for the "year". 2012, 2013, ...
And each retrieving another Dictionary with key for the "month", 1, 2, 3, 4, ...
And so on ... and so on ...
I also thought I could make a model using Core Data.
A Class Year representing the "Year" entity, having a relation to many possible Months, and each month, having a relation to many possible days, and days to Time *Stamps*.
And last,
I thought of creating a model with only two Entities.
Entries, with only one attribute "Date", that has a to-many relationship to "Time Stamps", receiving All the possible Time Stamps for that given day.
I am new to iOS programming. So this is all theory for me. But I did follow some Core Data tutorials and others working with NSDictionaries, protocols delegates and so on.
The "Dig In" approach as I go trough, seems more elegant. Specially because I think I could delete a particular given object in a cascade manner?
Do any of these make sense? Or is there a more obvious easy way to go about it? Also, please consider in the answer what would be easier to implement if a user chooses to delete a given entry in the "tree"
Any help is most appreciated.
Thank you advance!
Nuno
If you are going to rely on Core Data or any database engine, the best way to solve this is to use the database itself.
I see two possible solutions (there is more of course). The first, the simplest :
Entity
- timestamp
- year
- month
- day
- all_the_stuff_you_need
Make year, month and day readonly, updated along timestamp. Indexes: year, year+month, year+month+day. Easy call.
That way, you can very simple query the database, asking it to return the entities you need and only the entities you need.
A more complex setup would be:
Entity
- timestamp
- all_the_stuff_you_need
- year -> Year
- month -> Month
- day -> Day
Year
- year
- entities ->> Entity
Month
- month
- entities ->> Entity
Day
- day
- entities ->> Entity
So basically, 3 data domains for the years, months and days, months and days being immutable.
That structure is more complex, but it gives a better view of your data. You have a direct access to more information on your data as the data domains are explicit and well defined.
A third solution would be to create a date entity with year, month and day, with one entry per day. A middle ground between the two solutions above. Less interesting I think, but hey, it may suit your needs anyway.
I've started thinking about an employee shift management application to handle the shifts (who works when, trading, etc) at my current workplace (that uses pen and paper and hasn't got anyway for us employees to communicate about changes without going through the boss and be on site).
Currently the shifts are modeled loosely as:
There is a recurring 4 week period (from Monday week 1 to Sunday week 4)
There is a template for placing employees in this 4 week period
Every 4 months (ie 3 times a year) the 4 week template is projected over the next 4 month period
The shifts have been the same for a long time and it seems many employees would prefer to have them changed (I can say this by the requests for change that come in every time a new 4 month is set).
What I'm aiming at are the models:
Shift_group_tpl (the 4 week period above)
Shift_tpl (a single shift in the 4 week period, including info on who defaults to work this shift)
Shift_group (a set period of time whit actual shifts)
Shift (a set shift whit a real time period and an employee - and the possibility to be changed both in start_time, end_time and employee)
I've thought of a way to do this with recurring iCalendar events: Creating RRULE's (without an endtime) and then calculate (using temporary start and end times) if that specific Shift_group_tpl could be used within a real Shift_group. (The problem with this approach is that I can't figure out how to trim the Shift_group_tpl's to fit into the start or end of a Shift_group.)
What I'm looking for are some other perspectives or ways of doing it or even just a pat on the shoulder letting me know that I'm on the right track (and then giving advice on the trimming problem).
/iole1
What I'm aiming at are the models:
Shift_group_tpl (the 4 week period above)
Shift_tpl (a single shift in the 4 week period, including info on who defaults to work this shift)
Shift_group (a set period of time whit actual shifts)
Shift (a set shift whit a real time period and an employee - and the possibility to be changed both in start_time, end_time and employee)
You have "sql" as a tag for this post? So im guessing you want these as SQL tables?
By the sounds, the problem is that your considering the data you have, rather than the abstract concepts you need to store that data. Which is what you'd need to do to create an application. (Most likely a "Shifts" table, rather than the four tables above).
There is little information here to help, Consider refining your thoughts and ask another question.