How would one represent scheduled events in an RDBMS? - sql

I have to store scheduled events, (like say class times, for example) that can be organized on a weekly, daily or monthly basis. Events can occur, say, every Monday and Wednesday, or every second Thursday of the month. Is there a way to store this information in an RDBMS that adheres to 3NF?
EDIT: This is not homework; I'm building something with a friend for our own edification and we want it in 3NF.
To be specific, I'm trying to store the schedules for mass and confession times at RC parishes. These can be scheduled in a hell of a lot of ways, such as every Sunday at x time or every Tue/Thu at a different time. Sometimes it's only the third Friday of the month,and others are only offered at a certain time once a year. I need to not only store this information, but query it, so that I can quickly get a comprehensive list of available times in the next day or week or whatever.
I suppose that strictly speaking 3NF isn't a requirement, but it would be easier for us if it were and it's better to get it correct off the bat than to change our schema later.

To record the rules for "periodic repetition", you could take inspiration from crontab's format, except of course that you do not need constraints on minutes and hours, but rather day of week, day of month, and the like. Since more than one (e.g.) weekday could be in the schedule, for NF purposes you'll want typical intermediate tables as used to represent many to many relationships, i.e. one with just two foreign keys per row (one to the main table of events, one to a table of weekdays) -- and similarly of course for days-of-month, and the like.
Presumably each scheduled event would then also have a duration, a category, perhaps a location, a name or description description.
"How normal" is the form (once you've taken care of the "sets" with the many-many relationship mentioned above) depends mostly on whether and how these various attributes depend on each other - for example if every event in a certain category has the same duration, you'll want to have a separate auxiliary table with id, category and duration, and use foreign keys into this table rather than repeat the paired info. But from what you say I don't see any intrinsic violation of normal-form rules, save for such dependency possibilities (which are not inherent in what little you have specified about the event scheduling).

Yes I have solved this problem with my co-worker in the following way:
CREATE TABLE [dbo].[Schedule](
[ID] [int] IDENTITY(1,1) NOT NULL,
[StartDate] [datetime] NOT NULL,
[EndDate] [datetime] NULL
)
CREATE TABLE [dbo].[ScheduleInterval](
[ID] [int] IDENTITY(1,1) NOT NULL,
[ScheduleID] [int] NOT NULL,
[ScheduleIntervalUnitID] [int] NOT NULL,
[Interval] [smallint] NOT NULL
)
CREATE TABLE [dbo].[ScheduleIntervalUnit](
[ID] [int] NOT NULL,
[Name] [varchar](50) NULL
)
INSERT INTO ScheduleIntervalUnit (ID, Name)
SELECT '1' AS [ID], 'Day' AS [Name] UNION ALL
SELECT '2' AS [ID], 'Week' AS [Name] UNION ALL
SELECT '3' AS [ID], 'Month' AS [Name]
A schedule spans a length of time and intervals occur within that length of time. The schedule interval unit determines the length of the interval (days as in "every other" (2) or "every third" (3) etc.), week (day of the week, such as Monday, Tuesday, etc), and month (of the calendar year). Using this you can conduct queries and logic against your database to retrieve schedules.
If your schedules need better resolution - down to hours, minutes, seconds - look at the Unix implementation of cron. I originally started down that route, but found the above to be a much more simplistic and maintainable approach.
A single date/time span - such as a defined school semester starting Sept 9th and ending Nov 4th - can contain multiple schedules (so every Monday for Art class, and "every other day" for Phys Ed - but you'll need to do more work for considering holidays and weekends!).

Related

How to store day+time slots in sql database

Let's say I have application where owners of shops can add their shops. For each shop I have to store an information when the shop is open. For example
Monday 8:00 - 20:00
Tuesday 9:30 - 22:00
Sunday 11:00 - 18:00
I have a table with shops (id, name, address, e-mail, etc). What is a correct way to store information about these day+time slots when each shop is open in sql database?
There is no one correct way, there are several methods with various advantages and disadvantages.
If you're going to use the data to see if a shop is currently open, I think the most easily used is to define a second table with one day's worth of hours for one store per row
CREATE TABLE StoreHours(
HourID int IDENTITY(1,1) NOT NULL,
StoreID int NOT NULL,
DayOfHours int NOT NULL,
--Could be char(2) if you wanted to store day as Su, M, Tu, ...
--but storing as an integer is useful with other queries
Opens time(0) NOT NULL,
Closes time(0) NOT NULL,
--And maybe LastUpdated datetime(0) if you cared
CONSTRAINT [PK_StoreHours] PRIMARY KEY CLUSTERED ( HourID ASC )
)
Queries can use the date functions (like DatePart) to determine the day of week, which can then be used, along with StoreID, to look up the day's hours.
For example, to find open gas stations, a query might look like this (obviously more useful if you also have geolocation data ...)
SELECT *
FROM Store as S INNER JOIN StoreHours as H on S.StoreID = H.StoreID
WHERE S.TypeOfStore = 'Gas Station'
AND H.DayOfHours = DATEPART(weekday, GETDATE())
AND CONVERT(TIME(0), GETDATE()) < H.Closes
AND CONVERT(TIME(0), GETDATE()) > H.Opens
On the other hand, if you're just displaying them (like on a web site) without making any comparisons, then storing them as a text string would save you work. You could still do the secondary table and one row per day, or you could use one row per store with each day's hours in a dedicated column. That's really hard to use in queries, but very easy to just display.

SQL Interview: Prevent overlapping date range

Say there is an appointment_booking table for a list of Managers (or HRs) with startDatetime and endDatetime, then how do one design the table carefully such that it doesn't accept next entry that overlaps for same manager if he/she has appointment with some other person.
If
Manager: A
has a appointment from 2016-01-01 11:00 to 2016-01-01 14:00 with Employee-1
then if Employee-2 (or someother employee) tries to book an appointment from 20-16-01-01 13:00 to 16:00 then it shouldn't allow.
Note: It is about designing the table, so triggers/procedures isn't encouraged.
Instead of inserting ranges, you could insert slices of time. You could make the slices as wide as you want, but pretend you can book a manager for 30 minutes at a time. To book from 11:30 to 12:00, you'd insert a row with the time value at 11:30. To book from 11:30 to 12:30, you'd insert two rows, one at 11:30, the other at 12:00. Then you can just use a primary key constraint or unique constraint to prevent over booking.
create table appointment_booking (
manager char not null,
startSlice DateTime,
visiting_employee varchar2(255),
primary key (manager, startSlice)
)
I know this doesn't exactly fit your premise of the table with a start and end time, but if you have control over the table structure, this would work.
CHECK CONSTRAINT + FUNCTION (this is as close as I can get to a DDL answer)
You could create a scalar function -- "SCHEDULE_OPENING_EXISTS()" that takes begin, end, employeeID as inputs, and outputs true or false.
Then you could create a check constraint on the table
CREATE TABLE...
WITH CHECK ADD CONSTRAINT OPENING_EXISTS
CHECK (SCHEDULE_OPENING_EXISTS(begin, end, employeeID)) = 'True')
TRIGGERS:
I try to avoid triggers where I can. They're not evil per se -- but they do add a new layer of complexity to your application. If you can't avoid it, you'll need an INSTEAD OF INSERT, and also an INSTEAD OF UPDATE (presumably). Technet Reference Here: https://technet.microsoft.com/en-us/library/ms179288%28v=sql.105%29.aspx
Keep in mind, if you reject an insert/update attempt, whether or how you need to communicate that back to the user.
STORED PROCEDURES / USER INTERFACE:
Would a Stored Procedure work for your situation? Sample scenario:
User Interface -- user needs to see the schedule of the person(s) they're scheduling an appointment with.
From the UI -- attempt an insert/update using a stored proc. Have it re-check (last-minute) the opening (return a failure if the opening no longer exists), and then conditionally insert/update if an opening still exists (return a success message).
If the proc returns a failure to the UI, handle that in the UI by re-querying the visible schedule of all parties, accompanied by an error message.
I think these types of questions are interesting because any time you are designing a database, it is important to know the requirements of the application that will be interacting with your database.
That being said, as long as the application can reference multiple tables, I think Chris Steele's answer is a great start that I will build upon...
I would want 2 tables. The first table divides a day into parts (slices), depending on the business needs of the organization. Each slice would be the primary key of this table. I personally would choose 15 minute slices that equates into 96 day-parts. Each day-part in this table would have a "block start" and a "block end" time that would referenced by the scheduling application when a user has selected an actual start time and an actual end time for the meeting. The application would need to apply logic such as two "OR" operators between 3 "AND" statements in order to see if a particular blockID will be inserted into your Appointments table:
actual start >= block start AND actual start < block end
actual end > block start AND actual end < block end
actual start < block start AND actual end > block end
This slightly varies from Chris Steele's answer in that it uses two tables. The actual time stamps can still be inserted into your applications table, but logic is only applied to them when comparing against the TimeBlocks table. In my Appointments table, I prefer breaking dates into constituent parts for cross-platform analysis (our organization uses multiple RDBMS as well as SAS for analytics):
CREATE TABLE TimeBlocks (
blockID Number(X) NOT NULL,
blockStart DateTime NOT NULL,
blockEnd DateTime NOT NULL,
primary key (blockID)
);
CREATE TABLE Appointments (
mgrID INT NOT NULL,
yr INT NOT NULL,
mnth INT NOT NULL,
day INT NOT NULL,
blockID INT NOT NULL,
ApptStart DateTime NOT NULL,
ApptEnd DateTime NOT NULL
empID INT NOT NULL,
primary key (mgrID, yr, mnth, day, blockID),
CONSTRAINT timecheck
check (ApptStart < ApptEnd)
);

Storing datetime or season or quarter in a relational database

I must save an event in a relational database.
This event has a time when it starts.
This will be precisely one of:
a datetime, for example: 05.05.2015 06:00:00
a quarter, for example: 4th quarter of the year 2015
a season, for example: Winter
What would be a good way to store this in a database, so i can distinguish the three types.
Should i create a col for datetype and three other cols for datetime, quarter, season? And what would you use for season and quarter.
Yes, your suggestion makes perfect sense. Create a column for datetype and three other cols for datetime, quarter, season. There are plenty of different ways to do this, here's one approach;
DateType char(1) not null, D = datetime, Q = quarter, S = season
DateTime datetime null
Quarter int null, valid values 1 to 4
Season char(2), valid values Wi, Sp, Su, Au
I would use column constraints to enforce the valid values per column, then a table constraint to enforce the rule that if DateType = D then DateTime must not be null and Quarter and Season must be null etc.
You could skip the Quarter and Season columns and use the DateTime column to store a value to represent quarters 1 to 4 or the seasons but this sort of approach almost always leads to mistakes later on. These values are sometimes called 'magic values' because they aren't what they seem, for example, does 2015-01-01 mean 1st Jan 2015 or 'Quarter 1'? When someone queries your table and forgets to look at the DateType column how will they know? I like to see schemas and data that describe themselves. With my suggestion above (or any similar approach) it would be hard to misinterpret the data in the table.
Saving a few bytes of storage or a few millionths of a second in processing are very rarely worth it - you should design something that will always work all of the time, not something that will work a little quicker, most of the time.

What is the best way to structure Days of the week in a db

This is a normalization thing, but I want I have to hold information about the days of the week. Where the user is going to select each day and put a start time and a finish time. I need this info to be stored in a db. I can simply add 14 fields to the table and it will work (MondayStart,MondayFinish,TuesdayStart, etc). This doesnt seem
Do NOT design your database to match the UI.
My time keeping system at my job has a place to enter data for each day of the week. That doesn't mean you store it that way.
You need a table for users and one for times
User_T
User_ID
Time_log_T
User_ID
Start_dt (datetime)
End_dt (Datetime)
Everything can be derived from this.
If you want to have one check-in per day create a unique constraint on User_ID, TRUNC(start_DT). This will handle third shift that wrap days. RDBMS cannot express that the next start_dt for a given User_ID is > MAX(End_DT) for that user... you'll have to do that in code. Of course if you allow records from previous days to be entered or corrected you'll need to validate them to be non-overlapping in a more complex style.
Think of all the queries you'd throw at these tables; This will beat the 14 columns 99% of the time.
Users
id
...etc...
Days
id
day nvarchar (Monday, Tuesday, etc)
start_time datetime
end_time datetime
user_id
you could also break out day in Days to a day of week to enforce consistency on the day if you only want to allow specific days or what not so Days would become
Days
id
day_of_week_id
...etc...
DaysOfWeek
id
name
I don't think moving the data to another table would accomplish anything. There would still be a one-to-one (main record to 14 fields) relationship. It would be more complex and run slower.
Your instincts are good but in this case I think you would be better off leaving the data in the table. Over-normalization is a bad thing.
You could create a table with 3 columns -- one for the day (this would be the primary key), one for the start time, and one for the finish time.
You would then have one row for each day of the week.
You could extend it with, say, a column for a user id, if you are storing the start and finish time for each user on each day (in this case, the primary key would be user id and day of the week)... or something similar to suit your needs.

Data modeling with levels of detail, some of which are absent

I'm doing a data model for a roller derby league to track their matches. I track things like lap times, penalties per lap, penalties per period, and penalties per match.
The problem is that in some cases, I will only have the overall data; I might have "penalties per match" for one match and "penalties per period" for another. So at the lowest level, for some matches, I'll have the very detailed data (penalties per hap), and at the highest level I'll have penalties per match.
I'm not sure how to model/use this to do reporting when I don't have a high detail for some records. I thought about something like this:
PenaltiesPerMatch
MatchID
PenaltyCount
PenaltiesPerPeriod
MatchID
PeriodID
PenaltyCount
PenaltiesPerLap
MatchID
PeriodID
LapID
PenaltyCount
But my concern is that the higher-level information can be derived from lower level. Do I duplicate records (e.g. fill in a record for penalties per period with data that is also in penalties per lap, summed by period?) or keep unique records (don't put in penalties per period for data that I have already in penalties per lap; calculate it by summing on period).
What I would do is record the information that you have. For some matches, record it in high detail, for others in low detail.
When you report on the matches:
Calculate the sums per match for the high detail matches
Use the sum per match from the low detail matches
Store data at the lowest detail level that you have; calculate the higher detail levels.
You could save the information in one table, with NULL values indicating that you don't have the data down to that level. You wouldn't be able to put a primary key over that, so you would need a surrogate key, but you should be able to use a unique constraint.
For example:
CREATE TABLE PenaltyCounts
(
penalty_count_id INT NOT NULL,
match_id INT NOT NULL,
period TINYINT NULL CHECK (period BETWEEN 1 AND 3),
lap SMALLINT NULL,
penalty_count SMALLINT NOT NULL,
CONSTRAINT PK_PenaltyCounts PRIMARY KEY NONCLUSTERED (penalty_count_id),
CONSTRAINT UI_PenaltyCounts UNIQUE CLUSTERED (match_id, period, lap),
CONSTRAINT CK_lap_needs_period CHECK (lap IS NULL OR period IS NOT NULL)
)
One problem with this for which I don't see an easy solution yet is how to enforce that they ONLY can enter penalties at one level. For example, they could still do this:
INSERT INTO PenaltyCounts (penalty_count_id, match_id, period, lap, penalty_count)
VALUES (1, 1, NULL, NULL, 5)
INSERT INTO PenaltyCounts (penalty_count_id, match_id, period, lap, penalty_count)
VALUES (2, 1, 1, NULL, 3)
INSERT INTO PenaltyCounts (penalty_count_id, match_id, period, lap, penalty_count)
VALUES (3, 1, 2, NULL, 2)
The advantage of this single-table solution is that your statistics can all be found by querying one table and the GROUP BYs will roll everything up nicely.
You could also use the separate table method but put views over them to pull everything together. This still allows the problem above though of putting numbers in at multiple levels.
I think it depends on what information is valuable to the customer. If they would like to have the information by period, then you should include that as a separate record. Penalty by period and by match must be separated.
I you always had the penalty by period information, then you could do a query that sums the data.
If your periods is always a fixed number, then you could probably just do two columns in the table instead of a new table to hold the period information