Foreign key on part of composite PK - sql

I have a problem with finding how to declare the following type of association :
Say I have a table "Weekly" as such :
Weekly {
Id : Int <= PK
Week : Int
Year : Int
}
And a table "Monthly" :
Monthly{
Id : Int <= PK
Month: Int
Year : Int
}
I also have a "WeekMonth" Table :
Monthly{
Week : Int <= PK
Month : Int <= PK
Year : Int <= PK
}
As you my have guessed, i whant to be able to link the Weekly with WeekMonth and Monthly with WeekMonth too.
However, i can't seam to be able to do this : a foreign key on part of the composite primary key. Nevertheless, in my WeekMonth table, both the year and week and the year and month field are obviouly unique, so it should be able to work.
I've tried multiple approch to this problem , but as the custom mapping of week per month is a business need, I a bit stuck with it

in my WeekMonth table, both the year and week and the year and month field are obviouly unique
That isn't true. 'Year and week' may be unique, but it depends what 'week' is here - if it's the week within the month (i.e. 1-5) then it is not unique. If it's the week within the year (1-53) then it is; but you don't have a unique or primary key on that combination. And 'year and month' is not unique, as you will have multiple entries - either 4 or 5 - for each combination.
If you have a composite primary (or unique) key then a foreign key has to refer to all of the columns in that PK - otherwise they would not necessarily be unique.
A natural key isn't really working for you here. As well as not allowing the relationships you want, you're duplicating data in the parent and child tables. It would be better to have a synthetic key, e.g. set from a sequence:
WeekMonth{
WeekMonth_Id : Int <= PK (synthetic, e.g. from sequence)
Week : Int <= }
Month : Int <= } UK
Year : Int <= }
}
Weekly {
Weekly_Id : Int <= PK
WeekMonth_Id : Int <= FK to WeekMonth
}
Monthly{
Monthly_Id : Int <= PK
WeekMonth_Id : Int <= FK to WeekMonth
}
You don't need to duplicate the year/month/week values in the child tables as you can get them from the parent. And you shouldn't duplicate them, as you can't easily guarantee that the match the related parent record, as well as for general normalisation reasons.
I'm assuming you have other data in the weekly and monthly tables, otherwise they would be a bit pointless; any other table that has an FK to one of those could use an FK to WeekMonth instead.
If you do want to have the individual year/month/week values duplicated in the child tables then you will need separate unique keys for those combinations, in addition to your current PK. So you'd modify WeekMonth to have a unique key on year and month (which may be possible, depending what 'week' represents), and another unique key on year and month - but as that is not a unique combination you can't create that key.

Assuming that the WeekMonth table has Week values 1 through 53 for the year then:
CREATE TABLE WeekMonth(
Week INT,
Month INT,
Year INT,
CONSTRAINT WeekMonth__W_M_Y__PK PRIMARY KEY ( Week, Month, Year ),
CONSTRAINT WeekMonth__W_Y__PK UNIQUE ( Week, Year )
);
CREATE TABLE Monthly(
ID INT PRIMARY KEY,
Month INT,
Year INT,
FirstWeek INT GENERATED ALWAYS
AS ( TO_NUMBER(
TO_CHAR(
NEXT_DAY(
TO_DATE( month||'-'||year, 'MM-YYYY' ) - 1,
'MONDAY'
),
'WW'
)
)
),
CONSTRAINT Monthly__M_Y__PK FOREIGN KEY ( FirstWeek, Month, Year )
REFERENCES WeekMonth( Week, Month, Year )
);
CREATE TABLE Weekly(
ID INT PRIMARY KEY,
Week INT,
Year INT,
CONSTRAINT Weekly__W_Y__PK FOREIGN KEY ( Week, Year )
REFERENCES WeekMonth( Week, Year )
);

Related

RDBMS (SQL) storing time series with variable labels / extra column attributes?

I want to set up a RDBMS for structured time series data of limited size (about 6000 series, 50mb of data) at various frequencies (daily, monthly, quarterly, annual CY and annual FY), and I want to run SQL queries on the database (mostly join various tables by time). The database is updated once a month. The variable names of the tables in this database are rather technical not very informative. The raw data is labeled as shown in the table below (example of a monthly table).
I started setting this up in MySQL and figured that just equipping tables with appropriate temporal identifiers gives me the join functionality I want. I could however not find out how to store the variable labels appropriately. Is it possible to somehow add attributes to the columns? Or can I link a table to the table mapping labels to the column names, such that it is carried along in joins? Or should I set this up using a different kind of database? (database must be easy to set up and host though, and SQL is strongly preferred). I am grateful for any advice.
Update:
I figured you can add comments to MySQL columns and tables, but it seems these cannot be queried in a standard way or carried along in joins. Is it possible to retrieve the information in the comments along with the queried data from a standard database connector (like this one for the R language: https://github.com/r-dbi/RMySQL)? Below a DDL example for tables with variable labels as comments.
-- Annual FY Table
CREATE TABLE IF NOT EXISTS BOU_MMI_AF (
FY VARCHAR(7) COMMENT "Fiscal Year (July - June)",
NFA DOUBLE COMMENT "Net Foreign Assets (NFA) (Shs billion)",
NDA DOUBLE COMMENT "Net Domestic Assets (NDA) (Shs billion)",
PRIMARY KEY (FY)
) COMMENT = "Annual FY";
-- Quarterly Table
CREATE TABLE IF NOT EXISTS BOU_FS (
Year INT CHECK (Year >= 1800 AND Year < 2100) COMMENT "Year",
Quarter VARCHAR(2) CHECK (Quarter IN ('Q1', 'Q2', 'Q3', 'Q4')) COMMENT "Quarter",
FY VARCHAR(7) COMMENT "Fiscal Year (July - June)",
QFY VARCHAR(2) CHECK (QFY IN ('Q1', 'Q2', 'Q3', 'Q4')) COMMENT "Quarter of Fiscal Year",
KA_RC_RWA DOUBLE COMMENT "Capital Adequacy (%): Regulatory capital to risk-weighted assets",
AQ_NPL_GL DOUBLE COMMENT "Asset quality (%): NPLs to total gross loans",
EP_RA DOUBLE COMMENT "Earnings & profitability (%): Return on assets",
L_BFA_TD DOUBLE COMMENT "Liquidity (%): Bank-funded advances to total deposits",
MS_FX_T1CA DOUBLE COMMENT "Market Sensitivity (%): Forex exposure to regulatory tier 1 capital",
PRIMARY KEY (Year, Quarter)
) COMMENT = "Quarterly";
-- Daily Table
CREATE TABLE IF NOT EXISTS BOU_I (
Date DATE CHECK (Date >= '1800-01-01' AND Date < '2100-01-01') COMMENT "Date",
Year INT CHECK (Year >= 1800 AND Year < 2100) COMMENT "Year",
Quarter VARCHAR(2) CHECK (Quarter IN ('Q1', 'Q2', 'Q3', 'Q4')) COMMENT "Quarter",
FY VARCHAR(7) COMMENT "Fiscal Year (July - June)",
QFY VARCHAR(2) CHECK (QFY IN ('Q1', 'Q2', 'Q3', 'Q4')) COMMENT "Quarter of Fiscal Year",
Month VARCHAR(9) CHECK (Month IN ('January' , 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December')) COMMENT "Month",
Day INT CHECK (Day > 0 AND Day < 32) COMMENT "Day",
I_Overnight DOUBLE COMMENT "Daily Interbank Money-Market Rates: Overnight (%)",
I_7day DOUBLE COMMENT "Daily Interbank Money-Market Rates: 7-day (%)",
I_Overall DOUBLE COMMENT "Daily Interbank Money-Market Rates: Overall (%)",
PRIMARY KEY (Date)
) COMMENT = "Daily";
So if I execute a query like
SELECT * FROM BOU_I NATURAL JOIN BOU_FS NATURAL JOIN BOU_MMI_AF;
using a statistical software environment like R or STATA connecting to the database using a MySQL connector, I'd like to see a table similar to the one shown in the figure, where I can retrieve both the names of the variables and the labels stored as comments in the DDL.
I would structure your data differently. I would put all your measures in a single table and have a single measure per row. I would then add a DATE table (so that you have the week/month/quarter/year values for each metric date) and a METRIC_TYPE table that holds the labels for each metric code.
By normalising the data like this I think you have a more flexible design and it'll allow you to do what you want.
This is only for illustration of what I mean - it is not meant to be a definitive design:
So I am pretty happy with the suggestion of #NickW. For reference I am sharing my final schema below. I still have some questions regarding it. So I mostly query the DATA table directly (which has some 700,000 obs), and joining information from the TIME, SERIES and DATASET tables as needed. I noticed that retrieving larger amounts of data can take some time. So I wondered: am I indexing this optimally?
Then, there are a few computed columns: The Ndatasets column in DATASOURCE is counting the number of DSID by Source in the DATASET table, the Updated column in DATASET shows when data was last added to a particular dataset. DS_From, DS_to, and S_from, S_to give the maximum time range where data is available for a given dataset and series. Currently, I am doing all these computations in R and inserting the data. I wonder if these computations could be done in MySQL, so as to have self-updating columns?
Grateful for any further comment on this.
DDL:
DROP SCHEMA IF EXISTS TSDB;
CREATE SCHEMA IF NOT EXISTS TSDB;
USE TSDB;
CREATE TABLE IF NOT EXISTS DATASOURCE (
Source VARCHAR(120),
Source_Url VARCHAR(200),
NDatasets INT NOT NULL,
Desription VARCHAR(3000) NOT NULL,
Access VARCHAR(3000) NOT NULL,
PRIMARY KEY (Source)
);
CREATE TABLE IF NOT EXISTS DATASET (
DSID VARCHAR(30), -- INT
Dataset VARCHAR(120) NOT NULL,
Frequency VARCHAR(9) NOT NULL CHECK (Frequency IN ('Daily' , 'Monthly', 'Quarterly', 'Annual CY', 'Annual FY')),
DS_From DATE CHECK (DS_From >= '1800-01-01' AND DS_From < '2100-01-01'),
DS_To DATE CHECK (DS_To >= '1800-01-01' AND DS_To < '2100-01-01'),
Updated DATE CHECK (Updated >= '1800-01-01' AND Updated < '2100-01-01'),
Desription VARCHAR(3000) NOT NULL,
Source VARCHAR(120), -- NOT NULL
DS_Url VARCHAR(200),
PRIMARY KEY (DSID),
FOREIGN KEY (Source) REFERENCES DATASOURCE (Source) ON DELETE CASCADE ON UPDATE CASCADE
);
CREATE INDEX idx_dataset_source ON DATASOURCE (Source);
CREATE TABLE IF NOT EXISTS SERIES (
DSID VARCHAR(30), -- INT
Series VARCHAR(30) NOT NULL,
Label VARCHAR(120) NOT NULL,
S_From DATE CHECK (S_From >= '1800-01-01' AND S_From < '2100-01-01'),
S_To DATE CHECK (S_To >= '1800-01-01' AND S_To < '2100-01-01'),
S_Source VARCHAR(120),
S_Url VARCHAR(200),
PRIMARY KEY (DSID, Series),
FOREIGN KEY (DSID) REFERENCES DATASET (DSID) ON DELETE CASCADE ON UPDATE CASCADE
);
CREATE INDEX idx_series_DSID ON SERIES (DSID);
CREATE TABLE IF NOT EXISTS TIME (
Date DATE UNIQUE CHECK (Date >= '1800-01-01' AND Date < '2100-01-01'),
Year INT NOT NULL CHECK (Year >= 1800 AND Year < 2100),
Quarter INT NOT NULL CHECK (Quarter >= 1 AND Quarter <= 4),
FY CHAR(7) NOT NULL,
QFY INT NOT NULL CHECK (QFY >= 1 AND QFY <= 4),
Month INT NOT NULL CHECK (Month >= 1 AND Month <= 12),
Day INT NOT NULL CHECK (Day > 0 AND Day < 32),
PRIMARY KEY (Date)
);
CREATE TABLE IF NOT EXISTS DATA (
Date DATE,
DSID VARCHAR(30),
Series VARCHAR(30),
Value DOUBLE NOT NULL,
PRIMARY KEY (Date, DSID, Series),
FOREIGN KEY (DSID) REFERENCES DATASET (DSID) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY (DSID, Series) REFERENCES SERIES (DSID, Series) ON DELETE CASCADE ON UPDATE CASCADE,
FOREIGN KEY (Date) REFERENCES TIME (Date) ON DELETE CASCADE ON UPDATE CASCADE
);
CREATE INDEX idx_data_DSID ON DATA (DSID);
CREATE INDEX idx_data_series ON DATA (DSID, Series);
CREATE INDEX idx_data_date ON DATA (Date);
EER Diagram:

How to update a column based on an amount of time in SQL

I want to create a stored procedure to update a column based on an amount of time. For example, to update the interest generated column every 15 days.
Here is my code. Please help.
create table Loan(
Loan_ID int not null primary key,
Loan_custID int not null foreign key references Customers(Cust_ID),
Loan_Amount int not null,
Loan_Interest int not null,
Loan_Date date not null unique,
)
Create table Interestgenerated(
IG_ID int not null primary key,
Loan_ID int not null foreign key references Loan1(Loan_ID),
Loan_Date date null foreign key references Loan1(Loan_Date),
IG_Amount int not null,
IG_Date datetime not null
)
create procedure InsertINtoInterestgenerated1
#PresentDate Datetime
as
set #PresentDate=getdate()
select Loan_ID from Loan
set IG_Date=Loan_Date
IG_Date=dateadd(day,15, IG_Date)
if #PresentDate=IG_Date
begin
update Interestgenerated1 set IG_Date = #PresentDate, IG_Amount=IG_Amount*0.15
end
Considering you want to automate the update of the value in column IG_Amount every 15 days,
you can schedule a job to run every 15 days at midnight like on the 1st and 16th of every month.
the below link might help you:
how to schedule a job for sql query to run daily?

Storing Schedule Information

I need to create a table to hold a schedule for meetings.
A meeting can be scheduled to be:
Daily
'Ever X days'. where X can be between 1 and 6.
Ending after X sessions. Where 'sessions' is basically the number of repeats.
Weekly
Which days during the week it can occur. Mon, Tue, etc. Can select more than one day per week.
The Date on which it ends.
Monthly
Use can select the day of the month it can occur (1st, 2nd etc)
OR they can select from a lookup of '1st, 2nd, 3rd, 4th or Last' and a Day 'Mon, Tues', saying, for example "The 2nd Friday" of the month.
How can I store all these scenarios in a single table?
I was thinking:
CREATE TABLE schedule
(
ID INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
StartDate DATETIME NOT NULL,
EndTime TIME NULL,
RepeatTypeID INT NOT NULL, // Daily, Weekly, Monthly, None
// For Daily
EveryDayCount INT NULL, // to handle 'every 3 days',
RepeatCount INT NULL, // How many occurances. Can be shared with different RepeatTypes
// weekly
IsMonday BIT,
IsTuesday BIT,
etc // A field per day selection. Is there a better way?
// Monthly
MonthlyDayNumber INT NULL,
MonthlyRepeatIntervalID INT, // Lookup table with '1st, 2nd, 3rd, 4th, Last'
MonthlyDayRepeatSelection INT // Lookup on Monday, Tuesday etc
)
But this seems inefficient. Is there a better design pattern for these sorts of requirements?
So basically I once implemented the same functionality and I found that rather than ease of storage, that ease of retrieval and edit/update was of paramount importance.
You don't want to calculate all dates every single time, you query the DB for meeting dates or say like you have a function like showAllMeetingsForADate(somedate date) then you would not want to calculate dates for meeting at run time.
Holistically the most optimal storage is that you store meeting information calculation logic in a table and all meeting dates in another table like below.
However for the storage of meeting information, you should go with a normalized form.
Schedule Detail Tables
CREATE TABLE DailyScheduleDetails
(
ScheduleDetailsID INT PRIMARY KEY IDENTITY(1,1),
RecurrenceCount INT NOT NULL
)
CREATE TABLE WeeklyScheduleDetails
(
ScheduleDetailsID INT PRIMARY KEY IDENTITY(1,1),
OnMonday bit,
OnTuesday bit,
OnWednesday bit,
-- ...
OnSunday bit,
EndByDate Date NOT NULL
)
CREATE TABLE MonthlyScheduleDetails
(
ScheduleDetailsID INT PRIMARY KEY IDENTITY(1,1),
MonthlyDayNumber INT NULL,
MonthlyRepeatIntervalID INT, // Lookup table with '1st, 2nd, 3rd, 4th, Last'
-- Here I'd suggest using 0 for Last
MonthlyDayRepeatSelection INT // Lookup on Monday, Tuesday etc
)
Schedule
CREATE TABLE schedule
(
ID INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
StartDateTime DATETIME NOT NULL,
EndDateTime DATETIME NULL,
RepeatTypeID INT NOT NULL, // Daily, Weekly, Monthly, None
ScheduleDetailsID INT
)
MeetingDates
CREATE TABLE MeetingDates
(
ID INT NOT NULL PRIMARY KEY IDENTITY(1,1),
MeetingID int,
MeetingStartDate datetime,
MeetingEndDate datetime -- because you can have meeting spanning days like 11:00 PM to 1:00 AM
--,user or guest information too
,CONSTRAINT FK_MeetingDates_Schedule FOREIGN KEY (MeetingID)
REFERENCES Schedule(ID)
)
Use an existing standard. That standard is iCalendar RRules and ExDates.
Just store the recurrance rule in the db as a varchar
Use an existing library (C#) to calculate upcoming dates
Even though you have daily, weekly, monthly etc... still means that a meeting will occur on some specific day ... right ?
Thus
CREATE TABLE schedule
(
ID INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
StartDate DATETIME NOT NULL,
EndDate DATETIME NOT NULL,
RepeatTypeID INT NOT NULL, // Daily, Weekly, Monthly, None
RepeatCount INT NOT NULL,
DayOn INT NOT NULL, // can be a calculated field based on start date using DAY function
)
I believe this can capture all your schedule options.

SQL DDL - 2 CHECK Constraints on 1 Attribute

Below is DDL for the table I want to create. However, I want the attribute 'Appointment_datetime' to be a future date and during working hours (between 8:00AM and 5:00PM). I can get the future date part with -'CHECK (Appointment_datetime >= GETDATE()) But how do I get between 8AM and 5PM ontop of this constraint?
CREATE TABLE tAppointment
(
Appointment_ID int NOT NULL PRIMARY KEY,
Appointment_datetime datetime NOT NULL, -- CHECK CONSTRAINTS NEEDED
Appointment_week int NOT NULL,
Appointment_room varchar(5) NOT NULL,
Vet_ID int NOT NULL REFERENCES tVet(Vet_ID),
Owner_ID int NOT NULL REFERENCES tOwner(Owner_ID),
Pet_ID int NOT NULL REFERENCES tPet(Pet_ID)
)
You can just add it in. Here is a method using the hour:
CHECK (Appointment_datetime >= GETDATE() AND
DATEPART(HOUR, GETDATE()) NOT BETWEEN 8 AND 16
)
Note: If you want to take weekends and holidays into account, that is more difficult and probably requires a user-defined function.

How to update the date for a specific ID in Oracle SQL?

I'm setting up a database for a scuba diving company and have a SQL table with values for Student ID (SID), Instructor ID (IID), Item Borrowed (ITEMID), Equipment Borrow Date (BorrowDate), and Equipment Return Date (ReturnDate). How do I change the Equipment Return Date for one of the students? I'd like to add an extra 2 days to the ReturnDate. I created the Borrows table like this:
CREATE TABLE BORROWS(
SID CHAR(15),
ITEMID CHAR(15),
IID CHAR(15),
BORROW_DATE DATE,
RETURN_DATE DATE,
PRIMARY KEY(SID, ITEMID),
FOREIGN KEY(SID) REFERENCES STUDENT(SID) ON DELETE CASCADE,
FOREIGN KEY(ITEMID) REFERENCES EQUIPMENT(ITEMID) ON DELETE CASCADE,
FOREIGN KEY(IID) REFERENCES INSTRUCTOR(SSN) ON DELETE CASCADE
);
I tried doing this in my SQL file:
SELECT SID, ADD_DATE(RETURN_DATE, INTERVAL 2 DAY)
FROM BORROWS
WHERE SID = '005' AND IID = '108';
I'm getting this error back:
SELECT SID, ADD_DATE(RETURN_DATE, INTERVAL 2 DAY)
*
ERROR at line 1:
ORA-00907: missing right parenthesis
Can't figure out where the error is in my code...
Adding two days to a date is as simple as + 2.
select return_date + 2
from borrows
where sid='005'
and iid='108'
The syntax you are looking for might be:
select (return_date + INTERVAL '2' DAY)
from borrows
where sid='005'
and iid='108'