addition of rules and value assigning in an ontology - conditional-statements

Say a degree recommendation ontology which is supposed to recommend a degree programme according to certain set of results you get for an exam.
E.g. a students chooses a stream he did the exam in and his results are: for chemistry, physics, and maths are C, C, C. Thereby he is supposed to be recommended a degree from a particular university.
The class hierarchy of the ontology is as
stream
subject
University
Faculty
Department
Degree
Object properties have been defined as well.
Should the universities such as "University A" , "University B" have properties assigned in the class or to the individual?
How would I be able to assign different results to each subject?
How would I be able to add rules and conditions to the ontology?
E.g. engineering stream consists of only 3 subjects that can be done which are chemistry, physics and maths.
E.g. if you get a result of 'F' in one of those subjects a degree wont be recommended.

Related

How should I design my RESTful API in this case?

I've got a DB with a lot of people of type (PLAYER, DOCTOR, TEACHER) where each person has an ID and location ID. There're some common fields like first name, last name but also some fields that are specific on person's occupation: number of injuries / the most serious injury type for PLAYER, number of patients for DOCTOR and can_teach_math for TEACHER.
I want to build an API to compute total compensation of all these people that accepts a list of IDs (optional), list of location IDs (optional). For example, if someone passes 3 personIDm API should return a response with an array where each row corresponds to a specific person. If someone passes locationIDs - API should return all people who are living in that area.
Originally, I was thinking I could just return people ID:
request = {..., person_id:[person-123, person-456], location_id = [location-1, location-2]}
response = {
[person_id:person-123, first_name=Alex, compensation=100],
[person_id:person-456, first_name=Alex2, compensation=102],
# anyone who lives in location-1, location-2
[person_id:person-13, first_name=Alex3, compensation=50],
[person_id:person-12, first_name=Alex4, compensation=52],
}
However UI engineer showed up and said they want to see
also some fields that are specific on person's occupation: number of injuries / the most serious injury type for PLAYER, number of patients for DOCTOR and can_teach_math for TEACHER.
in a response even though it makes API denormalized. That said, it makes sense to me since loading all the object info through GET persons/{ID} might take quite a long time. Without going too much of the details, let's see we don't care about speed -- is the proper way to design RESTful API is not to return
also some fields that are specific on person's occupation: number of injuries / the most serious injury type for PLAYER, number of patients for DOCTOR and can_teach_math for TEACHER.

What is the industry standard way to store country / state / city in a database of web APP?

For country and state, there are ISO numbers. With City, there is not.
Method 1:
Store in one column:
[Country ISO]-[State ISO]-[City Name]
Method 2:
Store in 3 separate columns.
Also, how to handle city names if there is no unique identifier?
First and foremost, three separate columns to keep your data. If you want to create a unique identifier, the easiest way would be giving a random 3-10 digit code depending on the size of your data set. However, I would suggest concatenating [country-code]-[state-code]-[code] if you have a small data set and if you want human readability to a certain point. code can be several things. Here are some ideas:
of course a random id or even a database row id
licence plate number/code if there is for a city
phone area code of the city or the code of the center
same logic may apply to zip codes
combination of latitude and longitude of the city center up to certain degree
Here are also more references that can be used:
ISO 3166 is a country codes. In there you can find codes for states or cities depending on the country.
As mentioned IATA has both Airport and City codes list but they are hard to obtain.
UN Location list is a good mention but it can be difficult to gather the levels of differentiation, like the airport code or city code or a borough code can be on the same list, but eventually the UN/LOCODE must be unique. (Airport codes are used for ICAO, similar to IATA but not the same)
there are several data sets out there like OpenTravelData or GeoNames that can be used for start but may require digging and converting. They provide unique codes for locations. And many others can be found.
Bonus:
I would suggest checking Schema.org's City Schema and other Place Schemas for a conscious setup.

SQL Database Design Timesheet Employee vs Crews

First question here so please let me know how to ask the question better if below is unhelpful.
TLDR - Should I have separates employee times tables for employees assigned to crews and those who are not?
I'm trying to design database, following 'Database Design for mere mortals' book, that tracks employees times. I'm trying to replace the weekly timesheets and crew paper sheets (with start & end times for the crew) being used. There are also individual employee weekly timesheets for those not assigned to crews. Also crew sheets sometimes have an asterisk with if someone is sick etc.
There is a relationship of Projects to crews (1:N) and for the individual employee not assigned to crews are assigned to the project.
Employees are assigned to crews, normally 1:1 but headache comes when 1:N.
'Has' Relationships
So at the moment the are different types of crews say A, B, C, D, E.
Crews D & E will just fill in weekly timesheets (project, names and times, so crews D and E will both be on this same project) and the daily sheets don't include times. Sometimes like 10% of the time employees will be on both D & E on the same day.
A, B, C will have daily times on the daily sheet, but if an employee is on crew C these times take precedance over the times on sheets A or B (if they are also on A or B).
The obvious answer to have {employee, datetimestart, datetimeend} won't work as I care where the times have come from (crew, individual if exception to the crew e.g. sick, individual not assigned to a crew).
I can extend to have {employee, crewtype, datetimestart, datetimeend} this doesn't take care of when employee is both on D & E. I can put DE or F in this case?
Then how do I deal with those assigned to the project only?
if I have {employee, crewtype, projectref, datetimestart, datetimeend} the projectref is redudant and can be derived from crewtype when this is not null. Is this a reasonable approach or would having separate tables be better?
EDIT - or should I have one table {crewid, datetimestart, datetimeend} - derive the times for employees from the crew-employee relationship, and have a separate {employee, datetimestart, datetimeend, category} with category saying if exception (e.g sick) or non-assigned individual?
There are a lot of complicated rules in this description. If you want to create a schema that will never break these rules it will likely be a very complicated normalised schema. I would suggest that some of the rules will be best catered for in your application logic.
In terms of how you store the date, remember that you always need to cater for the exception cases, so something that happens 10% of the time, or 1% or one in a thousand still needs to be catered for, otherwise you just cant save the data.
I would tend to design for the most detailed level, which might be employee,crew,project,date,times, and possible also add an allocation column that defaults to 1, but could be 0.5 if the person is half allocated to two crews at the same time.
You could then write queries so that if a person is allocated to a crew, they get the crew time by default unless they have some overridden values, or whatever other rules you need.
Really, this is the sort of iterative modelling you would do with your business users as you tease out the design. Not always easy in a real-world scenario.
Sorry there is no definitive model here, but maybe a few tips to consider that might help along the way. Good luck with it.

The Specificity of Insertion Anomalies

I'm currently trying to understand the nuances of Insertion/Deletion/Modification anomalies in SQL.
Currently, the example I'm trying to understand is as follows:
ENROLLMENT
StudentID(PK) StudentName ClassID ClassName
111 Joe E1 English1
222 Bob E1 English1
333 Mary H1 History1
The problem the example wants me to answer is:
Which of the following causes an insertion anomaly?
with the answers being
Inserting a Student without a Class
and
Inserting a Class without a Student
I don't really understand why one of these answers is more right than the other, why, or how. It seems to me like either could be acceptable. Thanks in advance.
You need to think in terms of how data is added to a system naturalistically (i.e. what series of events occur in the real world).
In this case you would create a set of classes, prior to registration, and then create and assign students to them when they turned up to register.
You would be unlikely to create a set of students and then create and assign classes to each one.
A class might only be able to hold 30 students. How do you deal with any extra students who want to be registered for that class?
If you register 100 students and then decide to create classes, which subjects do you create?
Why do students decide to turn up to register? [Presumably because of the classes on offer.]
You can create as many classes as you're able to fit into your time-table. The number of students that actually register might mean a class is cancelled, but it has to exist in the first instance.
In summary, "Inserting a Student without a Class" would be more likely to cause an insertion anomaly.

How to name my enum elements?

I have a problem naming the elements in my application's data model.
In the application, the user has the possibility to create his own metamodel. He does so by creating entity types and a type defines which properties an entity has. However, there are three kinds of entity types:
There is always exactly one instance of the type.
For instance, I want to model the company I am working for. It has a name, a share price and a number of employees. These values change over time, but there is always exactly one company.
There are different instances of the type, each is unique.
Example: Cities. A city has a name and a population count, there are different cities and each city exists exactly once.
Each instance of the type defines multiple entities.
Example: Cars. A car has a color and a manufacturer. But there is not only one red mercedes. And even though they are similar, red mercedes #1 is different from red mercedes #2.
So lets say you are a user of this tool and you understood the concept of these three flavors. You want to create a new entity type and are prompted to choose between option 1, 2 and 3. How would you name these options?
Edit:
Documentation and help is available to the user. Also the user can be expecteted to have a technical/programming background, so understanding these three concepts should be no problem.
First of all let me make sure I understand the problem,
Here's what you have (correct me if I'm wrong):
#of instances , is/are Unique
(1,true)
(n,true)
(n,false)
If so,
for #of instances I would use single \ plural
for is\are unique (\ not unique) I would use unique \ ununique.
so you'll get:
singleUnique
pluralUnique
pluralUnunique
That's the best I could think of.. I don't know exactly who are your users and what is the environment, But if you have an option of adding tips (or documentation) that should be used for sure.