How to use FOAF as part of another ontology? - semantic-web

I want to have a class Professor which will have some properties: name, surname and nationality.
Now I just created a class Professor, a class Person and a class Nationality, and some data properties for name and surname and an object property hasNationaity to relate a professor with a nationality.
Does it make sense to use FOAF for Person and maybe something like Group and member for the nationalities?
To do so I would need to import FOAF, right?
I guess my main question is what are the reasons that justify importing an upper ontology? and is this what people normally do?
In any case the ontology, in Turtle, is available here on GitHub.

Ontologies and more generally "Semantic web technologies" are dedicated to knowledge pooling.
As Sire Tim Berners-Lee specified in its 5-Stars Ranking on Open Data, the best level of opening is reached when you "link your data to other data to provide context". So it is a good thing !
About the "import", not all the FOAF ontology is mandatory in your case I think. Importing all statements of an ontology is, in my experience, important when you need to implement many resources relative to the upper-ontology (Graph browsing, structure modification, ...) On the other hand, the simple use of prefixes can solve many problem without weighing an application:
xmlns:foaf="http://xmlns.com/foaf/0.1/#"
About the nationality question, Group from FOAF can be a solution. Nota Bene that other ontologies may provide suited answer like YAGO (Yet Another Great Ontology). Some may create multiple imports/connections to increase the contextualisaiton of their Knowledge Base.

Related

Is it conventional to use verbs to describe relationships between classes in UML?

I've come across resources that depict UML diagrams with verbs like 'wrote' to describe how one class uses another. Does this convention exist in UML; is it overkill to add this convention to my designs?
ex:
Yes, this is a common convention: the name over the association (Wrote) is the name of the association. You may add the solid triangle to show the order of reading.
But often the associations are shown without name, or without the triangle, if this information is not important for the understanding of the diagram. Adding this systematically in the diagram might make it more difficult to read and give a feeling of information overload. So, up to you to find the right balance in your specific case.
Just trying to summarize a few experiences:
Using the name/triangle notation is often advantageous when working with business stakeholders. In that case the triangle is mandatory because without it can lead to confusion. Not so in the above example but it should be a modeling rule set in the domain.
Applying roles/multiplicities is practical when moving over to technical aspects. In that stage the label is not important any more as it can be guessed from the role names. So the best is to have diagrams for business people having just the labels/triangles and ones for techies containing roles/multiplicities.
If for any case you want both notations make sure that you have enough space to distinguish between labels and role names. That makes dense diagrams impossible.
Like in a Chinese Restaurant: if there's all you can eat please listen to your stomach.

What is the difference between RDF Schema and Ontology?

I am new to Semantic Web and confused regarding RDFs and Ontology. Can someone explain the difference between RDF Schema and Ontology?
RDF Schema (RDFS) is a language for writing ontologies.
An ontology is a model of (a relevant part of) the world, listing the types of object, the relationships that connect them, and constraints on the ways that objects and relationships can be combined.
A simple example of an ontology (though not written in RDFS syntax):
class: Person
class: Project
property: worksOn
worksOn domain Person
worksOn range Project
which says that in our model of the world, we only care about People and Projects. People can work on Projects, but not the other way around.
Do you mean 'what is the difference between RDF Schema' and 'Web Ontology Language (OWL2)'. If so then there are a few main differences. Both are ways to create vocabularies of terms to describe data when represented as RDF. OWL2 and its subsets (OWL DL, OWL Full, OWL Lite) contain all the terms contained in RDFS but allow for greater expressiveness, including quite sophisticated class and property expressions. In additional, one of the subsets of OWL2 (OWL Full) can be modelled in such a way that when reasoned using an OWL Full reasoner, is undecidable. Both are representable as RDF and both are W3C Web Standards.
If you want to compare RDFS and ontology, not specifically in the context above, but in the context of Semantic Web, then my advice would be to very careful. Careful because you will find several distinct and not necessarily mutually exclusive camps; those with an interest in ontology from a philosophical perspective, those from a computing perspective, those who think the philosophical perspective should be the only perspective and those that don't. If you are any of those ways inclined, you can end up having great debates. But if you want to engage in Semantic Web Development, then the fastest route is to study and understand the Web Standards mentioned initially.
Conceptually there is no difference, i.e., RDFS can be utilised to create a (e.g. domain specific) vocabulary or ontology, where RDFS is bootstrapping itself in companion with RDF (everything is at least an rdfs:Resource). Furthermore, in the context of Semantic Web technologies you could utilise OWL to describe advanced semantics of your ontology/vocabulary. See also this definition of ontology.
As per the spec, RDF schema is purely that - a schema or structure for defining things semantically. It gives you the vocabulary (key words and properties) for describing things. Think of it like an XML schema as used in XML documents and web pages.
An ontology is a classification hierarchy (for example, the biological taxonomy of life) normally combined with instances of those classes. It is used for classifying and reasoning.
What is an instance depends on how you define a taxonomy. It might be that you have an ontology of living creatures and so a living, breathing person is an instance of the ontological class "Homo Sapiens", or it might be that you have an ontology of species and so the entire Homo Sapiens species is an instance of the ontological class "Species".
In non technical terms, I would say RDFS is a language that helps to represent information. And an ontology is the term used to refer to all the information about a domain.
Cheers

Model diagram doesn't seem right. How else can I relate the objects?

I have a entity diagram from some analysis that I'd like to have someone look over. For some reason the System object just doesn't seem right to me. Is there a better way to relate the objects?
Its basically a user authentication/management system in its infancy.
http://www.dumpt.com/img/viewer.php?file=zlh8ltbtho4mutbbb3yk.gif
Cheers,
Mike
User and Company should have a common base class (they both have names and mail addresses), then you can link the System to this base class. That's a common pattern for business modeling, look for example, into chapter one of Martin Fowler's book "Analysis Patterns".
EDIT: Or, if you think this makes more sense, you use System as the base class itself, put the EMail adress there (and perhaps give System a better name like LegalPerson, CorporateBody or something like that).
Considering the password has a 1-to-1 relationship with the User, and is not keyed to any other tables, I'd suggest saving yourself an inner join and just making it another column in the property table. Otherwise, looks pretty good.
It's hard to evaluate the "rightness" of something without some metrics of comparison. The easiest metrics for class designs are queries.
Think up as many of the queries that you will eventually want to ask of this data. Write them down and see how the design supports them. If you're unhappy, try another design and see how the queries look then.

How do I validate the class diagram for a given domain?

I am working on car dealership business domain model/UML class diagram.
I am new to modeling, so I would like to know how to validate the class diagram. It's very important for me to have an appropriate, if not 100 percent correct, class diagram to use further development (use cases, etc.).
Is it possible to build a completely incorrect model? Or are there only appropriate and less appropriate models?
If I have a Customer associated with SalesTeam modeling a customer being served by SalesTeam, is that wrong? I have seen in examples of Customer being associated with Order, Order with ItemOrder and ItemOrder with ItemInventory. Where the SalesTeam or Staff is associated with Order.
How do I validate my model and relationships?
To validate domain models, do the following.
Write use cases. During the writing, make sure you're using nouns and verbs in a consistent way. To be sure that your nouns make sense, be sure to record notes in the domain model.
Walk through each use case, following along on your domain model. At the entities there? Relationships required for navigation? Attributes of each entity?
Since it's a domain model, try to avoid describing things as classes -- they're usually real-world entities.
For example "customer entity in direct relationship with sales team entity" is something you'll learn from the use cases. For example, customers are associated with orders, but the order is created by the sales team. So, you have two navigation paths between customer and order: direct and via the sales team. Both appear (to me) to be true.
You must compare your domain model with your use cases to be sure both agree.
The short answer is that this is not very important.
Use your domain class diagrams to keep a note of what you think is in the domain, that is all. It is not your god, and it will not hurt you to change it as you go.
Domain experts should help you to validate the domain model.
As far as validating the specific relationships, as you develop the model further and investigate the collaborations between objects you will discover more and different relationships. You will need to revisit the domain model often during your analysis and development.
I don't think it matters that it's 'correct' up front (i.e. before you move onto looking at use cases and further analysis), only that it is useful - it gives you a conceptual model of the problem and what the main classes involved are. It isn't going to be finished until the software is no longer being developed or maintained.
If it represents the way you view the problem right now, it's good enough for you to start further analysis. Revise it as your view of the problem changes and you learn more.

Modeling Geographic Locations in an Relational Database

I am designing a contact management system and have come across an interesting issue regarding modeling geographic locations in a consistent way. I would like to be able to record locations associated with a particular person (mailing address(es) for work, school, home, etc.) My thought is to create a table of locales such as the following:
Locales (ID, LocationName, ParentID) where autonomous locations (such as countries, e.g. USA) are parents of themselves. This way I can have an arbitrarily deep nesting of 'political units' (COUNTRY > STATE > CITY or COUNTRY > STATE > CITY > UNIVERSITY). Some queries will necessarily involve recursion.
I would appreciate any other recommendations or perhaps advice regarding predictable issues that I am likely to encounter with such a scheme.
You might want to have a look at Freebase.com as a site that's had some open discussion about what a "location" means and what it means when a location is included in another. These sorts of questions can generate a lot of discussion.
For example, there is the obvious "geographic nesting", but there are less obvious logical nestings. For example, in a strictly geographic sense, Vatican City is nested within Italy. But it's not nested politically. Similarly, if your user is located in a research center that belongs to a university, but isn't located on the University's property, do you model that relationship or not?
Sounds like a good approach to me. The one thing that I'm not clear on when reading you post is what "parents of themselves" means - if this is to indicate that the locale does not have a parent, you're better off using null than the ID of itself.
I think you might be overthinking this. There's a reason most systems just store addresses and maybe a table of countries. Here are some things to look out for:
Would an address in the Bronx include the borough as a level in the hierarchy? Would an address in an unincorporated area eliminate the "city" level of the hierarchy? How do you model an address within a university vs an address that's not within one? You'll end up with a ragged hierarchy which will force you to traverse the tree every time you need to display an address in your application. If you have an "address book" page the performance hit could be significant.
I'm not sure that you even have just one hierarchy. Brown University has facilities in Providence, RI and Bristol, RI. The only clean solution would be to have a double hierarchy with two campuses that each belong to their respective cities in one hierarchy but that both belong to Brown University on the other hierarchy. (A university is fundamentally unlike a political region. You shouldn't really mix them.)
What about zip codes? Some zip codes encompass multiple towns, other times a city is broken into multiple zip codes. And (rarely) some zip codes even cross state lines. (According to Wikipedia, at least...)
How will you enter the data? Building out the database by parsing conventionally-formatted addresses can be difficult when you take into account vanity addresses, alternate names for certain streets, different international formats, etc. And I think that entering every address hierarchically would be a PITA.
It sounds like you're trying to model the entire world in your application. Do you really want or need to maintain a table that could conceivable contain every city, state, province, postal code, and country in the world? (Or at least every one where you know somebody?) The only thing I can think of that this scheme would buy you is proximity, but if that's what you want I'd just store state and country separately (and maybe the zip code) and add latitude and longitude data from Google.
Sorry for the extreme pessimism, but I've gone down that road myself. It's logically beautiful and elegant, but it doesn't work so well in practice.
Here's a suggestion for a pretty flexible schema. An immediate warning: it could be too flexible/complex for what you actually need
Location
(LocationID, LocationName)
-- Basic building block
LocationGroup
(LocationGroupID, LocationGroupName, ParentLocationGroupID)
-- This can effective encapsulate multiple hierarchies. You have one root node and then you can create multiple independent branches. E.g. you can split by state first and then create several sub-hierarchies e.g. ZIP/city/xxxx
LocationGroupLocation
(LocationID, LocationGroupID)
-- Here's how you link Location with one or more hierarchies. E.g. you can link your house to a ZIP, as well as a City... What you need to implement is a constraint that you should not be able to link up a location with any two hierarchies where one of them is a parent of the other (as the relationship is already implicit).
I would think carefully about this since it may not be a necessary feature.
Why not just use a text field and let users type in an address?
Remember the KISS principle (Keep It Simple, Stupid).
I agree with the other posts that you need to be very careful here about your requirements. Location can become a tricky issue and this is why GIS systems are so complicted.
If you are sure you just need a basic heirarchy structure, I have the following suggestions:
I support the previous comment that root level items should not have themselves as the parent. Root level items should have a null value for the parent. Always be careful about putting data into a field that has no meaning (i.e. "special" value to represent no data). This practice is rarely necessarily and way overused in the devleoper community.
Consider XPath / XML. This is Something to consider for bother recording the heirarchy structure, and for processing / parsing the data at retrieval. If you are using MSSQL Server, the XPath expressions in select statements are perfect for tasks such as returning the full location/heirarchy path of a record as the code is simple and the results are fast.
For Geographic locations you may wish to resolve an address to a Latitude, Longitude array (perhaps using Google maps etc.) to calculate proximities etc.. For Geopolitical nesting ... I'd go with the KISS response.
If you really want to model it, perhaps you need the types to be more generic ... Country -> State -> County -> Borough -> Locality -> City -> Suburb -> Street or PO Box -> Number -> -> Appartment etc. -> Institution (University or Employer) -> Division -> Subdivision-1 -> subdivision-n ... Are you sure you can't do KISS?
I'm modeling an apps for global users and I have the same problems, but I think that this approach could already be in use in many enterprise. But why this problem don't have an universal solution? Or, has this problem one best solution that can be the start point or anybody in the world need think in a solution for it since beginnig?
In IT, we are making the same things any times and in many places, unfortunately. For exemplo, who are not have made more than one user, customer or product's database? And the worst, all enterprise in the world has made it. I think that could have universal solutions for universal problems.