Related
I am going to attend the Instance Matching of OAEI, now I need to make my results to Alignment Format. In order to achieve it, I have learned official tutorials.(link:http://alignapi.gforge.inria.fr/tutorial/tutorial1/index.html).
But there are many differences between the method taught and the method I want. In other words, I can't understand the API.
This is my situation:
I have 2 rdf file(person11.rdf and person12.rdf respectively.data link is http://oaei.ontologymatching.org/2010/im/index.html, the PR dataset), each file has information of many person. I want to find the coreferent entities, the results must be printed in Alignment Format. I find the results by using SPARQL, but I don't know how to print it in Alignment Format.
So, I have three questions:
First, if I want to generate a Alignment Format file, is the method taught the only way?
Second, can you give me your method(code better) to generate the Alignment Format file? Maybe I am wrong from the beginning, can you give me some suggestions?
Third, if you attended OAEI or know something about Instance Matching, can you give me some advice? I want to find the coreferent entities.
Thank you!
First question: I guess that the "mentioned method" is the one in tutorial1. It is not the appropriate one since you have to write a program to output the alignment format and this is a command line interface tutorial. In this case, you'd better look at http://alignapi.gforge.inria.fr/tutorial/tutorial2/index.html
Then, there are basically two ways to do:
The advised one (for several reasons and for participating to OAEI) is to follow these tutorials, to create an empty alignment in it, to create the correspondences from the results of your SPARQL query and to render it. Everything is covered by the tutorials but the part concerning your SPARQL queries. This assumes that you are programming in Java.
The non-advised solution (primarily non advised because you will have to debug your own renderer), is to write, in any programming language that you want a program that output the format (which corresponds to what you cite).
Think about it: how would you expect that the Alignment API knows the results of your SPARQL query? If you come up with a nice solution, contact the API developers, they may integrate it and others could benefit.
Second question: I cannot do better than what is above.
Third question: too general. Read the OAEI results (http://oaei.ontologymatching.org) and look at the code of others.
Good luck!
What is the best way to implement a constructor for a record? It seems like a function should be able to return a record object in the instantiation of the record in some later model higher up the tree, but I can't get that to work. For now I just use a bunch of parameters at the top of the record that populate the variables stored in the record, but it seems like that will only work in simple cases.
Can anyone shed a little light? Perhaps I shouldn't be using a record but a model. Also does anyone know how the PDE functionality is coming? The book only says that it is coming, but I have seen some other things around.
I don't seem to have the clout to add tags (which makes sense, since my "reputation" is lower than yours) so sorry about that. I thought I had actually added one at one point, but perhaps I am mistaken.
I think you need to be clear what you mean by constructor since it has a very specific meaning in Modelica. If I understand your question correctly, it sounds like what you want to do is create an instance of a record that has some fields that are specified in the constructor arguments and from those arguments a bunch of other fields in the record are computed. Is that correct?
If so, there is a mechanism to do this. You mention "the book" but it isn't clear which one you mean. If it is mine, it definitely has no mention of these so called "record constructors" because it is too old. I do not know if Peter Fritzson's book mentions them either. However, they do exist and are documented in Section 12.6 of the Modelica 3.2 specification.
As for PDEs, there has been work into this kind of thing but nothing has really been done within the design group on this topic. I would add that if you want to solve either elliptical or parabolic PDEs on regular grids, this isn't too hard even with the current language. The only real drawback is that most tools probably don't handle sparsity very efficiently. Irregular grids would also be possible, but then you get into complicated basis functions. Finally, hyperbolic PDEs are, in my opinion, quite tricky (in any environment) due to the implicit physical constraints between time and space which are difficult to express (i.e. the CFL condition).
I hope that answers your questions so far.
I can only comment on your question regarding the book of Peter Fritzson. He confirmed that he's working on an update and he hopes to get it ready 'in the course of 2011'.
Original post here:
http://openmodelica.org/index.php/forum/topic?id=50
And thanks for initiating the modelica tag, I might be useful in the near future for me too... :-)
regards,
Roel
I have been working on sql server and front end coding and have usually faced problem formulating queries.
I do understand most of the concepts of sql that are needed in formulating queries but whenever some new functionality comes into the picture that can be dont using sql query, i do usually fails resolving them.
I am very comfortable with select queries using joins and all such things but when it comes to DML operation i usually fails
For every query that i never done before I usually finds uncomfortable with that while creating them. Whenever I goes for an interview I usually faces this problem.
Is it their some concept behind approaching on formulating sql queries.
Eg.
I need to create an sql query such that
A table contain single column having duplicate record. I need to remove duplicate records.
I know i can find the solution to this query very easily on Googling, but I want to know how everyone comes to the desired result.
Is it something like Practice Makes Man Perfect i.e. once you did it, next time you will be able to formulate or their is some logic or concept behind.
I could have get my answer of solving above problem simply by posting it on stackoverflow and i would have been with an answer within 5 to 10 minutes but I want to know the reason. How do you work on any new kind of query. Is it a major contribution of experience or some an implementation of concepts.
Whenever I learns some new thing in coding section I tries to utilize it wherever I can use it. But here scenario seems to be changed because might be i am lagging in some concepts.
EDIT
How could I test my knowledge and
concepts in Sql and related sql
queries ?
Typically, the first time you need to open a child proof bottle of pills, you have a hard time, but after that you are prepared for what it might/will entail.
So it is with programming (me thinks).
You find problems, research best practices, and beat your head against a couple of rocks, but in the process you will come to have a handy set of tools.
Also, reading what others tried/did, is a good way to avoid major obsticles.
All in all, with a lot of practice/coding, you will see patterns quicker, and learn to notice where to make use of what tool.
I have a somewhat methodical method of constructing queries in general, and it is something I use elsewhere with any problem solving I need to do.
The first step is ALWAYS listing out any bits of information I have in a request. Information is essentially anything that tells me something about something.
A table contain single column having
duplicate record. I need to remove
duplicate
I have a table (I'll call it table1)
I have a
column on table table1 (I'll call it col1)
I have
duplicates in col1 on table table1
I need to remove
duplicates.
The next step of my query construction is identifying the action I'll take from the information I have.
I'll look for certain keywords (e.g. remove, create, edit, show, etc...) along with the standard insert, update, delete to determine the action.
In the example this would be DELETE because of remove.
The next step is isolation.
Asnwer the question "the action determined above should only be valid for ______..?" This part is almost always the most difficult part of constructing any query because it's usually abstract.
In the above example you're listing "duplicate records" as a piece of information, but that's really an abstract concept of something (anything where a specific value is not unique in usage).
Isolation is also where I test my action using a SELECT statement.
Every new query I run gets thrown through a select first!
The next step is execution, or essentially the "how do I get this done" part of a request.
A lot of times you'll figure the how out during the isolation step, but in some instances (yours included) how you isolate something, and how you fix it is not the same thing.
Showing duplicated values is different than removing a specific duplicate.
The last step is implementation. This is just where I take everything and make the query...
Summing it all up... for me to construct a query I'll pick out all information that I have in the request. Using the information I'll figure out what I need to do (the action), and what I need to do it on (isolation). Once I know what I need to do with what I figure out the execution.
Every single time I'm starting a new "query" I'll run it through these general steps to get an idea for what I'm going to do at an abstract level.
For specific implementations of an actual request you'll have to have some knowledge (or access to google) to go further than this.
Kris
I think in the same way I cook dinner. I have some ingredients (tables, columns etc.), some cooking methods (SELECT, UPDATE, INSERT, GROUP BY etc.) then I put them together in the way I know how.
Sometimes I will do something weird and find it tastes horrible, or that it is amazing.
Occasionally I will pick up new recipes from the internet or friends, then use parts of these in my own.
I also save my recipes in handy repositories, broken down into reusable chunks.
On the "Delete a duplicate" example, I'd come to the result by googling it. This scenario is so rare if the DB is designed properly that I wouldn't bother keeping this information in my head. Why bother, when there is a good resource is available for me to look it up when I need it?
For other queries, it really is practice makes perfect.
Over time, you get to remember frequently used patterns just because they ARE frequently used. Rare cases should be kept in a reference material. I've simply got too much other stuff to remember.
Find a good documentation to your software. I am using Mysql a lot and Mysql has excellent documentation site with decent search function so you get many answers just by reading docs. If you do NOT get your answer at least you are learning something.
Than I set up an example database (or use the one I am working on) and gradually build my SQL. I tend to separate the problem into small pieces and solve it step by step - this is very successful if you are building queries including many JOINS - it is best to start with some particular case and "polute" your SQL with many conditions like WHEN id = "123" which you are taking out as you are working towards your solution.
The best and fastest way to learn good SQL is to work with someone else, preferably someone who knows more than you, but it is not necessarry condition. It can be replaced by studying mature code written by others.
Your example is a test of how well you understand the DISTINCT keyword and the GROUP BY clause, which are SQL's ways of dealing with duplicate data.
Examples and experience. You look at other peoples examples and you create your own code and once it groks, you don't need to think about it again.
I would have a look at the Mere Mortals book - I think it's the one by Hernandez. I remember that when I first started seriously with SQL Server 6.5, moving from manual ISAM databases and Access database systems using VB4, that it was difficult to understand the syntax, the joins and the declarative style. And the SQL queries, while powerful, were very intimidating to understand - because typically, I was looking at generated code in Microsoft Access.
However, once I had developed a relatively systematic approach to building queries in a consistent and straightforward fashion, my skills and confidence quickly moved forward.
From seeing your responses you have two options.
Have a copy of the specification for whatever your working on (SQL spec and the documentation for the SQL implementation (SQLite, SQL Server etc..)
Use Google, SO, Books, etc.. as a resource to find answers.
You can't formulate an answer to a problem without doing one of the above. The first option is to become well versed into the capabilities of whatever you are working on.
The second option allows you to find answers that you may not even fully know how to ask. You example is fairly simplistic, so if you read the spec/implementation documentaion you would know the answer right away. But there are times, where even if you read the spec/documentation you don't know the answer. You only know that it IS possible, just not how to do it.
Remember that as far as jobs and supervisors go, being able to resolve a problem is important, but the faster you can do it the better which can often be done with option 2.
There are a multitude of key-value stores available. Currently you need to choose one and stick with it. I believe an independent open API, not made by a key-value store vendor would make switching between stores much easier.
Therefore I'm building a datastore abstraction layer (like ODBC but focused on simpler key value stores) so that someone build an app once, and change key-value stores if necessary. Is this API too simple?
get(Key)
set(Key, Value)
exists(Key)
delete(Key)
As all the APIs I have seen so far seem to add so much I was wondering how many additional methods were necessary?
I have received some replies saying that set(null) could be used to delete an item and if get returns null then this means that an item doesn't exist. This is bad for two reasons. Firstly, is it not good to mix return types and statuses, and secondly, not all languages have the concept of null. See:
Do all programming languages have a clear concept of NIL, null, or undefined?
I do want to be able to perform many types of operation on the data, but as I understand it everything can be built up on top of a key value store. Is this correct? And should I provide these value added functions too? e.g: like mapreduce, or indexes
Internally we already have a basic version of this in Erlang and Ruby and it has saved us alot of time, and also enabled us to test performance for specific use cases of different key value stores
Do only what is absolute necessary, instead of asking if it is too simple, ask if it is too much, even if it only has one method.
Your API lacks some useful functions like "hasKey" and "clear". You might want to look at, say, Python's hack at it, http://docs.python.org/tutorial/datastructures.html#dictionaries, and pick and choose additional functions.
Everyone is saying, "simple is good" and that's true until "simple is too simple."
If all you are doing is getting, setting, and deleting keys, this is fine.
There is no such thing as "too simple" for an API. The simpler the better! If it solves the need the way it is, then leave it.
The delete method is unnecessary. You can just pass null to set.
Edited to add:
I'm only kidding! I would keep delete, and probably add Count, Contains, and maybe an enumerator (or two).
When creating an API, you need to ask yourself, what does my API provide the user. If your API is so simplistic that it is faster and easier for your client to write their own app, then your API has failed. Ask yourself, does my functionality give them specific benefits. If the answer is no, it is too simplistic and generic.
I am all for simplifying an interface to its bare minimum but without having more details about the requirements of the system, it is tough to tell if this interface is sufficient. Sure looks concise enough though.
Don't forget to document the semantics for "key non-existent" as it isn't clear from reading your API definition above. updated: I see you have added the exists method: is this necessary? you could use the get method and define a NIL of some sort, no?
Maybe worth thinking about: how about considering "freshness" of a value? i.e. an associated "last-modified" timestamp? Of course, it depends on your system requirements.
What about access control? Is it within scope of the API definition?
What about iterating through the keys? If there is a possibility of a large set, you might want to include some pagination semantics.
As mentioned, the simpler the better, but a simple iterator or key-listing method could be of use. I always end up needing to iterate through the set. A "size()" method too, if not taken care of by the iterator. It obviously depends on your usage, though.
It's not too simple, it's beautiful. If "exists(key)" is just a convenient shorthand for "get(Key) != null", you should consider removing it. I guess that depends on how large or complex the value you get() is.
I've always hated comments that fill half the screen with asterisks just to tell you that the function returns a string, I never read those comments.
However, I do read comments that describe why something is done and how it's done (usually the single line comments in the code); those come in really handy when trying to understand someone else's code.
But when it comes to writing comments, I don't write that, rather, I use comments only when writing algorithms in programming contests, I'd think of how the algorithm will do what it does then I'd write each one in a comment, then write the code that corresponds to that comment.
An example would be:
//loop though all the names from n to j - 1
Other than that I can't imagine why anyone would waste valuable time writing comments when he could be writing code.
Am I right or wrong? Am I missing something? What other good use cases of comments am I not aware of?
Comments should express why you are doing something not what you are doing
It's an old adage, but a good metric to use is:
Comment why you're doing something, not how you're doing it.
Saying "loop through all the names from n to j-1" should be immediately clear to even a novice programmer from the code alone. Giving the reason why you're doing that can help with readability.
If you use something like Doxygen, you can fully document your return types, arguments, etc. and generate a nice "source code manual." I often do this for clients so that the team that inherits my code isn't entirely lost (or forced to review every header).
Documentation blocks are often overdone, especially is strongly typed languages. It makes a lot more sense to be verbose with something like Python or PHP than C++ or Java. That said, it's still nice to do for methods & members that aren't self explanatory (not named update, for instance).
I've been saved many hours of thinking, simply by commenting what I'd want to tell myself if I were reading my code for the first time. More narrative and less observation. Comments should not only help others, but yourself as well... especially if you haven't touched it in five years. I have some ten year old Perl that I wrote and I still don't know what it does anymore.
Something very dirty, that I've done in PHP & Python, is use reflection to retrieve comment blocks and label elements in the user interface. It's a use case, albeit nasty.
If using a bug tracker, I'll also drop the bug ID near my changes, so that I have a reference back to the tracker. This is in addition to a brief description of the change (inline change logs).
I also violate the "only comment why not what" rule when I'm doing something that my colleagues rarely see... or when subtlety is important. For instance:
for (int i = 50; i--; ) cout << i; // looping from 49..0 in reverse
for (int i = 50; --i; ) cout << i; // looping from 49..1 in reverse
I use comments in the following situations:
High-level API documentation comments, i.e. what is this class or function for?
Commenting the "why".
A short, high-level summary of what a much longer block of code does. The key word here is summary. If someone wants more detail, the code should be clear enough that they can get it from the code. The point here is to make it easy for someone browsing the code to figure out where some piece of logic is without having to wade through the details of how it's performed. Ideally these cases should be factored out into separate functions instead, but sometimes it's just not do-able because the function would have 15 parameters and/or not be nameable.
Pointing out subtleties that are visible from reading the code if you're really paying attention, but don't stand out as much as they should given their importance.
When I have a good reason why I need to do something in a hackish way (performance, etc.) and can't write the code more clearly instead of using a comment.
Comment everything that you think is not straightforward and you won't be able to understand the next time you see your code.
It's not a bad idea to record what you think your code should be achieving (especially if the code is non-intuitive, if you want to keep comments down to a minimum) so that someone reading it a later date, has an easier time when debugging/bugfixing. Although one of the most frustrating things to encounter in reading someone else's code is cases where the code has been updated, but not the comments....
I've always hated comments that fill half the screen with asterisks just to tell you that the function returns a string, I never read those comments.
Some comments in that vein, not usually with formatting that extreme, actually exist to help tools like JavaDoc and Doxygen generate documentation for your code. This, I think, is a good form of comment, because it has both a human- and machine-readable format for documentation (so the machine can translate it to other, more useful formats like HTML), puts the documentation close to the code that it documents (so that if the code changes, the documentation is more likely to be updated to reflect these changes), and generally gives a good (and immediate) explanation to someone new to a large codebase of why a particular function exists.
Otherwise, I agree with everything else that's been stated. Comment why, and only comment when it's not obvious. Other than Doxygen comments, my code generally has very few comments.
Another type of comment that is generally useless is:
// Commented out by Lumpy Cheetosian on 1/17/2009
...uh, OK, the source control system would have told me that. What it won't tell me is WHY Lumpy commented out this seemingly necessary piece of code. Since Lumpy is located in Elbonia, I won't be able to find out until Monday when they all return from the Snerkrumph holiday festival.
Consider your audience, and keep the noise level down. If your comments include too much irrelevant crap, developers will just ignore them in practice.
BTW: Javadoc (or Doxygen, or equiv.) is a Good Thing(tm), IMHO.
I also use comments to document where a specific requirement came from. That way the developer later can look at the requirement that caused the code to be like it was and, if the new requirement conflicts with the other requirment get that resolved before breaking an existing process. Where I work requirments can often come from different groups of people who may not be aware of other requirements then system must meet. We also get frequently asked why we are doing a certain thing a certain way for a particular client and it helps to be able to research to know what requests in our tracking system caused the code to be the way it is. This can also be done on saving the code in the source contol system, but I consider those notes to be comments as well.
Reminds me of
Real programmers don't write documentation
I wrote this comment a while ago, and it's saved me hours since:
// NOTE: the close-bracket above is NOT the class Items.
// There are multiple classes in this file.
// I've already wasted lots of time wondering,
// "why does this new method I added at the end of the class not exist?".