How to create and populate a `SparqlQuery` object without a parser? - sparql

I am trying to model a SPARQL query with the SparqlQuery class.
It looks like I can use the RootGraphPattern property property to specify the triple patterns to adhere to in the results to my query.
Unfortunately, I have so far failed to create an instance of the SparqlQuery class, as its constructors are not publicly accessible and the class is sealed. Likewise, the query type can only be retrieved, but not set. Is there any factory method with an obscure name that creates instances of that class?
Forum postings, just as the documentation on the topic exclusively generate their SparqlQuery instances based on query strings with the SparqlQueryParser class. However, I don't have a query string yet, and I'm trying to avoid concatenating strings to build my query when there's an object-oriented API available that lets me construct my query in an OO way rather than starting out with a string.
Hence, my question is: How can I instantiate the SparqlQuery class without using an initial query string and a SPARQL parser?

Right now you can't, most of SparqlQuery is intentionally sealed because a lot of the properties and related classes like GraphPattern represent the AST and when we originally designed the class we didn't want people to intentionally/accidentally modify the AST in ways that created broken queries.
There is a fluent-query branch in the works which will eventually provide a Fluent API for building queries but the developer behind that is currently on a month long vacation and I haven't seen any activity on it for a while. You can take a look at the Fluent Query wiki for some examples of what this API is going to look like.
If this is an important feature to you we can push this up the priorities but as an open source project we are heavily constrained by the limited resources of our small developer team.
We could likely integrate what we have so far into our 1.0.0 release but our recent release focus has been on bug fixes and stability to make the 1.0.0 a stable production ready release, introducing a new and relatively untested feature goes somewhat against this. Also the API does not yet cover all of SPARQL so would be incomplete and potentially unstable.

Related

Automatically generating poco classes

Today I was checking out a few technologies: T4 templating, automapper
some mini orms: petapoco, sqlfu, ormlite
I understand the gist of what these technologies provide. I'm currently working on a 3 tier system, and I would have loved to replace the DAL (data access layer located on it's own data server) and have it integrated with a mini ORM as shown. However, I will be making no such plans for now. We currently use .NET Remoting (predates WCF).
So instead of replacing whatever is on the DataServer, I'd like to extend one of these new technologies on the application server.
I've done research on how Entity Framework can automatically generate POCO classes based on the context, which is done manually after building EF, I was wondering if I can do the same without using EF.
So here's the facts on what's currently happening:
Send a sql statement (or stored proc) to the DAL to execute
Retrieves a DataSet or a DataTable back to the application through TCP channel
My question is, is it possible to automatically generate a dynamic POCO class using keywords "var" and "dynamic" based on the values sent back from the DataSet and do dynamic mapping onto it during runtime? Would any of the technologies mentioned above help? Or do I have to manually create the POCO class first, and do a mapping on it?
It seems a bit redundant for me to manually create a POCO class and map it to a backend sql table if the application could be aware of what the POCO class is supposed to have. Like what happens if I update a table on the backend, then I'd have to update the POCO class associated with it as well. I'd love to have this to be automatic for me.
If you know the data sets at compile time, then T4 might be an option. You can write a T4 script that downloads the database schema, and constructs strongly-typed entity classes and database reads/write methods.
As far late-bound (runtime) classes, one option is to use the runtime typing provided by CustomTypeDescriptor. You can pass arrays of objects back and forth from the server, and use reflection or other techniques to infer the type.
I think it should be clear that #1 is preferable, if you know the types at compile time (which it sounds like in your case here). Runtime and dynamic should only be a last resort, as it circumvents a lot of valuable compile-time type checks.
Really, I would recommend using one of the micro ORMs like Dapper, etc, if you don't want to use the full Entity Framework. That is, unless you really want to re-invent the wheel.

Same business entity for identical tables?

I got a legacy database which have about 10 identical tables (only name differs).
Is it possible to be able to use the same business entity for all tables without having to create several classes/mapping files?
You can use the entity-name feature if you are using NHibernate v2.1 or higher. It is poorly documented but I am actively using the feature. It has gotten hard to find the documentation on it but look here:
Section 5.3 in
http://docs.jboss.org/hibernate/core/3.2/reference/en/html/mapping.html#mapping-entityname
A couple of things to be aware of. You must now use entity-name instead of class name to refer to the objects. In general it is not an entirely transparent change moving from class names to entity names.
Session actions now require two parameters, for example:
_session.Save("MyEntity", myobject)
The entity-name controls what table the data goes into.
Some HQL queries do not work right anymore, sometimes you must use Criteria instead.
If you need a set of sample code I may be able post some, but far too busy at the moment. I suggest you look at the limited info you can find and set it up for a very simple object and multiple tables to learn how it all works. It does work.
You can create a base class with all the properties, but you still need to map them all.
For that, you can either use copy&paste, XML entities (see examle at http://nhibernate.info/doc/nh/en/index.html#inheritance-tableperconcreate-polymorphism), or a code-based mapping method (Fluent or ConfORM). They usually make reuse easier.

Avoid loading unnecessary data from db into objects (web pages)

Really newbie question coming up. Is there a standard (or good) way to deal with not needing all of the information that a database table contains loaded into every associated object. I'm thinking in the context of web pages where you're only going to use the objects to build a single page rather than an application with longer lived objects.
For example, lets say you have an Article table containing id, title, author, date, summary and fullContents fields. You don't need the fullContents to be loaded into the associated objects if you're just showing a page containing a list of articles with their summaries. On the other hand if you're displaying a specific article you might want every field loaded for that one article and maybe just the titles for the other articles (e.g. for display in a recent articles sidebar).
Some techniques I can think of:
Don't worry about it, just load everything from the database every time.
Have several different, possibly inherited, classes for each table and create the appropriate one for the situation (e.g. SummaryArticle, FullArticle).
Use one class but set unused properties to null at creation if that field is not needed and be careful.
Give the objects access to the database so they can load some fields on demand.
Something else?
All of the above seem to have fairly major disadvantages.
I'm fairly new to programming, very new to OOP and totally new to databases so I might be completely missing the obvious answer here. :)
(1) Loading the whole object is, unfortunately what ORMs do, by default. That is why hand tuned SQL performs better. But most objects don't need this optimization, and you can always delay optimization until later. Don't optimize prematurely (but do write good SQL/HQL and use good DB design with indexes). But by and large, the ORM projects I've seen resultin a lot of lazy approaches, pulling or updating way more data than needed.
2) Different Models (Entities), depending on operation. I prefer this one. May add more classes to the object domain, but to me, is cleanest and results in better performance and security (especially if you are serializing to AJAX). I sometimes use one model for serializing an object to a client, and another for internal operations. If you use inheritance, you can do this well. For example CustomerBase -> Customer. CustomerBase might have an ID, name and address. Customer can extend it to add other info, even stuff like passwords. For list operations (list all customers) you can return CustomerBase with a custom query but for individual CRUD operations (Create/Retrieve/Update/Delete), use the full Customer object. Even then, be careful about what you serialize. Most frameworks have whitelists of attributes they will and won't serialize. Use them.
3) Dangerous, special cases will cause bugs in your system.
4) Bad for performance. Hit the database once, not for each field (Except for BLOBs).
You have a number of methods to solve your issue.
Use Stored Procedures in your database to remove the rows or columns you don't want. This can work great but takes up some space.
Use an ORM of some kind. For .NET you can use Entity Framework, NHibernate, or Subsonic. There are many other ORM tools for .NET. Ruby has it built in with Rails. Java uses Hibernate.
Write embedded queries in your website. Don't forget to parametrize them or you will open yourself up to hackers. This option is usually frowned upon because of the mingling of SQL and code. Also, it is the easiest to break.
From you list, options 1, 2 and 4 are probably the most commonly used ones.
1. Don't worry about it, just load everything from the database every time: Well, unless your application is under heavy load or you have some extremely heavy fields in your tables, use this option and save yourself the hassle of figuring out something better.
2. Have several different, possibly inherited, classes for each table and create the appropriate one for the situation (e.g. SummaryArticle, FullArticle): Such classes would often be called "view models" or something similar, and depending on your data access strategy, you might be able to get hold of such objects without actually declaring any new class. Eg, using Linq-2-Sql the expression data.Articles.Select(a => new { a .Title, a.Author }) will give you a collection of anonymously typed objects with the properties Title and Author. The generated SQL will be similar to select Title, Author from Article.
4. Give the objects access to the database so they can load some fields on demand: The objects you describe here would usaly be called "proxy objects" and/or their properties reffered to as being "lazy loaded". Again, depending on your data access strategy, creating proxies might be hard or easy. Eg. with NHibernate, you can have lazy properties, by simply throwing in lazy=true in your mapping, and proxies are automatically created.
Your question does not mention how you are actually mapping data from your database to objects now, but if you are not using any ORM framework at the moment, do have a look at NHibernate and Entity Framework - they are both pretty solid solutions.

Where is the api reference for nhibernate?

I may be going mental, but I can not find any api reference material for nhibernate. I've found plenty of manuals, tutorials, ebooks etc but no api reference. I saw the chm file on the nhibernate sourceforge page, but it doesn't seem to work on any of my PCs (different OSes)
Can someone please point me in the right direction?
I just found this one:
http://web.archive.org/web/20141001063046/http://elliottjorgensen.com/nhibernate-api-ref/index.html
It doesn't seem to be official, but at least it looks like an API reference... unlike the official reference, which mostly describes concepts and mappings without any information about classes and members.
If you're on Windows, get ILSpy and point it at NHibernate.dll. It's not quite the same as real API documentation, but it's not half bad.
There is no class references publicly available on Internet as far as I know. You may build it from the source. Clone them, build the NHibernate.sln solution, then go into doc folder, ensure you have prerequisites indicated in reference\readme.txt file, and run nant doc. This will generate the class reference in the build folder.
Otherwise the most commonly used API are not wide, and most of them are xml documented with intellisens working in Visual Studio. The reference documentation has the advantage of giving more context, probably helping avoiding pitfalls like believing ISession.Update is to be used for updating entities (this is wrong, you do not need it unless you use detached entities, or entities coming from another session).
Official documentation reference is on https://nhibernate.info.
Sub-links:
Global documentation list
Reference (What I mostly use, especially following sub parts.)
Configuration
Mapping - basic / entities. (Add mapping xsd definition file in any or your solution folders for letting VS know it and give you intellisens in your hbm mappings.)
Mapping - collections
Querying - general. Do not miss the named queries feature in The IQuery interface.
Querying APIs:
HQL. I mostly use HQL with named queries, in mappings, for queries not dynamically built. They get parsed and validated when building session factory, which normally occurs at application startup, so it is almost as good as compile time validation. Checks log4net logs to get detailed reasons of named query parsing failures.
Criteria API. I view it as the historical way of dynamically building queries in code, to be preferred over constructing HQL strings.
QueryOver API. Based on Criteia API, with lambda expression support for having compile time validation of queried entities namings. Should be preferred over Criteria API in my opinion.
Linq API. Great for dynamically built queries. Bear in mind that its implementation translates your queries to HQL. With complex queries, it may generate unsupported HQL constructs. Having knowledge of HQL capabilities allows a better understanding of how to write a supported Linq query for complex cases. (By example, for a complex order by, better use an explicit linq sub-query in the OrderBy rather than using a collection mapped on your queried entity.)
Native SQL. Well, quite self-explanatory. To be used by example when you need some SQL special feature not available through other querying APIs (SQL server full-text, select for xml, ...), and that you do not wish to extend those other APIs. You may also call stored procedures. When using native SQL, I favor SQL named queries.
Modifying data, from Updating objects to Flush, and Exception handling.
Performances.
Batch fetching. About this, you may read my post here for a detailed explanation of why lazy loading can be very efficient with NHibernate, thanks to batch fetching. This single feature will always cause me to prefer NHibernate over Entity Framework, till it ceases being lacking in EF.
Second level cache. Another great NHibernate feature, lacking native support in EF. Beware, you must use transactions for leveraging this. It allows NHibernate to automatically evict cached entries for you as you change data through your application process. Without transactions, NHibernate will disable the second level cache as soon as you start changing data, for avoiding letting the cache yield you stale data.
Interceptors. This is one way among many allowing to customize NHibernate inner working. NHibernate is very strong at allowing you to extend it. You may also add your own HQL extensions as here, your own linq2NH extension as here (all are answers from me). And there are other ways, see this list for linq2NH extensibility solutions.
Moreover, a class reference will very likely be near the Hibernate one. There is so many internals APIs supporting its implementation that is not much usable.
Why are such API not hidden (internal, private, ...)? Not hiding them is required for allowing the great extensibility capabilities of NHibernate. Those capabilities are a must have in my opinion. In contrast, it is so hard to fix some other .Net project shortcomings, due to lacks of extensibility they suffer. (MVC FileResult and the TweakDispositionAsInline I had to use instead of just being able of overriding some method, or try extend linq-to-entities, see this.)
there is a good book that covers a lot, and there is the html documentation on the site (which also comes as a book)
(the book would be manning - nHibernate in Action - a little outdated, but a good start)
Here is the link to the online reference

Improving my data access layer

I am putting some heavy though into re-writing the data access layer in my software(If you could even call it that). This was really my first project that uses, and things were done in an improper manner.
In my project all of the data that is being pulled is being stored in an arraylist. some of the data is converted from the arraylist into an typed object, before being put backinto an arraylist.
Also, there is no central set of queries in the application. This means that some queries are copy and pasted, which I want to eliminate as well.This application has some custom objects that are very standard to the application, and some queries that are very standard to those objects.
I am really just not sure if I should create a layer between my objects and the class that reads and writes to the database. This layer would take the data that comes from the database, type it as the proper object, and if there is a case of multiple objects being returned, return a list of those object. Is this a good approach?
Also, if this is a good way of doing things, how should I return the data from the database? I am currently using SqlDataReader.read, and filling an array list. I am sure that this is not the best method to use here, i am just not real clear on how to improve this.
The Reason for all of this, is I want to centralize all of the database operations into a few classes, rather than have them spread out amongst all of the classes in the project
You should use an ORM. "Not doing so is stealing from your customers" - Ayende
One thing comes to mind right off the bat. Is there a reason you use ArrayLists instead of generics? If you're using .NET 1.1 I could understand, but it seems that one area where you could gain performance is to remove ArrayLists from the picture and stop converting and casting between types.
Another thing you might think about which can help a lot when designing data access layers is an ORM. NHibernate and LINQ to SQL do this very well. In general, the N-tier approach works well for what it seems like you're trying to accomplish. For example, performing data access in a class library with specific methods that can be reused is far better than "copy-pasting" the same queries all over the place.
I hope this helps.
It really depends on what you are doing. If it is a growing application with user interfaces and the like, you're right, there are better ways.
I am currently developing in ASP.NET MVC, and I find Linq to SQL really comfortable. Linq to SQL uses code generation to create a collection of code classes that model your data.
ScottGu has a really nice introduction to Linq to SQL on his blog:
http://weblogs.asp.net/scottgu/archive/2007/05/19/using-linq-to-sql-part-1.aspx
I have over the past few projects used a base class which does all my ADO.NET work and that all other data access classes inherit. So my UserDB class will inherit the DataAccessBase class. I have it at the moment that my UserDB class actualy takes the data returned from the database and populates a User object which is then returned to the calling Business Object. If multiple objects are returned then these are then a Generic list ie List<Users> is returned.
There is a good article by Daemon Armstrong (search Google for Daemon Armstrong which demonstrates on how this can be achived.
""http://www.simple-talk.com/dotnet/.net-framework/.net-application-architecture-the-data-access-layer/""
However I have now started to move all of this over to use the entitty framework as its performs much better and saves on all those manual CRUD operations. Was going to use LINQ to SQL but as it seems to be going to be dead in the water very soon thought it would be best to invest my time in the next ORM.
"I am really just not sure if I should create a layer between my objects and the class that reads and writes to the database. This layer would take the data that comes from the database, type it as the proper object, and if there is a case of multiple objects being returned, return a list of those object. Is this a good approach?"
I'm a Java developer, but I believe that the language-agnostic answer is "yes".
Have a look at Martin Fowler's "Patterns Of Enterprise Application Architecture". I believe that technologies like LINQ were born for this.