although I am pretty decent at PHP I am new to frameworks.
started with CI last week and found myself looking at Kohana this week.
I have few questions to that regard:
why ORM vs traditional SQL or active queries?
if the model must fetch data from DB , how come in ORM most of the action happens in the controller ( or so it seems ) ie ( $data=$q->where('category', '=', 'articles')->find_all();}
how would I do a conditional query in ORM? ( something like if (isset($_GET['category']))...etc ) if the condition is passed to the model? or should the controller do all the conditions
FYI my queries tend to have numerous joins and my limited knowledge tells me that I should have a query controller that passes queries parameters to a query model which does the query and returns results.
Please let me know if my understanding is correct
thank you very much
ORM is some kind of wrapper over the DB layer. So, you just call $user->find($id) instead of $db->query('select * from users where id='.$id) or DB::select()->from('users')->where('id', '=', $id)->limit(1)->execute(). You declare model params (table name, relations etc) and use only model methods to work with its data. You can easily change DB structure or DB engine without modifying a lot of controller code.
Agree with Ikke, controller should avoid model specific data like query conditions. For example, create method get_by_category($category).
See #2. All args you want should be passed into model method (this can be done using chaining, like $object->set_category($category)->set_time_limit(time())->limit(10)).
ORM is just another way to get at your data. The idea is that there are many common kind of operations, and that could be automated. And because the relations between tables can easily be translated to objects referencing eachother, ORM was created.
It's up to you if you want to use the supplied ORM module. There are others which are also commonly used (like sprig, jelly and auto-modeler).
My personal opinion is to limit that kind of operations to a minimum. Very simple operations can be done this way, because it barely produces any advantages in placing them in the model, but the best way is to try to put the business logic as much in the models as possible.
Another point is that it should be the view that gets the data from the models. That way, when you want to reuse a view, very little code has to be duplicated. But to prevent too much logic getting in your views, it's recommended to use so-called viewclasses which contain the logic for your views, and is the interface for your views to talk to.
There is a Validation library to make sure that all the data for your model is correct. Your models shouldn't know about $_GET and $_POST, but the data from those arrays can be passed to your models.
Related
And, if I can, does that mean I lose my advantage of treating the results as objects? I find complex queries confusing in many ORMs, not just Django's. But, it is probably because I have never really used an ORM. Does anyone use straight up SQL anymore?
edit: Am I defeating the purpose of having a framework if I bypass the ORM completely? They all have a "nifty" ORM, but when it comes to queries with lots of subqueries, derived tables, it doesn't look pretty.
Using Django's QuerySet API you have different possibilities:
You can use extra() which will return a queryset which evaluates to model objects. Therefore it is, as the name says, somehow limited, because for returning model instances it is necessary to eg. query the model's table. But you have the possibility to add additional SQL eg. the WHERE or ORDER clause. Querysets that use extra() can still use the features of the ORM - like chaining multiple filter() for example.
raw() returns a RawQueryset which also can be iterated over to get model instances, but you loose a lot of features that the ORM would normally provide.
And of course you can execute SQL directly, using a low level connection cursor API (no model instances of course).
Study the documentation on raw queries, there's also a lot of information on eg. how to map a model's fields on the data coming from a raw query and documeting a few gotchas when passing parameters into the query.
To also answer your edited question: I wouldn't use raw SQL when you can do it with the ORM, but of course the ORM is limited and if you need to do some more complex stuff you will always have to switch to SQL (but sometimes using extra() is enough-so you can still use the advantages of the ORM). Don't forget that the ORM works with every DB backend, while the custom SQL might not work with every database.
You can use raw SQL to either return objects; or if you want you can bypass the ORM completely.
In my project (ASP.NET MVC + NHibernate) I have all my entities, lets say Documents, described by set of custom metadata. Metadata is contained in a structure that can have multiple tags, categories etc. These terms have the most importance for users seeking the document they want, so it has an impact on views as well as underlying data structures, database querying etc.
From view side of application, what interests me the most are the string values for the terms. Ideally I would like to operate directly on the collections of strings like that:
class MetadataAsSeenInViews
{
public IList<string> Categories;
public IList<string> Tags;
// etc.
}
From model perspective, I could use the same structure, do the simplest-possible ORM mapping and use it in queries like "fetch all documents with metadata exactly like this".
But that kind of structure could turn out useless if the application needs to perform complex database queries like "fetch all documents, for which at least one of categories is IN (cat1, cat2, ..., catN) OR at least one of tags is IN (tag1, ..., tagN)". In that case, for performance reasons, we would probably use numeric keys for categories and tags.
So one can imagine a structure opposite to MetadataAsSeenInViews that operates on numeric keys and provide complex mappings of integers to strings and other way round. But that solution doesn't really satisfy me for several reasons:
it smells like single responsibility violation, as we're dealing with database-specific issues when just wanting to describe Document business object
database keys are leaking through all layers
it adds unnecessary complexity in views
and I believe it doesn't take advantage of what can good ORM do
Ideally I would like to have:
single, as simple as possible metadata structure (ideally like the one at the top) in my whole application
complex querying issues addressed only in the database layer (meaning DB + ORM + at less as possible additional code for data layer)
Do you have any ideas how to structure the code and do the ORM mappings to be as elegant, as effective and as performant as it is possible?
I have found that it is problematic to use domain entities directly in the views. To help decouple things I apply two different techniques.
Most importantly I'm using separate ViewModel classes to pass data to views. When the data corresponds nicely with a domain model entity, AutoMapper can ease the pain of copying data between them, but otherwise a bit of manual wiring is needed. Seems like a lot of work in the beginning but really helps out once the project starts growing, and is especially important if you haven't just designed the database from scratch. I'm also using an intermediate service layer to obtain ViewModels in order to keep the controllers lean and to be able to reuse the logic.
The second option is mostly for performance reasons, but I usually end up creating custom repositories for fetching data that spans entities. That is, I create a custom class to hold the data I'm interested in, and then write custom LINQ (or whatever) to project the result into that. This can often dramatically increase performance over just fetching entities and applying the projection after the data has been retrieved.
Let me know if I haven't been elaborate enough.
The solution I've finally implemented don't fully satisfy me, but it'll do by now.
I've divided my Tags/Categories into "real entities", mapped in NHibernate as separate entities and "references", mapped as components depending from entities they describe.
So in my C# code I have two separate classes - TagEntity and TagReference which both carry the same information, looking from domain perspective. TagEntity knows database id and is managed by NHibernate sessions, whereas TagReference carries only the tag name as string so it is quite handy to use in the whole application and if needed it is still easily convertible to TagEntity using static lookup dictionary.
That entity/reference separation allows me to query the database in more efficient way, joining two tables only, like select from articles join articles_tags ... where articles_tags.tag_id = X without joining the tags table, which will be joined too when doing simple fully-object-oriented NHibernate queries.
A colleague of mine is currently designing SQL queries like the one below to produce reports, which are displayed in excel files through an external data query.
At present, only reporting processes on the DB are required (no CRUD operations).
I am trying to convince him that it would be better to use a ruby ORM in order to be able to display the data in a rails/sinatra app.
Despite the obvious advantages in displaying the data, what advantages are there for him in learning to use an ORM like Sequel or Datamapper?
The SQL queries he is writing are clearly quite complex, and being relatively new to SQL, he often complains that it is very time-consuming and confusing.
Is it possible to write extremely complex queries with an ORM? and if so, which is the most suitable(I have heard Sequel is good for legacy dbs)? and what are the advantages of learning ruby and using an ORM versus sticking with plain SQL, in making complex database queries?
I'm the DataMapper maintainer, and I think for complex reporting you should use SQL.
While I do think someday we'll have a DSL that provides the power and conciseness of SQL, everything I've seen so far requires you to write more Ruby code than SQL for complex queries. I would much rather maintain a 5 line SQL query than 10-15 lines of Ruby code to describe the same complex operation.
Please note I say complex.. if you have something simple, use the ORM's build-in finders. However, I do believe there is a line you can cross where SQL becomes simpler. Now, most apps aren't just reporting. You may have alot of CRUD type operations, for which an ORM is perfectly suited and far better than doing those things by hand.
One thing that an ORM will usually provide is some sort of organization to your application logic. You can group code based around each model in the same file. It's usually there that I'll put the complex SQL query, rather than embedding it in the controller, eg:
class User
include DataMapper::Resource
property :id, Serial
property :name, String, :length => 1..100, :required => true
property :age, Integer, :min => 1, :max => 130
def self.some_complex_query
repository.adapter.select <<-SQL
SELECT ...
FROM ...
WHERE ...
... more complex stuff here ...
SQL
end
end
Then I can just generate the report using User.some_complex_query. You could also push the SQL query into a view if you wanted to further cleanup this code.
EDIT: By "view" in the above sentence I meant RDBMS view, rather than view in the MVC context. Just wanted to clear up any potential confusion.
If you are writing your queries by hand you have the chance to optimize them. When I look at that query I see some potential for optimizations (E.ICGROUPNAME LIKE '%san-fransisco%' or E.ICGROUPNAME LIKE '%bordeaux%' wont use an index = Table Scan).
When using an OR Mapper (the native Objects/Tables) for reporting you have no or little control over the resulting SQL Query.
But: You could put that query in an View or Stored Procedure and map that View/Proc with an OR Mapper. You can optimize your queries and you can use all features of your Application Framework.
Unless you're dealing with objects, an ORM is not necessary. It sounds like your friend simply needs to generate reports, in which case pure SQL is just fine so long as he knows what he's doing (e.g. avoiding SQL injection issues).
ORM stands for "Object-Relational Mapping". If you don't have the "O" (objects), then it's probably not a good fit for your app. Where ORMs really shine is in persisting objects to the database and loading them from a database.
ORM stands for Object Relational Mapping - but looking at the query your friend seems to be wanting a pretty specific table of sums and other items... I've not used Ruby's Sequel, but I've used Hibernate, and Python's SQLAlchemy (for Django/Turbogears) and while you can do these sorts of queries, I don't believe that is their strength.
The power of ORM comes from being able to finding Foo->Bar object relationships, say you want all the Bar objects for Foo's field greater then X... That sort of thing. Therefore I would not classify an ORM as a "good" solution, though moving to a real programming language like Ruby and doing the SQL through it instead of Excel... that in itself is a win.
Just my 2 cents.
In a situation like that, I'd probably write them by hand or use a View (if the DB you're using supports views)
ORM's are used when you have Objects (Business Objects). I am therefore assuming that you have an application with which you creating and Managing the Business Objects that are ultimately saved into the database. If you have then you have almost definitely got some representation of the relationships and probably many of the calculations you are going to use in reports. The problem with using SQL to directly access your database for reports is simply maintainability.
You typically put a lot of effort into ensuring that your Business Objects hide any details of their database. You implement business rules and do common calculations in your Business Objects. Build a common language for all members of the team etc etc. You then use an ORM to map to the database and use Habanero or NHibernate or something like that to do this. This is all great. We do this all in the name of Maintainability and is great. You can migrate your application change your design etc etc.
You now go and write SQL to run reports over time you have hundreds of report. Firstly they often duplicate logic you already have in your BusinessObjects (Usually without any tests) and even worse Bham Damb sorry maintainability is now stuffed forget about moving a that field from one table to another forget about splitting that table into two changing that relationship etc you have a number of reports that are going to break unexpectedly.
The problem with quering through your Domain Objects/Business Objects is simply one of performance.
In summary if you are using Domain Driven Design or Business Object concepts try to use these for reports. (You will probably run directly from DB using SQL or stored procs for performance reasons but try limit these use your Business Objects first and then use SQL).
The other option of course is using a separate reporting database (Like some of the BI concepts) The mapping from your transactional DB to your reporting DB is therefore in one place and easily changeable in cases where you want to change your design.
Domain Objects (Business Objects) and ORMs have all the knowledge to allow you to start building high performing queries that run directly on the Database while using the Domain Terminology. Lets hope that these continue to evolve to a point where this is a reality.
Until then if you are using Business Objects in your application try use them for Reporting when performance is an issue resort to SQL.
I am a little bit confused about Dataset compared to ORM (NHibernate or Spring.Net). From my understanding the ORM sits between the application layer and the database layer. It will generate the SQL commands for the application layer. Is this the same as what Dataset does? What is the difference between the Dataset and ORM? What are the advantages and disadvantages for these two methods? Hope the experts in here can explain something.
Thanks,
Fakhrul
There is a BIG difference between them, first of all about the programming model they represent:
The Dataset is based on a Table Model
An ORM (without specify a particular product of framework) is based and tends to a Domain Model.
There is another kind of tool which could be used in data scenario, this kind of tool is a Data Mapper (eg. iBatis.NET)
As others answers before me, I think it's important to view what Microsoft says about Dataset and better what Wikipedia says about ORM, but I think (this was for me at beginning) it's more to understand the difference between them in terms of model. Understanding that will not only clarify the choises behind but better, will do too easy to approach and understand a tool itself.
As little explanation it's possible to say:
Table Model
is a model which tends to represent tabular data in a memory structure as close as possible (and even as needed). So it's easy to find implementations which implements concepts as Table, Columns, Relations in fact the model is concetrate on the table structure, so object orientation is based on that not on data itself. This model could has its own advantages, but in some case could be heavy to manage and difficult to apply concepts on contained data. As previous answers says, implementations like Dataset, let, or better, force you to prepare (even if with a tool) needed SQL instructions to perform actions over the data.
ORM
is a model which (as mendelt says before me..) where Objects are mapped directly to database objects, principally Tables and Views (even if it's possible to map even functions and procedures too). This is done in 2 ways generally, with a mapping file which describes the mapping, or with (in case of .NET or Java) code Attributes. This model is based on Objects which represents the data, so object orientation could be done on them as in normal programs, it's clear with more attention and caution in certain cases, but generally, when you are confident with ORM it could be a really powerfull tool! Even ORM could be heavy to manage if it's not managed and designed well, or better understood weel, so it's important to understand techniques, but I can say with my experience that ORM is a really powerfull tool. In ORM, the tool principally it's responsible to generate the SQL instructions needed as operations are done in code, and in more cases ORMs has a middle language (like HQL) to perform operations on Objects.
MAPPER
A mapper is a tool which doesn't makes things like an ORM, but, maps hand written SQL instructions to an Object Model. Thi kind of tool could be a better solution when it's needed to write by hand SQL instructions but It's wanted to designe an application Object model to represent data.
In this "model" objects are mapped to instruction and described in a mapping file (generally an Xml file as iBatis.Net or iBATIS (java) does). A mapper let you define granular rules in SQL instructions. In this scenario could be easy to find some ORM concepts as for example session management.
ORM and Mappers let to apply some very interesting Design Patterns, which could be not so easy to apply in the same way to a Table Model and in this case to a Dataset.
First of all excuse me for this long answer and about my poor english, but for me, an answer like this makes me in past to understand well the difference between this models and then between implementations.
the Dataset class is definitly not an ORM; an ORM maps relational data with an object oriented representation.
It can be regarded as some kind of 'unit of work' though, since it keeps track of the rows that have to be deleted/updated/inserted.
ADO.NET DataSet =
http://msdn.microsoft.com/en-us/library/zb0sdh0b(VS.80).aspx
ORM =
http://en.wikipedia.org/wiki/Object-relational_mapping
(Example Developer Express
XPO,DataObjects.NET)
ORM is based on mapping between objects and tables. Not the case for this dataset. Dataset is itself in a way directly to the table. ORM is based on a minimum of SQL script. But enough to use the dataset you write SQL clause. Dataset in this case is not an ORM.
Look at dataset and ORM.
No, Datasets are not ORM's. They may look like orms because datasets map tables to objects just like ORM's the main difference lies in what objects they map to.
Datasets have their own table and row object types that closely resemble the structure of the database. You're rebuilding part of the database's relational model in objects. Restricting these objects into something resembling a relational database gets around some of the problems inherent in mapping a database to an object model.
An ORM maps the tables and rows from the database into your own object model. The structure of your object model can be optimized for your application instead of resembling a relational database. The ORM takes care of the difficulties in transforming a relational model into an object model.
DataSet is a DTO, a data transfer object. DataSet itself can't do anything. You can use a DataAdapter (of the provider used) to produce sql or call predefined queries, though it still isn't doing anything.
This is not specific to any language, it´s just about best practices. I am using JPA/Hibernate (but it could be any other ORM solution) and I would like to know how do you guys deal with this situation:
Let´s suppose that you have a query returning something that is not represented by any of your domain classes.
Do you create a specific class to represent that specific query?
Do you return the query in some other kind of object (array, map...)
Some other solutions?
I would like to know about your experiences and best practices.
P.S.
Actually I am creating specific objetcs for specific queries.
We have a situation that sounds similar to yours.
We use separate objects for reporting data that spans several domain objects. Our convention is that these will be backed by a view in the database, so we have come to call them view objects. We generally use them for summarising complex data into a flat format.
I typically write a function that performs a query using SQL and then puts the results into either a list or dictionary (in Java, I'd use either an ArrayList or a HashMap).
If I found myself doing this a lot, I'd probably create a new file to hold all of these queries. Otherwise I'd just make them functions in whatever file they were needed/used.
Since we're talking Java specifically, I would certainly not create a new class in a separate file. However, for queries needed in only one class, you could create a private static inner class with only the function(s) needed to generate the query(s) needed by that class.
The idea of wrapping that up the functionality in some sort of manager is always nice. It allows for better testing, and management therefore of schema changes.
Also allows for easier reuse in the application. NEVER just put the sql in directly!!!. For Hibernate I have found HQL great for just this. In particular , if you can use Named queries. Also be careful of adding an filter values etc use "string append", use parameters (can we say SQL injection ?). Even if the SQL is dynamic in terms of the join or where criteria, have a function in some sort of manager is always best.
#DrPizza
I will be more specific. We have three tables in a database
USER
PROJECT
TASK
USER to TASK 1:n
PROJECT to TASK 1:n
I have a query that returns a list of all projects but showing also some grouped information (all tasks, open tasks, closed tasks). When returned, the query looks like this
PROJECTID: 1
NAME: New Web Site
ALLTASK: 10
OPENTASK: 7
CLOSEDTASK: 3
I don´t have any domain class that could represent this information and I don´t want to create specific methods in Project class (like getAllTasks, getOpenTasks) because each of these methods would trigger a new query.
So the question is:
I create a new class (somenthing like ProjectTasksQuery) just to hold that information?
I return information within array or map?
Something else?
You might feel better after reading about Data Transfer Objects. Some people plain don't like them, but if it feels like a good fit to you, it probably is.