For Doctrine ORM, why is DBAL needed in addition to PDO?

For Doctrine ORM, why is DBAL needed in addition to PDO? - orm

I've been working with Doctrine 2 ORM for some time, and there's something I've never quite understood.
What purpose does the Doctrine DBAL (database abstraction layer) serve? PDO itself a database abstraction layer, so why can't the ORM work directly with PDO?
I'm not trying to find a way around using DBAL or anything. I've just never understood why the extra layer is needed, and can't seem to find a clear answer in the documentation.

No, PDO is a "data-access layer", not a "database abstraction layer". Meaning that you can switch databases and still make the same method calls, but PDO will not re-write sql queries to match the selected database or emulate any database functionality.
Per PHP PDO docs:
PDO provides a data-access abstraction layer, which means that, regardless of which database you're using, you use the same functions to issue queries and fetch data. PDO does not provide a database abstraction; it doesn't rewrite SQL or emulate missing features. You should use a full-blown abstraction layer if you need that facility.

Doctrine2 actually supports some non-PDO databases so that is one reason.
It's also useful to look at the source code. The Connection class (for example) has a nice:
public function insert($tableName, array $data)
Which inserts a new record complete with escaping.

Related

ORM vs Query Builder vs Native driver

I am really confused because I have heard a lot about ORM, I guess the best practise is to used ORM to not write SQL queries.
I tried Django ORM, Laravel ORM, Hibernate and they take much longer time to return the data.
For example: I code a script to insert a dataset using native queries and Django ORM, Django takes 4 minutes and my script take 1 minutes, more or less.
I think the response time is the more important thing when you code an api.
So, is not a better option to use the query builder?

ORM is basically a way to introduce concepts of domain and type safety into work with databases. When you can have classes and properties with types, it's more easier to write logically structured and type safe code, also reuse existing functionality. Following these benefits, the main advantage of ORM is safety and clearness.
Also, ORM could be a way to introduce an abstraction layer, allowing semi-painless migration from one database to another as details of interaction are hidden inside ORM engine.
For small projects without many developers working on or for application with short lifetime, it's obviously not a necessity, and introduction of an ORM can be considered as a disadvantage.
Addressing your example with slow insertion of data and accusation of ORM, it can be said that probably you aren't using ORM correctly or you are using it not in the most optimized for the task way. For example, your script can use no transactions at all or probably use less restrictive isolation level whereas ORM can target safety over speed, creating a transaction on each insert or utilizing more restrictive isolation level.

Can I use straight SQL in Django models?

And, if I can, does that mean I lose my advantage of treating the results as objects? I find complex queries confusing in many ORMs, not just Django's. But, it is probably because I have never really used an ORM. Does anyone use straight up SQL anymore?
edit: Am I defeating the purpose of having a framework if I bypass the ORM completely? They all have a "nifty" ORM, but when it comes to queries with lots of subqueries, derived tables, it doesn't look pretty.

Using Django's QuerySet API you have different possibilities:
You can use extra() which will return a queryset which evaluates to model objects. Therefore it is, as the name says, somehow limited, because for returning model instances it is necessary to eg. query the model's table. But you have the possibility to add additional SQL eg. the WHERE or ORDER clause. Querysets that use extra() can still use the features of the ORM - like chaining multiple filter() for example.
raw() returns a RawQueryset which also can be iterated over to get model instances, but you loose a lot of features that the ORM would normally provide.
And of course you can execute SQL directly, using a low level connection cursor API (no model instances of course).
Study the documentation on raw queries, there's also a lot of information on eg. how to map a model's fields on the data coming from a raw query and documeting a few gotchas when passing parameters into the query.
To also answer your edited question: I wouldn't use raw SQL when you can do it with the ORM, but of course the ORM is limited and if you need to do some more complex stuff you will always have to switch to SQL (but sometimes using extra() is enough-so you can still use the advantages of the ORM). Don't forget that the ORM works with every DB backend, while the custom SQL might not work with every database.

You can use raw SQL to either return objects; or if you want you can bypass the ORM completely.

Simple Framework Mechanics

I'm building a simple framework and hoping it will help me better understand OOP. Well, I've come across my first hurdle.
I'm naturally using MVC pattern and I have a user model. I then have a 'table' which manages the collection of user objects. I have an abstract model and table. Similar to Zend Framework layout.
Anyway, I'm now wondering what is the best way to execute a query? I have a database object which manages the database layer. Currently using mysqli object.
Trouble is my connection is a member of my application object, how do I access that from within my table object to execute a query? I obviously can't create a new object because that will connect to the database again. I need to reuse this same db object.
Any ideas? I understand if this isn't optimal design. Right now I'm just looking for advice.
Edit: Should I just use Doctrine?

doctrine or not?
If your intention is to learn how to write SQL Queries and to create something for your own, use mysqli or PDO.
If you would like to learn how to use a real object oriented database layer, use doctrine. With 3 pages tutorial, you can do the most general things. However some very special cases are better coded in raw sql than with doctrine. But if you know how to do things with doctrine it is a very useful library and you end up writing better, consistenter, more readabler and extensiabler (:-)) code.

you should add such dependencies in constructor.
You are able to use a static db object too, but especially for a framework like project this will get you to hell, if you try to add unit-tests or you want to refactor.
In named Patterns, you want to use dependency injection for this

What will be the benefits of NHibernate in a data retrieval only scenario?

We have been suggested to use NHibernate for our project. But the point is our application will only be retrieving the data from the database so are there any benefits of using NHibernate in this scenario?
One more thing, what is the query execution plan in NHIbernate does it support something like prepared statements or the so called pre complied statements.?

I agree to the answers above, but there is one more reason for using nhibernate: Your completely independend of the underlaying database system. You can switch from mysql to oracle and the only thing you have to do, is to change the settings. the access to the database stays exactly the same.

NHibernate is useful is you need to map data from a database table into a .NET class. Even if you're only doing select queries, it still might be useful if you need to pass the data objects to a client tier (web page, desktop app, etc.) for display. Working with plain objects can be easier than working with a DataSet or other ADO.NET data class in a presentation layer.
NHibernate does have the ability to parse/pre-compile queries if you put them in the mapping file.

The benefit for using NHibernate in a read only scenario is that you would not need to map the results of queries back to .net objects as the runtime would do this for you. Also, it provides a more object oriented query syntax (you can also use LINQ), and you can take advantage of lazy loading.
I don't believe NHibernate can use prepared statements unless you are having it call stored procedures.

What are the advantages of using an ORM? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
As a web developer looking to move from hand-coded PHP sites to framework-based sites, I have seen a lot of discussion about the advantages of one ORM over another. It seems to be useful for projects of a certain (?) size, and even more important for enterprise-level applications.
What does it give me as a developer? How will my code differ from the individual SELECT statements that I use now? How will it help with DB access and security? How does it find out about the DB schema and user credentials?
Edit: #duffymo pointed out what should have been obvious to me: ORM is only useful for OOP code. My code is not OO, so I haven't run into the problems that ORM solves.

I'd say that if you aren't dealing with objects there's little point in using an ORM.
If your relational tables/columns map 1:1 with objects/attributes, there's not much point in using an ORM.
If your objects don't have any 1:1, 1:m or m:n relationships with other objects, there's not much point in using an ORM.
If you have complex, hand-tuned SQL, there's not much point in using an ORM.
If you've decided that your database will have stored procedures as its interface, there's not much point in using an ORM.
If you have a complex legacy schema that can't be refactored, there's not much point in using an ORM.
So here's the converse:
If you have a solid object model, with relationships between objects that are 1:1, 1:m, and m:n, don't have stored procedures, and like the dynamic SQL that an ORM solution will give you, by all means use an ORM.
Decisions like these are always a choice. Choose, implement, measure, evaluate.

ORMs are being hyped for being the solution to Data Access problems. Personally, after having used them in an Enterprise Project, they are far from being the solution for Enterprise Application Development. Maybe they work in small projects. Here are the problems we have experienced with them specifically nHibernate:
Configuration: ORM technologies require configuration files to map table schemas into object structures. In large enterprise systems the configuration grows very quickly and becomes extremely difficult to create and manage. Maintaining the configuration also gets tedious and unmaintainable as business requirements and models constantly change and evolve in an agile environment.
Custom Queries: The ability to map custom queries that do not fit into any defined object is either not supported or not recommended by the framework providers. Developers are forced to find work-arounds by writing adhoc objects and queries, or writing custom code to get the data they need. They may have to use Stored Procedures on a regular basis for anything more complex than a simple Select.
Proprietery binding: These frameworks require the use of proprietary libraries and proprietary object query languages that are not standardized in the computer science industry. These proprietary libraries and query languages bind the application to the specific implementation of the provider with little or no flexibility to change if required and no interoperability to collaborate with each other.
Object Query Languages: New query languages called Object Query Languages are provided to perform queries on the object model. They automatically generate SQL queries against the databse and the user is abstracted from the process. To Object Oriented developers this may seem like a benefit since they feel the problem of writing SQL is solved. The problem in practicality is that these query languages cannot support some of the intermediate to advanced SQL constructs required by most real world applications. They also prevent developers from tweaking the SQL queries if necessary.
Performance: The ORM layers use reflection and introspection to instantiate and populate the objects with data from the database. These are costly operations in terms of processing and add to the performance degradation of the mapping operations. The Object Queries that are translated to produce unoptimized queries without the option of tuning them causing significant performance losses and overloading of the database management systems. Performance tuning the SQL is almost impossible since the frameworks provide little flexiblity over controlling the SQL that gets autogenerated.
Tight coupling: This approach creates a tight dependancy between model objects and database schemas. Developers don't want a one-to-one correlation between database fields and class fields. Changing the database schema has rippling affects in the object model and mapping configuration and vice versa.
Caches: This approach also requires the use of object caches and contexts that are necessary to maintian and track the state of the object and reduce database roundtrips for the cached data. These caches if not maintained and synchrnonized in a multi-tiered implementation can have significant ramifications in terms of data-accuracy and concurrency. Often third party caches or external caches have to be plugged in to solve this problem, adding extensive burden to the data-access layer.
For more information on our analysis you can read:
http://www.orasissoftware.com/driver.aspx?topic=whitepaper

At a very high level: ORMs help to reduce the Object-Relational impedance mismatch. They allow you to store and retrieve full live objects from a relational database without doing a lot of parsing/serialization yourself.
What does it give me as a developer?
For starters it helps you stay DRY. Either you schema or you model classes are authoritative and the other is automatically generated which reduces the number of bugs and amount of boiler plate code.
It helps with marshaling. ORMs generally handle marshaling the values of individual columns into the appropriate types so that you don't have to parse/serialize them yourself. Furthermore, it allows you to retrieve fully formed object from the DB rather than simply row objects that you have to wrap your self.
How will my code differ from the individual SELECT statements that I use now?
Since your queries will return objects rather then just rows, you will be able to access related objects using attribute access rather than creating a new query. You are generally able to write SQL directly when you need to, but for most operations (CRUD) the ORM will make the code for interacting with persistent objects simpler.
How will it help with DB access and security?
Generally speaking, ORMs have their own API for building queries (eg. attribute access) and so are less vulnerable to SQL injection attacks; however, they often allow you to inject your own SQL into the generated queries so that you can do strange things if you need to. Such injected SQL you are responsible for sanitizing yourself, but, if you stay away from using such features then the ORM should take care of sanitizing user data automatically.
How does it find out about the DB schema and user credentials?
Many ORMs come with tools that will inspect a schema and build up a set of model classes that allow you to interact with the objects in the database. [Database] user credentials are generally stored in a settings file.

If you write your data access layer by hand, you are essentially writing your own feature poor ORM.
Oren Eini has a nice blog which sums up what essential features you may need in your DAL/ORM and why it writing your own becomes a bad idea after time:
http://ayende.com/Blog/archive/2006/05/12/25ReasonsNotToWriteYourOwnObjectRelationalMapper.aspx
EDIT: The OP has commented in other answers that his code base isn't very object oriented. Dealing with object mapping is only one facet of ORMs. The Active Record pattern is a good example of how ORMs are still useful in scenarios where objects map 1:1 to tables.

Top Benefits:
Database Abstraction
API-centric design mentality
High Level == Less to worry about at the fundamental level (its been thought of for you)
I have to say, working with an ORM is really the evolution of database-driven applications. You worry less about the boilerplate SQL you always write, and more on how the interfaces can work together to make a very straightforward system.
I love not having to worry about INNER JOIN and SELECT COUNT(*). I just work in my high level abstraction, and I've taken care of database abstraction at the same time.
Having said that, I never have really run into an issue where I needed to run the same code on more than one database system at a time realistically. However, that's not to say that case doesn't exist, its a very real problem for some developers.

I can't speak for other ORM's, just Hibernate (for Java).
Hibernate gives me the following:
Automatically updates schema for tables on production system at run-time. Sometimes you still have to update some things manually yourself.
Automatically creates foreign keys which keeps you from writing bad code that is creating orphaned data.
Implements connection pooling. Multiple connection pooling providers are available.
Caches data for faster access. Multiple caching providers are available. This also allows you to cluster together many servers to help you scale.
Makes database access more transparent so that you can easily port your application to another database.
Make queries easier to write. The following query that would normally require you to write 'join' three times can be written like this:
"from Invoice i where i.customer.address.city = ?" this retrieves all invoices with a specific city
a list of Invoice objects are returned. I can then call invoice.getCustomer().getCompanyName(); if the data is not already in the cache the database is queried automatically in the background
You can reverse-engineer a database to create the hibernate schema (haven't tried this myself) or you can create the schema from scratch.
There is of course a learning curve as with any new technology but I think it's well worth it.
When needed you can still drop down to the lower SQL level to write an optimized query.

Most databases used are relational databases which does not directly translate to objects. What an Object-Relational Mapper does is take the data, create a shell around it with utility functions for updating, removing, inserting, and other operations that can be performed. So instead of thinking of it as an array of rows, you now have a list of objets that you can manipulate as you would any other and simply call obj.Save() when you're done.
I suggest you take a look at some of the ORM's that are in use, a favourite of mine is the ORM used in the python framework, django. The idea is that you write a definition of how your data looks in the database and the ORM takes care of validation, checks and any mechanics that need to run before the data is inserted.

What does it give me as a developer?
Saves you time, since you don't have to code the db access portion.
How will my code differ from the individual SELECT statements that I use now?
You will use either attributes or xml files to define the class mapping to the database tables.
How will it help with DB access and security?
Most frameworks try to adhere to db best practices where applicable, such as parametrized SQL and such. Because the implementation detail is coded in the framework, you don't have to worry about it. For this reason, however, it's also important to understand the framework you're using, and be aware of any design flaws or bugs that may open unexpected holes.
How does it find out about the DB schema and user credentials?
You provide the connection string as always. The framework providers (e.g. SQL, Oracle, MySQL specific classes) provide the implementation that queries the db schema, processes the class mappings, and renders / executes the db access code as necessary.

Personally I've not had a great experience with using ORM technology to date. I'm currently working for a company that uses nHibernate and I really can't get on with it. Give me a stored proc and DAL any day! More code sure ... but also more control and code that's easier to debug - from my experience using an early version of nHibernate it has to be added.

Using an ORM will remove dependencies from your code on a particular SQL dialect. Instead of directly interacting with the database you'll be interacting with an abstraction layer that provides insulation between your code and the database implementation. Additionally, ORMs typically provide protection from SQL injection by constructing parameterized queries. Granted you could do this yourself, but it's nice to have the framework guarantee.
ORMs work in one of two ways: some discover the schema from an existing database -- the LINQToSQL designer does this --, others require you to map your class onto a table. In both cases, once the schema has been mapped, the ORM may be able to create (recreate) your database structure for you. DB permissions probably still need to be applied by hand or via custom SQL.
Typically, the credentials supplied programatically via the API or using a configuration file -- or both, defaults coming from a configuration file, but able to be override in code.

While I agree with the accepted answer almost completely, I think it can be amended with lightweight alternatives in mind.
If you have complex, hand-tuned SQL
If your objects don't have any 1:1, 1:m or m:n relationships with other objects
If you have a complex legacy schema that can't be refactored
...then you might benefit from a lightweight ORM where SQL is is not
obscured or abstracted to the point where it is easier to write your
own database integration.
These are a few of the many reasons why the developer team at my company decided that we needed to make a more flexible abstraction to reside on top of the JDBC.
There are many open source alternatives around that accomplish similar things, and jORM is our proposed solution.
I would recommend to evaluate a few of the strongest candidates before choosing a lightweight ORM. They are slightly different in their approach to abstract databases, but might look similar from a top down view.
jORM
ActiveJDBC
ORMLite

my concern with ORM frameworks is probably the very thing that makes it attractive to lots of developers.
nameley that it obviates the need to 'care' about what's going on at the DB level. Most of the problems that we see during the day to day running of our apps are related to database problems. I worry slightly about a world that is 100% ORM that people won't know about what queries are hitting the database, or if they do, they are unsure about how to change them or optimize them.
{I realize this may be a contraversial answer :) }

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas