As title says:
Should the query side of CQRS applications call the database direcly in the controllers/handlers and skip application services, domains and repositories?
What if the query logic is complex and/or I also need to publish an event (related to the read operation) to a message broker? In what layer would that logic fit?
The Query side will only contain the methods for getting data, so it can/should be really simple. The domain model from the command side is definitely not part of the query side. The queries are separate from the model we have in our domain. An abstraction on top of your persistence is not required too.
Simple query logic would make your life easier. The secret sauce of CQRS is polyglot persistence. You may maintain multiple denormalized representations of your data, also known as a materialized views, which are tailored to your query needs.You can have multiple projections on your data on different databases depending on your query needs. If you do that, the query side tends to become simple
e.g. if you have a projection of something that is an entity in your domain like a customer then you can persist it in Mongo and query it by id - really simple and performant, if you have some report with multiple orders you can persist those in a relational database and do sql queries - simple and performant. This way you would end up with GET queries that do database queries and return the read models without any additional mapping.
Having said that, I would like to state that this a typical use case, but your read models can also be slightly different queries on the same table of a db. This would make the query a bit more complex, but might be good enough too.
I also don't think that you should publish an event from the query side. What would that event be?
While most SQL Databases allow you to create a view, Microsoft Access has saved queries. I have read that Access Queries are not the same thing as SQL views, but that’s a sweeping statement.
I am aware that they have some differences in detail. For example Access saves SELECT * as is, while most saved views spell out the field list.
Aside from details such as these, is there a fundamental difference between the two?
Thanks
An Access "saved query" is more than a view in SQL Server (where "more" doesn't mean that it is better or not). In SQL Server you have the SQL text and the execution plan which defines a view. You can also add additional informations like description text or user defined variables with user defined values which have no influence on the view itself.
In Access you work with QueryDef objects, which are in fact the "saved queries", they contain a lot more than only the SQL text which is only one property of the QueryDef object. For example, you can define parameters with the PARAMETERS clause which can be used similar to #-variables in SQL Server stored procedures/functions. That's something which doesn't exist for SQL Server views. Of course a QueryDef object has also a saved exection plan, that's why Microsoft also recommends to use a QueryDef as i.e. form RecordSource instead of a dynamic SQL command at the same place. The JET/ACE query optimizer's result can also made visible with some registry tricks, it's only not part of the Access GUI so most people don't know that it also has an execution plan.
QueryDef objects contains also formatting properties, captions to be displayed in Datasheet view, definition of lookups for comboboxes, ODBC connection strings and a lot more, you can find them all in the Access help.
So Access QueryDefs contains a lot which only affects the display of the result which makes sense for a frontend which you can develop with Access, but they do not have very much advantages in comparison with SQL Server views. One simple difference is the SQL language: Independent of the backend you use you are working with Access SQL and this SQL is really a very basic SQL. T-SQL on SQL Server for example is a very powerful SQL language where you can do a lot more with - for example, you can query a hierarchical structure like a BOM (Bill Of Material) with one SQL statement which you can't with Access SQL as T-SQL can use recursive SQL. In Access it would only be able by using VBA functions in Access SQL which slows down the complete query a lot.
Of course a QueryDef object in Access can also use so called "Pass-Through-Queries" which executes i.e. T-SQL directly without using Access SQL. But as the SQL text is saved locally in Access it is handled as dynamical SQL in SQL Server because the text is sent each time it is executed and SQL Server has no saved view for that, so all the advantages of a saved view are lost. It's better to avoid them or use them only to execute saved views, functions or stored procedures on SQL Server.
A QueryDef object is moreover a DAO object and that means that you are working with DAO datatypes always. So even in case of a Pass-Through-Query the data is always converted from SQL Server (or of course other database) datatypes to DAO datatypes.
Easier deployment like mentioned above: Here's no big difference in using a view or a QueryDef in Access if both are used for frontend purposes like a form's RecordSource property. The reason is that a QueryDef and a view would both need to be implemented in the frontend, the QueryDef need to be changed locally, the view can be changed in the backend, but as it is in most cases linked to the frontend as linked table you need to delete this link in the frontend and recreate it in case of changes of the view so you also need to redeploy the frontend again (that's why I personally prefer ADPs instead of ACCDBs because in ADPs I work in Access directly with the view and not with any QueryDef so here a change in the backend is enough to reflect it in the frontend).
Independent of that you would also need to redeploy the frontend if you change a fieldname in the view you use in the frontend.
It is another thing if you use the view only in the backend to assemble data for other backend purposes like using it in a stored procedure or another view. If they are not linked to the frontend you don't need to redeploy the frontend. So as with stored procedures and functions it is also true for views that you can quickly change something in the backend if you need to fix something which doesn't affect the frontend directly. I.e., if you concatenate two text fields with a "." with an alias name and someone tells you that it now must be a "-" instead you can simply change the view and it's done, nothing to change in the frontend (if you have no further logic in the frontend which needs to check for the dot).
SELECT *
"spells out the field list" is indeed what a SQL Server view does, the problem is: It saves a list of fields of the table object you use in the SELECT at the time you save the view. But it doesn't save this field list visibly. If you open the view's SQL text you will always only see the "*", not the field list the view has saved. This is a big problem in SQL Server as you would expect that it lists all fields that the table has at all times (that's what Access does in a QueryDef object with SELECT *). So even if you use a tool like RedGate's free SQL Search you would not find that view as you have not the field list in the SQL text so the field name of a changed table field cannot be found.
In general avoid the "SELECT *" whereever possible as it produces more problems than that it has any advantage. Always use the list of fields you really want as result. In Access those field names will automatically be changed if you change the field name in a table (if you have AutoRename on in Access options), in SQL Server you can search for all objects using a specific field and change it.
One exception would be if you use a CTE in SQL Server where the last SELECT selects all fields of previous SELECTs in the CTE. Here it is no problem to use the asterisk as the field names are (should) be listed in the previous SELECTs of the same query. But in general it's better to more often avoid using it as using it for production purposes.
These are only examples of differences between both, there are a lot more like mentioned above (indexed views, security model) or something like schema names but this should give you a picture.
Very broadly speaking Views offer:
Performance
Benefit from Execution Plans which in the case of non-index Views will analyse the query using the View and the queries that make up the View definition. These plans are then stored so that repeated and/or similar queries can retrieve data faster. For reference: View Resolution
Indexed Views for situations such as when the underlying tables are not hugely transactional (OLTP) and dataset is large and requires aggregation such as in OLAP sources. For reference: Improving Performance with SQL Server 2008 Indexed Views
Security
Allows you to present a subset of data without granting access to the base Views or Tables that make up the View.
Easier Deployment
With Access, you make a change to your saved query, you would most likely have to roll-out that Access database to all users. With a View, you make a change and that'll affect all users.
Regarding the last point, that assumes you aren't changing identifiers in the View that are referenced elsewhere. A simple example would be changing a column name in the View. This would most likely require name changes in other dependent database objects or external tools that access it.
I am aware that they have some differences in detail. For example Access saves SELECT * as is, while most saved views spell out the field list.
That's not strictly true. It's generally preferred to identify columns by name in a View since they represent a subset of data and it restricts what data is seen by the end-user. That said, in general, any SQL query you can obviously restrict columns that are included, whether it's an Access Query or not. But you'll still come across Views with SELECT * so it's not a difference per se.
If I use SQLAlchemy's ORM to create objects and store them, does that mean I pretty much also only retrieve the data from the DB via SQLAlchemy? Will the underlying tables created by SQLAlchemy ORM still be sane? Can I still query the DB directly and have useful findings?
The ORM will only create and modify the database records as they're defined. You'll be able to query them just as you normally would. Using SqlAlchemy does not limit your normal database access in any way. SqlAlchemy can output the queries used into log files for seeing what exactly they're doing. It's nothing like html generation where you then don't want to look at the html it created.
We are working on a sync application using ColdFusion 9.0.1 ActionScript ORM Library for AIR applications. Since this is application should work smoothly offline as well, there is a list of clients that has to be loaded when a user logs in, hence we are fetching data from all the required tables when application loads (is that the right way?). Now when we get the data from the required tables then based on the user who logs in we have to filter the clients, to filter this the query required is a complex one with joins between 5-6 tables and where clause. What I found that using the Coldfusion.Air.Session class we can only load objects of tables with simple where clause. There is non ORM way to load the data but I don't think that is the right method. Is there any method using this ORM framework to load data using such complex queries.
Thanks,
Gaurav
Are you using any CF code to send data back to your application? Have you tried HQL?
In other words you can write standard cfquery and dbtype="hql"
This will let you do almost anything you can do with a standard cfquery.
I am not directly familiar with the ActionScript ORM Library for AIR.
I'm developing an iOS application that's a manager/viewer for another project. The idea is the app will be able to process the data stored in a database into a number of visualizations-- the overall effect being similar to cacti. I'm making the visualizations fully user-configurable: the user defines what she wants to see and adds restrictions.
She might specify, for instance, to graph a metric over the last three weeks with user accounts that are currently active and aren't based in the United States.
My problem is that the only design I can think of is more or less passing direct SQL from the iOS app to the backend server to be executed against the database. I know it's bad practice and everything should be written in terms of stored procedures. But how else do I maintain enough flexiblity to keep fully user-defined queries?
While the application does compose the SQL, direct SQL is never visible or injectable by the user. That's all abstracted away in UIDateTimeChoosers, UIPickerViews, and the like.
Is all of the data in the database available to all of the users, or do you only permit each user to access a subset of the data? If the latter, simply restricting the database login to read-only access isn't enough to secure your data.
As a trivial example, a user could compromise your interface in order to submit the query SELECT password, salt FROM users WHERE login = 'admin', hijack the response to get at the raw data, and brute force your admin password. As the popularity of an app grows, the pool of malicious users grows more than linearly, until eventually their collective intelligence exceeds that of your team; you shouldn't put yourself in a situation where success will be your downfall.
You could take the SQL query sent by the client application and try to parse it server-side in order to apply appropriate restrictions on the query, to fence the user in, so to speak. But getting there would require you to write a mini SQL parser in your server code, and who wants to do all that work? It's much easier to write code that can write SQL than it is to write code that can read it.
My team solved a similar problem for a reporting interface in a rather complex web application, and our approach went something like this:
Since you already intend to use a graphical interface to build the query, it would be fairly easy to turn the raw data from the interface elements into a data structure that represents the user's input (and in turn, the query). For example, a user might specify, using your interface, the condition that they want the results to be confined to those collected on May 5, 2010 by everyone but John. (Suppose that John's UserID is 3.) Using a variant of the JSON format my team used, you would simply rip that data from the UI into something like:
{ "ConditionType": "AND",
"Clauses": [
{ "Operator": "Equals",
"Operands": [
{ "Column": "CollectedDate" },
{ "Value": "2010-05-05" }
]
},
{ "Operator": "NotEquals",
"Operands": [
{ "Column": "CollectedByUserID" },
{ "Value": 3 }
]
}
]
}
On the client side, creating this kind of data structure is pretty much isomorphic to the task of creating an SQL query, and is perhaps somewhat easier, since you don't have to worry about SQL syntax.
There are subtleties here that I'm glossing over. This only represents the WHERE part of the query, and would have to live in a larger object ({ "Select": ..., "From": ..., "Where": ..., "OrderBy": ... }). More complicated scenarios are possible, as well. For example, if you require the user to be able to specify multiple tables and how they JOIN together, you have to be more specific when specifying a column as a operand in a WHERE clause. But again, all of this is work you would have to do anyway to build the query directly.
The server would then deserialize this structure. (It's worth pointing out that the column names provided by the user shouldn't be taken dirty – we mapped them onto a list of allowed columns in our application; if the column wasn't on the list, deserialization failed and the user got an error message.) With a simple object structure to work with, making changes to the query is almost trivial. The server application can modify the list of WHERE clauses to apply appropriate data access restrictions. For example, you might say (in pseudo-code) Query.WhereClauses.Add(new WhereClause(Operator: 'Equals', Operands: { 'User.UserID', LoggedInUser.UserID } )).
The server code then passes the object into a relatively simple query builder that walks the object and splits back an SQL query string. This is easier than it sounds, but make sure that all of the user-provided parameters are passed in cleanly. Don't sanitize – use parameterized queries.
This approach ultimately worked out really nicely for us, for a few reasons:
It allowed us to break up the complexity of composing a query from a graphical interface.
It ensured that user-generated queries were never executed dirty.
It enabled us to add arbitrary clauses to queries for various kinds of access restrictions.
It was extensible enough that we were able to do nifty things like allowing users to search on custom fields.
On the surface, it may seem like a complex solution, but my team found that the benefits were many and the implementation was clean and maintainable.
EDIT: I have come to dislike my answer here. I agree with some of the commenters below, and I would like to recommend that you build "Query" objects on the client and pass those to a web service which constructs the SQL statement using prepared statements. This is safe from SQL injection because you are using prepared statements, and you can control the security of what is being constructed in the web service which you control.
End of Edit
There is nothing wrong with executing SQL passed from the client. Especially in query building situations.
For example, you can add as many where clauses by joining them with "AND". However, what you should not do is allow a user to specify what the SQL is. You should instead provide an interface that allows your users to build the queries. There are a couple reasons this is advantageous:
Better user experience (who wants to write SQL other than developers?)
Safer from injection. There is just no way you could possibly filter out all dangerous SQL strings.
Other than that, it's absolutely fine to execute dynamic SQL instead of using a stored procedure. Your view that everything should be written in terms of stored procedures seems misguided to me. Sure, stored procedures are nice in a lot of ways, but there are also many downsides to using them.
In fact, overuse of stored procs sometimes leads to performance problems since developers reuse the same stored procedure in multiple places even when they don't need all the data it returns.
One thing you might want to look into though is building the SQL on the server side and passing over some kind of internal representation of the built query. If you have some kind of web service which is exposed and allows your client to run whatever SQL it wants to run, then you have a security concern. This would also help in versioning. If you modify the database, you can modify the web service with it and not worry about people using old clients building invalid SQL.
I see this fully user-configurable visualizations more like building blocks.
I wouldn't pass direct sql queries to the back-end. I would make the user send parameters (wich view to use, filters in the where clause, so on). But letting the user inject sql it's a potential nightmare (both for security and maintenance)
If you want to let users send over actual sql, try filtering words like "drop and truncate." If you have to allow deletes, you can enforce that they use a primary key.
There is nothing wrong about an application sending SQL commands to a database, as long as you are aware of injection issues. So don't do this in you're code:
(Pseudocode)
String sqlCommand = "SELECT something FROM YOURTABLE WHERE A='" + aTextInputFieldInYourGui + "'";
cmd.execute(sqlCommand);
Why not? See what happens if the user enters this line into aTextInputFieldInYourGui
' GO DELETE * FROM YOURTABLE GO SELECT '
(assuming your DB is MS SQL Server here, for other RDBMS slightly different syntax)
Use prepared statements and Parameterbinding instead
(Pseudocode)
String sqlCommand = "SELECT something FROM YOURTABLE WHERE A=?";
cmd.prepare(sqlCommand);
cmd.bindParam(1, aTextInputFieldInYourGui);
cmd.execute();
Regards