ActiveRecord querying - ruby-on-rails-3

I have Task and Project tables, project has many tasks.
How to get the count of all tasks in each project, order by tasks count descending?
I came up only with Task.group('project_id'), not any further.
Thanks

Just use count after group, like Task.group(:project_id).count, this will give you a hash with project_id as keys and numbers of their tasks as values. Then you may sort them using Enumerable#sort_by.

Related

Search large number of ID search in MongoDB

Thanks for looking at my query. I have 20k+ unique identification id that I is provided by client, I want to look for all these id's in MongoDB using single query. I tried looking using $in but then it does not seems feasible to put all the 20K Id in $in and search. Is there a better version of achieving.
If the id field is indexed, an IN query should be very fast, but i don't think it is a good idea to perform a query with 20k ids in one time, as it may consume quite a lot of resources like memory, you can split the ids into multiple groups with a reasonable size and do the query separately and you still can perform the queries parallelly in application level.
Consider importing your 20k+ id into a collection(says using mongoimport etc). Then perform a $lookup from your root collection to the search collection. Depending on whether the $lookup result is empty array or not, you can proceed with your original operation that requires $in.
Here is Mongo playground for your reference.

Creating an SQL query in drupal

Hi I'm not sure if this is the best way to do this. I am familiar with Views, but I'm interested in outputting to a specific variable, not a table, list, etc.
Here is what I'm trying to do:
for each taxonomy term, echo the term, count how many users have a node with that term and echo that count.
This is not possible with the default Views module as it doesn't offer this kind of aggregation function. However, with the Views Aggregator Plus module you can add the desired behaviour. When you would get a (grouped) table in standard Views, this module is able to count rows for each group (and do much more like sum and average) so you would get a single row per taxonomy term and the number of nodes/users associated to that term.

Single Row Table in SQL : Is this a good implementation?

I am new to SQL. I read a bit about how creating a single row table is not really a good practice, but I can't help but find it useful in my case. I am making a web app which balances the workload of employees in the organization. So apart from keeping track of how much work is assigned to every employee and how much work does each task (2 main task types) require, I also need to track the overall workload.
So I plan to make a single row table for total workload, with three columns. One for each of the two task types, summed together. And the third for the sum of those 2 totals. I plan to use triggers to update the table in case of addition of a new task or change in its requirements so that it reflects on the total.
Please let me know if I am heading in the right direction. Thanks!
It will work but it is not extensible, in the sense if tomorrow you need to add a 3rd main task then you will need to alter the table and add another column (not so preferred ). So may be you can just have the table with two columns for now with task type and load and you can always calculate the sum with sql query.

fetching multiple nested associations eagerly using nhibernate (and queryover)

I have a database which has multiple nested associates. Basically, the structure is as follows:
Order -> OrderItem -> OrderItemPlaylist -> OrderPlaylistItem -> Track -> Artist
I need to generate a report based on all orders sold in a certain date, which needs to traverse into ALL the mentioned associations in order to generate the required information.
Trying to join all tables together would be an overkill, as it would result in an extremely large cartesian join with many redundant data, considering it would be joining 6 tables together. Code below:
q.Left.JoinQueryOver<OrderItem>(order => order.OrderItems)
.Left.JoinQueryOver<OrderItemPlaylist>(orderItem => orderItem.Playlist)
.Left.JoinQueryOver<OrderItemPlaylistItem>(orderItemPlaylist => orderItemPlaylist.PlaylistItems)
.Left.JoinQueryOver<Track>(orderItemPlaylistItem => orderItemPlaylistItem.Track)
.Left.JoinQueryOver<Artist>(track => track.Artist)
The above works, but with even a few orders, each with a few order items, and a playlist each consisting of multiple tracks, the results would explode to thousand records, growing exponentially with each extra order.
Any idea what would be the best and most efficient approach? I've currently tried enabling batch-loading, which greatly scales down the number of database queries but still does not seem to me like a good approach, but more like an 'easy-workaround'.
There is no need for all the data to be loaded in just one SQL query, given the huge amount of data. One SQL query for each association would be perfect I guess. Ideally it would be something where first you get all orders, then you get all the order items for the order and load them in the associated collections, then the playlists for each order item, so on and so forth.
Also, this doesn't have to be specifically in QueryOver, as I can access the .RootCriteria and use the Criteria API.
Any help would be greatly appreciated !
I believe this is what you are looking for
http://ayende.com/blog/4367/eagerly-loading-entity-associations-efficiently-with-nhibernate
If you prefer one SQL query, what SQL syntax would you expect this to produce? I guess you can't avoid a long sequence of JOINs if you're going for one SQL query.
I guess what I would do is get the entities level by level, using several queries.
You should probably start off by defining the query as best you can in SQL, and looking at the execution plans to find the very best method (and whether your indexes are sufficiant).
At that point you know what you're shooting for, and then it's reasonably easy to try and code the query in HQL or QueryOver or even LINQ and check the results using the SQL writer in NHibernate, or the excellent NHProfiler http://www.nhprof.com.
You are probably right about ending up with several queries. Speed them up by batching as many as you can (that do not depend on each other) into single trips by using the "Future" command in Criteria or QueryOver. You can read more about that here: http://ayende.com/blog/3979/nhibernate-futures

Methods to speed up specific query

I have an existing sql query which works well but takes what I consider to be quite a bit of time and resources for such a small resultset. I am trying to figure out if the following query can be optimized in ways I am unfamiliar for better performance.
Query
SELECT
a.programname, count(b.id)
FROM
groups a
LEFT JOIN
selections b ON (a.id_selection = b.id AND a.min_age = 18 AND a.max_age = 24)
LEFT JOIN
member_info c ON (b.memberid = c.memberid AND (c.status = 1 OR c.term_date > '2011-01-31'))
WHERE
a.flag = 3
GROUP BY
a.programname
ORDER BY
a.programid asc;
There are three tables at work here:
Groups - A
Groups contains a list of possible program selections a member can make. A member can have multiple selections within the entire table but can only have one selection per programname and only one age bracket. The overall program is determined by the flag which limits the 400+ programs to only say 100 possible mixes. The program names grouped together are:
member only, member plus spouse, member plus child, family
The resultset must return the count of all active members who have that particular selection, even if the result is 0 (i.e. cannot limit the resultset to 3 rows just because one has a zero count).
Selections
This table groups the member selections to multiple groups selections. One member can have multiple IDs from groups but only one of each type.
Member_info
contains information about each particular member, including their status (1 is active) and if their termination date is passed in the event they are not active.
My query takes nearly 3/4 of a full second which I find to be way too much for this time of information but maybe I can wrong with all the necessary joins.
Any help is greatly appreciated. I can further expand my question if necessary.
EXPLAIN details
1 SIMPLE a ALL 184 Using where; Using temporary; Using filesort
1 SIMPLE b index memberid_id 7 3845 Using index
1 SIMPLE c ALL 1551
EDIT REGARDING INDEX SUGGESTION
I have given much thought to the use of indexes regarding this query but as nearly all sources would suggest, the use in an example like this may actually be hurtful. The best summary i found was:
Indexes are something extra that you
can enable on your MySQL tables to
increase performance,cbut they do have
some downsides. When you create a new
index MySQL builds a separate block of
information that needs to be updated
every time there are changes made to
the table. This means that if you are
constantly updating, inserting and
removing entries in your table this
could have a negative impact on
performance.
The member_information table will grow daily, the groups will stay fairly constant but the selections table can change drastically on a daily basis. As such, the use of indexes really seems to have a negative effect in this case.
Do you have indexes on the columns being joined? That would be an obvious first step.
There seems to be no problems with this query. Your options are
using indexes: if you plan to read way more than write
using parameterized queries, so that the db engine can cache the execution plan for reuse
Beyond this, there must be some serious bottleneck in the system or millions of rows in the tables that causes a long execution.
How does you query perform, if you run the query 100 times parallel?
If you run this query often, try using bind parameters instead of just concatenating sql. That way the db engine can cache the execution plan.