Limit result nested and get total number rows Laravel 4 - orm

i have some trouble limiting the result on a relation and get the original number of the rows. Let's go with my secenario:
I have posts, content of the post and comment. i want select all post and limit to 5 my comments, but i need to know how many comment have that post.
$post = Post::with(array('contentPost','commentPost' => function($query){
$query->take(5);
}))->where('wall_id','=',$team_info->id)->get();
with this relation i limited the comments to 5 it's right! but if i want count all the comments doing
$post->commentPost->count();
it show me just 5 comments because i limited it. How can i get the real number of comments even if i limited them?

In your $post->commentPost->count() call, you're asking for the count() of the results associated with $post. Naturally, this will always be the actual number of rows provided by your query parameters.
"Do it all in one place" is rarely the answer in development. Even if you find a way to make it happen, what happens in the future when you need to change this query? You end up with fragile code, prone to breakage.
Keep your relation as-is, with the 5-post limit. Run a separate query for the count.

Related

Query time for a specific entity is 10000 times higher [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 9 months ago.
Improve this question
We run into a problem: select for a filter by a certain id takes a very long time. For all id about 5ms, for this - 10 seconds.
This is explain. Left - normal, right - long. This is absolutely the same sql query, where the difference is only in one digit 'where id = ...'
this
It is striking that a filter is used on the right, but for some reason it is not on the left, as well as some huge number of 'rows removed'. Such a number can only be obtained by multiplying the number of rows in the joined tables. Once again I repeat that the sql query is absolutely the same except for the entity id, the number of retrieved data for entities is comparable.
One of the tables also uses btre index. The only thing that this id has is special - it comes after the numbering break, 22,23,24,30 for example. But I was not able to reproduce the problem on this principle.
Unfortunately, I cannot show the code, but I hope that this information will be enough to advise something.
upd:
I found the reason. Postgres for some reason expects that one of the tables will return only 1 structure, when as a real return in 10k+ and therefore chooses the wrong algorithm. For other entity ids, it "thinks" correctly and chooses higher algorithms. Can you find how posgres counts plan lines? What could be the problem?
If I understand correctly, your problem is data histogram. We cannot support you because you cannot provide example code. Briefly, one of your table has a data whose id columns has heterogenous data in it. For example; your table has 1 billion records and in that table each id has 500 records. Yet, some of the id' s (virtually, let say) 20 or 200 millions records. So, if you search for these highly non-selective rows the database optimizer will not help you.
Check your data histogram!

RediSearch - searching for particular word which occurs in many records take long time. How to improve it?

I have addresses database (as hashes) with about 30 millions records. I was adding text index to all addresses fields. Searching looks ok until I want to search word which occur in many records. For example searchin word "London" which occur in about 2500000 records took 4,5 seconds (FT.SEARCH idx:a4 london LIMIT 0 2). Is it any possibility to improve this result, any changes to make? Thank you for help.
If you do not care about getting the first 2 results sorted by scoring (calculated by tfidf), you can use FT.AGGREGATE which will just return after finding the first 2 results (without getting all the results, calculating the score, sort them, and get the first 2 results). It should look like this:
FT.AGGREGATE idx:a4 london LIMIT 0 2
Notice that you should use LOAD to decide which fields to return from the hash. Please refer here for the full FT.AGGREGATE documentation:
https://oss.redislabs.com/redisearch/Aggregations/
Again if you chose to use it, know that you are losing sorting by results score.

Ruby Rails Complex SQL with aggregate function and DayOfWeek

Rails 2.3.4
I have searched google, and have not found an answer to my dilemma.
For this discussion, I have two models. Users and Entries. Users can have many Entries (one for each day).
Entries have values and sent_at dates.
I want to query and display the average value of entries for a user BY DAY OF WEEK. So if a user has entered values for, say, the past 3 weeks, I want to show the average value for Sundays, Mondays, etc. In MySQL, it is simple:
SELECT DAYOFWEEK(sent_at) as day, AVG(value) as average FROM entries WHERE user_id = ? GROUP BY 1
That query will return between 0 and 7 records, depending upon how many days a user has had at least one entry.
I've looked at find_by_sql, but while I am searching Entry, I don't want to return an Entry object; instead, I need an array of up to 7 days and averages...
Also, I am concerned a bit about the performance of this, as we would like to load this to the user model when a user logs in, so that it can be displayed on their dashboard. Any advice/pointers are welcome. I am relatively new to Rails.
You can query the database directly, no need to use an actual ActiveRecord object. For example:
ActiveRecord::Base.connection.execute "SELECT DAYOFWEEK(sent_at) as day, AVG(value) as average FROM entries WHERE user_id = #{user.id} GROUP BY DAYOFWEEK(sent_at);"
This will give you a MySql::Result or MySql2::Result that you can then use each or all on this enumerable, to view your results.
As for caching, I would recommend using memcached, but any other rails caching strategy will work as well. The nice benefit of memcached is that you can have your cache expire after a certain amount of time. For example:
result = Rails.cache.fetch('user/#{user.id}/averages', :expires_in => 1.day) do
# Your sql query and results go here
end
This would put your results into memcached for one day under the key 'user//averages'. For example if you were user with id 10 your averages would be in memcached under 'user/10/average' and the next time you went to perform this query (within the same day) the cached version would be used instead of actually hitting the database.
Untested, but something like this should work:
#user.entries.select('DAYOFWEEK(sent_at) as day, AVG(value) as average').group('1').all
NOTE: When you use select to specify columns explicitly, the returned objects are read only. Rails can't reliably determine what columns can and can't be modified. In this case, you probably wouldn't try to modify the selected columns, but you can'd modify your sent_at or value columns through the resulting objects either.
Check out the ActiveRecord Querying Guide for a breakdown of what's going on here in a fairly newb-friendly format. Oh, and if that query doesn't work, please post back so others that may stumble across this can see that (and I can possibly update).
Since that won't work due to entries returning an array, we can try using join instead:
User.where(:user_id => params[:id]).joins(:entries).select('...').group('1').all
Again, I don't know if this will work. Usually you can specify where after joins, but I haven't seen select combined in there. A tricky bit here is that the select is probably going to eliminate returning any data about the user at all. It might make more sense just to eschew find_by_* methods in favor of writing a method in the Entry model that just calls your query with select_all (docs) and skips the association mapping.

Youtube API problem - when searching for playlists, start-index does not work past 100

I have been trying to get the full list of playlists matching a certain keyword. I have discovered however that using start-index past 100 brings the same set of results as using start-index=1. It does not matter what the max-results parameter is - still the same results. The total results returned however is way above 100, thus it cannot be that the query returned only 100 results.
What might the problem be? Is it a quota of some sort or any other authentication restriction?
As an example - the queries bring the same result set, whether you use start-index=1, or start-index=101, or start-index = 201 etc:
http://gdata.youtube.com/feeds/api/playlists/snippets?q=%22Jan+Smit+Laura%22&max-results=50&start-index=1&v=2
Any idea will be much appreciated!
Regards
Christo
I made an interface for my site, and the way I avoided this problem is to do a query for a large number, then store the results. Let your web page then break up the results and present them however is needed.
For example, if someone wants to do a search of over 100 videos, do the search and collect the results, but only present them with the first group, say 10. Then when the person wants to see the next ten, you get them from the list you stored, rather than doing a new query.
Not only does this make paging faster, but it cuts down on the constant queries to the YouTube database.
Hope this makes sense and helps.

SQL or Ruby on Rails question: Limit results from a search to the greater of two conditions

Is it possible to impose conditions on a SQL query to the greater of two conditions?
For example, imagine a site that has Posts. Is there a way to request posts such that the result will be guaranteed to contain both of at least all posts made in the past 24 hours, and at least 10 posts, but not unnecessarily exceeding either limit?
I.e., if it already has all posts made in the past 24 hours, but not 10 posts, it will continue adding posts until it reaches 10, or if it already has 10 posts, but not all posts made in the past 24 hours, it will continue until it covers the past 24 hours, but it will not ever have both posts from 25 hours ago and more than 10 posts.
Edit: In response to request to post my model. I'm not exactly sure how to post the model, since my models are created by Rails. However, in short, a Post has a created_at datetime row, and there is an index on it.
I think LIMIT requires a constant, but if you have an index on the created field, pulling a few records from one end will be pretty efficient, right? What about something like (in SQL-ish pseudocode):
select * from posts
where created > min(one_day_ago,
(select created from posts
order by created desc
limit 1 offset 10));
Probably the clearest solution is to make a select that returns the most recent 10, another select that returns everything for the last day and use the union of these two selects.
One of these unioned selects is always going to completely contain the other, so it may not be the most efficient, but I think it's what I would do.