I am using a SQLite backend with a simple show - season - episode schema:
class Show(BaseModel):
name = CharField()
class Season(BaseModel):
show = ForeignKeyField(Show, related_name='seasons')
season_number = IntegerField()
class Episode(BaseModel):
season = ForeignKeyField(Season, related_name='episodes')
episode_number = IntegerField()
and I would need the following query :
seasons = (Season.select(Season, Episode)
.join(Episode)
.where(Season.show == SHOW_ID)
.order_by(Season.season_number.desc(), Episode.episode_number.desc())
.aggregate_rows())
SHOW_ID being the id of the show for which I want the list of seasons.
But when I iterate over the query with the following code :
for season in seasons:
for episode in season.episodes:
print(episode.episode_number)
... I get something which is not ordered at all, and which does not even follow the order I would get without using order_by(), i.e. the insertion order.
I activated the debug logs to see the outgoing query, and the query does contain the ORDER BY clause, and manually applying it returns the proper descending order.
I am new to peewee, and I have seen so many examples making use of a join() combines with an order_by(), but I can still not find out what I am doing wrong.
This was due to a bug in the processing of nested collections in the aggregate query result wrapper.
The github issue is: https://github.com/coleifer/peewee/issues/519
The fix has been merged here: https://github.com/coleifer/peewee/commit/ec0e87f1a480695d98bf1f0d7f2e63aed8dfc440
So, to get the fix you'll need to either clone master or wait til the next release which should be in the next week or two (2.4.7).
Related
I have three models:
class Customer(models.Model):
pass
class IssueType(models.Model):
pass
class IssueTypeConfigPerCustomer(models.Model):
customer=models.ForeignKey(Customer)
issue_type=models.ForeignKey(IssueType)
class Meta:
unique_together=[('customer', 'issue_type')]
How can I find all tuples of (custmer, issue_type) where there is no IssueTypeConfigPerCustomer object?
I want to avoid a loop in Python. A solution which solves this in the DB would be preferred.
Background: for every customer and for every issue-type, there should be a config in the DB.
If you can afford to make one database trip for each issue type, try something like this untested snippet:
def lacking_configs():
for issue_type in IssueType.objects.all():
for customer in Customer.objects.filter(
issuetypeconfigpercustomer__issue_type=None
):
yield customer, issue_type
missing = list(lacking_configs())
This is probably OK unless you have a lot of issue types or if you are doing this several times per second, but you may also consider having a sensible default instead of making a config object mandatory for each combination of issue type and customer (IMHO it is a bit of a design-smell).
[update]
I updated the question: I want to avoid a loop in Python. A solution which solves this in the DB would be preferred.
In Django, every Queryset is either a list of Model instances or a dict (values querysets), so it is impossible to return the format you want (a list of tuples of Model) without some Python (and possibly multiple trips to the database).
The closest thing to a cross product would be using the "extra" method without a where parameter, but it involves raw SQL and knowing the underlying table name for the other model:
missing = Customer.objects.extra(
select={"issue_type_id": 'appname_issuetype.id'},
tables=['appname_issuetype']
)
As a result, each Customer object will have an extra attribute, "issue_type_id", containing the id of one IssueType. You can use the where parameter to filter based on NOT EXISTS (SELECT 1 FROM appname_issuetypeconfigpercustomer WHERE issuetype_id=appname_issuetype.id AND customer_id=appname_customer.id). Using the values method you can have something close to what you want - this is probably enough information to verify the rule and create the missing records. If you need other fields from IssueType just include them in the select argument.
In order to assemble a list of (Customer, IssueType) you need something like:
cross_product = [
(customer, IssueType.objects.get(pk=customer.issue_type_id))
for customer in
Customer.objects.extra(
select={"issue_type_id": 'appname_issuetype.id'},
tables=['appname_issuetype'],
where=["""
NOT EXISTS (
SELECT 1
FROM appname_issuetypeconfigpercustomer
WHERE issuetype_id=appname_issuetype.id
AND customer_id=appname_customer.id
)
"""]
)
]
Not only this requires the same number of trips to the database as the "generator" based version but IMHO it is also less portable, less readable and violates DRY. I guess you can lower the number of database queries to a couple using something like this:
missing = Customer.objects.extra(
select={"issue_type_id": 'appname_issuetype.id'},
tables=['appname_issuetype'],
where=["""
NOT EXISTS (
SELECT 1
FROM appname_issuetypeconfigpercustomer
WHERE issuetype_id=appname_issuetype.id
AND customer_id=appname_customer.id
)
"""]
)
issue_list = dict(
(issue.id, issue)
for issue in
IssueType.objects.filter(
pk__in=set(m.issue_type_id for m in missing)
)
)
cross_product = [(c, issue_list[c.issue_type_id]) for c in missing]
Bottom line: in the best case you make two queries at the cost of legibility and portability. Having sensible defaults is probably a better design compared to mandatory config for each combination of Customer and IssueType.
This is all untested, sorry if some homework was left for you.
I've got a few tables, Deployment, Deployment_Report and Workflow. In the event that the deployment is being reviewed they join together so you can see all details in the report. If a revision is going out, the new workflow doesn't exist yet new workflow is going into place so I'd like the values to return null as the revision doesn't exist yet.
Complications aside, this is a sample of the SQL that I'd like to have run:
DECLARE #WorkflowID int
SET #WorkflowID = 399 -- Set to -1 if new
SELECT *
FROM Deployment d
LEFT JOIN Deployment_Report r
ON d.FSJ_Deployment_ID = r.FSJ_Deployment_ID
AND r.Workflow_ID = #WorkflowID
WHERE d.FSJ_Deployment_ID = 339
The above in SQL works great and returns the full record if viewing an active workflow, or the left side of the record with empty fields for revision details which haven't been supplied in the event that a new report is being generated.
Using various samples around S.O. I've produced some Entity to SQL based on a few multiple on statements but I feel like I'm missing something fundamental to make this work:
int Workflow_ID = 399 // or -1 if new, just like the above example
from d in context.Deployments
join r in context.Deployment_Reports.DefaultIfEmpty()
on
new { d.Deployment_ID, Workflow_ID }
equals
new { r.Deployment_ID, r.Workflow_ID }
where d.FSJ_Deployment_ID == fsj_deployment_id
select new
{
...
}
Is the SQL query above possible to create using LINQ to Entities without employing Entity SQL? This is the first time I've needed to create such a join since it's very confusing to look at but in the report it's the only way to do it right since it should only return one record at all times.
The workflow ID is a value passed in to the call to retrieve the data source so in the outgoing query it would be considered a static value (for lack of better terminology on my part)
First of all don't kill yourself on learning the intricacies of EF as there are a LOT of things to learn about it. Unfortunately our deadlines don't like the learning curve!
Here's examples to learn over time:
http://msdn.microsoft.com/en-us/library/bb397895.aspx
In the mean time I've found this very nice workaround using EF for this kind of thing:
var query = "SELECT * Deployment d JOIN Deployment_Report r d.FSJ_Deployment_ID = r.Workflow_ID = #WorkflowID d.FSJ_Deployment_ID = 339"
var parm = new SqlParameter(parameterName="WorkFlowID" value = myvalue);
using (var db = new MyEntities()){
db.Database.SqlQuery<MyReturnType>(query, parm.ToArray());
}
All you have to do is create a model for what you want SQL to return and it will fill in all the values you want. The values you are after are all the fields that are returned by the "Select *"...
There's even a really cool way to get EF to help you. First find the table with the most fields, and get EF to generated the model for you. Then you can write another class that inherits from that class adding in the other fields you want. SQL is able to find all fields added regardless of class hierarchy. It makes your job simple.
Warning, make sure your filed names in the class are exactly the same (case sensitive) as those in the database. The goal is to make a super class model that contains all the fields of all the join activity. SQL just knows how to put them into that resultant class giving you strong typing ability and even more important use-ability with LINQ
You can even use dataannotations in the Super Class Model for displaying other names you prefer to the User, this is a super nice way to keep the table field names but show the user something more user friendly.
I have a table KmRelationship which associates Keywords and Movies
In keyword index I would like to list all keywords that appear most frequently in the KmRelationships table and only take(20)
.order doesn't seem to work no matter how I use it and where I put it and same for sort_by
It sounds relatively straight forward but i just can't seem to get it to work
Any ideas?
Assuming your KmRelationship table has keyword_id:
top_keywords = KmRelationship.select('keyword_id, count(keyword_id) as frequency').
order('frequency desc').
group('keyword_id').
take(20)
This may not look right in your console output, but that's because rails doesn't build out an object attribute for the calculated frequency column.
You can see the results like this:
top_keywords.each {|k| puts "#{k.keyword_id} : #{k.freqency}" }
To put this to good use, you can then map out your actual Keyword objects:
class Keyword < ActiveRecord::Base
# other stuff
def self.most_popular
KmRelationship.
select('keyword_id, count(keyword_id) as frequency').
order('frequency desc').
group('keyword_id').
take(20).
map(&:keyword)
end
end
And call with:
Keyword.most_popular
#posts = Post.select([:id, :title]).order("created_at desc").limit(6)
I have this listed in my controller index method which allows the the order to show the last post with a limit of 6. It might be something similar to what you are trying to do. This code actually reflects a most recent post on my home page.
I am missing the SQL out of this to Bulk update attributes by SKU/UPC.
Running EE1.10 FYI
I have all the rest of the code working but I"m not sure the who/what/why of
actually updating our attributes, and haven't been able to find them, my logic
is
Open a CSV and grab all skus and associated attrib into a 2d array
Parse the SKU into an entity_id
Take the entity_id and the attribute and run updates until finished
Take the rest of the day of since its Friday
Here's my (almost finished) code, I would GREATLY appreciate some help.
/**
* FUNCTION: updateAttrib
*
* REQS: $db_magento
* Session resource
*
* REQS: entity_id
* Product entity value
*
* REQS: $attrib
* Attribute to alter
*
*/
See my response for working production code. Hope this helps someone in the Magento community.
While this may technically work, the code you have written is just about the last way you should do this.
In Magento, you really should be using the models provided by the code and not write database queries on your own.
In your case, if you need to update attributes for 1 or many products, there is a way for you to do that very quickly (and pretty safely).
If you look in: /app/code/core/Mage/Adminhtml/controllers/Catalog/Product/Action/AttributeController.php you will find that this controller is dedicated to updating multiple products quickly.
If you look in the saveAction() function you will find the following line of code:
Mage::getSingleton('catalog/product_action')
->updateAttributes($this->_getHelper()->getProductIds(), $attributesData, $storeId);
This code is responsible for updating all the product IDs you want, only the changed attributes for any single store at a time.
The first parameter is basically an array of Product IDs. If you only want to update a single product, just put it in an array.
The second parameter is an array that contains the attributes you want to update for the given products. For example if you wanted to update price to $10 and weight to 5, you would pass the following array:
array('price' => 10.00, 'weight' => 5)
Then finally, the third and final attribute is the store ID you want these updates to happen to. Most likely this number will either be 1 or 0.
I would play around with this function call and use this instead of writing and maintaining your own database queries.
General Update Query will be like:
UPDATE
catalog_product_entity_[backend_type] cpex
SET
cpex.value = ?
WHERE cpex.attribute_id = ?
AND cpex.entity_id = ?
In order to find the [backend_type] associated with the attribute:
SELECT
backend_type
FROM
eav_attribute
WHERE entity_type_id =
(SELECT
entity_type_id
FROM
eav_entity_type
WHERE entity_type_code = 'catalog_product')
AND attribute_id = ?
You can get more info from the following blog article:
http://www.blog.magepsycho.com/magento-eav-structure-role-of-eav_attributes-backend_type-field/
Hope this helps you.
I use jqGrid to display data which is retrieved using NHibernate. jqGrid does paging for me, I just tell NHibernate to get "count" rows starting from "n".
Also, I would like to highlight specific record. For example, in list of employees I'd like a specific employee (id) to be shown and pre-selected in table.
The problem is that this employee may be on non-current page. E.g. I display 20 rows from 0, but "highlighted" employee is #25 and is on second page.
It is possible to pass initial page to jqGrid, so, if I somehow use NHibernate to find what page the "highlighted" employee is on, it will just navigate to that page and then I'll use .setSelection(id) method of jqGrid.
So, the problem is narrowed down to this one: given specific search query like the one below, how do I tell NHibernate to calculate the page where the "highlighted" employee is?
A sample query (simplified):
var query = Session.CreateCriteria<T>();
foreach (var sr in request.SearchFields)
query = query.Add(Expression.Like(sr.Key, "%" + sr.Value + "%"));
query.SetFirstResult((request.Page - 1) * request.Rows)
query.SetMaxResults(request.Rows)
Here, I need to alter (calculate) request.Page so that it points to the page where request.SelectedId is.
Also, one interesting thing is, if sort order is not defined, will I get the same results when I run the search query twice? I'd say that SQL Server may optimize query because order is not defined... in which case I'll only get predictable result if I pull ALL query data once, and then will programmatically in C# slice the specified portion of query results - so that no second query occur. But it will be much slower, of course.
Or, is there another way?
Pretty sure you'd have to figure out the page with another query. This would surely require you to define the column to order by. You'll need to get the order by and restriction working together to count the rows before that particular id. Once you have the number of rows before your id, you can figure what page you need to select and perform the usual paging query.
OK, so currently I do this:
var iquery = GetPagedCriteria<T>(request, true)
.SetProjection(Projections.Property("Id"));
var ids = iquery.List<Guid>();
var index = ids.IndexOf(new Guid(request.SelectedId));
if (index >= 0)
request.Page = index / request.Rows + 1;
and in jqGrid setup options
url: "${Url.Href<MyController>(c => c.JsonIndex(null))}?_SelectedId=${Id}",
// remove _SelectedId from url once loaded because we only need to find its page once
gridComplete: function() {
$("#grid").setGridParam({url: "${Url.Href<MyController>(c => c.JsonIndex(null))}"});
},
loadComplete: function() {
$("#grid").setSelection("${Id}");
}
That is, in request I lookup for index of id and set page if found (jqGrid even understands to display the appropriate page number in the pager because I return the page number to in in json data). In grid setup, I setup url to include the lookup id first, but after grid is loaded I remove it from url so that prev/next buttons work. However I always try to highlight the selected id in the grid.
And of course I always use sorting or the method won't work.
One problem still exists is that I pull all ids from db which is a bit of performance hit. If someone can tell how to find index of the id in the filtered/sorted query I'd accept the answer (since that's the real problem); if no then I'll accept my own answer ;-)
UPDATE: hm, if I sort by id initially I'll be able to use the technique like "SELECT COUNT(*) ... WHERE id < selectedid". This will eliminate the "pull ids" problem... but I'd like to sort by name initially, anyway.
UPDATE: after implemented, I've found a neat side-effect of this technique... when sorting, the active/selected item is preserved ;-) This works if _SelectedId is reset only when page is changed, not when grid is loaded.
UPDATE: here's sources that include the above technique: http://sprokhorenko.blogspot.com/2010/01/jqgrid-mvc-new-version-sources.html