How to .update m2m field in django - sql

I have:
class MyUser(Model):
today_ref_viewed_ips = ManyToManyField(
UniqAddress,
related_name='today_viewed_users',
verbose_name="Adresses visited referal link today")
...
On some croned daily request I do:
for u in MyUser.objects.all():
u.today_ref_viewed_ips.clear()
Can it be done on DB server with update?
MyUser.objects.all().update(...)
Ok, I can't update, thanks. But only thing I need is to TRUNCATE m2m internal table, is it possible to perform from django? How to know it's name whithout mysql's console "SHOW TABLES"?

If you want to update the m2m fields only and do not want to delete the m2m objects you can use the following:
#if you have **list of pk** for new m2m objects
today_ref_pk = [1,2,3]
u = MyUser.objects.get(pk=1)
u.today_ref_viewed_ips.clear()
u.today_ref_viewed_ips.add(*today_ref_pk)
for django >=1.11 documentation:
# if you have the **list of objects** for new m2m and you dont have the
# issue of race condition, you can do the following:
today_ref_objs = [obj1, obj2, obj3]
u = MyUser.objects.get(pk=1)
u.today_ref_viewed_ips.set(today_ref_objs, clear=True)

Query-1:
No, you cannot use .update() method to update a ManyToManyField.
Django's .update() method does not support ManyToManyField.
As per the docs from the section on updating multiple objects at once:
You can only set non-relation fields and ForeignKey fields using this
method. To update a non-relation field, provide the new value as a
constant. To update ForeignKey fields, set the new value to be the new
model instance you want to point to.
Query-2:
If you want to delete all the objects of m2m table, you can use .delete() queryset method.
MyModel.objects.all().delete() # deletes all the objects
Another method is to execute the raw SQL directly. This method is faster than the previous one.
from django.db import connection
cursor = connection.cursor()
cursor.execute("TRUNCATE TABLE table_name")
Query-3:
To get the table name of a model, you can use db_table model Meta option.
my_model_object._meta.db_table # gives the db table name

Related

Neo4j APOC trigger and Manual Index on Relationship Properties

I'd like to setup Neo4j APOC trigger that will add all relationship properties to manual index, something like the following:
CALL apoc.trigger.add('HAS_VALUE_ON_INDEX',"UNWIND {createdRelationships} AS r MATCH (Decision)-[r:HAS_VALUE_ON]->(Characteristic) CALL apoc.index.addRelationship(r,['property_1','property_2']) RETURN count(*)", {phase:'after'})
The issue is that I don't know the exact set of HAS_VALUE_ON relationship properties because I use the dynamic properties approach with Spring Data Neo4 5.
Is it possible to change this trigger declaration to be able to add all of the HAS_VALUE_ON relationship properties(existing and ones that will be created in future) to the manual index instead of the preconfigured ones( like ['property_1','property_2'] in the mentioned example) ?
If you do not know the set of properties in advance, then you can use the keys function to add all properties of the created relationships to the index:
CALL apoc.trigger.add(
'HAS_VALUE_ON_INDEX',
'UNWIND {createdRelationships} AS r MATCH (Decision)-[r:HAS_VALUE_ON]->(Characteristic)
CALL apoc.index.addRelationship(r, keys(r)) RETURN count(*)',
{phase:'after'}
)

Implementing a "soft delete" system using sqlalchemy

We are creating a service for an app using tornado and sqlalchemy. The application is written in django and uses a "soft delete mechanism". What that means is that there was no deletion in the underlying mysql tables. To mark a row as deleted we simply set the attributed "delete" as True. However, in the service we are using sqlalchemy. Initially, we started to add check for delete in the queries made through sqlalchemy itself like:
customers = db.query(Customer).filter(not_(Customer.deleted)).all()
However this leads to a lot of potential bugs because developers tend to miss the check for deleted in there queries. Hence we decided to override the default querying with our query class that does a "pre-filter":
class SafeDeleteMixin(Query):
def __iter__(self):
return Query.__iter__(self.deleted_filter())
def from_self(self, *ent):
# override from_self() to automatically apply
# the criterion too. this works with count() and
# others.
return Query.from_self(self.deleted_filter(), *ent)
def deleted_filter(self):
mzero = self._mapper_zero()
if mzero is not None:
crit = mzero.class_.deleted == False
return self.enable_assertions(False).filter(crit)
else:
return self
This inspired from a solution on sqlalchemy docs here:
https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/PreFilteredQuery
However, we are still facing issues, like in cases where we are doing filter and update together and using this query class as defined above the update does not respect the criterion of delete=False when applying the filter for update.
db = CustomSession(with_deleted=False)()
result = db.query(Customer).filter(Customer.id == customer_id).update({Customer.last_active_time: last_active_time })
How can I implement the "soft-delete" feature in sqlalchemy
I've done something similar here. We did it a bit differently, we made a service layer that all database access goes through, kind of like a controller, but only for db access, we called it a ResourceManager, and it's heavily inspired by "Domain Driven Design" (great book, invaluable for using SQLAlchemy well). A derived ResourceManager exists for each aggregate root, ie. each resource class you want to get at things through. (Though sometimes for really simple ResourceManagers, the derived manager class itself is generated dynamically) It has a method that gives out your base query, and that base query gets filtered for your soft delete before it's handed out. From then on, you can add to that query generatively for filtering, and finally call it with query.one() or first() or all() or count(). Note, there is one gotcha I encountered for this kind of generative query handling, you can hang yourself if you join a table too many times. In some cases for filtering we had to keep track of which tables had already been joined. If your delete filter is off the primary table, just filter that first, and you can join willy nilly after that.
so something like this:
class ResourceManager(object):
# these will get filled in by the derived class
# you could use ABC tools if you want, we don't bother
model_class = None
serializer_class = None
# the resource manager gets instantiated once per request
# and passed the current requests SQAlchemy session
def __init__(self, dbsession):
self.dbs = dbsession
# hand out base query, assumes we have a boolean 'deleted' column
#property
def query(self):
return self.dbs(self.model_class).filter(
getattr(self.model_class, 'deleted')==False)
class UserManager(ResourceManager):
model_class = User
# some client code might look this
dbs = SomeSessionFactoryIHave()
user_manager = UserManager(dbs)
users = user_manager.query.filter_by(name_last="Duncan").first()
Now as long as I always start off by going through a ResourceManager, which has other benefits too (see aforementioned book), I know my query is pre-filtered. This has worked very well for us on a current project that has soft-delete and quite an extensive and thorny db schema.
hth!
I would create a function
def customer_query():
return db.session.query(Customer).filter(Customer.deleted == False)
I used query functions to not forget default flags, to set flags based on user permission, filter using joins etc, so that these things wont be copy-pasted and forgotten at various places.

working of openerp-create & write orm methods

can anyone explain the working of create and write orm mehods in openerp ? Actually I'm stuck at this methods,I'm not getting how it works internally and how can I implement it over a simple program.
class dumval(osv.osv):
_name = 'dum_val'
_columns={
'state':fields.selection([('done','confirm'),('cancel','cancelled')],'position',readonly=True),
'name':fields.char('Name',size=40,required=True,states={'done':[('required','False')]}),
'lname':fields.char('Last name',size=40,required=True),
'fname':fields.char('Full name',size=80,readonly=True),
'addr':fields.char('Address',size=40,required=True,help='enter address'),
}
_defaults = {
'state':'done',
}
It would be nice if u could explain using this example..
A couple of comments plus a bit more detail.
As Lukasz answered, convention is to use periods in your model names dum.val. Usually something like my_module.my_model to ensure there are no name collisions (e.g. account.invoice, sale.order)
I am not sure if your conditional "required" in the model will work; this kind of thing is usually done in the view but it would be worth seeing how the field is defined in the SQL schema.
The create method creates new records (SQL Insert). It takes a dict of values, applies any defaults you have specified and then inserts the record and returns the new ID. Note that you can do compound creates, i.e. if you are creating and invoice, you can add the invoice lines into the dictionary and do it all in one create and OpenERP will take care of the related fields for you (ref write method in https://doc.openerp.com/trunk/server/api_models/)
The write method updates existing records (SQL Update). It takes a dict of values and applies to all of the ids you pass. This is an important point, if you pass a list of ids, the values will be written to all ids. If you want to update a single record, pass a list of one entry, if you want to do different updates to the records, you have to do multiple write calls. You can also manage related fields with a write.
It's convention to give _name like dum.val instead of dum_val.
In dumval class you can write a method:
def abc(cr, uid, ids, context=None):
create_dict = {'name':'xxx','lname':'xxx','fname':'xxx','addr':'xyz'}
# create new object and get id
new_id = self.create(cr, uid, write_dict, context=context)
# write on new object
self.write(cr, uid, new_id, {'lname':'yyy'}, context=context)
For more details look: https://www.openerp.com/files/memento/older_versions/OpenERP_Technical_Memento_v0.6.1.pdf

Django aggregate query

I have a model Page, which can have Posts on it. What I want to do is get every Page, plus the most recent Post on that page. If the Page has no Posts, I still want the page. (Sound familiar? This is a LEFT JOIN in SQL).
Here is what I currently have:
Page.objects.annotate(most_recent_post=Max('post__post_time'))
This only gets Pages, but it doesn't get Posts. How can I get the Posts as well?
Models:
class Page(models.Model):
name = models.CharField(max_length=50)
created = models.DateTimeField(auto_now_add = True)
enabled = models.BooleanField(default = True)
class Post(models.Model):
user = models.ForeignKey(User)
page = models.ForeignKey(Page)
post_time = models.DateTimeField(auto_now_add = True)
Depending on the relationship between the two, you should be able to follow the relationships quite easily, and increase performance by using select_related
Taking this:
class Page(models.Model):
...
class Post(models.Model):
page = ForeignKey(Page, ...)
You can follow the forward relationship (i.e. get all the posts and their associated pages) efficiently using select_related:
Post.objects.select_related('page').all()
This will result in only one (larger) query where all the page objects are prefetched.
In the reverse situation (like you have) where you want to get all pages and their associated posts, select_related won't work. See this,this and this question for more information about what you can do.
Probably your best bet is to use the techniques described in the django docs here: Following Links Backward.
After you do:
pages = Page.objects.annotate(most_recent_post=Max('post__post_time'))
posts = [page.post_set.filter(post_time=page.most_recent_post) for page in pages]
And then posts[0] should have the most recent post for pages[0] etc. I don't know if this is the most efficient solution, but this was the solution mentioned in another post about the lack of left joins in django.
You can create a database view that will contain all Page columns alongside with with necessary latest Post columns:
CREATE VIEW `testapp_pagewithrecentpost` AS
SELECT testapp_page.*, testapp_post.* -- I suggest as few post columns as possible here
FROM `testapp_page` LEFT JOIN `testapp_page`
ON test_page.id = test_post.page_id
AND test_post.post_time =
( SELECT MAX(test_post.post_time)
FROM test_post WHERE test_page.id = test_post.page_id );
Then you need to create a model with flag managed = False (so that manage.py sync won't break). You can also use inheritance from abstract Model to avoid column duplication:
class PageWithRecentPost(models.Model): # Or extend abstract BasePost ?
# Page columns goes here
# Post columns goes here
# We use LEFT JOIN, so all columns from the
# 'post' model will need blank=True, null=True
class Meta:
managed = False # Django will not handle creation/reset automatically
By doing that you can do what you initially wanted, so fetch from both tables in just one query:
pages_with_recent_post = PageWithRecentPost.objects.filter(...)
for page in pages_with_recent_post:
print page.name # Page column
print page.post_time # Post column
However this approach is not drawback free:
It's very DB engine-specific
You'll need to add VIEW creation SQL to your project
If your models are complex it's very likely that you'll need to resolve table column name clashes.
Model based on a database view will very likely be read-only (INSERT/UPDATE will fail).
It adds complexity to your project. Allowing for multiple queries is a definitely simpler solution.
Changes in Page/Post will require re-creating the view.

NHibernate Partial Update

Is there a way in NHibernate to start with an unproxied model
var m = new Model() { ID = 1 };
m.Name = "test";
//Model also has .LastName and .Age
Now save this model only updating Name without first selecting the model from the session?
If model has other properties then name, you need to initialize these with the original value in the database, unless they will be set to null.
You can use HQL update operations; I never tried it myself.
You could also use a native SQL statement. ("Update model set name ...").
Usually, this optimization is not needed. There are really rare cases where you need to avoid selecting the data, so writing this SQL statements are just a waste of time. You are using an ORM, this means: write your software object oriented! Unless you won't get much advantages from it.
What Stefan says looks like what you need. Please be aware that this is really an edge case and you should be happy with fully loading your entity unless you have some ultra-high-performance issues.
If you simply don't want to hit the database - try using caching - entity cache is very simple and efficient.
If your entity is a huge one - i.e. it contains a blob or something - think about splitting it in two (with many-to-one so that you can utilize lazy loading).
http://www.hibernate.org/hib_docs/nhibernate/html/mapping.html
dynamic-update (optional, defaults to
false): Specifies that UPDATE SQL
should be generated at runtime and
contain only those columns whose
values have changed.
Place dynamic-update on the class in the HBM.
var m = new Model() { ID = 1 };
m = session.Update(m); //attach m to the session.
m.Name = "test";
session.Save(m);