Airflow: Best way to pass BigQuery result as XCom [duplicate] - google-bigquery

I'm using Airflow 1.8.1 and I want to push the result of a sql request from PostgreOperator.
Here's my tasks:
check_task = PostgresOperator(
task_id='check_task',
postgres_conn_id='conx',
sql="check_task.sql",
xcom_push=True,
dag=dag)
def py_is_first_execution(**kwargs):
value = kwargs['ti'].xcom_pull(task_ids='check_task')
print 'count ----> ', value
if value == 0:
return 'next_task'
else:
return 'end-flow'
check_branch = BranchPythonOperator(
task_id='is-first-execution',
python_callable=py_is_first_execution,
provide_context=True,
dag=dag)
and here is my sql script:
select count(1) from table
when i check the xcom value from check_task it retrieves none value.

If i'm correct, airflow automatically pushes to xcom when a query returns a value. However, when you look at the code of the postgresoperator you see that it has an execute method that calls the run method of the PostgresHook (extension of dbapi_hook). Both methods do not return anything, as such it pushes nothing to xcom.
What we did to fix this is create a CustomPostgresSelectOperator, a copy of the PostgresOperator, but instead of 'hook.run(..)' do 'return hook.get_records(..)'.
Hope that helps you.

Finally, I created a new Sensor ExecuteSqlOperator in the plugin manager under $AIRFLOW_HOME/plugins.
I used CheckOperator as an example and I modified the returned value: the basic running of this operator was exactly the reverse of what I needed.
Here's the of the default ExecuteSqlOperator:
CheckOperator
and here is my customized SqlSensor: ReverseSqlSensor
class SqlExecuteOperator(BaseOperator):
"""
Performs checks against a db. The ``CheckOperator`` expects
a sql query that will return a single row.
Note that this is an abstract class and get_db_hook
needs to be defined. Whereas a get_db_hook is hook that gets a
single record from an external source.
:param sql: the sql to be executed
:type sql: string
"""
template_fields = ('sql',)
template_ext = ('.hql', '.sql',)
ui_color = '#fff7e6'
#apply_defaults
def __init__(
self, sql,
conn_id=None,
*args, **kwargs):
super(SqlExecuteOperator, self).__init__(*args, **kwargs)
self.conn_id = conn_id
self.sql = sql
def execute(self, context=None):
logging.info('Executing SQL statement: ' + self.sql)
records = self.get_db_hook().get_first(self.sql)
logging.info("Record: " + str(records))
records_int = int(records[0])
print (records_int)
return records_int
def get_db_hook(self):
return BaseHook.get_hook(conn_id=self.conn_id)

Related

Odoo: Access field by it's name (given as string)

I have a model, where I want to access a field, given by a string. Example:
def test(self):
field = 'name'
name = getattr(self, field)
This works fine - name is set to self.name. But then I want to access a related field:
def test2(self):
field = 'partner_id.name'
name = getattr(self, field)
That doesn't work (because 'partner_id.name' does not exist on self). Any idea how to do it right?
getattr doesn't support the dot notation, only simple attribute names. You can however create a simple function that does:
def getfield(model, field_name):
value = model
for part in field_name.split('.'):
value = getattr(value, part)
return value
You would use it like this:
def test2(self):
field = 'partner_id.name'
name = getfield(self, field)
You need to use the object that contain partner_id.name
def test2(self):
field = 'name'
object = self.pool.get('res.partner').browse(cr, uid, self.partner_id.id)#v7
#object = self.env['res.partner'].browse(self.partner_id.id)#v8
name = getattr(object, field)
I also came across another solution, inspired by the mail template system:
from openerp.tools.safe_eval import safe_eval as eval
def test2(self):
field = 'partner_id.name'
field = 'object.' + field
name = eval(field, {'object': self})

How to standardize field values on create or write in Odoo 8?

In Odoo 8, is there a preferred method for standardizing field values on create or write? Several methods come to mind, but this functionality seems like it belongs in the API. Essentially, I am wanting to create a field that specifies a standardize function, somewhat like a compute field that only specifies an inverse function. Does this already exist somewhere in the API?
Method 0: Create a field that specifies a standardize function.
The only flaw that I can see with this method is that the API does not exist.
import openerp
class Model(openerp.models.Model):
_name = 'addon.model'
field = openerp.fields.Text(
required=True,
standardize='_standardize_field',
)
#openerp.api.one
def _standardize_field(self):
self.field = self.field.upper()
Method 1: Override the create and write methods to insert a call to standardize the field.
This works, but seems rather verbose for what could be done with a single function, above. Note that the constraint is required if required=True and the standardization might yield an empty field.
import openerp
class Model(openerp.models.Model):
_name = 'addon.model'
field = openerp.fields.Text(
required=True,
)
#openerp.api.one
#openerp.api.constrains('field')
def _constrains_field(self):
if len(self.field) == 0:
raise openerp.exceptions.ValidationError('Field must be valid.')
def _standardize(self, args):
if 'field' in args:
# Return standardized field or empty string.
args['field'] = args['field'].upper()
#openerp.api.model
def create(self, args):
self._standardize(args)
return super(Model, self).create(args)
#openerp.api.multi
def write(self, args):
self._standardize(args)
super(Model, self).write(args)
return True
Method 2: Use a computed field and a bit of magic.
This works but feels a bit contrived. In addition, this method requires that the standardization function is deterministic, or this may create an infinite loop. Note that the standardization function may be called twice, which could be a concern if standardization is an expensive operation.
import openerp
class Model(openerp.models.Model):
_name = 'addon.model'
field = openerp.fields.Text(
compute=lambda x: x,
inverse='_inverse_field',
required=True,
store=True,
)
#openerp.api.one
#openerp.api.constrains('field')
def _constrains_field(self):
if self._standardize_field() is None:
raise openerp.exceptions.ValidationError('Field must be valid.')
def _inverse_field(self):
field = self._standardize_field()
# If the field is updated during standardization, this function will
# run a second time, so use this check to prevent an infinite loop.
if self.field != field:
self.field = field
def _standardize_field(self):
# Return the standardized field.
return self.field.upper()
Method 3: Use a regular field and a computed field, with only the computed field being exposed in the view.
The readonly flag and the constraints help to protect the underlying field, but I am not certain that this method would maintain data integrity, and the method as a whole feels contrived.
import openerp
class Model(openerp.models.Model):
_name = 'addon.model'
field = openerp.fields.Text(
readonly=True,
required=True,
)
field_for_view = openerp.fields.Text(
compute='_compute_field_for_view',
inverse='_inverse_field_for_view',
required=True,
)
#openerp.api.one
#openerp.api.depends('field')
def _compute_field_for_view(self):
self.field_for_view = self.field
#openerp.api.one
#openerp.api.constrains('field', 'field_for_view')
def _constrains_field(self):
if self._standardize_field() is None:
raise openerp.exceptions.ValidationError('Field must be valid.')
def _inverse_field(self):
self.field = self._standardize_field()
def _standardize_field(self):
# Return the standardized field.
return self.field_for_view.upper()
Maybe the 'default' attribute is an implementation of your approach #1?
Here's the example taken from the Odoo8 documentation at https://www.odoo.com/documentation/8.0/reference/orm.html#creating-models
a_field = fields.Char(default=compute_default_value)
def compute_default_value(self):
return self.get_value()
Another option is to override the write() method in your subclass to add your call like so:
def write(self, vals):
for record in self:
# do the cleanup here for each record, storing the result in
# vals again
# call the super:
res = super(extendedProject, self).write(vals)
return res
vals is a dictionary with the modified values to store; self is a recordset with all records to store the values to. Note that the transaction in Odoo may still be rolled back after returning from your call to write.

How do I loop thought each DB field to see if range is correct

I have this response in soapUI:
<pointsCriteria>
<calculatorLabel>Have you registered for inContact, signed up for marketing news from FNB/RMB Private Bank, updated your contact details and chosen to receive your statements</calculatorLabel>
<description>Be registered for inContact, allow us to communicate with you (i.e. update your marketing consent to 'Yes'), receive your statements via email and keep your contact information up to date</description>
<grades>
<points>0</points>
<value>No</value>
</grades>
<grades>
<points>1000</points>
<value>Yes</value>
</grades>
<label>Marketing consent given and Online Contact details updated in last 12 months</label>
<name>c21_mrktng_cnsnt_cntct_cmb_point</name>
</pointsCriteria>
There are many many many pointsCriteria and I use the below xquery to give me the DB value and Range of what that field is meant to be:
<return>
{
for $x in //pointsCriteria
return <DBRange>
<db>{data($x/name/text())}</db>
<points>{data($x//points/text())}</points>
</DBRange>
}
</return>
And i get the below response
<return><DBRange><db>c21_mrktng_cnsnt_cntct_cmb_point</db><points>0 1000</points></DBRange>
That last bit sits in a property transfer. I need SQL to bring back all rows where that DB field is not in that points range (field can only be 0 or 1000 in this case), my problem is I dont know how to loop through each DBRange/DBrange in this manner? please help
I'm not sure that I really understand your question, however I think that you want to make queries in your DB using specific table with a column name defined in your <db> field of your xml, and using as values the values defined in <points> field of the same xml.
So you can try using a groovy TestStep, first parse your Xml and get back your column name, and your points. To iterate over points if the values are separated with a blank space you can make a split(" ") to get a list and then use each() to iterate over the points on this list. Then using groovy.sql.Sql you can perform the queries in your DB.
Only one more thing, you need to put the JDBC drivers for your vendor DB in $SOAPUI_HOME/bin/ext and then restart SOAPUI in order that it can load the necessary driver classes.
So the follow code approach can achieve your goal:
import groovy.sql.Sql
import groovy.util.XmlSlurper
// soapui groovy testStep requires that first register your
// db vendor drivers, as example I use oracle drivers...
com.eviware.soapui.support.GroovyUtils.registerJdbcDriver( "oracle.jdbc.driver.OracleDriver")
// connection properties db (example for oracle data base)
def db = [
url : 'jdbc:oracle:thin:#db_host:d_bport/db_name',
username : 'yourUser',
password : '********',
driver : 'oracle.jdbc.driver.OracleDriver'
]
// create the db instance
def sql = Sql.newInstance("${db.url}", "${db.username}", "${db.password}","${db.driver}")
def result = '''<return>
<DBRange>
<db>c21_mrktng_cnsnt_cntct_cmb_point</db>
<points>0 1000</points>
</DBRange>
</return>'''
def resXml = new XmlSlurper().parseText(result)
// get the field
def field = resXml.DBRange.db.text()
// get the points
def points = resXml.DBRange.points.text()
// points are separated by blank space,
// so split to get an array with the points
def pointList = points.split(" ")
// for each point make your query
pointList.each {
def sqlResult = sql.rows "select * from your_table where ${field} = ?",[it]
log.info sqlResult
}
sql.close();
Hope this helps,
Thanks again for your help #albciff, I had to add this into a multidimensional array (I renamed field to column and result is a large return from the Xquery above)
def resXml = new XmlSlurper().parseText(result)
//get the columns and points ranges
def Column = resXml.DBRange.db*.text()
def Points = resXml.DBRange.points*.text()
//sorting it all out into a multidimensional array (index per index)
count = 0
bigList = Column.collect
{
[it, Points[count++]]
}
//iterating through the array
bigList.each
{//creating two smaller lists and making it readable for sql part later
def column = it[0]
def points = it[1]
//further splitting the points to test each
pointList = points.split(" ")
pointList.each
{//test each points range per column
def sqlResult = sql.rows "select * from my_table where ${column} <> ",[it]
log.info sqlResult
}
}
sql.close();
return;

OpenERP domain : char compare

Here's is what i've tried to do:
<field name="of_num" domain="[('etat','=','Terminé')]"/>
where 'of_num' is a many2one field and 'etat' is a function field of char type.
But it seems not working.I still get all records in my dropdown list.
I have also tried with some other text with no unicode chars but still the same.
I tried also to use 'ilike' operator and tried to put domain in python code with the field definition but with no chance.
EDITED
I've figured out the source of my problem :
the field 'etat' is computed but not stored since I'am using 'store=false'.
it's working with store=True.
Still, I don't wan't to store it because my value needs to be computed every time a view is loaded.
Could anyone please help me to do that without having to store my value ? thank you
The only solution I've found to get around my problem is to use a Boolean field that is stored and updated every time my function is computed (of the function field 'etat').
Use fnct_search. For functional fields there is an argument called 'fnct_search' which returns a search domain condition.
For example
_columns = {
'a':fields.float('A'),
'b':fields.float('B'),
'total_fn': fields.function(_total, fnct_search=_total_search, string='Total'),
}
def _total(self, cr, uid, ids, name, arg, context=None):
res = {}
for obj in self.browse(cr, uid, ids, context):
res[obj.id] = obj.a + obj.b
return res
def _total_search(self, cursor, user, obj, name, args, domain=None, context=None):
ids = set()
for cond in args:
amount = cond[2]
cr.execute("select id from your_table having sum(a,b) %s %%s" % (cond[1]),(amount,))
res_ids = set(id[0] for id in cr.fetchall())
ids = ids and (ids & res_ids) or res_ids
if ids:
return [('id', 'in', tuple(ids))]
return [('id', '=', '0')]
Here _total returns the value to display for the field total_fn, fnct_search returns the list of tuple need for searching. So whenever we are giving the argument [('total_fn','=',1500)]

Add extra field in Django QuerySet as timedelta type

I have the following model:
class UptimeManager(models.Manager):
def with_length(self):
"""Get querySet of uptimes sorted by length including the current one. """
extra_length = Uptime.objects.extra(select={'length':
"""
SELECT
IF (end is null,
timestampdiff(second,begin,now()),
timestampdiff(second,begin,end))
FROM content_uptime c
WHERE content_uptime.id = c.id
"""
})
return extra_length
class Uptime(models.Model):
begin = models.DateTimeField('beginning')
end = models.DateTimeField('end', null=True) I call
host = models.ForeignKey("Host")
objects = UptimeManager()
...
then I call Uptime.objects.with_length().order_by('-length')[:10] to get list of longest uptimes.
But the length in template is of integer type. How to modify my code as the length of object returned by manager would be accessible in template as timedelta object?
I almost could do it by returning a list and converting number of seconds to timedelta objects, but then I have to do sorting, filtering etc. in my Python code which is rather ineffective in comparison to one well done SQL query.
Add a property to the model that looks at the actual field and converts it to the appropriate type.
My solution is to create a filter that determines type of length var and returns timedelta in case it's some integer type
from django import template
import datetime
register = template.Library()
def timedelta(value):
if isinstance(value, (long,int)):
return datetime.timedelta(seconds=value)
elif isinstance(value, datetime.timedelta):
return value
else: raise UnsupportedOperation
register.filter('timedelta',timedelta)
and use in template it's trivial
{{ uptime.length|timedelta }}