How to get task_id inside the task function in django-q? - django-q

I am using django-q for running background tasks in my django application. I am now in a scenario where I need to get the task_id inside the task function.
A similar functionality in celery would look something like this:
`
#celery.task
def scan(host):
print scan.request.id
`
I searched online and could not find a solution. Any help would be appreciated.

Related

Enable Impala Impersonation on Superset

Is there a way to make the logged user (on superset) to make the queries on impala?
I tried to enable the "Impersonate the logged on user" option on Databases but with no success because all the queries run on impala with superset user.
I'm trying to achieve the same! This will not completely answer this question since it does not still work but I want to share my research in order to maybe help another soul that is trying to use this instrument outside very basic use cases.
I went deep in the code and I found out that impersonation is not implemented for Impala. So you cannot achieve this from the UI. I found out this PR https://github.com/apache/superset/pull/4699 that for whatever reason was never merged into the codebase and tried to copy&paste code in my Superset version (1.1.0) but it didn't work. Adding some logs I can see that the configuration with the impersonation is updated, but then the actual Impala query is with the user I used to start the process.
As you can imagine, I am a complete noob at this. However I found out that the impersonation thing happens when you create a cursor and there is a constructor parameter in which you can pass the impersonation configuration.
I managed to correctly (at least to my understanding) implement impersonation for the SQL lab part.
In the sql_lab.py class you have to add in the execute_sql_statements method the following lines
with closing(engine.raw_connection()) as conn:
# closing the connection closes the cursor as well
cursor = conn.cursor(**database.cursor_kwargs)
where cursor_kwargs is defined in db_engine_specs/impala.py as the following
#classmethod
def get_configuration_for_impersonation(cls, uri, impersonate_user, username):
logger.info(
'Passing Impala execution_options.cursor_configuration for impersonation')
return {'execution_options': {
'cursor_configuration': {'impala.doas.user': username}}}
#classmethod
def get_cursor_configuration_for_impersonation(cls, uri, impersonate_user,
username):
logger.debug('Passing Impala cursor configuration for impersonation')
return {'configuration': {'impala.doas.user': username}}
Finally, in models/core.py you have to add the following bit in the get_sqla_engine def
params = extra.get("engine_params", {}) # that was already there just for you to find out the line
self.cursor_kwargs = self.db_engine_spec.get_cursor_configuration_for_impersonation(
str(url), self.impersonate_user, effective_username) # this is the line I added
...
params.update(self.get_encrypted_extra()) # already there
#new stuff
configuration = {}
configuration.update(
self.db_engine_spec.get_configuration_for_impersonation(
str(url),
self.impersonate_user,
effective_username))
if configuration:
params.update(configuration)
As you can see I just shamelessy pasted the code from the PR. However this kind of works only for the SQL lab as I already said. For the dashboards there is an entirely different way of querying Impala that I did not still find out.
This means that queries for the dashboards are handled in a different way and there isn't something like this
with closing(engine.raw_connection()) as conn:
# closing the connection closes the cursor as well
cursor = conn.cursor(**database.cursor_kwargs)
My gut (and debugging) feeling is that you need to first understand the sqlalchemy part and extend a new ImpalaEngine class that uses a custom cursor with the impersonation conf. Or something like that, however it is not simple (if we want to call this simple) as the sql_lab part. So, the trick is to find out where the query is executed and create a cursor with the impersonation configuration. Easy, isnt'it ?
I hope that this could shed some light to you and the others that have this issue. Let me know if you did find out another way to solve this issue, or if this comment was useful.
Update: something really useful
A colleague of mine succesfully implemented impersonation with impala without touching any superset related, but instead working directly with the impyla lib. A PR was open with the code to change. You can apply the patch directly in the impyla src used by superset. You have to edit both dbapi.py and hiveserver2.py.
As a reminder: we are still testing this and we do not know if it works with different accounts using the same superset instance.

Click: Test click.group commands without running their code

I´m currently writing tests for my application and therefore, I have to test some click.group commands I defined:
Let´s say I defined them like:
#click.group(cls=MyGroup)
#click.pass_context
def myapp(ctx):
init_stuff()
#myapp.command()
#click.option('--myOption')
def foo(myOption: str) -> None:
do_stuff() # change some files, print, create other files
I know that I could use the CliRunner from click.testing. However, I just want to make sure, that the command is called, but I DONT WANT it to execute any code (for example by applying the CliRunner.invoke()).
How could this be done?
I couldn´t come up with a solution using mocking with foo for example. Or do I have to execute code lets say using the isolated_filesystem() which CliRunner provides?
So the question is: What would be the most efficient way to test my commands when defined like shown above?
Many thanks in advance
You could add a --dry-run flag to your group or some commands, and save it it inside the context, and if the flag is enabled, do not execute any code. Then you can use CliRunner.invoke() with the --dry-run flag enabled and just check your invocations have happened, without actually executing the code.

dask jobqueue and scheduler_file

For dask_jobqueue, is it OK to pass a SGECluster and scheduler_file when creating a Client?
Something like this:
client = Client(cluster, scheduler_file='shirley.json')
The reason is really, I want to my workers from dask_jobqueue to run on a specified IP/port. Thanks so much for the help.

Bigquery API for get the job information

support team,
I am using google bigquery API python lib to test some operations. One purpose is to get the job information. Then we can better control all our queries from the API. I found there is a get() method mentioned in the REST reference here, which can get the job information. But in the API python lib here, I can not find any doc about this get() method or something can finish the same operation.
Can you help to provide me any guide doc about any method in the python lib can get the job information?
Thanks
Zhihong
You are looking at the documentation for the translate API rather than BigQuery. See job_from_resource under the BigQuery client documentation.
Based on Elliott's suggestion, I got the job info I need after run a query, but not figure out how to fetch the job info for an exist job, which I think not needed anymore if got query info after each operation. The python code is as below:
from google.cloud import bigquery
client = bigquery.Client()
query = client.run_sunc_query(sql)
query.use_legacy_sql = False
query.use_query_cache = True
query.run()
trows = query.total_rows
billed_byte = query.total_bytes_processed
More query info parameters can be found here and more example code can be found here.

getting result from a function running "deferToThread"

I have recently started working on twisted not much familiar with its functions.I have a problem related to "deferToThread" method...my code is here to use this method
from twisted.internet.threads import deferToThread
from twisted.internet import reactor
results=[]
class Tool(object):
def exectool(self,tool):
# print "Test Class Exec tool running..........."
exec tool
return
def getResult(self,tool):
return results.append(deferToThread(self.exectool, tool))
to=Tool()
to.getResult(tools)
f=open(temp).read()
obj_tool=compile(f, 'a_filename', 'exec')
[ at 0x8ce7020, file "a_filename", line 1>, at 0x8cd4e30, file "a_filename", line 2>]
I am passing tools one by one in getResults() method it executs successfully & prints the results what script written in the file objects.
I have to store the result of tools executing in some variable so that I can save it in database.How to achieve this cause when i call re=to.getResult(tools) and print "re" it prints none.
I HAVE TO STORE ITS RESULTS IN DATABASE? IS THERE SOMETHING I CAN DO?
thanx in advance
There are two problems here.
First, deferToThread will not work if you never start the reactor. Hopefully this code snippet was actually extracted from a larger Twisted-using application where the reactor is running, so that won't be an actual problem for you. But you shouldn't expect this snippet to work unless you add a reactor.run() call to it.
Second, deferToThread returns a Deferred. The Deferred fires with the result of the callable you passed in. This is covered in the API documentation. Many APIs in Twisted return a Deferred, so you might want to read the documentation covering them. Once you understand how they work and how to use them, lots of things should be quite a bit easier.