Enable Impala Impersonation on Superset - impala

Is there a way to make the logged user (on superset) to make the queries on impala?
I tried to enable the "Impersonate the logged on user" option on Databases but with no success because all the queries run on impala with superset user.

I'm trying to achieve the same! This will not completely answer this question since it does not still work but I want to share my research in order to maybe help another soul that is trying to use this instrument outside very basic use cases.
I went deep in the code and I found out that impersonation is not implemented for Impala. So you cannot achieve this from the UI. I found out this PR https://github.com/apache/superset/pull/4699 that for whatever reason was never merged into the codebase and tried to copy&paste code in my Superset version (1.1.0) but it didn't work. Adding some logs I can see that the configuration with the impersonation is updated, but then the actual Impala query is with the user I used to start the process.
As you can imagine, I am a complete noob at this. However I found out that the impersonation thing happens when you create a cursor and there is a constructor parameter in which you can pass the impersonation configuration.
I managed to correctly (at least to my understanding) implement impersonation for the SQL lab part.
In the sql_lab.py class you have to add in the execute_sql_statements method the following lines
with closing(engine.raw_connection()) as conn:
# closing the connection closes the cursor as well
cursor = conn.cursor(**database.cursor_kwargs)
where cursor_kwargs is defined in db_engine_specs/impala.py as the following
#classmethod
def get_configuration_for_impersonation(cls, uri, impersonate_user, username):
logger.info(
'Passing Impala execution_options.cursor_configuration for impersonation')
return {'execution_options': {
'cursor_configuration': {'impala.doas.user': username}}}
#classmethod
def get_cursor_configuration_for_impersonation(cls, uri, impersonate_user,
username):
logger.debug('Passing Impala cursor configuration for impersonation')
return {'configuration': {'impala.doas.user': username}}
Finally, in models/core.py you have to add the following bit in the get_sqla_engine def
params = extra.get("engine_params", {}) # that was already there just for you to find out the line
self.cursor_kwargs = self.db_engine_spec.get_cursor_configuration_for_impersonation(
str(url), self.impersonate_user, effective_username) # this is the line I added
...
params.update(self.get_encrypted_extra()) # already there
#new stuff
configuration = {}
configuration.update(
self.db_engine_spec.get_configuration_for_impersonation(
str(url),
self.impersonate_user,
effective_username))
if configuration:
params.update(configuration)
As you can see I just shamelessy pasted the code from the PR. However this kind of works only for the SQL lab as I already said. For the dashboards there is an entirely different way of querying Impala that I did not still find out.
This means that queries for the dashboards are handled in a different way and there isn't something like this
with closing(engine.raw_connection()) as conn:
# closing the connection closes the cursor as well
cursor = conn.cursor(**database.cursor_kwargs)
My gut (and debugging) feeling is that you need to first understand the sqlalchemy part and extend a new ImpalaEngine class that uses a custom cursor with the impersonation conf. Or something like that, however it is not simple (if we want to call this simple) as the sql_lab part. So, the trick is to find out where the query is executed and create a cursor with the impersonation configuration. Easy, isnt'it ?
I hope that this could shed some light to you and the others that have this issue. Let me know if you did find out another way to solve this issue, or if this comment was useful.
Update: something really useful
A colleague of mine succesfully implemented impersonation with impala without touching any superset related, but instead working directly with the impyla lib. A PR was open with the code to change. You can apply the patch directly in the impyla src used by superset. You have to edit both dbapi.py and hiveserver2.py.
As a reminder: we are still testing this and we do not know if it works with different accounts using the same superset instance.

Related

ClientCacheConfiguration is not saved to table

Was using CacheConfiguration in Ignite until I stuck with issue on how to authenticate.
Because of that I was starting to change the CacheConfiguration to clientCacheConfiguration. However after converting it to CacheConfiguration I started to notice that it
does not able to save into table because it lack of method setIndexedTypes eg.
Before
CacheConfiguration<String, IgniteParRate> cacheCfg = new CacheConfiguration<>();
cacheCfg.setName(APIConstants.CACHE_PARRATES);
cacheCfg.setIndexedTypes(String.class, IgniteParRate.class);
New
ClientCacheConfiguration cacheCfg = new ClientCacheConfiguration();
cacheCfg.setName(APIConstants.CACHE_PARRATES);
//cacheCfg.setIndexedTypes(String.class, IgniteParRate.class); --> this is not provided
I still need the table to be populated so it easier for us to verify ( using Client IDE like DBeaver)
Any way to solve this issue?
If you need to create tables/cache dynamically using the thin-client, you'll need to use the setQueryEntities() method to define the columns available to SQL "manually". (Passing in the classes with annotations is basically a shortcut for defining the query entities.) I'm not sure why setIndexedTypes() isn't available in the thin-client; maybe a question for the developer mailing list.
Alternatively, you can define your caches/tables in advance using a thick client. They'll still be available when using the thin-client.
To add to existing answer, you can also try to use cache templates for that.
https://apacheignite.readme.io/docs/cache-template
Pre-configure templates, use them when creating caches from thin client.

Set variables in Javascript job entry at root level

I need to set variables in root scope in one job to be used in a different job. The first job has a Javascript job entry, with the statements:
parent_job.setVariable("customers_full_path", "C:\\customers22.csv", "r");
true;
But the compilation fails with:
Couldn't compile javascript:
org.mozilla.javascript.EvaluatorException: Can't find method
org.pentaho.di.job.Job.setVariable(string,string,string). (#2)
How to set a variable at root level in a Javascript job entry?
Sorry for the passive agressive but:
I don't know if you are new to Pentaho but, the most common mistake for new users, with previous knowledge of programming, is to be sort of 'addicted' to know methods, as such you are using JavaScript for a functionality that is built in the tool. Both Transformations(KTR) and JOBs(KJB) have a similar step, you can better manipulate this in a KTR.
JavaScript steps slow down the flow considerably, so try to stay away from those as much as possible.
EDIT:
Reading This article, seems the only thing you're doing wrong is the actual syntax of the command..
Correct usage :
parent_job.setVariable("Desired Value", [name_of_variable]);
The command you described has 3 parameters, when it should be 2. If you have more than 1 variable you need to set, use 3 times the command. Try it out see if it works.

Asterisk dial command return dialed number

I'm looking for a variable that can tell me which number 'won' the call on a multi-target Dial command.
Example:
Dial(SIP/1000&SIP/1001&SIP/1002,30)
Set(the_unlucky_winner=${...})
I'm not getting anything from the ${DIALEDPEERx} variables. Sounds like these vars are broken but I don't know if this is what I should be using.
Ancient version 1.2.14 deployed at this site. All clients are SIP
Thanks anyone
Only realistic way do that - cal via Local channle like freepbx do(check freepbx.org source) or use Macro on answer(i am afraid not work in 1.2)
Parse the contents of the CDR record for the file. One of the fields is dstchannel which will hold a value like SIP/1002-9786b0b0.
Also keep in mind that the call variable stack is wiped on hangup, unless you have an "h" (hangup) extension defined for the context. So, you can most easily handle your post-call processing there.
Further Reading:
http://www.asteriskdocs.org/en/3rd_Edition/asterisk-book-html-chunk/asterisk-SysAdmin-SECT-1.html
Please Note:
if this answer turns out to solve your problem, please "accept" it for the benefit of others trying to solve the same problem later
Hi all I have a solution to this problem. It is working fine for both normal dial and multi target dial.
In dialstring add a macro, here I am adding "followme" macro.
M(followme)
$agi->exec("dial", "SIP/6001#sip.example.com&SIP/6002#sip.example.com,rtTgM(followme)");
Then after call is answered it ll go to context
[macro-followme]
In this context you write one script to get the connected calls information by
$dstchannel=$agi->get_variable("DIALEDPEERNUMBER");
The way I managed to do it is as follows
Dial(SIP/1000&SIP/1001&SIP/1002,30,M(whoanswered))
[macro-whoanswered]
exten => s,1,NoOp(${CHANNEL})
You will see that the actual extension that answered is containes in ${CHANNEL}
If 1001 answered the channel will be something like SIP/1001-00017cf1
Just use the CUT command to cut it by / and -

Determine request Uri from WCF Data Services LINQ query for FirstOrDefault against Azure without executing it?

Problem
I would like to trace the Uri that will be generated by a LINQ query executed against a Microsoft.WindowsAzure.StorageClient.TableServiceContext object. TableServiceContext just extends System.Data.Services.Client.DataServiceContext with a couple of properties.
The issue I am having is that the query executes fine against our Azure Table Storage instance when we run the web role on a dev machine in debug mode (we are connecting to Azure storage in the cloud not using Dev Storage). I can get the resulting query Uri using Fiddler or just hovering over the statement in the debugger.
However, when we deploy the web role to Azure the query fails against the exact same Azure Table Storage source with a ResourceNotFound DataServiceClientException. We have had ResoureNotFound errors before that dealt with the behavior of FirstOrDefault() on empty tables. This is not the problem here.
As one approach to the problem, I wanted to compare the query Uri that is being generated when the web role is deployed versus when it is running on a dev machine.
Question
Does anyone know a way to get the query Uri for the query that will be sent when the FirstOrDefault() method is called. I know that you can call ToString() on the IQueryable returned from the TableServiceContext but my concern is that when FirstOrDefault() is called the Uri might be further optimized and ToString() on IQueryable might not be what is ultimately sent to the server when FirstOrDefault() is called.
If someone has another approach to the problem I am open to suggestions. It seems to be a general problem with LINQ when trying to determine what will happen when the expression tree is finally evaluated. I am open to suggestions here as well because my LINQ skills could use some improvement.
Sample Code
public void AddSomething(string ProjectID, string Username) {
TableServiceContext context = new TableServiceContext();
var qry = context.Somethings.Where(m => m.RowKey == Username
&& m.PartitionKey == ProjectID);
System.Diagnostics.Trace.TraceInformation(qry.ToString());
// ^ Here I would like to trace the Uri that will be generated
// and sent to the server when the qry.FirstOrDefault() call below is executed.
if (qry.FirstOrDefault() == null) {
// ^ This statement generates an error when the web role is running
// in the fabric
...
}
}
Edit Update and Answer
Steve provided the write answer. Our problem was as exactly described in this post which describes an issue with PartitionKey/RowKey ordering in Single Entity query which was fixed with an update to the Azure OS. This explains the discrepancy between our dev machines and when the web role was deployed to Azure.
When I indicated we had dealt with the ResourceNotFound issue before in our existence checks, we had dealt with it in two ways in our code. One way was using exception handling to deal with the ResourceNotFound error the other way was to put the RowKey first in the LINQ query (as some MS people had indicated was appropriate).
It turns out we have several places where the RowKey was first instead of using the exception handling. We will address this by refactoring our code to target .NET 4 and using the .IgnoreResourceNotFoundException = true property of theTableServiceContext .
Lesson learned (more than once): Don't depend on quirky undocumented behavior.
Aside
We were able to get the query Uri's. They did turn out to be different (as indicated they would be in the blog post). Here are the results:
Query Uri from Dev Fabric
`https://ourproject.table.core.windows.net/Somethings()?$filter=(RowKey eq 'test19#gmail.com') and (PartitionKey eq '41e0c1ae-e74d-458e-8a93-d2972d9ea53c')
Query Uri from Azure Fabric
`https://ourproject.table.core.windows.net/Somethings(RowKey='test19#gmail.com',PartitionKey='41e0c1ae-e74d-458e-8a93-d2972d9ea53c')
I can do one better... I think I know what the problem is. :)
See http://blogs.msdn.com/b/windowsazurestorage/archive/2010/07/26/how-wcf-data-service-changes-in-os-1-4-affects-windows-azure-table-clients.aspx.
Specifically, it used to be the case (in previous Guest OS builds) that if you wrote the query as you did (with the RowKey predicate before the PartitionKey predicate), it resulted in a filter query (while the reverse, PartitionKey preceding RowKey) resulted in the kind of query that raises an exception if the result set is empty.
I think the right fix for you (as indicated in the above blog post) is to set the IgnoreResourceNotFoundException to true on your context.

Asterisk with new functions

I created a write func odbc list records files in sql table:
[R]
dsn=connector
write=INSERT INTO ast_records (filename,caller,callee,dtime) VALUES
('${ARG1}','${ARG2}','${ARG3}','${ARG4}')
prefix=M
and set it in dialplan :
exten => _0X.,n,Set(
M_R(${MIXMONITOR_FILENAME}\,${CUSER}\,${EXTEN}\,${DTIME})= )
when I excute it I get an error : ast_func_write: M_R Function not registered:
note that : asterisk with windows
First thing I saw was you were performing the call to the function incorrectly...you need to be assigning values, not arguments....try this:
func_odbc.conf:
[R]
dsn=connector
prefix=M
writesql=INSERT INTO ast_records (filename,caller,callee,dtime) VALUES('${VAL1}','${VAL2}','${VAL3}','${VAL4}');
dialplan:
exten => _0X.,1,Set(M_R()=${MIXMONITOR_FILENAME}\,${CUSER}\,${EXTEN}\,${DTIME})
If that doesn't help you, continue on in my list :)
Make sure func_odbc.so is being loaded by Asterisk. (from the asterisk CLI: module show like func_odbc)... If it's not loaded, it can't "build" your custom odbc query function.
Make sure your DSN is configured in /etc/odbc.ini
Make sure that /etc/asterisk/res_odbc.conf is properly configured
Make sure you're calling the DSN by the right name (I see it happen all the time)
enable verbose and debug in your Asterisk logging, do a logger reload, core set verbose 5, core set debug 5, and then try the call again. when the call finishes, review the log, you'll see much more output regarding what happened...
Regarding the answer from recluze...Not to call you out here, but using a PHP AGI is serious overkill here. The func_odbc function works just fine, why create more overhead and potential security issues by calling an external script (which has to use a interpreter program on TOP itself)?
you should call func odbc function as "ODBC_connector". connector should be use in the func_odbc.conf file [connector]. In the dialplan it should call like this.
exten=> _0x.,n,ODBC_connector(${arg1},${arg2})
I don't really understand the syntax you're trying to use but how about using AGI (with php) for this. Just define your logic in a php script and call it from your dialplan as:
exten => _0X.,n,AGI(script-filename.php,${CUSER},${EXTEN},${DTIME})