WebHCat & Pig - how to pass a parameter file to the job? - apache-pig

I am using HCatalog's WebHCat API to run Pig jobs, such as documented here:
https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+Pig
I have no problem running a simple job but I would like to attach a parameters file to the job, such as one can do using pig command line's parameter: --param_file .
I assume this is possible through arg request's parameter, so I tried multiple things, such as passing:
'arg': '-param_file /path/to/param.file'
or:
'arg': {'param_file': '/path/to/param.file'}
None seems to work, and error stacks don't say much.
I would love to know if this is possible, and if so, how to correctly achieve this.
Many thanks

Correct usage:
'arg': ['-param_file', '/path/to/param.file']
Explanation:
By passing the value in arg,
'arg': {'-param_file': '/path/to/param.file'}
webhcat generates "-param_file" for the command prompt.
Pig throws the following error
ERROR org.apache.pig.Main - ERROR 2999: Unexpected internal error. Can not create a Path from a null string
Using a comma instead of the colon operator passes the path to file as a second argument.
webhcat will generate "-param_file" "/path/to/param.file"
P.S: I am using Requests library on python to make the REST calls

Related

Run a binary CGI file on the command line with GET params

How can I run a binary cgi file on the command line and provide GET parameters to it?
I understand this task may be straightforward for perl or php files, but I've got a binary cgi file and no documentation for it. I'd like to run it without a web server so that I can evaluate certain problems on some co-workers' machines.
I've tried the following, but to no avail:
QUERY_STRING="foo=bar" ./myfile.cgi
foo=bar ./myfile.cgi
./myfile.cgi foo=bar
./myfile.cgi <<< "foo=bar"
In each case, the script outputs Error in form found<br>Missing foo<br><b></b><br>. (When executed through apache on our server, it returns no error message, only the intended results.)
Specifying the environment variable REQUEST_METHOD=GET in addition to QUERY_STRING=... makes the difference.
Among the tokens output by strings myfile.cgi, there were a number of cgi-related variables which the web server might be expected to set, such as REMOTE_ADDR, SERVER_SOFTWARE, and SERVER_PROTOCOL, but the two aforementioned variables proved enough to get this executable to produce a non-error output.
(For a POST, I've read that the body/params are read from stdin, but I haven't substantiated that personally.)

pentaho PDI passing uservariable in command line

I am trying to run a Transformation/Job by passing a user variable in command line.
I have tried by passing variable value as below.
sh pan.sh -file='test.ktr' '-param:input_directory=/path/to/directory' -level=basic
where input_directory is variable in transformation and i mentioned it as ${input_directory}
But when I do this, the pan is unable to find the variable value. It is throwing error as below
Could not list the contents of "file:///home/user1/pdi8.1/data-integration8.1/${input_directory}" because it is not a folder.
can someone help me on this. Thank you
To pass named parameters to your job or transformation, the parameters need to be defined in the properties window, shown here for a transformation. The default value is not needed, but works well for testing. Pay attention to capitalization.
So the pieces of the puzzle are:
From the command line, pass the parameter like -param:yourparam=yourvalue
Define this same parameter in the highest-level job or transformation
Use it as you would use any variable, with ${yourparam}
i think the parameter names to be used in job should be ${PARAM_NAME1}
using command line i follow the below convention
call "{Replace with kitchen.bat File Path}" /file:"{Replace with JOB File Path}" "-param:PARAM_NAME1=PARAM_VALUE1" "-param:PARAM_NAME2=PARAM_VALUE2"

How to pass shell variable to pig param file

How we can pass shell variable to pig param file. As an example I have a shell variable defined as DB_NAME. i would like to define my pig parameter file as
p_db_nm=$DB_NAME
I tried like above which does not work and i did try like echo $DB_NAME does not work either.
I'm aware that i can pass this by using -param in command line but i have many variables which i would like to put it in param file but the values will be defined in shell script. I searched many topics in google and didn't have any luck!!!
My question is similar what was posted in http://grokbase.com/t/pig/user/09bdjeeftk/is-it-possible-to-use-an-env-variable-in-parameters-file but i see no workable solution is posted.
Can anyone help?
you can pass parameter file using –param_file option.
if Parameter File named "pig.cfg" defined like below,
p_db_nm=$DB_NAME
in the shell, pig command will be like this,
pig -param_file pig.cfg
and finally in your pig, you can use does variables named by KEY in the cfg file. (in this case, $p_db_nm)

Jmeter dynamic URL property with variable not substituted

I have a simple Jmeter test where I have a property to set the URL. The PATH in the Jmeter test is set to the following.
${__P(GET_URL,)}
This works well for all URLs that I have been working with, except for the ones where I need to pass a variable in the URL component.
For example, it works for http://server:port/getemployeelist when I run the test with -JGET_URL=/getemployeelist
Then I created a CSV config element to populate the variable EMP_ID.
Then if I run the test with -JGET_URL=/getemployee/${EMP_ID}, the EMP_ID variable is not getting substituted. Jmeter test gives me an error as follows:
java.net.URISyntaxException: Illegal character in path at index xx: https://://getemployee/${EMP_ID}
Appreciate any help/pointers.
It will not work this way, JMeter doesn't know anything about ${EMP_ID} at the time it is being started, you need to append this ${EMP_ID} to HTTP Request sampler "Path" in the runtime
Start JMeter as:
jmeter -JGET_URL=/getemployee/
Use CSV Data Set Config to read the EMP_ID from the CSV File
In the HTTP Request sampler use construction like /${__P(GET_URL,)}/${EMP_ID} to combine JMeter Property specified via -J command line argument and JMeter Variable originating from the CSV file.
If anything goes wrong first of all check jmeter.log file - it normally contains enough troubleshooting information. If there is nothing suspicious - use Debug Sampler and View Results Tree listener combination to inspect requests and response details, variables and properties names and values, etc.
Had asked this question a while back. Thought of posting the solution which I eventually ended up implementing. In the solution, I created a template jmx with a substitution variable for the HttpSampler.path and then replace the path at runtime. Following are the key points from the scripting done.
This turned out to be a simpler solution that worked for all kinds of API call patterns.
Created a template jmx (jmeter_test_template) with the following line.
<stringProp name="HTTPSampler.path">#PATH#</stringProp>
This jmx has CSV config element to populate variable "EMP_ID". To create this file, just create a new test with any URL and then save it as a template and replace the URL with substitution variable #PATH#.
Created a wrapper script like run_any_api.sh with usage,
sh run_any_api.sh URL=http://server:port/myapp/employees/${EMP_ID}
In the wrapper script, this URL is replaced in place of the token.
sed "s/#PATH#/$URL" jmeter_test_template.jmx > jmeter_test_template.current_test.jmx
jmeter -t jmeter_test_template.current_test.jmx
Last but not the least, please remember to cleanup the temporary jmx,
rm jmeter_test_template.current_test.jmx

ActiveTCL - Unable to run a batch file from an Expect Script

I was originally trying to run an executable (tftpd32.exe) from Expect with the following command, but for some unknown reason it would hanged the entire script:
exec c:/tftpd32.351/tftpd32.exe
So, decided to call a batch file that will start the executable.
I tried to call the batch file with the following command, but get an error message stating windows cannot find the file.
exec c:/tftpd32.351/start_tftp.bat
I also tried the following, but it does not start the executable:
spwan cmd.exe /c c:/tftpd32.351/start_tftp.bat
The batch file contains this and it run ok when I double click on it:
start tftpd32.exe
Any help would be very much appreciated.
Thanks
The right way to run that program from Tcl is to do:
set tftpd "c:/tftpd32.351/tftpd32.exe"
exec {*}[auto_execok start] "" [file nativename $tftpd]
Note that you should always have that extra empty argument when using start (due to the weird way that start works; it has an optional string in quotes that specifies the window title to create, but it tends to misinterpret the first quoted string as that even if that leaves it with no mandatory arguments) and you need to use the native system name of the executable to run, hence the file nativename.
If you've got an older version of Tcl inside your expect program (8.4 or before) you'd do this instead:
set tftpd "c:/tftpd32.351/tftpd32.exe"
eval exec [auto_execok start] [list "" [file nativename $tftpd]]
The list command in that weird eval exec construction adds some necessary quoting that you'd have trouble generating otherwise. Use it exactly as above or you'll get very strange errors. (Or upgrade to something where you don't need nearly as much code gymnastics; the {*} syntax was added for a good reason!)