Unknown command line flag compression - google-bigquery

I am trying to export tables from BQ to CS using the command-line tool.
i followed the instructions here: https://cloud.google.com/bigquery/docs/exporting-data
I tried to run the example script:
extract --compression=GZIP 'bigquery-public-data:samples.shakespeare' gs://my_bucket/shakespeare.zip
I keep getting:
Error parsing command: Unknown command line flag 'compression'
Any idea anyone?

I just run below and it perfectly worked for me
bq extract --compression=GZIP bigquery-public-data:samples.shakespeare gs://my_bucket/shakespeare.zip
it also worked with double quotes around full qualified table name, but not with single quotes (as it is in link in your question)

Related

How do I insert special characters in to Azure table storage with "az storage entity insert"-command?

I have a Powershell script that builds a "az storage entity insert" command dynamically. Basically I have a CSV file that I use to create the content of a table by converting it to a long command it then invokes. It has worked fine until I added a field that contains a Regexp.
I started to get strange "The system cannot find the path specified." errors. Not from accessing the CSV as you would first suspect, but from running the command generated. I found out that some special characters in the field's value breaks the command and it tries to execute what comes after that as some separate command or something.
I made the expression simpler and found that not much characters work. As simple commands as this does not work:
az storage entity insert --table-name table --account-name $StorageAccountName --if-exists replace --connection-string $StorageConnectionString --entity PartitionKey=ABC RowKey=DEF Field="(abc)" Field#odata.type=Edm.String
This causes a different error "Field#odata.type was unexpected at this time."
Also | character causes problems, like:
az storage entity insert --table-name table --account-name $StorageAccountName --if-exists replace --connection-string $StorageConnectionString --entity PartitionKey=ABC RowKey=DEF Field="|abc" Field#odata.type=Edm.String
gives "'abc' is not recognized as an internal or external command, operable program or batch file.
This instead works fine:
az storage entity insert --table-name table --account-name $StorageAccountName --if-exists replace --connection-string $StorageConnectionString --entity PartitionKey=ABC RowKey=DEF Field="abc" Field#odata.type=Edm.String
So why do those special characters break the command and how can fix it? I need both of those characters for the regexp and some others too that won't work.
These errors happen both when I run directly from Powershell as well as in my script that uses Invoke-Expression
I initially thought this had to do with the way that PowerShell handles single quotation marks vs double quotation marks but it turns out that I was only half way there. Octopus Deploy lists several solutions including this with wrapped single quotes:
'"(abc)"'
Here are your original commands and then your commands with the wrapped single quotes around double quotes (where I now instead error out on failing to provide an account name):

How can I save my Selenium IDE results in .xml format

selenium-side-runner C:\SeleniumIDE\MyProjectOne.side --output-directory C:\SeleniumIDE\Results --output-format=junit
I followed the syntax provided https://www.seleniumhq.org/selenium-ide/docs/en/introduction/command-line-runner/#output-test-results-to-a-file but it gives me an error. In my command line I specify where the file should go but unable to execute. Please help. Sorry I am new to this and apologies if I am not clear on this issue.
I realize that when I put it this way in the command line selenium-side-runner C:\SeleniumIDE\MyProjectOne.side --output-directory C:\SeleniumIDE\Results it does executes but stores format in .json file
New here too.
According to the document URL you post, --output-directory=results should work. Your command line seems omit a = between --output-directory and result.
Try this:
selenium-side-runner C:\SeleniumIDE\MyProjectOne.side --output-directory=C:\SeleniumIDE\Results --output-format=junit
Also, if it still not work, since result means the out put file absolute or relative path, I'm wonder if putting quotation marks helps.
Try this:
selenium-side-runner C:\SeleniumIDE\MyProjectOne.side --output-directory="C:\SeleniumIDE\Results" --output-format=junit

In texinfo, how to specify a bash single quote?

I am writing a package using the GNU build system. The documentation hence is in the texinfo format. As a result, executing make converts the texinfo file into the info format, and executing make pdf automatically produces a pdf file.
In the texinfo file, I have something like this:
#verbatim
awk '{...}' data.txt
#end verbatim
However, in the pdf, the "basic" single quotes (U+0027) in the awk command above are transformed into "curvy" single quotes (U+2019) so that, if one does a copy-paste of the command from the pdf into a terminal, bash complains ("syntax error"). This forces the user to edit the command he just copy-pasted. Same problem occurs if I replace #verbatim by #example. I searched the texinfo manual but couldn't find a way to specify apostrophes. I am using texinfo version 5.2.
Karl Berry (via the bug-texinfo mailing list) told me to add 2 lines to my texi file (more info):
#codequoteundirected on
#codequotebacktick on
as well as add the latest version of texinfo.tex to my package.

WebHCat & Pig - how to pass a parameter file to the job?

I am using HCatalog's WebHCat API to run Pig jobs, such as documented here:
https://cwiki.apache.org/confluence/display/Hive/WebHCat+Reference+Pig
I have no problem running a simple job but I would like to attach a parameters file to the job, such as one can do using pig command line's parameter: --param_file .
I assume this is possible through arg request's parameter, so I tried multiple things, such as passing:
'arg': '-param_file /path/to/param.file'
or:
'arg': {'param_file': '/path/to/param.file'}
None seems to work, and error stacks don't say much.
I would love to know if this is possible, and if so, how to correctly achieve this.
Many thanks
Correct usage:
'arg': ['-param_file', '/path/to/param.file']
Explanation:
By passing the value in arg,
'arg': {'-param_file': '/path/to/param.file'}
webhcat generates "-param_file" for the command prompt.
Pig throws the following error
ERROR org.apache.pig.Main - ERROR 2999: Unexpected internal error. Can not create a Path from a null string
Using a comma instead of the colon operator passes the path to file as a second argument.
webhcat will generate "-param_file" "/path/to/param.file"
P.S: I am using Requests library on python to make the REST calls

How to force STORE (overwrite) to HDFS in Pig?

When developing Pig scripts that use the STORE command I have to delete the output directory for every run or the script stops and offers:
2012-06-19 19:22:49,680 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 6000: Output Location Validation Failed for: 'hdfs://[server]/user/[user]/foo/bar More info to follow:
Output directory hdfs://[server]/user/[user]/foo/bar already exists
So I'm searching for an in-Pig solution to automatically remove the directory, also one that doesn't choke if the directory is non-existent at call time.
In the Pig Latin Reference I found the shell command invoker fs. Unfortunately the Pig script breaks whenever anything produces an error. So I can't use
fs -rmr foo/bar
(i. e. remove recursively) since it breaks if the directory doesn't exist. For a moment I thought I may use
fs -test -e foo/bar
which is a test and shouldn't break or so I thought. However, Pig again interpretes test's return code on a non-existing directory as a failure code and breaks.
There is a JIRA ticket for the Pig project addressing my problem and suggesting an optional parameter OVERWRITE or FORCE_WRITE for the STORE command. Anyway, I'm using Pig 0.8.1 out of necessity and there is no such parameter.
At last I found a solution on grokbase. Since finding the solution took too long I will reproduce it here and add to it.
Suppose you want to store your output using the statement
STORE Relation INTO 'foo/bar';
Then, in order to delete the directory, you can call at the start of the script
rmf foo/bar
No ";" or quotations required since it is a shell command.
I cannot reproduce it now but at some point in time I got an error message (something about missing files) where I can only assume that rmf interfered with map/reduce. So I recommend putting the call before any relation declaration. After SETs, REGISTERs and defaults should be fine.
Example:
SET mapred.fairscheduler.pool 'inhouse';
REGISTER /usr/lib/pig/contrib/piggybank/java/piggybank.jar;
%default name 'foobar'
rmf foo/bar
Rel = LOAD 'something.tsv';
STORE Rel INTO 'foo/bar';
Once you use the fs command, there a lot of ways to do this. For an individual file, I wound up adding this to the beginning of my scripts:
-- Delete file (won't work for output, which will be a directory
-- but will work for a file that gets copied or moved during the
-- the script.)
fs -touchz top_100
rm top_100
For a directory
-- Delete dir
fs -rm -r out