BigQuery bq command with asterisk (*) doesn't work in Compute Engine - google-bigquery

I have a directory with a file named file1.txt
And I run the command:
bq query "SELECT * FROM [publicdata:samples.shakespeare] LIMIT 5"
In my local machine it works fine but in Compute Engine I receive this error:
Waiting on bqjob_r2aaecf624e10b8c5_0000014d0537316e_1 ... (0s) Current status: DONE
BigQuery error in query operation: Error processing job 'my-project-id:bqjob_r2aaecf624e10b8c5_0000014d0537316e_1': Field 'file1.txt' not found.
If the directory is empty it works fine. I'm guessing the asterisk is expanding the file(s) into the query but I don't know why.

Apparently the bq command which is located at /usr/bin/bq has the following script:
#!/bin/sh
exec /usr/lib/google-cloud-sdk/bin/bq ${#}
which expands the asterisk.
As a current workaround I'm calling /usr/lib/google-cloud-sdk/bin/bq directly.

Related

How To Send Output To Terminal Window with Hive Script

I am familiar with storing output/results for a Hive Query to file, but what command do I use in the script to display the results of the HQL to the terminal?
Normally Hive prints results to the stdout, if not redirected it displays on console. You do not need any special command for this.
If you want to display results on the console screen and at the same time store them in a file, use tee command:
hive -e "use mydb; select * from test_t" | tee ./results.txt
OK
123 {"value(B)":"Bye"}
123 {"value(G)":"Jet"}
Time taken: 1.322 seconds, Fetched: 2 row(s)
Check file contains results
cat ./results.txt
123 {"value(B)":"Bye"}
123 {"value(G)":"Jet"}
See here: https://ru.wikipedia.org/wiki/Tee
This was my output:
There was no output, because I had yet to properly use the LOAD DATA INPATH command to my hdfs. After loading, I received output from the SELECT statement in the script.

Hi , Google big query - bq fail load display file number how to get the file name

I'm running the following bq command
bq load --source_format=CSV --skip_leading_rows=1 --max_bad_records=1000 --replace raw_data.order_20150131 gs://raw-data/order/order/2050131/* order.json
and
getting the following message when loading data into bq .
*************************************
Waiting on bqjob_r4ca10491_0000014ce70963aa_1 ... (412s) Current status: DONE
BigQuery error in load operation: Error processing job
'orders:bqjob_r4ca10491_0000014ce70963aa_1': Too few columns: expected
11 column(s) but got 1 column(s). For additional help: http://goo.gl/RWuPQ
Failure details:
- File: 844 / Line:1: Too few columns: expected 11 column(s) but got
1 column(s). For additional help: http://goo.gl/RWuPQ
**********************************
The message display only the file number .
checked the files content most of them are good .
gsutil ls and the cloud console on the other hand display file names .
how can I know which file is it according to the file number?
There seems to be some weird spacing introduced in the question, but if the desired path to ingest is "/order.json" - that won't work: You can only use "" at the end of the path when ingesting data to BigQuery.

Using sql within shell script

I am currently trying to integrate an sql statement into a shell script, But facing major syntax issue:
My statement in the script:
su - <sid>adm -c 'hdbsql -U SYSTEM export "'SCHEMA'"."'*'" as binary into "'Export Location'" with reconfigure'
I get the following error:
* 257: sql syntax error: incorrect syntax near "*": line 1 col 16 (at pos 16) SQLSTATE: HY000
Would really appreciate if anyone could help me with this.
Thanks and Regards,
AK
Your command line doesn't make much sense to me. It starts with
su - <sid>adm
which means that you are redirecting the contents of the file "sid" into "su" and then redirecting the result of that operation into the file "adm".
Second problem is that in the command you are giving to adm, the single quotes end right before the "" which means, that the "" will get interpreted by the shell as a file glob:
-c 'hdbsql -U SYSTEM export "'SCHEMA'"."'*'" as binary into "'Export Location'" with reconfigure'
You'll need to escape those single quotes like this: "\'".
But I think your problem solving approach is not good. Try to reduce to problem and only then start adding additional things to it. So first try to execute the SQL statement from the "hdbsql" shell. Does it work?
$ hdbsql
> YOUR SQL STATEMENT HERE
Once that works, try to execute the SQL statement from the unix shell as a user:
$ hdbsql -U SYSTEM export ...
Once that works, try to execute it via su
$ su - ...

bigquery pass property in commandline

I am getting an error following in bigquery :
Error: Response too large to return.
After couple of google search I found workaround is set configuration.query.allowLargeResults=true
But not sure how to pass this property value in bq command line tool.
any help ?
Thanks
$ bq help query
USAGE: bq.py [--global_flags] <command> [--command_flags] [args]
[...]
--[no]allow_large_results: Enables larger destination table sizes
--destination_table: Name of destination table for query results.
So:
$ bq query --allow_large_results --destination_table "dataset.table" "SELECT 1"

How to generate executable TPC-DS queries?

I have downloaded the DSGEN tool from the TPC-DS web site and already generated the tables and loaded the data into Oracle XE.
I am using the following command to generate the SQL statements :
dsqgen -input ..\query_templates\templates.lst -directory ..\query_templates -dialect oracle -scale 1
However, No matter how I adjust the command I always get this error message :
ERROR: A query template list must be supplied using the INPUT option
Can anybody help?
Apparently you need to use / rather than - for the flags for the Windows executable:
dsqgen /input ..\query_templates\templates.lst /directory ..\query_templates
/dialect oracle /scale 1