This is how it typical looks
I use PuTTY for running Hive queries. Query results have no column names.
Setting doesn't have obvious check-box for "add column names", not to me anyways...
Can you help please?
Thanks
More details: Issue description
Execute this command once in your hive session:
set hive.cli.print.header=true;
Alternatively add this command in .hiverc file in your home directory. See this for reference: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli#LanguageManualCli-ThehivercFile
Related
I am getting this error quite frequently while trying to create a scheduled query
Error creating scheduled query: Cannot create a transfer in
JURISDICTION_US when destination dataset is located in
REGION_ASIA_SOUTHEAST_1
I just need a scheduled query to overwrite data in a table.
I had the same problem while trying to create a scheduled query with python:
400 Cannot create a transfer in REGION_EUROPE_WEST_1 when destination dataset is located in JURISDICTION_EU
I figured out that even my project is located in europe-west1 but my destination dataset was located in multinational location: Europe. I had to update my parent path : parent=project_path to '{project_path}/locations/eu' so that it works.
I hope that it helps someone.
It's look like as a bug from BQ.
I got the same problems, with source and destination dataset located in EU both.
I've change just for testing purpose the destination for an other EU dataset, and it works.
I've finally update the scheduled query to use my first destination choice and now it works.. I can't explain why, but it's seem to be a workaround.
Maybe, you can try with starting from the Scheduled Queries BigQuery UI and click on "+ create schedule query" button, then I don't get error. If I start directly in BigQuery UI I get the same error.
As I tried, it may happen because I have existed table with the same id as the destination table. This happens even if the table is the result of manually running that query and saved.
I faced the same issue recently.I tried 2 things and they worked:
try setting the query location to destination dataset/table location, then try scheduling the query.
If that does not work try to run the query and save results to the intended table in bigquery i.e. try creating the destination table with storing the results of the query you are trying to schedule first. Then try scheduling the query.
Both cases worked for me in different cases.
I had this error and tried many of the solutions in this thread. I tried a new session in an incognito window and it worked so I believe this is a transient issue as suggested.
I just scheduled query select 1 and then edited it to the needed one – it worked
I think trouble with time when start schedule. If it is in the past relative to local time, then bg tries to run the request on another server.
I had the same issue. The way I solved it was to disable the editor tabs (there is a button at the top). Then opened the query settings and set the processing location to EU manually.
I was using bq command when I came across this issue and was able to resolve it by adding parameter --location='europe-west1
So my final query looked like this
bq query \
--use_legacy_sql=false \
--display_name='my_table' \
--location='europe-west1' \
'''create or replace table my_dataset.my_table as (select * from external_query('projects/my_mysql_connection/locations/europe-west1/connections/bi', '(select * from my_table)'))'''
I try to open a csv file in PGadmin4 with COPY.
However it does not work.
The permission is denied and suggests me to use \copy from pgsql.
I tried to replace COPY by \copy, did not work.
I guess that pgsql must be run another way. I saw there ,example with \copy, that it run with a shell file .sh. However I'm using Windows.
How to run pgsql request ?
Thank you
Try Import/Export option in pgAdmin4 to import your CSV data into Table.
Ref: https://www.pgadmin.org/docs/pgadmin4/dev/import_export_data.html
I answer you on my own question page. Yes, psql is the SQL Shell program and you could start it from the Postgresql folder (from where you start pgAdmin). You do not need of psql if you use pgAdmin.
On Ubuntu 14.04 LTS running this osqueryi command:
osquery> SELECT * FROM file LIMIT 10;
returns no rows. Other tables like users are populated.
Do I need to "activate" something to populate the file table? Is there another table or some thing like the ls command?
There are no need to "activate" something to populate the file table, test with
SELECT * FROM file WHERE path = '/etc/group';
it is only an uggly way to send parameters to tables like file, device_file, device_partitions, etc. that are flagged at osquery.io/docs/tables with the "required in WHERE clause" icon in some column.
They will fix the information problem with an error message, and perhaps better documentation, see more details here at the issue discussion.
There is a case in my ETL where i am trying to take "table output" name from command line. The table name does not correspond to any streaming field's name. Is there any way to get it done in pentaho kettle?
Pentaho DI is a metadata based tool. I assume you will be trying to pass the output table name from the command line like below:
.../pan.sh -file:"/home/user/sample.ktr" -param:table_output=SOMETABLE
Assuming the command above is what you are trying.
So firstly, change the transformation settings of sample.ktr (just an example) and add the parameter name : "table_output" to the Parameters section.
Next, in the Table Output Step, use this parameter name in the format : ${table_output} in place of table name. This should solve your query.
Incase you are passing the parameters to a job. As mentioned above, the first section of the adding the parameters remains the same.
You can next take a separate transformation (.ktr) file inside a job, double click on the ktr (from the job file) and you will find PARAMETERS Section like the image below. Add the parameters
Thirdly inside the .ktr file, repeat the step from above (first section) and use a SET VARIABLE or TABLE OUTPUT. SET Variable step will ensure that you have the parameter available across the entire job. Mostly depends on your requirement.
Hope it helps :)
This should give you an idea how to do it. Since transformations are just xml you can read the metadata from them. Basically you find the table output step and set it as a variable in this case "TABLE"
I know that you can get column names from a table via the following trick in hive:
hive> set hive.cli.print.header=true;
hive> select * from tablename;
Is it also possible to just get the column names from the table?
I dislike having to change a setting for something I only need once.
My current solution is the following:
hive> set hive.cli.print.header=true;
hive> select * from tablename;
hive> set hive.cli.print.header=false;
This seems too verbose and against the DRY-principle.
If you simply want to see the column names this one line should provide it without changing any settings:
describe database.tablename;
However, if that doesn't work for your version of hive this code will provide it, but your default database will now be the database you are using:
use database;
describe tablename;
you could also do show columns in $table or see Hive, how do I retrieve all the database's tables columns for access to hive metadata
The solution is
show columns in table_name;
This is simpler than use
describe tablename;
Thanks a lot.
use desc tablename from Hive CLI or beeline to get all the column names. If you want the column names in a file then run the below command from the shell.
$ hive -e 'desc dbname.tablename;' > ~/columnnames.txt
where dbname is the name of the Hive database where your table is residing
You can find the file columnnames.txt in your root directory.
$cd ~
$ls
Best way to do this is setting the below property:
set hive.cli.print.header=true;
set hive.resultset.use.unique.column.names=false;