I'm using the Google SDK bq command. How do I change the name of a table? I'm not seeing this at https://cloud.google.com/bigquery/bq-command-line-tool
You have to copy to a new table and delete the original.
$ bq cp dataset.old_table dataset.new_table
$ bq rm -f -t dataset.old_table
I don't think there is a way just rename table
What you can do is COPY table to a new table with desired name (copy is free of charge) and then delete original table
The only drawback I see with this is that if you have long term stored data - I think you will lose storage discount (50%) for that data
Related
I need to play around with a data syncing program I wrote, but I want to copy the structure of the production database into a new table on my localhost Postgres database, without copying the data to my localhost db.
I was thinking along the lines of
CREATE TABLE new_table AS
TABLE existing_table
WITH NO DATA;
But I am unsure how to modify it to work with 2 different databases.
Any help would be appreciated
This boils down to the question "how to create the DDL script for a table" which can easily be done using pg_dump on the command line.
pg_dump -d some_db -h production_server -t existing_table --schema-only -f create.sql
The file create.sql then contains the CREATE TABLE script that you can run on your local Postgres installation.
I'm loading CSV data into BigQuery from the command line. I would like to prevent the operation from occurring if the table exists already. I do not want to truncate the table if it exists, and I do not want to append to it.
It seems that there is no command line option for this:
However, I feel like I might be missing something. Is this truly an option that is impossible to use from the command line interface?
A possible workaround for this can be by using bq cp as follow:
Upload your data to a side table, Truncate the data each upload
bq --location=US load --autodetect --source_format=CSV dataset.table ./dataRaw.csv
Copy the data to your target table using bq cp which support an overwrite flag
bq --location=US cp -n dataset.dataRaw dataset.tableNotToOverWrite
If the table exists you get the following error:
Table 'project:dataset.table' already exists, skipping
I think you are right about CLI doesn't support WRITE_EMPTY mode now.
You may file a feature request to get it prioritized.
I don't want to delete tables one by one.
What is the fastest way to do it?
Basically you need to remove all partitions for the partitioned BQ table to be dropped.
Assuming you have gcloud already installed... do next:
Using terminal(check/set) GCP project you are logged in:
$> gcloud config list - to check if you are using proper GCP project.
$> gcloud config set project <your_project_id> - to set required project
Export variables:
$> export MY_DATASET_ID=dataset_name;
$> export MY_PART_TABLE_NAME=table_name_; - specify table name without partition date/value, so the real partition table name for this example looks like -> "table_name_20200818"
Double-check if you are going to delete correct table/partitions by running this(it will just list all partitions for your table):
for table in `bq ls --max_results=10000000 $MY_DATASET_ID | grep TABLE | grep $MY_PART_TABLE_NAME | awk '{print $1}'`; do echo $MY_DATASET_ID.$table; done
After checking run almost the same command, plus bq remove command parametrized by that iteration to actually DELETE all partitions, eventually the table itself:
for table in `bq ls --max_results=10000000 $MY_DATASET_ID | grep TABLE | grep $MY_PART_TABLE_NAME | awk '{print $1}'`; do echo $MY_DATASET_ID.$table; bq rm -f -t $MY_DATASET_ID.$table; done
The process for deleting a time-partitioned table and all the
partitions in it are the same as the process for deleting a standard
table.
So if you delete a partition table without specifying the partition it will delete all tables. You don't have to delete one by one.
DROP TABLE <tablename>
You can also delete programmatically (i.e. java).
Use the sample code of DeleteTable.java and change the flow to have a list of all your tables and partitions to be deleted.
In case needed for deletion of specific partitions only, you can refer to a table partition (i.e. partition by day) in the following way:
String mainTable = "TABLE_NAME"; // i.e. my_table
String partitionId = "YYYYMMDD"; // i.e. 20200430
String decorator = "$";
String tableName = mainTable+decorator+partitionId;
Here is the guide to run java BigQuery samples, and ensure to set your project in the cloud shell:
gcloud config set project <project-id>
Is there a way to copy a date-sharded table to another dataset via the bq utility?
My current solution is generating a bash script to copy each day one-by-one and splitting the work, but more efficient would be to do everything in parallel:
#!/bin/sh
bq cp old_dataset.table_20140101 new_dataset_20140101
..
bq cp old_dataset.table_20171001 new_dataset_20171001
You can specify multiple source tables but only a single destination table (refer to this question), so this may not work for you. However, if your data is date-partitioned (instead of sharded) then you can copy the table in one command.
I recommend you convert the sharded table into a date-partitioned table which will be effectively copying all the sharded tables to a new table. You can do this with the following command:
bq partition old_dataset.table_ new_dataset.partitioned
I have a program which will download some data from the web and save it as a csv, and then upload that data to a Google Cloud Storage Bucket. Next, that program will use gsutil to create a new Google BigQuery Table by concatenating all the files in the Google Cloud Storage Bucket. To do the concatenating I run this command in command prompt:
bq load --project_id=ib-17 da.hi gs://ib/hi/* da:TIMESTAMP,bol:STRING,bp:FLOAT,bg:FLOAT,bi:FLOAT,lo:FLOAT,en:FLOAT,kh:FLOAT,ow:FLOAT,ls:FLOAT
The issue is that for some reason this command appends to the existing table, so I get a lot of duplicate data. The question is how can I either use gsutil to delete the table first maybe how can I use gsutil to overwrite the table?
If I understood correctly your question, you should delete and recreate the table with:
bq rm -f -t da.hi
bq mk --schema da:TIMESTAMP,bol:STRING,bp:FLOAT,bg:FLOAT,bi:FLOAT,lo:FLOAT,en:FLOAT,kh:FLOAT,ow:FLOAT,ls:FLOAT -t da.hi
Another possibility is to use the --replace flag, such as:
bq load --replace --project_id=ib-17 da.hi gs://ib/hi/*
I think that this flag was once called WRITE_DISPOSITION but looks like the CLI updated the name to --replace.