Change clustered columns in an existing bigquery table

Change clustered columns in an existing bigquery table - google-bigquery

I have a partitioned and clustered table in bigquery. I would like to add another column to the set of clustered columns. I found out that the way to fix it is creating another table as you can see here Make existing bigquery table clustered, but I can't do it because my table is the source of a Data Studio dashboard where I have many calculated fields and I don't want to lose these fields.
Any suggestion? Thanks a lot!
Gustavo.

You don't need a new table, although changing cluster column was not supported initially, it is supported afterwards (since early 2020).
Please check this documentation: https://cloud.google.com/bigquery/docs/creating-clustered-tables#modifying-cluster-spec
Unfortunately, the feature is only available through API right now.
(If you're not familiar with BigQuery API) It doesn't require you to write code, you can interact with API web interface here. For your one time maintenance, it may save you some time.

I don't think BigQuery yet allows renaming a table.
Can you use view? So copy data to another table with required modified clustering. Then have a view with same name as the old table name on the new table, so that nothing breaks on Data Studio.

Related

BigQuery insert (not append) a new column into schema

Is there a convenient way (Python, web UI, or CLI) for inserting a new column into an existing BigQuery table (that already has 100 columns or so) and update the schema accordingly?
Say I want to insert it after column 49. If I do this via a query, I will have to type every single column name, will I not?
Update: the suggested answer does not make it clear how this applies to BigQuery. Furthermore, the documentation does not seem to cover
ALTER TABLE `tablename` ADD `column_name1` TEXT NOT NULL AFTER `column_name2`;
Syntax. A test confirmed that the AFTER identifier does not work for BigQuery.

I think that is not possible to perform this action in a simple way, I thought in some workarounds to reach this such as:
Create a view after adding your column.
Creating a table from a query result after adding your column.
On the other hand, I can't catch how this is useful, the only scenario I can think for this requirement is if you are using SELECT * which is not recommended when using BigQuery according with the Bigquery best practices. If is not the case share your case of use to get a better understanding of it.
Since this is not a current feature of BigQuery you can file a feature request asking for this feature.

Updating partitioned and clustered table in BigQuery

I've created a partitioned and clustered BigQuery table for the time period of the year 2019, up to today. I can't seem to find if it is possible to update such a table (since I would need to add data for each new day). Is it possible to do it and if so, then how?
I've tried searching stackoverflow and BigQuery documentation for the answer. No results there on my part.

You could use the UPDATE statement to update this data. Your partitioned table will maintain their properties across all operations that modify it, like the DML and DDL statements, load jobs and copy jobs as well. For more information, you could check this document.
Hope it helps.

Google Big Query - Date-Partitioned Tables with Eventual Data

Our use case for BigQuery is a little unique. I want to start using Date-Partitioned Tables but our data is very much eventual. It doesn't get inserted when it occurs, but eventually when it's provided to the server. At times this can be days or even months before any data is inserted. Thus, the _PARTITION_LOAD_TIME attribute is useless to us.
My question is there a way I can specify the column that would act like the _PARTITION_LOAD_TIME argument and still have the benefits of a Date-Partitioned table? If I could emulate this manually and have BigQuery update accordingly, then I can start using Date-Partitioned tables.
Anyone have a good solution here?

You don't need create your own column.
_PARTITIONTIME pseudo column still will work for you!
The only what you will need to do is insert/load respective data batch into respective partition by referencing not just table name but rather table with partition decorator - like yourtable$20160718
This way you can load data into partition that it belong to

Best Practice for adding columns to a Table in Oracle database

I came across a scenario where there is a column need to be added in a table. What is the Industries best practices to add a column to the existing table in Production System.
By default at the end
At appropriate position
Before the Audit fields of the table
Our data modeler has added the column and chose the default options. Is there any performance hit if the added column is used frequently.
What is the efforts to develop the script that always add the column before the audit fields as a standard?
Any help will appreciated.

It is not possible in Oracle to decide position of the new column. (well, unless you drop and recreate new table).
Note, that order of columns is not related to performance issues.

Add Column on SQL Server on Specific Place?

I would like to know if there's a way to add a column to an SQL Server table after it's created and in a specific position??
Thanks.

You can do that in Management-Studio. You can examine the way this is accomplished by generating the SQL-script BEFORE saving the change. Basically it's achieved by:
removing all foreign keys
creating a new table with the added column
copying all data from the old into the new table
dropping the old table
renaming the new table to the old name
recreating all the foreign keys

In addition to all the other responses, remember that you can reorder and rename columns in VIEWs. So, if you find it necessary to store the data in one format but present it in another, you can simply add the column on to the end of the table and create a single table view that reorders and renames the columns you want to show. In almost every circumstance, this view will behave exactly like the original table.

The safest way to do this is.
Create your new table with the correct column order
Copy the data from the old table.
Drop the Old Table.

The only safe way of doing that is creating a new table (with the column where you want it), migrating the data, dropping the original table, and renaming the new table to the original name.
This is what Management Studio does for you when you insert columns.

As others have pointed out you can do this by creating a temp table moving the data and droping the orginal table and then renaming the other table. This is stupid thing to do though. If your table is large, it could be very time-consuming to do this and users will be locked out during the process. This issomething you NEVER want to do to any table in production.
There is absolutely no reason to ever care what order the columns are in a table since you should not be relying on column order anyway (what if someone else did this same stupid thing?). No queries should use select * or ordinal positions to get columns. If you are doing this now, this is broken code and needs to be fixed immediately as the results are not always going to be as expected. For instance if you do insert a column where you want it and someone else is using select * for a report, suddenly the partnumber is showing up in the spot that used to hold the Price.
By doing what you want to do, you may break much more than you fix by putting the column where you personally want it. Column order in tables should always be irrelevant. You should not be doing this every time you want columns to appear in a differnt order.

With Sql Server Management Studio you can open the table in design and drag and drop the column wherever you want

As Kane says, it's not possible in a direct way. You can see how Management Studio does it by adding a column in the design mode and checking out the change script.
If the column is not in the last position, the script basically drops the table and recreates it, with the new column in the desired position.

In databases table columns don't have order.
Write proper select statement and create a view

No.
Basically, SSMS behind the scenes will copy the table, constraints, etc, drop the old table and rename the new.
The reason is simple - columns are not meant to be ordered (nor are rows), so you're always meant to list which columns you want in a result set (select * is a bit of a hack)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Change clustered columns in an existing bigquery table - google-bigquery

I don't think BigQuery yet allows renaming a table. Can you use view? So copy data to another table with required modified clustering. Then have a view with same name as the old table name on the new table, so that nothing breaks on Data Studio.

Related

BigQuery insert (not append) a new column into schema

Updating partitioned and clustered table in BigQuery

Google Big Query - Date-Partitioned Tables with Eventual Data

Best Practice for adding columns to a Table in Oracle database

Add Column on SQL Server on Specific Place?

Categories

Resources