Is it possible to convert external tables to native in BigQuery? - google-bigquery

I have created a table from Google Cloud Storage (filepath starts with gs://). I could not create it as native even after trying multiple times. I succeeded only after setting the table option as native. Later, I was able to query this table successfully. However, I need to do the following:
Add a column to the table
Append (union) two such external tables
Join the appended table with another external table
save the joined table in a new table so that I can later query this new table
Questions:
Since this is an external table, can I add a column? Can I save the joined table as native?

Yes, you can convert an external table (or federated source) to a native table in BigQuery.
To do this, simply read the external table using SQL and set the destination table for the results. BigQuery will then write the results of your query to a native table.

Related

Bigquery - Updating GCS path of Bigquery external tables

I have a requirement where I might have to update the Bigquery External tables on a periodic basis.
The GCS location has timestamp for every incremental run, I would like to update to the latest timestamp folder as the path of External table.
One way i see is only dropping the table and creating again by pointing it to latest folder. But, is there any other way to update it without dropping the table
As suggested by #Samuel , you can use the SQL statement CREATE or REPLACE EXTERNAL TABLES for your requirement. Scheduled queries support DML and DDL statements which can be used to create the new tables. You can use the below mentioned query parameter to create the table according to your schedule :
My_database_name.my_table_name.my_results_{run_date}
For more information you can refer to this documentation.

Azure SQL: Is it possible to use the view from another database?

Being new to Azure SQL, I am already successfully JOINing my table with the table from the other database (by defining the CREATE EXTERNAL TABLE and the related things). However, some queries contain views from the other database, not the tables. How can I join my table with that external view? Is it possible at all? Or do I have to copy the code from the remote view and work directly with the external tables?
with create external table you can access a table or view :
take a look at
https://medium.com/#fbeltrao/access-another-database-in-azure-sql-1afc526b7ad4

How to drop columns from a partitioned table in BigQuery

We can not use create or replace table statement for partitioned tables in BigQuery. I can export the table to GCS but BigQuery generates then multiple JSON files that can not be imported into a table in once. Is there a safe way to drop a column from a partitioned table? I use BigQuery's web interface.
Renaming a column is not supported by the Cloud Console, the classic BigQuery web UI, the bq command-line tool, or the API. If you attempt to update a table schema using a renamed column, the following error is returned: BigQuery error in update operation: Provided Schema does not match Table project_id:dataset.table.
There are two ways to manually rename a column:
Using a SQL query: choose this option if you are more concerned about simplicity and ease of use, and you are less concerned about costs.
Recreating the table: choose this option if you are more concerned about costs, and you are less concerned about simplicity and ease of use.
If you want to drop a column you can either:
Use a SELECT * EXCEPT query that excludes the column (or columns) you want to remove and use the query result to overwrite the table or to create a new destination table
You can also remove a column by exporting your table data to Cloud Storage, deleting the data corresponding to the column (or columns) you want to remove, and then loading the data into a new table with a schema definition that does not include the removed column(s). You can also use the load job to overwrite the existing table
There is a guide published for Manually Changing Table Schemas.
edit
In order to change a Partitioned table to a Non-partitioned table, you can use the Console to query your data and overwrite your current table or copy to a new one. As an example, I have a table in BigQuery partitioned by _PARTITIONTIME. I used the following query to create a non-partitioned table,
SELECT *, _PARTITIONTIME as pt FROM `project.dataset.table`
With the above code, you will query the data among all table's partitions and create an extra column to show which partition it came from. Then, before executing it, there are two options, save the view in a new non-partitioned table or overwrite the current table:
Creating a new table go to: More(under the query editor) > Query Settings > Check the box "Set a destination table for query results" > Choose your project, dataset and write your new table's name > Under Destination table write preference check Write if empty.
Overwriting the current table: More(under the query editor) > Query Settings > Check the box "Set a destination table for query results" > Choose the same project and dataset for your current table > Write the same table's name as the one you want to overwrite > Under Destination table write preference check Overwrite table.
credit

Create Partition table in Big Query

Can anyone please suggest how to create partition table in Big Query ?.
Example: Suppose I have one log data in google storage for the year of 2016. I stored all data in one bucket partitioned by year , month and date wise. Here I want create table with partitioned by date.
Thanks in Advance
Documentation for partitioned tables is here:
https://cloud.google.com/bigquery/docs/creating-partitioned-tables
In this case, you'd create a partitioned table and populate the partitions with the data. You can run a query job that reads from GCS (and filters data for the specific date) and writes to the corresponding partition of a table. For example, to load data for May 1st, 2016 -- you'd specify the destination_table as table$20160501.
Currently, you'll have to run several query jobs to achieve this process. Please note that you'll be charged for each query job based on bytes processed.
Please see this post for some more details:
Migrating from non-partitioned to Partitioned tables
There are two options:
Option 1
You can load each daily file into separate respective table with name as YourLogs_YYYYMMDD
See details on how to Load Data from Cloud Storage
After tables created, you can access them either using Table wildcard functions (Legacy SQL) or using Wildcard Table (Standar SQL). See also Querying Multiple Tables Using a Wildcard Table for more examples
Option 2
You can create Date-Partitioned Table (just one table - YourLogs) - but you still will need to load each daily file into respective partition - see Creating and Updating Date-Partitioned Tables
After table is loaded you can easily Query Date-Partitioned Tables
Having partitions for an External Table is not allowed as for now. There is a Feature Request for it:
https://issuetracker.google.com/issues/62993684
(please vote for it if you're interested in it!)
Google says that they are considering it.

Get query of the table with all the records in it in SQL

HI I have a table in SQL. I want to create that table into another machine and get the DATA inside that table as well.
I don't wanna use the backup method because I have existing Stored Procedures in the other machine. I just want to get one table with all the records in it. ANy idea on how to do it?
Can you create a linked server on the target database and query the source database from the target?