Native Table creation on top of GCS failed in Bigquery - sql

I tried to create a Big query native table on top of GCS bucket. The native table creation works, when the table is created from the UI, But when I tried to run DDL for creating a native table on top of GCS it Failed
Below is the query used and error produced:
create table sample_ds_001.native
options
(
format='csv'
,uris=["gs://test-bucket-003/File_Folder/sample.txt"]
)
Error:
Unknown option: format at [4:3]

You can create an external table for your requirement using the csv file stored in the gcs bucket, below query can be considered:
Create EXTERNAL table `project-id.dataset.table`
options (format='csv' ,uris=["gs://file-location.csv"])
CSV Data:
A,B,C,D,E,F,G
47803,000000000020030263,629,785,2021-01-12 23:26:37,986,2022-01-12 23:26:37
Output:

Related

Creating cetas table returns multiple files with .text.deflate extension

I have created an external data source and CSV file format.I am creating an external table susing cetas script
create external table test
With (location='test/data/',
Data_source=test_datasource,
File_format=csv_format)
As select * from dimcustomer
But when I run this is query this is generating many files with extension .text.deflated .
Can we generate only one file and can we give the name to the file which we generate.
Input appreciated .I am creating this external table to export synapse data to data lake container.
Tried creating an external table

Can we create external table on top of adls2 from data bricks using interactive cluster?

I'm trying to create external table on top of adls2 from azure data bricks and in location I gave "abfss://......". This is not working and throwing the below error
Error in SQL statement: AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got exception: shaded.databricks.xxxxx.org.apache.hadoop.fs.azurebfs.contracts.exceptions.ConfigurationPropertyNotFoundException Configuration property xxxxxx.dfs.core.windows.net not found.);;
If I'm giving mount point path in the location it is working as expected. Is there any other way we can create the table without the mount point?
After some research I figured out that we can create an external table by giving full path in the options.
Example of creating parquet table
CREATE TABLE TABLE_NAME
(column1 int,
column2 int
)
USING PARQUET OPTIONS (PATH "abfss://container#storageAccount/.....")

Creating external hive table in databricks

I am using databricks community edition.
I am using a hive query to create an external table , the query is running without any error but the table is not getting populated with the specified file that has been specified in the hive query.
Any help would be appreciated .
from official docs ... make sure your s3/storage location path and schema (with respects to the file format [TEXT, CSV, JSON, JDBC, PARQUET, ORC, HIVE, DELTA, and LIBSVM]) are correct
DROP TABLE IF EXISTS <example-table> // deletes the metadata
dbutils.fs.rm("<your-s3-path>", true) // deletes the data
CREATE TABLE <example-table>
USING org.apache.spark.sql.parquet
OPTIONS (PATH "<your-s3-path>")
AS SELECT <your-sql-query-here>
// alternative
CREATE TABLE <table-name> (id long, date string) USING PARQUET LOCATION "<storage-location>"

Loading Avro Data into BigQuery via command-line?

I have created an avro-hive table and loaded data into avro-table from another table using hive insert-overwrite command.I can see the data in avro-hive table but when i try to load this into bigQuery table, It gives an error.
Table schema:-
CREATE TABLE `adityadb1.gold_hcth_prfl_datatype_acceptence`(
`prfl_id` bigint,
`crd_dtl` array< struct < cust_crd_id:bigint,crd_nbr:string,crd_typ_cde:string,crd_typ_cde_desc:string,crdhldr_nm:string,crd_exprn_dte:string,acct_nbr:string,cre_sys_cde:string,cre_sys_cde_desc:string,last_upd_sys_cde:string,last_upd_sys_cde_desc:string,cre_tmst:string,last_upd_tmst:string,str_nbr:int,lng_crd_nbr:string>>)
STORED AS AVRO;
Error that i am getting:-
Error encountered during job execution:
Error while reading data, error message: The Apache Avro library failed to read data with the follwing error: Cannot resolve:
I am using following command to load the data into bigquery:-
bq load --source_format=AVRO dataset.tableName avro-filePath
Make sure that there is data available in your gs folder where you are pointing and the data contains the schema (it should if your created it from Hive). Here you have an example of how load data
bq --location=US load --source_format=AVRO --noreplace my_dataset.my_avro_table gs://myfolder/mytablefolder/part-m-00001.avro

Is it possible to convert external tables to native in BigQuery?

I have created a table from Google Cloud Storage (filepath starts with gs://). I could not create it as native even after trying multiple times. I succeeded only after setting the table option as native. Later, I was able to query this table successfully. However, I need to do the following:
Add a column to the table
Append (union) two such external tables
Join the appended table with another external table
save the joined table in a new table so that I can later query this new table
Questions:
Since this is an external table, can I add a column? Can I save the joined table as native?
Yes, you can convert an external table (or federated source) to a native table in BigQuery.
To do this, simply read the external table using SQL and set the destination table for the results. BigQuery will then write the results of your query to a native table.