Error while creating a table from another one in Apache Spark - sql

I'm creating the table the following way:
spark.sql("CREATE TABLE IF NOT EXISTS table USING DELTA AS SELECT * FROM origin")
But i get this error:
Exception in thread "main" org.apache.spark.SparkException: Table implementation does not support writes: table

You get this SparkException error because this way to create a delta-lake table using a SQL request as input data is not implemented.
If you want to create a delta-lake table and insert data in the same time, you should use DataFrameWriter API, as explained in databricks documentation:
spark.sql("SELECT * FROM origin")
.write
.format("delta")
.save("/path/to/where/you/want/to/save/your/data")

Related

How can I write data to BigQuery with Spark SQL?

We are using Google Dataproc cluster and spark-sql shell.
And able to create a table as follows:
CREATE TABLE table_bq
USING bigquery
OPTIONS (
project 'project',
dataset 'dataset',
table 'bq_table'
);
This connects to BigQuery for all query purposes, however, when we try to do
INSERT OVERWRITE TABLE table_bq SELECT ....;
It fails with error:
) does not allow insertion.;;
Any pointers on how can we load data into BigQuery from spark-sql ?
Note: I have seen example of writing data to BigQuery with spark with dataframe, however my question is there anyway to do with spark-sql?

Can we create external table on top of adls2 from data bricks using interactive cluster?

I'm trying to create external table on top of adls2 from azure data bricks and in location I gave "abfss://......". This is not working and throwing the below error
Error in SQL statement: AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got exception: shaded.databricks.xxxxx.org.apache.hadoop.fs.azurebfs.contracts.exceptions.ConfigurationPropertyNotFoundException Configuration property xxxxxx.dfs.core.windows.net not found.);;
If I'm giving mount point path in the location it is working as expected. Is there any other way we can create the table without the mount point?
After some research I figured out that we can create an external table by giving full path in the options.
Example of creating parquet table
CREATE TABLE TABLE_NAME
(column1 int,
column2 int
)
USING PARQUET OPTIONS (PATH "abfss://container#storageAccount/.....")

U-SQL External table error: 'Unable to cast object of type 'System.DBNull' to type 'System.Type'.'

I'm failing to create external tables to two specific tables from Azure SQL DB,
I already created few external tables with no issues.
The only difference I can see between the failed and the successful external tables is that the tables that failed contains geography type columns, so I think this is the issue but i'm not sure.
CREATE EXTERNAL TABLE IF NOT EXISTS [Data].[Devices]
(
[Id] int
)
FROM SqlDbSource LOCATION "[Data].[Devices]";
Failed to connect to data source: 'SqlDbSource', with error(s): 'Unable to cast object of type 'System.DBNull' to type 'System.Type'.'
I solved it by doing a workaround to the external table:
I created a view that select from external rowset using EXECUTE
CREATE VIEW IF NOT EXISTS [Data].[Devices]
AS
SELECT Id FROM EXTERNAL SqlDbSource
EXECUTE "SELECT Id FROM [Data].[Devices]";
This made the script to completely ignore the geography type column, which is currently not supported as REMOTEABLE_TYPE for data sources by U-SQL.
Please have a look at my answer on the other thread opened by you. To add to that, I would also recommend you to have a look at how to create a table using a query. In the query, you should be able to use "extractors" in the query to create the tables. To read more about extractors, please have a look at this doc.
Hope this helps.

Getting exception while updating table in Hive

I have created one table in hive from existing s3 file as follows:
create table reconTable (
entryid string,
run_date string
)
LOCATION 's3://abhishek_data/dump1';
Now I would like to update one entry as follows:
update reconTable set entryid='7.24E-13' where entryid='7.24E-14';
But I am getting following error:
FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.
I have gone through a few posts here, but not getting any idea how to fix this.
I think you should create an external table when reading data from a source like S3.
Also, you should declare the table in ORC format and set properties 'transactional'='true'.
Please refer to this for more info: attempt-to-do-update-or-delete-using-transaction-manager
You can refer to this Cloudera Community Thread:
https://community.cloudera.com/t5/Support-Questions/Hive-update-delete-and-insert-ERROR-in-cdh-5-4-2/td-p/29485

Hive Query Error - While copy existing table data to another table

I have loaded a web file to a table using serde in hive. i am able to view the table data. now i want to copy the data to a new table. If i run a new table
-Create table new_xxx as select * from XXX;
- the job is failing.
Error in the log file:
Execution error,return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
Run time Exception:error in configuring object.
Since you are using serde to load web data into the 1st table it will serialize and deserialize the table data while insert and select. So, in the second table to which you are trying to insert data should also be aware of the serde used.
use the following syntax it might help you.
CREATE TABLE new_table_XX ROW FORMAT SERDE "org.apache.hadoop.hive.serde" AS SELECT .....