Insert If Table Exists in Databricks or Spark SQL - apache-spark-sql

I am writing in a Databricks notebook where I need to say something like:
%sql
IF EXISTS myTable INSERT OVERWRITE TABLE myTable SELECT * FROM somethingElse
I know that the INSERT OVERWRITE clause works, but I'm not sure how to get the IF EXISTS to work without breaking out of pure SQL code and using python (which would make the script messier).

Unfortunately, there is no DDL named "IF EXISTS" supported in Databricks.
You have to use command called "Drop Table":
Drop a table and delete the directory associated with the table from the file system if this is not an EXTERNAL table. If the table to drop does not exist, an exception is thrown.
IF EXISTS
If the table does not exist, nothing happens.
DROP TABLE [IF EXISTS] [db_name.]table_name
Example:
DROP TABLE IF EXISTS diamonds;
CREATE TABLE diamonds USING CSV OPTIONS (path "/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv", header "true")
Reference: SQL Language Manual This is a complete list of Data Definition Language (DDL) and Data Manipulation Language (DML) constructs supported in Databricks.
Hope this helps.

Related

Rename a table in Amazon Redshift

I've been trying to rename a table from "fund performance" to fund_performance in SQLWorkbench for a Redshift database. Commands I have tried are:
alter table schemaname."fund performance"
rename to fund_performance;
I received a message that the command executed successfully, and yet the table name did not change.
I then tried copying the table to rename it that way. I used
#CREATE TABLE fund_performance LIKE "schema_name.fund performance";
CREATE TABLE fund_performance AS SELECT * FROM schema_name."fund performance";
In both these cases I also received a message that the statements executed successfully, but nothing changed. Does anyone have any ideas?
Use following it may work out for you
SELECT * into schema_name.fund_performance FROM schema_name.[fund performance]
It will copy the data by creating new table as fund_performance but it won't create any constraints and Identity's
To Rename specific table without disturbing existing constraints
EXEC sp_rename 'schema_name.[fund performance]', 'schema_name.fund_performance';

Setting transactional-table properties results in external table

I am creating a managed table via Impala as follows:
CREATE TABLE IF NOT EXISTS table_name
STORED AS parquet
TBLPROPERTIES ('transactional'='false', 'insert_only'='false')
AS ...
This should result in a managed table which does not support HIVE-ACID.
However, when I run the command I still end up with an external table.
Why is this?
I found out in the Cloudera documentation that neglecting the EXTERNAL-keyword when creating the table does not mean that the table definetly will be managed:
When you use EXTERNAL keyword in the CREATE TABLE statement, HMS stores the table as an external table. When you omit the EXTERNAL keyword and create a managed table, or ingest a managed table, HMS might translate the table into an external table or the table creation can fail, depending on the table properties.
Thus, setting transactional=false and insert_only=false leads to an External Table in the interpretation of the Hive Metastore.
Interestingly, only setting TBLPROPERTIES ('transactional'='false') is completly ignored and will still result in a managed table having transactional=true).

How do I make CREATE DATABASE command re-runnable in SQL?

I'd like to create a script that simply drops and creates a database over and over in PostgreSQL.
For a table this is not problem with the following:
DROP TABLE IF EXISTS test CASCADE;
CREATE TABLE IF NOT EXISTS test ( Sample varchar );
The above code works, no problem.
However, when I try to do the same for a database, ie:
DROP DATABASE IF EXISTS sample;
CREATE DATABASE sample;
I get the following error:
ERROR: DROP DATABASE cannot run inside a transaction block
SQL state: 25001
Any idea how I can get the database to be created and dropped repetitively without doing it manually?

How to create external table in db2 with basic DML operation

I created external table with following command
db2 "
CREATE EXTERNAL TABLE TEST(a int) using
(dataobject '/home/db2inst2/test.tbl' )
)
"
db2 "insert into TEST values(1)"
db2 "insert into TEST values(2)"
But looks like it is replacing value. Is there any option to append files & do basic DML operation on external table. Please let me know if any other option available in db2 V11.5
It's not possible.
CREATE EXTERNAL TABLE statement
Restrictions
External tables cannot be used by a Db2 instance running on a Windows system.
Data being loaded must be properly formatted.
You cannot delete, truncate, or update an external table.
For remote external tables (that is, for external tables are not located in a Swift or S3 object store and for which the REMOTESOURCE option is set to a value other than LOCAL):
A single query or subquery cannot select from more than one external table at a time, and cannot reference the same external table
more than once. If necessary, combine data from several external
tables into a single table and use that table in the query.
A union operation cannot involve more than one external table.
In addition:
For an unload operation, the following conditions apply:
If the file exists, it is overwritten.

How to create table like avro?

I created a table (test_load) based on the schema of another one (test). Then i inserted test_load into another table.
drop table if exists test_load;
create external table test_load
like test
location /test_load_folder;
insert into warehouse
select * from test_load;
It works fine when everything is in parquet.
I then evolved my test schema to avro and recreated my test_load table but when i try to insert into warehouse i receive an error :
Error while processing statement: FAILED: Execution Error, return code 2
from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
I'm looking for the good syntax to re-create the load table and scpecify its avro. My hypothesis is that hive still considers its parquet files.
I tried
drop table if exists test_load;
create external table test_load
like test
location /test_load_folder
stored as avro;
but i have a syntax error.