Create table statement and add condition to replace None values with NULL in IMPALA

Create table statement and add condition to replace None values with NULL in IMPALA - conditional-statements

Bellow is my create table statement.
Some values are coming as "None" from the data source and want to add a condition to my CREATE TABLE statement to check if a value coming in equal "None" replace it with NULL. Is that possible with Impala without having an intermediate table and then ETL the data to the final TABLE with CASE STATMENT
CREATE TABLE IF NOT EXISTS customer_db.customers_table(
`customer_id` BIGINT NOT NULL ENCODING AUTO_ENCODING COMPRESSION DEFAULT_COMPRESSION,
`ts` BIGINT NOT NULL ENCODING AUTO_ENCODING COMPRESSION DEFAULT_COMPRESSION,
`customer_name` STRING NULL DEFAULT NULL ENCODING AUTO_ENCODING COMPRESSION DEFAULT_COMPRESSION,
PRIMARY KEY (customer_id, ts)
)
PARTITION BY RANGE (`ts`)(PARTITION VALUE = 0)
STORED AS KUDU

Update:
Import your data as CSV.
Use Notepad++ to replace none to null
Import it
https://docs.cloudera.com/machine-learning/cloud/import-data/topics/ml-loading-csv-data-into-an-impala-table.html
Try using create table as select query
None is data it cannot be converted to null via create table command
So just import data as is in temp table
Then use create table as select columns then replace column with
None to null in select query
CREATE [EXTERNAL] TABLE [IF NOT EXISTS] db_name.]table_name
[PARTITIONED BY (col_name[, ...])]
[SORT BY ([column [, column ...]])]
[COMMENT 'table_comment']
[ROW FORMAT row_format]
[WITH SERDEPROPERTIES ('key1'='value1', 'key2'='value2', ...)]
[STORED AS ctas_file_format]
[LOCATION 'hdfs_path']
[CACHED IN 'pool_name' [WITH REPLICATION = integer] | UNCACHED]
[TBLPROPERTIES ('key1'='value1', 'key2'='value2', ...)]
AS
select_statement

Related

Populate snowflake table with default values without selecting default column values from the file data

I am trying to load a table (drop table and load the data - similar to truncate and load) dynamically. Let us assume that table needs to have 4 fields, ID, Name, SeqNo, and DtTimeStamp.
The data is being selected from an externally staged csv\text file that has only two fields (ID and Name). The below query gives an error for the nonmatching of a number of columns. How to resolve that issue?
CREATE OR REPLACE TABLE SOMETABLENAME(ID NUMBER(38,0), Name
VARCHAR(255), SeqNo NUMBER(38,0) NOT NULL AUTOINCREMENT, DtTimeStamp
TIMESTAMP_NTZ(9) NOT NULL DEFAULT CURRENT_TIMESTAMP()) AS SELECT A.$1
AS ID, A.$2 AS Name FROM #EXTERNALSTAGE/SOME_FILE.CSV A;
If you carefully look at the above SQL statement, my table has two extra fields that need to be auto-populated for every row it loads. But I am unable to make it work?
Any suggestions are highly appreciated.
Thanks in Advance!
Sathya

CREATE TABLE … AS SELECT (CTAS)
CREATE TABLE <table_name> ( <col1_name> , <col2_name> , ... ) AS SELECT ...
The number of column names specified must match the number of SELECT list items in the query; the types of the columns are inferred from the types produced by the query.
To resolve it, CTAS and INSERT INTO could be two separate steps:
CREATE OR REPLACE TABLE SOMETABLENAME(
ID NUMBER(38,0),
Name VARCHAR(255),
SeqNo NUMBER(38,0) NOT NULL AUTOINCREMENT,
DtTimeStamp TIMESTAMP_NTZ(9) NOT NULL DEFAULT CURRENT_TIMESTAMP()
);
-- here INSERT/SELECT have matching column list
INSERT INTO SOMETABLENAME(ID, Name)
SELECT A.$1 AS ID, A.$2 AS Name FROM #EXTERNALSTAGE/SOME_FILE.CSV A;

Database Table Constraint using LIKE operator

I am trying to setup constraints on my database table using like operator. Is this possible in Azure SQL Server?
I have a column FILE_NAME that has for example 'VID' as a common pattern in most of the records. Then, I have another column FILE_TYPE where I want to setup constraint on so that only possible values can be inserted.
Table Definition:
CREATE TABLE dbo.CUST_LIBRARY
(
FILE_NAME VARCHAR(20),
FILE_TYPE VARCHAR(10)
);
Here is how my data looks like:
FILE_NAME
VID_GEO_1 |
IMG-ART_1 |
TER-VID_6 |
FIL-PAR_1 |
Now I want to setup a constraint on Column FILE_TYPE where we can only insert values 'MP4', 'AVI' if the FILE_NAME has 'VID' in it. Otherwise, the remaining records should always be defaulted to 'NA' and nothing else should be inserted.

You want a check constraint:
CREATE TABLE dbo.CUST_LIBRARY (
FILE_NAME VARCHAR(20),
FILE_TYPE VARCHAR(10),
CONSTRAINT CHK_CUST_LIBRARY
CHECK ( (FILE_TYPE IN ('MP4', 'AVI') AND (FILE_NAME LIKE '%VID%')) OR
FILE_TYPE = 'NA'
);

Adding a NOT NULL column to a Redshift table

I'd like to add a NOT NULL column to a Redshift table that has records, an IDENTITY field, and that other tables have foreign keys to.
In PostgreSQL, you can add the column as NULL, fill it in, then ALTER it to be NOT NULL.
In Redshift, the best I've found so far is:
ALTER TABLE my_table ADD COLUMN new_column INTEGER;
-- Fill that column
CREATE TABLE my_table2 (
id INTEGER IDENTITY NOT NULL SORTKEY,
(... all the fields ... )
new_column INTEGER NOT NULL,
PRIMARY KEY(id)
) DISTSTYLE all;
UNLOAD ('select * from my_table')
to 's3://blah' credentials '<aws-auth-args>' ;
COPY my_table2
from 's3://blah' credentials '<aws-auth-args>'
EXPLICIT_IDS;
DROP table my_table;
ALTER TABLE my_table2 RENAME TO my_table;
-- For each table that had a foreign key to my_table:
ALTER TABLE another_table ADD FOREIGN KEY(my_table_id) REFERENCES my_table(id)
Is this the best way of achieving this?

You can achieve this w/o having to load to S3.
modify the existing table to create the desired column w/ a default value
update that column in some way (in my case it was copying from another column)
create a new table with the column w/o a default value
insert into the new table (you must list out the columns rather than using (*) since the order may be the same (say if you want the new column in position 2)
drop the old table
rename the table
alter table to give correct owner (if appropriate)
ex:
-- first add the column w/ a default value
alter table my_table_xyz
add visit_id bigint NOT NULL default 0; -- not null but default value
-- now populate the new column with whatever is appropriate (the key in my case)
update my_table_xyz
set visit_id = key;
-- now create the new table with the proper constraints
create table my_table_xzy_new
(
key bigint not null,
visit_id bigint NOT NULL, -- here it is not null and no default value
adt_id bigint not null
);
-- select all from old into new
insert into my_table_xyz_new
select key, visit_id, adt_id
from my_table_xyz;
-- remove the orig table
DROP table my_table_xzy_events;
-- rename the newly created table to the desired table
alter table my_table_xyz_new rename to my_table_xyz;
-- adjust any views, foreign keys or permissions as required

Postgres returns SQL state: 22001 when copying data to another table

Searching on google the SQL state: 22001 is because:
the sql statement specifies a
string that is too long
but both tables have the same definition of the column.
this is the table i want to have the data copied:
CREATE TABLE wise_estado
(
id_estado serial NOT NULL,
>> cvgeo_estado character varying(2),<<
nombre_estado character varying
)
this the table with data i want to copy:
CREATE TABLE estados
(
gid serial NOT NULL,
>> "CVE_ENT" character varying(2),<<
"NOM_ENT" character varying(80),
geom geometry(MultiPolygon,4326),
CONSTRAINT estados_pkey PRIMARY KEY (gid)
)
my SQL statement:
INSERT INTO wise_estado ( cvgeo_estado, nombre_estado)
SELECT 'CVE_ENT', 'NOM_ENT'
FROM estados
What i'm missing in my SQL statement?

You are sending string literals instead of field names in your SQL statement. Instead:
INSERT INTO wise_estado ( cvgeo_estado, nombre_estado)
SELECT "CVE_ENT", "NOM_ENT"
FROM estados;
Or, more succintly:
INSERT INTO wise_estado ( cvgeo_estado, nombre_estado)
SELECT CVE_ENT, NOM_ENT
FROM estados

MS Access - sql expression for allow null?

I use MS Access (2003) database. Once I create a column I set NOT NULL using sql statement:
ALTER TABLE Table1
ALTER column myColumn INTEGER not null
Is there a way to change it back to allow null values? I already tried:
ALTER TABLE Table1
ALTER column myColumn INTEGER null
but nothing...

You cant specify null in ALTER TABLE (although not null is allowed)
See the below documentation and also this discussion on this toppic
Syntax
ALTER TABLE table {ADD {COLUMN field type[(size)] [NOT NULL] [CONSTRAINT index] | ALTER COLUMN field type[(size)] | CONSTRAINT multifieldindex} | DROP {COLUMN field I CONSTRAINT indexname} }
Old School Solution:-
create a new temporray field as null with the same datatype
update the new temporary field to the existing NOT NULL field
drop the old NOT NULL field
create the droped column with the same datatype again without NOT NULL
update the existing field to the temporary field
if there have been indices on the existing field, recreate these
drop the temporary field

Try something like this using MODIFY :-
ALTER TABLE Table1 MODIFY myColumn INT NULL;

The only way I've found is to use DAO directly on the table.
Set db.TableDefs(strTable1).Fields(strFieldName).Required = false

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Create table statement and add condition to replace None values with NULL in IMPALA - conditional-statements

Related

Populate snowflake table with default values without selecting default column values from the file data

Database Table Constraint using LIKE operator

Adding a NOT NULL column to a Redshift table

Postgres returns SQL state: 22001 when copying data to another table

MS Access - sql expression for allow null?

Categories

Resources