Hive converts Null to empty String in String Column - hive

Hive converts Null to empty String in String Column. What is reason for that ?
As per our requirement, we need to see Null in string column instead of empty String otherwise Is Null not working for those columns.
So to solve this problem we set below property on Table:
TBLPROPERTIES('serialization.null.format'='')
But Still we see empty string instead of NULL and even SHOW TBLPROPERTIES also not showing this property in result, so i am not sure if this property is set or not.
I tried to set this property in DDL itself and even i tried
Alter Table <TableName> Set TBLPROPERTIES ('serialization.null.format' = '');

If needed create another table and store the values of this into that in this format. I mentioned -1 as example you can use anything of your choice.
in your select query
select
case when <col> is null then -1 else <col> end as <col>
from
table

How you are moving data?
If you are using sqoop then you can try passing below arguments
--input-null-string '\\N'
--input-null-non-string '\\N'

Related

Empty string being stored as null and need to differentiate between null and empty string in Orade [duplicate]

I am using Oracle DB. At the database level, when you set a column value to either NULL or '' (empty string), the fetched value is NULL in both cases. Is it possible to store '' (empty string) as a non NULL value in the database?
I execute this
UPDATE contacts SET last_name = '' WHERE id = '1001';
commit;
SELECT last_name, ID FROM contacts WHERE id ='1001';
LAST_NAME ID
------------ ------
null 1001
Is it possible to store the last_name as a non-NULL empty string ('')?
The only way to do this in oracle is with some kind of auxiliary flag field, that when set is supposed to represent the fact that the value should be an empty string.
As far as i know Oracle does not distinguish between '' and NULL, see here.
Oracle has a well know behavior that it silently converts "" to NULL on INSERT and UPDATE statements.
You have to deal with this in your code to prevent this behavior by converting NULL to "" when you read the columns back in and just do not use null in your program to begin with.
A long time since I used Oracle, but I believe we used to use a single space ' ' to represent an empty string, then trim it after reading.
If you use a VARCHAR2 data type then NULL and '' are identical and you cannot distinguish between them; so, as mentioned in other answers, you would either need to:
Have an additional column that contains a flag that distinguishes between non-NULL and NULL values so that if then flag states it is non-NULL and it contains a NULL then you know it is an empty string; or
Use an alternate representation, such as a single space character, for an empty string. This would then mean that you cannot store a string with that alternate representation; however, if trailing white-space was syntactically invalid for the strings you are storing then using a single space character to represent an empty string would be fine.
If you are using a CLOB data type then you CAN store an empty string using the EMPTY_CLOB() function:
CREATE TABLE table_name (value CLOB);
INSERT INTO table_name (value) VALUES (NULL);
INSERT INTO table_name (value) VALUES (EMPTY_CLOB());
INSERT INTO table_name (value) VALUES ('A');
Then:
SELECT value, LENGTH(value) FROM table_name;
Outputs:
VALUE
LENGTH(VALUE)
null
null
0
A
1
db<>fiddle here

Handling null for char(1) and varcar(2) in hive

I am reading a flat file in hive and i have null values coming in file like below
a|b|null|null|d
and when I create table on top of this with below datatypes
a char(1),b char(1),c char(1),varchar2(2),char(1)
and the value in table coming like this
a,b,n,nu,d
The oneway I can do this is to make the datatype as varchar2(4) and add check at null.
But is there any other way i can do this.
SerDe treats 'null' strings as normal values, no difference between value 'a' and 'null'.
Try to add 'serialization.null.format'='null' property to your table definition:
ALTER TABLE mytable SET tblproperties('serialization.null.format'='null');
Another approach is to use STRING data type and case statements is select:
select case when col = 'null' then null end as col
...

Trigger to convert empty string to 'null' before it posts in SQL Server decimal column

I've got a front table that essentially matches our SSMS database table t_myTable. Some columns I'm having problems with are those with numeric data types in the db. They are set to allow null, but from the front end when the user deletes the numeric value and tries to send a blank value, it's not posting to the database. I suspect because this value is sent back as an empty string "" which does not translate to the null allowable data type.
Is there a trigger I can create to convert these empty strings into null on insert and update to the database? Or, perhaps a trigger would already happen too late in the process and I need to handle this on the front end or API portion instead?
We'll call my table t_myTable and the column myNumericColumn.
I could also be wrong and perhaps this 'empty string' issue is not the source of my problem. But I suspect that it is.
As #DaleBurrell noted, the proper place to handle data validation is in the application layer. You can wrap each of the potentially problematic values in a NULLIF function, which will convert the value to a NULL if an empty string is passed to it.
The syntax would be along these lines:
SELECT
...
,NULLIF(ColumnName, '') AS ColumnName
select nullif(Column1, '') from tablename
SQL Server doesn't allow to convert an empty string to the numeric data type. Hence the trigger is useless in this case, even INSTEAD OF one: SQL Server will check the conversion before inserting.
SELECT CAST('' AS numeric(18,2)) -- Error converting data type varchar to numeric
CREATE TABLE tab1 (col1 numeric(18,2) NULL);
INSERT INTO tab1 (col1) VALUES(''); -- Error converting data type varchar to numeric
As you didn't mention this error, the client should pass something other than ''. The problem can be found with SQL Profiler: you need to run it and see what exact SQL statement is executing to insert data into the table.

Updating a JSON field replaces whole document?

In sql server 2016 I am expecting a document to have 3000+ fields in a JSON column. Can I update one field in the document without replacing to whole document. How can I do this?
You could use JSON_MODIFY function:
Updates the value of a property in a JSON string and returns the
updated JSON string.
JSON_MODIFY ( expression , path , newValue )
Something like:
UPDATE table_name
SET json_column = JSON_MODIFY(json_column, '$.name', 'new_name')
WHERE id = 1;

Is it possible to store '' (empty string) as a non NULL value in the database?

I am using Oracle DB. At the database level, when you set a column value to either NULL or '' (empty string), the fetched value is NULL in both cases. Is it possible to store '' (empty string) as a non NULL value in the database?
I execute this
UPDATE contacts SET last_name = '' WHERE id = '1001';
commit;
SELECT last_name, ID FROM contacts WHERE id ='1001';
LAST_NAME ID
------------ ------
null 1001
Is it possible to store the last_name as a non-NULL empty string ('')?
The only way to do this in oracle is with some kind of auxiliary flag field, that when set is supposed to represent the fact that the value should be an empty string.
As far as i know Oracle does not distinguish between '' and NULL, see here.
Oracle has a well know behavior that it silently converts "" to NULL on INSERT and UPDATE statements.
You have to deal with this in your code to prevent this behavior by converting NULL to "" when you read the columns back in and just do not use null in your program to begin with.
A long time since I used Oracle, but I believe we used to use a single space ' ' to represent an empty string, then trim it after reading.
If you use a VARCHAR2 data type then NULL and '' are identical and you cannot distinguish between them; so, as mentioned in other answers, you would either need to:
Have an additional column that contains a flag that distinguishes between non-NULL and NULL values so that if then flag states it is non-NULL and it contains a NULL then you know it is an empty string; or
Use an alternate representation, such as a single space character, for an empty string. This would then mean that you cannot store a string with that alternate representation; however, if trailing white-space was syntactically invalid for the strings you are storing then using a single space character to represent an empty string would be fine.
If you are using a CLOB data type then you CAN store an empty string using the EMPTY_CLOB() function:
CREATE TABLE table_name (value CLOB);
INSERT INTO table_name (value) VALUES (NULL);
INSERT INTO table_name (value) VALUES (EMPTY_CLOB());
INSERT INTO table_name (value) VALUES ('A');
Then:
SELECT value, LENGTH(value) FROM table_name;
Outputs:
VALUE
LENGTH(VALUE)
null
null
0
A
1
db<>fiddle here