Setting maximum length for column field in HiveQL - sql

I am very new to SQL/Hive and trying to set a maximum length for strings in a column when creating my table as below,
hive> CREATE TABLE Persons
(
PersonID int,
Suffix string(5),
LastName string,
FirstName string
);
FAILED: ParseException line 3:15 mismatched input '(' expecting ) near 'string' in create table statement
Any ideas what I am doing wrong?

Till Hive version 0.11 you cannot restrict the length of string column. You have to use STRING data type for string column.
But from Hive version 0.12 we have varchar data type just like other RDBMS where you can specify and restrict the length of string columns. You check the data types in hive here: data types in hive
Also for create table syntax in Hive you refer to this link : Create table syntax in Hive
Hope this helps..!!!

Related

Why is SQL converting Varchar value to Integer in a varchar column?

I am receiving an error
Conversion failed when converting the varchar value to data type int
while trying to insert data from one table into another. Both have the same table structure (table being inserted is an exact copy of the one used in the Select) and data types on the columns are the same.
INSERT INTO PS_PSOPRDEFN_BA
SELECT *
FROM PSOPRDEFN
Error:
Conversion failed when converting the varchar value '11000_600' to data type int.
The column this is inserting with this value is a varchar(30) in both tables, so I don't know why SQL is trying to convert it to int. Any ideas are appreciated.
When doing inserts, always include the columns:
INSERT INTO PS_PSOPRDEFN_BA ( . . . ) -- column list here
SELECT . . . -- column list here
FROM PSOPRDEFN;
You have a value which is a string which is being assigned to an integer column, and the value cannot be converted.
When doing an insert, the columns are aligned by order in the column list, not by name. So, merely having the same name doesn't mean that the code will work. The tables have to have exactly the same columns defined in the same order with compatible types for your code to work.

Hive- issue with Create Table with column have space

Will need some advice. In HIVE DB is it possible to create table with column have space as below
CREATE TABLE TEST2("Kod ASS" String)
get an error as below
Error: Error while compiling statement: FAILED: ParseException line 1:19 cannot recognize input near '"Kod ASS"' 'String' ')' in column specification
SQLState: 42000
ErrorCode: 40000
show manual about column names:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL
In Hive 0.12 and earlier, only alphanumeric and underscore characters are allowed in table and column names.
In Hive 0.13 and later, column names can contain any Unicode character (see HIVE-6013). Any column name that is specified within
backticks (`) is treated literally. Within a backtick string, use
double backticks (``) to represent a backtick character. Backtick
quotation also enables the use of reserved keywords for table and
column identifiers.
To revert to pre-0.13.0 behavior and restrict column names to alphanumeric and underscore characters, set the configuration property
hive.support.quoted.identifiers to none. In this configuration,
backticked names are interpreted as regular expressions. For details,
see Supporting Quoted Identifiers in Column Names.
CREATE TABLE DB_Name.Table_name (
First name VARCHAR(64), Last name VARCHAR(64), Location id VARCHAR(64) , age INT, gpa DECIMAL(3,2)) CLUSTERED BY (age) INTO 2 BUCKETS STORED AS ORC;
OR
CREATE TABLE TEST2(Kod ASS String) STORED AS TEXTFILE;
You can use and put column name inside.
I hope both worked for you.

How to convert an invalid number column to a number on HANA?

I have a table with a string column. I convert this column to a number using the function TO_INTEGER(). Ist work fine. But If I Aggregate the converted column with the function SUM I got this error:
SAP DBTech JDBC: [339]: invalid number: not a valid number string ' ' at function to_int()
This is my sample SQL query:
select SUM(PARTICIPANT)
from (
select TO_INTEGER(STUDENT) as PARTICIPANT
from MyTable)
Column STUDENT is a varchar(50) in MyTable
What did I do wrong?
Thanks in advance
Without seeing your Column values, it looks like you're trying to convert the numeric sequence at the end of your values list to a number, and the spaces that delimit it are throwing this error. But based on the information you've given us, it could be happening on any field.
Eg:
Create table Table1 (tel_number number);
Insert into Table1 Values ('0419 853 694');
The above gives you a
invalid number
Kindly Check Filter/ where clause if you try to give string column value as integer that time you got this error message. I wrote in HANA - Calculation view - SQL Script, In where clause Bukrs (Company Code) = 1000 after that i changed Bukrs = '1000'. then this issue is resolved.

Is there a way to define replacement of one string to other in external table creation in greenplum.?

I need to create external table for a hdfs location. The data is having null instead of empty space for few fields. If the field length is less than 4 for such fields, it is throwing error when selecting data. Is there a way to define replacement of all such nulls with empty space while creating table it self.?
I am trying it in greenplum, just tagged hive to see what can be done for such cases in hive.
You could use the serialization property for mapping NULL string to empty string.
CREATE TABLE IF NOT EXISTS abc ( ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE TBLPROPERTIES ("serialization.null.format"="")
In this case when you query it from hive you would get empty value for that field and hdfs would have "\N".
Or
If you want to represented empty string instead of '\N', you can using COALESCE function:
INSERT OVERWRITE tabname SELECT NULL, COALESCE(NULL,"") FROM data_table;
the answer to the problem is using NULL as 'null' statement in create table syntax for greenplum. As i have mentioned, i wanted to get few inputs from people who faced such issues in hive. so i have tagged hive as well. But, greenplum external table syntax supports NULL AS phrase in which we can specify the form of NULL that you want to keep.

Create hive timestamp from pig

How i can create a timestamp field in pig from a string that hive accepts as timestamp?
I have formatted the string in pig to match timestamp format in hive, but after loading it is null instead of showing the date.
2014-04-10 09:45:56 this is how the format looks like in pig, and this is matching the format with hive timestamp, but cannot load. (only if i load into string field)
any ideas why?
quick update: no hcatalog is available
problem is some case the timestamp fields contains null values and all the filed become null when using timestamp data type. When putting timestamp to a column where all the row is in the above format it works fine. So the real question is how null values can be handle
I suspect you have written your data to HDFS using PigStorage and you want to load it into a Hive table. The problem is that a missing tuple field will be written by Pig as null which will be treated by Hive 0.11 as null. So far so good.
But then all the subsequent fields will be treated as null, however they can have different values. Hive 0.12 doesn't have this issue.
Depending on the SerDe type, Hive can interpret different strings as null. In case of LazySimpleSerDe it is \N.
You have two option:
set the table's null format property to the empty string which is produced by Pig
or store \N in Pig for null fields
E.g:
Given the following data in Pig 0.11 :
A = load 'data' as (txt:chararray, ts:chararray);
dump A;
(a,2014-04-10 09:45:56)
(b,2014-04-11 10:45:56)
(,)
(e,2014-04-12 11:45:56)
Option 1:
store A into '/user/data';
Hive 0.11 :
CREATE EXTERNAL TABLE test (txt string, tms TimeStamp)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION '/user/data';
alter table test SET SERDEPROPERTIES('serialization.null.format' = '');
Option 2:
...
B = foreach A generate txt, (ts is null?'\\N':ts);
store B into '/user/data';
Then create the table in Hive without setting the serde property.