Hive date functions within create table - hive

My hive version is 1.2.0
I am doing hive hbase integration where my hbase table already present.
While creating hive table, I was checking if I can use few of hive's built-in date functions as a candidate for virtual columns/derived columns, which is something like this -
create external table `Hive_Test`(
*existing hbase columns*,
*new_column* AS to_date(from_unixtime(unix_timestamp(*existing_column*,'yyyy/MM/dd HH:mm:ss')...
)CLUSTERED BY (..) SORTED BY (new_colulmn) INTO n BUCKETS
..
WITH SERDEPROPERTIES(
hbase.columns.mappings=':key,cf:*,:timestamp',
..
)
If there is any other way where I can use built-in functions capability in create table, then please let me know.
Thanks.

With reference to - Hive Computed Column, i think you are defining a logic when creating a table which is not possible with hive.
You can refer this article for Apache Hive Derived Column Support and Alternative

A better way is to create a view on top of the non-native table created for Hive-HBase integration, with which you can do almost any kind of mapping that facilitates your business.

Related

How to create a bucketed ORC transactional table in Hive that is modeled after a non-transactional table

Suppose I have a non-transactional table in Hive named 'ccm'. It has hundreds of columns and one partition field.
I know how to create a copy with "create table abc like ccm' but I would like abc to be bucketed, ORC, and have transaction support set on via TBLPROPERTIES.
I do not want to mention all the columns in ccm when I compose the HQL.
Can I do this?
This answer may have the correct way to proceed in your case, and it also explains some limitation of the method used.
Create hive table using "as select" or "like" and also specify delimiter
So, from the example, you should add the missing parts:
CLUSTER BY
TBLPROPERTIES ("transactional"="true")
I have some doubts that you can achieve exactly your expected results but i would consider it as a step forward

How to extract the SQL Create statement from ignite

Using Apache Ignite, is it possible to extract the CREATE statement used to create the table? You can do this in MySQL with the SHOW CREATE TABLE x command for example.
I don't think dumping DML (database structure) is possible currently. Especially since CREATE TABLE is only one way of making tables in Ignite out of three.
However, you can query tables, schemas and indexes via JDBC metadata introspection feature.

How to find when record was last updated?

How to find when table rows were last updated/inserted? Presto is ANSI-SQL compliant so even if you don't know Presto, maybe there's a generic SQL way that would point me in the right direction.
I'm using Hadoop. Presto queries are quicker than Hive. "Describe" just gives column names.
https://prestosql.io/docs/current/
Presto 309 added a hidden $properties table in the Hive connector for each table that exposes the Hive table properties. You can use it to find the last update time (replace example with your table name):
SELECT transient_lastddltime FROM "example$properties"

Add unique value in hive table

I want to add a unique value to my hive table whenever i enter any record, that value should not be repeated in the entire hive table. I am not able to find any solutions or any function for this. In my case i want to enter the record in hive using pig latin. Please help.
HIVE does not provide RDBMS database like constraints.
The suggested approch using PIG Script is as below.
1. Load data
2. Apply DISTINCT to data
3. Store data at a location
4. Create external hive table at the same location.
Step 3 and 4 can be combined if you can use HCATALOG which allows you to directly store data in Hive table.
Official documentation :Link 1 link 2
did you take a look to this? https://github.com/manojkumarvohra/hive-hilo it seems to provide a way to generate sequence numbers in hive using hi/lo algorithm

What is the default schema in Hive?

For Pig, the default schema is ByteArray. Is there a default schema for Hive if we don't mention a schema in Hive? I tried to look at some Hive documentation but couldn't find any.
Hive is schema on Read --- I am not sure this is the answer...If some one could give an insight on this that would be great
Hive does the best that it can to
read the data. You will get lots of null values if there aren’t enough fields in each record
to match the schema. If some fields are numbers and Hive encounters nonnumeric
strings, it will return nulls for those fields. Above all else, Hive tries to recover from all
errors as best it can.
There is not default schema in Hive, in order to query data in hive you have to first create a table explaining the content of your data (by using create external table ... location).
So you basically have to tell hive the "scheme" before querying the data.