Can't insert values into a transactional table in Apache HIVE - hive

I'm trying to insert values into a bucketed, transactional table both via Beeline and Apache HUE, but it receive an Error, saying that unlocking locks is not permited.
SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
SET hive.support.concurrency=true;
SET hive.enforce.bucketing=true;
SET hive.exec.dynamic.partition.mode=nonstrict;
USE classicmodels;
CREATE TABLE classicmodels.my_emp (ID INT, Name STRING, Salary INT)
CLUSTERED BY (ID) INTO 5 BUCKETS
STORED AS ORC
TBLPROPERTIES('transactional'='true');
INSERT INTO TABLE classicmodels.my_emp (id, name, salary)
VALUES (1, 'John', 10000),
(2, 'Sara', 12000),
(3, 'Adam', 8000);
Error in beeline
Existing Locks
Error in HUE:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Hive Internal Error: org.apache.hadoop.hive.ql.lockmgr.LockException(Error communicating with the metastore) at org.apache.hive.service.cli.operation.Operation.toSQ...

Related

why is delete function not working in hive shell?

hive> delete from daily_case where num_casedaily=0;
FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.
thank you in advance.
Hive doesn't support ACID transactions in a conventional way. You will need some pre-requisites and undestand limitations of ACID Transactions in Hive.
You can review this article:
using-hive-acid-transactions-to-insert-update-and-delete-data
for more information on Hive Transactions.
Pre -Requisites
Hive Transactions Manager should be set to DbTxnManager SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
We need to enable concurrency SET hive.support.concurrency=true;
Once we set the above properties, we should be able to insert data into any table.
For updates and deletes, table should be bucketed and file format need to be ORC or any ACID Compliant Format.
We also need to set table property transactions to true TBLPROPERTIES ('transactional'='true');
REVIEW PROPERTIES
$ cd /etc/hive/conf
$ grep -i txn hive-site.xml
$ hive -e "SET;" | grep -i txn
$ beeline -u jdbc:hive2://localhost:10000/training_retail
As an example to create a transactional table in HIVE
SET hive.txn.manager;
hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager;
SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
SET hive.support.concurrency=true;
SET hive.enforce.bucketing;
SET hive.enforce.bucketing=true;
SET hive.exec.dynamic.partition.mode;
hive.exec.dynamic.partition.mode=strict
SET hive.exec.dynamic.partition.mode=nonstrict;
SET hive.compactor.initiator.on;
SET hive.compactor.initiator.on=true;
-- A positive number
SET hive.compactor.worker.threads;
SET hive.compactor.worker.threads=1;
CREATE TABLE orders_transactional (
order_id INT,
order_date STRING,
order_customer_id INT,
order_status STRING
) CLUSTERED BY (order_id) INTO 8 BUCKETS
STORED AS ORC
TBLPROPERTIES("transactional"="true");
INSERT INTO orders_transactional VALUES
(1, '2013-07-25 00:00:00.0', 1000, 'COMPLETE');
INSERT INTO orders_transactional VALUES
(2, '2013-07-25 00:00:00.0', 2001, 'CLOSED'),
(3, '2013-07-25 00:00:00.0', 1500, 'PENDING'),
(4, '2013-07-25 00:00:00.0', 2041, 'PENDING'),
(5, '2013-07-25 00:00:00.0', 2031, 'COMPLETE');
UPDATE orders_transactional
SET order_status = 'COMPLETE'
WHERE order_status = 'PENDING';
DELETE FROM orders_transactional
WHERE order_status <> 'COMPLETE';
SELECT *
FROM orders_transactional;
As #Chema explained ACID Transactions of HIVE. You can change the table property to allow transaction.
OR
You can do the following. With this you don't have to change table properties.
INSERT OVERWRITE INTO daily_case
SELECT * FROM daily_case WHERE num_casedaily <> 0;

Can you make inserting by Id optional?

I want to be able to insert something into my table at a specific ID, so I turned IDENTITY_INSERT on for the table. However, if I just want the auto increment to handle the ID, this error appears:
"Explicit value must be specified for identity column in table
'TsiList' either when IDENTITY_INSERT is set to ON or when a
replication user is inserting into a NOT FOR REPLICATION identity
column."
Is there a way to make the queries
INSERT INTO table (ID, something_else) VALUES (15, 'foo');
and
INSERT INTO table (something_else) VALUES ('foo');
work at the same time?
You can't do it without switching identity_insert on and off as required in between running each query.
Each version will only work when identity_insert is set to the relevant value within the session in which the query is being executed.
For example:
SET IDENTITY_INSERT TsiList ON;
INSERT INTO TsiList (ID, something_else) VALUES (15, 'foo');
SET IDENTITY_INSERT TsiList OFF;
INSERT INTO TsiList (something_else) VALUES ('foo');

Liquibase doesn't execute SQLs and returns Update Successful

I have created a new PostgreSQL DB on my PC,
CREATE DATABASE "Test_Liquibase_Versionin"
WITH OWNER = postgres
ENCODING = 'UTF8'
TABLESPACE = pg_default
LC_COLLATE = 'English_United States.1252'
LC_CTYPE = 'English_United States.1252'
CONNECTION LIMIT = -1;
Downloaded liquibase-3.4.2-bin and run this command:
C:\LiquiBase\liquibase.bat --driver=org.postgresql.Driver --classpath="C:\LiquiBase\Driver\postgresql-9.4.1212.jar" --changeLogFile=C:\LiquiBase\changes\databaseChangeLog.sql --url="jdbc:postgresql://localhost:5432/Test_Liquibase_Versionin?user=test&password=Password" update
Got this response Liquibase Update Successful
check the DB and notice that I have 2 tables databasechangelog, databasechangeloglock
changed databaseChangeLog.sql to look like this:
--liquibase formatted sql
create table employees( uuid int, name Varchar(10));
insert into employees values(1, 'Mr');
insert into employees values(2, 'Mail');
create table depts( dept_id int, dep_name Varchar(10));
executed again C:\LiquiBase\liquibase.bat.........
Got this response Liquibase Update Successful
Logged into my DB - but there are no new tables as expected.
What could be the reason for that? What could I do to test whats gone wrong?
You need to add a changeset to the script.
This is because you are telling Liquibase that your sql script is Liquibase formatted sql. But since you do not have any changeset in it, it does nothing.
--liquibase formatted sql
--changeSet PeterH:Inserting-Values1 endDelimiter:; splitStatements:true stripComments:false runOnChange:false
create table employees( uuid int, name Varchar(10));
insert into employees values(1, 'Mr');
insert into employees values(2, 'Mail');
create table depts( dept_id int, dep_name Varchar(10));
In my case, I was setting liquibase header on the file, but the problem was with an space between username (the user that makes the change) and the parameter:
--liquibase formatted sql
--changeset Jenkins: scriptname.sql stripComments:true splitStatements:true
As you can see, there is an space between "Jenkins:" and "scriptname.sql". I've deleted that space and it runs ok

SQL Server Merge WHEN NOT MATCHED clause customizations

When merge clause is used in SQL Server, I need to insert a row when it is not available. This is what I have tried:
drop table test;
create table test (col1 int, col2 varchar(20));
insert into test values(1, 'aaa');
insert into test values(2, 'bbb');
insert into test values(3, 'ccc');
--insert into test values(4, 'eee');
merge test as target
using (SELECT * from test where col1=4) as source
on (target.col1 = source.col1)
when matched then
update set target.col2='ddd'
when not matched by target then
insert values (4, 'ddd');
This updates when upon matching but fails to insert. I have got two questions:
Is there a way to insert upon not matching in the above case?
Can I customize the not matching criteria to raise an error?
Thanks.
The merge works, it's just that your source (SELECT * from test where col1=4) is empty. There is no such row.
You can raise an error using this hack. For example:
when not matched by target then
insert values (0/0 /*ASSERT*/, NULL);

Running the same unmodified query in SQL Server and Oracle

I want to run a query on both Oracle and SQL Server. The problem I have is that the query inserts into a column called PERCENT which I believe is a keyword in SQL Server.
A straight insert like this fails on SQL Server
INSERT INTO testtable
(PERCENT,VALUE)
VALUES
(50,'test');
To overcome the above SQL Server allows it if it is changed to one of the following
INSERT INTO testtable
([PERCENT],[VALUE])
VALUES
(50,'test');
INSERT INTO testtable
("PERCENT","VALUE")
VALUES
(50,'test');
The problem now is that Oracle does not support any of the above formats. Oracle only allows this format:
INSERT INTO testtable
(PERCENT,VALUE)
VALUES
(50,'test');
Is there a way I can run the above query in both Oracle and SQL Server without any problems?
Actually Oracle does support this format:
insert into testtable("PERCENT","VALUE") values(50,'test');
Here is a direct paste from my SQL Plus session:
SQL> create table testtable (percent number, value varchar2(20));
Table created.
SQL> insert into testtable ("PERCENT", "VALUE") values (50, 'test');
1 row created.