why is delete function not working in hive shell? - hive

hive> delete from daily_case where num_casedaily=0;
FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.
thank you in advance.

Hive doesn't support ACID transactions in a conventional way. You will need some pre-requisites and undestand limitations of ACID Transactions in Hive.
You can review this article:
using-hive-acid-transactions-to-insert-update-and-delete-data
for more information on Hive Transactions.
Pre -Requisites
Hive Transactions Manager should be set to DbTxnManager SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
We need to enable concurrency SET hive.support.concurrency=true;
Once we set the above properties, we should be able to insert data into any table.
For updates and deletes, table should be bucketed and file format need to be ORC or any ACID Compliant Format.
We also need to set table property transactions to true TBLPROPERTIES ('transactional'='true');
REVIEW PROPERTIES
$ cd /etc/hive/conf
$ grep -i txn hive-site.xml
$ hive -e "SET;" | grep -i txn
$ beeline -u jdbc:hive2://localhost:10000/training_retail
As an example to create a transactional table in HIVE
SET hive.txn.manager;
hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager;
SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
SET hive.support.concurrency=true;
SET hive.enforce.bucketing;
SET hive.enforce.bucketing=true;
SET hive.exec.dynamic.partition.mode;
hive.exec.dynamic.partition.mode=strict
SET hive.exec.dynamic.partition.mode=nonstrict;
SET hive.compactor.initiator.on;
SET hive.compactor.initiator.on=true;
-- A positive number
SET hive.compactor.worker.threads;
SET hive.compactor.worker.threads=1;
CREATE TABLE orders_transactional (
order_id INT,
order_date STRING,
order_customer_id INT,
order_status STRING
) CLUSTERED BY (order_id) INTO 8 BUCKETS
STORED AS ORC
TBLPROPERTIES("transactional"="true");
INSERT INTO orders_transactional VALUES
(1, '2013-07-25 00:00:00.0', 1000, 'COMPLETE');
INSERT INTO orders_transactional VALUES
(2, '2013-07-25 00:00:00.0', 2001, 'CLOSED'),
(3, '2013-07-25 00:00:00.0', 1500, 'PENDING'),
(4, '2013-07-25 00:00:00.0', 2041, 'PENDING'),
(5, '2013-07-25 00:00:00.0', 2031, 'COMPLETE');
UPDATE orders_transactional
SET order_status = 'COMPLETE'
WHERE order_status = 'PENDING';
DELETE FROM orders_transactional
WHERE order_status <> 'COMPLETE';
SELECT *
FROM orders_transactional;

As #Chema explained ACID Transactions of HIVE. You can change the table property to allow transaction.
OR
You can do the following. With this you don't have to change table properties.
INSERT OVERWRITE INTO daily_case
SELECT * FROM daily_case WHERE num_casedaily <> 0;

Related

Can't insert values into a transactional table in Apache HIVE

I'm trying to insert values into a bucketed, transactional table both via Beeline and Apache HUE, but it receive an Error, saying that unlocking locks is not permited.
SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
SET hive.support.concurrency=true;
SET hive.enforce.bucketing=true;
SET hive.exec.dynamic.partition.mode=nonstrict;
USE classicmodels;
CREATE TABLE classicmodels.my_emp (ID INT, Name STRING, Salary INT)
CLUSTERED BY (ID) INTO 5 BUCKETS
STORED AS ORC
TBLPROPERTIES('transactional'='true');
INSERT INTO TABLE classicmodels.my_emp (id, name, salary)
VALUES (1, 'John', 10000),
(2, 'Sara', 12000),
(3, 'Adam', 8000);
Error in beeline
Existing Locks
Error in HUE:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Hive Internal Error: org.apache.hadoop.hive.ql.lockmgr.LockException(Error communicating with the metastore) at org.apache.hive.service.cli.operation.Operation.toSQ...

How to use MERGE or Upsert Sql statement

How can i use MERGE Sql Statement or UPDATE statement for my below code. I am having a columnName called MachineName, other column values change but MachineName doesnot change. If the Column MachineName changes i need to insert the new values in a secondrow. If not i need to Update the same row. How can i do this. Is it a right approach ? Please help
MERGE INTO [devLaserViso].[dbo].[Machine] WITH (HOLDLOCK)
USING [devLaserViso].[dbo].[Machine]
ON (MachineName = MachineName)
WHEN MATCHED
THEN UPDATE SET MachineName = L1,ProgramName= ancdh.pgm, TotalCount= 10, RightCount=4,
LeftCount= 3,ErrorCode=0,FinishingTime=fsefsefef
WHEN NOT MATCHED
THEN INSERT (MachineName, ProgramName, TotalCount, RightCount, LeftCount, ErrorCode, FinishingTime)
VALUES (L02, djiwdn.pgm, 11, 5, 4, 0, dnwdnwoin);
I've had success in the past "Selecting" the values to upsert into the USING section of the MERGE command:
MERGE INTO [devLaserViso].[dbo].[Machine] WITH (HOLDLOCK) AS Target
USING (SELECT 'L1' AS MachineName, 'ancdh.pgm' AS ProgramName, 10 AS TotalCount,
4 AS RightCount, 3 AS LeftCount, 0 AS ErrorCode, 'fsefsefef' AS FinishingTime) AS Source
ON (Target.MachineName = Source.MachineName)
WHEN MATCHED
THEN UPDATE SET ProgramName= Source.ProgramName, TotalCount= Source.TotalCount,
RightCount= Source.RightCount, LeftCount= Source.LeftCount,
ErrorCode= Source.ErrorCode, FinishingTime= Source.FinishingTime
WHEN NOT MATCHED
THEN INSERT (MachineName, ProgramName, TotalCount, RightCount, LeftCount, ErrorCode, FinishingTime)
VALUES (Source.MachineName, Source.ProgramName, Source.TotalCount, Source.RightCount,
Source.LeftCount, Source.ErrorCode, Source.FinishingTime);
You can load the new Machine data in a Temporary table and then can use the Merge statement as follows to update the records for which there is already a record in Machine table and will insert a new record if it does not exists in Machine table.
MERGE [devLaserViso].[dbo].[Machine] t WITH (HOLDLOCK)
USING [devLaserViso].[dbo].[TempMachine] s
ON (s.MachineName = t.MachineName)
WHEN MATCHED THEN
UPDATE SET t.MachineName = s.MachineName,t.ProgramName =s.ProgramName
WHEN NOT MATCHED BY TARGET THEN INSERT (MachineName,ProgramName) VALUES (s.MachineName, s.ProgramName);

Liquibase doesn't execute SQLs and returns Update Successful

I have created a new PostgreSQL DB on my PC,
CREATE DATABASE "Test_Liquibase_Versionin"
WITH OWNER = postgres
ENCODING = 'UTF8'
TABLESPACE = pg_default
LC_COLLATE = 'English_United States.1252'
LC_CTYPE = 'English_United States.1252'
CONNECTION LIMIT = -1;
Downloaded liquibase-3.4.2-bin and run this command:
C:\LiquiBase\liquibase.bat --driver=org.postgresql.Driver --classpath="C:\LiquiBase\Driver\postgresql-9.4.1212.jar" --changeLogFile=C:\LiquiBase\changes\databaseChangeLog.sql --url="jdbc:postgresql://localhost:5432/Test_Liquibase_Versionin?user=test&password=Password" update
Got this response Liquibase Update Successful
check the DB and notice that I have 2 tables databasechangelog, databasechangeloglock
changed databaseChangeLog.sql to look like this:
--liquibase formatted sql
create table employees( uuid int, name Varchar(10));
insert into employees values(1, 'Mr');
insert into employees values(2, 'Mail');
create table depts( dept_id int, dep_name Varchar(10));
executed again C:\LiquiBase\liquibase.bat.........
Got this response Liquibase Update Successful
Logged into my DB - but there are no new tables as expected.
What could be the reason for that? What could I do to test whats gone wrong?
You need to add a changeset to the script.
This is because you are telling Liquibase that your sql script is Liquibase formatted sql. But since you do not have any changeset in it, it does nothing.
--liquibase formatted sql
--changeSet PeterH:Inserting-Values1 endDelimiter:; splitStatements:true stripComments:false runOnChange:false
create table employees( uuid int, name Varchar(10));
insert into employees values(1, 'Mr');
insert into employees values(2, 'Mail');
create table depts( dept_id int, dep_name Varchar(10));
In my case, I was setting liquibase header on the file, but the problem was with an space between username (the user that makes the change) and the parameter:
--liquibase formatted sql
--changeset Jenkins: scriptname.sql stripComments:true splitStatements:true
As you can see, there is an space between "Jenkins:" and "scriptname.sql". I've deleted that space and it runs ok

Conditional adding of records to a table

I have got this update script updating certain columns:
update oppar
set oppar_run_mode = 0,
oppar_run_time = 0,
oppar_interval_ind = 'N' ,
oppar_destination = '',
oppar_run_date ='',
oppar_run_interval=''
where ( oppar_job_name, oppar_job_rec )
in
( ('CSCLM' , 'XYZ')
, ('ARCLEVEXT' , 'LMN'));
But there are cases where there is no record in the table oppar where the column
oppar_job_rec is XYZ or LMN.
Now I need to verify the existence of oppar_job_name=CSCLM
then if that exists.
I need to check the existence of the Job rec coresponding to CSCLM i.e oppar_job_rec=XYZ
and if it does not exists I need to add a new record with these details.
oppar_job_name=CSCLM
oppar_job_rec=XYZ
oppar_run_mode = 0
oppar_run_time = 0
oppar_interval_ind = 'N'
oppar_destination = ''
oppar_run_date =''
oppar_run_interval=''
If it exists then I need to update that row.
Please help and tell me if you need more info.
But how do I perform the checking if it could be done and I need to do this on about 100 records with different values for oppar_job_rec .
Oracle 9i Enterprise Edition release 9.2.8.0 - 64 bit Production
You can use a SQL Merge statement: http://psoug.org/reference/merge.html
Here's some example code:
Instead of hardcoding the job_name and job_rec, build a table (if they aren't already in some table):
CREATE TABLE oppar_jobs (oppar_job_name VARCHAR2(200),
oppar_job_rec VARCHAR2(200));
INSERT INTO oppar_jobs (oppar_job_name,oppar_job_rec)
VALUES ('CSCLM','XYZ');
INSERT INTO oppar_jobs (oppar_job_name,oppar_job_rec)
VALUES ('ARCLEVEXT','LMN');
Then you can run a MERGE as follows:
MERGE
INTO oppar
USING oppar_jobs
ON ( oppar_jobs.oppar_job_name = oppar.oppar_job_name
AND oppar_jobs.oppar_job_rec = oppar.oppar_job_rec)
WHEN MATCHED
THEN
UPDATE
SET oppar_run_mode = 0,
oppar_run_time = 0,
oppar_interval_ind = 'N' ,
oppar_destination = '',
oppar_run_date ='',
oppar_run_interval=''
WHEN NOT MATCHED
THEN
INSERT ( oppar_job_name,
oppar_job_rec,
oppar_run_mode,
oppar_run_time,
oppar_interval_ind,
oppar_destination,
oppar_run_date,
oppar_run_interval)
VALUES ( oppar_jobs.oppar_job_name,
oppar_jobs.oppar_job_rec,
0,
0,
'N',
'',
'',
'');
As you're using 9i merge is not an option; So, you have a number of options, 2 that involve PL?SQL.
Option 1: update then insert
If you don't care about errors occurring you can just run your update then run your insert. The update may do nothing and the insert may cause a primary key violation but at least you know everything has been done. If you do this each insert would have to be done separately.
Option 2: update then insert with error catching
Using PL/SQL you could do something like the following,
update my_table
set <col1> = :col1
where <blah>
if SQL%ROWCOUNT = 0 then
insert into my_table
values < my values >
elsif SQL%ROWCOUNT = 1 then
insert less...
end if;
Option 3: insert then update with error catching
insert into my_table
values < my values >
exception when dup_val_on_index then
update my_table
set <col1> : :col1
where <blah>

Simulate a deadlock using stored procedure

Does anyone know how to simulate a deadlock using a stored procedure inserting or updating values? I could only do so in sybase using individual commands.
Thanks,
Ver
Create two stored procedures.
The first should start a transaction, modify table 1 (and take a long time) and then modify table 2.
The second should start a transaction, modify table 2 (and take a long time) and then modify table 1.
Ideally, the modifications should affect the same rows, or create table locks.
Then, in a client application, start SP1, and immediately then also start SP2 (before SP1 has finished).
The simple and short answer to get a deadlock will be to access the tables data in a reverse order and hence introducing a cyclic deadlock between two connections. Let me show you code:
Create table vin_deadlock (id int, Name Varchar(30))
GO
Insert into vin_deadlock values (1, 'Vinod')
Insert into vin_deadlock values (2, 'Kumar')
Insert into vin_deadlock values (3, 'Saravana')
Insert into vin_deadlock values (4, 'Srinivas')
Insert into vin_deadlock values (5, 'Sampath')
Insert into vin_deadlock values (6, 'Manoj')
GO
Now with the tables ready. Just update the columns in the reverse order from two connections like:
-- Connection 1
Begin Tran
Update vin_deadlock
SET Name = 'Manoj'
Where id = 6
WAITFOR DELAY '00:00:10'
Update vin_deadlock
SET Name = 'Vinod'
Where id = 1
and from connection 2
-- Connection 2
Begin Tran
Update vin_deadlock
SET Name = 'Vinod'
Where id = 1
WAITFOR DELAY '00:00:10'
Update vin_deadlock
SET Name = 'Manoj'
Where id = 6
And this will result in a deadlock. You can see the deadlock graph from profiler.
Start a process which continously insert or update a table using while loop with script and run your desired sp.