This question already has answers here:
Using merge..output to get mapping between source.id and target.id
(3 answers)
Closed last month.
I have a table called TreeItemOption.
I need to insert new rows into TreeItemOption table from a temp table and save the mapped values in a different temp table
INSERT INTO ProductSetup.TreeItemOption (TreeItemHeaderId)
OUTPUT INSERTED.TreeItemOptionId, TI.TreeItemOptionId INTO #NewTreeItemOptionIds
SELECT NTI.NewTreeItemHeaderId
FROM #TTreeItem TI
INNER JOIN #NewTreeItemHeaderIds NTI ON NTI.TreeItemHeaderId = TI.TreeItemHeaderId;
I have a #NewTreeItemHeaderIds table from that table I'm inserting the new TreeItemHeaderId into TreeItemOption table and I'm trying output the newly inserted TreeItemOptionId as well as old TreeItemOptionId from TTreeItem table
But I'm getting the error
The multi-part identifier "TI.TreeItemOptionId" could not be bound
In conventional insert and update commands only Magic tables can only be used in the output clause. Try a merge statement instead if you want the value from your source table to also be mapped.
Something like this:
MERGE DestinationTable DEST
USING SourceTable SRC
ON DEST.ID = SRC.ID
WHEN NOT MATCHED THEN
INSERT
(
DestinationColumnName
)
VALUES
(
SRC.SourceColumnName
)OUTPUT inserted.ID,SRC.SOurceID INTO #MyTemp;
One the above has been executed you'll have the records in the Temp table #MyTemp with the SourceID and the respective DestinationID mapped
I'm trying to create a temporary table in Hive as follows:
CREATE TEMPORARY TABLE mydb.tmp2
AS SELECT * FROM (VALUES (0, 'abc'))
AS T (id , mystr);
But that gives me the following error:
SemanticException [Error 10296]: Values clause with table constructor not yet supported
Is there another way to create a temporary table by explicitly and directly providing the values in the same command?
My ultimate goal is to run a MERGE command, and the temporary table would be inserted after the USING command. So something like this:
MERGE INTO mydb.mytbl
USING <temporary table>
...
Use subquery instead of temporary table:
MERGE INTO mydb.mytbl t
USING (SELECT 0 as id, 'abc' as mystr) tmp on tmp.id = t.id
Hive does not support values constructor yet. You can achieve this using below query:
CREATE TEMPORARY TABLE mydb.tmp2
AS SELECT 0 as id, 'abc' as mystr;
For merge, you can use temporary table as below:
merge into target_table
using ( select * from mydb.tmp2) temp
on temp.id = target_table.id
when matched then update set ...
when not matched then insert values (...);
I use SQL Server 2014.
In my procedure, I have a MERGE statement and I have a question about it.
My MERGE statement has simple following structure:
MERGE dbo.T1 AS tgt
USING (SELECT ...) AS src ON ...
WHEN MATCHED THEN
UPDATE ...
WHEN NOT MATCHED THEN
INSERT ...
OUTPUT inserted.MyColumn
INTO #NewTable (MyColumnValue);
Just like how it populates a table for all inserts, I also need it to populate another table for all updates too.
Is is possible, and if yes then would you please let me know how?
No, it's not possible to direct the results to two tables. See this question.
You can make the table wider and output both the inserted and deleted columns on the same row:
MERGE dbo.T1 AS tgt
USING (SELECT ...) AS src ON ...
WHEN MATCHED THEN
UPDATE ...
WHEN NOT MATCHED THEN
INSERT ...
OUTPUT $action, inserted.col1, inserted.col2, deleted.col1, deleted.col2
INTO #NewTable (action, inserted_col1, inserted_col2, deleted_col1, deleted_col2);
Then you can split #NewTable however you want.
I am new to hive, and want to know if there is anyway to insert data into Hive table like we do in SQL. I want to insert my data into hive like
INSERT INTO tablename VALUES (value1,value2..)
I have read that you can load the data from a file to hive table or you can import data from one table to hive table but is there any way to append the data as in SQL?
Some of the answers here are out of date as of Hive 0.14
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-InsertingvaluesintotablesfromSQL
It is now possible to insert using syntax such as:
CREATE TABLE students (name VARCHAR(64), age INT, gpa DECIMAL(3, 2));
INSERT INTO TABLE students
VALUES ('fred flintstone', 35, 1.28), ('barney rubble', 32, 2.32);
You can use the table generating function stack to insert literal values into a table.
First you need a dummy table which contains only one line. You can generate it with the help of limit.
CREATE TABLE one AS
SELECT 1 AS one
FROM any_table_in_your_database
LIMIT 1;
Now you can create a new table with literal values like this:
CREATE TABLE my_table AS
SELECT stack(3
, "row1", 1
, "row2", 2
, "row3", 3
) AS (column1, column2)
FROM one
;
The first argument of stack is the number of rows you are generating.
You can also add values to an existing table:
INSERT INTO TABLE my_table
SELECT stack(2
, "row4", 1
, "row5", 2
) AS (column1, column2)
FROM one
;
Slightly better version of the unique2 suggestion is below:
insert overwrite table target_table
select * from
(
select stack(
3, # generating new table with 3 records
'John', 80, # record_1
'Bill', 61 # record_2
'Martha', 101 # record_3
)
) s;
Which does not require the hack with using an already exiting table.
You can use below approach. With this, You don't need to create temp table OR txt/csv file for further select and load respectively.
INSERT INTO TABLE tablename SELECT value1,value2 FROM tempTable_with_atleast_one_records LIMIT 1.
Where tempTable_with_atleast_one_records is any table with atleast one record.
But problem with this approach is that If you have INSERT statement which inserts multiple rows like below one.
INSERT INTO yourTable values (1 , 'value1') , (2 , 'value2') , (3 , 'value3') ;
Then, You need to have separate INSERT hive statement for each rows. See below.
INSERT INTO TABLE yourTable SELECT 1 , 'value1' FROM tempTable_with_atleast_one_records LIMIT 1;
INSERT INTO TABLE yourTable SELECT 2 , 'value2' FROM tempTable_with_atleast_one_records LIMIT 1;
INSERT INTO TABLE yourTable SELECT 3 , 'value3' FROM tempTable_with_atleast_one_records LIMIT 1;
No. This INSERT INTO tablename VALUES (x,y,z) syntax is currently not supported in Hive.
You could definitely append data into an existing table. (But it is actually not an append at the HDFS level). It's just that whenever you do a LOAD or INSERT operation on an existing Hive table without OVERWRITE clause the new data will be put without replacing the old data. A new file will be created for this newly inserted data inside the directory corresponding to that table. For example :
I have a file named demo.txt which has 2 lines :
ABC
XYZ
Create a table and load this file into it
hive> create table demo(foo string);
hive> load data inpath '/demo.txt' into table demo;
Now,if I do a SELECT on this table it'll give me :
hive> select * from demo;
OK
ABC
XYZ
Suppose, I have one more file named demo2.txt which has :
PQR
And I do a LOAD again on this table without using overwrite,
hive> load data inpath '/demo2.txt' into table demo;
Now, if I do a SELECT now, it'll give me,
hive> select * from demo;
OK
ABC
XYZ
PQR
HTH
Ways to insert data into Hive table:
for demonstration, I am using table name as table1 and table2
create table table2 as select * from table1 where 1=1;
or
create table table2 as select * from table1;
insert overwrite table table2 select * from table1;
--it will insert data from one to another. Note: It will refresh the target.
insert into table table2 select * from table1;
--it will insert data from one to another. Note: It will append into the target.
load data local inpath 'local_path' overwrite into table table1;
--it will load data from local into the target table and also refresh the target table.
load data inpath 'hdfs_path' overwrite into table table1;
--it will load data from hdfs location iand also refresh the target table.
or
create table table2(
col1 string,
col2 string,
col3 string)
row format delimited fields terminated by ','
location 'hdfs_location';
load data local inpath 'local_path' into table table1;
--it will load data from local and also append into the target table.
load data inpath 'hdfs_path' into table table1;
--it will load data from hdfs location and also append into the target table.
insert into table2 values('aa','bb','cc');
--Lets say table2 have 3 columns only.
Multiple insertion into hive table
Yes you can insert but not as similar to SQL.
In SQL we can insert the row level data, but here you can insert by fields (columns).
During this you have to make sure target table and the query should have same datatype and same number of columns.
eg:
CREATE TABLE test(stu_name STRING,stu_id INT,stu_marks INT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE;
INSERT OVERWRITE TABLE test SELECT lang_name, lang_id, lang_legacy_id FROM export_table;
To insert entire data of table2 in table1. Below is a query:
INSERT INTO TABLE table1 SELECT * FROM table2;
You can't do insert into to insert single record. It's not supported by Hive. You may place all new records that you want to insert in a file and load that file into a temp table in Hive. Then using insert overwrite..select command insert those rows into a new partition of your main Hive table. The constraint here is your main table will have to be pre partitioned. If you don't use partition then your whole table will be replaced with these new records.
Enter the following command to insert data into the testlog table with some condition:
INSERT INTO TABLE testlog SELECT * FROM table1 WHERE some condition;
I think in such scenarios you should be using HBASE which facilitates such kind of insertion but it does not provide any SQL kind of query language. You need you use Java API of HBASE like the put method to do such kind of insertion. Moreover HBASE is column oriented no-sql database.
You still can insert into complex type in Hive - it works
(id is int, colleagues array)
insert into emp (id,colleagues) select 11, array('Alex','Jian') from (select '1')
you can add values to specific columns as well, just specify the column names in which you like to add corresponding values:
Insert into Table (Col1, Col2, Col4,col5,Col7) Values ('Va11','Va2','Val4','Val5','Val7');
Make sure the columns you skip dont have not null value type.
There are few properties to set to make a Hive table support ACID properties and to insert the values into tables as like in SQL .
Conditions to create a ACID table in Hive.
The table should be stored as ORC file. Only ORC format can support ACID prpoperties for now.
The table must be bucketed
Properties to set to create ACID table:
set hive.support.concurrency =true;
set hive.enforce.bucketing =true;
set hive.exec.dynamic.partition.mode =nonstrict
set hive.compactor.initiator.on = true;
set hive.compactor.worker.threads= 1;
set hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set the property hive.in.test to true in hive.site.xml
After setting all these properties , the table should be created with tblproperty 'transactional' ='true'. The table should be bucketed and saved as orc
CREATE TABLE table_name (col1 int,col2 string, col3 int) CLUSTERED BY col1 INTO 4
BUCKETS STORED AS orc tblproperties('transactional' ='true');
Now its possible to inserte values into the table like SQL query.
INSERT INTO TABLE table_name VALUES (1,'a',100),(2,'b',200),(3,'c',300);
Yes we can use Insert query in Hive.
hive> create table test (id int, name string);
INSERT: INSERT...VALUES is available starting in version 0.14.
hive> insert into table test values (1,'mytest');
This is going to work for insert. We have to use values keyword.
Note: User cannot insert data into a complex datatype column (array, map, struct, union) using the INSERT INTO...VALUES clause.
This may be a simple query to some of you. But I am not strong in Sql, so expecting some solution for my problem.
I have 2 tables, ProductVenueImport and SupplierVenueImport.
We are dumping all the records from SupplierVenueImport to ProductVenueImport using MERGE clause and a Temp table. Temp will have valid records from SupplerVenuImport and from Temp table we are importing records to ProductVenueImport.
But before importing data to ProductVenueImport from Temp table I need to check for the duplicate records in my target (ProductVenueImport).
For example if I am importing a record with name as 'A', I need to look into ProductVenueImport whether 'A' already existing or not. If it is not existing then only I need to insert 'A' otherwise not.
Could somebody tell me how to do this?
Is using Cursors only the option?
Thanks,
Naresh
Assuming the Temp table itself doesn't have duplicates, you could use MERGE like this:
Insert non-existing products.
Do a NO-OP in case of an existing product.
Use $action in the OUTPUT clause to mark which rows were considered for insertion (and inserted) and which for update (but not really updated).
This is what I mean:
DECLARE #noop int; -- needed for the NO-OP below
MERGE INTO ProductVenueImport AS tgt
USING Temp AS src
ON src.ProductID = tgt.ProdutID
WHEN NOT MATCHED THEN
INSERT ( column1, column2, ...)
VALUES (src.column1, src.column2, ...)
WHEN MATCHED THEN
UPDATE SET #noop = #noop -- the NO-OP instead of update
OUTPUT $action, src.column1, src.column2, ...
INTO anotherTempTable
;
I think this would do this :
INSERT INTO PRODUCTTBL(FEILD1, FIELD2, FIELD3, FIELD4, FIELD5)
SELECT (FIELD1,FIELD2,FIELD3,FIELD4,FIELD5) FROM TEMP WHERE CRITERIAFIELD NOT IN(SELECT DISTINCT CRITERIAFIELD FROM PRODUCTTBL)
This should allow you to check for duplicates in a table
select columnname from tablename
group by columnname
having count(columnname) >1
sorry if I am not getting the question right, can't you use the merge statement on the source table with "When not matched Insert" to insert the new records alone
so in your case it should be like this
merge into ProductVenueImport using temp on (<condition for duplicate>)
when not matched then insert <clause>;
the merge clause will make sure that no duplicate records are inserted into your source table.