create a view in HIVE merge two tables - hive

How can I create a view to merge three tables
The workflow is like initially one table was created in mysql now this table has been divided to 3 tables and kept in hive
so for that I need to create a view
Initially in mysql one table for eg the table name is Initialtable.
This Initialtable consists of col1,col2,col3,col4,col5
now this table has been divided to 3 tables in hive and I need to merge these tables using a view
1)table1
2)table2
3)table3
Now this table1 consists of col1,col3,col5
table2 consists of col1,col2,col3
table3 consists of col1,col5
Now I have to create a view so that I can merge these table1,table2,table3
for that I will put the non used columns in table1,table2,table3 as null
like create view v1 select col1,col2 as null,col3,col4,col5 from table1 union select col1,col2,col3,col4 as null,col5 as null from table 2 union col1,col2 as null,col3 as null, col4 as null,col5 from table 3
can someone provide a proper syntax to gain this output in hive

Assuming table1, table2, table3 are the three tables which were split and the columns are as below:
table1: col1,col3,col5
table2: col1,col2,col3
table3: col1,col4,col3
and col1 is the primary key across all the three tables. You can create a view as below:
CREATE OR replace VIEW initialtable AS
SELECT DISTINCT a.col1,
b.col2,
a.col3,
c.col4,
a.col5
FROM TABLE1 AS a
join TABLE2 AS b
ON ( a.col1 = b.col1 )
join TABLE3 AS c
ON ( c.col1 = a.col1 )

Related

append column from one table to another table in Oracle

Suppose i have one table named A with 3 columns(1 id column),and a table named B with two columns(1 id column).Table A and table B can join using column id.
Now I want to append one column from table B to table A by sql statements。so after executing,table A will have 4 columns.Both table A and table B have million rows,how can I do it efficiently?
Assuming this is a one-off consolidation of tables and you have a reason to do this rather than using a join (with or without a view):
alter table a add (col4 varchar2(10)); -- or whatever data type you actually need
merge into a
using b
on (b.id = a.id)
when matched then update set a.col4 = b.col4;
You could do a correlated updated:
update a set col4 = (
select col4 from b where b.id = a.id
);
but a merge is probably going to be quicker.
There are 2 steps to perform:
Changing the data model (ie. specifying the additional column for A)
Fill the column with suitable values.
These operations can be folded together if you can afford the resources to temporarily hold the data contained in A and B twice:
CREATE TABLE C AS (
SELECT a.id
, a.col2
, a.col3
, b.col2 AS col4
FROM A a
INNER JOIN B b ON ( b.id = a.id )
);
DROP TABLE A;
RENAME C TO A;
While the Alex Poole's answer is more efficient, the above solution works on oracle versions before 9i (esoteric) and on other rdbms ( syntax to rename the table might differ a bit, eg. alter table C rename to A in postgresql 9.x+)

Creation of pipe-delimited hive table - duplicate ids

I'm trying to create a pipe-delimited hive table using these commands:
CREATE TABLE IF NOT EXISTS tableA (
id string,
col1 double,
col2 double,
col3 double,
col4 double,
col5 double,
col6 double,
col7 double
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|'
tblproperties ("skip.header.line.count"="1");
INSERT INTO TABLE TABLEA
select a.id
b.col1,
b.col2,
b.col3,
b.col4,
b.col5,
b.col6,
b.col7
FROM customerTable as a left join factTable as b on a.id = b.id;
I get duplicate records in the new table, tableA. I checked using
select count(distinct id) as cnt from tableA ;
Whereas if I create a normal hive table like this, I don't get any duplicate ids:
Create table if not exists tableA as
select a.id
b.col1,
b.col2,
b.col3,
b.col4,
b.col5,
b.col6,
b.col7
FROM customerTable as a left join factTable as b on a.id = b.id;
The table created is in order of 80 Million rows but the difference in the number of records ( duplicate records) is only 58 records.
Not sure whats going on. I guess the problem is with how I'm creating the pipe-delimited hive table. Any help would be appreciated.
Remove tblproperties ("skip.header.line.count"="1"); property in your create table statement and run the insert statement again.

How to merge two tables in sql using inner join but without a common value in both tables

i have two tables in my database and i want to merge the two using an inner join - there is a unique in both tables which have same names but different data so i cant use that column as the data is different in the primary key column!
Also i cant use a union operator because i want most of the columns from the first table and second table
How can i merge the two tables?
here is the code snippet of what am i trying to do:
Note: (the farmercode in table one has different values to farmercode in table two though the column name is the same and also a primary key)
SELECT base.stationCode ,
base.Farmercode ,
base.Farmername ,
base.POcode ,
base.Sex ,
base.DateofContract ,
CONCAT(a.InspectionDate1, a.InspectionDate2) AS Internal_inspectionDate ,
a.UnderstandingOfCertification ,
a.TotFarmAcreage ,
a.CoffeePlots ,
a.CoffeeAcreage ,
a.ArabicTrees ,
a.ProductionEstimateFarm ,
a.Tel ,
a.InspectorName
FROM kcl_baseline_2015_final AS base
INNER JOIN main AS a ON ( base.farmerCode = a.farmerCode )
(Preliminary answer for what I guess the OP is asking. Will edit once the question is clarified.)
SELECT
a.x, a.y, a.z,
b.o, b.p, b.q
FROM my_first_table a
LEFT OUTER JOIN another_table b ON 1=2 -- don't want any data, just the column definitions!
UNION
SELECT
NULL, NULL, NULL,
b.o, b.p, b.q -- only the data from the second table.
FROM another_table b
As per the basic criteria, in order to merge two tables there should be a relation between them. But as in your case the relation cannot be established as both the fields might have different data.
So, you can either go for cross join without using ON condition or UNION ALL operator.
Edit 1:
If you want to merge to tables even if they do not have same no. of columns. Then you can define psuedo columns with alias like given below,
SELECT Column1, '' Column2 FROM Table1
UNION ALL
SELECT Column1, Column2 FROM Table2;
Edit 2:
If you want to merge to tables where you have different columns and want to display all. Then you can use psuedo columns as,
SELECT Column1, Column2, '' Column3, '' Column4 FROM Table1
UNION ALL
SELECT '' Column1, '' Column2, Column3, Column4 FROM Table2;

create table to compare rows

In order to be able to use UTL_MATCH.JARO_WINKLER_SIMILARITY in oracle, i would need to have a table(temp table in my case) in following format
a ----------------abc
a ----------------bax
a ----------------tax
b ----------------abc
b ----------------bax
b ----------------tax
c ----------------abc
c ----------------bax
c ----------------tax
I have column in the LEFT coming from one table and the column in RIGHT from another table. My question is how to create table in such format?
I would highly appreciate your help.
If you want empty structure then:
create table test as
select table1.col1, table2.col2
from table1, table2
where 1=0
If you want table with data then:
create table test as
select distinct table1.col1, table2.col2
from table1, table2

using sql create a new table with 2 fields, field1 from tableA and field1 from table B

Am new to SQL and am stuck here with a very simple-looking query request.
I have 2 tables, both having exactly the same structure (IE same no. of columns, same no. Of rows) except for the actual contents. so for example,tableA has 2 columns called col1&col2; tableB has 2 columns too called col1&col2. Now I want to create a 3rd new tale, where 1st column is tableA's col1, and 2nd column is tableB's col1. preferably the name of the 1st column is fromTableA, and name of 2nd column is fromTableC. How do I achieve this please? I tried all the following ways but I always get the same error: "number of query values and destination fields are not the same."
variation 1:
insert into newTable(fromTable1,fromTable2)
select col1 from table1
select col1 from table2
variation 2:
insert into newTable(fromTable1,fromTable2)
select col1 from table1,col1 from table2
variation 3:
insert into newTable(fromTable1,fromTable2)
select col1 from table1, table2
Presumably you have fields in the two tables that can be joined, so this:
insert into newtable (romTable1,fromTable2)
select a.col1, b.col1
from table1 a, table2 b
where a.col1 = b.col1;
The a/b are aliases that differentiate between the two columns in each table. If you don't have fields to join then whatever you're trying to do probably needs a rethink.
You may try following sql query to achieve your purpose:
with OrderedTableA as (
select row_number() over (order by Col1) RowNum, *
from TableA (nolock)
),
OrderedTableB as (
select row_number() over (order by Col1) RowNum, *
from TableB (nolock)
)
select T1.Col1, T2.Col2 into TableC
from OrderedTableA T1
full outer join OrderedTableB T2 on T1.RowNum = T2.RowNum
Above query will create a new table as TableC with column col1 from TableA and col2 from TableB. You may change the queries to your need.
I hope you will understand the above queries. Give it a try.