How create new hive table from two existing ones having same columns - hive

I have two hive table :
> T1exp
Col1 Col2 Col3
1 5 7
3 4 6
4 2 1
and the table
> T2exp
Col1 Col2 Col3
0 5 4
1 2 2
4 3 1
I need to get one by merging both :
>FinalTable
Col1 Col2 Col3
1 5 7
3 4 6
4 2 1
0 5 4
1 2 2
4 3 1
I tried using this instruction :
create TableRDH as (select * from T2exp as t1 left.join FinalTable as t2 on t1.Col1 = t2.Col1 );
But it gives this error
FAILED: ParseException line 1:7 cannot recognize input near 'create'
'TableRDH' 'as' in ddl statement
How can I resolve this ?

There is a simple way to achieve your objective:
create table as
select * from <T2exp>
union
select * from <FinalTable>

You need do union to merge the table data,
Try,
create table as
select col1, col2, col3 from <table1>
union
select col1, col2, col3 from <table2>
Note - Error which you are getting is due to syntax issue, try the query without keyword as and paranthesis

Related

Union all in Vertica SQL based on tables with different number of columns?

Hellow i have two tables in Vertica SQL:
table 1
col1 col2 col3
1 3 5
2 4 6
table 2
col1 col2
11 33
22 44
And I would like to UNION these two tables, so as as result I would like to have:
col1 col2 col3
1 3 5
2 4 6
11 33 NULL
22 44 NULL
How can I do it in vertica
In general, you should use UNION ALL and define the extra column with whatever default value you want:
select col1, col2, col3
from table1
union all
select col1, col2, NULL as col3
from table2;
UNION incurs overhead for removing duplicates. In general, you should use UNION ALL unless you intend to remove duplicates.
use null as follows:
select col1, col2, col3 from table1
union
select col1, col2, null from table2

How to select a column that doesn't exist in a table and return a NULL result for all rows returned

I have a query, that I'm looking to select a column that doesn't exist and just fill it with "NULL" in the results. The data is being exported to a .CSV File for import to another database which has the column. For example:
My query is:
Select col1, col2, col3
from table1
Output is:
col1 col2 col3
1 5 9
2 6 10
3 7 11
4 8 12
I'd like the output to be:
col1 col2 col3 col4
1 5 9 NULL
2 6 10 NULL
3 7 11 NULL
4 8 12 NULL
You can select a null literal:
SELECT col1, col2, col3, NULL AS col4
FROM mytable
Select col1, col2, col3, null as col4 from table1

Removing rows in SQL that have a duplicate column value

I have looked high and low on SO for an answer over the last couple of hours (subqueries, CTE's, left-joins with derived tables) to this question but none of the solutions are really meeting my criteria..
I have a table with data like this :
COL1 COL2 COL3
1 A 0
2 A 1
3 A 1
4 B 0
5 B 0
6 B 0
7 B 0
8 B 1
Where column1 1 is the primary key and is an int. Column 2 is nvarchar(max) and column 3 is an int. I have determined that by using this query:
select name, COUNT(name) as 'count'
FROM [dbo].[AppConfig]
group by Name
having COUNT(name) > 3
I can return the total counts of "A, B and C" only if they have an occurrence of column C more than 3 times. I am now trying to remove all the rows that occur after the initial value of column 3. The sample table I provided would look like this now:
COL1 COL2 COL3
1 A 0
2 A 1
4 B 0
8 B 1
Could anyone assist me with this?
If all you want is the first row with a ColB-ColC combination, the following will do it:
select min(id) as id, colB, colC
from tbl
group by colB, colC
order by id
SQL Fiddle
This should work:
;WITH numbered_rows as (
SELECT
Col1,
Col2,
Col3,
ROW_NUMBER() OVER(PARTITION BY Col2, Col3 ORDER BY Col3) as row
FROM AppConfig)
SELECT
Col1,
Col2,
Col3
FROM numbered_rows
WHERE row = 1
SELECT DISTINCT MIN(COL1) AS COL1,COL2,COL3
FROM TABLE
GROUP BY COL2,COL3
ORDER BY COL1

Need help in Sql query joining operation

I have below test table with data:
Create table test
(
col1 int,
col2 int
)
Sample data:
col1 col2
-------------
1 4
1 5
2 4
3 5
3 4
Now I want all the col1 which have col2 value 4 and 5
o/p 1,3 since it contain 4 as well as 5 col2 value
SELECT *
FROM TestTable
WHERE col1 IN
(
SELECT col1
FROM TestTable
WHERE col2 = 4
)
AND col2 = 5
If you only need col1 value, you can do
of course, the value in the having clause (2 in this case) depends on the numbers of elements in the IN clause (so if you have 3, 4, 5, you'll need to put 3 in the having clause)
select col1 from test t1
where col2 in (4, 5)--take only results with 4 and 5 in col2
group by col1
having count(distinct col2) = 2 --be sure there's at least a 4 and a 5

Query without Union operator SQL

TABLE X
col1,col2
1 , 2
1 , 7
1 , 4
1 , 8
2 , 3
2 , 1
2 , 2
3 , 1
3 , 8
3 , 9
3 , 4
4 , 5
4 , 3
4 , 2
4 , 8
4 , 4
I want to retrieve the col1 values that contains in the col2 the values 2 and 4
in this case it will retrieve the values 1 and 4
How can i accomplish this without using the UNION ALL operator ?
The query that i am using is
select distinct col1
from X as A
where col1 = (
select col1 from (
select distinct col1
from X as B
where A.col1 = B.col1 and col2 = 2
union ALL
select distinct col1
from X as C
where A.col1 = C.col1 and col2 = 4
) D
group by col1
having count(col1) > 1
)
It is returning the correct result but i guess is to performance expensive.
Can anyone give me ideas about how to achieve the same result but without unions ?
This problem is called Relational Division, here is one way to do so:
SELECT col1
FROM tablex
WHERE col2 IN (2, 4)
GROUP BY col1
HAVING COUNT(DISTINCT col2) >=2
The HAVING COUNT(col2) >=2 will ensure that the selected col1 must have both the two values 2 and 4 at least.
SQL Fiddle Demo
I think the best performance will come from inner joining the table with itself:
SELECT DISTINCT X1.col1
FROM X X1 INNER JOIN X X2 ON X1.col1=X2.col1
WHERE X1.col2=2 AND X2.col2=4