SQL Impala create partitioned table from a view - sql

I am interested in turning a view into a table, but I want the table to be partitioned wrt to one variable:
My query is:
CREATE TABLE table_test AS (
SELECT
*
FROM view_test
And I want to have the table partitioned by the variable "time-period".
Thank you in advance.

You can do this directly like this
CREATE TABLE table_test
PARTITIONED BY (time_period)
as
SELECT col1,col2,
time_period -- Pls make sure this partition column as the last column in SELECT.
FROM schema.view_test;

Related

How can I create and query against temp table in HiveQL?

I am new to HiveQL and I need to create a temp table from the results of following query:
SELECT * FROM `database`.`table` LIMIT 0,100;
....Then run query against this temp table. How could I accomplish this and what is the best practice?
So i guess the SQL Anywhere version would be something like:
create table #temp (foo int)
insert into #temp (foo)
select top (100) 1
from dbo.table t
select count(*) from #temp as c
drop table #temp
Any help would be greatly appreciated. Thanks in advance!
In hive you can create similar functionality using CREATE TEMPORARY TABLE . Table can me managed or external and will stay as long as your hive session is active. Also, pls note, if you are running it as part of batch process, this may not be a good idea.
CREATE TEMPORARY TABLE etl.tmp1 as select * from tab limit 10;
select count(*) from etl.tmp1 ;

exclude partitions in select query

I have a table in hive which is partitioned based on country.
I want to exclude 3 specific partition like somalia,iraq.
I do not want to give in where clause (not in 'somalia','iraq').
Do we have option to exclude specific partitions like (we have exclude columns from the select statement)?.
Please suggest.
You can drop the partitions that are not needed,
hive> alter table <db_name>.<table_name> drop partition
(<partition_filed>="somalia"),(<partition_filed>="iraq");
(or)
Create a view on top of the table by excluding the partitions that are not needed.
hive> create view <db_name>.<view_name> as select * from <db_name>.<table_name>
where <partition_filed> not in ("somalia","iraq");
hive> select * from <db_name>.<view_name>;

Split Hive table on subtables by field value

I have a Hive table foo. There are several fields in this table. One of them is some_id. Number of unique values in this fields in range 5,000-10,000. For each value (in example it 10385) I need to perform CTAS queries like
CREATE TABLE bar_10385 AS
SELECT * FROM foo WHERE some_id=10385 AND other_id=10385;
What is the best way to perform this bunch of queries?
You can store all these tables in the single partitioned one. This approach will allow you to load all the data in single query. Query performance will not be compromised.
Create table T (
... --columns here
)
partitioned by (id int); --new calculated partition key
Load data using one query, it will read source table only once:
insert overwrite table T partition(id)
select ..., --columns
case when some_id=10385 AND other_id=10385 then 10385
when some_id=10386 AND other_id=10386 then 10386
...
--and so on
else 0 --default partition for records not attributed
end as id --partition column
from foo
where some_id in (10385,10386) AND other_id in (10385,10386) --filter
Then you can use this table in queries specifying partition:
select from T where id = 10385; --you can create a view named bar_10385, it will act the same as your table. Partition pruning works fast

Copied data column is not partitioned in target table in hive

I have created a table in hive from existing partitioned table using the command
create table new_table As select * from old_table;
Record counts are matching in both the table but when I give DESC table I could see the column is not partitioned in New table.
You should explicitly specify partition columns when creating the table.
create table new_table partitioned by (col1 datatype,col2 datatype,...) as
select * from old_table;

Temporary tables and AS statement usage

If I have a query, say:
SELECT *
FROM ( SELECT fields
FROM tables
WHERE condition
)
AS TEMP_TABLE
are the results of the above query saved in a temporary table called TEMP_TABLE so as I can perform another query on it later? Will the query below be executed successfully when using DB2?
SELECT fields
FROM TEMP_TABLE
WHERE condition
The answer is NO, it's just an alias for a subquery.
If you want to use it later, you have to explicitly create it.
you can create temporary table in following way.
CREATE TEMPORARY TABLE temp_table AS (
SELECT fields
FROM tables
WHERE condition
);
then you can retrieve data from the temp table as below.
SELECT * FROM temp_table