I have a Hive table and I want to update the struct. The table was created like this:
CREATE TABLE orig (columnname
struct<a:string, b:string,
c:array<struct<d:string, e:string>>,
f:array<struct<g:string>>,
h:string,
i:array<struct<j:string>>>);
I need to change some names of struct like this:
change 'a' to 'new_a', 'b' to 'new_b'
also I need to add some fields inside the struct:
a:string,
b:string,
b1:string,
b2:string,
c:array<struct<d:string, e:string>>,
f:array<struct<g:string,g3:string>>,
h:string,
i:array<struct<j:string>>>);
How can I do this? Someone to help me.
Related
I have a struct column in BigQuery called meta. Inside that field, I have a field called join_at which is currently in a FLOAT datatype and I'd like to change it to a TIMESTAMP datatype.
I tried running this query:
ALTER TABLE `my-table`
ALTER COLUMN meta.join_at SET DATA TYPE TIMESTAMP
That doesn't work. It throws an error at the "." character. So, apparently I can't just change the struct field like that.
What would be the correct approach in this case?
If you were wanting to alter a field in a struct you would do something like this:
CREATE OR REPLACE TABLE so_test.alter_struct(s1 STRUCT<a FLOAT64, b STRING>);
ALTER TABLE so_test.alter_struct ALTER COLUMN s1
SET DATA TYPE STRUCT<a TIMESTAMP, b STRING>;
However a FLOAT is not coercible to a TIMESTAMP as listed in this table here:
https://cloud.google.com/bigquery/docs/reference/standard-sql/conversion_rules#comparison_chart
Instead you can take an approach similar to this:
How to delete a column in BigQuery that is part of a nested column
And just define the new structure while overwriting the table.
I have a large table with given number of rows in which I'd like to replace personal informations with dummy data. I've written functions for this but actually struggling with how to implement it.
I'd like to do something like:
ALTER TABLE SomeTable DROP COLUMN SomeName
ALTER TABLE SomeTable ADD COLUMN SomeName NVARCHAR(30) DEFAULT (SELECT * FROM dbo.FakeName)
Help would be appreciated.
Instead of dropping and adding a column, just do an UPDATE.
If you just want to update the actual data with dummy data , why can't you use update statement as below. We do almost similar in our day to day work. For ex. if we would like to sanitize actual email address of users while restoring the data in my local or test machine (in column SomeName) and in another column we just want to update it with 'XXX' .
UPDATE SomeTable
SET Email_address= SUBSTRING(Email_address,0,CHARINDEX('#',Email_address)) + '#mytest.com',
SomeName2= 'XXX',
I have a table with lots of columns.
I don't want to write something like
CREATE TABLE IF NOT EXISTS
table1(
col1 int,
col2 String,
etc....)
Is there a fast way to create a table with the same structure, but without any data?
Try this:
CREATE TABLE some_db.T1 LIKE some_db.T2
See this manual: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-CreateTableLike
select name into viewtable from stdinfo5
My error is:
There is already an object named 'viewtable' in the database.
Can someone explain: I want column with data (adding) into viewtable from the stdinfo5 table.
Thanks!
select ... into SomeTarget from SomeSource will create a physical table with the name SomeTarget!
You can use DROP TABLE SomeTarget to delete this table (carefull with real data!!!) or, what might be better, use select ... into #SomeTarget ....
The # before the name will create this table as temp table which is deleted automatically when it gets out of scope.
In your case it seems, that you do not want to delete the table, but you just want to add one more column. In this case you'd need something like ALTER TABLE viewtable ADD TheColumnName TheColumnType; and then use an UPDATE statement to fill this column. If possible, it was easier to delete the table and re-create it with the missing column...
I have table Employee in hive which is partitioned.
Now i want to copy all the contents from Employee to another table without defining any schema like:
My first table is like:
create table Employee(Id String,FirstName String,Lastname String);
But i don't want to define the same schema for the NewEmployee table:
create table Newemployee(Id String,FirstName String,LastName String);
Since, you have not mentioned any partitioning details so I am assuming that it does not have any significance. Please correct me, if I am wrong.
The query that you are looking for would be like this:
create table Newemployee as select * from Employee;
You can also use below code:
Create table dbname.tablename LIKE existing_table_or_Viewname LOCATION hdfs-path
CREATE TABLE NewEmployee
[ROW FORMAT SERDE] (if any)
[STORED AS] Format
AS
SELECT * FROM Employee [SORT BY];
Rules while create table as create
1. The target table cannot be a partitioned table.
2. The target table cannot be an external table.
3. The target table cannot be a list bucketing table.