BigQuery table partition - google-bigquery

I have table for e.g emp which is not partitioned and contains 200TB of data.
I want to create partition table from emp table but it should have name emp only.
To do that i have to first create partition table emp_1 from emp table then drop emp then create emp from emp_1
This way have to load 200 TB two times. Is there any alternate solution?

You can copy emp to emp_1. Copy job is a metadata only operation, which is fast and free. Then you can drop emp and re-create as partitioned table, then load the data from emp_1 to emp.

Related

How to do update/delete operations on non-transactional table

I am a beginner to hive and got to know that update/delete operations are not supported on non-transactional tables. I didn't get a clear picture of why those operations are not supported? Also, wanted to know if there exists a way to update the non-transactional table.
why those operations are not supported?
Transaction tables are special managed tables in hive which should be in orc format, bucketed(hive 2.0), dbtransacmanager enabled and concurrency enabled. These properties ensures we can do ACID operations. And every DML(DEL/UPD/MERGE/INS) creates delta file which tracks the changes. So, hive does a lot of internal operations to maintain the table and data.
wanted to know if there exists a way to update the non-transactional table.
Of course there is.
The idea is to truncate and insert whole data after changing the data.
You can update a field using below code.
Assume you have emp table and you want to update emp salary to 5000 for id =10. For simplicity's sake, emp table has 3 columns only - empid, empname, empsal.
Insert overwrite emp -- truncate and load emp table
Select empid, empname, empsal from emp where id <> 10 -- non changed data
union all
Select empid, empname, 5000 empsal from emp where id = 10 -- union to changed data
You can use similar sql to delete employee id 10.
Insert overwrite emp -- truncate and load emp table
Select empid, empname, empsal from emp where id <> 10 -- insert when emp id not equal to 10 which means it gets deleted from emp
You can do merge too. You can ins/upd/del data from other sources too.

My question is how to create a VPD in Oracle with SQL that will also mask data

I am trying to create a VPD in Oracle using SQL statements. The purpose of this problem is so an employee could ONLY view records for employees in the same department while masking their coworker's salaries as NULL.
The code for the table being used is as follows
create table Employee
(
ID number primary key,
DEPT varchar2(25),
SALARY number(8,2),
NAME varchar2(25)
);
I am unsure what the best way to go about doing this would be.... would it be to create a package and use an application context. I believe getting the table to only display those in same "DEPT" I understand but unsure how to mask the data of those with same DEPT but different ID.
Native RLS will get you close but not totally there. Using "sec_relevant_cols" will give you the option between
only seeing the rows that match your predicate, but all values are present
seeing all the rows, but masking values that do not match your predicate
whereas (if I'm reading correctly) you want to see only predicate matching rows AND mask out some values as well.
You could achieve this with a two-step method
Your context contains two keys (say) DEPT and YOUR_ID
The RLS policy is "where dept = sys_context(ctx,'DEPT')"
You have a view EMP to which that policy is applied, being
select
id,
dept,
name,
case when id = sys_context(ctx,'YOUR_ID') then sal else null end sal
from EMP_TABLE

How can I alter a temporary table with 36 million rows to add a new column?

I am working with a temporary table in Netezza that contains the columns id, gender, start_date, and end_date. I want to add a new column to this table that contains a default date of 2019-01-01 for all rows. The table to which I want to add this column is a local temp table, so ALTER TABLE does not work ("Error: Operation not allowed on a temp table"). To get around this, I created a new temp table as follows:
DROP TABLE new_temp_table IF EXISTS;
GO
SELECT id, gender, start_date, end_date, '2019-01-01' default_date
INTO TEMP TABLE new_temp_table
FROM old_temp_table;
GO
This new table is limited to 1000 rows by the SELECT...INTO syntax. My old table has 36 million rows. Is there a solution that would allow me to directly modify the old table to add the new default date column, or some other way to get around the 1000-row limit with SELECT...INTO?

How to retrieve a dropped table?

I accidentally deleted a table called DEPARTMENT from my oracle database and I want to restore it back. So I googled and found the solution.
Here is what I did:
SHOW RECYCLEBIN;
CRIMINALS BIN$hqnw1JViXO/gUwPAcgqn3A==$0 TABLE 2019-04-16:13:17:16
DEPARTMENT BIN$hqnw1JVjXO/gUwPAcgqn3A==$0 TABLE 2019-04-16:13:19:04
DEPARTMENT BIN$hqnw1JVkXO/gUwPAcgqn3A==$0 TABLE 2019-04-16:13:21:23
DEPARTMENT BIN$hqnw1JVnXO/gUwPAcgqn3A==$0 TABLE 2019-04-16:13:36:34
FLASHBACK table department TO BEFORE DROP;
Flashback succeeded.
If you can see the SHOW RECYCLEBIN QUERY, You can tell there are more than one department table and all of them have different content. My Question is how can I get the content of all 3 table in one.
After flashback, rename each DEPARTMENT table to a new name, e.g.
rename department to dept_1;
Do it for all of them but the last one (whose name will remain DEPARTMENT). Then insert the rest of data into it:
insert into department
select * from dept_1
union all
select * from dept_2;
Note that uniqueness might be violated; if table's description has changed, select * might not work (so you'll have to name all columns, one-by-one).
But, generally speaking, that's the idea of how to do it.

Update multiple columns which has millions of records

I have table with millions of record and I added two new columns
alter table hr.employees add (ind char(1Byte), remove Char(1 Byte)); commit;
I have another view hr.department which has more data than this and it has these two columns .
So if I write and update for these records it takes so long.
update hr.employees a
set (ind, remove) =(select ind, remove
from hr.department b
where a.dept_id = b.dept_id ) ;
It's been an hour it still goes with the update. Can some one help in this ?
If your table has millions of rows, using an UPDATE will very likely take too much time.
I would rename the old table, create a new table with the new columns already filled, then add indexes, constraints, comments and gather statistics.
RENAME employees to employees_old;
CREATE TABLE employees AS
SELECT a.col1, a.col2, ... a.coln, b.ind, b.remove
FROM employees_old a
LEFT JOIN department b
ON a.dept_id = b.dept_id;
The simplest way to handle this update will likely be the following:
Find a time when no one else will be critically using the system.
Make a backup.
If they don't already exist, copy DDL for creating all indexes/constraints on that table.
Drop all indexes/constraints on that table.
Update your two columns.
Recreate indexes/constraints (this may take some time, but usually orders of magnitude less than updating every row in a big table with multiple indexes).
As usual, Burleson Consulting is a good resource: http://www.dba-oracle.com/t_efficient_update_sql_dml_tips.htm