BigQuery: Loop through tables in dataset and drop tables with a specific prefix - google-bigquery

Our development environment in BigQuery is isolated to a development dataset i.e. dev in BigQuery. The environments are further isolated by a prefix of the ticket for each table i.e. DATA-100-change-table would correspond to the DATA-100 ticket.
I am aware of setting TTL for BigQuery, however, I am also interested in having a query I could run by hand to delete the tables.
So far, I have the below:
begin
-- Create temporary table of tables to drop from `dev` with prefix
create temp table to_drop (drop_string STRING)
as
(
select concat("drop table if exists ", table_catalog, ".", table_schema, ".", table_name)
from dev.INFORMATION_SCHEMA.TABLES
where table_name like "DATA-100%"
);
-- Loop through table and execute drop_string statements
for drop_statement in (select drop_string from to_drop)
do
execute immediate drop_statement;
end for;
end
However, this fails with the following error:
Query error: Cannot coerce expression drop_statement to type STRING at [14:23]
Is my approach right here? How do I best delete all tables with a prefix in BigQuery?
Also, if possible, I would like this query to handle views as well.

The variable drop_statement in the for loop contains a struct. So you have to access the string with drop_statement.drop_string.
begin
-- Create temporary table of tables to drop from `dev` with prefix
create temp table to_drop (drop_string STRING)
as
(
select concat("drop table if exists ", table_catalog, ".", table_schema, ".", table_name)
from dev.INFORMATION_SCHEMA.TABLES
where table_name like "DATA-100%" and table_type = 'BASE TABLE'
);
-- Loop through table and execute drop_string statements
for drop_statement in (select drop_string from to_drop)
do
execute immediate drop_statement.drop_string;
end for;
end
To drop VIEWs, just replace table_type = "BASE TABLE" with table_type = "VIEW" and use drop view if exists instead.

Using p13rr0m's answer as inspiration, I've come up with the below:
begin
create temp table to_drop (drop_string STRING)
as
(
select
case
when table_type = "BASE TABLE" then concat("drop table if exists `", table_catalog, ".", table_schema, ".", table_name, "`")
when table_type = "VIEW" then concat("drop view if exists `", table_catalog, ".", table_schema, ".", table_name, "`")
end as drop_string
from
`dev.INFORMATION_SCHEMA.TABLES`
where
table_name like "DATA_100%"
);
for drop_statement in (select drop_string from to_drop)
do
execute immediate drop_statement.drop_string;
end for;
end
Had to add backticks to the tables as it was failing on dropping views.

Related

Drop multiple tables in the same DB -starts with same prefix - Big Query

How can I drop multiple tables in the same database that starts with same prefix?
Ex:
Query to delete 1 table
drop table project_id.db.test_table_<some_random_string>
But how can I drop all tables that start with the same prefix test_table_ in the same db?
A possible work around would be. (region is set eu)
BEGIN
DECLARE drop_statments ARRAY<string>;
DECLARE len int64 default 1;
SET drop_statments = (SELECT ARRAY_AGG( 'drop table ' || table_schema ||'.' || table_name)
FROM `region-eu.INFORMATION_SCHEMA.TABLES`
WHERE table_schema = 'db' and table_name like 'Table_Prefix%'
);
WHILE ARRAY_LENGTH(drop_statments) >= len DO
EXECUTE IMMEDIATE drop_statments[offset(len-1)];
SET len = len +1;
END WHILE ;
END;
You may use any of below INFORMATION_SCHEMA dataset.
-- Returns metadata for tables in a single dataset.
SELECT * FROM myDataset.INFORMATION_SCHEMA.TABLES;
-- Returns metadata for tables in a region.
SELECT * FROM region-us.INFORMATION_SCHEMA.TABLES;
Similar to narendra# solution, but using FOR..IN:
FOR drop_statement IN
(SELECT CONCAT("drop table ",table_schema,".", table_name, ";" ) AS value
FROM dataset.INFORMATION_SCHEMA.TABLES -- or region.INFORMATION_SCHEMA.TABLES
WHERE table_name LIKE "table_prefix%"
ORDER BY table_name DESC)
DO
EXECUTE IMMEDIATE(drop_statement.value); -- Here the table is dropped
END FOR;
It's also worth to mention, that you can change "table" to "view" also depending on the information returned from the INFORMATION_SCHEMA views.

How to delete every table in a specific schema in postgres?

How do I delete all the tables I have in a specific schema? Only the tables in the schema should be deleted.
I already have all the table names that I fetched with the code below, but how do delete all those tables?
The following is some psycopg2 code, and below that is the SQL generated
writeCon.execute("SELECT table_name FROM information_schema.tables WHERE table_schema='mySchema'")
SELECT table_name FROM information_schema.tables WHERE table_schema='mySchema'
You can use an anonymous code block for that.
WARNING: This code is playing with DROP TABLE statements, and they are really mean if you make a mistake ;) The CASCADE option drops all depending objects as well. Use it with care!
DO $$
DECLARE
row record;
BEGIN
FOR row IN SELECT * FROM pg_tables WHERE schemaname = 'mySchema'
LOOP
EXECUTE 'DROP TABLE mySchema.' || quote_ident(row.tablename) || ' CASCADE';
END LOOP;
END;
$$;
In case you want to drop everything in your schema, including wrappers, sequences, etc., consider dropping the schema itself and creating it again:
DROP SCHEMA mySchema CASCADE;
CREATE SCHEMA mySchema;
For a single-line command, you can use psql and its \gexec functionality:
SELECT format('DROP TABLE %I.%I', table_schema, table_name)
FROM information_schema.tables
WHERE table_schema= 'mySchema';\gexec
That will run the query and execute each result string as SQL command.

Drop all tables in a Redshift schema - without dropping permissions

I would be interested to drop all tables in a Redshift schema. Even though this solution works
DROP SCHEMA public CASCADE;
CREATE SCHEMA public;
is NOT good for me since that it drops SCHEMA permissions as well.
A solution like
DO $$ DECLARE
r RECORD;
BEGIN
-- if the schema you operate on is not "current", you will want to
-- replace current_schema() in query with 'schematodeletetablesfrom'
-- *and* update the generate 'DROP...' accordingly.
FOR r IN (SELECT tablename FROM pg_tables WHERE schemaname = current_schema()) LOOP
EXECUTE 'DROP TABLE IF EXISTS ' || quote_ident(r.tablename) || ' CASCADE';
END LOOP;
END $$;
as reported in this thread How can I drop all the tables in a PostgreSQL database?
would be ideal. Unfortunately it doesn't work on Redshift (apparently there is no support for for loops).
Is there any other solution to achieve it?
Run this SQL and copy+paste the result on your SQL client.
If you want to do it programmatically you need to built little bit code around it.
SELECT 'DROP TABLE IF EXISTS ' || tablename || ' CASCADE;'
FROM pg_tables
WHERE schemaname = '<your_schema>'
I solved it through a procedure that deletes all records. Using this technique to truncate fails but deleting it works fine for my intents and purposes.
create or replace procedure sp_truncate_dwh() as $$
DECLARE
tables RECORD;
BEGIN
FOR tables in SELECT tablename
FROM pg_tables
WHERE schemaname = 'dwh'
order by tablename
LOOP
EXECUTE 'delete from dwh.' || quote_ident(tables.tablename) ;
END LOOP;
RETURN;
END;
$$ LANGUAGE plpgsql;
--call sp_truncate_dwh()
In addition to demircioglu's answer, I had to add Commit after every drop statement to drop all tables in my schema. SELECT 'DROP TABLE IF EXISTS ' || tablename || ' CASCADE; COMMIT;' FROM pg_tables WHERE schemaname = '<your_schema>'
P.S.: I do not have required reputation to add this note as a comment and had to add as an answer.
Using Python and pyscopg2 locally on my PC I came up with this script to delete all tables in schema:
import psycopg2
schema = "schema_to_be_deleted"
try:
conn = psycopg2.connect("dbname='{}' port='{}' host='{}' user='{}' password='{}'".format("DB_NAME", "DB_PORT", "DB_HOST", "DB_USER", "DB_PWD"))
cursor = conn.cursor()
cursor.execute("SELECT tablename FROM pg_tables WHERE schemaname = '%s'" % schema)
rows = cursor.fetchall()
for row in rows:
cursor.execute("DROP TABLE {}.{}".format(schema, row[0]))
cursor.close()
conn.commit()
except psycopg2.DatabaseError as error:
logger.error(error)
finally:
if conn is not None:
conn.close()
Replace correctly values for DB_NAME, DB_PORT, DB_HOST, DB_USER and DB_PWD to connect to the Redshift DB
The following recipe differs from other answers in the regard that it generates one SQL statement for all tables we're going to delete.
SELECT
'DROP TABLE ' ||
LISTAGG("table", ', ') ||
';'
FROM
svv_table_info
WHERE
"table" LIKE 'staging_%';
Example result:
DROP TABLE staging_077815128468462e9de8ca6fec22f284, staging_abc, staging_123;
As in other answers, you will need to copy the generated SQL and execute it separately.
References
|| operator concatenates strings
LISTAGG function concatenates every table name into a string with a separator
The table svv_table_info is used because LISTAGG doesn't want to work with pg_tables for me. Complaint:
One or more of the used functions must be applied on at least one user created tables. Examples of user table only functions are LISTAGG, MEDIAN, PERCENTILE_CONT, etc
UPD. I just now noticed that SVV_TABLE_INFO page says:
The SVV_TABLE_INFO view doesn't return any information for empty tables.
...which means empty tables will not be in the list returned by this query. I usually delete transient tables to save disk space, so this does not bother me much; but in general this factor should be considered.

truncate script tables in schema does not work

I am looking for a solution. I am trying to truncate all the tables in my postgres database:
I am using a simple SQL script
SELECT 'TRUNCATE ' || table_name || ';'
FROM information_schema.tables WHERE table_schema='sda' AND table_type='BASE TABLE';
unfortunately it does not work because many relations do not exist.
please help. ( i am using postgresql 9.2)
prepare:
t=# create schema sda;
CREATE SCHEMA
t=# create table sda."BASE TABLE"();
CREATE TABLE
try:
t=# SELECT format('TRUNCATE %I;',table_name)
FROM information_schema.tables WHERE table_schema='sda' AND table_type = 'BASE TABLE';
format
------------------------
TRUNCATE "BASE TABLE";
(1 row)
t=# TRUNCATE "BASE TABLE";
TRUNCATE TABLE
so I assume you just did not treat table a identifier, like:
t=# TRUNCATE BASE TABLE;
ERROR: syntax error at or near "TABLE"
LINE 1: TRUNCATE BASE TABLE;
also - upper case with space in name often leads to human errors, better use standard no case names...

DB2 Drop table if exists equivalent

I need to drop a DB2 table if it exists, or drop and ignore errors.
Try this one:
IF EXISTS (SELECT name FROM sysibm.systables WHERE name = 'tab_name') THEN
DROP TABLE tab_name;END IF;
The below worked for me in DB2 which queries the SYSCAT.TABLES view to check if the table exists. If yes, it prepares and executes the DROP TABLE statement.
BEGIN
IF EXISTS (SELECT TABNAME FROM SYSCAT.TABLES WHERE TABSCHEMA = 'SCHEMA_NAME' AND TABNAME = 'TABLE_NAME') THEN
PREPARE stmt FROM 'DROP TABLE SCHEMA_NAME.TABLE_NAME';
EXECUTE stmt;
END IF;
END
search on systable : if you are on as400 (power i, system i) the system table name is QSYS2.SYSTABLES else try sysibm.systables or syscat.tables (This depends on the operating system)
BEGIN
IF EXISTS (SELECT NAME FROM QSYS2.SYSTABLES WHERE TABLE_SCHEMA = 'YOURLIBINUPPER' AND TABLE_NAME = 'YOUTABLENAMEINUPPER') THEN
DROP TABLE YOURLIBINUPPER.YOUTABLENAMEINUPPER;
END IF;
END ;
This is simpler and works for me:
DROP TABLE SCHEMA.TEST IF EXISTS;
First query if the table exists, like
select tabname from syscat.tables where tabschema='myschema' and tabname='mytable'
and if it returns something issue your
drop table myschema.mytable
Other possibility is to just issue the drop command and catch the Exception that will be raised if the table does not exist. Just put that code inside try {...} catch (Exception e) { // Ignore } block for that approach.
To complement the other answers here, if you want to be ANSI compatible you could also use the queries bellow. It should work for IBM i and LUW:
SELECT * FROM information_schema.tables WHERE TABLE_SCHEMA = 'MY_SCHEMA' AND TABLE_NAME = 'MY_TABLE';
then if any result is returned:
DROP TABLE MY_SCHEMA.MY_TABLE;