Pypika vacuum and analyze redshift - pypika

How to build a Pypika sql for following sql's in redshift:
vacuum schema1.table1
analyze schema1.table1

Related

Oracle DDL statment ( like CTAS) , after executed, not show in V$SQL

Why Oracle "DDL" statements (like "CTAS"), after executed, does not shown in V$SQL view ?
How can get "SQL_ID" of that? I want to use "SQL_ID" in SQl plan baselines. TNX
V$SQL shows just the first 20 charachter for CTAS. It is a bug in Oracle Database 11g. For more details, see: (https://stackoverflow.com/questions/27623646/oracle-sql-text-truncated-to-20-characters-for-create-alter-grant-statements/28087571#28087571\)1.
CTAS operations appear in the v$sql view
SQL> create table t1 as select * from dba_objects ;
Table created.
SQL> select sql_text,sql_id from v$sql where sql_text like '%create table t1 as select%' ;
SQL_TEXT
--------------------------------------------------------------------------------
SQL_ID
-------------
create table t1 as select * from dba_objects
4j5kv6x7cz5r7
select sql_text,sql_id from v$sql where sql_text like '%create table t1 as selec
t%'
5n4xnjkt3vz3h
SQL Plan Baselines are part of SPM ( SQL Plan Management )
A SQL plan baseline. A plan baseline is a set of accepted plans that
the optimizer is allowed to use for a SQL statement. In the typical
use case, the database accepts a plan into the plan baseline only
after verifying that the plan performs well. In this context, a plan
includes all plan-related information (for example, SQL plan
identifier, set of hints, bind values, and optimizer environment) that
the optimizer needs to reproduce an execution plan.
If you are using a CTAS recurrently, I guess you are doing it in batch mode, thus dropping the table and then recreate it afterwards using the mentioned CTAS command. I would rather try to see what is the problem with the SELECT itself of that statement.
But SQL Baselines are more focus for solving queries which plans can be fixed and evolved as the optimizer is not always choosing the best one.

count(*) on Avro table returns 0

I've recently moved to using AvroSerDe for my External tables in Hive.
Select col_name,count(*)
from table
group by col_name;
The above query gives me a count. Where as the below query does not:
Select count(*)
from table;
The reason is hive just looks at the table metadata and fetches the values. For some reason, statistics for the table is not updated in hive due to which count(*) returns 0.
The statistics is written with no data rows at the time of table creation and for any data appends/changes, hive requires to update this statistics in the metadata.
Running ANALYZE command gather statistics and write them into Hive MetaStore.
ANALYZE TABLE table_name COMPUTE STATISTICS;
Visit Apache Hive wiki for more details about ANALYZE command.
Other methods to solve this issue
Use of 'limit' and 'group by' clause triggers map reduce job to get
the count of number of rows and gives correct value
Setting fetch task conversion to none forces hive to run a map reduce
job to count the number of rows
hive> set hive.fetch.task.conversion=none;

sqlite3 DB doesn't use index unless vacuumed

Performing VACUUM on my DB significantly improves query performance. While trying to determine why this is, I found that sqlite3 isn't using the index on the DB in its original state, just a generic SEARCH TABLE.
QUERY PLAN
|--SCAN TABLE data <--- no Index
|--USE TEMP B-TREE FOR GROUP BY
`--USE TEMP B-TREE FOR ORDER BY
After performing VACUUM, the QUERY PLAN shows a SEARCH USING INDEX as it should
QUERY PLAN
|--SEARCH TABLE data USING INDEX index_name (name=?)
|--USE TEMP B-TREE FOR GROUP BY
`--USE TEMP B-TREE FOR ORDER BY
How can I determine why the index isn't being used before the vacuum operation?
I have the explain results as well, but I'm not sure they'd be useful. They are clearly different (original, non-vacuumed result performs a Rewind/Loop where the vacuumed DB OpenRead's the index)
Thank you,

Oracle EXPLAIN PLAN FOR Returns Nothing

I run the following query on an Oracle database:
EXPLAIN PLAN FOR
SELECT *
FROM table_name
However, it's not returning any data. When I delete the EXPLAIN PLAN FOR clause, the query does run as expected. Thanks for the help!
In case it's relevant, I'm accessing the database through Teradata and also a Jupyter IPython notebook.
From Using EXPLAIN PLAN:
The PLAN_TABLE is automatically created as a global temporary table to hold the output of an EXPLAIN PLAN statement for all users. PLAN_TABLE is the default sample output table into which the EXPLAIN PLAN statement inserts rows describing execution plans
EXPLAIN PLAN FOR SELECT last_name FROM employees;
This explains the plan into the PLAN_TABLE table. You can then select the execution plan from PLAN_TABLE.
Displaying PLAN_TABLE Output
UTLXPLS.SQL
UTLXPLP.SQL
DBMS_XPLAN.DISPLAY table function
I suggest to use:
EXPLAIN PLAN FOR SELECT * FROM table_name;
SELECT * FROM TABLE(dbms_xplan.display);

when using functions on PostgreSQL partitoned tables it does a full table scan

The tables are partitioned in a PostgreSQL 9 database. When I run the following script:
select * from node_channel_summary
where report_date between '2012-11-01' AND '2012-11-02';
it send the data from the proper tables without doing a full table scan. Yet if I run this script:
select * from node_channel_summary
where report_date between trunc(sysdate)-30 AND trunc(sysdate)-29;
in this case it does a full table scan which performance is unacceptable. The -30 and -29 will be replaced by parameters.
After doing some research, Postgres doesn't work properly with functions and partitioned tables.
Does somebody know as a work around to resolve this issue?
The issue is that PostgreSQL calculates and caches execution plans when you compile the function. This is a problem for partitioned tables because PostgreSQL uses the query planner to eliminate partitions. You can get around this by specifying your query as a string, forcing PostgreSQL to re-parse and re-plan your query at run time:
FOR row IN EXECUTE 'select * from node_channel_summary where report_date between trunc(sysdate)-30 AND trunc(sysdate)-29' LOOP
-- ...
END LOOP;
-- or
RETURN QUERY EXECUTE 'select * from ...'