Oracle 11g : Meta data query very slow - sql

I have this view that should display comments and constraints (including check conditions where applicable) for the columns of some tables in a schema.
Essentially I'm (left ) joining ALL_COL_COMMENTS to ALL_CONS_COLUMNS to ALL_CONSTRAINTS.
However, this is really slow for some reason ( takes around 10 seconds ) even though I have a very small number of tables ( just 7 ) , very small number of columns ( 58 columns in total ). So the query returns few results. And it's still slow. What can I do ?
CREATE OR REPLACE FORCE VIEW "MYDB"."COMMENTS_VIEW" ("TABLE_NAME", "COLUMN_NAME", "COMMENTS", "CONSTRAINT_TYPE", "CHECK_CONDITION") AS
SELECT r.TABLE_NAME, r.COLUMN_NAME, r.COMMENTS, DECODE(q.CONSTRAINT_TYPE,'P', 'Primary Key', 'C', 'Check Constraint', 'R', 'Referential Integrity Constraint' ), q.SEARCH_CONDITION AS CHECK_CONDITION
FROM ALL_COL_COMMENTS r -- ALL_COL_COMMENTS has the COMMENTS
LEFT JOIN ALL_CONS_COLUMNS p ON (p.TABLE_NAME = r.TABLE_NAME AND p.OWNER = 'MYDB' AND p.COLUMN_NAME = r.COLUMN_NAME) -- ALL_CONS_COLUMNS links COLUMNS to CONSTRAINTS
LEFT JOIN ALL_CONSTRAINTS q ON (q.OWNER = 'MYDB' AND q.CONSTRAINT_NAME = p.CONSTRAINT_NAME AND q.TABLE_NAME = p.TABLE_NAME AND (q.CONSTRAINT_TYPE = 'C' OR q.CONSTRAINT_TYPE = 'P' OR q.CONSTRAINT_TYPE = 'R' ) ) -- this gives us INFO on CONSTRAINTS
WHERE r.OWNER = 'MYDB'
AND
r.TABLE_NAME IN ('TABLE1', 'TABLE2', 'TABLE3', 'TABLE4', 'TABLE5', 'TABLE6', 'TABLE7')
AND
r.COLUMN_NAME NOT IN ('CREATED', 'MODIFIED', 'CREATED_BY', 'MODIFIED_BY')
ORDER BY r.TABLE_NAME, r.COLUMN_NAME, r.COMMENTS;

Ensure the dictionary and fixed object statistics are up-to-date. Checking for up-to-date statistics is a good first step for almost any SQL performance problem. The dictionary and fixed objects are unusual, and there's a good chance nobody has considered gathering statistics on them before.
begin
dbms_stats.gather_fixed_objects_stats;
dbms_stats.gather_dictionary_stats;
end;
/

Try to join on table, and column ids instead of names where possible. Even OWNER if you can. Example:
ON p.TABLE_ID = r.TABLE_ID
Also, you are selecting from objects that are already views of who knows how many underlying tables. The query optimizer is probably having a hard time (and maybe giving up in some aspects). Try to translate your query into using the base tables.

I would either use a query profiler, or (simpler) just remove parts of your query until it gets super fast. For example, remove the DECODE() call, maybe that's doing it.

Related

Greenplum PSQL Format for Dynamic Query

Firstly, thank you in advance for any help with my relatively simple issue below. It's honestly driving my insane!
Simply, I'm trying to select some metrics on all tables in a schema. However, this specifically includes Partitioned tables in Greenplum (which, for those who don't know it, have a single parent table named X and then child tables named X_1_prt_3, X_1_prt_4, etc).
As a result, my query in trying to get the total table size for the single partitioned table X is as follows:
-- Part 1
select cast(sum(sotaidtablesize) as bigint) / 1024 / 1024 as "Table Size (MB)"
from gp_toolkit.gp_size_of_table_and_indexes_disk
where sotaidschemaname = 'Y'
and sotaidtablename like 'X%'
;
This sums up the table size for any table named X or similar thereafter, which is effectively what I want. But this is just a part of a bigger query.. I don't want to actually specify the schema and table, I want it to be:
-- Part 2
where sotaidschemaname = t4.nspname
and sotaidtablename like 't4.relname%'
but that sadly doesn't just work (what a world that would be!!). I've tried the following, which I think is close, but I cannot get it to return any value other than NULL :
-- Part 3
and sotaidtablename like quote_literal(format( '%I', tablename )::regclass)
where tablename is a column from another part (I already use this column in another format which correctly works, so I know this bit in particular isn't the issue).
Thank you in advance to anyone for any help!
Regards,
Vinny
I find it easier using gp_size_of_table_and_indexes_disk.sotaidoid on the join clause rather than (sotaidschemaname, sotaidtablename).
For example:
SELECT pg_namespace.nspname AS schema,
pg_class.relname AS relation,
pg_size_pretty(sotd.sotdsize::BIGINT) as tablesize,
pg_size_pretty(sotd.sotdtoastsize::BIGINT) as toastsize,
pg_size_pretty(sotd.sotdadditionalsize::BIGINT) as othersize,
pg_size_pretty(sotaid.sotaidtablesize::BIGINT) as tabledisksize,
pg_size_pretty(sotaid.sotaididxsize::BIGINT) as indexsize
FROM pg_class
LEFT JOIN pg_stat_user_tables
ON pg_stat_user_tables.relid = pg_class.oid
LEFT JOIN gp_toolkit.gp_size_of_table_disk sotd
ON sotd.sotdoid = pg_class.oid
LEFT JOIN gp_toolkit.gp_size_of_table_and_indexes_disk sotaid
ON sotaid.sotaidoid = pg_class.oid
LEFT JOIN pg_namespace
ON pg_namespace.oid = pg_class.relnamespace
WHERE
pg_class.relkind = 'r'
AND relstorage != 'x'
AND pg_namespace.nspname NOT IN ('information_schema', 'madlib', 'pg_catalog', 'gptext')
AND pg_class.relname NOT IN ('spatial_ref_sys');

Translating query from Firebird to PostgreSQL

I have a Firebird query which I should rewrite into PostgreSQL code.
SELECT TRIM(RL.RDB$RELATION_NAME), TRIM(FR.RDB$FIELD_NAME), FS.RDB$FIELD_TYPE
FROM RDB$RELATIONS RL
LEFT OUTER JOIN RDB$RELATION_FIELDS FR ON FR.RDB$RELATION_NAME = RL.RDB$RELATION_NAME
LEFT OUTER JOIN RDB$FIELDS FS ON FS.RDB$FIELD_NAME = FR.RDB$FIELD_SOURCE
WHERE (RL.RDB$VIEW_BLR IS NULL)
ORDER BY RL.RDB$RELATION_NAME, FR.RDB$FIELD_NAME
I understand SQL, but have no idea, how to work with this system tables like RDB$RELATIONS etc. It would be really great if someone helped me with this, but even some links with this tables explanation will be OK.
This piece of query is in C++ code, and when I'm trying to do this :
pqxx::connection conn(serverAddress.str());
pqxx::work trans(conn);
pqxx::result res(trans.exec(/*there is this SQL query*/));//and there is a mistake
it writes that:
RDB$RELATIONS doesn't exist.
Postgres has another way of storing information about system content. This is called System Catalogs.
In Firebird your query basically returns a row for every column of a table in every schema with an additional Integer column that maps to a field datatype.
In Postgres using system tables in pg_catalog schema something similar can be achieved using this query:
SELECT
TRIM(c.relname) AS table_name, TRIM(a.attname) AS column_name, a.atttypid AS field_type
FROM pg_class c
LEFT JOIN pg_attribute a ON
c.oid = a.attrelid
AND a.attnum > 0 -- only ordinary columns, without system ones
WHERE c.relkind = 'r' -- only tables
ORDER BY 1,2
Above query does return system catalogs as well. If you'd like to exclude them you need to add another JOIN to pg_namespace and a where clause with pg_namespace.nspname <> 'pg_catalog', because this is the schema where system catalogs are stored.
If you'd also like to see datatype names instead of their representative numbers add a JOIN to pg_type.
Information schema consists of collection of views. In most cases you don't need the entire SQL query that stands behind the view, so using system tables will give you better performance. You can inspect views definition though, just to get you started on the tables and conditions used to form an output.
I think you are looking for the information_schema.
The tables are listed here: https://www.postgresql.org/docs/current/static/information-schema.html
So for example you can use:
select * from information_schema.tables;
select * from information_schema.columns;

Sample Oracle SQL Randomly - in absense of ROWID

I have a weird problem in using SAMPLE clause. Why does the First SQL does not work, while the second one works fine.
SELECT * FROM SYS.ALL_TABLES SAMPLE(10)
SELECT * FROM MIDAS.GERMPLASM SAMPLE(10)
I'm trying to SAMPLE a SQL query not just a table, but I could not figure out how I should use the SAMPLE clause. Is there any other way besides sample clause? Note: I want to do this in a random manner; not the first N rows.
Update:
First of all, thank you for reading this Q to help. But I already know that this SQL does not work because the SAMPLE clause is using a hidden column, ROWID. What I don't know is how to do this if ROWID does not exist in the table.
Here is a reproducible example SQL that I try to SAMPLE randomly:
SELECT cols.table_name, cols.column_name, cols.position, cons.status, cons.owner, cons.constraint_type
FROM all_constraints cons, all_cons_columns cols
WHERE cons.constraint_name = cols.constraint_name
AND cons.owner = cols.owner
ORDER BY cols.table_name, cols.position
I want to get small random subset of data (from query), to compute statistical properties of table columns before fetching everything from DB.
Thank you
The error message you get when you try to run the first query is a pretty big clue:
ORA-01446: cannot select ROWID from, or sample, a view with DISTINCT, GROUP BY, etc.
It's pretty clear to me from this that the SAMPLE functionality requires access to ROWID to work. As ROWID is a pseudocolumn that the database uses to physically locate a row, any query where the ROWID is indeterminate (such as when the data is aggregated), cannot use SAMPLE on the outer query. In the case of ALL_ALL_TABLES, the fact that it is a view that combines two tables via UNION blocks access to the ROWID.
From your revised question, the first thing that jumps out at me is that the SAMPLE clause must be in the FROM clause, between the table name and any alias. I was able to sample in a query with joins like this:
SELECT *
FROM table_a SAMPLE (10) a
JOIN table_b SAMPLE (10) b
ON a.column1 = b.column1
Regarding your actual query, I tried using the tables (again, actually views) that you're trying to sample one at a time:
select * from all_constraints sample(10)
ORA-01445: cannot select ROWID from, or sample, a join view without a key-preserved table
select * from all_cons_columns sample(10)
ORA-01445: cannot select ROWID from, or sample, a join view without a key-preserved table
This message is pretty clear: none of the tables in these views are key-preserved (i.e. guaranteed to return each row no more than once), so you can't sample them.
The following query should work to manually create a random sample, using DBMS_RANDOM.
SELECT *
FROM (SELECT cols.table_name,
cols.column_name,
cols.position,
cons.status,
cons.owner,
cons.constraint_type,
DBMS_RANDOM.VALUE rnd
FROM all_constraints cons
JOIN all_cons_columns cols
ON cons.constraint_name = cols.constraint_name
AND cons.owner = cols.owner)
WHERE rnd < .1
ORDER BY table_name, position

How to get table comments via SQL in Oracle?

I've tried :
select * from user_tab_comments;
and it returns me 3 columns "TABLE_NAME", "TABLE_TYPE", and "COMMENTS", but the "TABLE_NAME" column is like "encrypted", I need clear table names :
TABLE_NAME TABLE_TYPE COMMENTS
BIN$IN1vjtqhTEKcWfn9PshHYg==$0 TABLE Résultat d'intégration d'une photo numérisée
BIN$PUwG3lb3QoazOc4QaC1sjw==$0 TABLE Motif de fin d'agrément de maître de stage
When I use select * from user_tables; TABLE_NAME is not "encrypted".
Since 10g Oracle doesn't immediately drop tables when we issue a DROP TABLE statement. Instead it renames them like this BIN$IN1vjtqhTEKcWfn9PshHYg==$0 and puts them in the recycle bin. This allows us to recover tables we didn't mean to drop. Find out more.
Tables in the recycle bin are still tables, so they show up in ALL_TABLES and similar views. So if you only want to see comments relating only to live (non-dropped) tables you need to filter by table name:
select * from all_tab_comments
where substr(table_name,1,4) != 'BIN$'
/
"I can't believe there isn't a flag column so you could do and is_recycled = 0 or something. "
You're right, it would be incredible. So I checked the documentation it turns out Oracle 10g added a column called DROPPED to the USER_/ALL_/DBA_TABLES views.
select tc.*
from all_tab_comments tc
join all_tables t
on tc.owner = t.owner
and tc.table_name = t.table_name
where t.dropped = 'NO'
/
Check out the documentation. Obviously the need to join to the ALL_TABLES view requires more typing than filtering on the name, so depending on our need it might just be easier to keep the original WHERE clause.
SELECT t.table_name,t.comments FROM USER_TAB_COMMENTS t WHERE TABLE_NAME = 'SS_DEPT';

How can I identify unused/redundant columns given a list of tables?

[This is on an iSeries/DB2 database if that makes any difference]
I want to write a procedure to identify columns that are left as blank or zero (given a list of tables).
Assuming I can pull out table and column definitions from the central system tables, how should I check the above condition? My first guess is for each column generate a statement dynamically such as:
select count(*) from my_table where my_column != 0
and to check if this returns zero rows, but is there a better/faster/standard way to do this?
NB This just needs to handle simple character, integer/decimal fields, nothing fancy!
To check for columns that contain only NULLs on DB2:
Execute RUNSTATS on your database (http://www.ibm.com/developerworks/data/library/techarticle/dm-0412pay/)
Check the database statistics by quering SYSSTAT.TABLES and SYSSTAT.COLUMNS . Comparing SYSSTAT.TABLES.CARD and SYSSTAT.COLUMNS.NUMNULLS will tell you what you need.
An example could be:
select t.tabschema, t.tabname, c.colname
from sysstat.tables t, sysstat.columns c
where ((t.tabschema = 'MYSCHEMA1' and t.tabname='MYTABLE1') or
(t.tabschema = 'MYSCHEMA2' and t.tabname='MYTABLE2') or
(...)) and
t.tabschema = c.tabschema and t.tabname = c.tabname and
t.card = c.numnulls
More on system stats e.g. here: http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/r0001070.htm and http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/r0001073.htm
Similarly, you can use SYSSTAT.COLUMNS.AVGCOLLEN to check for empty columns (just it doesn't seem to work for LOBs).
EDIT: And, to check for columns that contain only zeros, use try comparing HIGH2KEY and LOW2KEY in SYSSTAT.COLUMNS.
Yes, typically, I would do something like this in SQL Server:
SELECT
REPLACE(REPLACE(REPLACE(
'
SELECT COUNT(*) AS [COUNT NON-EMPTY IN {TABLE_NAME}.{COLUMN_NAME}]
FROM [{TABLE_SCHEMA}].[{TABLE_NAME}]
WHERE [{COLUMN_NAME}] IS NOT NULL
OR [{COLUMN_NAME}] <> 0
'
, '{TABLE_SCHEMA}', c.TABLE_SCHEMA)
, '{TABLE_NAME}', c.TABLE_NAME)
, '{COLUMN_NAME}', c.COLUMN_NAME) AS [SQL]
FROM INFORMATION_SCHEMA.COLUMNS c
INNER JOIN INFORMATION_SCHEMA.TABLES t
ON t.TABLE_TYPE = 'BASE TABLE'
AND c.TABLE_CATALOG = t.TABLE_CATALOG
AND c.TABLE_SCHEMA = t.TABLE_SCHEMA
AND c.TABLE_NAME = t.TABLE_NAME
AND c.DATA_TYPE = 'int'
You can get a lot fancier by doing UNIONs of the entire query and checking the IS_NULLABLE on each column and obviously you might have different requirements for different data types, and skipping identity columns, etc.
I'm assuming you mean you want to know if there are any values in all the rows of a given column. If your column can have "blanks" you're probably going to need to add an OR NOT NULL to your WHERE clause to get the correct answer.