Searching for string in the whole schema - sql

I have Oracle database with many schemas. I'd like to find xyz string in all the tables in one specific schema. I've tried to do it as in accepted answer here (Search All Fields In All Tables For A Specific Value (Oracle)). Unfortunately, I get error:
3 ORA-00903: invalid table name
ORA-06512: at line 8 SQL.sql 30 2
where line 8 is
SELECT COUNT(*) FROM ' || t.owner || '.' || t.table_name ||
How can I do this full search? I've also tried other googled solutions but none of them worked for me.

Adapted from the other answer, run this query:
SELECT
'SELECT '''|| owner||table_name||column_name|| ''' as locn, COUNT(*) FROM "' || owner || '"."' || table_name ||'" WHERE "'||column_name||'" = ''[VALUE YOURE LOOKING FOR HERE]'' UNION ALL'
FROM
all_tab_columns
WHERE owner <> 'SYS' and data_type LIKE '%CHAR%'
Replace the [VALUE YOURE LOOKING FOR HERE] with the value you're looking for. Replace the square brackets too. Do not touch any apostrophes.
Then run the query, and it'll produce a huge number of sql statements. Copy them out of your query tool results grid, paste them into the query area panel, delete the last union all and then run them. And wait. For a very long time. Eventually it'll produce a list of every table and column name, together with the count of the number of times that value you're looking for appears in that column

You can adapt the original code to cope with tables (and schemas, and columns) created with case-sensitive or otherwise invalid object identifiers, by treating everything as a quoted identifier.
EXECUTE IMMEDIATE
'SELECT COUNT(*) FROM "' || t.owner || '"."' || t.table_name || '"' ||
' WHERE "'||t.column_name||'" = :1'
INTO match_count
USING '1/22/2008P09RR8';
but using whatever string you're actually looking for, of course.
In the dynamic SQL that generates, the owner, table name and column name are now all enclosed in double quotes - which is what #CaiusJard is doing, but this still executes separate queries inside an anonymous block.

Related

I'd like to select all values from all databases

I'm using Oracle 12c with many databases that come and go. I can run "select * from all_users" and get a list of all of my users databases. Now I can specify which user's table I want to query, but I'd really like to query them all. So something like "select * from all_users.client" to get all clients from all users. I know that won't work, and frankly, there may be no way to do what I want here, but if there is please point me in the right direction.
I created a SQL script that you can run to export all the information from all the tables (that you have access to). Using SQLcl, you can set sqlformat csv so that any select that you run returns the columns comma delimited. You can also modify the select statement that is gathering the table names to limit the tables that you want to return.
Also note that I ran into some tables that had characters in the table name that are not valid for a file name. For example, I have some tables that have $ in the table name, but that is not a valid character for a file name so if you run into that scenario you may want to add a correction for that.
set pagesize 0
set linesize 20000
set heading off
set feedback off
spool export_tables.sql
SELECT 'spool ' || owner || '.' || table_name || '.csv' || chr(10)
|| 'select * from ' || owner || '.' || table_name || ';' || chr(10)
|| 'spool off'
FROM all_tables
WHERE owner NOT IN ('SYS',
'SYSTEM',
'OUTLN',
'DBSFWUSER',
'AUDSYS',
'DBSNMP',
'APPQOSSYS',
'XDB',
'WMSYS',
'OJVMSYS',
'CTXSYS')
ORDER BY owner, table_name;
spool off;
set sqlformat csv
set heading on
#export_tables.sql
I think you want to get all the tables of the user:
select * from all_tables order by owner, table_name;
It returns the user-name and table names

Sum all numeric columns in database and log results

I have a query which gives me all numeric columns in my Postgres database:
SELECT table_schema, table_name, column_name
FROM information_schema.columns
WHERE table_schema in (
'datawarehouse_x',
'datawarehouse_y',
'datawarehouse_z',
'datawarehouse_w'
)
and udt_name not in
('date','timestamp','bool','varchar')
and column_name not like '%_id'
This gives me, what I need:
table_schema table_name column_name
schema_1 table_x column_z
schema_2 table_y column_w
I checked it and it's fine.
What I do now want to do is, to query all these columns for each table as a select sum(column) and then insert this schema_name, table_name, query_result and the current date into a log table on a daily basis.
Writing the results into a target table shouldn't be a big deal, but how in the world can I run queries according to the results of this query?
Thanks in advance.
EDIT: What I will write afterwards would be a procedure, which takes these schema/table/column as input, then queries the table and writes into the log-table. I just do not know the part in-between. This is kind of what I would be doing then, but I don't know yet which types I should use for schema, table and column.
create or replace function sandbox.daily_routine_metrics(schema_name regnamespace, table_name regclass, column_name varchar)
returns void
language plpgsql
as $$
BEGIN
EXECUTE
'INSERT INTO LOGGING.DAILY_ROUTINE_SIZE
SELECT
'|| QUOTE_LITERAL(schema_name) ||' schema_name,' ||
QUOTE_LITERAL(table_name) ||' table_name, ' ||
QUOTE_LITERAL(column_name) ||' column_name, ' ||
'current_timestamp, sum(' || QUOTE_LITERAL(column_name) || ')
FROM ' || QUOTE_LITERAL(schema_name) ||'.'|| QUOTE_LITERAL(table_name);
END;
$$;
The feature you need is known as "dynamic SQL". It's an RDBMS-specific implementation; the documents for Postgres are here.
Whilst it's possible to achieve what you want in dynamic SQL, you might find it easier to use a scripting language like Python or Ruby to achieve this. Dynamic SQL is hard to code and debug - you find yourself concatenating lots of hardcoded strings with results from SQL queries, printing them to the console to see if they work, and realizing all sorts of edge cases blow up.

Inserting a select statement into a table- ORA-06502

I have 6 TEST environments and 1 production environment. I create quite a few different reports as Oracle views, and need a way to sync these between environments.
I am trying to make a script that I can run, which will basically output a list of commands that I can copy and paste into my different environment to create the necessary views/public synonyms and privileges.
I have to put the resultant text into a database table as dbms_output.put_line has a certain limitation on how many characters it can show.
I have the following, but if I try to insert the data, I get ORA-06502: PL/SQL: numeric or value error. I am guessing this is probably got to do with character literals not being escaped and what not.
CREATE OR REPLACE PROCEDURE EXPORT_REPORTS AS
statements CLOB;
tmp_statement CLOB;
CURSOR all_views IS
SELECT
OWNER,
VIEW_NAME,
TEXT
FROM
ALL_VIEWS
WHERE
OWNER = 'PAS'
;
BEGIN
FOR v IN all_views LOOP
tmp_statement := 'CREATE OR REPLACE FORCE VIEW "' || v.OWNER || '"."' || v.VIEW_NAME || '" AS ' || CHR(13) || CHR(10) || v.TEXT;
statements := statements || tmp_statement;
END LOOP;
EXECUTE IMMEDIATE 'INSERT INTO VIEW_EXPORTS VALUES ('''|| statements || ''')';
END EXPORT_REPORTS;
Any idea what I can do to try and fix this?
If it is because some of the text in the statements variable contains single quotes, how can I escape this before inserting the data into a table?
This sounds like a job for Data Pump.
Oracle Data Pump technology enables very high-speed movement of data and metadata from one database to another.
http://docs.oracle.com/cd/B28359_01/server.111/b28319/dp_overview.htm
For getting database object DDL I would recommend using the DBMS_METADATA package.
http://docs.oracle.com/cd/B28359_01/appdev.111/b28419/d_metada.htm#BGBDJAHH
you cannot use CLOB as a datatype for the local variables. Use VARCHAR instead

Select a dynamic set of columns from a table and get the sum for each

Is it possible to do the following in Postgres:
SELECT column_name FROM information_schema WHERE table_name = 'somereport' AND data_type = 'integer';
SELECT SUM(coulmn_name[0]),SUM(coulmn_name[1]) ,SUM(coulmn_name[3]) FROM somereport;
In other words I need to select a group of columns from a table depending on certain criteria, and then sum each of those columns in the table.
I know I can do this in a loop, so I can count each column independently, but obviously that requires a query for each column returned from the information schema query. Eg:
FOR r IN select column_name from information_schema where report_view_name = 'somereport' and data_type = 'integer';
LOOP
SELECT SUM(r.column_name) FROM somereport;
END
This query creates the complete DML statement you are after:
WITH x AS (
SELECT 'public'::text AS _schema -- provide schema name ..
,'somereport'::text AS _tbl -- .. and table name once
)
SELECT 'SELECT ' || string_agg('sum(' || quote_ident(column_name)
|| ') AS sum_' || quote_ident(column_name), ', ')
|| E'\nFROM ' || quote_ident(x._schema) || '.' || quote_ident(x._tbl)
FROM x, information_schema.columns
WHERE table_schema = _schema
AND table_name = _tbl
AND data_type = 'integer'
GROUP BY x._schema, x._tbl;
You can execute it separately or wrap this query in a plpgsql function and run the query automatically with EXECUTE:
Full automation
Tested with PostgreSQL 9.1.4
CREATE OR REPLACE FUNCTION f_get_sums(_schema text, _tbl text)
RETURNS TABLE(names text[], sums bigint[]) AS
$BODY$
BEGIN
RETURN QUERY EXECUTE (
SELECT 'SELECT ''{'
|| string_agg(quote_ident(c.column_name), ', ' ORDER BY c.column_name)
|| '}''::text[],
ARRAY['
|| string_agg('sum(' || quote_ident(c.column_name) || ')'
, ', ' ORDER BY c.column_name)
|| ']
FROM '
|| quote_ident(_schema) || '.' || quote_ident(_tbl)
FROM information_schema.columns c
WHERE table_schema = _schema
AND table_name = _tbl
AND data_type = 'integer'
);
END;
$BODY$
LANGUAGE plpgsql;
Call:
SELECT unnest(names) AS name, unnest (sums) AS col_sum
FROM f_get_sums('public', 'somereport');
Returns:
name | col_sum
---------------+---------
int_col1 | 6614
other_int_col | 8364
third_int_col | 2720642
Explain
The difficulty is to define the RETURN type for the function, while number and names of columns returned will vary. One detail that helps a little: you only want integer columns.
I solved this by forming an array of bigint (sum(int_col) returns bigint). In addition I return an array of column names. Both sorted alphabetically by column name.
In the function call I split up these arrays with unnest() arriving at the handsome format displayed.
The dynamically created and executed query is advanced stuff. Don't get confused by multiple layers of quotes. Basically you have EXECUTE that takes a text argument containing the SQL query to execute. This text, in turn, is provided by secondary SQL query that builds the query string of the primary query.
If this is too much at once or plpgsql is rather new for you, start with this related answer where I explain the basics dealing with a much simpler function and provide links to the manual for the major features.
If performance is essential query the Postgres catalog directly (pg_catalog.pg_attributes) instead of using the standardized (but slow) information_schema.columns. Here is a simple example with pg_attributes.

Check a whole table for a single value

Background: I'm converting a database table to a format that doesn't support null values. I want to replace the null values with an arbitrary number so my application can support null values.
Question: I'd like to search my whole table for a value ("999999", for example) to make sure that it doesn't appear in the table. I could write a script to test each column individually, but I wanted to know if there is a way I could do this in pure sql without enumerating each field. Is that possible?
You can use a special feature of the PostgreSQL type system:
SELECT *
FROM tbl t
WHERE t::text LIKE '%999999%';
There is a composite type of the same name for every table that you create in PostgreSQL. And there is a text representation for every type in PostgreSQL (to input / output values).
Therefore you can just cast the whole row to text and if the string '999999' is contained in any column (its text representation, to be precise) it is guaranteed to show in the query above.
You cannot rule out false positives completely, though, if separators and / or decorators used by Postgres for the row representation can be part of the search term. It's just very unlikely. And positively not the case for your search term '999999'.
There was a very similar question on codereview.SE recently. I added some more explanation in my answer there.
create or replace function test_values( real ) returns setof record as
$$
declare
query text;
output record;
begin
for query in select 'select distinct ''' || table_name || '''::text table_name, ''' || column_name || '''::text column_name from '|| quote_ident(table_name)||' where ' || quote_ident(column_name) || ' = ''' || $1::text ||'''::' || data_type from information_schema.columns where table_schema='public' and numeric_precision is not null
loop
raise notice '%1 qqqq', query;
execute query::text into output;
return next output;
end loop;
return;
end;$$ language plpgsql;
select distinct * from test_values( 999999 ) as t(table_name text ,column_name text)