Dynamic Oracle Table Function for use in Tableau - sql

We have a large amount of data in an Oracle 11g server. Most of the engineers use Tableau for visualizing data, but there is currently not a great solution for visualizing straight from the Oracle server because of the structure of the database. Unfortunately, this cannot be changed, as it's very deeply integrated with the rest of our systems. There is a "dictionary" table, let's call it tab_keys:
name | key
---------------
AB-7 | 19756
BG-0 | 76519
FY-10 | 79513
JB-2 | 18765
...
...
And there are also the tables actually containing the data. Each entry in tab_keys has a corresponding data table named by prefixing the key with an identifier, in this case, we'll use "dat_". So AB-7 will store all its data in a table called dat_19756. These keys are not known to the user, and are only used for tracking "behind the scenes". The user only knows the AB-7 moniker.
Tableau allows communication with Oracle servers using standard SQL select statements, but because the user doesn't know the key value, they cannot write a SQL statement to query the data.
Tableau recently added the ability for users to query Oracle Table Functions, so I started going down the road of writing a table function to query for the key, and return a table of the results for Tableau to use. The problem is that each dat_ table is basically unique with a different numbers of columns, labels, number of records, and datatypes from the next dat_ table.
What is the right way to handle this problem? Can I:
1) Write a function (which tableau can call inline in regular SQL) to return a bonified table name which is dynamically generated? I tried this:
create or replace FUNCTION TEST_FUNC
(
V_NAME IN VARCHAR2
) RETURN user_tables.table_name%type AS
V_KEY VARCHAR(100);
V_TABLE user_tables.table_name%type;
BEGIN
select KEY into V_KEY from my_schema.tab_keys where NAME = V_NAME;
V_TABLE := dbms_assert.sql_object_name('my_schema.dat_' || V_KEY);
RETURN V_TABLE;
END TEST_FUNC;
and then SELECT * from TABLE(TEST_FUNC('AB-7')); but I get:
ORA-22905: cannot access rows from a non-nested table item
22905. 00000 - "cannot access rows from a non-nested table item"
*Cause: attempt to access rows of an item whose type is not known at
parse time or that is not of a nested table type
*Action: use CAST to cast the item to a nested table type
I couldn't figure out a good way to CAST the table as the table type I needed. Could this be done in the function before returning?
2) Write a table function? Tableau can supposedly query these like tables, but then I run into the problem of dynamically generating types (which I understand isn't easy) but with the added complication of this needing to be used by multiple users simultaneously, so each user would need a data type generated for them each time they connect to a table (if I'm understanding this correctly).
I have to think I'm missing something simple. How do I cast the return of this query as this other table's datatype?

There is no simple way to have a single generic function return a dynamically configurable nested table. With other products you could use a Ref Cursor (which maps to ODBC or JDBC ResultSet object) but my understanding is Tableau does not support that option.
One thing you can do is generate views from your data dictionary. You can use this query to produce a one-off script.
select 'create or replace view "' || name || '" as select * from dat_' || key || ';'
from tab_keys;
The double-quotes are necessary because AB-7 is not a valid object name in Oracle, due to the dash.
This would allow your users to query their data like this:
select * from "AB-7";
Note they would have to use the double-quotes too.
Obviously, any time you inserted a row in tab_keys you'd need to create the required view. That could be done through a trigger.

You can build dynamic SQL in SQL using the open source program Method4:
select * from table(method4.dynamic_query(
q'[
select 'select * from dat_'||key
from tab_keys
where name = 'AB-7'
]'
));
A
-
1
The program combines Oracle Data Cartridge Interface with ANYDATASET to create a function that can return dynamic types.
There might be a way to further simplify the interface but I haven't figured it out yet. These Oracle Data Cartridge Interface functions are very picky and are not easy to repackage.
Here's the sample schema I used:
create table tab_keys(name varchar2(100), key varchar2(100));
insert into tab_keys
select 'AB-7' , '19756' from dual union all
select 'BG-0' , '76519' from dual union all
select 'FY-10', '79513' from dual union all
select 'JB-2' , '18765' from dual;
create table dat_19756 as select 1 a from dual;
create table dat_76519 as select 2 b from dual;
create table dat_79513 as select 3 c from dual;
create table dat_18765 as select 4 d from dual;

Related

How can I create a calculate column in the creation of table in POSTGRESQL, for example in sql server LineTotal AS Price * Quantity [duplicate]

Does PostgreSQL support computed / calculated columns, like MS SQL Server? I can't find anything in the docs, but as this feature is included in many other DBMSs I thought I might be missing something.
Eg: http://msdn.microsoft.com/en-us/library/ms191250.aspx
Postgres 12 or newer
STORED generated columns are introduced with Postgres 12 - as defined in the SQL standard and implemented by some RDBMS including DB2, MySQL, and Oracle. Or the similar "computed columns" of SQL Server.
Trivial example:
CREATE TABLE tbl (
int1 int
, int2 int
, product bigint GENERATED ALWAYS AS (int1 * int2) STORED
);
fiddle
VIRTUAL generated columns may come with one of the next iterations. (Not in Postgres 15, yet).
Related:
Attribute notation for function call gives error
Postgres 11 or older
Up to Postgres 11 "generated columns" are not supported.
You can emulate VIRTUAL generated columns with a function using attribute notation (tbl.col) that looks and works much like a virtual generated column. That's a bit of a syntax oddity which exists in Postgres for historic reasons and happens to fit the case. This related answer has code examples:
Store common query as column?
The expression (looking like a column) is not included in a SELECT * FROM tbl, though. You always have to list it explicitly.
Can also be supported with a matching expression index - provided the function is IMMUTABLE. Like:
CREATE FUNCTION col(tbl) ... AS ... -- your computed expression here
CREATE INDEX ON tbl(col(tbl));
Alternatives
Alternatively, you can implement similar functionality with a VIEW, optionally coupled with expression indexes. Then SELECT * can include the generated column.
"Persisted" (STORED) computed columns can be implemented with triggers in a functionally equivalent way.
Materialized views are a related concept, implemented since Postgres 9.3.
In earlier versions one can manage MVs manually.
YES you can!! The solution should be easy, safe, and performant...
I'm new to postgresql, but it seems you can create computed columns by using an expression index, paired with a view (the view is optional, but makes makes life a bit easier).
Suppose my computation is md5(some_string_field), then I create the index as:
CREATE INDEX some_string_field_md5_index ON some_table(MD5(some_string_field));
Now, any queries that act on MD5(some_string_field) will use the index rather than computing it from scratch. For example:
SELECT MAX(some_field) FROM some_table GROUP BY MD5(some_string_field);
You can check this with explain.
However at this point you are relying on users of the table knowing exactly how to construct the column. To make life easier, you can create a VIEW onto an augmented version of the original table, adding in the computed value as a new column:
CREATE VIEW some_table_augmented AS
SELECT *, MD5(some_string_field) as some_string_field_md5 from some_table;
Now any queries using some_table_augmented will be able to use some_string_field_md5 without worrying about how it works..they just get good performance. The view doesn't copy any data from the original table, so it is good memory-wise as well as performance-wise. Note however that you can't update/insert into a view, only into the source table, but if you really want, I believe you can redirect inserts and updates to the source table using rules (I could be wrong on that last point as I've never tried it myself).
Edit: it seems if the query involves competing indices, the planner engine may sometimes not use the expression-index at all. The choice seems to be data dependant.
One way to do this is with a trigger!
CREATE TABLE computed(
one SERIAL,
two INT NOT NULL
);
CREATE OR REPLACE FUNCTION computed_two_trg()
RETURNS trigger
LANGUAGE plpgsql
SECURITY DEFINER
AS $BODY$
BEGIN
NEW.two = NEW.one * 2;
RETURN NEW;
END
$BODY$;
CREATE TRIGGER computed_500
BEFORE INSERT OR UPDATE
ON computed
FOR EACH ROW
EXECUTE PROCEDURE computed_two_trg();
The trigger is fired before the row is updated or inserted. It changes the field that we want to compute of NEW record and then it returns that record.
PostgreSQL 12 supports generated columns:
PostgreSQL 12 Beta 1 Released!
Generated Columns
PostgreSQL 12 allows the creation of generated columns that compute their values with an expression using the contents of other columns. This feature provides stored generated columns, which are computed on inserts and updates and are saved on disk. Virtual generated columns, which are computed only when a column is read as part of a query, are not implemented yet.
Generated Columns
A generated column is a special column that is always computed from other columns. Thus, it is for columns what a view is for tables.
CREATE TABLE people (
...,
height_cm numeric,
height_in numeric GENERATED ALWAYS AS (height_cm * 2.54) STORED
);
db<>fiddle demo
Well, not sure if this is what You mean but Posgres normally support "dummy" ETL syntax.
I created one empty column in table and then needed to fill it by calculated records depending on values in row.
UPDATE table01
SET column03 = column01*column02; /*e.g. for multiplication of 2 values*/
It is so dummy I suspect it is not what You are looking for.
Obviously it is not dynamic, you run it once. But no obstacle to get it into trigger.
Example on creating an empty virtual column
,(SELECT *
From (values (''))
A("virtual_col"))
Example on creating two virtual columns with values
SELECT *
From (values (45,'Completed')
, (1,'In Progress')
, (1,'Waiting')
, (1,'Loading')
) A("Count","Status")
order by "Count" desc
I have a code that works and use the term calculated, I'm not on postgresSQL pure tho we run on PADB
here is how it's used
create table some_table as
select category,
txn_type,
indiv_id,
accum_trip_flag,
max(first_true_origin) as true_origin,
max(first_true_dest ) as true_destination,
max(id) as id,
count(id) as tkts_cnt,
(case when calculated tkts_cnt=1 then 1 else 0 end) as one_way
from some_rando_table
group by 1,2,3,4 ;
A lightweight solution with Check constraint:
CREATE TABLE example (
discriminator INTEGER DEFAULT 0 NOT NULL CHECK (discriminator = 0)
);

Big Query For-In not picking up table paths from a look-up table

I have a look-up table containing a list of fully qualified table paths in a Big Query table called all_tables. For example
|table_list|
|----------|
|project_name.dataset_name1.table_1|
|project_name.dataset_name2.table_1|
|project_name.dataset_name3.table_1|
|project_name.dataset_name4.table_1|
|project_name.dataset_name5.table_1|
I am trying to iterate through these tables to pull out elements I need for another procedure using the for-in syntax in Big Query. This is a simplified version of the query I am using
```
FOR table IN (select * from my_project.my_dataset.all_tables)
DO
select * from table;
END FOR;
```
This isn't working. It picks up the list of tables correctly, but when it substitutes the dataset name in the line 3 select statement, it says
**Invalid value: Table "table" must be qualified with a dataset (e.g. dataset.table)**
I know what the error is, but I am not sure how to make it 'see' the value of table as a table path.
All paths are correct, and I am doing it this way as I am querying multiple tables across multiple datasets for a table creation query.
You should a dynamic sql to refer the table name as a variable, so consider below query:
FOR table IN (select * from my_project.my_dataset.all_tables)
DO
EXECUTE IMMEDIATE FORMAT("""
SELECT * FROM %s;
""", table.table_list);
END FOR;

Querying an Oracle Database with Dynamic Table names

I'm stuck with some poor database design where I have to query tables that are named by date.
The following query works when the table names are hard coded with relevant dates.
SELECT
ajob.ORDER_ID
, ajob.JOB_NAME
, abim.SERVICE_ID
, shist.SERVICE_NAME
FROM
obscuredschema.A190129001_AJOB ajob --hardcoded YYMMDD table name
INNER JOIN obscuredschema.A190129001_ABIMSVC abim --hardcoded YYMMDD table name
ON (ajob.ORDER_ID = abim.ORDER_ID)
INNER JOIN obscuredschema.SERVICE_HIST shist
ON (abim.SERVICE_ID = shist.SERVICE_KEY)
WHERE shist.SERVICE_NAME LIKE '%BIM'
AND shist.BIM_AUTH_ID > 0
;
Noting the two hardcoded table names (along with aliases)
How would I execute this same query using dynamic table names? (There's two)
The code for the dynamic date: TO_CHAR(trunc(sysdate - 7), 'YYMMDD')
If the first table name were a string, here's how I would build it:
'A'||TO_CHAR(trunc(sysdate - 7), 'YYMMDD')||'001_AJOB'
If the second table name were a string, here's how I would build it:
'A'||TO_CHAR(trunc(sysdate - 7), 'YYMMDD')||'001_ABIMSVC'
I don't think you can write a plain SQL query with dynamic table names.
You can write a PL/SQL procedure which uses execute immediate and returns a cursor or something; somebody asked about that just yesterday. If you're just trying to write this query to interact with some data, that might be your best bet.
In addition, you could modify that by turning your PL/SQL procedure into a pipelined function, and then you could call it from a SQL query using TABLE().
If it were me, I'd consider creating a synonym (or a standard view which just selects from the dynamically-named-tables), and scheduling a job to re-create it every time new tables are created. That might be simpler than dealing with pipelined functions.

Is it possible to look up a table-valued function's return columns in SAP HANA's dictionary views?

I've created a table-valued function in SAP HANA:
CREATE FUNCTION f_tables
RETURNS TABLE (
column_value INTEGER
)
LANGUAGE SQLSCRIPT
AS
BEGIN
RETURN SELECT 1 column_value FROM SYS.DUMMY;
END
Now I'd like to be able to discover the function's table type using the dictionary views. I can run this query here:
select *
from function_parameters
where schema_name = '[xxxxxxxxxx]'
and function_name = 'F_TABLES'
order by function_name, position;
Which will yield something like:
PARAMETER_NAME TABLE_TYPE_SCHEMA TABLE_TYPE_NAME
---------------------------------------------------------------------
_SYS_SS2_RETURN_VAR_ [xxxxxxxxxx] _SYS_SS_TBL_[yyyyyyy]_RET
Unfortunately, I cannot seem to be able to look up that _SYS_SS_TBL_[yyyyyyy]_RET table in SYS.TABLES (and TABLE_COLUMNS), SYS.VIEWS (and VIEW_COLUMNS), SYS.DATA_TYPES, etc. in order to find the definitions of the individual columns.
Note that explicitly named table types created using CREATE TYPE ... do appear in SYS.TABLES...
Is there any way for me to look formally look up a table-valued function's return columns? I'm not interested in parsing the source, obviously.
These kind of tables are internal row-store tables, therefore you can only find your _SYS_SS_TBL_[yyyyyyy]_RET table in SYS.RS_TABLES_. This will give you some basic information, including a column ID (CID). This value is important to find the column information.
For example, if your CID is 100, you can find column information in the RS_COLUMNS_ table with this query:
SELECT * FROM SYS.RS_COLUMNS_ WHERE CID = 100

How to transform an Oracle SQL into a Stored Procedure that should iterate through some tables fetching a certain data field?

I need to transform an Oracle SQL statement into a Stored Procedure therefore users with less privileges can access certain data field:
SELECT
info_field, data_field
FROM
table_one
WHERE
some_id = '<id>' -- I need this <id> to be the procedure's parameter
UNION ALL
SELECT
info_field, data_field
FROM
table_two
WHERE
some_id = '<id>'
UNION ALL
SELECT
info_field, data_field
FROM
table_three
WHERE
some_id = '<id>'
UNION ALL
...
Given that I'm no SP expert I've been unable to figure out a good solution to loop through all the involved tables (12 aprox.).
Any ideas would be helpful. Thanks much!
If you just want to restrict users' access you could create a view and grant them select on the view but not the tables:
CREATE VIEW info_and_data AS
SELECT info_field, data_field
FROM table_one
UNION ALL
SELECT info_field, data_field
FROM table_two
UNION ALL
SELECT info_field, data_field
FROM table_three
...
The users could then type:
SELECT info_field, data_field
FROM info_and_data
WHERE some_id = <id>
There are other ways to achieve your goal besides my suggestions below, but I would warn against splitting up data that really belongs in one table just to implement a data access policy that may change in the future.
The simplest solution to limit which table columns a user sees is through views on those tables. Use different views that show or hide specific columns and grant access to those views to different users/roles.
If you don't know in advance which combination of columns a user may be allowed to see, then you could use dynamic sql: You assemble the SQL statment in the stored procedure based on the access privileges of your user (look up from some other table you create to hold this info), meaning that you only include the proper columns in the SELECT portion of your statement. See this document from Orace
for more info.
If you are using Oracle 10g, then you may find this Oracle article interesting. It introduces the topic of the Virtual Private Database, or VPD for short, where you can hide certain rows, or columns or even individual column values depending on who is accessing a table.
Is the expectation that, among all these tables, only one will have a match for a given ID?
If no: You need to explain what you want to do when there are multiple matches.
If yes: You simply do the same SQL query, selecting the result into a variable that you then return.
It would look something like this:
PROCEDURE get_fields( the_id NUMBER,
info_field_out OUT table_one.info_field%TYPE,
data_field_out OUT table_one.data_field%TYPE
)
IS
BEGIN
SELECT info_field, data_field
INTO info_field_out, data_field_out
FROM (
... put your full SQL query here, using 'the_id' as the value to match against ..
);
EXCEPTION
WHEN no_data_found THEN
-- What do you want to do here? Set the outputs to NULL? Raise an error?
WHEN too_many_rows THEN
-- Is this an invalid condition?
END;