SQL Finding data from similar tables - sql

I have a 50 tables with similar structures with name TABLE_1, TABLE_2, TABLE_3 etc. I want to select some information like SELECT * FROM TABLE WHERE CRM_ID = 100 but I dont know which table consists this ID. I guess that I should make a union with this tables and make a query from this union but I am afraid that it is not the best solution. Could anybody help me?

To find the table containing the field CRM_ID, you can use:
select TABLE_NAME
from SYS.ALL_TAB_COLUMNS
where COLUMN_NAME = 'CRM_ID'
From here, you can query the relevant tables with your union

select 'table_1', count(*) from table_1 where CMR_ID = 100
union all
select 'table_2', count(*) from table_2 where CMR_ID = 100
[.....]
If you have 50 tables, you may want to use some advanced text editing so you don't have to type the same line 50 times.

I can't see anything different from a UNION to get what you need.
However, I would like to avoid repeating all the code every time you need to query your tables, for example by creating a view:
create view all_my_tables as
select * from table1
union
select * from table2
...
and then
select *
from all_my_tables
where crm_id = 100

I would
create a view table_all as
select * from table_1 union all
select * from table_2 ...
but I guess you have 50 tables due to some performance reason like creating partitioning without partitioned table. If that is the case you need some table that simulates index and you need to have data "sorted". Then create table that contains: table_name, min_val, max_val:
table_1, 1, 1000
table_2, 1001, 2000
and procedure selecting looks like:
procedure sel(crmid) is
a varchar2(10);
value table%rowtype;
begin
select table_name into a from lookup where crmid between min_val and max_val;
execute immediate 'select * from ' || a || ' where crm_id = ' || crmid into value;
--do logic with value
end;
but if data is not ordered you need to iterate selects in loop. In such case use view.

Try this:
Select tmp.tablename from (
select 'table1' as tablename from table1 where CRM_ID=100
union all
select 'table2' as tablename from table2 where CRM_ID=100
) as tmp

Related

BigQuery using Dynamic sql as the input source of a query

How can I use a dynamic query as an input source for a larger query?
There is a query I'm getting the union of values in different datasets/tables scattered around and the list is growing so I'm thinking of using the dynamic query than to write queries for each tables like this
SET QUERY = "";
SET tables = ["table1", "table2"...];
SET tables_size = ARRAY_LENGTH(tables);
WHILE i < tables_size DO
IF (i = tables_size -1) THEN
BEGIN
SET query = CONCAT(query, " SELECT id, name FROM ", tables[OFFSET(i)]);
BREAK;
END;
ELSE
SET query = CONCAT(query, " SELECT id, name FROM ", tables[OFFSET(i)], ' UNION ALL ');
END IF;
SET i = i + 1;
END WHILE;
EXECUTE IMMEDIATE query;
My goal is to use the output of the executed query as a FROM clause for a larger query.
It will be something like
Select A, B, C, D ... From *EXECUTE IMMEDIATE query* LEFT JOIN ... ON..
Is there a way to inject an output of a dynamic query as a table for another query?
I don't see TABLE as a variable type for bigquery so that was not my option.
I'm getting a bit tired of copy pasting table names to the exact query every time a new table is introduced to this logic.
SELECT id, name FROM table1 UNION ALL
SELECT id, name FROM table1 UNION ALL
SELECT id, name FROM table3...
If there is a simple way to do this? or maybe a reason to not use dynamic queries for performance reasons?
Hope one of these are helpful:
1. Wildcard tables
If tables you want to union have a common prefix, you can consider to use a wildcard table like below. I think this is more concise form rather than union-all:
-- Sample Tables
CREATE TABLE IF NOT EXISTS testset.table1 AS SELECT 1 AS id, 'aaa' AS name;
CREATE TABLE IF NOT EXISTS testset.table2 AS SELECT 2 AS id, 'bbb' AS name;
CREATE TABLE IF NOT EXISTS testset.table3 AS SELECT 3 AS id, 'ccc' AS name;
--- Wildcard tables
SELECT * FROM `testset.table*` WHERE _TABLE_SUFFIX IN ('1', '2', '3');
2. Dynamic SQL & Temp Table
You can't inject a dynamic SQL directly into another query but you can use a temp table to emulate it.
2.1 Dynamic SQL
More concise dynamic query to union all tables:
DECLARE tables DEFAULT ["testset.table1", "testset.table2", "testset.table3"];
SELECT ARRAY_TO_STRING(ARRAY_AGG(FORMAT('SELECT id, name FROM %s', t)), ' UNION ALL\n')
FROM UNNEST(tables) t;
2.2 Using a temp table
I thinks you can modify your larger query to use a dynamically generated temp table.
DECLARE tables DEFAULT ["testset.table1", "testset.table2", "testset.table3"];
CREATE TABLE IF NOT EXISTS testset.table1 AS SELECT 1 AS id, 'aaa' AS name;
CREATE TABLE IF NOT EXISTS testset.table2 AS SELECT 2 AS id, 'bbb' AS name;
CREATE TABLE IF NOT EXISTS testset.table3 AS SELECT 3 AS id, 'ccc' AS name;
EXECUTE IMMEDIATE (
SELECT "CREATE TEMP TABLE IF NOT EXISTS union_tables AS \n"
|| ARRAY_TO_STRING(ARRAY_AGG(FORMAT('SELECT id, name FROM %s', t)), ' UNION ALL\n') FROM UNNEST(tables) t
);
-- your larger query using a temp table
SELECT * FROM union_tables;
output:

union unusual behavior

Trying to union two tables with the same field into one master table but for some reason im getting a weird result.
select count(*)
from staging.sandoval_parcels
where parcel_id = 0;
returns 0
select count(*)
from staging.bernalillo_parcels
where parcel_id = 0;
returns 0
but when i merge the tables using
CREATE TABLE staging.master_parcels
AS
SELECT * FROM bernalillo_parcels
UNION ALL
SELECT * FROM sandoval_parcels
;
then
select count(*)
from staging.master_parcels
where parcel_id = 0;
returns 85553
both tables have the same fields and the fields are the same data type,also, no of the values for any field are missing, thus no nulls, why am i getting ids = 0 when either of the table have parcel_ids = 0?
The order of the fields matter, replace the * for the explicit name, other wise the second query field will be inserted on the first query position. But not necessarily on the same field you want.
CREATE TABLE staging.master_parcels
AS
SELECT parcel_id, field1 ... FROM bernalillo_parcels
UNION ALL
SELECT parcel_id, field1 ... FROM sandoval_parcels
;
Union will merge tables even if the column order is not the same. If all of the columns match and are in the same order, it will union distinct values and not create duplicates if the rows are the same for each table. Having the order and data type be the same is important.

SELECT * FROM (SELECT)

There is a table "MAIN_TABLE" with columns "Table_Unique_Code" and "Table_name".
Also there are several tables with required data.
The task is to create SQL-query with parameter ("Table_Unique_Code"), which will select all data from the table determined by the "Table_Unique_code".
Something like
SELECT * FROM (*determine the name of the table by Table_Unique_code here*);
I tried
SELECT * FROM (SELECT table_name FROM MAIN_TABLE WHERE Table_Unique_Code=?)
but it doesn't work.
I work with OracleDB.

How would I store the result of a select statement so that I can reuse the results to join to different tables?

How would I store the result of a select statement so that I can reuse the results to join to different tables? This will also be inside a cursor.
Below is some pseudo code, in this example I have kept the Select statement simple but in real life it is a long query with multiple joins, I have to use the identical SQL twice to join to 2 different tables and as it is quite long and can be changed in the future hence I want to be able reuse it.
I have tried creating a view and storing the results of the select statement in it but it seems I can't create a view inside the cursor loop, when I tried I am getting "Encountered the symbol "CREATE"" error.
DECLARE TYPE cur_type IS REF CURSOR;
CURSOR PT_Cursor IS
SELECT * FROM Table1
PT_Cursor_Row PT_Cursor%ROWTYPE;
BEGIN
OPEN PT_Cursor;
LOOP
FETCH PT_Cursor INTO PT_Cursor_Row;
EXIT WHEN PT_Cursor%NOTFOUND;
Select ID From Table2 --this is actually a long complext query
INNER JOIN Table3 ON Table2.ID = Table3.ID
WHERE Table2.ID = PT_Cursor_Row.ID
Select * From Table2 --this is actually a long complext query
LEFT JOIN Table4 ON Table2.ID = Table4.ID
WHERE Table2.ID = PT_Cursor_Row.ID
END LOOP;
CLOSE PT_Cursor;
END;
One way to save the results from a query is via a temporary table - there's a short answer to this question that describes how to create them, while there is a longer answer here that discusses how to use them, with possible alternatives.
Temp tables certainly are a viable option.
One can also use the with statement to 'reuse' results sets.
WITH
PEOPLE AS
(
SELECT 'FRED' NAME, 12 SHOE_SIZE FROM DUAL UNION ALL
SELECT 'WILMA' NAME, 4 SHOE_SIZE FROM DUAL UNION ALL
SELECT 'BARNEY' NAME, 10 SHOE_SIZE FROM DUAL UNION ALL
SELECT 'BETTY' NAME, 3 SHOE_SIZE FROM DUAL
),
WOMAN AS
(
SELECT 'BETTY' NAME FROM DUAL UNION ALL
SELECT 'WILMA' NAME FROM DUAL
)
SELECT 'WOMANS ', PEOPLE.NAME, PEOPLE.SHOE_SIZE
FROM PEOPLE, WOMAN
WHERE PEOPLE.NAME = WOMAN.NAME
UNION ALL
SELECT 'MENS ', PEOPLE.NAME, PEOPLE.SHOE_SIZE
FROM PEOPLE, WOMAN
WHERE PEOPLE.NAME = WOMAN.NAME(+)
AND WOMAN.NAME IS NULL

SQL select from either one or other table

Assume I have a table A with a lot of records (> 100'000) and a table B with has the same columns as A and about the same data amount.
Is there a possibility with one clever select statement that I can either get all records of table A or all records of table B?
I am not so happy with the approach I currently use because of the performance:
select
column1
,column2
,column3
from (
select 'A' as tablename, a.* from table_a a
union
select 'B' as tablename, b.* from table_b b
) x
where
x.tablename = 'A'
Offhand, your approach seems like the only approach in standard SQL.
You will improve performance considerably by changing the UNION to UNION ALL. The UNION must read in the data from both tables and then eliminate duplicates, before returning any data.
The UNION ALL does not eliminate duplicates. How much better this performs depends on the database engine and possibly on turning parameters.
Actually, there is another possibility. I don't know how well it will work, but you can try it:
select *
from ((select const.tableName, a.*
from A cross join
(select 'A' as tableName where x.TableName = 'A')
) union all
(select const.tableName, b.*
from B cross join
(select 'B' as tableName where x.TableName = 'B')
)
) t
No promises. But the idea is to cross join to a table with either 1 or 0 rows. This will not work in MySQL, because it does not allow WHERE clauses without a FROM. In other databases, you might need a tablename such as dual. This gives the query engine an opportunity to optimize away the read of the table entirely, when the subquery contains no records. Of course, just because you give a SQL engine the opportunity to optimize does not mean that it will.
Also, the "*" is a bad idea particularly in union's. But I've left it in because that is not the focus of the question.
you can try next solution, it's selects only from table tmp1 ('A' = 'A')
select
*
from
tmp1
where
'A' = 'A'
union all
select
*
from
tmp2
where
'B' = 'A'
SQL Fiddle demo here
check execution plan
Hard to tell exactly what you want without a little more context, but perhaps something like this could work?
DECLARE #TableName nvarchar(15);
DECLARE #Query nvarchar(50);
SELECT #TableName = YourField
FROM YourTable
WHERE ...
SET #Query = 'SELECT * FROM ' + #TableName
EXEC #Query
Syntax might differ a bit depending on what RDBMS you are using, and more specifically what you are trying to accomplish, but might be a push in the right direction.
The proper way to do this and maintain performance requires some modification to your physical table design.
If you can add a column to each table that holds your indicator column and add a check constraint on that column, you can achieve "partition" elimination on your query.
DDL:
create table table_a (
c1 ...
,c2 ...
,c3 ...
,table_ind char(1) not null generated always as 'A'
,constraint ck_table_ind check (table_ind = 'A')
);
create table table_b (
c1 ...
,c2 ...
,c3 ...
,table_ind char(1) not null generated always as 'B'
,constraint ck_table_ind check (table_ind = 'B')
);
create view v1 as (
select * from table_a
union all
select * from table_b
);
If you execute the query select c1,c2,c3 from v1 where table_ind = 'A' the DB2 optimizer will use the check constraint to recognize that no rows in table_b can match the table_ind = 'A' predicate, so it will completely eliminate the table from the access plan.
This was used (and still is in some cases) before DB2 for Linux/UNIX/Windows supported Range Partitioning. You can read more about this technique in this research paper [PDF] written by some of the IBM DB2 developers back in 2002.