Oracle SQL Select from a Large Number of Tables - sql

I apologize if this has been answered elsewhere (There has to be something out there on the topic), but I can't seem to find a concise answer to my question.
I am relatively new to SQL, and when I have worked with it I have only used basic statements. Now I am working with a pretty large database (in Oracle) and was asked to explore it a little bit on the development side to get more familiar.
One of the questions that was sent to me to explore the db involved finding a list of "Run Controls" that are associated with a particular user.
There is a single table that keeps track of the different types of "Run Controls" that exist via a field titled run_cntl_id. There are 18 rows in this table associated with the specific user, each with a unique run_cntl_id. For each of the values in the run_cntl_id field, there is at least one corresponding table with multiple rows (pretty standard database stuff). Unfortunately, I do not have any reference material to display the table relationships.
There are just under 3,000 tables that contain both the oprid (the user identifier) and the run_cntl_id (the type of "Run Control") fields (when they are separated, there are 3,100 tables that contain the run_cntl_id field, and 8,800 that contain the oprid field). There are approximately 65,000 total tables in the database. Is there a way to search these 3,000 tables for the specific operid and run_cntl_id?
If I wanted to perform this query on one table, I would use the following statement:
SELECT *
FROM PS_JRNL_COPY_REQ
WHERE oprid = 'jle0010'
AND run_cntl_id = 'Copy_Jrnl';
To rephrase the question:
Is there a way to perform this statement on the 3,000 tables mentioned above without running a single statement 3,000 times?

If all of the 3,000 tables have the same columns then your query is going to look like :
SELECT *
FROM (
select * from PS_JRNL_COPY_REQ union all
select * from other_table_1 union all
select * from other_table_2 union all
... and so on for 2,997 more)
WHERE oprid = 'jle0010'
AND run_cntl_id = 'Copy_Jrnl';
No guarantees on it parsing or running though.
You can build that inline view by querying user_tables of course.

Related

Selecting a large number of rows by index using SQL

I am trying to select a number of rows by the value of a column called ID. I know you can do this pretty easily by:
SELECT col1, col2, col3 FROM mytable WHERE id IN (1,2,3,4,5...)
However, what if there are a few million IDs I want to select and the IDs don't always have pattern (which means I can't use something like BETWEEN x AND y)? Does this select statement still work or is there better ways of doing so?
The actual application is this. Filters are specified by users, which is compared to some attributes of the records. From those filters, we create a subset of the data which is of interest to a particular user. There are about 30 million records each with roughly ~3000 attributes (which is stored in roughly 30 tables, but every table has ID as a primary key), so every time someone makes a query about their desired subset of records, we'd have to join many tables, apply those filters, and figure out what his subset looks like. In order to avoid joining many tables all the time, I thought maybe it's a better idea to join the tables once, figure out the id of the selected subset, and this way each time a new query is made, all we have to do is select the relevant columns of the rows that match the filtered ids.
This depends on the database and the interface you are using. For a few hundred or thousand values, no problem. But your question specifies millions. And that could start to get into limits on the length of the query -- either specified by the database, the tool you are using, or intermediate libraries.
If you have so many ids, I would strongly recommend that you load them into a table in the database with the id as the primary key. Then use join or exists to identify the rows in your table that match.
Often, such a list would be generated in the database anyway. In that case, you can use a subquery or CTE and just include that code in your final query.

Need to create SQL query that calculates a field based on whether or not a linked record exists in a different table

I'm no database architect but my employer has saddled me with a problem that I'm not sure how to solve because I'm the "computer savvy geologist".
They have a database that has a table called "wellExpenses". In that table there are expenses that are linked to a table called "wellInvoices". They want me to create a query that provides the subtotal of a well subtracting the appropriate expense. However, There are not always expenses associated with each well.
So, while there may be 300 entries in the wellInvoices table, there are only about 75 entries in the wellExpenses table.
What I need to do is something like the psuedocode below:
If “wellNum” exists in “expenseTable”:
“wellSubtotal” (in wellTable) = “wellSubtotal” (in wellTable) -
“expenseSubtotal” (in “expenseTable”)
Else:
“wellSubtotal” (in wellTable) = “wellSubtotal” (in wellTable)
How would I do this in SQL? Or any other way in MS Access
It is difficult to advise without knowing the structure of your tables, but since your wellExpenses table may not necessarily contain a corresponding record for each record in your wellInvoices table, a LEFT JOIN from wellInvoices to wellExpenses would be appropriate - something like:
SELECT
wellInvoices.wellNum,
wellInvoices.wellSubtotal-Nz(wellExpenses.expenseSubtotal,0) AS wellTotal
FROM
wellInvoices LEFT JOIN wellExpenses ON wellInvoices.wellNum = wellExpenses.wellNum

bigquery dataset design, multiple vs single tables for storing the same type of data

im planning to build a new ads system and we are considering to use google bigquery.
ill quickly describe my data flow :
Each User will be able to create multiple ADS. (1 user, N ads)
i would like to store the ADS impressions and i thought of 2 options.
1- create a table for impressions , for example table name is :Impressions fields : (userid,adsid,datetime,meta data fields...)
in this options of all my impressions will be stored in a single table.
main pros : ill be able to big data queries quite easily.
main cons: table will be hugh, and with multiple queries, ill end up paying too much (:
option 2 is to create table per ads
for example, ads id 1 will create
Impression_1 with fields (datetime,meta data fields)
pros: query are cheaper, data table is smaller
cons: todo big dataquery sometimes ill have to create a union and things will complex
i wonder what are your thoughts regarding this ?
In BigQuery it's easy to do this, because you can create tables per each day, and you have the possibility to query only those tables.
And you have Table wildcard functions, which are a cost-effective way to query data from a specific set of tables. When you use a table wildcard function, BigQuery only accesses and charges you for tables that match the wildcard. Table wildcard functions are specified in the query's FROM clause.
Assuming you have some tables like:
mydata.people20140325
mydata.people20140326
mydata.people20140327
You can query like:
SELECT
name
FROM
(TABLE_DATE_RANGE(mydata.people,
TIMESTAMP('2014-03-25'),
TIMESTAMP('2014-03-27')))
WHERE
age >= 35
Also there are Table Decorators:
Table decorators support relative and absolute <time> values. Relative values are indicated by a negative number, and absolute values are indicated by a positive number.
To get a snapshot of the table at one hour ago:
SELECT COUNT(*) FROM [data-sensing-lab:gartner.seattle#-3600000]
There is also TABLE_QUERY, which you can use for more complex queries.

SQL Combine two tables in select statement

I have a situation where I want to combine two tables for queries in a select statement, and I haven't found a working solution yet.
The Situation:
Table A
Table B
Both A and B have identical fields but distinct populations. I have other queries that are pulling from each table separately.
I want to build a query that pulls from them as if they were one table. There are no instances of records being in both tables.
My research so far led me to think the FULL OUTER JOIN was what I wanted, but I'm not entirely sure how to do that when I'm not really joining them on any field and it failed in my tests. So I searched for append options thinking that might more accurately represent what I'm trying to do and the INSERT INTO looked promising but less so for select statements. Is there a way to do this or do I need to create a third table that's the combination of the first two to query from?
.
This is being done as an Excel VBA query to Access via DAO. I'm building up SQL statements piece by piece in my VBA code based on user-selected options and then pulling the results into Excel for use. AS such my hope is to be able to only alter the FROM statement (since I'm building up the queries piecemeal) to effect this so that any other aspects of the select statement won't be impacted. Any suggestions or help will be much appreciated!
You can UNION the tables to do this:
SELECT StuffYouWant
FROM (SELECT *
FROM TableA
UNION ALL
SELECT *
FROM TableB) c
Something like this:
SELECT * FROM a
UNION
SELECT * FROM b
Make sure the a table and the b table have the same number of columns and the corresponding columns have the same data type

How to select all fields in SQL joins without getting duplicate columns names?

Suppose I have one table A, with 10 fields. And Table B, with 5 fields.
B links to A via a column named "key", that exists both in A, and in B, with the same name ("key").
I am generating a generic piece of SQL, that queries from a main table A, and receives a table name parameter to join to, and select all A fields + B.
In this case, I will get all the 15 fields I want, or more precisely - 16, because I get "key" twice, once from A and once from B.
What I want is to get only 15 fields (all fields from the main table + the ones existing in the generic table), without getting "key" twice.
Of course I can explicit the fields I want in the SELECT itself, but that thwarts my very objective of building a generic SQL.
It really depends on which RDBMS you're using it against, and how you're assembling your dynamic SQL. For instance, if you're using Oracle and it's a PL/SQL procedure putting together your SQL, you're presumably querying USER_TAB_COLS or something like that. In that case, you could get your final list of columns names like
SELECT DISTINCT(column_name)
FROM user_tab_cols
WHERE table_name IN ('tableA', 'tableB');
but basically, we're going to need to know a lot more about how you're building your dynamic SQL.
Re-thinking about what I asked makes me conclude that this is not plausible. Selecting columns in a SELECT statement picks the columns we are interested in from the list of tables provided. In cases where the same column name exists in more than one of the tables involved, which are the cases my question is addressing, it would, ideally, be nice if the DB engine could return a unique list of fields - BUT - for that it would have to decide itself which column (and from which table) to choose, from all the matches - which is something that the DB cannot do, because it is solely dependent in the user's choice.