Load data in a Global Temporary Table - sql

I have two different Oracle sessions ("session A" and "session B") on the same Oracle user.
A Global Temporary Table is populated, in "session A", with about 320,000 records.
How can I quickly insert the same 320,000 records in the global temporary table of the "session B"?
Thank you in advance for your kind suggestions!
EDIT: I have forgotten to specify that I am allowed to create ONLY GLOBAL TEMPORARY TABLES.
EDIT: I have forgotten to specify that I am not allowed to create database links

The data within a temporary table is only ever visible to the current session, so I don't think there's a way to do what you want to do without another approach.

DBMS_PIPE is the 'classic' mechanism for pushing information from one session to another. Session A would have to push data into the pipe and session B would have to pull it.
But generally the idea of databases is that sessions are independent and any commonality is in the preserved data. Going against this suggests you are using the wrong tool.

The data in a global temporary table is only visible to the session that inserted it. So you would have to run the same process that populated the table in session B.
Of course, the fact that you appear to want to access the same 320,000 rows in two different sessions would seem to imply that a global temporary table is not the appropriate data structure to be using. Perhaps you want to load that data into a permanent table (possibly along with some sort of identifier if you will have multiple SessionA/ SessionB pairs). Or perhaps whatever logic Session B is running ought to be run by Session A.
And just taking a step back, since Oracle implements multi-version read consistency such that readers don't block writers and writers don't block readers, it would be very unusual to need to have a 320,000 row temporary table in the first place.

Related

avoiding write conflicts while re-sorting a table

I have a large table that I need to re-sort periodically. I am partly basing this on a suggestion I was given to stay away from using cluster keys since I am inserting data ordered differently (by time) from how I need it clustered (by ID), and that can cause re-clustering to get a little out of control.
Since I am writing to the table on a hourly I am wary of causing problems with these two processes conflicting: If I CTAS to a newly sorted temp table and then swap the table name it seems like I am opening the door to have a write on the source table not make it to the temp table.
I figure I can trigger a flag when I am re-sorting that causes the ETL to pause writing, but that seems a bit hacky and maybe fragile.
I was considering leveraging locking and transactions, but this doesn't seem to be the right use case for this since I don't think I'd be locking the table I am copying from while I write to a new table. Any advice on how to approach this?
I've asked some clarifying questions in the comments regarding the clustering that you are avoiding, but in regards to your sort, have you considered creating a nice 4XL warehouse and leveraging the INSERT OVERWRITE option back into itself? It'd look something like:
INSERT OVERWRITE INTO table SELECT * FROM table ORDER BY id;
Assuming that your table isn't hundreds of TB in size, this will complete rather quickly (inside an hour, I would guess), and any inserts into the table during that period will queue up and wait for it to finish.
There are some reasons to avoid the automatic reclustering, but they're basically all the same reasons why you shouldn't set up a job to re-cluster frequently. You're making the database do all the same work, but without the built in management of it.
If your table is big enough that you are seeing performance issues with the clustering by time, and you know that the ID column is the main way that this table is filtered (in JOINs and WHERE clauses) then this is probably a good candidate for automatic clustering.
So I would recommend at least testing out a cluster key on the ID and then monitoring/comparing performance.
To give a brief answer to the question about resorting without conflicts as written:
I might recommend using a time column to re-sort records older than a given time (probably in a separate table). While it's sorting, you may get some new records. But you will be able to use that time column to marry up those new records with the, now sorted, older records.
You might consider revoking INSERT, UPDATE, DELETE privileges on the original table within the same script or procedure that performs the CTAS creating the newly sorted copy of the table. After a successful swap you can re-enable the privileges for the roles that are used to perform updates.

When this query is performed, do all the records get loaded into physical memory?

I have a table where i have millions of records. The total size of that table only is somewhere 6-7 GigaByte. This table is my application log table. This table is growing really fast, which makes sense. Now I want to move records from log table into backup table. Here is the scenario and here is my question.
Table Log_A
Insert into Log_b select * from Log_A;
Delete from Log_A;
I am using postgres database. the question is
When this query is performed Does all the records from Log_A gets load in physical memory ? NOTE: My both of the above query runs inside a stored procedure.
If No, then how will it works ?
I hope this question applies for all database.
I hope if somebody could provide me some idea on this.
In PostgreSQL, that's likely to execute a sequential scan, loading some records into shared_buffers, inserting them, writing the dirty buffers out, and carrying on.
All the records will pass through main memory, but they don't all have to be in memory at once. Because they all get read from disk using normal buffered reads (pread) it will affect the operating system disk cache, potentially pushing other data out of the cache.
Other databases may vary. Some could execute the whole SELECT before processing the INSERT (though I'd be surprised if any serious ones did). Some do use O_DIRECT reads or raw disk I/O to avoid the OS cache affects, so the buffer cache effects might be different. I'd be amazed if any database relied on loading the whole SELECT into memory, though.
When you want to see what PostgreSQL is doing and how, the EXPLAIN and EXPLAIN (BUFFERS, ANALYZE) commands are quite useful. See the manual.
You may find writable common table expressions interesting for this purpose; it lets you do all this in one statement. In this simple case there's probably little benefit, but it can be a big win in more complex data migrations.
BTW, make sure to run that pair of queries wrapped in BEGIN and COMMIT.
Probably not.
Each record is individually processed; this particular query doesn't need to have knowledge of any of the other records to successfully execute. So the only record that needs to be in memory at any given moment is the one currently being processed.
But it really depends on whether or not the database thinks it can do it faster by loading up the whole table. Check the execution plan of the query.
If your setup allows it, just rename the old table and create a new empty one. Much faster, obviously, as no copying is done at all.
ALTER TABLE log_a RENAME TO log_b;
CREATE TABLE log_a (LIKE log_b INCLUDING ALL);
The LIKE clause copies the structure of the (now renamed) old table. INCLUDING ALL includes defaults, constraints, indexes, ...
Foreign key constraints or views depending on the table or other less common dependencies (but not queries in plpgsql functions) might be a hurdle for this route. You would have to recreate those to have them point to the new table. But a logging table like you describe probably carries no such dependencies.
This acquires an exclusive lock on the table. I assume, typical write access will be INSERT only in your case? One way to deal with concurrent access would then be to create the new table in a different schema and alter the search_path for your application user. Then the applications starts to write to the new table without concurrency issues. Of course, you wouldn't schema-qualify the table name in your INSERT statements for this to take effect.
CREATE SCHEMA log20121018;
CREATE TABLE log20121018.log_a (LIKE log20121011.log_a INCLUDING ALL);
ALTER ROLE myrole SET search_path = app, log20121018, public;
Or alter the search_path setting at whatever level is effective for you:
globally, per database, per role, per session, per function ...

Oracle - To Global Temp or NOT to Global Temp

So let's say I have a few million records to pull from in order to generate some reports, and instead of running my reports of the live table, I create a temp where I can then create my indexes and use it for further data extraction.
I know cached tables tend to be quicker / faster seeing as the data is stored in memory, but I'm curious to know if there are instances where using a physical temp table is better than Global Temporary Tables and why? What kind of scenario would one be better than the other when dealing with larger volumes of data?
Global Temporary Tables in Oracle are not like temporary tables in SQL Server. They are not cached in memory, they are written to the temporary tablespace.
If you are handling a large amount of data and retaining it for a reasonable amount of time - which seems likely as you want to build additional indexes - I think you should use a regular table. This is even more the case if your scenario has a single session, perhaps a background job, working with the data.
I use Subquery Factoring before I consider temp tables. If there's a need for reuse in various functions or procedures, I turn it into a view (which can turn into a materialized view depending on the data returned).
According to asktom:
...temp table and global temp table are synonymous in Oracle.
For reporting, temporary tables are helpful in that data can only be seen by the session that created it, meaning that you shouldn't have to worry about any concurrency issues.
With a non-temporary table you need to add a session handle/identifier to the table in order to distinguish between sessions.
The primary difference between ordinary (heap) tables and global temp tables in Oracle is their visibility and volatility:
Once rows are committed to an ordinary table they are visible to other sessions and are retained until deleted.
Rows in a global temp table are never visible to other sessions, and are not retained after the session ends.
So the choice should primarily be down to what your application design needs, rather than just about performance (not to say performance isn't important).
The contents of an Oracle temporary table are only visible within the session that created the data and will disappear when the session ends. So you will have to copy the data for every report.
Is this report you are doing a one time operation or will the report be run periodically? Copying large quantities of data just to run a report does not seem a good solution to me. Why not run the report on the original data?
If you can't use the original tables you may be able to create a meterialized view so the latest data is available when you need it.

Persistent temp tables in SQL?

Is it possible to have a 'persistent' temp table in MS-SQL? What I mean is that I currently have a background task which generates a global temp table, which is used by a variety of other tasks (which is why I made it global). Unfortunately if the table becomes unused, it gets deleted by SQL automatically - this is gracefully handled by my system, since it just queues it up to be rebuilt again, but ideally I would like it just to be built once a day. So, ideally I could just set something like set some timeout parameter, like "If nothing touches this for 1 hour, then delete".
I really don't want it in my existing DB because it will cause loads more headaches related to managing the DB (fragmentation, log growth, etc), since it's effectively rollup data, only useful for a 24 hour period, and takes up more than one gigabyte of HD space.
Worst case my plan is to create another DB on the same drive as tempdb, call it something like PseudoTempDB, and just handle the dropping myself.
Any insights would be greatly appreciated!
If you create a table as tempdb.dbo.TempTable, it won't get dropped until:
a - SQL Server is restarted
b - You explicitly drop it
If you would like to have it always available, you could create that table in model, so that it will get copied to tempdb during the restart (but it will also be created on any new database you create afterwards, so you would have to delete manually) or use a startup stored procedure to have it created. There would be no way of persisting the data through restarts though.
I would go with your plan B, "create another DB on the same drive as tempdb, call it something like PseudoTempDB, and just handle the dropping myself."
How about creating a permanent table? Say, MyTable. Once every 24 hours, refresh the data like this:
Create a new table MyTableNew and populate it
Within a transaction, drop MyTable, and use rename_object to rename MyTableNew to MyTable
This way, you're recreating the table every day.
If you're worried about log files, store the table in a different database and set it to Recovery Model: Simple.
I have to admit to doing a double-take on this question: "persistent" and "temp" don't usually go together! How about a little out-of-the-box thinking? Perhaps your background task could periodically run a trivial query to keep SQL from marking the table as unused. That way, you'd take pretty direct control over creation and tear down.
After 20 years of experience dealing with all major RDBMS in existence, I can only suggest a couple of things for your consideration:
Note the oxymoronic concepts: "persistent" and "temp" are complete opposites. Choose one, and one only.
You're not doing your database any favors writing data to the temp DB for a manual, semi-permanent, user-driven basis. Normal tablespaces (i.e. user) are already there for that purpose. The temp DB is for temporary things.
If you already know that such a table will be permanently used ("daily basis" IS permanent), then create it as a normal table on a user database/schema.
Every time that you delete and recreate the very same table you're fragmenting your whole database. And have the perverse bonus of never giving a chance for the DB engine optimizer to assist you in any sort of crude optimization. Instead, try truncating it. Your rollback segments will thank you for that small relief and disk space will probably still be allocated for when you repopulate it again the next day. You can force that desired behavior by specifying a separate tablespace and datafile for that table alone.
Finally, and utterly more important: Stop mortifying you and your DB engine for a measly 1 GB of data. You're wasting CPU, I/O cycles, adding latency, fragmentation, and so on for the sake of saving literally 0.02 cents of hardware real state. Talk about dropping to the floor in a tuxedo to pick up a brown cent. 😂

Database history for client usage

I'm trying to figure out what would be the best way to have a history on a database, to track any Insert/Delete/Update that is done. The history data will need to be coded into the front-end since it will be used by the users. Creating "history tables" (a copy of each table used to store history) is not a good way to do this, since the data is spread across multiple tables.
At this point in time, my best idea is to create a few History tables, which the tables would reflect the output I want to show to the users. Whenever a change is made to specific tables, I would update this history table with the data as well.
I'm trying to figure out what the best way to go about would be. Any suggestions will be appreciated.
I am using Oracle + VB.NET
I have used very successfully a model where every table has an audit copy - the same table with a few additional fields (time stamp, user id, operation type), and 3 triggers on the first table for insert/update/delete.
I think this is a very good way of handling this, because tables and triggers can be generated from a model and there is little overhead from a management perspective.
The application can use the tables to show an audit history to the user (read-only).
We've got that requirement in our systems. We added two tables, one header, one detail called AuditRow and AuditField. The AuditRow contains one row per row changed in any other table, and the AuditField contains one row per column changed with old value and new value.
We have a trigger on every table that writes a header row (AuditRow) and the needed detail rows (one per changed colum) on each insert/update/delete. This system does rely on the fact that we have a guid on every table that can uniquely represent the row. Doesn't have to be the "business" or "primary" key, but it's a unique identifier for that row so we can identify it in the audit tables. Works like a champ. Overkill? Perhaps, but we've never had a problem with auditors. :-)
And yes, the Audit tables are by far the largest tables in the system.
If you are lucky enough to be on Oracle 11g, you could also use the Flashback Data Archive
Personally, I would stay away from triggers. They can be a nightmare when it comes to debugging and not necessarily the best if you are looking to scale out.
If you are using an PL/SQL API to do the INSERT/UPDATE/DELETEs you could manage this in a simple shift in design without the need (up front) for history tables.
All you need are 2 extra columns, DATE_FROM and DATE_THRU. When a record is INSERTed, the DATE_THRU is left NULL. If that record is UPDATEd or DELETEd, just "end date" the record by making DATE_THRU the current date/time (SYSDATE). Showing the history is as simple as selecting from the table, the one record where DATE_THRU is NULL will be your current or active record.
Now if you expect a high volume of changes, writing off the old record to a history table would be preferable, but I still wouldn't manage it with triggers, I'd do it with the API.
Hope that helps.