What are the benefits of a Make Table vs a Select query in Access? - sql

I know you can run SELECT queries on top of SELECT queries in Access, but the application also provides the Make Table query type.
I'm wondering what the benefits/reasons for using Make Table might be?

You would usually use Make Table for performance reasons. If you have a fairly complex query that returns a subset of your table's data, and that you may need to retrieve multiple times, it can be expensive to re-run the query multiple times.
Using Make Table allows you to incur the cost of running the expensive query once, and make a copy of the query results into a table. Querying this copy would then be a lot less expensive than running your original expensive query.
This is usually a good option when you don't expect your original data to change frequently, or if you don't care that you are working of a copy of the data that may not be 100% up-to-date with the original data.
Notice what the following article on Create a make table query has to say:
Typically, you create make table queries when you need to copy or archive data. For example, suppose you have a table (or tables) of past sales data, and you use that data in reports. The sales figures cannot change because the transactions are at least one day old, and constantly running a query to retrieve the data can take time — especially if you run a complex query against a large data store. Loading the data into a separate table and using that table as a data source can reduce workload and provide a convenient data archive. As you proceed, remember that the data in your new table is strictly a snapshot; it has no relationship or connection to its source table or tables.

The main defense here is that a make table query creates a table. And when you done with the table then effort and time to delete that table and recover the VERY LARGE increase in the database file will have to occur. For general reports and a query of data make much more send. A comparison would be to build a NEW garage every time you want to park your car.
The database engine and query system can fetch and pull rows at a very high rate and those results are then able to be rendered into a report or form, and this occurs without having to create a temp table. It makes little sense to go through all of the trouble of having the system create a WHOLE NEW table for such results of data when they can with ease be sent to a report.
In other words creating a whole table just to display or use some data that the database engine already fetched and returned makes little sense. A table is a set of rows that holds data that can be updated and the results are permanent. A query is a “on the fly” results or sub set of data that only exists in memory and is discarded after you use the results.
So for general reporting and display of data, it makes no sense to create a temp table. MUCH WORSE of an issue is that if you have two users wanting to run a report, if they both need different results and you send the results to the SAME temp table, then you have a big mess and collision between the two users. So use of a temp table in Access for the most part makes little sense, and this is EVEN MORE so when working in a multi-user environment. And as noted, once the table is created, then after you are done you need to delete and remove the table. And with many users in a multi-user database this becomes even more of a problem and issue.
However in a multi-user environment as pointed out that if the resulting data needs additional processing, then sending the results to a temp table can be of use. This approach however suggests that EACH USER has their own front end and own copy of the application side. And better is that the temp table is created outside of the front end application that resides on each computer. Since the application part (front end) is placed on each computer, then creating of a temp table does not occur in the production database (back end) and as a result you can have multiple users function correctly without each individual user creating a temp table in the production back end database. So if one is to adopt a make table query, it likely should occur on each local workstation and not in the back end database when you have a multiple user database application.
Thus for the most part a make table and that of reports and query of data are VERY different goals and tasks. You don't want nor as a general rule create a whole brand new table for a simple query. In a multi user database system the users might run 100's of reports in a given day and FEW if any systems will send such data to a temp table in place of sending the query results directly to the report.

It creates a table - which is useful if you have a need for that table which you may have for temporary use where you have to modify the data for calculations or further processing while not disturbing the original data.

Related

Is it possible to implement point in time recovery (PITR) in PostgreSQL for a single table?

Let's say I have a database with lots of tables, but there's one big table that's being updated regularly. At any given point in time, this table contains billions of rows, and let's say that the table is updated so regularly that we can expect a 100% refresh of the table by the end of each quarter. So the volume of data being moved around is in the order tens of billions. Because this table is changing so constantly, I want to implement a PITR, but only for this one table. I have two options:
Hack PostgreSQL's in-house PITR to apply only for one table.
Build it myself by creating a base backup, set up continuous archiving, and using a python script to execute the log of SQL statements up to a point in time (or use PostgreSQL's EXECUTE statement to loop through the archive). The big con with this is that it won't have the timeline functionality.
My problem is, I don't know if option 1 is even possible, and I don't know if option 2 even makes sense (looping through billions of rows sounds like it defeats the purpose of PITR, which is speed and convenience.) What other options do I have?

When to use internal tables?

So, I have read that using internal tables increases the performance of the program and that we should make operations on DB tables as less as possible. But I have started working on a project that does not use internal tables at all.
Some details:
It is a scanner that adds or removes products in/from a store. First the primary key is checked (to see if that type of product exists) and then the product is added or removed. We use ‘Insert Into’ and ‘Delete From’ to add/remove the products directly from the DB table.
I have not asked why they do not use internal tables because I do not have a better solution so far.
Here’s what I have so far: Insert all products in an internal table, place the deleted products in another internal table.
Form update.
Modify zop_db_table from table gt_table." – to add all new products
LOOP AT gt_deleted INTO gs_deleted.
DELETE FROM zop_db_table WHERE index_nr = gs_deleted-index_nr.
ENDLOOP. " – to delete products
Endform.
But when can I perform this update?
I could set a ‘Save button’ to perform the update, but then there will be the risk that the user forgets to save large amounts of data, or drops the scanner, shutting it down or similar situations. So this is clearly not a good solution.
My final question is: Is there a (good) way to implement internal tables in a project like this?
internal tables should be used for data processing, like lists or arrays in other languages (c#, java...). From a performance and system load perspective it is preferred to first load all data you need into an internal table, then process that internal table instead of loading individual records from the database.
But that is mostly true for reporting, which is probably the most common type of custom abap program. You often see developers use select...endselect-statements, that in effect loop over a database table, transferring row after row to the report, one at a time. That is extremely slow compared to reading all records at once into an itab, then looping over the itab. More than once i've cut the execution time of a report down to a fraction by just eliminating roundtrips to the database.
If you have a good reason to read from the database or update records immediately, you should do so. If you can safely delay updates and deletes to a point in time where you can process all of them together, without risking inconsistencies, I'd consider than an improvement. But if there is a good reason (like consistency or data loss) to update immediately, do it.
Update: as #vwegert mentioned regarding the select-endselect statement, the statement doesn't actually create individual database queries for each row. The database interface of the application server optimizes the query, transferring rows in bulk to the application server. From there the records are transported to the abap report one by one (because in the report there is only the work area to store a single row), which has a significant performance impact especially for queries with large result sets. A select into an internal table can transport all rows directly to the abap report (as long as there is enough memory to hold them), as now there is the internal table to hold those records in the report.

Microsoft Access Equivalent of "Create Table #Tablename"?

I'm wanting to create a temporary table, purely to make a simpler user interface feature, and it has to be visible only to the current user. I can't see how to achieve this in Access: apparently the "create" in Access is not like the standard SQL: there is a "Temporary" option however (according to http://msdn.microsoft.com/en-us/library/bb177893(v=office.12).aspx):
"When a TEMPORARY table is created it is visible only within the
session in which it was created. It is automatically deleted when the
session is terminated. Temporary tables can be accessed by more than
one user."
It's that last sentence that I have a problem with. I can see "normal databases" you would use SQL "create table #whatever"... so I want to imitate that with Access.
It's a bit long winded to explain the whole situation, apologies if i'm not being clear enough as I'm trying to avoid writing a stupid amount of unnecessary detail: essentially what I have is an "employee" record with a number of "tasks" they perform. My "EmployeeTasks" table has a "percentage" field for each task (i.e. in plain english "employee A(f.key) performs task B(f.key) X% of the day".
To maintain that information in the user interface, it's a bit "messy" to ask users to manually enter percentages... to my mind, people don't really think "well I work 7.6 hours a day, i do 10 tasks, i do this task 3.528% of the time, this task 9.813% of the time... " etc... What I want to present to the user, is their list of tasks, (in a continuous form) with their task effort expressed as hours and minutes per day.
So my theory is, create a temporary table including hours and minute extrapolation, display a form based on that table, the user can then edit those hours and minutes, and "update" function will take those figures and convert back to percentages based on sum. This way the user doesn't have to worry about being accurate in assuring all hours and minutes add to 7.6 hours, and the don't have to worry about all percentages adding to 1 etc... There's a large acceptable margin of error (because obviously most people don't perform tasks for a regimented amount of time, we're only gathering rough information)
It seems the easiest approach is to create a form based on a temp table [EDIT ADDITION]: but if more than one user edits a different employee, they would be overwriting each others temporary tables unless I can create a user-unique table somehow [/EDIT]. Another method I guess would be to dynamically create a list of controls for each task and read from them, but that would get messy quickly when employees have a large number of tasks. Thanks for your help, Simon
It is true that Access SQL does not support CREATE TABLE #TableName to create session-specific temporary tables like T-SQL does, but practically speaking it doesn't need to. Here's why:
For your Microsoft Access database application to support multiple concurrent users
you must split your database into a front-end database file (containing queries, forms, reports, code) linked to a back-end database file (containing just the data tables), and
each user must have their own (local) copy of the front-end database file.
No two users should ever directly open the same .mdb or .accdb file at the same time, e.g., by double-clicking it or doing File > Open in Access. (More details here).
Your VBA code in the front-end can create a temporary table in the front-end and your application can use it. Access allows us to build queries that JOIN local tables with linked tables, so the (local) temporary table can be used like a #Temporary table in T-SQL.
Since each user has their own copy of the front-end file (point #2, above), they each have their own copy of any temporary tables that your application might create.

Oracle - To Global Temp or NOT to Global Temp

So let's say I have a few million records to pull from in order to generate some reports, and instead of running my reports of the live table, I create a temp where I can then create my indexes and use it for further data extraction.
I know cached tables tend to be quicker / faster seeing as the data is stored in memory, but I'm curious to know if there are instances where using a physical temp table is better than Global Temporary Tables and why? What kind of scenario would one be better than the other when dealing with larger volumes of data?
Global Temporary Tables in Oracle are not like temporary tables in SQL Server. They are not cached in memory, they are written to the temporary tablespace.
If you are handling a large amount of data and retaining it for a reasonable amount of time - which seems likely as you want to build additional indexes - I think you should use a regular table. This is even more the case if your scenario has a single session, perhaps a background job, working with the data.
I use Subquery Factoring before I consider temp tables. If there's a need for reuse in various functions or procedures, I turn it into a view (which can turn into a materialized view depending on the data returned).
According to asktom:
...temp table and global temp table are synonymous in Oracle.
For reporting, temporary tables are helpful in that data can only be seen by the session that created it, meaning that you shouldn't have to worry about any concurrency issues.
With a non-temporary table you need to add a session handle/identifier to the table in order to distinguish between sessions.
The primary difference between ordinary (heap) tables and global temp tables in Oracle is their visibility and volatility:
Once rows are committed to an ordinary table they are visible to other sessions and are retained until deleted.
Rows in a global temp table are never visible to other sessions, and are not retained after the session ends.
So the choice should primarily be down to what your application design needs, rather than just about performance (not to say performance isn't important).
The contents of an Oracle temporary table are only visible within the session that created the data and will disappear when the session ends. So you will have to copy the data for every report.
Is this report you are doing a one time operation or will the report be run periodically? Copying large quantities of data just to run a report does not seem a good solution to me. Why not run the report on the original data?
If you can't use the original tables you may be able to create a meterialized view so the latest data is available when you need it.

How should I keep accurate records summarising multiple tables?

I have a normalized database and need to produce web based reports frequently that involve joins across multiple tables. These queries are taking too long, so I'd like to keep the results computed so that I can load pages quickly. There are frequent updates to the tables I am summarising, and I need the summary to reflect all update so far.
All tables have autoincrement primary integer keys, and I almost always add new rows and can arrange to clear the computed results in they change.
I approached a similar problem where I needed a summary of a single table by arranging to iterate over each row in the table, and keep track of the iterator state and the highest primary keen (i.e. "highwater") seen. That's fine for a single table, but for multiple tables I'd end up keeping one highwater value per table, and that feels complicated. Alternatively I could denormalise down to one table (with fairly extensive application changes), which feels a step backwards and would probably change my database size from about 5GB to about 20GB.
(I'm using sqlite3 at the moment, but MySQL is also an option).
I see two approaches:
You move the data in a separate database, denormalized, putting some precalculation, to optimize it for quick access and reporting (sounds like a small datawarehouse). This implies you have to think of some jobs (scripts, separate application, etc.) that copies and transforms the data from the source to the destination. Depending on the way you want the copying to be done (full/incremental), the frequency of copying and the complexity of data model (both source and destination), it might take a while to implement and then to optimizie the process. It has the advantage that leaves your source database untouched.
You keep the current database, but you denormalize it. As you said, this might imply changing in the logic of the application (but you might find a way to minimize the impact on the logic using the database, you know the situation better than me :) ).
Can the reports be refreshed incrementally, or is it a full recalculation to rework the report? If it has to be a full recalculation then you basically just want to cache the result set until the next refresh is required. You can create some tables to contain the report output (and metadata table to define what report output versions are available), but most of the time this is overkill and you are better off just saving the query results off to a file or other cache store.
If it is an incremental refresh then you need the PK ranges to work with anyhow, so you would want something like your high water mark data (except you may want to store min/max pairs).
You can create triggers.
As soon as one of the calculated values changes, you can do one of the following:
Update the calculated field (Preferred)
Recalculate your summary table
Store a flag that a recalculation is necessary. The next time you need the calculated values check this flag first and do the recalculation if necessary
Example:
CREATE TRIGGER update_summary_table UPDATE OF order_value ON orders
BEGIN
UPDATE summary
SET total_order_value = total_order_value
- old.order_value
+ new.order_value
// OR: Do a complete recalculation
// OR: Store a flag
END;
More Information on SQLite triggers: http://www.sqlite.org/lang_createtrigger.html
In the end I arranged for a single program instance to make all database updates, and maintain the summaries in its heap, i.e. not in the database at all. This works very nicely in this case but would be inappropriate if I had multiple programs doing database updates.
You haven't said anything about your indexing strategy. I would look at that first - making sure that your indexes are covering.
Then I think the trigger option discussed is also a very good strategy.
Another possibility is the regular population of a data warehouse with a model suitable for high performance reporting (for instance, the Kimball model).