I just have a general question about setting an Access query to Dynaset (Inconsistent Updates). I know it opens up the fields for editing and this increases the risk of maintaining data integrity, but what about in "controlled cases"?
For example, I have a table on the "one" side of 3 left outer joins. I want to allow edits (via a form) to any fields on the "one" side. The 3 outer joins are merely pulling information from these other tables to use in a calculated field in that query. So I need to show these calculations at this query level, but edit the primary table in the query. I know the changes I'm allowing in the form are just on the "one" side. Is this an allowable case for using Dynaset (Inconsistent Updates)? I just can't figure out an appropriate solution. I tried using a subquery for the one field instead of outer joins but that still left it locked.
Related
I'm writing a report that needs to pull data from a view that I'm not authorized to modify. The view is missing a column that I need for a report, so I attempted to join it against one of its source tables. However, this is causing it to take twice as long to execute.
A look at the execution plan shows that it performs two scans of the table and merge joins them together. Is there a hint I can use to convince the query optimizer to visit the table only once?
Abstracted fiddle: http://sqlfiddle.com/#!3/4a44d/1/0
Because the optimizer will never eliminate a table access specified in the query unless nothing is actually referenced from that table.
There is no way to access a table less times than it is referenced in the query (as far as I know from 13 years experience). There may be a few other cases, but the only case I know of where the query optimizer can do less accesses than the number of object references is when it can optimize away a left outer or right outer join when nothing is accessed from the outer table and it is known from constraints that excluding the work will not change the number of rows or which rows would be returned in the result.
I am developing a small web app (angularjs/jquery front end, postgresql 9.3 backend) in which I want to present a view of a largish (few million) set of records in a "grid" (read-only). I have a set of filters based on facets of the data that I would like the user to be able to apply serially; that is, one filter is applied and then the next filter is applied. The user can choose both the filters and the filter settings. This ends up being a set of logical AND operations (perhaps requiring SQL joins, as well).
I am interested in what folks do on the backend to improve the user experience. In particular, I can imagine:
Apply filters "dynamically" as a SQL query whenever pagination or additional filtering is applied
Create a cache at each level of filtering so that I can update data more quickly
There are clearly other options and I would like to hear what others would do in this situtation.
Not sure the question is “answerable”, but here’s example of what we’ve done. We have application that does not do filtering but rather allows the use to select which data items that want to see. Like your application, these could include joins to multiple other tables. We have a “driver” table that has the “user” version of the field name, a string for the inner join to the table that contains it, optionally a string for the Where clause if the inner join condition is not sufficient, and the name of the column on the DB.
We build the base query then look at the entries for all the items the user has selected. We add distinct inner join clauses form those fields (if three columns are coming from one table, we only want to join it once). We add And clauses to the Where clause encapsulation in parenthesis. And we add the column names to the select list.
I'm having some trouble in Access 2002...
I have two tables, one containing around 60k occurrences and one with a column and the foreign keys to make the join. In my form, I set the source to a query with these two tables joined (left join on the empty one). Basically, I end up with my 60k occurrences and my new column at the end.
Now, I need to allow my users to edit this field in my form. I found out that when the corresponding data exists in my empty table, I can edit the field just fine, however since we need this empty table to contain only the occurrences where we need to add the new column, I can't simply make a new entry for all of my occurrences.
Here are a schema of the two tables:
Table 1:
ID Sequence Col1 Col2 Col3 Col5
60k
Table 2:
ID Sequence Col6
0
And my query:
SELECT tblOne.*, tblTwo.Col6
FROM tblOne
LEFT JOIN tblTwo ON (tblOne.Sequence=tblTwo.Sequence) AND (tblOne.ID=tblTwo.ID);
If you're willing to consider a different approach, this could be easier with a form/subform approach.
Base the main form on tblOne and base the subform on tblTwo. Use Sequence and ID as the master/child link fields (you can find that setting in the property sheet of the subform control).
With that design, the subform will display existing tblTwo rows which match the main form's current tblOne row. And you can add a new matching tblTwo row at the subform's new record --- it will "inherit" the Sequence and ID values of the current main form row.
By the way, Sequence is a reserved word. Rename that field if possible. If you must keep that name, you can avoid the risk of confusing the db engine by enclosing the name in square brackets or by qualifying the field name with the table name (or alias) in your SQL statements.
If you have a query that is not updateable, check out Allen Browne's Tips on what might cause this: http://allenbrowne.com/ser-61.html
MS Access has a shortcoming. If you wish to edit data in list-like views, it generally works best for it to be displayed in basically the same structure that it's stored in your tables. Edit: In reference to the comment left by Yawar about this not being an Access shortcoming, I'd like to point out that when developing in .NET it isn't uncommon to have a database structure that is quite unlike the data model classes used inside your application. In this case the GUI is built on the data model so the database may look somewhat (or event quite) different from your data models/GUI.
Back to MS Access, when you use a table join to create the recordsource/recordset for a datasheet form or continuous form, it's my understanding that only one of the tables is going to be updateable. In other words, only one side of the join is updateable. And in many cases, the recordset is not updateable at all, due to the DAO engine being confused. Update: I have deduced from the link below that what I wrote above seems to be more true of SQL Server than using JET/ACE backend.
The most common solution is as HansUp has suggested, use a form/subform approach. You can actually have a datasheet subform as a child of another datasheet subform, which would work quite well in your case here. There will just be an expandable plus sign at the far left of each record so you can add/edit/delete the record(s) in tblTwo.
Another option is to use an ActiveX grid control such as the iGrid from 10tec which means you'll write quite a bit of code for all kinds of things, like loading the recordset, writing changes/additions/deletions back to the database, handling formatting of cells, etc.
Yet another option is to use a fabricated ADO recordset. This is a terribly clumsy approach and I can't say that I've really seen it in use, mostly just experimented with it and read about it in theory. The problem is that you have to create a fabricated recordset that is nearly identical to the one that you generated, and then you have to loop through and copy all of the records from the generated recordset into your fabricated recordset. It's a tremendous amount of overhead, especially for that many records. And then, you must write code once again to write all additions/changes/deletions back to the database. Handling the creation of new primary keys can be tricky. This particular approach is not easy or simple and is not something I recommend a VBA beginner tackling.
If you're using SQL Server you should check out the following article at Microsoft's website. It covers a variety of material including updating multiple tables from a single recordsource/view. http://technet.microsoft.com/en-us/library/bb188204%28v=sql.90%29.aspx
I use OUTER JOIN to get values stored in rows and show them as columns. When there is no value, I show NULL in column.
Source table:
Id|Name|Value
01|ABCG|,,,,,
01|ZXCB|.....
02|GHJK|;;;;;
View:
Id|ABCG|ZXCB|GHJK
01|,,,,|....|NULL
02|NULL|NULL|;;;;
The query looks like:
SELECT DISTINCT
b.Id,
bABCG.Value AS "ABCG"
bZXCB.Value AS "ZXCB"
bGHJK.Value AS "GHJK"
FROM
Bars b
LEFT JOIN Bars bABCG ON b.Id = bABCG.Id and b.Name = 'ABCG'
LEFT JOIN Bars bZXCB ON b.Id = bZXCB.Id and b.Name = 'ZXCB'
LEFT JOIN Bars bGHJK ON b.Id = bGHJK.Id and b.Name = 'GHJK'
I want to remove LEFT JOIN because it's not allowed in indexed view. I tried replacing it with inner SELECT, but inner SELECT is not allowed also and UNION too. I can't use INNER JOIN because I want to show NULLs in view. What should I use?
You may be able to implement something similar using an actual table to store the results, and a set of triggers against the base tables to maintain the internal data.
I believe that, under the covers, this is what SQL Server does (in spirit, if not in actual implementation) when you create an indexed view. However, by examining the rules for indexed views, it's clear that the triggers should only use the inserted and deleted tables, and should not be required to scan the base tables to perform the maintenance - otherwise, for large tables, maintaining this indexed view would impose a serious performance penalty.
As an example of the above, whilst you can easily write a trigger for insert to maintain a MAX(column) column in the view, deletion would be more problematic - if you're deleting the current max value, you'd need to scan the table to determine the new maximum. For many of the other restrictions, try writing the triggers by hand, and most times there'll come a point where you need to scan the base table.
Now, in your particular case, I believe it could be reasonably efficient for these triggers to perform the maintenance - but you need to carefully consider all of the insert/update/delete scenarios, and make sure that your triggers actually faithfully maintain this data - e.g. if you update any ids, you may need to perform a mixture of updates, inserts and deletes.
The best you are going to be able to do is use inner joins to get the matches, then union with the left joins and filter it to only return nulls. This probably won't solve your problem.
I don't know the specifics of your system but I am assuming that you are dealing with performance issues, which is why you want to use the indexed view. There are a few alternatives, but I think the following is the most appropriate.
Since you commented this is for a DW I am going to assume that your system is more intensive on reads than writes and that data is loaded into it on a schedule by an ETL process. In this kind of high read/low write* situation I would recommend you "materialize" this view, which means when the ETL process runs, to generate the table with your initial select statement that includes the left joins. You will take the hit on the write, then all your reads will be on par with the performance of the indexed view (you would be doing the same thing the indexed view would do, except in a batch instead of on a row by row basis). If your source DB and DW are on the same instance this is a better choice than an indexed view b/c it won't affect the performance of the source system (indexed views slow down inserts). This is the same concept as the indexed view because you take the performance hit on the insert to speed up the select.
I've been down this path before and come to the following conclusion:
An indexed view is more likely to be part of the solution than the entire solution.
*when I said "high read/low write" above you can also think of it as "high read/scheduled write"
SELECT DISTINCT
b.Id,
(Select bABCG.Value from Bars bABCG where b.Name = 'ABCG') AS "ABCG"
...
FROM
Bars b
you may have to add a aggregation on the value, I'm not sure how your data is organized
Here's my query, it is fairly straightforward:
SELECT
INVOICE_ITEMS.II_IVNUM, INVOICE_ITEMS.IIQSHP
FROM
INVOICE_ITEMS
LEFT JOIN
INVOICES
ON
INVOICES.INNUM = INVOICE_ITEMS.II_INNUM
WHERE
INVOICES.IN_DATE
BETWEEN
'2010-08-29' AND '2010-08-30'
;
I have very limited knowledge of SQL, but I'm trying to understand some of the concepts like subqueries and the like. I'm not looking for a redesign of this code, but rather an explanation of why it is so slow (600+ seconds on my test database) and how I can make it faster.
From my understanding, the left join is creating a virtual table and populating it with every result row from the join, meaning that it is processing every row. How would I stop the query from reading the table completely and just finding the WHERE/BETWEEN clause first, then creating a virtual table after that (if it is possible)?
How is my logic? Are there any consistently recommended resources to get me to SQL ninja status?
Edit: Thanks everyone for the quick and polite responses. Currently, I'm connecting over ODBC to a proprietary database that is used in the rapid application development framework called OMNIS. Therefore, I really have no idea what sort of optimization is being run, but I believe it is based loosely on MSSQL.
I would rewrite it like this, and make sure you have an index on i.INNUM, ii.INNUM, and i.IN_DATE. The LEFT JOIN is being turned into an INNER JOIN by your WHERE clause, so I rewrote it as such:
SELECT ii.II_IVNUM, ii.IIQSHP
FROM INVOICE_ITEMS ii
INNER JOIN INVOICES i ON i.INNUM = ii.II_INNUM
WHERE i.IN_DATE BETWEEN '2010-08-29' AND '2010-08-30'
Depending on what database you are using, what may be happening is all of the records from INVOICE_ITEMS are being joined (due to the LEFT JOIN), regardless of whether there is a match with INVOICE or not, and then the WHERE clause is filtering down to the ones that matched that had a date within range. By switching to an INNER JOIN, you may make the query more efficient, by only needing to apply the WHERE clause to INVOICES records that have a matching INVOICE_ITEMS record.
SInce that is a very basic query the optimizer should do fine with it, likely your problem would be incorrect indexing. DO you haveindexes on the In_date field and INVOICE_ITEMS.II_INNUM field? If you have properly set up PK Fk relationships, INVOICES.INNUM should already be indexed but FKs are not indexed automatically.
Your query is fine, it's the indexes you have to look at.
Are INVOICES.INNUM and INVOICE_ITEMS.II_INNUM indexed?
If not SQL has to do something called a 'scan' - it searches every single record.
You can think of indexes as like the tabs on the side of a phone book - you know where to start looking for people based on the first letters of their surname. Without an index (say you want to look for names that end in '...son') you have to search the entire book.
There are different types of index - they can be ordered (like the phone book index - all ordered by surname) or not (like the index at the back of a book - there's an overhead in finding the index and then the actual page).
You should also be able to view the query plan - this is how the server executes the SQL statement. That can tell you all sorts of more advanced stuff - for instance there are multiple ways to do the job: a merge join is possible if both tables are sorted by the join field or a nested join will loop through the smaller table for every record in the larger table.
well there is no reason why this query is slow... the only thing that comes to mind is, do you have indexes on INVOICES.INNUM = INVOICE_ITEMS.II_INNUM? if you add them it could speed up the select but it would slow down updates/inserts...
A join doesn't create a "virtual table" on anything more than just a conceptual level.
The performance issue with your query most likely lies in poor or insufficient indexing. You should have indexes on:
INVOICE_ITEMS.II_INNUM
INVOICES.IN_DATE
You should also have an index on INVOICES.INNUM, but if that's the primary key of the table then it already has one.
Also, don't use a left join here. If there's a foreign key between INVOICE_ITEMS.II_INNUM and INVOICES.INNUM (and INVOICE_ITEMS.II_INNUM is not nullable), then you'll never encounter a record in INVOICE_ITEMS that won't match up to a record in INVOICES. Even if there were, your WHERE condition is using a value from INVOICES, so you'd eliminate any unmatched rows anyway. Just use a regular JOIN.