That was not the right table: Access wiped the wrong data - sql

I... don't quite know if I have the right idea about Access here.
I wrote the following, to grab some data that existed in two places:-
Select TableOne.*
from TableOne inner join TableTwo
on TableOne.[LINK] = TableTwo.[LINK]
Now, my interpretation of this is:
Find the table "TableOne"
Match the LINK field to the corresponding field in the table "TableTwo"
Show only records from TableOne that have a matching record in TableTwo
Just to make sure, I ran the query with some sample tables in SSMS, and it worked as expected.
So why, when I deleted the rows from within that query, did it delete the rows from TableTwo, and NOT from TableOne as expected? I've just lost ~3 days of work.
Edit: For clarity, I manually selected the rows in the query window and deleted them. I did not use a delete query - I've been stung by that a couple of times lately.

Since you have deleted the records manually, your query has to be updateable. This means that your query couldn't have been solely a cartesian join or a join without referential integrity, since these queries are non-updateable in ms access.
When I recreate your query based on two fields without indexes or primary keys, I am not even able to manualy delete records. This leads me to believe there was unknowingly a relationship established which deleted the records in table two. Perhaps you should take a look in the design view of your queries and relationships window, since the query itself should indeed select only records from table one.

Not sure why it got deleted, but I suggest to rewrite your query:
delete TableOne
where LINK in (select LINK from TableTwo)

This should work for you:
DELETE TableOne
FROM TableOne a
INNER JOIN
TableTwo b on b.Bid = a.Bid
and [my filter condition]

Related

MSSQL - Question about how insert queries run

We have two tables we want to merge. Say, table1 and table2.
They have the exact same columns, and the exact same purpose. The difference being table2 having newer data.
We used a query that uses LEFT JOIN to find the the rows that are common between them, and skip those rows while merging. The problem is this. both tables have 500M rows.
When we ran the query, it kept going on and on. For an hour it just kept running. We were certain this was because of the large number of rows.
But when we wanted to see how many rows were already inserted to table2, we ran the code select count(*) from table2, it gave us the exact same row count of table2 as when we started.
Our questions is, is that how it's supposed to be? Do the rows get inserted all at the same time after all the matches have been found?
If you would like to read uncommited data, than the count should me modified, like this:
select count(*) from table2 WITH (NOLOCK)
NOLOCK is over-used, but in this specific scenario, it might be handy.
No data are inserted or updated one by one.
I have no idea how it is related with "Select count(*) from table2 WITH (NOLOCK) "
Join condition is taking too long to produce Resultset which will be use by insert operator .So actually there is no insert because no resultset is being produce.
Join query is taking too long because Left Join condition produces very very high cardinality estimate.
so one has to fix Join condition first.
for that need other info like Table schema ,Data type and length and existing index,requirement.

Trying to use cursor on one database using select from another db

So I'm trying to wrap my head around cursors. I have task to transfer data from one database to another, but they have slightly diffrent schemas. Let's say I have TableOne (Id, Name, Gold) and TableTwo (Id, Name, Lvl). I want to take all records from TableTwo and insert it into TableOne, but it can be duplicated data on Name column. So if single record from TableTwo exist (on Name column comparison) in TableOne, I want to skip it, if don't - create record in TableOne with unique Id.
I was thinking about looping on each record in TableTwo, and for every record check if it's exist in TableOne. So, how do I make this check without making call to another database every time? I wanted first select all record from TableOne, save it into variable and in loop itself make check against this variable. Is this even possible in SQL? I'm not so familiar with SQL, some code sample would help a lot.
I'm using Microsoft SQL Server Management Studio if that matters. And of course, TableOne and TableTwo exists in diffrent databases.
Try this
Insert into table1(id,name,gold)
Select id,name,lvl from table2
Where table2.name not in(select t1.name from table1 t1)
If you want to add newId for every row you can try
Insert into table1(id,name,gold)
Select (select max(m.id) from table1 m) + row_number() over (order by t2.id) ,name,lvl from table2 t2
Where t2.name not in(select t1.name from table1 t1)
It is possible yes, but I would not recommend it. Looping (which is essentially what a cursor does) is usually not advisable in SQL when a set-based operation will do.
At a high level, you probably want to join the two tables together (the fact that they're in different databases shouldn't make a difference). You mention one table has duplicates. You can eliminate those in a number of ways such as using a group by or a row_number. Both approaches will require you understanding which rows you want to "pick" and which ones you want to "ignore". You could also do what another user posted in a comment where you do an existence check against the target table using a correlated subquery. That will essentially mean that if any rows exist in the target table that have duplicates you're trying to insert, none of those duplicates will be put in.
As far as cursors are concerned, to do something like this, you'd be doing essentially the same thing, except on each pass of the cursor you would be temporarily assigning and using variables instead of columns. This approach is sometimes called RBAR (for "Rob by Agonizing Row"). On every pass of the cursor or loop, it has to re-open the table, figure out what data it needs, then operate on it. Even if that's efficient and it's only pulling back one row, there's still lots of overhead to doing that query. So while, yes, you can force SQL to do what you've describe, the database engine already has an operation for this (joins) which does it far faster than any loop you could conceivably write

Poor performance with stacked joins

I'm not sure I can provide enough details for an answer, but my company is having a performance issue with an older mssql view. I've narrowed it down to the right outer joins, but I'm not familiar with the structure of joins following joins without a "ON" with each one, as in the code snippet below.
How do I write the joins below to either improve performance or to the simpler format of Join Tablename on Field1 = field2 format ?
FROM dbo.tblObject AS tblObject_2
JOIN dbo.tblProspectB2B PB ON PB.Object_ID = tblObject_2.Object_ID
RIGHT OUTER JOIN dbo.tblProspectB2B_CoordinatorStatus
RIGHT OUTER JOIN dbo.tblObject
INNER JOIN dbo.vwDomain_Hierarchy
INNER JOIN dbo.tblContactUser
INNER JOIN dbo.tblProcessingFile WITH ( NOLOCK )
LEFT OUTER JOIN dbo.enumRetentionRealization AS RR ON RR.RetentionRealizationID = dbo.tblProcessingFile.RetentionLeadTypeID
INNER JOIN dbo.tblLoan
INNER JOIN dbo.tblObject AS tblObject_1 WITH ( NOLOCK ) ON dbo.tblLoan.Object_ID = tblObject_1.Object_ID ON dbo.tblProcessingFile.Loan_ID = dbo.tblLoan.Object_ID ON dbo.tblContactUser.Object_ID = dbo.tblLoan.ContactOwnerID ON dbo.vwDomain_Hierarchy.Object_ID = tblObject_1.Domain_ID ON dbo.tblObject.Object_ID = dbo.tblLoan.ContactOwnerID ON dbo.tblProspectB2B_CoordinatorStatus.Object_ID = dbo.tblLoan.ReferralSourceContactID ON tblObject_2.Object_ID = dbo.tblLoan.ReferralSourceContactID
Your last INNER JOIN has a number of ON statements. Per this question and answer, such syntax is equivalent to a nested subquery.
That is one of the worst queries I have ever seen. Since I cannot figure out how it is supposed to work without the underlying data, this is what I suggest to you.
First find a good sample loan and write a query against this view to return where loan_id = ... Now you have a data set you chan check you changes against more easily than the, possibly, millions of records this returns. Make sure these results make sense (that right join to tbl_objects is bothering me as it makes no sense to return all the objects records)
Now start writing your query with what you think should be the first table (I would suggest that loan is the first table, if it not then the first table is Object left joined to loan)) and the where clause for the loan id.
Check your results, did you get the same loan information as teh view query with the where clause added?
Then add each join one at a time and see how it affects the query and whether the results appear to be going off track. Once you have figured out a query that gives the same results with all the tables added in, then you can try for several other loan ids to check. Once those have checked out, then run teh whole query with no where clause and check against the view results (if it is a large number you may need to just see if teh record counts match and visually check through (use order by on both things in order to make sure your results are in the same order). In the process try to use only left joins and not that combination of right and left joins (its ok to leave teh inner ones alone).
I make it a habit in complex queries to do all the inner joins first and then the left joins. I never use right joins in production code.
Now you are ready to performance tune.
I woudl guess the right join to objects is causing a problem in that it returns teh whole table and the nature of that table name and teh other joins to the same table leads me to believe that he probably wanted a left join. Without knowing the meaning of the data, it is hard to be sure. So first if you are returning too many records for one loan id, then consider if the real problem is that as tables have grown, returning too many records has become problematic.
Also consider that you can often take teh view and replace it with code to get the same results. Views calling views are a poor technique that often leads to performance issues. Often the views on top of the other views call teh same tables and thus you end up joining to them multiple times when you don;t need to.
Check your Explain plan or Execution plan depending on what database backend you have. Analysis of this should show where you might have missing indexes.
Also make sure that every table in the query is needed. This especially true when you join to a view. The view may join to 12 other tables but you only need the data from one of them and it can join to one of your tables. MAke sure that you are not using select * but only returning teh fields the view actually needs. You have inner joins so, by definition, select * is returning fields you don't need.
If your select part of teh view has a distinct in it, then consider if you can weed down the multiple records you get that made distinct needed by changing to a derived table or adding a where clause. To see what is causing the multiples, you may need to temporarily use select * to see all the columns and find out which one is not uniques and is causing the issue.
This whole process is not going to be easy or fun. Just take it slowly, work carefully and methodically and you will get there and have a query that is understandable and maintainable in the end.

Number of records discrepancy - only change is sorting

I have an Access 2003 database with a query that is a left outer join of a table to another query. If I didn't sort that final query, I got 42 records. If I sorted the final query by the 2 joined fields, I got 43 records. No other changes were made to the query.
To verify this, I took the query, copied it, applied the sort with no other changes, and the record count went up by one. Perplexed, I copied the results into Excel, sorted, and compared row by row. I discovered one record was duplicated (all fields were exactly the same), where there were actually no duplicate records in the source table and query.
I would think this is a bug, and I know there are a few in Access, but has anyone heard of this behavior before?
It is possible that you have a corrupt index. It may be worth taking a back up and then compacting and repairing the database, which should rebuild the indexes.
I had a similar problem on Access 2003 where I had duplicate records where 1 field was an autonumber which should up twice.
My query was :
SELECT qry_Tasks_with_Names.*, Location.Location, WorkRequests.Element, WorkRequests.Site_ID
FROM ((WorkRequests LEFT JOIN Location ON WorkRequests.Site_ID=Location.LocationID) INNER JOIN qry_Tasks_with_Names_by_Individual ON WorkRequests.Work_Request_ID=qry_Tasks_with_Names_by_Individual.Work_Request_ID) INNER JOIN qry_Tasks_with_Names ON WorkRequests.Work_Request_ID=qry_Tasks_with_Names.Work_Request_ID
WHERE (WorkRequests.Site_ID=Forms!TaskList!comboFilter_by_Site Or Forms!TaskList!comboFilter_by_Site Is Null) And (qry_Tasks_with_Names.Assigned_to_User_ID=forms!taskList!comboFilter_by_Person Or forms!taskList!comboFilter_by_Person Is Null) And (qry_Tasks_with_Names.Assigned_to_Team_ID=forms!taskList!comboFilter_by_Team Or forms!taskList!comboFilter_by_Team Is Null) And qry_Tasks_with_Names.Assigned_to_User_ID>0
ORDER BY qry_Tasks_with_Names.SLA_Due;
The above was initially created by the query designer, but it gets itself confused at this level and it also seems that I do.
Once I removed the inner join on qry_Tasks_with_Names_by_Individual all was OK.
No idea why, but hopefully this may save someone else some tears if they have the same problem.

Tips or tricks for translating sql joins from literal language to SQL syntax?

I often know exactly what I want, and know how tables are related in order to get it, but I have a real hard time translating that literal language knowledge to SQL syntax when it comes to joins. Do you have any tips or tricks you can share that have worked for you in the past?
This is a basic, but poor example:
"I have Categories, which have one-to-many Products, which have one-to-many Variants, which have one-to-many Sources. I need all Sources that belong to Category XYZ."
I imagine doing something where you cross out certain language terms and replace them with SQL syntax. Can you share how you formulate your queries based upon some concept similar to that? Thanks!
Use SQL Query Designer to easily buid Join queries from the visual table collection right there, then if you want to learn how it works, simply investigate it, that's how I learned it.
You won't notice how charming it is till you try it.
Visual Representation of SQL Joins - A walkthrough explaining SQL JOINs.
Complete ref of SQL-Server Join, Inner Join, Left Outer Join, Right Outer Join, Full Outer Join, in SQL-Server 2005 (View snapshot bellow).
ToTraceString of Entity Frameork' ObjectQuery (that you add Include shapings to it) is also a good way to learn it.
SQL-Server Join types (with detailed examples for each join type):
INNER JOIN - Match rows between the two tables specified in the INNER JOIN statement based on one or more columns having matching data. Preferably the join is based on referential integrity enforcing the relationship between the tables to ensure data integrity.
Just to add a little commentary to the basic definitions above, in general the INNER JOIN option is considered to be the most common join needed in applications and/or queries. Although that is the case in some environments, it is really dependent on the database design, referential integrity and data needed for the application. As such, please take the time to understand the data being requested then select the proper join option.
Although most join logic is based on matching values between the two columns specified, it is possible to also include logic using greater than, less than, not equals, etc.
LEFT OUTER JOIN - Based on the two tables specified in the join clause, all data is returned from the left table. On the right table, the matching data is returned in addition to NULL values where a record exists in the left table, but not in the right table.
Another item to keep in mind is that the LEFT and RIGHT OUTER JOIN logic is opposite of one another. So you can change either the order of the tables in the specific join statement or change the JOIN from left to right or vice versa and get the same results.
RIGHT OUTER JOIN - Based on the two tables specified in the join clause, all data is returned from the right table. On the left table, the matching data is returned in addition to NULL values where a record exists in the right table but not in the left table.
Self Join - In this circumstance, the same table is specified twice with two different aliases in order to match the data within the same table.
CROSS JOIN - Based on the two tables specified in the join clause, a Cartesian product is created if a WHERE clause does filter the rows. The size of the Cartesian product is based on multiplying the number of rows from the left table by the number of rows in the right table. Please heed caution when using a CROSS JOIN.
FULL JOIN - Based on the two tables specified in the join clause, all data is returned from both tables regardless of matching data.
I think most people approach its:
Look for substantives, as they can point to potential tables
Look for adjectives, cause they probably are fields
Relationships between substantives gives JOIN rules
Nothing better than to draw these structures in a paper sheet.
Write and debug a query which returns the fields from the table having the majority of—or the most important—data. Add constraints which depend only on that table, or which are independent of all tables.
Add a new where term which relates another table.
Repeat 2 until done.
I've yet to use the join operator in a query, even after 20+ years of writing SQL queries. One can almost always write them in the form
select field, field2, field3, <etc.>
from table
where field in (select whatever from table2 where whatever) and
field2 in (select whatever from table2 where whatever) and ...
or
select field, field2, field3, <etc.>
from table1, table2, ...
where table1.field = table2.somefield and
table1.field2 = table3.someotherfield and ...
Like someone else wrote, just be bold and practice. It will be like riding a bicycle after 4 or 5 times creating such a query.
One word: Practice.
Open up the query manager and start running queries until you get what you want. Look up similar examples and adapt them to your situation. You will always have to do some trial and error with the queries to get them right.
SQL is very different from imperative programming.
1) To design tables, consider Entities (the real THINGS in the world of concern), Relationships (between the Entities), and Attributes (the values associated with an Entity).
2) to write a Select statement consider a plastic extrusion press:
a) you put in raw records From tables, Where conditions exist in the records
b) you may have to join tables to get at the data you need
c) you craft extrusion nozzles to make the plastic into the shapes you want. These are the individual expressions of the select List.
d) you may want the n-ary sets of list data to come to you in a certain order, you can apply an Order By clause.
3) crafting the List expressions is the most like imperative programming after you discover the if(exp,true-exp,false-exp) function.
Look at the ERD.
Logical or physical version, it will show what tables are related to one another. This way, you can see what table(s) you need to join to in order to get from point/table a to point/table b, and what criteria.