Do I have to make sure my record only match with one row when using JOIN with DELETE statement? - sql

I have a query like this:
DELETE A
FROM A
LEFT JOIN B ON
A.ID=B.ID
WHERE B.Status='OK'
The records from these join might resulted in more than one rows, for example:
Table A
ID
1
2
Table B
ID Status
1 OK
1 OK
2 OK
Do I have to make sure my record only match with one row? Because in these example, ID 1 will have 2 rows.
Apologize for bad english.

In SQL Server you don't have to ensure a 1 row result per row to delete. The engine will delete all rows from the deleting table that matches your joining or where conditions, even if they are being selected for more than 1 row.
The important part is which table are your deleting, make sure not to delete the wrong one!

Related

Get the "most" optimal row in a JOIN

Problem
I have a situation in which I have two tables in which I would like the entries from table 2 (lets call it table_2) to be matched up with the entries in table 1 (table_1) such that there are no duplicates rows of table_2 used in the match up.
Discussion
Specifically, in this case there are datetime stamps in each table (field is utcdatetime). For each row in table_1, I want to find the row in table_2 in which has the closed utcdatetime to the table 1 utcdatetime such that the table2.utcdatetime is older than the table_1 utcdatetime and within 30 minutes of the table 1 utcdatetime. Here is the catch, I do not want any repeats. If a row in table 2 gets gobbled up in a match on an earlier row in table 1, then I do not want it considered for a match later.
This has currently been implemented in a Python routine, but it is slow to iterate over all of the rows in table 1 as it is large. I thought I was there with a single SQL statement, but I found that my current SQL results in duplicate table 2 rows in the output data.
I would recommend using a nested select to get whatever results you're looking for.
For instance:
select *
from person p
where p.name_first = 'SCCJS'
and not exists (select 'x' from person p2 where p2.person_id != p.person_id
and p.name_first = 'SCCJS' and p.name_last = 'SC')

SQL select from two tabels wrong result

I can not understand what I am doing wrong.
My array:
ps_product
id_product
active
1
1
2
1
and
ps_product_sync
id_product
status
1
0
2
1
and my SQL code
SELECT pr_product.id_product, pr_product.active
FROM pr_product, pr_product_sync
WHERE pr_product.active = pr_product_sync.status
I get a result like this:
id_product
status
2
1
2
1
2
1
...
..
24 rows
I try the same with inner but result is the same, I don't have duplicates in the arrays... I don't understand why I get one row 24 times
PS. all tables looks good before posting/saving
If you query two tables and include both in the FROM clause, you create a Cartesian product of these tables. In other words, if one table has 4 rows and the other 6 rows, the result is 24 rows.
It is better to create an INNER JOIN using the key of the first table and the foreign key of the second table.
Change your query accordingly
SELECT pr_product.id_product, pr_product.active
FROM pr_product
INNER JOIN pr_product_sync
ON pr_product.id_product = ps_product_sync.id_product
WHERE pr_product.active = pr_product_sync.status
Of course, you could also compare the Keys in the WHERE clause or eliminate duplicates using DISTINCT. IMHO the most understandable solution is an INNER JOIN.
I hope this solves your problem.
Missing the primary key join.
Add:
WHERE
pr_product.id_product=pr_product_sync.id_product
AND pr_product.active=pr_product_sync.status

Delete records from 2 tables with 1 query

I need to delete records from 2 different tables that are linked via job#
One table contains dates for completion and I need to delete all records that were completed between 1999 and 2001. In the second table I need to delete all phases of the job where Job number from table 1 match job number in table 2
I've researched a bit and came up with something like this but when running it I receive "Too Many Fields Selected"
DELETE a.,b.
FROM PUB_jc_job a
LEFT JOIN PUB_jc_phase b
ON b.jph_job = a.job_num
WHERE PUB_jc_job.job_compdate BETWEEN #3/31/1999# AND #12/31/2001#
Please remove dot(.) from table alias "a., b." and retry.
DELETE a, b FROM PUB_jc_job a INNER JOIN PUB_jc_phase b ON b.jph_job = a.job_num WHERE PUB_jc_job.job_compdate BETWEEN #3/31/1999# AND #12/31/2001#

Understanding this SQL Query

I'm new to oracle database, can some help me understand this query. This query eliminates duplicates from table.
DELETE FROM table_name A
WHERE ROWID > (SELECT min(rowid)
FROM table_name B
WHERE A.key_values = B.key_values);
Any suggestions for improving the query are welcome.
Edit: No this is not homework , what I didn't understand is, what is being done by subquery and what does ROWID > On subquery do ?
This is the Source of the query
Dissecting the actual mechanics:
DELETE FROM table_name A
This is a standard query to delete records from the table named "table_name". Here, it has been aliased as "A" to be referred to in the subquery.
WHERE ROWID >
This places a condition on the deletion, such that for each row encountered, the ROWID must meed a condition of being greater than..
(SELECT min(rowid)
FROM table_name B
WHERE A.key_values = B.key_values)
This is a subquery that is correlated to the main DELETE statement. It uses the value A.key_values from the outside query. So given a record from the DELETE statement, it will run this subquery to find the minimum rowid (internal record id) for all records in the same table (aliased as B now) that bear the same key_values value.
So, to put it together, say you had these rows
rowid | key_values
======= ============
1 A
2 B
3 B
4 C
5 A
6 B
The subquery works out that the min(rowid) for each record based on ALL records with the same key_values is:
rowid | key_values | min(rowid)
======= ============ ===========
1 A 1
2 B 2
3 B 2 **
4 C 4
5 A 1 **
6 B 2 **
For the records marked with **, the condition
WHERE ROWID > { subquery }
becomes true, and they are deleted.
EDIT - additional info
This answer previously stated that ROWID increased by insertion order. That is very untrue. The truth is that rowid is just a file.block.slot-on-block - a physical address.
http://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:53140678334596
Tom's Followup December 1, 2008 - 6am Central time zone:
it is quite possible that D will be "first" in the table - as it took over A's place.
If rowids always "grew", than space would never be reused (that would be an implication of rowids growing always - we would never be able to reuse old space as the rowid is just a file.block.slot-on-block - a physical address)
Rowid is a pseudo-column that uniquely identifies each row in a table; it is numeric.
This query finds all rows in A where A.key_values = B.key_values and delete all of them but one with the minimal rowid. It's just a way to arbitrarily choose one duplicate to preserve.
Quote AskTom:
A rowid is assigned to a row upon insert and is immutable (never changing)... unless the row
is deleted and re-inserted (meaning it is another row, not the same row!)
The query you provided is relying on that rowid, and deletes all the rows with a rowid value higher than the minimum one on a per key_values basis. Hence, any duplicates are removed.
The subquery you provided is a correlated subquery, because there's a relationship between the table reference in the subquery, and one outside of the subquery.
ROWID is a number that increments for each new row that is inserted. So if you have two ROWID numbers 16 & 24, you know 16 was inserted before 24. Your delete statement is deleting all duplicates, keeping only the first of those duplicates that was inserted. Make sense??

SQL Query - Ensure a row exists for each value in ()

Currently struggling with finding a way to validate 2 tables (efficiently lots of rows for Table A)
I have two tables
Table A
ID
A
B
C
Table matched
ID Number
A 1
A 2
A 9
B 1
B 9
C 2
I am trying to write a SQL Server query that basically checks to make sure for every value in Table A there exists a row for a variable set of values ( 1, 2,9)
The example above is incorrect because t should have for every record in A a corresponding record in Table matched for each value (1,2,9). The end goal is:
Table matched
ID Number
A 1
A 2
A 9
B 1
B 2
B 9
C 1
C 2
C 9
I know its confusing, but in general for every X in ( some set ) there should be a corresponding record in Table matched. I have obviously simplified things.
Please let me know if you all need clarification.
Use:
SELECT a.id
FROM TABLE_A a
JOIN TABLE_B b ON b.id = a.id
WHERE b.number IN (1, 2, 9)
GROUP BY a.id
HAVING COUNT(DISTINCT b.number) = 3
The DISTINCT in the COUNT ensures that duplicates (IE: A having two records in TABLE_B with the value "2") from being falsely considered a correct record. It can be omitted if the number column either has a unique or primary key constraint on it.
The HAVING COUNT(...) must equal the number of values provided in the IN clause.
Create a temp table of values you want. You can do this dynamically if the values 1, 2 and 9 are in some table you can query from.
Then, SELECT FROM tempTable WHERE NOT IN (SELECT * FROM TableMatched)
I had this situation one time. My solution was as follows.
In addition to TableA and TableMatched, there was a table that defined the rows that should exist in TableMatched for each row in TableA. Let’s call it TableMatchedDomain.
The application then accessed TableMatched through a view that controlled the returned rows, like this:
create view TableMatchedView
select a.ID,
d.Number,
m.OtherValues
from TableA a
join TableMatchedDomain d
left join TableMatched m on m.ID = a.ID and m.Number = d.Number
This way, the rows returned were always correct. If there were missing rows from TableMatched, then the Numbers were still returned but with OtherValues as null. If there were extra values in TableMatched, then they were not returned at all, as though they didn't exist. By changing the rows in TableMatchedDomain, this behavior could be controlled very easily. If a value were removed TableMatchedDomain, then it would disappear from the view. If it were added back again in the future, then the corresponding OtherValues would appear again as they were before.
The reason I designed it this way was that I felt that establishing an invarient on the row configuration in TableMatched was too brittle and, even worse, introduced redundancy. So I removed the restriction from groups of rows (in TableMatched) and instead made the entire contents of another table (TableMatchedDomain) define the correct form of the data.