Difference between DELETE and DELETE FROM in SQL? - sql

Is there one? I am researching some stored procedures, and in one place I found the following line:
DELETE BI_Appointments
WHERE VisitType != (
SELECT TOP 1 CheckupType
FROM BI_Settings
WHERE DoctorName = #DoctorName)
Would that do the same thing as:
DELETE FROM BI_Appointments
WHERE VisitType != (
SELECT TOP 1 CheckupType
FROM BI_Settings
WHERE DoctorName = #DoctorName)
Or is it a syntax error, or something entirely different?

Assuming this is T-SQL or MS SQL Server, there is no difference and the statements are identical. The first FROM keyword is syntactically optional in a DELETE statement.
http://technet.microsoft.com/en-us/library/ms189835.aspx
The keyword is optional for two reasons.
First, the standard requires the FROM keyword in the clause, so it would have to be there for standards compliance.
Second, although the keyword is redundant, that's probably not why it's optional. I believe that it's because SQL Server allows you to specify a JOIN in the DELETE statement, and making the first FROM mandatory makes it awkward.
For example, here's a normal delete:
DELETE FROM Employee WHERE ID = #value
And that can be shortened to:
DELETE Employee WHERE ID = #value
And SQL Server allows you to delete based on another table with a JOIN:
DELETE Employee
FROM Employee
JOIN Site
ON Employee.SiteID = Site.ID
WHERE Site.Status = 'Closed'
If the first FROM keyword were not optional, the query above would need to look like this:
DELETE FROM Employee
FROM Employee
JOIN Site
ON Employee.SiteID = Site.ID
WHERE Site.Status = 'Closed'
This above query is perfectly valid and does execute, but it's a very awkward query to read. It's hard to tell that it's a single query. It looks like two got mashed together because of the "duplicate" FROM clauses.
Side note: Your example subqueries are potentially non-deterministic since there is no ORDER BY clause.

Hi friends there is no difference between delete and delete from in oracle database it is optional, but this is standard to write code like this
DELETE FROM table [ WHERE condition ]
this is sql-92 standard. always develop your code in the standard way.

Related

update average/count from another table

I've been provided the below schema for this problem and I'm trying to do two things:
Update the ACCOUNT table's average_eval row with the average of the evaluation row from the POST_EVAL table per account_id.
Update the ACCOUNT table with a count of the number of posts per account_id, with default value 0 if the account_id has no post_id associated to it.
Here's the kicker : I MUST use the UPDATE statement and I'm not allowed to use triggers for these specific problems.
I've tried WITH clauses and GROUP BY but haven't gotten anywhere. Using postresql's pgadmin for reference.
Any help setting up these queries?
The first question can be done using something like this:
update account a
set average_eval = t.avg_eval
from (
select account_id, avg(evaluation) as avg_eval
from post_eval
group by account_id
) t
where t.account_id = a.account_id
The second question needs a co-related sub-query as there is no way to express an outer join in an UPDATE statement like the above:
update account a
set num_posts = (select count(*)
from post p
where p.account_id = a.account_id);
The count() will return zero (0) if there are no posts for that account. If a join was used (as in the first statement), the rows would not be updated at all, as the "join" condition wouldn't match.
I have not tested either of those statements, so they can contain typos (or even logical errors).
Unrelated, but: I understand that this is some kind of assignment, so you have no choice. But as RiggsFolly has mentioned: in general you should avoid storing information in a relational database that can be derived from existing data. Both values can easily be calculated in a view and then will always be up-to-date.

Bizarre behavior of a query

I found this query from a developer:
DELETE FROM [MYDB].[dbo].[MYSIGN] where USERID in
(select USERID from [MYDB].[dbo].[MYUSER] where Surname = 'Rossi');
This query deletes every record in table MYSIGN.
The field USERID does not exists in table MYUSER. If I run only the subquery:
select USERID from [MYDB].[dbo].[MYUSER] where Surname = 'Rossi'
It throws the right error, because the missing column.
We corrected the query using the right column, but we didn't figure out:
Why the first query works?
Why it deletes every record?
Specs: database is on a SQL SERVER 2016 SP1, CU3.
Apparently you have USERID in [MYDB].[dbo].[MYSIGN] so it's exactly how sql-server resolves unprefixed USERID in (select USERID from [MYDB].[dbo].[MYUSER] where Surname = 'Rossi') - it resolves it to [MYDB].[dbo].[MYSIGN].USERID
Use aliases and it will fail
DELETE FROM [MYDB].[dbo].[MYSIGN] where USERID in
(select t.USERID from [MYDB].[dbo].[MYUSER] t where Surname = 'Rossi');
It's something referred as "accidental correlated sub-query" as #NenadZivkovic named it, i like the term.
The problem is the scoping rules of subqueries. If the column is not found in the subquery tables, then the SQL engine starts looking at the next level out -- and so on (in the case of SQL Server).
Whenever you have multiple tables in a query, always qualify the column names. This means, put the table name (or alias) with the column alias. Then you have no ambiguity:
DELETE
FROM [MYDB].[dbo].[MYSIGN] m
WHERE m.USERID IN (SELECT u.USERID FROM [MYDB].[dbo].[MYUSER] u WHERE u.Surname = 'Rossi');
A simple rule to follow that makes your code more readable and less prone to error.

SQL Server: Using columns with identical names

I'm writing a migration script to move data from one data model to another in Microsoft SQL Server Management Studio. The problem I'm running into is that, in the source database, some tables have foreign key columns that I need to compare. A snippet of code:
INSERT INTO TargetDB.dbo.Encounter(EncounterID, PATID, DRG)
Select
visit_occurrence_id,
person_id,
(Select
Case when ((Select top 1 observation_concept_id from SourceDB.dbo.Observation where visit_occurrence_id = visit_occurrence_id) = 3040464)
Then (Select top 1 value_as_string from SourceDB.dbo.Observation where visit_occurrence_id = visit_occurrence_id)
Else NULL End
)
from SourceDB.dbo.Visit_occurrence
As you can see, I need to compare visit_occurrence_id in SourceDB.dbo.Observation to visit_occurrence_id in SourceDB.dbo.Visit_occurrence. As it is, it's just returning values from the first row in SourceDB.dbo.Observation, since visit_occurrence_id will always equal itself.
What's the proper way to do this? Can I assign the first visit_occurrence_id value to a variable within the query, so it has a distinct name? I'm pretty lost here.
I'm going to add a little more detail for you here in an answer. You can always refer to an object by it's fully-qualified name, but it isn't always necessary:
Database.Schema.Table
or
Database.Schema.Table.Column
with sql server, it can even include server for linked-server scenarios.
also true of other objects like views, procedures, functions, etc... Aliasing of tables and/or columns can be a good strategy for shortening this qualification.
Anytime there is ambiguity, this is necessary. However, it is a good practice to be fairly explicit, because it can save you future headaches. As an example, consider this view:
CREATE VIEW vwEmployeesWithLocation AS
SELECT
E.EmployeeId -- from employees
, LastName -- from employees
, Status -- from employees
, LocationName -- from locations
FROM
Employees AS E
INNER JOIN
EmployeeLocations AS EL ON E.EmloyeeId = EL.EmployeeId
INNER JOIN
Locations AS L ON EL.LocationId = L.LocationId
Right now, everything is fine because other than EmployeeId, the column names are distinct. However, someone might add a Status column to the Locations table in the future and break this view. So, it would be better to explicitly include the table prefix for all columns in the select.
In your case, your query is cross database, so again, be explicit about the database in all parts of your query.
Used snow_FFFFFF's answer in the comments: Just used SourceDB.dbo.Observation.visit_occurence_id.

In an EXISTS can my JOIN ON use a value from the original select

I have an order system. Users with can be attached to different orders as a type of different user. They can download documents associated with an order. Documents are only given to certain types of users on the order. I'm having trouble writing the query to check a user's permission to view a document and select the info about the document.
I have the following tables and (applicable) fields:
Docs: DocNo, FileNo
DocAccess: DocNo, UserTypeWithAccess
FileUsers: FileNo, UserType, UserNo
I have the following query:
SELECT Docs.*
FROM Docs
WHERE DocNo = 1000
AND EXISTS (
SELECT * FROM DocAccess
LEFT JOIN FileUsers
ON FileUsers.UserType = DocAccess.UserTypeWithAccess
AND FileUsers.FileNo = Docs.FileNo /* Errors here */
WHERE DocAccess.UserNo = 2000 )
The trouble is that in the Exists Select, it does not recognize Docs (at Docs.FileNo) as a valid table. If I move the second on argument to the where clause it works, but I would rather limit the initial join rather than filter them out after the fact.
I can get around this a couple ways, but this seems like it would be best. Anything I'm missing here? Or is it simply not allowed?
I think this is a limitation of your database engine. In most databases, docs would be in scope for the entire subquery -- including both the where and in clauses.
However, you do not need to worry about where you put the particular clause. SQL is a descriptive language, not a procedural language. The purpose of SQL is to describe the output. The SQL engine, parser, and compiler should be choosing the most optimal execution path. Not always true. But, move the condition to the where clause and don't worry about it.
I am not clear why do you need to join with FileUsers at all in your subquery?
What is the purpose and idea of the query (in plain English)?
In any case, if you do need to join with FileUsers then I suggest to use the inner join and move second filter to the WHERE condition. I don't think you can use it in JOIN condition in subquery - at least I've never seen it used this way before. I believe you can only correlate through WHERE clause.
You have to use aliases to get this working:
SELECT
doc.*
FROM
Docs doc
WHERE
doc.DocNo = 1000
AND EXISTS (
SELECT
*
FROM
DocAccess acc
LEFT OUTER JOIN
FileUsers usr
ON
usr.UserType = acc.UserTypeWithAccess
AND usr.FileNo = doc.FileNo
WHERE
acc.UserNo = 2000
)
This also makes it more clear which table each field belongs to (think about using the same table twice or more in the same query with different aliases).
If you would only like to limit the output to one row you can use TOP 1:
SELECT TOP 1
doc.*
FROM
Docs doc
INNER JOIN
FileUsers usr
ON
usr.FileNo = doc.FileNo
INNER JOIN
DocAccess acc
ON
acc.UserTypeWithAccess = usr.UserType
WHERE
doc.DocNo = 1000
AND acc.UserNo = 2000
Of course the second query works a bit different than the first one (both JOINS are INNER). Depeding on your data model you might even leave the TOP 1 out of that query.

SQL - table alias scope

I've just learned ( yesterday ) to use "exists" instead of "in".
BAD
select * from table where nameid in (
select nameid from othertable where otherdesc = 'SomeDesc' )
GOOD
select * from table t where exists (
select nameid from othertable o where t.nameid = o.nameid and otherdesc = 'SomeDesc' )
And I have some questions about this:
1) The explanation as I understood was: "The reason why this is better is because only the matching values will be returned instead of building a massive list of possible results". Does that mean that while the first subquery might return 900 results the second will return only 1 ( yes or no )?
2) In the past I have had the RDBMS complainin: "only the first 1000 rows might be retrieved", this second approach would solve that problem?
3) What is the scope of the alias in the second subquery?... does the alias only lives in the parenthesis?
for example
select * from table t where exists (
select nameid from othertable o where t.nameid = o.nameid and otherdesc = 'SomeDesc' )
AND
select nameid from othertable o where t.nameid = o.nameid and otherdesc = 'SomeOtherDesc' )
That is, if I use the same alias ( o for table othertable ) In the second "exist" will it present any problem with the first exists? or are they totally independent?
Is this something Oracle only related or it is valid for most RDBMS?
Thanks a lot
It's specific to each DBMS and depends on the query optimizer. Some optimizers detect IN clause and translate it.
In all DBMSes I tested, alias is only valid inside the ( )
BTW, you can rewrite the query as:
select t.*
from table t
join othertable o on t.nameid = o.nameid
and o.otherdesc in ('SomeDesc','SomeOtherDesc');
And, to answer your questions:
Yes
Yes
Yes
You are treading into complicated territory, known as 'correlated sub-queries'. Since we don't have detailed information about your tables and the key structures, some of the answers can only be 'maybe'.
In your initial IN query, the notation would be valid whether or not OtherTable contains a column NameID (and, indeed, whether OtherDesc exists as a column in Table or OtherTable - which is not clear in any of your examples, but presumably is a column of OtherTable). This behaviour is what makes a correlated sub-query into a correlated sub-query. It is also a routine source of angst for people when they first run into it - invariably by accident. Since the SQL standard mandates the behaviour of interpreting a name in the sub-query as referring to a column in the outer query if there is no column with the relevant name in the tables mentioned in the sub-query but there is a column with the relevant name in the tables mentioned in the outer (main) query, no product that wants to claim conformance to (this bit of) the SQL standard will do anything different.
The answer to your Q1 is "it depends", but given plausible assumptions (NameID exists as a column in both tables; OtherDesc only exists in OtherTable), the results should be the same in terms of the data set returned, but may not be equivalent in terms of performance.
The answer to your Q2 is that in the past, you were using an inferior if not defective DBMS. If it supported EXISTS, then the DBMS might still complain about the cardinality of the result.
The answer to your Q3 as applied to the first EXISTS query is "t is available as an alias throughout the statement, but o is only available as an alias inside the parentheses". As applied to your second example box - with AND connecting two sub-selects (the second of which is missing the open parenthesis when I'm looking at it), then "t is available as an alias throughout the statement and refers to the same table, but there are two different aliases both labelled 'o', one for each sub-query". Note that the query might return no data if OtherDesc is unique for a given NameID value in OtherTable; otherwise, it requires two rows in OtherTable with the same NameID and the two OtherDesc values for each row in Table with that NameID value.
Oracle-specific: When you write a query using the IN clause, you're telling the rule-based optimizer that you want the inner query to drive the outer query. When you write EXISTS in a where clause, you're telling the optimizer that you want the outer query to be run first, using each value to fetch a value from the inner query. See "Difference between IN and EXISTS in subqueries".
Probably.
Alias declared inside subquery lives inside subquery. By the way, I don't think your example with 2 ANDed subqueries is valid SQL. Did you mean UNION instead of AND?
Personally I would use a join, rather than a subquery for this.
SELECT t.*
FROM yourTable t
INNER JOIN otherTable ot
ON (t.nameid = ot.nameid AND ot.otherdesc = 'SomeDesc')
It is difficult to generalize that EXISTS is always better than IN. Logically if that is the case, then SQL community would have replaced IN with EXISTS...
Also, please note that IN and EXISTS are not same, the results may be different when you use the two...
With IN, usually its a Full Table Scan of the inner table once without removing NULLs (so if you have NULLs in your inner table, IN will not remove NULLS by default)... While EXISTS removes NULL and in case of correlated subquery, it runs inner query for every row from outer query.
Assuming there are no NULLS and its a simple query (with no correlation), EXIST might perform better if the row you are finding is not the last row. If it happens to be the last row, EXISTS may need to scan till the end like IN.. so similar performance...
But IN and EXISTS are not interchangeable...