Why do SQL INSERT and UPDATE Statements have Different Syntaxes? - sql

While contemplating this question about a SQL INSERT statement, it occurred to me that the distinction in syntax between the two statements is largely artificial. That is, why can't we do:
INSERT INTO MyTable SET Field1=Value1, Field2=Value2, ...
or
UPDATE MyTable ( Field1, Field2 ...) VALUES ( Value1, Value2, ... )
WHERE some-key = some-value
Perhaps I'm missing something critical. But for those of us who have had to concatenate our SQL statements in the past, having comparable syntax for an INSERT and an UPDATE statement would have saved a significant amount of coding.

They're serving different grammatical functions. In an update you are specifying a filter that chooses a set of rows to which you will apply an update. And of course that syntax is shared with a SELECT query for the same purpose.
In an INSERT you are not choosing any rows, you are generating a new row which requires specifying a set of values.
In an UPDATE, the LHS=RHS stuff is specifying an expression which yields true or false (or maybe null :) and in an INSERT, the VALUES clause is about assignment of value. So while they are superficially similar, they are semantically quite different, imho. Although I have written a SQL parser, so that may influence my views. :)

SQL Server 2008 has introduced UPSERT functionality via the MERGE command. This is the logical equivalent of
IF FOUND THEN
UPDATE
ELSE
INSERT

I believe this is so that you may make an insert statement without being explicit about the values. If you are putting a value in every single column in the table you can write:
insert into my_table values ("value1", 2);
instead of:
insert into my_table (column1, column2) values ("value1", 2);
When importing and exporting entire (large) databases, this is invaluable for cutting down file size and processing time. Nowadays, with binary snapshots and the like, it may be "less invaluable" :-)

Related

Postgresql - INSERT INTO based on multiple SELECT

I intend to write a INSERT INTO request in Postgresql based on several SELECT but didn't succeed.
I have one table containing data I select (srctab), and another one where I insert data (dsttab).
Here is what I run :
INSERT INTO dsttab (dstfld1, dstfld2) WITH
t1 AS (
SELECT srcfld1
FROM srctab
WHERE srcfld3 ='foo'
),
t2 AS (
SELECT srcfld5
FROM srctab
WHERE srcfld6 ='bar'
) select srcfld1, srcfld5 from srctab;
Could you please help to make this work ? Thank you !
Note: I'm guessing about what you want to do here. My guess is that you want to insert a single row with the values from the CTEs (That's the WITH block.). Your query as written would insert a row into dsttab for every row in srctab, if it were valid syntax.
You don't really need a CTE here. CTEs should really only be used when you need to reference the same subquery more than once; that's what they exist for. (Occasionally, you can somewhat abuse them to control certain performance aspects in PostgreSQL, but that isn't the case in other DBs and is something to be avoided when possible anyway.)
Just put your queries in line:
INSERT INTO dsttab (dstfld1, dstfld2)
VALUES (
(SELECT srcfld1
FROM srctab
WHERE srcfld3 ='foo'),
(SELECT srcfld5
FROM srctab
WHERE srcfld6 ='bar')
);
The key point here is to surround the subqueries with parentheses.

Why do I need the 'match' part of a SQL merge, in this scenario?

Consider the following:
merge into T t1
using (select ID,Col1 from T where ID = 123) t2
on 1 = 0
when not matched then insert (Col1) values (t2.Col1);
Cominig from a programming background, to me this translates to:
"Evaluate false (i.e. 1 = 0), and when it is false (i.e. all the time), insert."
Is it not possible to just omit the match condition?
Is it because of my select's where condition that I'm confused here? Should this condition be moved to the on?
NOTE:
Due to restrictions with output, I cannot use insert. I need to output the results of this merge into a temporary table for reasons outside of the scope of what I'm asking.
In the answer you've linked to in the comments, as I've hopefully made clear, we are abusing the MERGE statement.
The query you've shown here could trivially be replaced by:
insert into T(Col1) select Col1 from T where ID = 123
However, if you want to be able to add an OUTPUT clause, and that OUTPUT clause needs to reference both the newly inserted data and data from the source table, you're not allowed to write such a clause on an INSERT statement.
So, we instead use a MERGE statement, but not for its intended purpose. The entire purpose is to force it to perform an INSERT and write our OUTPUT clause.
If we examine the documentation for MERGE, we see that the only clause in which we can specify to perform an INSERT is in the WHEN NOT MATCHED [BY TARGET] clause - in both the WHEN MATCHED and WHEN NOT MATCHED BY SOURCE clauses, our only options are to UPDATE or DELETE.
So, we have to write the MERGE such that matching always fails - and the simplest way to do that is to say that matching should occur when 1 = 01 - which, hopefully, is never.
1Since SQL Server doesn't support boolean literals

SQL Server: Inserting from various sources

The simple version
What is the correct syntax of this?
INSERT INTO foo(IP, referer)
VALUES(SELECT bin FROM dbo.bar("foobar"),"www.foobar.com/test/")
I am getting syntax errors near 'SELECT' and ')'
The long version: I want to insert using a Function and a string (this is simplified, in reality there will be a few other values including datetime, ints, etc to insert along with the function result).
I have a function, itvfBinaryIPv4, which was set up to convert IPs to a binary(4) datatype to allow for easy indexing, I used this as a reference: Datatype for storing ip address in SQL Server.
So this is what I am trying to accomplish:
INSERT INTO foo (IP, referer)
VALUES(SELECT bin FROM dbo.itvfBinaryIPv4("192.65.68.201"), "www.foobar.com/test/")
However, I get syntax errors near 'SELECT' and ')'. What is the correct syntax to insert with function results and direct data?
It should be like this -
INSERT INTO foo (IP, referer)
SELECT bin, 'www.foobar.com/test/'
FROM dbo.itvfBinaryIPv4('192.65.68.201')
here assume dbo.itvfBinaryIPv4("192.65.68.201") is table valued function.
The INSERT command comes in two flavors:
(1) either you have all your values available, as literals or SQL Server variables - in that case, you can use the INSERT .. VALUES() approach:
INSERT INTO dbo.YourTable(Col1, Col2, ...., ColN)
VALUES(Value1, Value2, #Variable3, #Variable4, ...., ValueN)
Note: I would recommend to always explicitly specify the list of column to insert data into - that way, you won't have any nasty surprises if suddenly your table has an extra column, or if your tables has an IDENTITY or computed column. Yes - it's a tiny bit more work - once - but then you have your INSERT statement as solid as it can be and you won't have to constantly fiddle around with it if your table changes.
(2) if you don't have all your values as literals and/or variables, but instead you want to rely on another table, multiple tables, or views, to provide the values, then you can use the INSERT ... SELECT ... approach:
INSERT INTO dbo.YourTable(Col1, Col2, ...., ColN)
SELECT
SourceColumn1, SourceColumn2, #Variable3, #Variable4, ...., SourceColumnN
FROM
dbo.YourProvidingTableOrView
Here, you must define exactly as many items in the SELECT as your INSERT expects - and those can be columns from the table(s) (or view(s)), or those can be literals or variables. Again: explicitly provide the list of columns to insert into - see above.
You can use one or the other - but you cannot mix the two - you cannot use VALUES(...) and then have a SELECT query in the middle of your list of values - pick one of the two - stick with it.
Haven't checked it, but the correct syntax would be:
INSERT INTO foo (IP, referer)
SELECT bin, "www.foobar.com/test/" FROM dbo.itvfBinaryIPv4("192.65.68.201")

SQL: Use the same string for both INSERT and UPDATE?

The INSERT syntax I've been using is this
INSERT INTO TableName VALUES (...)
The UPDATE syntax I've been using is
UPDATE TableName SET ColumnName=Value WHERE ...
So in all my code, I have to generate 2 strings, which would result in something like this
insertStr = "(27, 'John Brown', 102)";
updateStr = "ID=27, Name='John Brown', ItemID=102";
and then use them separately
"UPDATE TableName SET " + updateStr + " WHERE ID=27 " +
"IF ##ROWCOUNT=0 "+
"INSERT INTO TableName VALUES (" + insertStr + ")"
It starts bothering me when I am working with tables with like 30 columns.
Can't we generate just one string to use on both INSERT and UPDATE?
eg. using insertStr above on UPDATE statement or updateStr on INSERT statement, or a whole new way?
I think you need a whole new approach. You are open to SQL Injection. Provide us with some sample code as to how you are getting your data inputs and sending the statements to the database.
alt text http://goose.ycp.edu/~weddins/440/S09%20IFS440%20Bobby%20Drop%20Tables.PNG
As far as I'm aware, what you're describing isn't possible in ANSI SQL, or any extension of it that I know. However, I'm mostly familiar with MySQL, and it likely depends completely upon what RDBMS you're using. For example, MySQL has "INSERT ... ON DUPLICATE KEY UPDATE ... " syntax, which is similar to what you've posted there, and combines an INSERT query with an UPDATE query. The upside is that you are combining two possible operations into a single query, however, the INSERT and UPDATE portions of the query are admittedly different.
Generally, this kind of thing can be abstracted away with an ORM layer in your application. As far as raw SQL goes, I'd be interested in any syntax that worked the way you describe.
Some DBMS' have an extension to do this but why don't you just provide a function to do it for you? We've actually done this before.
I'm not sure what language you're using but it's probably got associative arrays where you can wrote something like:
pk{"ID"} = "27"
val{"Name"} = "'John Brown'"
val{"ItemID"} = "102"
upsert ("MyTable", pk, val)
and, if it doesn't have associative arrays, you can emulate them with multiple integer-based arrays of strings.
In our upsert() function, we just constructed a string (update, then insert if the update failed) and passed it to our DBMS. We kept the primary keys separate from our other fields since that made construction of the update statement a lot easier (primary key columns went in the where clause, other columns were just set).
The result of the calls above would result in the following SQL (we had a different check for failed update but I've put your ##rowcount in for this example):
update MyTable set
Name = 'John Brown',
ItemID = 102
where ID = 27
if ##rowcount=0
insert into MyTable (ID, Name, ItemID) values (
27,
'John Brown',
102
)
That's one solution which worked well for us. No doubt there are others.
Well, how about no statements? You might want to look into an ORM to handle this for you...
Some databases have proprietary extensions that do exactly this.
I agree that the syntax of INSERT and UPDATE could be more consistent, but this is just a fact of life now -- it ain't gonna change now. For many scenarios, the best option is your "whole new way": use an object-relational mapping library (or even a weak-tea layer like .NET DataSets) to abstract away the differences, and stop worrying about the low-level SQL syntax. Not a viable option for every application, of course, but it would allow you to just construct or update an object, call a Save method and have the library figure out the SQL syntax for you.
If you think about it, INSERT and UPDATE are exactly the same thing. They map field names to values, except the UPDATE has a filter.
By creating an associative array, where the key is the field name and the value is the value you want to assign to the field, you have your mapping. You just need to convert it to a the proper string format depending on INSERT or UPDATE.
You just need to create a function that will handle the conversion based on the parameters given.
SQL Server 2008:
MERGE dbo.MyTable AS T
USING
(SELECT
#mykey AS MyKey
#myval AS MyVal
) AS S
ON (T.MyKey = S.MyKey)
WHEN MATCHED THEN
UPDATE SET
T.MyVal = S.MyVal
WHEN NOT MATCHED THEN
INSERT (MyKey, MyVal)
VALUES (S.MyKey, S.MyVal)
MySQL:
INSERT (MyKey, MyVal)
INTO MyTable
VALUES({$myKey}, {$myVal})
ON DUPLICATE KEY UPDATE myVal = {$myVal}

Copy Query Result to another mysql table

I am trying to import a large CSV file into a MySQL database. I have loaded the entire file into one flat table. i can select the data that needs to go into separate tables using select statements, my question is how do i copy the results of those select queries to different tables. i would prefer to do it completely in SQL and not have to worry about using a scripting language.
INSERT
INTO new_table_1
SELECT *
FROM existing_table
WHERE condition_for_table_1;
INSERT
INTO new_table_2
SELECT *
FROM existing_table
WHERE condition_for_table_2;
INSERT INTO anothertable (list, of , column, names, to, give, values, for)
SELECT list, of, column, names, of, compatible, column, types
FROM bigimportedtable
WHERE possibly you want a predicate or maybe not;
The answer from Quassnoi was the one I was looking for. Please observe that if new_table_1 doesn't exist yet the "INSERT INTO" statement has to be replaced with a "CREATE TABLE" statement.