JDBC getGeneratedKeys() - sql

I obtained the following code example from http://docs.oracle.com/javase/1.4.2/docs/guide/jdbc/getstart/preparedstatement.html
I have three questions:
What does 'keyColumn' refer to considering that there are only three columns - LAST, FIRST, HOME.
Why is a loop used in iterating the generated keys? Are multiple rows returned for that one insert statement?
Which databases support multiple generated keys per table?
String sql = "INSERT INTO AUTHORS (LAST, FIRST, HOME) VALUES " +
"(?, ?, ?, keyColumn)";
PreparedStatement addAuthor = con.prepareStatement(sql,
Statement.RETURN_GENERATED_KEYS);
addAuthor.setString(1, "Wordsworth");
addAuthor.setString(2, "William");
addAuthor.setString(3, "England");
int rows = addAuthor.executeUpdate();
ResultSet rs = stmt.getGeneratedKeys();
if (rs.next()) {
ResultSetMetaData rsmd = rs.getMetaData();
int colCount = rsmd.getColumnCount();
do {
for (int i = 1; i <= colCount; i++) {
String key = rs.getString(i);
System.out.println("key " + i + "is " + key);
}
}
while (rs.next();)
}
else {
System.out.println("There are no generated keys.");
}

Question 1. I think that keyColumn in the query on your link is simply an error in the example. The second and third example in that paragraph also contain serious syntax errors. I wouldn't dwell on it. This documentation has been removed entirely from more recent Java versions.
Question 2. Think about statements like INSERT INTO ... SELECT ... FROM ... or INSERT INTO ... VALUES (...), (...) which can produce multiple inserted rows. Also some database support returning values from other DML (eg UPDATE or DELETE), which can also affect multiple rows, that is why you need to consider a loop. In this specific example it is not necessary as you can be sure only one row will be inserted.
Question 3. This question is a bit more complex:
Some databases (or drivers) can't easily decide what is the actual generated column. For example because the database doesn't have IDENTITY columns, but use triggers to generate keys. Identifying generated keys would involve parsing all triggers on a table to check if it assigns a generated value to a (primary key or other) column, which is not easily done and would be error prone. And sometimes there are multiple generated columns (ie computed fields etc). You as the developer should know what fields you can or want to get back.
As it is hard (or inefficient) to decide which fields to return, some drivers (by default) return all columns of the inserted (or deleted/updated) row. For example the PostgreSQL and Firebird drivers do that. On the other hand some drivers just return the last-generated key even if the table does not contain an identity column (I believe MySQL does, not 100% sure though). And I seem to remember that the Oracle driver simply returns the ROWID, leaving it up to the user to retrieve actual values from the database using that ROWID.
If you specifically know which columns you want returned, you can specify that yourself using the alternative methods that accept an array of column ordinal indices or column names. Although again not all drivers support that.

Related

SQLite changes() counts non-changing UPDATEs

I have question regarding SQLite's changes() function, which, according to the documentation, "returns the number of database rows that were changed or inserted or deleted by the most recently completed INSERT, DELETE, or UPDATE statement" (also see the documentation of the underlying C/C++ function).
I was hoping to use this function to check whether the execution of an UPDATE statement pertaining to a single row has really caused that row to be changed or not.
By changed I do not just mean that the row matched the statement's WHERE clause. No, instead what I mean is that, for the row in question, the value of at least 1 column is actually different after the execution compared to before. If you ask me this is the only proper definition of a change in this context.
So I was hoping to detect such changes by checking whether changes() returns 1 (row changed) or 0 (row unchanged) when called right after the execution of the UPDATE statement.
But much to my despair this does not seem to work as expected.
Allow me to illustrate:
CREATE TABLE People (Id INTEGER PRIMARY KEY AUTOINCREMENT, Name TEXT NOT NULL);
INSERT INTO People (Name) VALUES ("Astrid");
SELECT changes();
Here changes() returns 1, as expected because we just INSERTed 1 row.
UPDATE People SET Name = "Emma" WHERE Id = 1;
SELECT changes();
Here changes() returns 1, as expected because 1 row was UPDATEd (i.e. actually changed: the Name of the Person with Id = 1 was "Astrid" but is now "Emma").
UPDATE People SET Name = "John" WHERE Id = 200;
SELECT changes();
Here changes() returns 0, as expected because there is no row with Id = 200.
So far so good. But now have a look at the following UPDATE statement, which does indeed match an existing row, but does not actually change it at all (Name remains set to "Emma")...
UPDATE People SET Name = "Emma" WHERE Id = 1;
SELECT changes();
Here changes() returns 1, while I was of course hoping for 0 :-(.
Perhaps this would have made sense if the function was called something like matched_rows() or affected_rows(). But for a function called changes(), and documented as it is, this behaviour strikes me as illogical, or confusing at best.
So anyway, can somebody explain why this happens, or, even better, suggest an alternative strategy to achieve my goal in a reliable (and efficient) way?
All I can think of is to actually do something like SELECT * FROM People WHERE Id = x, compare all returned column values with the values I'm about to set in the UPDATE statement and thereby decide whether I need to execute the UPDATE at all. But that can't be very efficient, right?
Of course in this toy example it might not matter much, but in my actual application I'm dealing with tables with many more columns, some of which are (potentially big) BLOBs.
The database does not compare old and new values; any UPDATEd row always counts as "changed" even if the values happen to be the same.
The documentation says that
the UPDATE affects … those rows for which the result of evaluating the WHERE clause expression as a boolean expression is true.
If you want to check the old value, you have to do it explicitly:
UPDATE People SET Name = 'Emma' WHERE Id = 1 AND Name IS NOT 'Emma';

"Kludge" of a result set in JDBC [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
rs.last() gives Invalid operation for forward only resultset : last
So I'm trying to understand the result set cursor and I'm having an issue with where the cursor is apparently.
I have a very small application that assigns a new integer id automatically, which will be the last entry into the database, therefore the highest integer. I'm trying to get to the last entry like this (rs is result set) so I can use its value:
rs.last
and then assigning the_new_id to rs.getInt(1)...
However, I get the "Invalid operation for forward only resultset : last" sql exception.
Right now I have a big "kludge" to make this work:
while(rs.next())
your_new_id = rs.getInt(1);
and then I just assign the new id that way. :-\
How can I implement this same behavior more elegantly using last?
Any help is appreciated.
By default, result sets are forward-only, meaning that the only thing you can do to change the position of the cursor is next() (which is all you need if you order by ID in descending order).
The javadoc explains it:
A default ResultSet object is not updatable and has a cursor that
moves forward only. Thus, you can iterate through it only once and
only from the first row to the last row. It is possible to produce
ResultSet objects that are scrollable and/or updatable. The following
code fragment, in which con is a valid Connection object, illustrates
how to make a result set that is scrollable and insensitive to updates
by others, and that is updatable. See ResultSet fields for other
options.
Statement stmt = con.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,
ResultSet.CONCUR_UPDATABLE);
ResultSet rs = stmt.executeQuery("SELECT a, b FROM TABLE2");
// rs will be scrollable, will not show changes made by others,
// and will be updatable
Apparently you are trying to retrieve the ID that was generated by a previous INSERT statement. You should not use a separate SELECT statement for that (which is not transaction safe and does impose an unnecessary load on the database).
To retrieve a generated ID, use the following JDBC calls:
String insert = "insert into some_table (... ";
PreparedStatement pstmt = con.prepareStatement(insert, new String[] {"ID"});
int rowsInserted = pstmt.executeUpdate();
ResultSet idResult = pstmt.getGeneratedKeys();
int newId = -1;
if (rs.next()) {
newId = rs.getINt(1);
}
This will retrieve the value that was generated for the ID column during the INSERT. This will be faster than doing a SELECT to get the latest ID, but more importantly it is transaction safe.

Retrieve id of record just inserted into a Java DB (Derby) database

I am connecting to a Java DB database with JDBC and want to retrieve the id (which is on auto increment) of the last record inserted.
I see this is a common question, but I see solutions using for example MS SQL Server, what is the equivalent for Java DB?
No need to use a DBMS specific SQL for that.
That's what getGeneratedKeys() is for.
When preparing your statement you pass the name(s) of the auto-generated columns which you can then retrieve using getGeneratedKeys()
PreparedStatement pstmt = connection.prepareStatement(
"insert into some_table (col1, col2, ..) values (....)",
new String[] { "ID_COLUMN"} );
pstmt.executeUpdate();
ResultSet rs = pstmt.getGeneratedKeys(); // will return the ID in ID_COLUMN
Note that column names are case sensitive in this case (in Derby and many other DBMS).
new String[] { "ID_COLUMN"} is something different than new String[] { "id_column"}
Alternatively you can also use:
connection.prepareStatement("INSERT ...", PreparedStatement.RETURN_GENERATED_KEYS);
You may be able to get what you're looking for using the IDENTITY_VAL_LOCAL function. (Derby Reference)
This function is supposed to return "the most recently assigned value of an identity column for a connection, where the assignment occurred as a result of a single row INSERT statement using a VALUES clause."
It's worth noting that this function will return DECIMAL(31,0), regardless of the actual data type of the corresponding identity column.
Also, this only works for single row inserts that contain a VALUES clause.
For those who have issues getting the generated autoincrement id like I used to for Java Derby, my answer can be of help.
stmt.executeUpdate("INSERT INTO xx(Name) VALUES ('Joe')", Statement.RETURN_GENERATED_KEYS);
ResultSet rs = stmt.getGeneratedKeys();
if (rs.next()) {
int autoKey = rs.getInt(1); //this is the auto-generated key for your use
}
Answer copied from here

Select the last row in a SQL table

Is it possible to return the last row of a table in MS SQL Server.
I am using an auto increment field for the ID and i want to get the last one just added to join it with something else. Any idea?
Here's the code:
const string QUERY = #"INSERT INTO Questions (ID, Question, Answer, CategoryID, Permission) "
+ #"VALUES (#ID, #Question, #Answer, #CategoryID, #Permission) ";
using (var cmd = new SqlCommand(QUERY, conn))
{
cmd.Parameters.AddWithValue("#Question", question);
cmd.Parameters.AddWithValue("#Answer", answer);
cmd.Parameters.AddWithValue("#CategoryID", lastEdited);
cmd.Parameters.AddWithValue("#Permission", categoryID);
cmd.ExecuteNonQuery();
}
Not safe - could have multiple inserts going on at the same time and the last row you'd get might not be yours. You're better off using SCOPE_IDENTITY() to get the last key assigned for your transaction.
using an auto increment field ... and i want to get the last one just added to join it with something else.
The key here is "just added". If you have a bunch of different users hit the db at the same time, I don't think you want user A to retrieve the record created by user B. That means you probably want to use the scope_identity() function to get that id rather than running a query on the table again right away.
Depending on the context you might also need ##identity (would include triggers) or ident_current('questions') (limited to a specific table, but not the specific scope). But scope_identity() is almost always the right one to use.
Here's an example:
DECLARE #NewOrderID int
INSERT INTO TABLE [Orders] (CustomerID) VALUES (1234)
SELECT #NewOrderID=scope_identity()
INSERT INTO TABLE [OrderLines] (OrderID, ProductID, Quantity)
SELECT #NewOrderID, ProductID, Quantity
FROM [ShoppingCart]
WHERE CustomerID=1234 AND SessionKey=4321
Based on the code you posted, you can do something like this:
// don't list the ID column: it should be an identity column that sql server will handle for you
const string QUERY = "INSERT INTO Questions (Question, Answer, CategoryID, Permission) "
+ "VALUES (#Question, #Answer, #CategoryID, #Permission);"
+ "SELECT scope_identity();";
int NewQuestionID;
using (var cmd = new SqlCommand(QUERY, conn))
{
cmd.Parameters.AddWithValue("#Question", question);
cmd.Parameters.AddWithValue("#Answer", answer);
cmd.Parameters.AddWithValue("#CategoryID", lastEdited);
cmd.Parameters.AddWithValue("#Permission", categoryID);
NewQuestionID = (int)cmd.ExecuteScalar();
}
See my answer to another question here:
get new SQL record ID
The problem now is that you'll likely want subsequent sql statements to be in the same transaction. You could do this with client code, but I find keeping it all on the server to be cleaner. You could do that by building a very long sql string, but I tend to prefer a stored procedure at this point.
I'm also not a fan of the .AddWithValue() method — I prefer explicitly defining the parameter types — but we can leave that for another day.
Finally, it's kind of late now, but I want to emphasize that it's really better to try to keep this all on the db. It's okay to run multiple statements in one sql command, and you want to reduce the number of round trips you need to make to the db and the amount of data you need to pass back and forth between the db and your app. It also makes it easier to get the transactions right and keep things atomic where they need to be.
use
scope_identity() returns the last identity value generated in this session and this scope
ident_current() returns the last identity value generated for a particular table in any session and any scope
select ident_current( 'yourTableName' )
will return the last identity created by a different session.
Most of the time you should use scope_identity() right after an insert statement like so.
--insert statement
SET #id = CAST(SCOPE_IDENTITY() AS INT)
MSDN Link - Scope_Identity()
MSDN Link - Ident_Current
select top 1 * from yourtable order by id desc
I'm not sure of your version of SQL Server, but look for the OUTPUT clause of ther INSERT statement. You can capture a set of rows with this clause
Since the questioner is using .NET, here's a modified example of how to do it. (I removed ID from the insert list since it's autoincrement--the original example would fail. I also assume ID is an SQL int, not a bigint.)
const string QUERY = #"INSERT INTO Questions (Question, Answer, CategoryID, Permission) "
+ #"VALUES (#Question, #Answer, #CategoryID, #Permission);"
+ #"SELECT #ID = SCOPE_IDENTITY();";
using (var cmd = new SqlCommand(QUERY, conn))
{
cmd.Parameters.AddWithValue("#Question", question);
cmd.Parameters.AddWithValue("#Answer", answer);
cmd.Parameters.AddWithValue("#CategoryID", lastEdited);
cmd.Parameters.AddWithValue("#Permission", categoryID);
cmd.Parameters.Add("#ID", System.Data.SqlDbType.Int).Direction = ParameterDirection.Output;
cmd.ExecuteNonQuery();
int id = (int)cmd.Parameters["#ID"].Value;
}
EDITED: I also suggest considering LINQ to SQL instead of hand-coding SqlCommand objects--it's much better (faster to code, easier to use) for many common scenarios.
With a simple select you can do something like this:
SELECT *
FROM table_name
WHERE IDColumn=(SELECT max(IDColum) FROM table_name)

SQL: Use the same string for both INSERT and UPDATE?

The INSERT syntax I've been using is this
INSERT INTO TableName VALUES (...)
The UPDATE syntax I've been using is
UPDATE TableName SET ColumnName=Value WHERE ...
So in all my code, I have to generate 2 strings, which would result in something like this
insertStr = "(27, 'John Brown', 102)";
updateStr = "ID=27, Name='John Brown', ItemID=102";
and then use them separately
"UPDATE TableName SET " + updateStr + " WHERE ID=27 " +
"IF ##ROWCOUNT=0 "+
"INSERT INTO TableName VALUES (" + insertStr + ")"
It starts bothering me when I am working with tables with like 30 columns.
Can't we generate just one string to use on both INSERT and UPDATE?
eg. using insertStr above on UPDATE statement or updateStr on INSERT statement, or a whole new way?
I think you need a whole new approach. You are open to SQL Injection. Provide us with some sample code as to how you are getting your data inputs and sending the statements to the database.
alt text http://goose.ycp.edu/~weddins/440/S09%20IFS440%20Bobby%20Drop%20Tables.PNG
As far as I'm aware, what you're describing isn't possible in ANSI SQL, or any extension of it that I know. However, I'm mostly familiar with MySQL, and it likely depends completely upon what RDBMS you're using. For example, MySQL has "INSERT ... ON DUPLICATE KEY UPDATE ... " syntax, which is similar to what you've posted there, and combines an INSERT query with an UPDATE query. The upside is that you are combining two possible operations into a single query, however, the INSERT and UPDATE portions of the query are admittedly different.
Generally, this kind of thing can be abstracted away with an ORM layer in your application. As far as raw SQL goes, I'd be interested in any syntax that worked the way you describe.
Some DBMS' have an extension to do this but why don't you just provide a function to do it for you? We've actually done this before.
I'm not sure what language you're using but it's probably got associative arrays where you can wrote something like:
pk{"ID"} = "27"
val{"Name"} = "'John Brown'"
val{"ItemID"} = "102"
upsert ("MyTable", pk, val)
and, if it doesn't have associative arrays, you can emulate them with multiple integer-based arrays of strings.
In our upsert() function, we just constructed a string (update, then insert if the update failed) and passed it to our DBMS. We kept the primary keys separate from our other fields since that made construction of the update statement a lot easier (primary key columns went in the where clause, other columns were just set).
The result of the calls above would result in the following SQL (we had a different check for failed update but I've put your ##rowcount in for this example):
update MyTable set
Name = 'John Brown',
ItemID = 102
where ID = 27
if ##rowcount=0
insert into MyTable (ID, Name, ItemID) values (
27,
'John Brown',
102
)
That's one solution which worked well for us. No doubt there are others.
Well, how about no statements? You might want to look into an ORM to handle this for you...
Some databases have proprietary extensions that do exactly this.
I agree that the syntax of INSERT and UPDATE could be more consistent, but this is just a fact of life now -- it ain't gonna change now. For many scenarios, the best option is your "whole new way": use an object-relational mapping library (or even a weak-tea layer like .NET DataSets) to abstract away the differences, and stop worrying about the low-level SQL syntax. Not a viable option for every application, of course, but it would allow you to just construct or update an object, call a Save method and have the library figure out the SQL syntax for you.
If you think about it, INSERT and UPDATE are exactly the same thing. They map field names to values, except the UPDATE has a filter.
By creating an associative array, where the key is the field name and the value is the value you want to assign to the field, you have your mapping. You just need to convert it to a the proper string format depending on INSERT or UPDATE.
You just need to create a function that will handle the conversion based on the parameters given.
SQL Server 2008:
MERGE dbo.MyTable AS T
USING
(SELECT
#mykey AS MyKey
#myval AS MyVal
) AS S
ON (T.MyKey = S.MyKey)
WHEN MATCHED THEN
UPDATE SET
T.MyVal = S.MyVal
WHEN NOT MATCHED THEN
INSERT (MyKey, MyVal)
VALUES (S.MyKey, S.MyVal)
MySQL:
INSERT (MyKey, MyVal)
INTO MyTable
VALUES({$myKey}, {$myVal})
ON DUPLICATE KEY UPDATE myVal = {$myVal}