SQL: Use the same string for both INSERT and UPDATE? - sql

The INSERT syntax I've been using is this
INSERT INTO TableName VALUES (...)
The UPDATE syntax I've been using is
UPDATE TableName SET ColumnName=Value WHERE ...
So in all my code, I have to generate 2 strings, which would result in something like this
insertStr = "(27, 'John Brown', 102)";
updateStr = "ID=27, Name='John Brown', ItemID=102";
and then use them separately
"UPDATE TableName SET " + updateStr + " WHERE ID=27 " +
"IF ##ROWCOUNT=0 "+
"INSERT INTO TableName VALUES (" + insertStr + ")"
It starts bothering me when I am working with tables with like 30 columns.
Can't we generate just one string to use on both INSERT and UPDATE?
eg. using insertStr above on UPDATE statement or updateStr on INSERT statement, or a whole new way?

I think you need a whole new approach. You are open to SQL Injection. Provide us with some sample code as to how you are getting your data inputs and sending the statements to the database.
alt text http://goose.ycp.edu/~weddins/440/S09%20IFS440%20Bobby%20Drop%20Tables.PNG

As far as I'm aware, what you're describing isn't possible in ANSI SQL, or any extension of it that I know. However, I'm mostly familiar with MySQL, and it likely depends completely upon what RDBMS you're using. For example, MySQL has "INSERT ... ON DUPLICATE KEY UPDATE ... " syntax, which is similar to what you've posted there, and combines an INSERT query with an UPDATE query. The upside is that you are combining two possible operations into a single query, however, the INSERT and UPDATE portions of the query are admittedly different.
Generally, this kind of thing can be abstracted away with an ORM layer in your application. As far as raw SQL goes, I'd be interested in any syntax that worked the way you describe.

Some DBMS' have an extension to do this but why don't you just provide a function to do it for you? We've actually done this before.
I'm not sure what language you're using but it's probably got associative arrays where you can wrote something like:
pk{"ID"} = "27"
val{"Name"} = "'John Brown'"
val{"ItemID"} = "102"
upsert ("MyTable", pk, val)
and, if it doesn't have associative arrays, you can emulate them with multiple integer-based arrays of strings.
In our upsert() function, we just constructed a string (update, then insert if the update failed) and passed it to our DBMS. We kept the primary keys separate from our other fields since that made construction of the update statement a lot easier (primary key columns went in the where clause, other columns were just set).
The result of the calls above would result in the following SQL (we had a different check for failed update but I've put your ##rowcount in for this example):
update MyTable set
Name = 'John Brown',
ItemID = 102
where ID = 27
if ##rowcount=0
insert into MyTable (ID, Name, ItemID) values (
27,
'John Brown',
102
)
That's one solution which worked well for us. No doubt there are others.

Well, how about no statements? You might want to look into an ORM to handle this for you...

Some databases have proprietary extensions that do exactly this.
I agree that the syntax of INSERT and UPDATE could be more consistent, but this is just a fact of life now -- it ain't gonna change now. For many scenarios, the best option is your "whole new way": use an object-relational mapping library (or even a weak-tea layer like .NET DataSets) to abstract away the differences, and stop worrying about the low-level SQL syntax. Not a viable option for every application, of course, but it would allow you to just construct or update an object, call a Save method and have the library figure out the SQL syntax for you.

If you think about it, INSERT and UPDATE are exactly the same thing. They map field names to values, except the UPDATE has a filter.
By creating an associative array, where the key is the field name and the value is the value you want to assign to the field, you have your mapping. You just need to convert it to a the proper string format depending on INSERT or UPDATE.
You just need to create a function that will handle the conversion based on the parameters given.

SQL Server 2008:
MERGE dbo.MyTable AS T
USING
(SELECT
#mykey AS MyKey
#myval AS MyVal
) AS S
ON (T.MyKey = S.MyKey)
WHEN MATCHED THEN
UPDATE SET
T.MyVal = S.MyVal
WHEN NOT MATCHED THEN
INSERT (MyKey, MyVal)
VALUES (S.MyKey, S.MyVal)
MySQL:
INSERT (MyKey, MyVal)
INTO MyTable
VALUES({$myKey}, {$myVal})
ON DUPLICATE KEY UPDATE myVal = {$myVal}

Related

Add column with substring of other column in SQL (Snowflake)

I feel like this should be simple but I'm relatively unskilled in SQL and I can't seem to figure it out. I'm used to wrangling data in python (pandas) or Spark (usually pyspark) and this would be a one-liner in either of those. Specifically, I'm using Snowflake SQL, but I think this is probably relevant to a lot of flavors of SQL.
Essentially I just want to trim the first character off of a specific column. More generally, what I'm trying to do is replace a column with a substring of the same column. I would even settle for creating a new column that's a substring of an existing column. I can't figure out how to do any of these things.
On obvious solution would be to create a temporary table with something like
CREATE TEMPORARY TABLE tmp_sub AS
SELECT id_col, substr(id_col, 2, 10) AS id_col_sub FROM table1
and then join it back and write a new table
CREATE TABLE table2 AS
SELECT
b.id_col_sub as id_col,
a.some_col1, a.some_col2, ...
FROM table1 a
JOIN tmp_sub b
ON a.id_col = b.id_col
My tables have roughly a billion rows though and this feels extremely inefficient. Maybe I'm wrong? Maybe this is just the right way to do it? I guess I could replace the CREATE TABLE table2 AS... to INSERT OVERWRITE INTO table1 ... and at least that wouldn't store an extra copy of the whole thing.
Any thoughts and ideas are most welcome. I come at this humbly from the perspective of someone who is baffled by a language that so many people seem to have mastery over.
I'm not sure the exact syntax/functions in Snowflake but generally speaking there's a few different ways of achieving this.
I guess the general approach that would work universally is using the SUBSTRING function that's available in any database.
Assuming you have a table called Table1 with the following data:
+-------+-----------------------------------------+
Code | Desc
+-------+-----------------------------------------+
0001 | 1First Character Will be Removed
0002 | xCharacter to be Removed
+-------+-----------------------------------------+
The SQL code to remove the first character would be:
select SUBSTRING(Desc,2,len(desc)) from Table1
Please note that the "SUBSTRING" function may vary according to different databases. In Oracle for example the function is "SUBSTR". You just need to find the Snowflake correspondent.
Another approach that would work at least in SQLServer and MySQL would be using the "RIGHT" function
select RIGHT(Desc,len(Desc) - 1) from Table1
Based on your question I assume you actually want to update the actual data within the table. In that case you can use the same function above in an update statement.
update Table1 set Desc = SUBSTRING(Desc,2,len(desc))
You didn't try this?
UPDATE tableX
SET columnY = substr(columnY, 2, 10 ) ;
-Paul-
There is no need to specify the length, as is evidenced from the following simple test harness:
SELECT $1
,SUBSTR($1, 2)
,RIGHT($1, -2)
FROM VALUES
('abcde')
,('bcd')
,('cdef')
,('defghi')
,('e')
,('fg')
,('')
;
Both expressions here - SUBSTR(<col>, 2) and RIGHT(<col>, -2) - effectively remove the first character of the <col> column value.
As for the strategy of using UPDATE versus INSERT OVERWRITE, I do not believe that there will be any difference in performance or outcome, so I might opt for the UPDATE since it is simpler. So, in conclusion, I would use:
UPDATE tableX
SET columnY = SUBSTR(columnY, 2)
;

Is there a syntax error with this SQL server query? Can I not use "target.#1"?

Hard coded, this works:
var insertCommand1 = ("MERGE INTO Leaderboard WITH (HOLDLOCK) AS target USING (SELECT * FROM Scores WHERE WeekNumber = 7) AS Source ON (target.id = source.id) WHEN MATCHED THEN UPDATE SET target.Id = source.Id, target.Week7 = source.weeklyScore WHEN NOT MATCHED THEN INSERT (Id, Week7) VALUES (source.Id, source.weeklyScore);");
db.Execute(insertCommand1);
This does not work:
var insertCommand1 = ("MERGE INTO Leaderboard WITH (HOLDLOCK) AS target USING (SELECT * FROM Scores WHERE WeekNumber = #0) AS Source ON (target.id = source.id) WHEN MATCHED THEN UPDATE SET target.Id = source.Id, target.#1 = source.weeklyScore WHEN NOT MATCHED THEN INSERT (Id, #2) VALUES (source.Id, source.weeklyScore);");
db.Execute(insertCommand1, weeknum, weekstring, weekstring);
The error says there's a syntax error near #1. What could this be?
I've already debugged to make sure the value to weeknum and weekstring were correct.
Working in SQL server on VS 2015.
Schema for the 2 tables-
Leaderboard(Id, Week1, Week2, Week3, Week4, Week5,
Week6, Week7, Week8, Week9, Week10)
with Id as the primary key
Scores(Id, WeekNumber, weeklyScore)
with Id and WeekNumber as the primary key
You are trying to set the fieldname using a parameter, and #parameters are for values.
, target.#1 = source.weeklyScore
Should be
, target.something = #1
It looks like you're trying to use a parameter as a schema object name instead of a value. This doesn't work, as you've discovered. Parameters are just for values.
If you need a dynamic schema object name, be aware of two things:
It could impact performance, though probably not by much.
SQL injection is a significant concern.
The first one you can measure if it becomes a problem, but I doubt it will. The second one can be handled just by being careful. The simple rule with SQL injection is not to always use parameters for everything, but to never execute user-modified values as code.
For schema objects, you already have a finite set of possible values. So you can build a list of known values in your code. This isn't user-modified, so it's safe. (Maybe it's hard-coded, maybe you auto-generate it from the DB schema, that's up to you.)
With the variable, check if it matches a value in the list. If it doesn't, that's an error and the code should simply raise the appropriate exception or in some other way handle that error case. If it does match an element from the known finite list of safe non-user-modifiable values, use that matched value from the list in your query:
var query = string.Format("SELECT SomeTable.{0} FROM SomeTable ...", knownList[x]);
(Or however you want to structure it, hopefully you get the idea.)
Then with that dynamically generated query, you can add your parameter values and you're all set.

SQL fixed-length insert statement formatter

Lets assume I have this long insert statement
insert into table1 (id, name, phone, very_long_col_name, ...)
values (1, 'very long name indeed...', '555-555-5555', 1, ...)
As you can see above, it gets hard to tell values from their column since their length is uneven
I'm looking for something (e.g. command line util) to format the above (not just SQL format) to this:
insert into table1 (id, name , phone , very_long_col_name, ...)
values (1 , 'very long name indeed...', '555-555-5555', 1 , ...)
This way I can see which value goes with which column easily
It can be a plugin to notepad++, a java utility, a plugin to an SQL IDE, what ever does the trick...
Prepared statements, T-SQL parameters, Hibernate, JPA etc is not an option right now
Not suggesting a plugin, but I mostly see this kind of thing formatted this way:
insert into table1
(
id,
name,
phone,
very_long_col_name,
...
)
values
(
1,
'very long name indeed...',
'555-555-5555',
1,
...
)
I find this more readable than scrolling through a very long line.
Have you considered the following alternative syntax?
INSERT INTO `table` SET
`id` = 1,
`name` = 'very long name indeed...',
`phone` = '555-555-5555',
`very_long_col_name` = 1,
`...` = '...'
;
Your better bet is to use SQL prepared statements. This lets you separate the SQL query syntax from your data, so you'd first prepare the statement:
$statement = mysqli_prepare("INSERT INTO `blah` (`id`,`phone`,`name`) VALUES(?,?,?)");
Then you bind the data to the statement:
$statement->bind('iss', 1234, "(555) 123-4567", "Kris");
I used PHP as an example, and the 'iss' in the above code says it's binding an Int and 2 strings in that order.
If the place that contains your SQL statement contains the DATA that you want to insert, then you are most probably doing something very, very, very wrong.
What do you want to achieve? Do you want to format the query, so that you can dump it in a pretty style for debugging purposes? Well, this is easy, just add strlen(some_string)-some_fixed_number number of whitespace at the appropriate places in your code. I can not suggest actual code here, because I do not know what language you use or what coding styles you prefer and so on...
But even if I wanted to, I do not see any value in this. You should separate SQL queries and the data that you use in your SQL queries (e.g. for inserting).
Building SQL query strings dynamically is out of fashion for some very good reasons (quoting, sql injection and so on...).
EDIT: If you want to format an SQL dump or some INSERT statements that prepare a database, then you can just use CSV formatted data. It is easier to read than SQL statements.
Variables?
insert into
table1 ( id, name, phone, very_long_col_name, ...)
values (#id, #name, #phone, #long_val, ...)
(obviously you need to declare and set / select these too)
In Oracle (since we can select from dual) I like to make these into insert into select from so I can alias the columns and make it easier to read:
insert into table1
(
id,
name,
phone,
very_long_col_name,
...
)
select 1 id,
'very long name indeed...' name,
'555-555-5555' phone,
1 very_long_col_name,
...
from dual;
A desired the same thing so I built this javascript tool:
SQL Insert Formatter
It does precisely what you ask and handles multiple statements. It is client-side only, so no need to worry about your data being uploaded.

Why do SQL INSERT and UPDATE Statements have Different Syntaxes?

While contemplating this question about a SQL INSERT statement, it occurred to me that the distinction in syntax between the two statements is largely artificial. That is, why can't we do:
INSERT INTO MyTable SET Field1=Value1, Field2=Value2, ...
or
UPDATE MyTable ( Field1, Field2 ...) VALUES ( Value1, Value2, ... )
WHERE some-key = some-value
Perhaps I'm missing something critical. But for those of us who have had to concatenate our SQL statements in the past, having comparable syntax for an INSERT and an UPDATE statement would have saved a significant amount of coding.
They're serving different grammatical functions. In an update you are specifying a filter that chooses a set of rows to which you will apply an update. And of course that syntax is shared with a SELECT query for the same purpose.
In an INSERT you are not choosing any rows, you are generating a new row which requires specifying a set of values.
In an UPDATE, the LHS=RHS stuff is specifying an expression which yields true or false (or maybe null :) and in an INSERT, the VALUES clause is about assignment of value. So while they are superficially similar, they are semantically quite different, imho. Although I have written a SQL parser, so that may influence my views. :)
SQL Server 2008 has introduced UPSERT functionality via the MERGE command. This is the logical equivalent of
IF FOUND THEN
UPDATE
ELSE
INSERT
I believe this is so that you may make an insert statement without being explicit about the values. If you are putting a value in every single column in the table you can write:
insert into my_table values ("value1", 2);
instead of:
insert into my_table (column1, column2) values ("value1", 2);
When importing and exporting entire (large) databases, this is invaluable for cutting down file size and processing time. Nowadays, with binary snapshots and the like, it may be "less invaluable" :-)

Dynamic Query in SQL Server

I have a table with 10 columns as col_1,col_2,.... col_10. I want to write a select statement that will select a value of one of the row and from one of these 10 columns. I have a variable that will decide which column to select from. Can such query be written where the column name is dynamically decided from a variable.
Yes, using a CASE statement:
SELECT CASE #MyVariable
WHEN 1 THEN [Col_1]
WHEN 2 THEN [Col_2]
...
WHEN 10 THEN [Col_10]
END
Whether this is a good idea is another question entirely. You should use better names than Col_1, Col_2, etc.
You could also use a string substitution method, as suggested by others. However, that is an option of last resort because it can open up your code to sql injection attacks.
Sounds like a bad, denormalized design to me.
I think a better one would have the table as parent, with rows that contain a foreign key to a separate child table that contains ten rows, one for each of those columns you have now. Let the parent table set the foreign key according to that magic value when the row is inserted or updated in the parent table.
If the child table is fairly static, this will work.
Since I don't have enough details, I can't give code. Instead, I'll explain.
Declare a string variable, something like:
declare #sql varchar(5000)
Set that variable to be the completed SQL string you want (as a string, and not actually querying... so you embed the row-name you want using string concatenation).
Then call: exec(#sql)
All set.
I assume you are running purely within Transact-SQL. What you'll need to do is dynamically create the SQL statement with your variable as the column name and use the EXECUTE command to run it. For example:
EXECUTE('select ' + #myColumn + ' from MyTable')
You can do it with a T-SQl CASE statement:
SELECT 'The result' =
CASE
WHEN choice = 1 THEN col1
WHEN choice = 2 THEN col2
...
END
FROM sometable
IMHO, Joel Coehoorn's case statement is probably the best idea
... but if you really have to use dynamic SQL, you can do it with sp_executeSQL()
I have no idea what platform you are using but you can use Dynamic LINQ pretty easily to do this.
var query = context.Table
.Where( t => t.Id == row_id )
.Select( "Col_" + column_id );
IEnumerator enumerator = query.GetEnumerator();
enumerator.MoveNext();
object columnValue = enumerator.Current;
Presumably, you'll know which actual type to cast this to depending on the column. The nice thing about this is you get the parameterized query for free, protecting you against SQL injection attacks.
This isn't something you should ever need to do if your database is correctly designed. I'd revisit the design of that element of the schema to remove the need to do this.