Does using NamedParameterJdbcTemplate prevent SQL injection? - sql

I have the following method that uses a NamedParameterJdbcTemplate to execute an SQL query in Spring Boot:
#Service
class MyRepository(
val jdbcTemplate: NamedParameterJdbcTemplate
) {
fun loadData(myKey: List<Int>): List<MyRow> {
return jdbcTemplate.query(
"""
select
io.KEY as itemKey,
art.ARTICLE_NR as articleNumber,
art.PRICE as price,
concat(
concat(
concat(art.BEST_B, BEST_A),
lpad(BEST_B, 2, '0')),
lpad(BEST_A, 2, '0')) as group
from
BUY.OPTION io
INNER JOIN BUYING.ART art ON (to_char(art.id) = io.keyb)
where
io.KEY IN (:MYKEY)
""".trimIndent(),
parameters
) { rs, rowNum ->
MyRow(
itemOption = ItemOption(rs.getString("ITEMOPTION")),
articleNumber = rs.getString("ARTICLENUMBER"),
price = rs.getBigDecimal("PRICE")
group = rs.getString("GROUP")
)
}
}
}
Is this method already protected against SQL injection since it's using NamedParameterJdbcTemplate? Or do I have to do some extra steps for that?

Using parameters in a NamedParameterJdbcTemplate will use JDBC prepared statement with parameters, which will - in general - protect you against SQL injection.
I'm saying "in general", because the actual protection depends on the implementation of the specific JDBC driver used. Mainstream JDBC drivers will - bar any bugs - protect you against SQL injection because they either keep statements and parameter values separate (the statement is prepared with parameter placeholders, and on execute only the values are sent), or otherwise properly escape things when generating the actual query (i.e. default behaviour in the MySQL Connector/J driver). However, nothing keeps someone from writing a naive JDBC driver that will use string interpolation to generate a query without actually preventing SQL injection.

Related

java/jdbc timeout in clojure

I am trying to add timeout to jdbc/query and jdbc/execute!. Somewhere in the web I found that both functions take :timeout as an option. Documention also says the options are passed to prepare-statment which takes in :timeout as an option.
My function calls look like,
(jdbc/query db-read-spec query {:timeout 2})
(jdbc/execute! db-write-spec query {:timeout 2})
Is this how it is done? If yes, How do I test this?
If there is different way of doing this which is testable, that works too.
The :timeout option causes .setQueryTimeout to be called on the PreparedStatement used under the hood of clojure.java.jdbc. It is in seconds, not milliseconds, so your query would have to be extremely slow for a timeout of 2,000 seconds (just over half an hour) to take effect.
JDBC supports several different timeouts across several of its classes. For example, javax.sql.DataSource supports .setLoginTimeout (also in seconds), as does java.sql.DriverManager.
There are also database-specific options you can add to the connection string (which you can add as additional key/value pairs in your "db-spec") to control lower-level timeouts. For example, MySQL supports connectionTimeout and socketTimeout in the connection string -- and both of those are in milliseconds. clojure.java.jdbc allows for those to be provided in your "db-spec" hash map as :connectTimeout and :socketTimeout keys respectively.
Note that clojure.java.jdbc is considered "Stable" at this point and all current and future development effort is focused on next.jdbc at this point. next.jdbc makes it easier to use the loginTimeout since it operates on JDBC objects directly, so the whole (Java) API is available as well. It also has built-in support for connection pooling and is, overall, simpler and faster than clojure.java.jdbc.
You can leverage query-hint on mysql-select-queries (time in ms)
SELECT /*+ MAX_EXECUTION_TIME(1000) */ * FROM t1 INNER JOIN t2 WHERE....
then you can just wrap your queries:
(defn timed-query [db query t]
(j/query db [(str (subs query 0 6)
(format " /*+ MAX_EXECUTION_TIME(%s) */ " t)
(subs query 7))]))
and test:
(deftest test-query-timeout
(is (thrown? Exception (timed-query db "select * from Employees where id>5" 1))))
you should use much-complex queries for this to work with 1ms;
I figure out a work around to test this out. Since I use postgres I could leverage select pg_sleep(time-in-seconds)
And my test looks like
(is (thrown-with-msg? PSQLException #"ERROR: canceling statement due to user request"
(fetch-or-save "select pg_sleep(3)")))

how to run multiple sql statements in airflow jinja template using jdbc hook

Trying to run a hive sql using jdbchook and jinja template through airflow. Template works fine for a single sql statement but throws a parsing error with multiple statements.
DAG
p1 = JdbcOperator(
task_id=DAG_NAME+'_create',
jdbc_conn_id='big_data_hive',
sql='/mysql_template.sql',
params={'env': ENVIRON},
autocommit=True,
dag=dag)
Template
create table {{params.env}}_fct.hive_test_templated
(cookie_id string
,sesn_id string
,load_dt string)
;
INSERT INTO {{params.env}}_fct.hive_test_templated
select* from {{params.env}}_fct.hive_test
;
Error: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: ParseException line 7:0 missing EOF at ';' near ')'
The template queries works fine when I run it in Hue.
tobi is correct, the easiest way to do this is to parse your SQL statement into a list of SQL's and execute them sequentially.
The way that I do this is by using the sqlparse python library to split the string into a list of SQL statements and then pass them down to the hook (inherits dbapi hook) - the dbapi base class accepts a list of SQL statements and executes the sequentially, this could easily be implemented in the hive hook too. In the following example my "CustomSnoqflakeHook" inherits from the dbapi hook and the run method in the dbapi hook accpets a list of SQL statements :
hook = hooks.CustomSnowflakeHook(snowflake_conn_id=self.snowflake_conn_id)
sql = sqlparse.split(sqlparse.format(self.sql, strip_comments=True))
hook.run(
sql,
autocommit=self.autocommit,
parameters=self.parameters)
From the dbapi hook:
def run(self, sql, autocommit=False, parameters=None):
"""
Runs a command or a list of commands. Pass a list of sql
statements to the sql parameter to get them to execute
sequentially
:param sql: the sql statement to be executed (str) or a list of
sql statements to execute
:type sql: str or list
:param autocommit: What to set the connection's autocommit setting to
before executing the query.
:type autocommit: bool
:param parameters: The parameters to render the SQL query with.
:type parameters: mapping or iterable
"""
if isinstance(sql, basestring):
sql = [sql]
with closing(self.get_conn()) as conn:
if self.supports_autocommit:
self.set_autocommit(conn, autocommit)
with closing(conn.cursor()) as cur:
for s in sql:
if sys.version_info[0] < 3:
s = s.encode('utf-8')
self.log.info(s)
if parameters is not None:
cur.execute(s, parameters)
else:
cur.execute(s)
if not getattr(conn, 'autocommit', False):
conn.commit()
It seems to me that Hue parses the statement differently. Sometimes there are statement separators implemented which allow this to happen.
Airflow seems not to have those separators.
So the easiest way would be to separate the two statements and execute those statements in two separate tasks.

Weird timeout issues with Dapper.net

I started to use dapper.net a while ago for performance reasons and that i really like the named parameters feature compared to just run "ExecuteQuery" in LINQ To SQL.
It works great for most queries but i get some really weird timeouts from time to time. The strangest thing is that this timeout only happens when the SQL is executed via dapper. If i take the executed query copied from the profiler and just run it in Management Studio its fast and works perfect. And it's not just a temporary issues. The query consistently timeout via dapper and consistently works fine in Management Studio.
exec sp_executesql N'SELECT Item.Name,dbo.PlatformTextAndUrlName(Item.ItemId) As PlatformString,dbo.MetaString(Item.ItemId) As MetaTagString, Item.StartPageRank,Item.ItemRecentViewCount
NAME_SRCH.RANK as NameRank,
DESC_SRCH.RANK As DescRank,
ALIAS_SRCH.RANK as AliasRank,
Item.itemrecentviewcount,
(COALESCE(ALIAS_SRCH.RANK, 0)) + (COALESCE(NAME_SRCH.RANK, 0)) + (COALESCE(DESC_SRCH.RANK, 0) / 20) + Item.itemrecentviewcount / 4 + ((CASE WHEN altrank > 60 THEN 60 ELSE altrank END) * 4) As SuperRank
FROM dbo.Item
INNER JOIN dbo.License on Item.LicenseId = License.LicenseId
LEFT JOIN dbo.Icon on Item.ItemId = Icon.ItemId
LEFT OUTER JOIN FREETEXTTABLE(dbo.Item, name, #SearchString) NAME_SRCH ON
Item.ItemId = NAME_SRCH.[KEY]
LEFT OUTER JOIN FREETEXTTABLE(dbo.Item, namealiases, #SearchString) ALIAS_SRCH ON
Item.ItemId = ALIAS_SRCH.[KEY]
INNER JOIN FREETEXTTABLE(dbo.Item, *, #SearchString) DESC_SRCH ON
Item.ItemId = DESC_SRCH.[KEY]
ORDER BY SuperRank DESC OFFSET #Skip ROWS FETCH NEXT #Count ROWS ONLY',N'#Count int,#SearchString nvarchar(4000),#Skip int',#Count=12,#SearchString=N'box,com',#Skip=0
That is the query that i copy pasted from SQL Profiler. I execute it like this in my code.
using (var connection = new SqlConnection(ConfigurationManager.ConnectionStrings["Conn"].ToString())) {
connection.Open();
var items = connection.Query<MainItemForList>(query, new { SearchString = searchString, PlatformId = platformId, _LicenseFilter = licenseFilter, Skip = skip, Count = count }, buffered: false);
return items.ToList();
}
I have no idea where to start here. I suppose there must be something that is going on with dapper since it works fine when i just execute the code.
As you can see in this screenshot. This is the same query executed via code first and then via Management Studio.
I can also add that this only (i think) happens when i have two or more word or when i have a "stop" char in the search string. So it may have something todo with the full text search but i cant figure out how to debug it since it works perfectly from Management Studio.
And to make matters even worse, it works fine on my localhost with a almost identical database both from code and from Management Studio.
Dapper is nothing more than a utility wrapper over ado.net; it does not change how ado.net operates. It sounds to me that the problem here is "works in ssms, fails in ado.net". This is not unique: it is pretty common to find this occasionally. Likely candidates:
"set" option: these have different defaults in ado.net - and can impact performance especially if you have things like calculated+persisted+indexed columns - if the "set" options aren't compatible it can decide it can't use the stored value, hence not the index - and instead table-scan and recompute. There are other similar scenarios.
system load / transaction isolation-level / blocking; running something in ssms does not reproduce the entire system load at that moment in time
cached query plans: sometimes a duff plan gets cached and used; running from ssms will usually force a new plan - which will naturally be tuned for the parameters you are using in your test. Update all your index stats etc, and consider adding the "optimise for" query hint
In ADO is the default value for CommandTimeout 30 Seconds, in Management Studio infinity. Adjust the command timeout for calling Query<>, see below.
var param = new { SearchString = searchString, PlatformId = platformId, _LicenseFilter = licenseFilter, Skip = skip, Count = count };
var queryTimeoutInSeconds = 120;
using (var connection = new SqlConnection(ConfigurationManager.ConnectionStrings["Conn"].ToString()))
{
connection.Open();
var items = connection.Query<MainItemForList>(query, param, commandTimeout: queryTimeoutInSeconds, buffered: false);
return items.ToList();
}
See also
SqlCommand.CommandTimeout Property on MSDN
For Dapper , default timeout is 30 seconds But we can increase the timeout in this way. Here we are incresing the timeout 240 seconds (4 minutes).
public DataTable GetReport(bool isDepot, string fetchById)
{
int? queryTimeoutInSeconds = 240;
using (IDbConnection _connection = DapperConnection)
{
var parameters = new DynamicParameters();
parameters.Add("#IsDepot", isDepot);
parameters.Add("#FetchById", fetchById);
var res = this.ExecuteSP<dynamic>(SPNames.SSP_GetSEPReport, parameters, queryTimeoutInSeconds);
return ToDataTable(res);
}
}
In the repository layer , we can call our custom ExecuteSP method for the Stored Procedures with additional parameters "queryTimeoutInSeconds".
And below is the "ExecuteSP" method for dapper:-
public virtual IEnumerable<TEntity> ExecuteSP<TEntity>(string spName, object parameters = null, int? parameterForTimeout = null)
{
using (IDbConnection _connection = DapperConnection)
{
_connection.Open();
return _connection.Query<TEntity>(spName, parameters, commandTimeout: parameterForTimeout, commandType: CommandType.StoredProcedure);
}
}
Could be a matter of setting the command timeout in Dapper. Here's an example of how to adjust the command timeout in Dapper:
Setting Command Timeout in Dapper

ScalaQuery's query/queryNA several times slower than JDBC?

In the following performance tests of many queries, this timed JDBC code takes 500-600ms:
val ids = queryNA[String]("select id from account limit 1000").list
val stmt = session.conn.prepareStatement("select * from account where id = ?")
debug.time() {
for (id <- ids) {
stmt.setString(1, id)
stmt.executeQuery().next()
}
}
However, when using ScalaQuery, the time goes to >2s:
val ids = queryNA[String]("select id from account limit 1000").list
implicit val gr = GetResult(r => ())
val q = query[String,Unit]("select * from account where id = ?")
debug.time() {
for (id <- ids) {
q.first(id)
}
}
After debugging with server logs, this turns out to be due to the fact that the PreparedStatements are being repeatedly prepared and not reused.
This is in fact a performance issue that we've been hitting in our application code, so we're wondering if we're missing something regarding how to reuse prepared statements properly in ScalaQuery, or if dropping down to JDBC is the suggested workaround.
Got an answer from the scalaquery mailing list. This is just how ScalaQuery is designed - it assumes that you're something that provides statement pooling underneath:
Nowadays ScalaQuery always requests a new PreparedStatement from the Connection. There used to be a cache for PreparedStatements in early versions but I removed it because there are already good solutions for this problem. Every decent connection pool should have an option for PreparedStatement pooling. If you're using a Java EE server, it should have an integrated connection pool. For standalone applications, you can use something like http://sourceforge.net/projects/c3p0/

Grails/Hibernate Batch Insert

I am using STS + Grails 1.3.7 and doing the batch insertion for thousands instances of a domain class.
It is very slow because Hibernate simply batch all the SQL statements into one JDBC call instead of combining the statements into one.
How can I make them into one large statement?
What you can do is to flush the hibernate session each 20 insert like this :
int cpt = 0
mycollection.each{
cpt ++
if(cpt > 20){
mycollection.save(flush:true)
}
else{
mycollection.save()
}
}
The flushing of hbernate session executes the SQL statement each 20 inserts.
This is the easiest method but you can find more interessant way to do it in Tomas lin blog. He is explaining exactly what you want to do : http://fbflex.wordpress.com/2010/06/11/writing-batch-import-scripts-with-grails-gsql-and-gpars/
Using the withTransaction() method on the domain classes makes the inserts much faster for batch scripts. You can build up all of the domain objects in one collection, then insert them in one block.
For example:
Player.withTransaction{
for (p in players) {
p.save()
}
}
You can see this line in Hibernate doc:
Hibernate disables insert batching at the JDBC level transparently if you use an identity identifier generator.
When I changed the type of generator, it worked.