I am using a RichSinkFunction to execute a SQL UPDATE query on an existing record.
This function assumes that a record already exists on the DB. However, in certain scenarios the existing record is late.
To overcome the issue of record lateness, I have added a Thread.sleep() to make the function wait and retry the DB update.
Sample code provided below for reference.
class RichSinkFact extends RichSinkFunction[FulfillmentUsagesOutput]{
private def updateFactUpcoming(
r: FulfillmentUsagesOutput,
schemaName: String
): Unit = {
var updateStmt: PreparedStatement = null
val sqlStatement =
s"""
|UPDATE $schemaName.$factUpcomingTableName
|SET unit_id = ?
|WHERE pledge_id = ?
|;
|
""".stripMargin
try {
updateStmt = connection.prepareStatement(sqlStatement)
updateStmt.setLong(1, r.unit_id)
updateStmt.setString(2, r.pledge_id)
val rows = updateStmt.executeUpdate()
if(rows == 0) {
logger.warn(s"Retrying update for ${r}")
//retry update
Thread.sleep(retrySleepTime)
val rows = updateStmt.executeUpdate()
if(rows == 0){
//raise error
logger.error(s"Unable to update row: ${r}")
}
}
} finally {
if (updateStmt != null) {
updateStmt.close()
}
}
}
}
Question : Since Flink already implements other timers and uses internal time processing functions, is this the right way of retrying a DB update?
Thanks
As you suspected, sleeping in a Flink user function can cause problems, and should be avoided. In this case there is a better solution: take a look at Sink.ProcessingTimeService. This will let you register timers that will call a callback you register when they fire.
Thanks to David for the original idea behind this approach.
Sink.ProcessingTimeService is only present from Flink 1.12 onwards. So, for anyone on a previous version of Flink looking to implement a similar solution, ProcessingTimeCallback can be used to implement timers in a Sink application.
I have included a sample approach here
https://gist.github.com/soumoks/f73694c64169c8b3494ba1842fa61f1b
Related
thanks reading this question.
I created simple kotlin project and I want to learn kotlin exposed.
I use H2 database.
I wrote code like below.
package learn.exposed.tables
import org.jetbrains.exposed.sql.Table
object AuthorTable : Table("author") {
val name = varchar("name", 30)
}
fun main() {
// this url based on http://www.h2database.com/html/features.html#execute_sql_on_connection
val url = "jdbc:h2:mem:test;INIT=runscript from 'classpath:/create.sql'\\;runscript from 'classpath:/init.sql'"
Database.connect(url, driver = "org.h2.Driver", user = "root", password = "")
transaction {
AuthorTable.insert {
it[name] = "hoge"
}
println("insert done.") // this message can show on console. I think Insert is successfull.
}
transaction {
AuthorTable.selectAll().firstOrNull()
}
}
and sql files below.
create table author (name varchar(30));
insert into author values ('author1');
When execute main(), console showing insert done.. in short, I think insert is doing well, but when execute AuthorTable.selectAll().firstOrNull(), happend Exception like below,
Exception in thread "main" org.jetbrains.exposed.exceptions.ExposedSQLException: org.h2.jdbc.JdbcSQLNonTransientException: 一般エラー: "java.lang.NullPointerException"
General error: "java.lang.NullPointerException" [50000-200]
SQL: [Failed on expanding args for SELECT: org.jetbrains.exposed.sql.Query#27406a17]
at org.jetbrains.exposed.sql.statements.Statement.executeIn$exposed_core(Statement.kt:62)
at org.jetbrains.exposed.sql.Transaction.exec(Transaction.kt:135)
at org.jetbrains.exposed.sql.Transaction.exec(Transaction.kt:121)
at org.jetbrains.exposed.sql.AbstractQuery.iterator(AbstractQuery.kt:65)
at kotlin.collections.CollectionsKt___CollectionsKt.firstOrNull(_Collections.kt:267)
at learn.exposed.MainKt$main$2.invoke(Main.kt:22)
at learn.exposed.MainKt$main$2.invoke(Main.kt)
at org.jetbrains.exposed.sql.transactions.ThreadLocalTransactionManagerKt.inTopLevelTransaction$run(ThreadLocalTransactionManager.kt:179)
at org.jetbrains.exposed.sql.transactions.ThreadLocalTransactionManagerKt.access$inTopLevelTransaction$run(ThreadLocalTransactionManager.kt:1)
at org.jetbrains.exposed.sql.transactions.ThreadLocalTransactionManagerKt$inTopLevelTransaction$1.invoke(ThreadLocalTransactionManager.kt:205)
at org.jetbrains.exposed.sql.transactions.ThreadLocalTransactionManagerKt.keepAndRestoreTransactionRefAfterRun(ThreadLocalTransactionManager.kt:213)
at org.jetbrains.exposed.sql.transactions.ThreadLocalTransactionManagerKt.inTopLevelTransaction(ThreadLocalTransactionManager.kt:204)
at org.jetbrains.exposed.sql.transactions.ThreadLocalTransactionManagerKt$transaction$1.invoke(ThreadLocalTransactionManager.kt:156)
at org.jetbrains.exposed.sql.transactions.ThreadLocalTransactionManagerKt.keepAndRestoreTransactionRefAfterRun(ThreadLocalTransactionManager.kt:213)
at org.jetbrains.exposed.sql.transactions.ThreadLocalTransactionManagerKt.transaction(ThreadLocalTransactionManager.kt:126)
at org.jetbrains.exposed.sql.transactions.ThreadLocalTransactionManagerKt.transaction(ThreadLocalTransactionManager.kt:123)
at org.jetbrains.exposed.sql.transactions.ThreadLocalTransactionManagerKt.transaction$default(ThreadLocalTransactionManager.kt:122)
at learn.exposed.MainKt.main(Main.kt:21)
at learn.exposed.MainKt.main(Main.kt)
can I solve this? do you know something how to solve this?
thanks.
Seems like you need at least one Primary Key(PK) or Constraint because of H2 bug.
https://github.com/h2database/h2database/issues/2191
https://github.com/JetBrains/Exposed/issues/801
I have a very strange error with dapper:
there is already an open DataReader associated with this Command
which must be closed first
But I don't use DataReader! I just call select query on my server application and take first result:
//How I run query:
public static T SelectVersion(IDbTransaction transaction = null)
{
return DbHelper.DataBase.Connection.Query<T>("SELECT * FROM [VersionLog] WHERE [Version] = (SELECT MAX([Version]) FROM [VersionLog])", null, transaction, commandTimeout: DbHelper.CommandTimeout).FirstOrDefault();
}
//And how I call this method:
public Response Upload(CommitRequest message) //It is calling on server from client
{
//Prepearing data from CommitRequest
using (var tr = DbHelper.DataBase.Connection.BeginTransaction(IsolationLevel.Serializable))
{
int v = SelectQueries<VersionLog>.SelectVersion(tr) != null ? SelectQueries<VersionLog>.SelectVersion(tr).Version : 0; //Call my query here
int newVersion = v + 1; //update version
//Saving changes from CommitRequest to db
//Updated version saving to base too, maybe it is problem?
return new Response
{
Message = String.Empty,
ServerBaseVersion = versionLog.Version,
};
}
}
}
And most sadly that this exception appearing in random time, I think what problem in concurrent access to server from two clients.
Please help.
This some times happens if the model and database schema are not matching and an exception is being raised inside Dapper.
If you really want to get into this, best way is to include dapper source in your project and debug.
In sql we can do something like this:
INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);
Is there any way to do multiple/bulk/batch inserts or updates in Slick?
Can we do something similar, at least using SQL plain queries ?
For inserts, as Andrew answered, you use insertALL.
def insertAll(items:Seq[MyCaseClass])(implicit session:Session) = {
(items.size) match {
case s if s > 0 =>
try {
// basequery is the tablequery object
baseQuery.insertAll(tempItems :_*)
} catch {
case e:Exception => e.printStackTrace()
}
Some(tempItems(0))
case _ => None
}
}
For updates, you're SOL. Check out Scala slick 2.0 updateAll equivalent to insertALL? for what I ended up doing. To paraphrase, here's the code:
private def batchUpdateQuery = "update table set value = ? where id = ?"
/**
* Dropping to jdbc b/c slick doesnt support this batched update
*/
def batchUpate(batch:List[MyCaseClass])(implicit session:Session) = {
val pstmt = session.conn.prepareStatement(batchUpdateQuery)
batch map { myCaseClass =>
pstmt.setString(1, myCaseClass.value)
pstmt.setString(2, myCaseClass.id)
pstmt.addBatch()
}
session.withTransaction {
pstmt.executeBatch()
}
}
In Slick, you are able to use the insertAll method for a Table. An example of insertAll is given in the Getting Started page on Slick's website.
http://slick.typesafe.com/doc/0.11.1/gettingstarted.html
How do I create a timeout for this operation: ?
def db = Sql.newInstance("jdbc:mysql://${mysql_host}:3306/${dbName}",
user, pass, 'com.mysql.jdbc.Driver')
db.eachRow(query) { row ->
// do something with the row
}
I believe the correct way would be something like this:
sql = Sql.newInstance("jdbc:oracle:thin:#localhost:1521:XE", "user",
"pwd", "oracle.jdbc.driver.OracleDriver")
sql.withStatement {
stmt -> stmt.queryTimeout = 10
}
sql.eachRow("select * from someTable", {
println it
} )
of course, this is where I'd used Oracle, but I hope this can give you an idea.
I believe there might not be a general answer, but rather a database/driver-specific answer via parameters to the connection URL.
E.g. for mysql, I think that adding connectTimeout=something&socketTimeout=something might do the trick.
i'm using the following method for all data inserting and updating
processes in my application.i just need to pass array of sql quires to
the method.are there any disadvantages of using one common method.does
it cause any performance reduction in the application
public int ExecuteCommand(string[] sqls)
{
numberOfRecordsAffected = 0;
IngresConnection ingresConnection = new IngresConnection(ConnStr);
IngresTransaction ingresTransaction = null;
try
{
ingresConnection.Open();
ingresTransaction = ingresConnection.BeginTransaction();
foreach (string sql in sqls)
{
IngresCommand ingresCommand = new IngresCommand(sql, ingresConnection, ingresTransaction);
ingresCommand.CommandTimeout = 0;
numberOfRecordsAffected += ingresCommand.ExecuteNonQuery();
}
ingresTransaction.Commit();
}
catch
{
if (ingresTransaction != null)
ingresTransaction.Rollback();
ingresConnection.Close();
throw;
}
finally
{
if (ingresConnection != null)
ingresConnection.Close();
}
return numberOfRecordsAffected;
}
See this opinionated article about dynamic sql. You ask specifically about performance which indeed is hurt a lot because your queries can't be cached by the database and each of them need to be parsed. The real worry should be about security though. It's so easy to do it wrong/incomplete at one point or the other and it's even harder to test if it has been messed up somewhere or not.