Neo4j 3.5's embeded database does not seem to persist data - kotlin

I am trying to build a small command line tool that will store data in a neo4j graph. To do this I have started experimenting with Neo4j3.5's embedded databases. After putting together the following example I have found that either the nodes I am creating are not being saved to the database or the method of database creation is overwriting my previous run.
The Example:
fun main() {
//Spin up data base
val graphDBFactory = GraphDatabaseFactory()
val graphDB = graphDBFactory.newEmbeddedDatabase(File("src/main/resources/neo4j"))
registerShutdownHook(graphDB)
val tx = graphDB.beginTx()
graphDB.createNode(Label.label("firstNode"))
graphDB.createNode(Label.label("secondNode"))
val result = graphDB.execute("MATCH (a) RETURN COUNT(a)")
println(result.resultAsString())
tx.success()
}
private fun registerShutdownHook(graphDb: GraphDatabaseService) {
// Registers a shutdown hook for the Neo4j instance so that it
// shuts down nicely when the VM exits (even if you "Ctrl-C" the
// running application).
Runtime.getRuntime().addShutdownHook(object : Thread() {
override fun run() {
graphDb.shutdown()
}
})
}
I would expect that every time I run main the resulting query count will increase by 2.
That is currently not the case and I can find nothing in the docs that references a different method of opening an already created embedded database. Am I trying to use the embedded database incorrectly or am I missing something? Any help or info would be appreciated.
build Info:
Kotlin jvm 1.4.21
Neo4j-comunity-3.5.35

Transactions in neo4j 3.x have a 3 stage model
create
success / failure
close
you missed the third, which would then commit or rollback.
You can use Kotlin's use as Transaction is an AutoCloseable

Related

Quarkus: execute parallel unis

In a quarkus / kotlin application, I want to start multiple database requests concurrently. I am new at quarkys and I am not sure if I am doing things right:
val uni1 = Uni.createFrom().item(repo1).onItem().apply { it.request() }
val uni2 = Uni.createFrom().item(repo2).onItem().apply { it.request() }
return Uni.combine().all()
.unis(uni1, uni2)
.asTuple()
.onItem()
.apply { tuple ->
Result(tuple.item1, tuple.item2) }
.await()
.indefinitely()
Will the request() really be made in parallel? Is it the right way to do it in quarkus?
Yes, your code is right.
Uni.combine().all() runs all the passed Unis concurrently. You will get the tuple (containing the individual results) when all the Unis have completed (emitted a result).
From your code, you may remove the tuple step and use combineWith instead.
Finally, note that the await().indefinitely() blocks the caller thread, forever if one of the Uni does not complete (for whatever reason). I strongly recommend using await().atMost(...)

Is it possible to load a pre-populated database from local resource using sqldelight

I have a relatively large db that may take 1 to 2 minutes to initialise, is it possible to load a pre-populated db when using sqldelight (kotlin multiplatform) instead of initialising the db on app launch?
Yes, but it can be tricky. Not just for "Multiplatform". You need to copy the db to the db folder before trying to init sqldelight. That probably means i/o on the main thread when the app starts.
There is no standard way to do this now. You'll need to put the db file in assets on android and in a bundle on iOS and copy them to their respective folders before initializing sqldelight. Obviously you'll want to check if the db exists first, or have some way of knowing this is your first app run.
If you're planning on shipping updates that will have newer databases, you'll need to manage versions outside of just a check for the existance of the db.
Although not directly answering your question, 1 to 2 minutes is really, really long for sqlite. What are you doing? I would first make sure you're using transactions properly. 1-2 minutes of inserting data would (probably) result in a huge db file.
Sorry, but I can't add any comments yet, which would be more appropriate...
Although not directly answering your question, 1 to 2 minutes is
really, really long for sqlite. What are you doing? I would first make
sure you're using transactions properly. 1-2 minutes of inserting data
would (probably) result in a huge db file.
Alternatively, my problem due to which I had to use a pre-populated database was associated with the large size of .sq files (more than 30 MB text of INSERTs per table), and SqlDeLight silently interrupted the generation, without displaying error messages.
You'll need to put the db file in assets on android and in a bundle on
iOS and copy them to their respective folders before initializing
sqldelight.
Having to load a db from resources on both android and ios feels a lot
of work + it means the shared project wont be the only place where the
data is initialised.
Kotlin MultiPlatform library Moko-resources solves the issue of a single source for a database in a shared module. It works for KMM the same way for Android and iOS.
Unfortunately, using this feature are almost not presented in the samples of library. I added a second method (getDriver) to the expected class DatabaseDriverFactory to open the prepared database, and implemented it on the platform. For example, for androidMain:
actual class DatabaseDriverFactory(private val context: Context) {
actual fun createDriver(schema: SqlDriver.Schema, fileName: String): SqlDriver {
return AndroidSqliteDriver(schema, context, fileName)
}
actual fun getDriver(schema: SqlDriver.Schema, fileName: String): SqlDriver {
val database: File = context.getDatabasePath(fileName)
if (!database.exists()) {
val inputStream = context.resources.openRawResource(MR.files.dbfile.rawResId)
val outputStream = FileOutputStream(database.absolutePath)
inputStream.use { input: InputStream ->
outputStream.use { output: FileOutputStream ->
input.copyTo(output)
}
}
}
return AndroidSqliteDriver(schema, context, fileName)
}
}
MR.files.fullDb is the FileResource from the class generated by the library, it is associated with the name of the file located in the resources/MR/files directory of the commonMain module. It property rawResId represents the platform-side resource ID.
The only thing you need is to specify the path to the DB file using the driver.
Let's assume your DB lies in /mnt/my_best_app_dbs/super.db. Now, pass the path in the name property of the Driver. Something like this:
val sqlDriver: SqlDriver = AndroidSqliteDriver(Schema, context, "/mnt/my_best_app_dbs/best.db")
Keep in mind that you might need to have permissions that allow you to read a given storage type.

Spark could not filter correctly?

I encounter an wired problem that the result is not correct.
I have a class called A, and it has a value called keyword.
I want to filter the RDD[A] if it has some keyword.
Spark environment:
version: 1.3.1
execution env: yarn-client
Here is the code:
class A ...
case class C(words:Set[String] ) extends Serializable {
def run(data:RDD[A])(implicit sc:SparkContext) ={
data.collect{ case x:A=> x }.filter(y => words.contains(y.keyword)).foreach(println)
}
}
// in main function
val data:RDD[A] = ....
val c = C(Set("abc"))
c.run(data)
The code above prints nothing. However if I collect RDD[A] to local, then it print something! E.g.
data.take(1000).collect{ case x:A=> x }.filter(y => words.contains(y.keyword)).foreach(println)}
How could this happen?
Let me ask another related question: Should I make case class C extends Serializable? I don't think it is necessary.
The reason is quite easy. If you run the println function when you collect data locally, what happens is that your data are trasferred over the network to the machine you are using (let's call it the client of the Spark environment) and then it is printed on your console. SO far, everything behaves as expected. Instead, if you run the println function on a distributed RDD, the println function is executed locally on the worker machine on which there are your data. So the function is actually executed but you won't see any result on the console of your client, unless it is also a worker machine: in fact, everything is printed on the console of the respective worker node.
No, it's not necessary you make it Serializable, the only thing is serialized is your words:Set[String].

Grails - Recreate database schema for integration test

Is there a convenient way to force Grails / Hibernate to recreate the database schema from an integration test?
If you add the following in DataSource.groovy an empty database will be created before the integration tests are run:
environments {
test {
dataSource {
dbCreate = "create"
}
}
}
By default each integration test executes within a transaction that is rolled-back at the end of the test, so unless you're not using this default behaviour there shouldn't be any need to programatically recreate the database.
Update
Based on your comment, it seems you really do want to recreate the schema before some integration tests. In that case, the only way I can think of, is to run
drop and recreate the schema
use grails schema-export to import a fresh schema
class MyIntegrationTest {
SessionFactory sessionFactory
/**
* Helper for executing SQL statements
* #param jdbcWork A closure that is passed an <tt>Sql</tt> object that is used to execute the JDBC statements
*/
private doJdbcWork(Closure jdbcWork) {
sessionFactory.currentSession.doWork(
new Work() {
#Override
void execute(Connection connection) throws SQLException {
// do not close this Sql instance ourselves
Sql sql = new Sql(connection)
jdbcWork(sql)
}
}
)
}
private recreateSchema() {
doJdbcWork {Sql sql ->
// use the sql object to drop the database and create a new blank database
// something like the following might work for MySQL
sql.execute("drop database my-schema")
sql.execute("create database my-schema")
}
// generate the DDL and import it
// there must be a better way to execute a grails command from within an
// integration test, but unfortunately I don't know what it is
'grails test schema-export export'.execute()
}
#Test
void myTestMethod() {
recreateSchema()
// now do the test
}
}
First and foremost, this code is completely untested, so treat with deep suspicion and low expectations. Secondly, you may need to change the default transational behaviour of integration tests (with #Transactional) in order for this to work.
This seems to work fine, but it's obviously very tightly coupled to H2 so it would have been nice if the Hibernate plugin had exposed an api to take care of this.
http://h2database.com/html/grammar.html#script
class SomethingTestingTransactionsSpec extends IntegrationSpec {
static transactional = false // Why I need this
SessionFactory sessionFactory // Injected by Spring
DataSource dataSource // Also injected
File schemaDump
Sql sql
void setup() {
sql = new Sql(dataSource)
schemaDump = File.createTempFile("test-database-dump", ".sql") // Java 7 API
sql.execute("script drop to ${schemaDump.absolutePath}")
}
void cleanup() {
sql.execute("runscript from ${schemaDump.absolutePath}")
sessionFactory.currentSession.clear()
schemaDump.delete()
}
// Spock tests ...
}
It should be trivial to extract this code into a bean registered only for test environments. That should clean up the test code a bit and improve efficiency by only having to dump the schema once.
Well, you have access to executing arbitrary sql via sessionFactory, so you could call a grails schema export at the beginning of your tests and then just re-import the schema into your DB when needed.
Alternatively, I wonder if calling database migration plugin externally will accomplish the same.
Or you can trick grails into thinking your domain class has changed and force a reload via https://github.com/grails/grails-core/blob/v2.1.1/grails-hibernate/src/main/groovy/org/codehaus/groovy/grails/plugins/orm/hibernate/HibernatePluginSupport.groovy#L340 ( don't ask me how )

How can I encapsulate the session/transaction acquisition into the lazy-init of relations in Squeryl?

I am trying to implement a One-To-Many relation using Squeryl, and following the instructions on their site.
The documentation gives the following example:
object SchoolDb extends Schema {
val courses = table[Course]
val subjects = table[Subject]
val subjectToCourses =
oneToManyRelation(subjects, courses).
via((s,c) => s.id === c.subjectId)
}
class Course(val subjectId: Long) extends SchoolDb2Object {
lazy val subject: ManyToOne[Subject] = SchoolDb.subjectToCourses.right(this)
}
class Subject(val name: String) extends SchoolDb2Object {
lazy val courses: OneToMany[Course] = SchoolDb.subjectToCourses.left(this)
}
I find that any calls to Course.subject or Subject.courses needs to be wrapped in a transaction. However, One of my goals in using an ORM is to hide these details from callers. As such, I don't want the calling code to have to wrap a call to these fields in a transaction.
It seems that if I modify the example to wrap the lazy init function in a transaction, like so:
class Subject(val name: String) extends SchoolDb2Object {
lazy val courses: OneToMany[Course] = {
inTransaction {
SchoolDb.subjectToCourses.left(this)
}
}
I get the following exception:
Exception in thread "main" java.lang.RuntimeException: no session is bound to current thread, a session must be created via Session.create
and bound to the thread via 'work' or 'bindToCurrentThread'
at scala.Predef$.error(Predef.scala:58)
at org.squeryl.Session$$anonfun$currentSession$1.apply(Session.scala:111)
at org.squeryl.Session$$anonfun$currentSession$1.apply(Session.scala:111)
at scala.Option.getOrElse(Option.scala:104)
at org.squeryl.Session$.currentSession(Session.scala:110)
at org.squeryl.dsl.AbstractQuery.org$squeryl$dsl$AbstractQuery$$_dbAdapter(AbstractQuery.scala:116)
at org.squeryl.dsl.AbstractQuery$$anon$1.<init>(AbstractQuery.scala:120)
at org.squeryl.dsl.AbstractQuery.iterator(AbstractQuery.scala:118)
at org.squeryl.dsl.DelegateQuery.iterator(DelegateQuery.scala:9)
But, like I said, if I wrap the caller in a transaction, then everything works.
So, how can I encapsulate the fact that this object is backed by a database in the object itself?
I assume you get this error in calls on the courses object?
I don't know very much about how Squeryl works, but I believe that the OneToMany[Course] is a live object. That means that the calls on the courses object need a session since any call may lazily go to the database to fetch data.
How you organise this depends on what type of application you use. In a web application it often makes sense to add a filter (first point of entry) to start and stop the transaction. In a GUI client, say a swing application, it's a good solution to start the transaction at the point where you receive the user interaction. That way you get transactions that are not to long and also stretches over calls which you expect to be performed atomically (either fully or not at all).