Neo4j procedure testing and test server - testing

I'm creating a Neo4J procedure to import RDF data. The RDF data has a few complex structure information and I write the tests to cover every case (some triples creates labels, some properties, some relations...etc..).
The procedures are written in Kotlin.
It works fine, and actually each test when executed individually succeeds. but when I run the whole test case at one, I get one success then all other tests fails with the exception:
org.neo4j.kernel.impl.core.ThreadToStatementContextBridge$BridgeDatabaseShutdownException: This database is shutdown.
I'm new to Neo4J and I struggle to find good examples, here is the structure of a test case:
package mypackage
import org.junit.Assert
import org.junit.Rule
import org.junit.Test
import org.neo4j.driver.internal.value.NullValue
import org.neo4j.driver.v1.Config
import org.neo4j.driver.v1.GraphDatabase
import org.neo4j.harness.junit.Neo4jRule
import java.io.File
import java.net.URI
class PropertyParserTest {
// this rule starts a Neo4j instance
#Rule
#JvmField
var neo4j: Neo4jRule = Neo4jRule()
// This is the procedure/function to test
.withProcedure(Mypackage::class.java)
#Test
#Throws(Throwable::class)
fun shouldSetTheNameCorrectly() {
GraphDatabase.driver(neo4j.boltURI(), Config.build().withoutEncryption().toConfig()).use({ driver ->
driver.session().use({ session ->
// Given
val path: String = File("src/test/resources/test_rdf__1.ttl").getAbsolutePath()
val testFile = File(path)
val urlTestFile: URI = testFile.toURI()
session.run("CALL mypackage.import('${urlTestFile}')")
// When
val result = session.run("MATCH (n) WHERE n:Person RETURN n.name as name")
// Then
var rec = result.next()
Assert.assertEquals("Manuel, Niklaus (Niclaus)", rec.get("name").asString())
rec = result?.next()
Assert.assertEquals("Fischli / Weiss", rec.get("name").asString())
rec = result?.next()
Assert.assertEquals("Hodler, Ferdinand", rec.get("name").asString())
})
})
}
#Test
#Throws(Throwable::class)
fun shouldSetTheAlternateNameCorrectly() {
GraphDatabase.driver(neo4j?.boltURI(), Config.build().withoutEncryption().toConfig()).use({ driver ->
driver.session().use({ session ->
// Given
val path: String = File("src/test/resources/test_rdf_2.ttl").absolutePath
val testFile = File(path)
val urlTestFile: URI = testFile.toURI()
session.run("CALL mypackage.import('${urlTestFile}')")
// When
val result = session.run("MATCH (n) WHERE n:Person RETURN n.name as name, n.alternate_names as alternate_names")
// Then
var rec = result.next()
Assert.assertEquals("Holbein, Hans", rec.get("name").asString())
var alternateNames = rec.get("alternate_names").asList()
Assert.assertEquals(9, alternateNames.size)
Assert.assertEquals("Holpenius, Joannes", alternateNames[0])
Assert.assertEquals("Olpenius, Hans", alternateNames[8])
rec = result.next()
Assert.assertEquals("Manuel, Niklaus (Niclaus)", rec.get("name").asString())
alternateNames = rec.get("alternate_names").asList()
Assert.assertEquals(8, alternateNames.size)
rec = result.next()
Assert.assertEquals("Fischli / Weiss", rec.get("name").asString())
Assert.assertTrue(rec.get("alternate_names") is NullValue)
rec = result.next()
Assert.assertEquals("Hodler, Ferdinand", rec.get("name").asString())
Assert.assertTrue(rec.get("alternate_names") is NullValue)
rec = result.next()
Assert.assertEquals("Holbein", rec.get("name").asString())
alternateNames = rec.get("alternate_names").asList()
Assert.assertEquals(3, alternateNames.size)
})
})
}
}
Any idea ? I'm using this code base as starting point : https://github.com/jbarrasa/neosemantics/blob/3.3/src/test/java/semantics/RDFImportTest.java

ok I've actually found a code in my procedure executing outside a transaction and I think this may be the cause of the problem. When I group everything in a single transaction, I don't have this problem anymore.
I'm not entirely sure why it works for individual tests and fails when running the whole test case, but this works now.

Related

Get or Insert within a Transaction on Doobie in Scala

I'm reading through the Doobie documentation and trying to do a simple get or create within a transaction. I get an option off the first query and attempt to do a getOrElse and run an insert within the else, however I keep getting a value map is not a member of Any within the getOrElse call. What's the correct way to either get an existing or create a new row in instances and return that result in a transaction?
import doobie._
import doobie.implicits._
import cats._
import cats.effect._
import cats.implicits._
import org.joda.time.DateTime
import scala.concurrent.ExecutionContext
case class Instance(id : Int, hostname : String)
case class User(id : Int, instanceId: Int, username : String, email : String, created : DateTime)
class Database(dbUrl : String, dbUser: String, dbPass: String) {
implicit val cs = IO.contextShift(ExecutionContext.global)
val xa = Transactor.fromDriverManager[IO](
"org.postgresql.Driver", dbUrl, dbUser, dbPass
)
def getOrCreateInstance(hostname: String) = for {
existingInstance <- sql"SELECT id, hostname FROM instances i WHERE i.hostname = $hostname".query[Instance].option
ensuredInstance <- existingInstance.getOrElse(sql"INSERT INTO instances(hostname) VALUES(?)".update.withGeneratedKeys[Instance]("id", "hostname"))
} yield ensuredInstance
}
I got the following answer thanks to the people on the #scala/freenode chatroom. I'm posting it here for completeness and if people are interested in doing this without the for comprehension in the other answer.
def getOrCreateInstance(hostname: String): ConnectionIO[Instance] =
OptionT(sql"SELECT id, hostname FROM instances i WHERE i.hostname = $hostname".query[Instance].option)
.getOrElseF(sql"INSERT INTO instances(hostname) VALUES($hostname)".update.withGeneratedKeys[Instance]("id", "hostname").compile.lastOrError)
I believe something like this should work for you,
def getOrCreateInstance(hostname: String): ConnectionIO[Instance] = for {
existingInstance <- sql"SELECT id, hostname FROM instances i WHERE i.hostname = $hostname".query[Instance].option
ensuredInstance <- existingInstance.fold(sql"INSERT INTO instances(hostname) VALUES($hostname)".update.withGeneratedKeys[Instance]("id", "hostname").take(1).compile.lastOrError)(_.pure[ConnectionIO])
} yield ensuredInstance
where you are compiling the fs2 Stream and also lifting the existing instance into a ConnectionIO in the case that it does already exist.

Is it possible to parameterize queries or parameters for an Acolyte ScalaCompositeHandler?

Background:
I have attempted to accomplish the question defined here, and I have not been able to succeed. Acolyte requires you to define the queries and parameters you want to handle within a match expression, and the values used in match expressions must be known at compile time. (Note, however, that this StackOverflow answer appears to provide a way around this limitation).
If this is indeed not possible, the inability to dynamically define the parameters and queries for Acolyte would be, for my use case, a severe limitation of the framework. I suspect this would be a limitation for others as well.
One SO user who has advocated for the use of Acolyte across a handful of questions stated in this comment that it is possible to dynamically define queries and their responses. So, I have opened this question as an invitation for someone to show that to be the case.
Question:
Using Acolyte, I want to be able to encapsulate the logic for matching queries and generating their responses. This is a desired feature because I want to keep my code DRY. In other words, I am looking for something like the following pseudo-code:
def generateHandler(query: String, accountId: Int, parameters: Seq[String]): ScalaCompositeHandler = AcolyteDSL.handleQuery {
parameters.foreach(p =>
// Tell the handler to handle this specific parameter
case acolyte.jdbc.QueryExecution(query, ExecutedParameter(accountId) :: ExecutedParameter(p) :: Nil) =>
someResultFunction(p)
)
}
Is this possible in Acolyte? If so, please provide an example.
It is indeed possible to parameterize queries and/or parameters by utilizing pattern matching.
See the code below for an example:
import java.sql.DriverManager
import acolyte.jdbc._
import acolyte.jdbc.Implicits._
import org.scalatest.FunSpec
class AcolyteTest extends FunSpec {
describe("Using pattern matching to extract a query parameter") {
it("should extract the parameter and make it usable for dynamic result returning") {
val query = "SELECT someresult FROM someDB WHERE id = ?"
val rows = RowLists.rowList1(classOf[String] -> "someresult")
val handlerName = "testOneHandler"
val handler = AcolyteDSL.handleQuery {
case acolyte.jdbc.QueryExecution(`query`, ExecutedParameter(id) :: _) =>
rows.append(id.toString)
}
Driver.register(handlerName, handler)
val connection = DriverManager.getConnection(s"jdbc:acolyte:anything-you-want?handler=$handlerName")
val preparedStatement = connection.prepareStatement(query)
preparedStatement.setString(1, "hello world")
val resultSet = preparedStatement.executeQuery()
resultSet.next()
assertResult(resultSet.getString(1))("hello world")
}
it("should support a slightly more complex example") {
val firstResult = "The first result"
val secondResult = "The second result"
val query = "SELECT someresult FROM someDB WHERE id = ?"
val rows = RowLists.rowList1(classOf[String] -> "someresult")
val results: Map[String, RowList1.Impl[String]] = Map(
"one" -> rows.append(firstResult),
"two" -> rows.append(secondResult)
)
def getResult(parameter: String): QueryResult = {
results.get(parameter) match {
case Some(row) => row.asResult()
case _ => acolyte.jdbc.QueryResult.Nil
}
}
val handlerName = "testTwoHandler"
val handler = AcolyteDSL.handleQuery {
case acolyte.jdbc.QueryExecution(`query`, ExecutedParameter(id) :: _) =>
getResult(id.toString)
}
Driver.register(handlerName, handler)
val connection = DriverManager.getConnection(s"jdbc:acolyte:anything-you-want?handler=$handlerName")
val preparedStatement = connection.prepareStatement(query)
preparedStatement.setString(1, "one")
val resultSetOne = preparedStatement.executeQuery()
resultSetOne.next()
assertResult(resultSetOne.getString(1))(firstResult)
preparedStatement.setString(1, "two")
val resultSetTwo = preparedStatement.executeQuery()
resultSetTwo.next()
assertResult(resultSetTwo.getString(1))(secondResult)
}
}
}

Kafka Spark streaming HBase insert issues

I'm using Kafka to send a file with 3 columns using Spark streaming 1.3 to insert into HBase.
This is how my HBase looks like :
ROW COLUMN+CELL
zone:bizert column=travail:call, timestamp=1491836364921, value=contact:numero
zone:jendouba column=travail:Big data, timestamp=1491835836290, value=contact:email
zone:tunis column=travail:info, timestamp=1491835897342, value=contact:num
3 row(s) in 0.4200 seconds
And this is how I read data with spark streaming, I'm using spark-shell:
import org.apache.spark.streaming.{ Seconds, StreamingContext }
import org.apache.spark.streaming.kafka.KafkaUtils
import kafka.serializer.StringDecoder
val ssc = new StreamingContext(sc, Seconds(10))
val topicSet = Set ("zed")
val kafkaParams = Map[String, String]("metadata.broker.list" -> "xx.xx.xxx.xx:9092")
val stream = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](ssc, kafkaParams, topicSet)
lines.foreachRDD(rdd => { (!rdd.partitions.isEmpty)
lines.saveAsTextFiles("hdfs://xxxxx:8020/user/admin/zed/steams3/")
})
this code is working when I'm saving data into HDFS even it save many empty data to HDFS.
before writing this question I was searching here and some other question like mine but I didn't get a good solution.
May you propose the best way to do this?.
This is how my code look now
val sc = new SparkContext("local", "Hbase spark")
val tableName = "notz"
val conf = HBaseConfiguration.create()
conf.addResource(new Path("file:///opt/cloudera/parcels/CDH-5.4.7-1.cdh5.4.7.p0.3/etc/hbase/conf.dist/hbase-site.xml"))
conf.set(TableInputFormat.INPUT_TABLE, tableName)
val admin = new HBaseAdmin(conf)
lines.foreachRDD(rdd => { (!rdd.partitions.isEmpty)
if(!admin.isTableAvailable(tableName)) {
print("Creating GHbase Table")
val tableDesc = new HTableDescriptor(tableName)
tableDesc.addFamily(new HColumnDescriptor("zone"
.getBytes()))
admin.createTable(tableDesc)
}else{
print("Table already exists!!")
}
val myTable = new HTable(conf, tableName)
// i'm blocked here
})

Avro specific vs generic record types - which is best or can I convert between?

We’re trying to decide between providing generic vs specific record formats for consumption by our clients
with an eye to providing an online schema registry clients can access when the schemas are updated.
We expect to send out serialized blobs prefixed with a few bytes denoting the version number so schema
retrieval from our registry can be automated.
Now, we’ve come across code examples illustrating the relative adaptability of the generic format for
schema changes but we’re reluctant to give up the type safety and ease-of-use provided by the specific
format.
Is there a way to obtain the best of both worlds? I.e. could we work with and manipulate the specific generated
classes internally and then have them converted them to generic records automatically just before serialization?
Clients would then deserialize the generic records (after looking up the schema).
Also, could clients convert these generic records they received to specific ones at a later time? Some small code examples would be helpful!
Or are we looking at this all the wrong way?
What you are looking for is Confluent Schema registry service and libs which helps to integrate with this.
Providing a sample to write Serialize De-serialize avro data with a evolving schema. Please note providing sample from Kafka.
import io.confluent.kafka.serializers.KafkaAvroDeserializer;
import io.confluent.kafka.serializers.KafkaAvroSerializer;
import org.apache.avro.generic.GenericRecord;
import org.apache.commons.codec.DecoderException;
import org.apache.commons.codec.binary.Hex;
import java.util.HashMap; import java.util.Map;
public class ConfluentSchemaService {
public static final String TOPIC = "DUMMYTOPIC";
private KafkaAvroSerializer avroSerializer;
private KafkaAvroDeserializer avroDeserializer;
public ConfluentSchemaService(String conFluentSchemaRigistryURL) {
//PropertiesMap
Map<String, String> propMap = new HashMap<>();
propMap.put("schema.registry.url", conFluentSchemaRigistryURL);
// Output afterDeserialize should be a specific Record and not Generic Record
propMap.put("specific.avro.reader", "true");
avroSerializer = new KafkaAvroSerializer();
avroSerializer.configure(propMap, true);
avroDeserializer = new KafkaAvroDeserializer();
avroDeserializer.configure(propMap, true);
}
public String hexBytesToString(byte[] inputBytes) {
return Hex.encodeHexString(inputBytes);
}
public byte[] hexStringToBytes(String hexEncodedString) throws DecoderException {
return Hex.decodeHex(hexEncodedString.toCharArray());
}
public byte[] serializeAvroPOJOToBytes(GenericRecord avroRecord) {
return avroSerializer.serialize(TOPIC, avroRecord);
}
public Object deserializeBytesToAvroPOJO(byte[] avroBytearray) {
return avroDeserializer.deserialize(TOPIC, avroBytearray);
} }
Following classes have all the code you are looking for.
io.confluent.kafka.serializers.KafkaAvroDeserializer;
io.confluent.kafka.serializers.KafkaAvroSerializer;
Please follow the link for more details :
http://bytepadding.com/big-data/spark/avro/avro-serialization-de-serialization-using-confluent-schema-registry/
Can I convert between them?
I wrote the following kotlin code to convert from a SpecificRecord to GenericRecord and back - via JSON.
PositionReport is an object generated off of avro with the avro plugin for gradle - it is:
#org.apache.avro.specific.AvroGenerated
public class PositionReport extends org.apache.avro.specific.SpecificRecordBase implements org.apache.avro.specific.SpecificRecord {
...
The functions used are below
/**
* Encodes a record in AVRO Compatible JSON, meaning union types
* are wrapped. For prettier JSON just use the Object Mapper
* #param pos PositionReport
* #return String
*/
private fun PositionReport.toAvroJson() : String {
val writer = SpecificDatumWriter(PositionReport::class.java)
val baos = ByteArrayOutputStream()
val jsonEncoder = EncoderFactory.get().jsonEncoder(this.schema, baos)
writer.write(this, jsonEncoder)
jsonEncoder.flush()
return baos.toString("UTF-8")
}
/**
* Converts from Genreic Record into JSON - Seems smarter, however,
* to unify this function and the one above but whatevs
* #param record GenericRecord
* #param schema Schema
*/
private fun GenericRecord.toAvroJson(): String {
val writer = GenericDatumWriter<Any>(this.schema)
val baos = ByteArrayOutputStream()
val jsonEncoder = EncoderFactory.get().jsonEncoder(this.schema, baos)
writer.write(this, jsonEncoder)
jsonEncoder.flush()
return baos.toString("UTF-8")
}
/**
* Takes a Generic Record of a position report and hopefully turns
* it into a position report... maybe it will work
* #param gen GenericRecord
* #return PositionReport
*/
private fun toPosition(gen: GenericRecord) : PositionReport {
if (gen.schema != PositionReport.getClassSchema()) {
throw Exception("Cannot convert GenericRecord to PositionReport as the Schemas do not match")
}
// We will convert into JSON - and use that to then convert back to the SpecificRecord
// Probalby there is a better way
val json = gen.toAvroJson()
val reader: DatumReader<PositionReport> = SpecificDatumReader(PositionReport::class.java)
val decoder: Decoder = DecoderFactory.get().jsonDecoder(PositionReport.getClassSchema(), json)
val pos = reader.read(null, decoder)
return pos
}
/**
* Converts a Specific Record to a Generic Record (I think)
* #param pos PositionReport
* #return GenericData.Record
*/
private fun toGenericRecord(pos: PositionReport): GenericData.Record {
val json = pos.toAvroJson()
val reader : DatumReader<GenericData.Record> = GenericDatumReader(pos.schema)
val decoder: Decoder = DecoderFactory.get().jsonDecoder(pos.schema, json)
val datum = reader.read(null, decoder)
return datum
}
There are a couple difference however between the two:
Fields in the SpecificRecord that are of Instant type will be encoded in the GenericRecord as long and Enums are slightly different
So for example in my unit test of this function time fields are tested like this:
val gen = toGenericRecord(basePosition)
assertEquals(basePosition.getIgtd().toEpochMilli(), gen.get("igtd"))
And enums are validated by string
val gen = toGenericRecord(basePosition)
assertEquals(basePosition.getSource().toString(), gen.get("source").toString())
So to convert between you can do:
val gen = toGenericRecord(basePosition)
val newPos = toPosition(gen)
assertEquals(newPos, basePosition)

Slick 2 - Update columns in a table and return whole table object

How would you update a few columns in a table table while returning the entire updated table when using slick?
Assuming SomeTables is some TableQuery, you would typically write a query like this if you want to, for example, add an item to the table (and returning the newly added item)
val returnedItem = SomeTables returning SomeTables += someTable
How would you do the same if you want to update an item and return the whole back the whole item, I suspect you would do something like this
val q = SomeTables.filter(_.id === id).map(x => (x.someColumn,x.anotherColumn)) returning SomeTables
val returnedItem = q.update((3,"test"))
The following code however does not work, and I can't see any documentation on how to do this
Note that I am aware you can just query the item beforehand, update it, and then use copy on the original object, however this requires a lot of boilerplate (and DB trips as well)
This feature is not supported in Slick (v2 or v3-M1); although I don't see any specific reason prohibiting it's implementation, UPDATE ... RETURNING is not a standard SQL feature (for example, H2 does not support it: http://www.h2database.com/html/grammar.html#update). I'll leave as an exercise to the reader to explore how one might safely and efficiently emulate the feature for RDBMSes lacking UDPATE ... RETURNING.
When you call "returning" on a scala.slick.lifted.Query, it gives you a JdbcInsertInvokerComponent$ReturningInsertInvokerDef. You'll find no update method, although there is an insertOrUpdate method; however, insertOrUpdate only returns the returning expression result if an insert occurs, None is returned for updates, so no help here.
From this we can conclude that if you want to use the UPDATE ... RETURNING SQL feature, you'll either need to use StaticQuery or roll your own patch to Slick. You can manually write your queries (and re-implement your table projections as GetResult / SetParameter serializers), or you can try this snippet of code:
package com.spingo.slick
import scala.slick.driver.JdbcDriver.simple.{queryToUpdateInvoker, Query}
import scala.slick.driver.JdbcDriver.{updateCompiler, queryCompiler, quoteIdentifier}
import scala.slick.jdbc.{ResultConverter, CompiledMapping, JdbcBackend, JdbcResultConverterDomain, GetResult, SetParameter, StaticQuery => Q}
import scala.slick.util.SQLBuilder
import slick.ast._
object UpdateReturning {
implicit class UpdateReturningInvoker[E, U, C[_]](updateQuery: Query[E, U, C]) {
def updateReturning[A, F](returningQuery: Query[A, F, C], v: U)(implicit session: JdbcBackend#Session): List[F] = {
val ResultSetMapping(_,
CompiledStatement(_, sres: SQLBuilder.Result, _),
CompiledMapping(_updateConverter, _)) = updateCompiler.run(updateQuery.toNode).tree
val returningNode = returningQuery.toNode
val fieldNames = returningNode match {
case Bind(_, _, Pure(Select(_, col), _)) =>
List(col.name)
case Bind(_, _, Pure(ProductNode(children), _)) =>
children map { case Select(_, col) => col.name } toList
case Bind(_, TableExpansion(_, _, TypeMapping(ProductNode(children), _, _)), Pure(Ref(_), _)) =>
children map { case Select(_, col) => col.name } toList
}
implicit val pconv: SetParameter[U] = {
val ResultSetMapping(_, compiled, CompiledMapping(_converter, _)) = updateCompiler.run(updateQuery.toNode).tree
val converter = _converter.asInstanceOf[ResultConverter[JdbcResultConverterDomain, U]]
SetParameter[U] { (value, params) =>
converter.set(value, params.ps)
}
}
implicit val rconv: GetResult[F] = {
val ResultSetMapping(_, compiled, CompiledMapping(_converter, _)) = queryCompiler.run(returningNode).tree
val converter = _converter.asInstanceOf[ResultConverter[JdbcResultConverterDomain, F]]
GetResult[F] { p => converter.read(p.rs) }
}
val fieldsExp = fieldNames map (quoteIdentifier) mkString ", "
val sql = sres.sql + s" RETURNING ${fieldsExp}"
val unboundQuery = Q.query[U, F](sql)
unboundQuery(v).list
}
}
}
I'm certain the above can be improved; I've written it based on my somewhat limited understanding of Slick internals, and it works for me and can leverage the projections / type-mappings you've already defined.
Usage:
import com.spingo.slick.UpdateReturning._
val tq = TableQuery[MyTable]
val st = tq filter(_.id === 1048003) map { e => (e.id, e.costDescription) }
st.updateReturning(tq map (identity), (1048003, Some("such cost")))