Sorted table with a map in Apache Ignite - sql

I initially want to accomplish something simple with Ignite. I have a type like this (simplified):
case class Product(version: Long, attributes: Map[String, String])
I have a key for each one to store it by (it's one of the attributes).
I'd like to store them such that I can retrieve a subset of them between two version numbers or, at the very least, WHERE version > n. The problem is that the cache API only seems to support either retrieval by key or table scan. On the other hand, SQL99 doesn't seem to have any kind of map type.
I was thinking I'd need to use a binary marshaler, but the docs say:
There is a set of 'platform' types that includes primitive types, String, UUID, Date, Timestamp, BigDecimal, Collections, Maps and arrays of thereof that will never be represented as a BinaryObject.
So... maps are supported?
Here's my test code. It fails with java.lang.IllegalArgumentException: Cache is not configured: ignite-sys-cache, though. Any help getting a simple test working would really aid my understanding of how this is supposed to work.
Oh, and also, do I need to configure the schema in the Ignite config file? Or are the field attributes a sufficient alternative to that?
case class Product(
#(QuerySqlField #field)(index = true) version: Long,
attributes: java.util.Map[String, String]
)
object Main {
val TestProduct = Product(2L, Map("pid" -> "123", "foo" -> "bar", "baz" -> "quux").asJava)
def main(args: Array[String]): Unit = {
Ignition.setClientMode(true)
val ignite = Ignition.start()
val group = ignite.cluster.forServers
val cacheConfig = new CacheConfiguration[String, Product]
cacheConfig.setName("inventory1")
cacheConfig.setIndexedTypes(classOf[String], classOf[Product])
val cache = ignite.getOrCreateCache(cacheConfig)
cache.put("P123", TestProduct)
val query = new SqlQuery(classOf[Product], "select * from Product where version > 1")
val resultSet = cache.query(query)
println(resultSet)
}
}

Ignite supports querying by indexed fields. Since version is a regular indexed field it should be feasible to do the described queries.
I've checked your code and it works on my side.
Please check that the Ignite version is consistent across all the nodes.
If you provide the full logs I could take a look at it.

Related

Is it possible to generate scenario outline definition dynamically [duplicate]

I currently use junit5, wiremock and restassured for my integration tests. Karate looks very promising, yet I am struggling with the setup of data-driven tests a bit as I need to prepare a nested data structures which, in the current setup, looks like the following:
abstract class StationRequests(val stations: Collection<String>): ArgumentsProvider {
override fun provideArguments(context: ExtensionContext): java.util.stream.Stream<out Arguments>{
val now = LocalDateTime.now()
val samples = mutableListOf<Arguments>()
stations.forEach { station ->
Subscription.values().forEach { subscription ->
listOf(
*Device.values(),
null
).forEach { device ->
Stream.Protocol.values().forEach { protocol ->
listOf(
null,
now.minusMinutes(5),
now.minusHours(2),
now.minusDays(1)
).forEach { startTime ->
samples.add(
Arguments.of(
subscription, device, station, protocol, startTime
)
)
}
}
}
}
}
return java.util.stream.Stream.of(*samples.toTypedArray())
}
}
Is there any preferred way how to setup such nested data structures with karate? I initially thought about defining 5 different arrays with sample values for subscription, device, station, protocol and startTime and to combine and merge them into a single array which would be used in the Examples: section.
I did not succeed so far though and I am wondering if there is a better way to prepare such nested data driven tests?
I don't recommend nesting unless absolutely necessary. You may be able to "flatten" your permutations into a single table, something like this: https://github.com/intuit/karate/issues/661#issue-402624580
That said, look out for the alternate option to Examples: which just might work for your case: https://github.com/intuit/karate#data-driven-features
EDIT: In version 1.3.0, a new #setup life cycle was introduced that changes the example below a bit.
Here's a simple example:
Feature:
Scenario:
* def data = [{ rows: [{a: 1},{a: 2}] }, { rows: [{a: 3},{a: 4}] }]
* call read('called.feature#one') data
and this is: called.feature:
#ignore
Feature:
#one
Scenario:
* print 'one:', __loop
* call read('called.feature#two') rows
#two
Scenario:
* print 'two:', __loop
* print 'value of a:', a
This is how it looks like in the new HTML report (which is in 0.9.6.RC2 and may need more fine tuning) and it shows off how Karate can support "nesting" even in the report, which Cucumber cannot do. Maybe you can provide feedback and let us know if it is ready for release :)

Controller returning reactive paged result from total item count as Mono, items as Flux

I've got 2 endpoints returning the same data in two different JSON-formats.
The first endpoint returns a JSON-array, and starts the response right away.
#Get("/snapshots-all")
fun allSnapshots(
#Format("yyyy-MM-dd") cutoffDate: LocalDate
): Flux<PersonVm> = snapshotDao.getAllSnapshots(cutoffDate)
The next endpoint that returns a paged result, is more sluggish. It starts the response when both streams are completed. It also requires a whole lot more of memory than the previous endpoint, even though the previous endpoint returns all records from BigQuery.
#Get("/snapshots")
fun snapshots(
#Format("yyyy-MM-dd") cutoffDate: LocalDate,
pageable: Pageable
): Mono<Page<PersonVm>> = Mono.zip(
snapshotDao.getSnapshotCount(cutoffDate),
snapshotDao.getSnapshots(
cutoffDate,
pageable.size,
pageable.offset
).collectList()
).map {
CustomPage(
items = it.t2,
totalNumberOfItems = it.t1,
pageable = pageable
)
}
(Question update) BigQuery is at the bottom of this endpoint. The strength of BigQuery compared to e.g. Postgres, is querying huge tables. The weakness is relatively high latency for simple queries. Hence I'm running the queries in parallel in order to keep latency for the endpoint at a minimum. Running the queries in sequence, will add at least a second to the total processing time.
Question is: Is there a possible rewrite of the chain that will speed up the /snapshots endpoint?
Solution requirements (question update after suggested approaches)
The consumer of this endpoint is external to the project, and every endpoint in this project is documented at a detailed level. Hence, pagination may only occur one time in the returned JSON. Else feel free to suggest new types for returning pagination along with the PersonVm collection.
If it turns out that another solution is impossible, that's an answer as well.
SnapshotDao#getSnapshotCount returns a Mono<Long>
SnapshotDao#getSnapshots returns a Flux<PersonVm>
PersonVm is defined like this:
#Introspected
data class PersonVm(
val volatilePersonId : UUID,
val cases: List<PublicCaseSnapshot>
)
CustomPage is defined like this:
#Introspected
data class CustomPage<T>(
private val items: List<T> = listOf(),
private val totalNumberOfItems: Long,
private val pageable: Pageable
) : Page<T> {
override fun getContent(): MutableList<T> = items.toMutableList()
override fun getTotalSize(): Long = totalNumberOfItems
override fun getPageable(): Pageable = pageable
}
PublicCaseSnapshot is a complex structure, and left out for brevity. It should not be required for solving this issue.
Code used during test of suggested approach from #Denis
In this approach, chain starts with SnapshotDao#getSnapshotCount, and is mapped into an HttpResponse instance with response body containing the Flux<PersonVm>, and total item count in header.
Queries will now run in sequence, and numerous comparison tests between below code and existing code, showed that the original code performs better (by approx. 1 second). Different page sizes were used during the tests, and BigQuery was warmed up by running same query multiple times. Best results were recorded.
Please note that in cases where time spent on total item count query is negligible (or total item count is cacheable) and pagination is not required to be part of the JSON, this should be considered as a viable approach.
#Get("/snapshots-with-total-count-in-header")
fun snapshotsWithTotalCountInHeader(
#Format("yyyy-MM-dd") cutoffDate: LocalDate,
pageable: Pageable
): Mono<HttpResponse<Flux<PersonVm>>> = snapshotDao.getSnapshotCount(cutoffDate)
.map { totalItemCount ->
HttpResponse.ok(
snapshotDao.getSnapshots(
cutoffDate,
pageable.size,
pageable.offset
)
).apply {
headers.add("total-item-count", totalItemCount.toString())
}
}
You need to rewrite the method to return a publisher of the items. I can see a few options here:
Return the pagination information in the header. Your method will have return type Mono<HttpResponse<Flux<PersonVm>>>.
Return the pagination information on every item: Flux<Tuple<PageInfo, PersonVm>>

Gson api parsing issue Kotlin

I'm trying to parse the JSON returned by the following API call (recipe and ingredientLines only):
https://api.edamam.com/search?q=khachapuri&app_id=xxx&app_key=yyy
My model for GSON looks like this:
class FoodModel {
var label:String = "Yummy"
var image:String = "https://agenda.ge/files/khachapuri.jpg"
var ingredientLines = ""
}
After launching the app, I'm facing the following error:
com.google.gson.JsonSyntaxException: java.lang.IllegalStateException: Expected BEGIN_ARRAY but was BEGIN_OBJECT at line 1 column 2 path $
I think I'm writing the model class incorrectly, because the structure of a json is not clear for me. This is how I'm trying to use Gson: val foodItems = Gson().fromJson(response, Array<FoodModel>::class.java) can anyone help?
The JSON object returned by the API has a slightly different structure compared to your model.
In particular the API is returning a complex object that you need to traverse in order to extract the information you are interest into. A high-level example (I'm not able to test it, but hopefully you'll get the gist of it):
data class Response(
val hits: List<Hit>
)
data class Hit(
val recipe: Recipe
)
data class Recipe(
val label: String,
val image: String
)
val foodItems = Gson().fromJson(response, Response::class.java)
Just be aware that Gson may create instances in an unsafe manner, which means you may experience NullPointerExceptions thrown apparently with no reason. If you want to prove it, just rename image to anything else (you can also try with other fields, it doesn't matter), and you'll see its value is null even if the type is non-nullable.

How to prepare a nested data structure for a data-driven test in Karate?

I currently use junit5, wiremock and restassured for my integration tests. Karate looks very promising, yet I am struggling with the setup of data-driven tests a bit as I need to prepare a nested data structures which, in the current setup, looks like the following:
abstract class StationRequests(val stations: Collection<String>): ArgumentsProvider {
override fun provideArguments(context: ExtensionContext): java.util.stream.Stream<out Arguments>{
val now = LocalDateTime.now()
val samples = mutableListOf<Arguments>()
stations.forEach { station ->
Subscription.values().forEach { subscription ->
listOf(
*Device.values(),
null
).forEach { device ->
Stream.Protocol.values().forEach { protocol ->
listOf(
null,
now.minusMinutes(5),
now.minusHours(2),
now.minusDays(1)
).forEach { startTime ->
samples.add(
Arguments.of(
subscription, device, station, protocol, startTime
)
)
}
}
}
}
}
return java.util.stream.Stream.of(*samples.toTypedArray())
}
}
Is there any preferred way how to setup such nested data structures with karate? I initially thought about defining 5 different arrays with sample values for subscription, device, station, protocol and startTime and to combine and merge them into a single array which would be used in the Examples: section.
I did not succeed so far though and I am wondering if there is a better way to prepare such nested data driven tests?
I don't recommend nesting unless absolutely necessary. You may be able to "flatten" your permutations into a single table, something like this: https://github.com/intuit/karate/issues/661#issue-402624580
That said, look out for the alternate option to Examples: which just might work for your case: https://github.com/intuit/karate#data-driven-features
EDIT: In version 1.3.0, a new #setup life cycle was introduced that changes the example below a bit.
Here's a simple example:
Feature:
Scenario:
* def data = [{ rows: [{a: 1},{a: 2}] }, { rows: [{a: 3},{a: 4}] }]
* call read('called.feature#one') data
and this is: called.feature:
#ignore
Feature:
#one
Scenario:
* print 'one:', __loop
* call read('called.feature#two') rows
#two
Scenario:
* print 'two:', __loop
* print 'value of a:', a
This is how it looks like in the new HTML report (which is in 0.9.6.RC2 and may need more fine tuning) and it shows off how Karate can support "nesting" even in the report, which Cucumber cannot do. Maybe you can provide feedback and let us know if it is ready for release :)

Querying GemFire Region by partial key

When the key is a composite of id1, id2 in a GemFire Region and the Region is partitioned with id1, what is the best way of getting all the rows whose key matched id1.
Couple of options that we are thinking of:
Create another index on id1. If we do that, we are wondering if it goes against all Partition Regions?
Write data aware Function and Filter by (id1, null) to target specific Partition Region. Use index in local Region by using QueryService?
Can you please let me know if there is any other way to achieve the query by partial key.
Well, it could be implemented (optimally) by using a combination of #1 and #2 in your "options" above (depending on whether your application domain object also stored/referenced the key, which would be the case if you were using SD[G] Repositories.
This might be best explained with the docs and an example, particularly using the PartitionResolver interface Javadoc.
Say your "composite" Key was implemented as follows:
class CompositeKey implements PartitionResolver {
private final Object idOne;
private final Object idTwo;
CompositeKey(Object idOne, Object idTwo) {
// argument validation as necessary
this.idOne = idOne;
this.idTwo = idTwo;
}
public String getName() {
return "MyCompositeKeyPartitionResolver";
}
public Object getRoutingObject() {
return idOne;
}
}
Then, you could invoke a Function that queries the results you desire by using...
Execution execution = FunctionService.onRegion("PartitionRegionName");
Optionally, you could use the returned Execution to filter on just the (complex) Keys you wanted to query (further qualify) when invoking the Function...
ComplexKey filter = { .. };
execution.withFilter(Arrays.stream(filter).collect(Collectors.toSet()));
Of course, this is problematic if you do not know your keys in advance.
Then you might prefer to use the ComplexKey to identify your application domain object, which is necessary when using SD[G]'s Repository abstraction/extension:
#Region("MyPartitionRegion")
class ApplicationDomainObject {
#Id
CompositeKey identifier;
...
}
And then, you can code your Function to operate on the "local data set" of the Partition Region. That is, when a data node in the cluster hosts the same Partition Region (PR), then it will only operate on the data set in the "bucket" for that PR, which is accomplished by doing the following:
class QueryPartitionRegionFunction implements Function {
public void execute(FunctionContext<Object> functionContext) {
RegionFunctionContext regionFunctionContext =
(RegionFunctionContext) functionContext;
Region<ComplexKey, ApplicationDomainObject> localDataSet =
PartitionRegionHelper.getLocalDataForContext(regionFunctionContext);
SelectResults<?> resultSet =
localDataSet.query(String.format("identifier.idTwo = %s",
regionFunctionContext.getArguments);
// process result set and use ResultSender to send results
}
}
Of course, all of this is much easier to do using SDG's Function annotation support (i.e. implementing and invoking your Function anyway).
Note that, when you invoke the Function, onRegion using the GemFire's FunctionService, or more conveniently with SDG's annotation support for Function Execution, like so:
#OnRegion("MyPartitionRegion")
interface MyPartitionRegionFunctions {
#FunctionId("QueryPartitionRegion")
<return-type> queryPartitionRegion(..);
}
Then..
Object resultSet = myPartitionRegionFunctions.queryPartitionRegion(..);
Then, the FunctionContext will be a RegionFunctionContext (because you executed the Function on the PR, which executes on all nodes in the cluster hosting the PR).
Additionally, you use the PartitionRegionHelper.getLocalDataForContext(:RegionFunctionContext) to get the local data set of the PR (i.e. the bucket, or just the shard of data in the entire PR (across all nodes) hosted by that node, which would be based your "custom" PartitionResolver).
You can then query to further qualify, or filter the data of interests. You can see that I queried (or further qualified) by idTwo, which was not part of the PartitionResolver implementation. Additionally, this would only be required in the (OQL) query predicate if you did not specify Keys in your Filter with the Execution (since, I think, that would take the entire "Key" (idOne & idTwo) into account, based on our properly implemented Object.equals() method of your ComplexKey class).
But, if you did not know the keys in advance and/or (especially if) you are using SD[G]'s Repositories, then the ComplexKey would be part of your application domain abject, which you could then Index, and query on (as shown above: identifier.idTwo = ?).
Hope this helps!
NOTE: I have not test any of this, but hopefully it will point you in the right direction and/or give you further ideas.