Optaplanner contraint streams - how to check mutliple resources - kotlin

My domain is:
Requirement with some Variants of its realization
Any Variant need multiple resources with amount
Any Resource has limit
I try to write conflict method
constraintFactory
.from(Requirement::class.java)
.groupBy(Requirement::variant::resourceUsageList, sum(Resource::amount))
...
but is doesn't work
How can I get all used resource with its used amount and compare it with resources limit?
I think I need something like flatmap after from.

First make sure that your List<Resource> on your #PlanningSolution has a #ProblemFactCollectionProperty, so from(Resource.class) works.
Then I see multiple ways of doing it:
Proposal A)
from(Requirement.class)
.join(Resource.class) // Bi<Requirement, Resource>
.groupBy((requirement, resource) -> resource, sum((requirement, resource) -> requirement.variant.resourceUsage[resource.index]))
...
The downside is that this proposal A creates a Cartesian Product, so it can be costly memory wise if you have 100 resources and 10 000 requirements.
Propsal B)
.from(Requirement::class.java)
.groupBy(Requirement::variant::resourceUsageList, new MyResourceUsageSumCollector(...))
...
For MyResourceUsageSumCollector, which sums for each resource, look at this sum which sums for just one long for inspiration:
public static <A> UniConstraintCollector<A, ?, Long> sumLong(ToLongFunction<? super A> groupValueMapping) {
return new DefaultUniConstraintCollector<>(
() -> new long[1],
(resultContainer, a) -> {
long value = groupValueMapping.applyAsLong(a);
resultContainer[0] += value;
return () -> resultContainer[0] -= value;
},
resultContainer -> resultContainer[0]);
}

Related

mapping custom object kotlin

I have a custom object:
data class MoneyTransaction(
val amount: Double,
val category: String
)
I have a list of MoneyTransaction. I want to create a map out of that list where keys are categories, and the values are the total amount according to the category. Kotlin has functions like groupBy, groupByTo, groupingBy. But there is no tutorial or documentation about those, so I can't figure it out. So far I got this:
val map = transactionList.groupBy({it.category},{it.amount})
But this doesn't give the total amount, just separate amounts on each category
Any help would be much appreciated.
So first of all you group your transactions by category
transactionList.groupBy { it.category }
this gives you a Map<String, List<MoneyTransaction>> after that you need to sum up the amounts
transactionList.groupBy { it.category }
.mapValues { (_, transactionsInCategory) ->
transactionsInCategory.sumOf { it.amount }
}
This will give you a Map<String, Double> with the value representing the sum of all transactions in the category.
You can use groupingBy and then fold:
transactions.groupingBy(MoneyTransaction::category)
.fold(0.0) { acc, next -> acc + next.amount }
groupingBy here would return a Grouping<MoneyTransaction, String>, which is an intermediate collection of the groups. Then you fold each of the groups by starting from 0, and adding the next transaction's amount.
Looking at the implementation, the groupingBy call doesn't actually does any actual "grouping" - it just creates a lazy Grouping object. So effectively, you are going through the collection only once.

Feedback on Lambdas

I was hoping someone could provide me some feedback on better/cleaner ways to do the following:
val driversToIncome = trips
.map { trip ->
// associate every driver to a cost (NOT UNIQUE)
trip.driver to trip.cost }
.groupBy (
// aggregate all costs that belong to a driver
keySelector = { (driver, _) -> driver },
valueTransform = { (_, cost) -> cost }
)
.map { (driver, costs) ->
// sum all costs for each driver
driver to costs.sum() }
.toMap()
You can do it like this:
val driversToIncome = trips
.groupingBy { it.driver }
.fold(0) { acc, trip -> acc + trip.cost }
It groups trips by driver and while grouping it sums costs per each driver separately.
Note that groupingBy() does not do anything on its own, it only prepares for the grouping operation. This solution avoids creating intermediary collections, it does everything in a single loop.
Then fold() calls the provided lambda sequentially on each item belonging to the specific group. Lambda receives a result from the previous call and it provides a new result (result is called accumulator). As a result, it reduces a collection of items to a single item.
You can read more about this kind of transformations in documentation about Grouping and Aggregation. Also, they aren't really inventions of Kotlin. Such operations exist in other languages and data transformation tools, so you can read about it even on Wikipedia.

Comparing and removing object from ArrayLists using Java 8

My apologies if this is a simple basic info that I should be knowing. This is the first time I am trying to use Java 8 streams and other features.
I have two ArrayLists containing same type of objects. Let's say list1 and list2. Let's say the lists has Person objects with a property "employeeId".
The scenario is that I need to merge these lists. However, list2 may have some objects that are same as in list1. So I am trying to remove the objects from list2 that are same as in list1 and get a result list that then I can merge in list1.
I am trying to do this with Java 8 removeIf() and stream() features. Following is my code:
public List<PersonDto> removeDuplicates(List<PersonDto> list1, List<PersonDto> list2) {
List<PersonDto> filteredList = list2.removeIf(list2Obj -> {
list1.stream()
.anyMatch( list1Obj -> (list1Obj.getEmployeeId() == list2Obj.getEmployeeId()) );
} );
}
The above code is giving compile error as below:
The method removeIf(Predicate) in the type Collection is not applicable for the arguments (( list2Obj) -> {})
So I changed the list2Obj at the start of "removeIf()" to (<PersonDto> list2Obj) as below:
public List<PersonDto> removeDuplicates(List<PersonDto> list1, List<PersonDto> list2) {
List<PersonDto> filteredList = list2.removeIf((<PersonDto> list2Obj) -> {
list1.stream()
.anyMatch( list1Obj -> (list1Obj.getEmployeeId() == list2Obj.getEmployeeId()) );
} );
}
This gives me an error as below:
Syntax error on token "<", delete this token for the '<' in (<PersonDto> list2Obj) and Syntax error on token(s), misplaced construct(s) for the part from '-> {'
I am at loss on what I really need to do to make it work.
Would appreciate if somebody can please help me resolve this issue.
I've simplified your function just a little bit to make it more readable:
public static List<PersonDto> removeDuplicates(List<PersonDto> left, List<PersonDto> right) {
left.removeIf(p -> {
return right.stream().anyMatch(x -> (p.getEmployeeId() == x.getEmployeeId()));
});
return left;
}
Also notice that you are modifying the left parameter, you are not creating a new List.
You could also use: left.removeAll(right), but you need equals and hashcode for that and it seems you don't have them; or they are based on something else than employeeId.
Another option would be to collect those lists to a TreeSet and use removeAll:
TreeSet<PersonDto> leftTree = left.stream()
.collect(Collectors.toCollection(() -> new TreeSet<>(Comparator.comparing(PersonDto::getEmployeeId))));
TreeSet<PersonDto> rightTree = right.stream()
.collect(Collectors.toCollection(() -> new TreeSet<>(Comparator.comparing(PersonDto::getEmployeeId))));
leftTree.removeAll(rightTree);
I understand you are trying to merge both lists without duplicating the elements that belong to the intersection. There are many ways to do this. One is the way you've tried, i.e. remove elements from one list that belong to the other, then merge. And this, in turn, can be done in several ways.
One of these ways would be to keep the employee ids of one list in a HashSet and then use removeIf on the other list, with a predicate that checks whether each element has an employee id that is contained in the set. This is better than using anyMatch on the second list for each element of the first list, because HashSet.contains runs in O(1) amortized time. Here's a sketch of the solution:
// Determine larger and smaller lists
boolean list1Smaller = list1.size() < list2.size();
List<PersonDto> smallerList = list1Smaller ? list1 : list2;
List<PersonDto> largerList = list1Smaller ? list2 : list1;
// Create a Set with the employee ids of the larger list
// Assuming employee ids are long
Set<Long> largerSet = largerList.stream()
.map(PersonDto::getEmployeeId)
.collect(Collectors.toSet());
// Now remove elements from the smaller list
smallerList.removeIf(dto -> largerSet.contains(dto.getEmployeeId()));
The logic behind this is that HashSet.contains will take the same time for both a large and a small set, because it runs in O(1) amortized time. However, traversing a list and removing elements from it will be faster on smaller lists.
Then, you are ready to merge both lists:
largerList.addAll(smallerList);

How to query an index with a subcollection that has date ranges in RavenDB?

I prepared the full test case here: https://gist.github.com/pkrakowiak/cc8addf5725193a01f2d
There are Location documents. Each location can have zero or more sponsors during some time periods (represented by the IList<Sponsorship> Sponsors property). I need to return only those locations that are sponsored on a particular day (say 15th of March in my example). So such location must have at least one Sponsorship instance that matches the following query: .Where(x => x.Sponsors.Any(s => s.From <= today && s.To >= today))
I prepared two tests, one is not using an index explicitly: CanGetCurrentlySponsoredLocations, and one which uses a static index that I created: CanGetCurrentlySponsoredLocationsUsingStaticIndex. The first one will pass, the second one will fail. The question is - how do I make the second test pass? What sort of modifications do I need to apply to my Locations_ByCoordinates index?
In case you are wondering where the index name came from or what the reviews are - just ignore them. :) They are leftovers from other things that I was testing.
Update
I took this question first to the official RavenDB Google group: https://groups.google.com/forum/?fromgroups=#!topic/ravendb/ySUPXqkpTA8 Sadly, it did not bring me a solution.
The simplest index that will pass your unit test is:
private class Locations_ByCoordinates : AbstractIndexCreationTask<Location>
{
public Locations_ByCoordinates()
{
Map = locations => from l in locations
from s in l.Sponsors
select new
{
Sponsors_From = s.From,
Sponsors_To = s.To
};
}
}
You might want to pick a better name, since the coordinates aren't indexed.
I'm not sure what your other test CanSortOnSponsorshipStatus is all about though.
UPDATE
To include locations that have no sponsors, use the DefaultIfEmpty linq extension method. This will make sure that all locations have at least one index entry.
private class Locations_ByCoordinates : AbstractIndexCreationTask<Location>
{
public Locations_ByCoordinates()
{
Map = locations => from l in locations
from s in l.Sponsors
.DefaultIfEmpty(new Sponsorship
{
From = DateTime.MinValue,
To = DateTime.MaxValue
})
select new
{
Sponsors_From = s.From,
Sponsors_To = s.To
};
}
}

Proper Way to Retrieve More than 128 Documents with RavenDB

I know variants of this question have been asked before (even by me), but I still don't understand a thing or two about this...
It was my understanding that one could retrieve more documents than the 128 default setting by doing this:
session.Advanced.MaxNumberOfRequestsPerSession = int.MaxValue;
And I've learned that a WHERE clause should be an ExpressionTree instead of a Func, so that it's treated as Queryable instead of Enumerable. So I thought this should work:
public static List<T> GetObjectList<T>(Expression<Func<T, bool>> whereClause)
{
using (IDocumentSession session = GetRavenSession())
{
return session.Query<T>().Where(whereClause).ToList();
}
}
However, that only returns 128 documents. Why?
Note, here is the code that calls the above method:
RavenDataAccessComponent.GetObjectList<Ccm>(x => x.TimeStamp > lastReadTime);
If I add Take(n), then I can get as many documents as I like. For example, this returns 200 documents:
return session.Query<T>().Where(whereClause).Take(200).ToList();
Based on all of this, it would seem that the appropriate way to retrieve thousands of documents is to set MaxNumberOfRequestsPerSession and use Take() in the query. Is that right? If not, how should it be done?
For my app, I need to retrieve thousands of documents (that have very little data in them). We keep these documents in memory and used as the data source for charts.
** EDIT **
I tried using int.MaxValue in my Take():
return session.Query<T>().Where(whereClause).Take(int.MaxValue).ToList();
And that returns 1024. Argh. How do I get more than 1024?
** EDIT 2 - Sample document showing data **
{
"Header_ID": 3525880,
"Sub_ID": "120403261139",
"TimeStamp": "2012-04-05T15:14:13.9870000",
"Equipment_ID": "PBG11A-CCM",
"AverageAbsorber1": "284.451",
"AverageAbsorber2": "108.442",
"AverageAbsorber3": "886.523",
"AverageAbsorber4": "176.773"
}
It is worth noting that since version 2.5, RavenDB has an "unbounded results API" to allow streaming. The example from the docs shows how to use this:
var query = session.Query<User>("Users/ByActive").Where(x => x.Active);
using (var enumerator = session.Advanced.Stream(query))
{
while (enumerator.MoveNext())
{
User activeUser = enumerator.Current.Document;
}
}
There is support for standard RavenDB queries, Lucence queries and there is also async support.
The documentation can be found here. Ayende's introductory blog article can be found here.
The Take(n) function will only give you up to 1024 by default. However, you can change this default in Raven.Server.exe.config:
<add key="Raven/MaxPageSize" value="5000"/>
For more info, see: http://ravendb.net/docs/intro/safe-by-default
The Take(n) function will only give you up to 1024 by default. However, you can use it in pair with Skip(n) to get all
var points = new List<T>();
var nextGroupOfPoints = new List<T>();
const int ElementTakeCount = 1024;
int i = 0;
int skipResults = 0;
do
{
nextGroupOfPoints = session.Query<T>().Statistics(out stats).Where(whereClause).Skip(i * ElementTakeCount + skipResults).Take(ElementTakeCount).ToList();
i++;
skipResults += stats.SkippedResults;
points = points.Concat(nextGroupOfPoints).ToList();
}
while (nextGroupOfPoints.Count == ElementTakeCount);
return points;
RavenDB Paging
Number of request per session is a separate concept then number of documents retrieved per call. Sessions are short lived and are expected to have few calls issued over them.
If you are getting more then 10 of anything from the store (even less then default 128) for human consumption then something is wrong or your problem is requiring different thinking then truck load of documents coming from the data store.
RavenDB indexing is quite sophisticated. Good article about indexing here and facets here.
If you have need to perform data aggregation, create map/reduce index which results in aggregated data e.g.:
Index:
from post in docs.Posts
select new { post.Author, Count = 1 }
from result in results
group result by result.Author into g
select new
{
Author = g.Key,
Count = g.Sum(x=>x.Count)
}
Query:
session.Query<AuthorPostStats>("Posts/ByUser/Count")(x=>x.Author)();
You can also use a predefined index with the Stream method. You may use a Where clause on indexed fields.
var query = session.Query<User, MyUserIndex>();
var query = session.Query<User, MyUserIndex>().Where(x => !x.IsDeleted);
using (var enumerator = session.Advanced.Stream<User>(query))
{
while (enumerator.MoveNext())
{
var user = enumerator.Current.Document;
// do something
}
}
Example index:
public class MyUserIndex: AbstractIndexCreationTask<User>
{
public MyUserIndex()
{
this.Map = users =>
from u in users
select new
{
u.IsDeleted,
u.Username,
};
}
}
Documentation: What are indexes?
Session : Querying : How to stream query results?
Important note: the Stream method will NOT track objects. If you change objects obtained from this method, SaveChanges() will not be aware of any change.
Other note: you may get the following exception if you do not specify the index to use.
InvalidOperationException: StreamQuery does not support querying dynamic indexes. It is designed to be used with large data-sets and is unlikely to return all data-set after 15 sec of indexing, like Query() does.