Why are Hash Converters of redis repository so slow? - redis

I would like to know why the classes that are in charge of converting an object to a redis hash are so slow.
In particular I have tried to use these 2 classes with StringRedisTemplate:
private final HashMapper<Item, String, String> hashMapper = new DecoratingStringHashMapper(new Jackson2HashMapper(true));
private final HashMapper<Object, byte[], byte[]> mapper = new ObjectHashMapper();
To save 27000 records it takes 24 seconds for the first converter and 2 seconds for the second (it is less but still a lot).
Finally I have tested the conversion with the following class:
private final ObjectMapper objectMapper = new ObjectMapper();
and takes 200 milliseconds, but this method is not the one offered by spring-data-redis.
It should be noted that the benchmark has been performed by iterating over the elements and performing only the conversion, without saving anything in redis.
I do not understand how it is possible that the conversion from an object to a map is so slow, and even less using the classes offered by spring.
Maybe I have left some important configuration parameter, I don't know, please help.

Related

Key-value store with only explicitly allowed keys

I need a key-value store (e.g. a Mapor a custom class) which only allows keys out of a previously defined set, e.g. only the keys ["apple", "orange"]. Is there anything like this built-in in Kotlin? Otherwise, how could one do this? Maybe like the following code?
class KeyValueStore(val allowedKeys: List<String>){
private val map = mutableMapOf<String,Any>()
fun add(key: String, value: Any) {
if(!allowedKeys.contains(key))
throw Exception("key $key not allowed")
map.put(key, value)
}
// code for reading keys, like get(key: String) and getKeys()
}
The best solution for your problem would be to use an enum, which provides exactly the functionality that you're looking for. According to the docs, you can declare an enum like so:
enum class AllowedKeys {
APPLE, ORANGE
}
then, you could declare the keys with your enum!
Since the keys are known at compile time, you could simply use an enum instead of String as the keys of a regular Map:
enum class Fruit {
APPLE, ORANGE
}
val fruitMap = mutableMapOf<Fruit, String>()
Instead of Any, use whatever type you need for your values, otherwise it's not convenient to use.
If the types of the values depend on the key (a heterogeneous map), then I would first seriously consider using a regular class with your "keys" as properties. You can access the list of properties via reflection if necessary.
Another option is to define a generic key class, so the get function returns a type that depends on the type parameter of the key (see how CoroutineContext works in Kotlin coroutines).
For reference, it's possible to do this if you don't know the set of keys until runtime. But it involves writing quite a bit of code; I don't think there's an easy way.
(I wrote my own Map class for this. We needed a massive number of these maps in memory, each with the same 2 or 3 keys, so I ended up writing a Map implementation pretty much from scratch: it used a passed-in array of keys — so all maps could share the same key array — and a private array of values, the same size. The code was quite long, but pretty simple. Most operations meant scanning the list of keys to find the right index, so the theoretic performance was dire; but since the list was always extremely short, it performed really well in practice. And it saved GBs of memory compared to using HashMap. I don't think I have the code any more, and it'd be far too long to post here, but I hope the idea is interesting.)

Joining (union) of Sets inside a Set in Java

I have a map where the values are sets of integers. What i'd want to do is to get in the best way possible (using only the Java API would be great) the union of all the sets of Integers.
Map<Long, Set<Integer>> map;
What I thought so far is to loop through the values() of the map and manually add to the big Set:
Set<Integer> bigSet = new HashSet<>();
Iterator<Set<Integer>> iter = map.values().iterator();
while(iter.hasNext())
bigSet.addAll(iter.next());
Also a collection for the union backed by the map would be great.
Unfortunately i am stuck with Java 7.
On the one hand you could use the new Java 8 fluent interface
import static java.util.stream.Collectors.toSet;
Set<Integer> myUnion = map
.values()
.stream()
.flatMap(set -> set.stream())
.collect(toSet());
On the other hand I would suggest taking a look at Guava's SetMultimap if you can use external libraries.

Limiting Parameters

Probably going to get shot down for this, but I have an issue with my parameters.
Say I need to store a race (Which I do)
During planning, I realized I needed to store things like:
Terrain of the race
location of the race
time the race starts
time admission ends.
Name of the Race
Types of member permitted to join
etc
In short, it's a ton of underivable data that can't really come from elsewere
and all in all, I have like, 22 parrameters for my JuniorRace object, and like 26 Parameters for my SeniorRace object, I've already coded it but it's messy and I don't like my work.
This wouldn't be a massive problem, and it actually won't be a problem AT ALL for the users since they won't see the business model, just the view model, but it is for me having to constantly comment these same parameters multiple times.
What is the best way I can stop using so many parameters every time I make a constructor, and every time I create a new object instance?
do I just try to use less and store data elsewhere, if so, where?
use more classes like Person would have Address and Details?
I'm really stumped here, will post my code, but yeah, it's a ton of parameters pretty much everywhere -- I'm not a very experienced OO programmer.
You could store all the parameters as a map and just pass in the map, something like:
Map myParams = new HashMap<String,Object>();
myParams.add("Terrain","terrible");
myParams.add("Location","Bobs back yard");
myParams.add("Length (yards)", 100);
myParams.add("Hazards", new String[] {"Bob's cat","The old tire","the fence"});
Then you could calll your routine like this:
SaveRaceCourse(myParams);
Maps and suchlike are great for passing around data.
As I can't see all 22 parameters this is a guess but most probably a correct one.
Are some of those 22 parameter related. If so group the related ones in another class and make SeniorRace a composition of all these classes.
For example: location and terrain seem related, admission period and allowed member types seems related to admission (maybe a fee is part of it too).
This way you will end up with a limited set of objects to pass, all related info lives together and evolve together.
Break it down by encapsulating similar properties in objects. It's called decomposition.
For example, your Race can accept a TimeCard encapsulating all the timing details, a Location which has the terrain and what not (maybe directions), an object encapsulating the requirements, ect...
class RaceTimeCard {
private final Timestamp admissionStart;
private final Timestamp admissionEnd;
private final Timestamp raceStart;
private Timestamp raceEnd;
public RaceTimeCard(Timestamp admissionStart, Timestamp admissionEnd, Timestamp raceStart) {
//init final fields
}
public void endRace() {
//clock the time that the race ended
}
}
class RaceLocation {
private final Terrain terrain;
private final Directions directions;
private final GPSCoordinates coordinates;
public RaceLocation(Terrain terrain, Directions directions, GPSCoordinates coordinates) {
//init final fields
}
}
class Race {
private RaceTimeCard timeCard;
private RaceLocation location;
public Race(RaceTimeCard timeCard, RaceLocation) {
//init fields
}
}
If you'd like, you could subclass Location and TimeCard to create specific instances:
final class Mountains extends RaceLocation {
public Mountains() {
super(Terrain.ROCKY, new Directions(...), new GPSCoordinates(...));
}
}
final class EarlyBirdTimeCard extends RaceTimeCard {
public EarlyBirdTimeCard() {
//specify super constructor with params
}
}
Now instantiating your Race object is as simple as:
RaceTimeCard timeCard = new EarlyBirdTimeCard();
RaceLocation location = new Mountains();
...
Race race = new Race(timeCard, location, ...);
If it's still too long, you can probably decompose more. The way I see it, you could have a RaceDetails object containing all the (already decomposed) details, then pass that to Race. Make sure to profile your application, make sure the overhead from object creation doesn't get too bad.

Map with ArrayList as the value in Java - Why use a third party library for this?

I've recently found the need to use a Map with a Long as the key and an ArrayList of objects as the value, like this:
Map<Long, ArrayList<Object>>
But I just read here that using a third-party library like Google Guava is recommended for this. In particular, a Multimap is recommended instead of the above.
What are the main benefits of using a library to do something so simple?
I like the ArrayList analogy given above. Just as ArrayList saves you from array-resizing boilerplate, Multimap saves you from list-creation boilerplate.
Before:
Map<String, List<Connection>> map =
new HashMap<>();
for (Connection connection : connections) {
String host = connection.getHost();
if (!map.containsKey(host)) {
map.put(host, new ArrayList<Connection>());
}
map.get(host).add(connection);
}
After:
Multimap<String, Connection> multimap =
ArrayListMultimap.create();
for (Connection connection : connections) {
multimap.put(connection.getHost(), connection);
}
And that leads into the next advantage: Since Guava has committed to using Multimap, it includes utilities built around the class. Using them, the "after" in "before and after" should really be:
Multimap<String, Connection> multimap =
Multimaps.index(connections, Connection::getHost);
Multimaps.index is one of many such utilities. Plus, the Multimap class itself provides richer methods than Map<K, List<V>>.
What is Guava details some reasoning and benefits.
For me the biggest reason would be reliability and testing. As mentioned it has been battle-tested at Google, is now very widely used elsewhere and has extensive unit testing.

VB.NET EF - Creating a generic function that inspects an entity & permits entity save? Code First

I know, this sounds strange... But this is what I'm trying to do, how far along I am, and why I'm even doing it in the first place:
The class is configured, as an instance, with the name of the class. The context class is also available for preparing the batch class.
Pass a generic list of objects (entities, really) to a class.
That class (which can be pre-configured for this particular class) does just one thing: Adds a new entity to the backend database via DBContext.
The same class can be used for any entity described in the Context class, but each instance of the class is for just one entity class.
I want to write a blog article on my blog showing the performance of dynamically adjusting the batch size when working with EF persistence, and how constantly looking for the optimum batch size can be done.
When I say "DBContext" class, I mean a class similar to this:
Public Class CarContext
Inherits DbContext
Public Sub New()
MyBase.New("name=vASASysContext")
End Sub
Public Property Cars As DbSet(Of Car)
Protected Overrides Sub OnModelCreating(modelBuilder As DbModelBuilder)
modelBuilder.Configurations.Add(Of Car)(New CarConfiguration())
End Sub
End Class
That may sound confusing. Here is a use-case (sorta):
Need: I need to add a bunch of entities to a database. Let's call them cars, for lack of an easier example. Each entity is an instantiation of a car class, which is configured via Code First EF6 to be manipulated like any other class that is well defined in DBContext. Just like in real world classes, not all attributes are mapped to a database column.
High Level:
-I throw all the entities into a generic list, the kind of list supported by our 'batch class'.
-I create an instance of our batch class, and configure it to know that it is going to be dealing with car entities. Perhaps I even pass the context class that has the line:
-Once the batch class is ready (may only need the dbcontext instance and the name of the entity), if is given the list of entities via a function something like:
Public Function AddToDatabase(TheList as List(Of T)) As Double
The idea is pretty simple. My batch class is set to add these cars to the database, and will add the cars in the list it was given. The same rules apply to adding the entities in batch as they do when adding via DBContext normally.
All I want to happen is that the batch class itself does not need to be customized for each entity type it would deal with. Configuration is fine, but not customization.
OK... But WHY?
Adding entities via a list is easy. The secret sauce is the batch class itself. I've already written all the logic that determines the rate at which the entities are added (in something akin to "Entities per Second). It also keeps track of the best performance, and varies the batch size occasionally to verify.
The reason the function above (called AddToDatabase() for illustrative purposes) returns a double is that the double represents the amount of entities that were added per second.
Ultimately, the batch class returns to the calling method the number of entities to put in the next batch to maintain peak performance.
So my question for you all is how I can determine the entity fields, and use that to save the entities in a given list.
Can it be as simple as copying the entities? Do I even need to use reflection at all, and have the class use 'GetType' to figure out the entity class in the list (cars)?
How would you go about this?
Thank yu very much in advance for your reading this far, and your thoughtful response..
[Don't read further unless you are into this kind of thing!]
The performance of a database operation isn't linear, and is dependent on several factors (memory, CPU load, DB connectivity, etc.), and the DB is not always on the same machine as the application. It may even involve web services.
Your first instinct is to say that more entities in a single batch is best, but that is probably not true in most cases. When you add entities to a batch add, at first you see an increase in performance (increase in entities/second). But as the batch size increases, the performance may reach a maximum, then start to decrease (for a lot of reasons, not excluding environmental, such as memory). For non-memory issues, the batch performance may start to level off, and we haven't even discussed the impact of the batch on the system itself.
So in the case of a leveling off, I don't want my batch size any larger than it needs to be to be in the neighborhood of peak performance. Also, with smaller batch sizes, the class is able to evaluate the system's performance more frequently.
Being new to Code First and EF6, I can see that there must be some way to use reflection to determine how to take the given list of entities, break them apart into the entity attributes, and persist them via the EF itself.
So far, I do it this way because I need to manually configure each parameter in the INSERT INTO...
For Each d In TheList
s = "INSERT INTO BDTest (StartAddress, NumAddresses, LastAddress, Duration) VALUES (#StartAddress, #NumAddresses, #LastAddress, #Duration)"
Using cmd As SqlCommand = New SqlCommand(s, conn)
conn.Open()
cmd.Parameters.Add("#StartAddress", Data.SqlDbType.NVarChar).Value = d.StartIP
cmd.Parameters.Add("#NumAddresses", Data.SqlDbType.Int).Value = d.NumAddies
cmd.Parameters.Add("#LastAddress", Data.SqlDbType.NVarChar).Value = d.LastAddie
singleRate = CDbl(Me.TicksPerSecond / swSingle.Elapsed.Ticks)
cmd.Parameters.Add("#Duration", Data.SqlDbType.Int).Value = singleRate
cmd.ExecuteNonQuery()
conn.Close()
End Using
Next
I need to steer away in this test code from using SQL, and closer toward EF6...
What are your thoughts?
TIA!
So, there are two issues I see that you should tackle. First is your question about creating a "generic" method to add a list of entities to the database.
Public Function AddToDatabase(TheList as List(Of T)) As Double
If you are using POCO entities, then I would suggest you create an abstract base class or interface for all entity classes to inherit from/implement. I'll go with IEntity.
Public Interface IEntity
End Interface
So your Car class would be:
Public Class Car
Implements IEntity
' all your class properties here
End Class
That would handle the generic issue.
The second issue is one of batch inserts. A possible implementation of your method could be as follows. This will insert a batches of 100, modify the paramater inputs as needed. Also replace MyDbContext with the actual Type of your DbContext.
Public Function AddToDatabase(ByVal entities as List(Of IEntity)) As Double
Using scope As New TransactionScope
Dim context As MyDbContext
Try
context = new MyDbContext()
context.Configuration.AutoDetectChangesEnabled = false
Dim count = 0
For Each entityToInsert In entities
count += 1
context = AddToContext(context, entityToInsert, count, 100, true)
Next
context.SaveChanges()
Finally
If context IsNot Nothing
context.Dispose()
End If
End Try
scope.Complete()
End Using
End Function
Private Function AddToContext(ByVal context As MyDbContext, ByVal entity As IEntity, ByVal count As Integer, ByVal commitCount As Integer, ByVal recreateContext as Boolean) As MyDbContext
context.Set(entity.GetType).Add(entity)
If (count % commitCount = 0)
context.SaveChanges()
If (recreateContext)
context.Dispose()
context = new MyDbContext()
context.Configuration.AutoDetectChangesEnabled = false
End If
End If
return context
End Function
Also, please apologize if this is not 100% perfect as I mentally converted it from C# to VB.Net while typing. It should be very close.