Defining the structure of hierarchical data versus actually storing the data - oop

I'm trying to develop a survey application that can deal with various types of responses, such as boolean, multiple choice, as well as ranges from 1-5, 1-10, 1-100, -2 to +2, even decimal values. So, I've created a hierarchy of Response types:
class Response {
String name
}
class BooleanResponse extends Response {
boolean score
}
class SimpleGradeResponse extends Response {
char score // A-F
}
class ComplexGradeResponse extends Response {
char score // A-F
char modifier // +, -, blank
}
class IntegerResponse extends Response {
int min, max, score
}
class DecimalResponse extends Response {
double min, max, score
}
I think of that as the metadata. It describes the survey response types. You could have question 1 that's an IntegerResponse from 1 to 10, question 2 that's an IntegerResponse from 0-100, question 3 that's a DecimalResponse, etc.
Where I'm uncertain is where to store the actual responses from users. Do you mingle the scores in with the metadata, as I've done with the score field above? That seems awkward, since every range-type response carries with it the min and max.
What I really want (I think) is to be able to curry the range response parameters to create a new type, and then reflect that type in a table of actual responses. So, a question that takes a 1-10 range would be a row in IntegerResponse with min=1 and max=10. But how does the ActualResponse table (I need to improve the naming convention, once I figure this out!) hold a score and refer to the curried IntegerResponse with min and max set?
If there is a better way to approach this whole problem, I'm eager to hear it.
Thanks,
Lee

min and max should rather be stored in a Question object, or even in QuestionType - depending on how flexible the system should be in runtime.
Will there be arbitrary limitations to different kinds of Responses? Data structure hugely depends on it. You can either hardcode different Response types or support them in runtime. First approach is one order of magnitude simpler.

Related

Kotlin: Why is Sequence more performant in this example?

Currently, I am looking into Kotlin and have a question about Sequences vs. Collections.
I read a blog post about this topic and there you can find this code snippets:
List implementation:
val list = generateSequence(1) { it + 1 }
.take(50_000_000)
.toList()
measure {
list
.filter { it % 3 == 0 }
.average()
}
// 8644 ms
Sequence implementation:
val sequence = generateSequence(1) { it + 1 }
.take(50_000_000)
measure {
sequence
.filter { it % 3 == 0 }
.average()
}
// 822 ms
The point here is that the Sequence implementation is about 10x faster.
However, I do not really understand WHY that is. I know that with a Sequence, you do "lazy evaluation", but I cannot find any reason why that helps reducing the processing in this example.
However, here I know why a Sequence is generally faster:
val result = sequenceOf("a", "b", "c")
.map {
println("map: $it")
it.toUpperCase()
}
.any {
println("any: $it")
it.startsWith("B")
}
Because with a Sequence you process the data "vertically", when the first element starts with "B", you don't have to map for the rest of the elements. It makes sense here.
So, why is it also faster in the first example?
Let's look at what those two implementations are actually doing:
The List implementation first creates a List in memory with 50 million elements.  This will take a bare minimum of 200MB, since an integer takes 4 bytes.
(In fact, it's probably far more than that.  As Alexey Romanov pointed out, since it's a generic List implementation and not an IntList, it won't be storing the integers directly, but will be ‘boxing’ them — storing references to Int objects.  On the JVM, each reference could be 8 or 16 bytes, and each Int could take 16, giving 1–2GB.  Also, depending how the List gets created, it might start with a small array and keep creating larger and larger ones as the list grows, copying all the values across each time, using more memory still.)
Then it has to read all the values back from the list, filter them, and create another list in memory.
Finally, it has to read all those values back in again, to calculate the average.
The Sequence implementation, on the other hand, doesn't have to store anything!  It simply generates the values in order, and as it does each one it checks whether it's divisible by 3 and if so includes it in the average.
(That's pretty much how you'd do it if you were implementing it ‘by hand’.)
You can see that in addition to the divisibility checking and average calculation, the List implementation is doing a massive amount of memory access, which will take a lot of time.  That's the main reason it's far slower than the Sequence version, which doesn't!
Seeing this, you might ask why we don't use Sequences everywhere…  But this is a fairly extreme example.  Setting up and then iterating the Sequence has some overhead of its own, and for smallish lists that can outweigh the memory overhead.  So Sequences only have a clear advantage in cases when the lists are very large, are processed strictly in order, there are several intermediate steps, and/or many items are filtered out along the way (especially if the Sequence is infinite!).
In my experience, those conditions don't occur very often.  But this question shows how important it is to recognise them when they do!
Leveraging lazy-evaluation allows avoiding the creation of intermediate objects that are irrelevant from the point of the end goal.
Also, the benchmarking method used in the mentioned article is not super accurate. Try to repeat the experiment with JMH.
Initial code produces a list containing 50_000_000 objects:
val list = generateSequence(1) { it + 1 }
.take(50_000_000)
.toList()
then iterates through it and creates another list containing a subset of its elements:
.filter { it % 3 == 0 }
... and then proceeds with calculating the average:
.average()
Using sequences allows you to avoid doing all those intermediate steps. The below code doesn't produce 50_000_000 elements, it's just a representation of that 1...50_000_000 sequence:
val sequence = generateSequence(1) { it + 1 }
.take(50_000_000)
adding a filtering to it doesn't trigger the calculation itself as well but derives a new sequence from the existing one (3, 6, 9...):
.filter { it % 3 == 0 }
and eventually, a terminal operation is called that triggers the evaluation of the sequence and the actual calculation:
.average()
Some relevant reading:
Kotlin: Beware of Java Stream API Habits
Kotlin Collections API Performance Antipatterns

Linking Text to an Integer Objective C

The goal of this post is to find a more efficient way to create this method. Right now, as I start adding more and more values, I'm going to have a very messy and confusing app. Any help is appreciated!
I am making a workout app and assign an integer value to each workout. For example:
Where the number is exersiceInt:
01 is High Knees
02 is Jumping Jacks
03 is Jog in Place
etc.
I am making it so there is a feature to randomize the workout. To do this I am using this code:
-(IBAction) setWorkoutIntervals {
exerciseInt01 = 1 + (rand() %3);
exerciseInt02 = 1 + (rand() %3);
exerciseInt03 = 1 + (rand() %3);
}
So basically the workout intervals will first be a random workout (between high knees, jumping jacks, and jog in place). What I want to do is make a universal that defines the following so I don't have to continuously hard code everything.
Right now I have:
-(void) setLabelText {
if (exerciseInt01 == 1) {
exercise01Label.text = [NSString stringWithFormat:#"High Knees"];
}
if (exerciseInt01 == 2) {
exercise01Label.text = [NSString stringWithFormat:#"Jumping Jacks"];
}
if (exerciseInt01 == 3) {
exercise01Label.text = [NSString stringWithFormat:#"Jog in Place"];
}
}
I can already tell this about to get really messy once I start specifying images for each workout and start adding workouts. Additionally, my plan was to put the same code for exercise02Label, exercise03Label, etc. which would become extremely redundant and probably unnecessary.
What I'm thinking would be perfect if there would be someway to say
exercise01Label.text = exercise01Int; (I want to to say that the Label's text equals Jumping Jacks based on the current integer value)
How can I make it so I only have to state everything once and make the code less messy and less lengthy?
Three things for you to explore to make your code easier:
1. Count from zero
A number of things can be easier if you count from zero. A simple example is if your first exercise was numbered 0 then your random calculation would just be rand() % 3 (BTW look up uniform random number, there are much better ways to get a random number).
2. Learn about enumerations
An enumeration is a type with a set of named literal values. In (Objective-)C you can also think of them as just a collection of named integer values. For example you might declare:
typedef enum
{
HighKnees,
JumpingJacks,
JogInPlace,
ExerciseKindCount
} ExerciseCount;
Which declares ExerciseCount as a new type with 4 values. Each of these is equivalent to an integer, here HighKnees is equivalent to 0 and ExerciseKindCount to 3 - this should make you think of the first thing, count from zero...
3. Discover arrays
An array is an ordered collection of items where each item has an index - which is usually an integer or enumeration value. In (Objective-)C there are two basic kinds of arrays: C-style and object-style represented by NSArray and NSMutableArray. For example here is a simple C-style array:
NSString *gExerciseLabels[ExerciseKindCount] =
{ #"High Knees",
#"Jumping Jacks",
#"Jog in Place"
}
You've probably guessed by now, the first item of the above array has index 0, back to counting from zero...
Exploring these three things should quickly show you ways to simplify your code. Later you may wish to explore structures and objects.
HTH
A simple way to start is by putting the exercise names in an array. Then you can access the names by index. eg - exerciseNames[exerciseNumber]. You can also make the list of exercises in an array (of integers). So you would get; exerciseNames[exerciseTable[i]]; for example. Eventually you will want an object to define an exercise so that you can include images, videos, counts, durations etc.

How can I ask Lucene to do simple, flat scoring?

Let me preface by saying that I'm not using Lucene in a very common way and explain how my question makes sense. I'm using Lucene to do searches in structured records. That is, each document, that is indexed, is a set of fields with short values from a given set. Each field is analysed and stored, the analysis producing usually no more than 3 and in most cases just 1 normalised token. As an example, imagine files for each of which we store two fields: the path to the file and a user rating in 1-5. The path is tokenized with a PathHierarchyTokenizer and the rating is just stored as-is. So, if we have a document like
path: "/a/b/file.txt"
rating: 3
This document will have for its path field the tokens "/a", "/a/b" and "/a/b/file.ext", and for rating the token "3".
I wish to score this document against a query like "path:/a path:/a/b path:/a/b/different.txt rating:1" and get a value of 2 - the number of terms that match.
My understanding and observation is that the score of the document depends on various term metrics and with many documents with many fields each, I most definitely am not getting simple integer scores.
Is there some way to make Lucene score documents in the outlined fashion? The queries that are run against the index are not generated by the users, but are built by the system and have an optional filter attached, meaning they all have a fixed form of several TermQuerys joined in a BooleanQuery with nothing like any fuzzy textual searches. Currently I don't have the option of replacing Lucene with something else, but suggestions are welcome for a future development.
I doubt there's something ready to use, so most probably you will need to implement your own scorer and use it when searching. For complicated cases you may want to play around with queries, but for simple case like yours it should be enough to overwrite DefaultSimilarity setting tf factor to raw frequency (number of specified terms in document in question) and all other components to 1. Something like this:
public class MySimilarity extends DefaultSimilarity {
#Override
public float computeNorm(String field, FieldInvertState state) {
return 1;
}
#Override
public float queryNorm(float sumOfSquaredWeights) {
return 1;
}
#Override
public float tf(float freq) {
return freq;
}
#Override
public float idf(int docFreq, int numDocs) {
return 1;
}
#Override
public float coord(int overlap, int maxOverlap) {
return 1;
}
}
(Note, that tf() is the only method that returns something different than 1)
And the just set similarity on IndexSearcher.

Best Pattern Applyable to Sudoku?

I'm trying to make a Sudoku game, and I gathered the following validations to each number inserted:
Number must be between 1 and 9;
Number must be unique in the line;
Number must be unique in the column;
Number must be unique in the sub-matrix.
As I'm repeating too much the "Number must be unique in..." rule, I made the following design:
There are 3 kinds of groups, ColumnGroup, LineGroup, and SubMatrixGroup (all of them implement the GroupInterface);
GroupInterface has a method public boolean validate(Integer number);
Each cell is related to 3 groups, and it must be unique between the groups, if any of them doesn't evaluate to true, number isn't allowed;
Each cell is an observable, making the group an observer, that reacts to one Cell change attempt.
And that s*cks.
I can't find what's wrong with my design. I just got stuck with it.
Any ideas of how I can make it work?
Where is it over-objectified? I can feel it too, maybe there is another solution that would be more simple than that...
Instead of having 3 validator classes, an abstract GroupInterface, an observable, etc., you can do it with a single function.
Pseudocode ahead:
bool setCell(int cellX, int cellY, int cellValue)
{
m_cells[x][y] = cellValue;
if (!isRowValid(y) || !isColumnValid(x) || !isSubMatrixValid(x, y))
{
m_cells[x][y] = null; // or 0 or however you represent an empty cell
return false;
}
return true;
}
What is the difference between a ColumnGroup, LineGroup and SubMatrixGroup? IMO, these three should simply be instances of a generic "Group" type, as the type of the group changes nothing - it doesn't even need to be noted.
It sounds like you want to create a checker ("user attempted to write number X"), not a solver. For this, your observable pattern sounds OK (with the change mentioned above).
Here (link) is an example of a simple sudoku solver using the above-mentioned "group" approach.

Game Design: Data structures for Stackable Attributes (DFP, HP, MP, etc) in an RPG (VB.Net)

I'm wrangling with issues regarding how character equipment and attributes are stored within my game.
My characters and equippable items have 22 different total attributes (HP, MP, ATP, DFP). A character has their base-statistics and can equip up to four items at one time.
For example:
BASE ATP: 55
WEAPON ATP: 45
ARMOR1 ATP: -10
ARMOR2 ATP: -5
ARMOR3 ATP: 3
Final character ATP in this case would be 88.
I'm looking for a way to implement this in my code. Well, I already have this implemented but it isn't elegant. Right now, I have ClassA that stores these attributes in an array. Then, I have ClassB that takes five ClassA and will add up the values and return the final attribute.
However, I need to emphasize that this isn't an elegant solution.
There has to be some better way to access and manipulate these attributes, including data structures that I haven't thought of using. Any suggestions?
EDIT: I should note that there are some restrictions on these attributes that I need to be put in place. E.g., these are the baselines.
For instance, the character's own HP and MP cannot be more than the baseline and cannot be less than 0, whereas the ATP and MST can be. I also currently cannot enforce these constraints without hacking what I currently have :(
Make an enum called CharacterAttributes to hold each of STR, DEX, etc.
Make an Equipment class to represent any equippable item. This class will have a Dictionary which is a list of any stats modified by this equipment. For a sword that gives +10 damage, use Dictionary[CharacterAttributes.Damage] = 10. Magic items might influence more than one stat, so just add as many entries as you like.
The equipment class might also have an enum representing which inventory it slots to (Boots, Weapon, Helm).
Your Character class will have a List to represent current gear. It will also have a dictionary of CharacterAttributes just like the equipment class, which represents the character's base stats.
To calculate final stats, make a method in your Character class something like this:
int GetFinalAttribute(CharacterAttributes attribute)
{
int x = baseStats[attribute];
foreach (Equipment e in equipment)
{
if (e.StatModifiers[attribute] != null)
{
x += e.StatModifiers[attribute];
}
}
// do bounds checking here, e.g. ensure non-negative numbers, max and min
return x;
}
I know this is C# and your post was tagged VB.NET, but it should be easy to understand the method. I haven't tested this code so apologies if there's a syntax error or something.