X out of Y people recommend - microformat markup - microformats

I'm in the process of marking up a site with microformats, using hcard, hreview and hreview-aggregate.
The documentation on http://microformats.org/ and http://www.google.com/support/webmasters/bin/topic.py?topic=21997 is pretty good.
However one thing has got me totally stumped - Can anybody explain to me how I can express something in the form:
X out of Y people recommend...
Thanks

That sounds like hreview-aggregate to me. A simple recommendation doesn't necessarily have a numeric value, but you can arbitrarily assign one based on the number of people. The scale will keep changing, but the proportions will stay the same. Example:
<div class="hreview-aggregate">
<span class="item">...</span>
<span class="rating"><span class="average">90</span> out of <span class="best count">100</span> people recommend.</span>
</div>
So every time you add a new person, both best and count would change. That makes sense, since the best scenario would be everyone recommending. That may be a bit misleading, since people aren't really rating on a 100-point scale, but the average of 90% is still accurate.

Related

Accessing chain position or multi-entity subchain in optaplanner?

We are using Optaplanner to come up with music playlists that follow a set of sound musical principles and rules (with respect to key changes, etc):
https://www.youtube.com/watch?v=Iqya6xqc1jY
We’re using chained planning variables to avoid disrupting an otherwise solid “streak” of playlist tracks, but many of our rules involve some aspect of temporal reasoning across a subchain of X previous tracks. As an example,I’m trying to implement a rule that requires the key to change at least every five songs (to keep the playlist from getting too boring/monotonous). What I’ve come up with works, but I’m wondering if there’s a less awkward way of doing it.
Here’s the rule as we have it right now, which I feel is ugly from a DRY and configurability perspective:
https://github.com/spotfire-io/spotfire-solver/blob/1c0fcda5256c337e214b33043a27fc25f615d0ef/src/main/resources/io/spotfire/solver/rules/rules.drl#L79-L88
rule "Should change key at least once every five songs"
when
$t0: RestPlaylistTrack(keyDistance == 0) // previousTrack is a chained variable
$t1: RestPlaylistTrack(keyDistance == 0, previousTrack == $t0)
$t2: RestPlaylistTrack(keyDistance == 0, previousTrack == $t1)
$t3: RestPlaylistTrack(keyDistance == 0, previousTrack == $t3)
$t4: RestPlaylistTrack(keyDistance == 0, previousTrack == $t3)
then
scoreHolder.addSoftConstraintMatch(kcontext, 0, new BigDecimal(-2));
end
Another example would be implementing a rule that batches tracks of the same genre together (e.g. play 4 jazz tracks in a row followed by 4 rock tracks), or ensuring that we avoid playing the same artist 5 tracks from the last time we played that artist.
In this example, Is there a better way to keep track of the distance between two tracks and then specify a constraint on that? Some potential options we’ve considered include…
Provide a way to extract X-length sub-chains programmatically and apply the rules to that subchain.
Create a shadow variable that represents the position of the track relative to the anchor. Then we could create constraints like RestPlaylistTrack(position < $t.position, position > $t.position - 5) to apply to any tracks within 5 tracks of $t.
Using some sort of Drools aggregate expression that accumulates previous tracks via a map-reducey thing until reaching a certain maximum number of tracks.
The challenge we perceive with the first two solutions is that a chain swap move involves changes to three planning variables. If we have a chain that looks like A <- B <- C <- D, a swap between B and D involves a change to point D to A, B to C, and C to D. At the Drools or shadow variable level, I think there’s a risk doing a bunch of intermediate calculations before the move is complete. This might make score calculation pretty inefficient. For the third option, we’re just not sure how something like that would work mechanically.
If anyone (especially #geoffrey-de-smet) has examples on how this could be done, that would be greatly appreciated. If this is legitimately tricky in the current version of Optaplanner, we think adding a native position mechanism to chained planning modes would be super helpful as a future feature.
It sounds like consecutive shift constraints in nurse rostering. Detecting "no more than n shifts in a row" is non trivial in hand-written DRL. In nurse rostering, we use insertLogicals to deal with those, but I would recommend not to use that (it kills performance). I guesstimate that approach 1) (which gives up incremental calculation) is still faster than any insertLogical approach, unless you're queuing up thousands of songs.
In ConstraintStreams, approach 1 could maybe one day look like this:
constraintFactory.from(Shift.class)
.groupBy(Shift::getEmployee,
sort(Comparable.from(Shift::getStartDateTime, Shift::getId)) // BiConstraintStream<Employee, List<Shift>))
.penalize((employee, sortedShiftList) -> ...); // One match for all bad subsequences of 1 employee
Approach 2) is interesting. Try it out and let us know if it works well enough for you.
Approach 3) is what I aim thinking of in ConstraintStreams at some point. This is incremental. Something like:
constraintFactory.from(Shift.class)
.forEachSortedSubList(Shift::getEmployee,
Comparable.from(Shift::getStartDateTime, Shift::getId,
(employee, sortedShiftSubList) -> ...)
.penalize(...); // One match per bad subsequence
If you have any suggestions on how you 'd like to use the API for approach 3) or how using it could look like, please put them on our google group discussion forum. It could help move the work along.

Understanding Google Code Jam 2013 - X Marks the Spot

I was trying to solve Google Code Jam problems and there is one of them that I don't understand. Here is the question (World Finals 2013 - problem C): https://code.google.com/codejam/contest/2437491/dashboard#s=p2&a=2
And here follows the problem analysis: https://code.google.com/codejam/contest/2437491/dashboard#s=a&a=2
I don't understand why we can use binary search. In order to use binary search the elements have to be sorted. In order words: for a given element e, we can't have any element less than e at its right side. But that is not the case in this problem. Let me give you an example:
Suppose we do what the analysis tells us to do: we start with a left bound angle of 90° and a right bound angle of 0°. Our first search will be at angle of 45°. Suppose we find that, for this angle, X < N. In this case, the analysis tells us to make our left bound 45°. At this point, we can have discarded a viable solution (at, let's say, 75°) and at the same time there can be no more solutions between 0° and 45°, leading us to say that there's no solution (wrongly).
I don't think Google's solution is wrong =P. But I can't figure out why we can use a binary search in this case. Anyone knows?
I don't understand why we can use binary search. In order to use
binary search the elements have to be sorted. In order words: for a
given element e, we can't have any element less than e at its right
side. But that is not the case in this problem.
A binary search works in this case because:
the values vary by at most 1
we only need to find one solution, not all of them
the first and last value straddle the desired value (X .. N .. 2N-X)
I don't quite follow your counter-example, but here's an example of a binary search on a sequence with the above constraints. Looking for 3:
1 2 1 1 2 3 2 3 4 5 4 4 3 3 4 5 4 4
[ ]
[ ]
[ ]
[ ]
*
I have read the problem and in the meantime thought about the solution. When I read the solution I have seen that they have mostly done the same as I would have, however, I did not thought about some minor optimizations they were using, as I was still digesting the task.
Solution:
Step1: They choose a median so that each of the line splits the set into half, therefore there will be two provinces having x mines, while the other two provinces will have N - x mines, respectively, because the two lines each split the set into half and
2 * x + 2 * (2 * N - x) = 2 * x + 4 * N - 2 * x = 4 * N.
If x = N, then we were lucky and accidentally found a solution.
Step2: They are taking advantage of the "fact" that no three lines are collinear. I believe they are wrong, as the task did not tell us this is the case and they have taken advantage of this "fact", because they assumed that the task is solvable, however, in the task they were clearly asking us to tell them if the task is impossible with the current input. I believe this part is smelly. However, the task is not necessarily solvable, not to mention the fact that there might be a solution even for the case when three mines are collinear.
Thus, somewhere in between X had to be exactly equal to N!
Not true either, as they have stated in the task that
You should output IMPOSSIBLE instead if there is no good placement of
borders.
Step 3: They are still using the "fact" described as un-true in the previous step.
So let us close the book and think ourselves. Their solution is not bad, but they assume something which is not necessarily true. I believe them that all their inputs contained mines corresponding to their assumption, but this is not necessarily the case, as the task did not clearly state this and I can easily create a solvable input having three collinear mines.
Their idea for median choice is correct, so we must follow this procedure, the problem gets more complicated if we do not do this step. Now, we could search for a solution by modifying the angle until we find a solution or reach the border of the period (this was my idea initially). However, we know which provinces have too much mines and which provinces do not have enough mines. Also, we know that the period is pi/2 or, in other terms 90 degrees, because if we move alpha by pi/2 into either positive (counter-clockwise) or negative (clockwise) direction, then we have the same problem, but each child gets a different province, which is irrelevant from our point of view, they will still be rivals, I guess, but this does not concern us.
Now, we try and see what happens if we rotate the lines by pi/4. We will see that some mines might have changed borders. We have either not reached a solution yet, or have gone too far and poor provinces became rich and rich provinces became poor. In either case we know in which half the solution should be, so we rotate back/forward by pi/8. Then, with the same logic, by pi/16, until we have found a solution or there is no solution.
Back to the question, we cannot arrive into the situation described by you, because if there was a valid solution at 75 degrees, then we would see that we have not rotated the lines enough by rotating only 45 degrees, because then based on the number of mines which have changed borders we would be able to determine the right angle-interval. Remember, that we have two rich provinces and two poor provinces. Each rich provinces have two poor bordering provinces and vice-versa. So, the poor provinces should gain mines and the rich provinces should lose mines. If, when rotating by 45 degrees we see that the poor provinces did not get enough mines, then we will choose to rotate more until we see they have gained enough mines. If they have gained too many mines, then we change direction.

100 is showing as 10,000% NumberTextBox

I set the constraint to use type set to percent, a value of 100 is showing up as 10,000% once the NumericTextbox renders
<div id="Percentage" data-dojo-type="ourcompay.NumberTextBox" data-dojo-props="constraints:{type: 'percent'}" title="Percentage" required="true"></div>
Not sure what to do to fix this. Is this a bug with Dojo or I'm not doing something right? Not quite sure.
I would assume that given a percentage is a representation of the amount of something in hundreths of a unit that it's doing exactly what it intends - i.e. if you are supplying a value of 100 and expecting that to mean 100% you're not understanding what percentages are.
1 would be 100%.

How can I compare two NSImages for differences?

I'm attempting to gauge the percentage difference between two images.
Having done a lot of reading I seem to have a number of options but I'm not sure what the best method to follow for:
Ease of coding
Performance.
The methods I've seen are:
Non language specific - academic Image comparison - fast algorithm and Mac specific direct pixel access http://www.markj.net/iphone-uiimage-pixel-color/
Does anyone have any advice about what solutions make most sense for the above two cases and have code samples to show how to apply them?
I've had success calculating the difference between two images using the histogram technique mentioned here. redmoskito's answer in the SO question you linked to was actually my inspiration!
The following is an overview of the algorithm I used:
Convert the images to grayscale—compare one channel instead of three.
Divide each image into an n * n grid of "subimages". Then, for subimage pair:
Calculate their colour composition histograms.
Calculate the absolute difference between the two histograms.
The maximum difference found between two subimages is a measure of the two images' difference. Other metrics could also be used (e.g. the average difference betwen subimages).
As tskuzzy noted in his answer, if your ultimate goal is a binary "yes, these two images are (roughly) the same" or "no, they're not", you need some meaningful threshold value. You could produce such a value by passing images into the algorithm and tweaking the threshold based on its output and how similar you think the images are. A form of machine learning, I suppose.
I recently wrote a blog post on this very topic, albeit as part of a larger goal. I also created a simple iPhone app to demonstrate the algorithm. You can find the source on GitHub; perhaps it will help?
It is really difficult to suggest something when you don't tell us more about the images or the variations. Are they shapes? Are they the different objects and you want to know what class of objects? Are they the same object and you want to distinguish the object instance? Are they faces? Are they fingerprints? Are the objects in the same pose? Under the same illumination?
When you say performance, what exactly do you mean? How large are the images? All in all it really depends. With what you've said if it is only ease of coding and performance I would suggest to just find the absolute value of the difference of pixels. That is super easy to code and about as fast as it gets, but really unlikely to work for anything other than the most synthetic examples.
That being said I would like to point you to: DHOG, GLOH, SURF and SIFT.
You can use fairly basic subtraction technique that the lads above suggested. #carlosdc has hit the nail on the head with regard to the type of image this basic technique can be used for. I have attached an example so you can see the results for yourself.
The first shows a image from a simulation at some time t. A second image was subtracted away from the first which was taken some (simulation) time later t + dt. The subtracted image (in black and white for clarity) then shows how the simulation has changed in that time. This was done as described above and is very powerful and easy to code.
Hope this aids you in some way
This is some old nasty FORTRAN, but should give you the basic approach. It is not that difficult at all. Due to the fact that I am doing it on a two colour pallette you would do this operation for R, G and B. That is compute the intensities or values in each cell/pixal, store them in some array. Do the same for the other image, and subtract one array from the other, this will leave you with some coulorfull subtraction image. My advice would be to do as the lads suggest above, compute the magnitude of the sum of the R, G and B componants so you just get one value. Write that to array, do the same for the other image, then subtract. Then create a new range for either R, G or B and map the resulting subtracted array to this, the will enable a much clearer picture as a result.
* =============================================================
SUBROUTINE SUBTRACT(FNAME1,FNAME2,IOS)
* This routine writes a model to files
* =============================================================
* Common :
INCLUDE 'CONST.CMN'
INCLUDE 'IO.CMN'
INCLUDE 'SYNCH.CMN'
INCLUDE 'PGP.CMN'
* Input :
CHARACTER fname1*(sznam),fname2*(sznam)
* Output :
integer IOS
* Variables:
logical glue
character fullname*(szlin)
character dir*(szlin),ftype*(3)
integer i,j,nxy1,nxy2
real si1(2*maxc,2*maxc),si2(2*maxc,2*maxc)
* =================================================================
IOS = 1
nomap=.true.
ftype='map'
dir='./pictures'
! reading first image
if(.not.glue(dir,fname2,ftype,fullname))then
write(*,31) fullname
return
endif
OPEN(unit2,status='old',name=fullname,form='unformatted',err=10,iostat=ios)
read(unit2,err=11)nxy2
read(unit2,err=11)rad,dxy
do i=1,nxy2
do j=1,nxy2
read(unit2,err=11)si2(i,j)
enddo
enddo
CLOSE(unit2)
! reading second image
if(.not.glue(dir,fname1,ftype,fullname))then
write(*,31) fullname
return
endif
OPEN(unit2,status='old',name=fullname,form='unformatted',err=10,iostat=ios)
read(unit2,err=11)nxy1
read(unit2,err=11)rad,dxy
do i=1,nxy1
do j=1,nxy1
read(unit2,err=11)si1(i,j)
enddo
enddo
CLOSE(unit2)
! substracting images
if(nxy1.eq.nxy2)then
nxy=nxy1
do i=1,nxy1
do j=1,nxy1
si(i,j)=si2(i,j)-si1(i,j)
enddo
enddo
else
print *,'SUBSTRACT: Different sizes of image arrays'
IOS=0
return
endif
* normal finishing
IOS=0
nomap=.false.
return
* exceptional finishing
10 write (*,30) fullname
return
11 write (*,32) fullname
return
30 format('Cannot open file ',72A)
31 format('Improper filename ',72A)
32 format('Error reading from file ',72A)
end
! =============================================================
Hope this is of some use. All the best.
Out of the methods described in your first link, the histogram comparison method is by far the simplest to code and the fastest. However key point matching will provide far more accurate results since you want to know a precise number describing the difference between two images.
To implement the histogram method, I would do the following:
Compute the red, green, and blue histograms of each image
Add up the differences between each bucket
If the difference is above a certain threshold, then the percentage is 0%
Otherwise the colors found in the images are similar. So then do a pixel by pixel comparison and convert the difference into a percentage.
I don't know any precise algorithms for finding the key points of an image. However once you find them for each image you can do a pixel by pixel comparison for each of the key points.

Process to pass from problem to code. How did you learn?

I'm teaching/helping a student to program.
I remember the following process always helped me when I started; It looks pretty intuitive and I wonder if someone else have had a similar approach.
Read the problem and understand it ( of course ) .
Identify possible "functions" and variables.
Write how would I do it step by step ( algorithm )
Translate it into code, if there is something you cannot do, create a function that does it for you and keep moving.
With the time and practice I seem to have forgotten how hard it was to pass from problem description to a coding solution, but, by applying this method I managed to learn how to program.
So for a project description like:
A system has to calculate the price of an Item based on the following rules ( a description of the rules... client, discounts, availability etc.. etc.etc. )
I first step is to understand what the problem is.
Then identify the item, the rules the variables etc.
pseudo code something like:
function getPrice( itemPrice, quantity , clientAge, hourOfDay ) : int
if( hourOfDay > 18 ) then
discount = 5%
if( quantity > 10 ) then
discount = 5%
if( clientAge > 60 or < 18 ) then
discount = 5%
return item_price - discounts...
end
And then pass it to the programming language..
public class Problem1{
public int getPrice( int itemPrice, int quantity,hourOdDay ) {
int discount = 0;
if( hourOfDay > 10 ) {
// uh uh.. U don't know how to calculate percentage...
// create a function and move on.
discount += percentOf( 5, itemPriece );
.
.
.
you get the idea..
}
}
public int percentOf( int percent, int i ) {
// ....
}
}
Did you went on a similar approach?.. Did some one teach you a similar approach or did you discovered your self ( as I did :( )
I go via the test-driven approach.
1. I write down (on paper or plain text editor) a list of tests or specification that would satisfy the needs of the problem.
- simple calculations (no discounts and concessions) with:
- single item
- two items
- maximum number of items that doesn't have a discount
- calculate for discounts based on number of items
- buying 10 items gives you a 5% discount
- buying 15 items gives you a 7% discount
- etc.
- calculate based on hourly rates
- calculate morning rates
- calculate afternoon rates
- calculate evening rates
- calculate midnight rates
- calculate based on buyer's age
- children
- adults
- seniors
- calculate based on combinations
- buying 10 items in the afternoon
2. Look for the items that I think would be the easiest to implement and write a test for it. E.g single items looks easy
The sample using Nunit and C#.
[Test] public void SingleItems()
{
Assert.AreEqual(5, GetPrice(5, 1));
}
Implement that using:
public decimal GetPrice(decimal amount, int quantity)
{
return amount * quantity; // easy!
}
Then move on to the two items.
[Test]
public void TwoItemsItems()
{
Assert.AreEqual(10, GetPrice(5, 2));
}
The implementation still passes the test so move on to the next test.
3. Be always on the lookout for duplication and remove it. You are done when all the tests pass and you can no longer think of any test.
This doesn't guarantee that you will create the most efficient algorithm, but as long as you know what to test for and it all passes, it will guarantee that you are getting the right answers.
the old-school OO way:
write down a description of the problem and its solution
circle the nouns, these are candidate objects
draw boxes around the verbs, these are candidate messages
group the verbs with the nouns that would 'do' the action; list any other nouns that would be required to help
see if you can restate the solution using the form noun.verb(other nouns)
code it
[this method preceeds CRC cards, but its been so long (over 20 years) that I don't remember where i learned it]
when learning programming I don't think TDD is helpful. TDD is good later on when you have some concept of what programming is about, but for starters, having an environment where you write code and see the results in the quickest possible turn around time is the most important thing.
I'd go from problem statement to code instantly. Hack it around. Help the student see different ways of composing software / structuring algorithms. Teach the student to change their minds and rework the code. Try and teach a little bit about code aesthetics.
Once they can hack around code.... then introduce the idea of formal restructuring in terms of refactoring. Then introduce the idea of TDD as a way to make the process a bit more robust. But only once they are feeling comfortable in manipulating code to do what they want. Being able to specify tests is then somewhat easier at that stage. The reason is that TDD is about Design. When learning you don't really care so much about design but about what you can do, what toys do you have to play with, how do they work, how do you combine them together. Once you have a sense of that, then you want to think about design and thats when TDD really kicks in.
From there I'd start introducing micro patterns leading into design patterns
I did something similar.
Figure out the rules/logic.
Figure out the math.
Then try and code it.
After doing that for a couple of months it just gets internalized. You don't realize your doing it until you come up against a complex problem that requires you to break it down.
I start at the top and work my way down. Basically, I'll start by writing a high level procedure, sketch out the details inside of it, and then start filling in the details.
Say I had this problem (yoinked from project euler)
The sum of the squares of the first
ten natural numbers is, 1^2 + 2^2 +
... + 10^2 = 385
The square of the sum of the first ten
natural numbers is, (1 + 2 + ... +
10)^2 = 55^2 = 3025
Hence the difference between the sum
of the squares of the first ten
natural numbers and the square of the
sum is 3025 385 = 2640.
Find the difference between the sum of
the squares of the first one hundred
natural numbers and the square of the
sum.
So I start like this:
(display (- (sum-of-squares (list-to 10))
(square-of-sums (list-to 10))))
Now, in Scheme, there is no sum-of-squares, square-of-sums or list-to functions. So the next step would be to build each of those. In building each of those functions, I may find I need to abstract out more. I try to keep things simple so that each function only really does one thing. When I build some piece of functionality that is testable, I write a unit test for it. When I start noticing a logical grouping for some data, and the functions that act on them, I may push it into an object.
I've enjoyed TDD every since it was introduced to me. Helps me plan out my code, and it just puts me at ease having all my tests return with "success" every time I modify my code, letting me know I'm going home on time today!
Wishful thinking is probably the most important tool to solve complex problems. When in doubt, assume that a function exists to solve your problem (create a stub, at first). You'll come back to it later to expand it.
A good book for beginners looking for a process: Test Driven Development: By Example
My dad had a bunch of flow chart stencils that he used to make me use when he was first teaching me about programming. to this day I draw squares and diamonds to build out a logical process of how to analyze a problem.
I think there are about a dozen different heuristics I know of when it comes to programming and so I tend to go through the list at times with what I'm trying to do. At the start, it is important to know what is the desired end result and then try to work backwards to find it.
I remember an Algorithms class covering some of these ways like:
Reduce it to a known problem or trivial problem
Divide and conquer (MergeSort being a classic example here)
Use Data Structures that have the right functions (HeapSort being an example here)
Recursion (Knowing trivial solutions and being able to reduce to those)
Dynamic programming
Organizing a solution as well as testing it for odd situations, e.g. if someone thinks L should be a number, are what I'd usually use to test out the idea in pseudo code before writing it up.
Design patterns can be a handy set of tools to use for specific cases like where an Adapter is needed or organizing things into a state or strategy solution.
Yes.. well TDD did't existed ( or was not that popular ) when I began. Would be TDD the way to go to pass from problem description to code?... Is not that a little bit advanced? I mean, when a "future" developer hardly understand what a programming language is, wouldn't it be counterproductive?
What about hamcrest the make the transition from algorithm to code.
I think there's a better way to state your problem.
Instead of defining it as 'a system,' define what is expected in terms of user inputs and outputs.
"On a window, a user should select an item from a list, and a box should show him how much it costs."
Then, you can give him some of the factors determining the costs, including sample items and what their costs should end up being.
(this is also very much a TDD-like idea)
Keep in mind, if you get 5% off then another 5% off, you don't get 10% off. Rather, you pay 95% of 95%, which is 90.25%, or 9.75% off. So, you shouldn't add the percentage.