What is the Benjamini-Yekutieli test - testing

According to the documentation the tsfresh algorithm makes use of the Benjamini-Yekutieli in its final step. To give you a little bit of context,
tsfresh automatically calculates a large number of time series characteristics, the so called features. Further the package contains methods to evaluate the explaining power and importance of such characteristics for regression or classification tasks.
I have tried to read the linked references but I found them very technical. Can anybody provide an high-level description of the Benjamini-Yekutieli and explain why it is needed? I would like to understand what is its main purpose.
If you don’t know what FRESH is, I would still be happy to read an explanation of the Benjamini-Yekutieli test.

Related

How are customized heuristics implemented to optimization problems (in IDE) and solvers?

I have read up on several papers that described the use of heuristics in various routing problems that results in a faster run time for the solvers used (e.g. Gecode, CBC). For e.g., in the CPLEX/MiniZinc IDE, we have a constraint problem for the Vehicle Routing Problem (VRP), and a data file with contains the locations which the vehicle needs to go to (.dzn file in MiniZinc). Then, the authors in these papers implemented various kinds of routing heuristics to obtain a solution (may not be optimal) for each constraint model.
Thus my question is, how are the heuristics (can be customized to one that you designed, that is not built-in with he solver) implemented? Are the heuristics done in another IDE?
I have been looking around online literature but have yet to find out how this is actually done, especially for the case of MiniZinc and Cplex with the Gecode solver for example. It will be great if some insight can be shared on this issue! :)

Optimizing assignments based on many variables

I was recently talking with someone in Resource Management and we discussed the problem of assigning developers to projects when there are many variables to consider (of possibly different weights), e.g.:
The developer's skills & the technology/domain of the project
The developer's travel preferences & the location of the project
The developer's interests and the nature of the project
The basic problem the RM person had to deal with on a regular basis was this: given X developers where each developers has a unique set of attributes/preferences, assign them to Y projects where each project has its own set of unique attributes/requirements.
It seems clear to me that this is a very mathematical problem; it reminds me of old optimization problems from algebra and/or calculus (I don't remember which) back in high school: you know, find the optimal dimensions for a container to hold the maximum volume given this amount of material—that sort of thing.
My question isn't about the math, but rather whether there are any software projects/libraries out there designed to address this kind of problem. Does anyone know of any?
My question isn't about the math, but rather whether there are any software projects/libraries out there designed to address this kind of problem. Does anyone know of any?
In my humble opinion, I think that this is putting the cart before the horse. You first need to figure out what problem you want to solve. Then, you can look for solutions.
For example, if you formulate the problem by assigning some kind of numerical compatibility score to every developer/project pair with the goal of maximizing the total sum of compatibility scores, then you have a maximum-weight matching problem which can be solved with the Hungarian algorithm. Conveniently, this algorithm is implemented as part of Google's or-tools library.
On the other hand, let's say that you find that computing compatibility scores to be infeasible or unreasonable. Instead, let's say that each developer ranks all the projects from best to worst (e.g.: in terms of preference) and, similarly, each project ranks each developer from best to worst (e.g.: in terms of suitability to the project). In this case, you have an instance of the Stable Marriage problem, which is solved by the Gale-Shapley algorithm. I don't have a pointer to an established library for G-S, but it's simple enough that it seems that lots of people just code their own.
Yes, there are mathematical methods for solving a type of problem which this problem can be shoehorned into. It is the natural consequence of thinking of developers as "resources", like machine parts, largely interchangeable, their individuality easily reduced to simple numerical parameters. You can make up rules such as
The fitness value is equal to the subject skill parameter multiplied by the square root of the reliability index.
and never worry about them again. The same rules can be applied to different developers, different subjects, different scales of projects (with a SLOC scaling factor of, say, 1.5). No insight or real leadership is needed, the equations make everything precise and "assured". The best thing about this approach is that when the resources fail to perform the way your equations say they should, you can just reduce their performance scores to make them fit. And if someone has already written the tool, then you don't even have to worry about the math.
(It is interesting to note that Resource Management people always seem to impose such metrics on others in an organization -- thereby making their own jobs easier-- and never on themselves...)

Looking for ideas/references/keywords: adaptive-parameter-control of a search algorithm (online-learning)

I'm looking for ideas/experiences/references/keywords regarding an adaptive-parameter-control of search algorithm parameters (online-learning) in combinatorial-optimization.
A bit more detail:
I have a framework, which is responsible for optimizing a hard combinatorial-optimization-problem. This is done with the help of some "small heuristics" which are used in an iterative manner (large-neighborhood-search; ruin-and-recreate-approach). Every algorithm of these "small heuristics" is taking some external parameters, which are controlling the heuristic-logic in some extent (at the moment: just random values; some kind of noise; diversify the search).
Now i want to have a control-framework for choosing these parameters in a convergence-improving way, as general as possible, so that later additions of new heuristics are possible without changing the parameter-control.
There are at least two general decisions to make:
A: Choose the algorithm-pair (one destroy- and one rebuild-algorithm) which is used in the next iteration.
B: Choose the random parameters of the algorithms.
The only feedback is an evaluation-function of the new-found-solution. That leads me to the topic of reinforcement-learning. Is that the right direction?
Not really a learning-like-behavior, but the simplistic ideas at the moment are:
A: A roulette-wheel-selection according to some performance-value collected during the iterations (near past is more valued than older ones).
So if heuristic 1 did find all the new global best solutions -> high probability of choosing this one.
B: No idea yet. Maybe it's possible to use some non-uniform random values in the range (0,1) and i'm collecting some momentum of the changes.
So if heuristic 1 last time used alpha = 0.3 and found no new best solution, then used 0.6 and found a new best solution -> there is a momentum towards 1
-> next random value is likely to be bigger than 0.3. Possible problems: oscillation!
Things to remark:
- The parameters needed for good convergence of one specific algorithm can change dramatically -> maybe more diversify-operations needed at the beginning, more intensify-operations needed at the end.
- There is a possibility of good synergistic-effects in a specific pair of destroy-/rebuild-algorithm (sometimes called: coupled neighborhoods). How would one recognize something like that? Is that still in the reinforcement-learning-area?
- The different algorithms are controlled by a different number of parameters (some taking 1, some taking 3).
Any ideas, experiences, references (papers), keywords (ml-topics)?
If there are ideas regarding the decision of (b) in a offline-learning-manner. Don't hesitate to mention that.
Thanks for all your input.
Sascha
You have a set of parameter variables which you use to control your set of algorithms. Selection of your algorithms is just another variable.
One approach you might like to consider is to evolve your 'parameter space' using a genetic algorithm. In short, GA uses an analogue of the processes of natural selection to successively breed ever better solutions.
You will need to develop an encoding scheme to represent your parameter space as a string, and then create a large population of candidate solutions as your starting generation. The genetic algorithm itself takes the fittest solutions in your set and then applies various genetic operators to them (mutation, reproduction etc.) to breed a better set which then become the next generation.
The most difficult part of this process is developing an appropriate fitness function: something to quantitatively measure the quality of a given parameter space. Your search problem may be too complex to measure for each candidate in the population, so you will need a proxy model function which might be as hard to develop as the ideal solution itself.
Without understanding more of what you've written it's hard to see whether this approach is viable or not. GA is usually well suited to multi-variable optimisation problems like this, but it's not a silver bullet. For a reference start with Wikipedia.
This sounds like hyper heuristics which you're trying to do. Try looking for that keyword.
In Drools Planner (open source, java) I have support for tabu search and simulated annealing out the box.
I haven't implemented the ruin-and-recreate-approach (yet), but that should be easy, although I am not expecting better results. Challenge: Prove me wrong and fork it and add it and beat me in the examples.
Hyper heuristics are on my TODO list.

Software Metrics in Agile Methodologies [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Agile methodologies are rather prevalent these days, but I cannot seem to find much documentation on what metrics are most useful and why. I have found many more things saying that some traditional metrics like LOC and code coverage of tests are not appropriate, leaving two main questions:
Why are those two (and other) metrics inappropriate?
What metrics are best for Agile and why?
Even with an Agile process, wouldn't you want to know how much code coverage you have with your unit tests? Or is it simply that this metric (and others) just are not as useful as other metrics like cyclomatic complexity and velocity?
Agile is a business oriented thing, Agile is about maximizing the customer value while minimizing waste to provide the most optimal ROI. This is what should get measured. And to do so, I use the system that Mary Poppendieck recommends. This system is based on three holistic measurements that must be taken as a package:
Cycle time
From product concept to first release or
From feature request to feature deployment or
From bug detection to resolution
Business Case Realization (without this, everything else is irrelevant)
P&L or
ROI or
Goal of investment
Customer Satisfaction
e.g. Net Promoter Score
Sure, at the team level you can track things like test coverage, cyclomatic complexity, conformance to coding standards, etc, but high quality is not an end in itself, it's just a mean. Don't misinterpret me, I'm not saying high quality doesn't matters, high quality is mandatory to achieve sustainable pace (and we include "no increase of the technical debt" in our Definition of Done) but still, the goal is to deliver value to the customer in a fast and profitable way.
Irrespective of methodology, there are some basic metrics that can and should be used.
According to S. Kahn, the most important are the following three:
size of product
number of defects found in final phase of testing
and number of defects found in the field.
If those are all you track, there's at least five ways they can be used:
calculate product defect rate (A)
calculate test defect rate (B)
determine a desirable goal for A and monitor the performance
determine a desirable goal for B and monitor the performance
assess correlation between A and B
if correlation is found, form metric of test effectiveness (B/A * 100%)
Although not necessarily fun to read, Metrics and Models of Software Quality Engineering provides an excellent in-depth software engineering and metrics overview.
1.1) LOC are easy to answer
They are really dependent of the language you use! The same feature might have a big difference when written on JAVA or on Ruby, for example
A not well written software might have more lines than a good one!
1.2) Code coverage
IMHO you should use metric, although its not perfect, it should give you a nice understanding on where your code needs more tests.
Just one point you should take care here is that it is also dependent of the language. There could be some situations where you have a class or method that you really don't need to test! For example a class with only getters and setters.
2) From (1) you just mentioned code metrics, but judging from your question about velocity, you are interested on metrics on all the creation process, so I would list some:
Velocity: The classic one and, if used well, it can enhance quite well an agile team performance, since you will know what your team can really do on a fixed time.
Burn up and burn down charts : they can give you a good notion about how the team is performing during the interaction (sprint)
There are some articles on InfoQ about this. Here and here.
As for question 1, I don't see any reason those metrics would be bad in an Agile process.
LOC provides you with a relative size measurement. While it may not always be useful to compare numbers between projects, it can provide you with a rate of growth within the project. If you can get it, the number of lines changed within a sprint may be useful as well to track a rate or refactoring.
Code coverage (of lines of code) gives you a general sense of whether or not your team is meeting a minimum bar of automated testing within a project.
As for question 2, keep the items above and here are a few more:
LOC versus test count. If you can, maintain separate ratios for unit, integration and system tests.
Average number of acceptance criteria versus test scenarios (or tests) for each story. It can help provide a better sense of whether or not your testing against the story's intent.
Number of defects discovered
Amount of work discovered (this is often captured by Agile tracking software) that wasn't part original estimates. It will help you judge if you are doing 'enough' planning.
Tracking consistencies, or lack thereof, of velocity sprint to sprint
While probably not popular and probably potentially dangerous, tracking estimates to work completed for each developer. While teams are supposed to be self organized and driven, not all teams are capable of dealing with human problems.
Just to add
Why LOC and Code Coverage of Tests are less than ideal:
Agile emphasizes outcome, not output (see Agile Manifesto). These two simply track output. Also, they do not properly measure refactoring, which is a vital aspect of Agile processes.
Another metric to consider would be Running Tested Features. I can't describe any better than this: http://xprogramming.com/articles/jatrtsmetric/
I'm going to answer to this very old question...
LOC and Test coverage are, in my opinion, good metrics, but they have one big problem: if you push them, you can make them grow fastly, but the result will be terryifing: tons of nonsense code, or in the test coverage, you can invoque all your code in a try-catch block and not write one single assert... Or even worse, just write one for "compliance" reasons, but without any business-facing or code-facing meaning...
So, these kind of metrics are very good if they help the team to honestly evaluate their outcome, but are an evil tool if they form part of some "compliance" rules, as using them in that way causes more harm (dead code, bad tests!) than what you originally wanted to achieve.
So, with every metric, think how you would trick it if you were forced to achieve a certain value, and think of the consequences... This is not an issue of LOC or test coverage, many other metrics can have similar outcome, even cyclomatic complexity... If you divide your code in a bad manner, you can reduce cyclomatic complexity, but it doesn't mean you get better or more readable code!
So, these kind of metrics are quite good to see what's happening inside a team, but any measure you take should be based on concrete goals, not on the metric itself... For example:
Test coverage is low: you implement coding dojos once a month to help train people to write testable code, you find out what code has the worst test coverage and try to implement a better / more testable architecture that helps / motivates developers to write test, etc.
As you can see, you never tell the team to achieve a certain value of test coverage, you just use the metric to see where you can improve and then look for measures that benefit your process, after a time you would expect test coverage to increase, but you are not pushing people to do so! You are evaluating changes in order to see if the measures are helping. If after a time you find out that test coverage has not changed with your measures, then it's time to look for other ideas, and so on...

How to test numerical analysis routines?

Are there any good online resources for how to create, maintain and think about writing test routines for numerical analysis code?
One of the limitations I can see for something like testing matrix multiplication is that the obvious tests (like having one matrix being the identity) may not fully test the functionality of the code.
Also, there is the fact that you are usually dealing with large data structures as well. Does anyone have some good ideas about ways to approach this, or have pointers to good places to look?
It sounds as if you need to think about testing in at least two different ways:
Some numerical methods allow for some meta-thinking. For example, invertible operations allow you to set up test cases to see if the result is within acceptable error bounds of the original. For example, matrix M-inverse times the matrix M * random vector V should result in V again, to within some acceptable measure of error.
Obviously, this example exercises matrix inverse, matrix multiplication and matrix-vector multiplication. I like chains like these because you can generate quite a lot of random test cases and get statistical coverage that would be a slog to have to write by hand. They don't exercise single operations in isolation, though.
Some numerical methods have a closed-form expression of their error. If you can set up a situation with a known solution, you can then compare the difference between the solution and the calculated result, looking for a difference that exceeds these known bounds.
Fundamentally, this question illustrates the problem that testing complex methods well requires quite a lot of domain knowledge. Specific references would require a little more specific information about what you're testing. I'd definitely recommend that you at least have Steve Yegge's recommended book list on hand.
If you're going to be doing matrix calculations, use LAPACK. This is very well-tested code. Very smart people have been working on it for decades. They've thought deeply about issues that the uninitiated would never think about.
In general, I'd recommend two kinds of testing: systematic and random. By systematic I mean exploring edge cases etc. It helps if you can read the source code. Often algorithms have branch points: calculate this way for numbers in this range, this other way for numbers in another range, etc. Test values close to the branch points on either side because that's where approximation error is often greatest.
Random input values are important too. If you rationally pick all the test cases, you may systematically avoid something that you don't realize is a problem. Sometimes you can make good use of random input values even if you don't have the exact values to test against. For example, if you have code to calculate a function and its inverse, you can generate 1000 random values and see whether applying the function and its inverse put you back close to where you started.
Check out a book by David Gries called The Science of Programming. It's about proving the correctness of programs. If you want to be sure that your programs are correct (to the point of proving their correctness), this book is a good place to start.
Probably not exactly what you're looking for, but it's the computer science answer to a software engineering question.