Designing a minor circuit - input

I've got this assignment that demands designing a minor circuit of 3 inputs so that the circuit returns the value of the less seen input.I've tried numerous times,but it seems like I'm stuck. Any hint would be appreciated.

You should drawing up a truth table of the three inputs, and the expected/ desired output.
To get the correct result, probably you could use an ADDER (or perhaps 2). You could sum A, B and C inputs. If more than 2 inputs are true, the "least seen" will be false. If less than 2 inputs are true, the "least seen" will be true.
Look up what gates are needed to implement an ADDER (possibly with carry) and then test its output.
The optimized solution might be simpler than this, but that shoudl give you some ideas.

Related

What is the most conclusive way to evaluate an n-way split test where n > 2?

I have plenty of experience designing, running and evaluating two-way split tests (A/B Tests). Those are by far the most common in digital marketing, where I do most of my work.
However, I'm wondering if anything about the methodology needs to change when more variants are introduced into an experiment (creating, say, a 3-way test (A/B/C Test)).
My instinct tells me I should just run n-1 evaluations against the control group.
If I run a 3-way split test, for example, instinct says I should find significance and power twice:
Treatment A vs Control
Treatment B vs Control
So, in that case, I'm finding out which, if any, treatment performed better than the control (1-tailed test, alt: treatment - control > 0, the basic marketing hypothesis).
But, I'm doubting my instinct. It's occurred to me that running a third test contrasting Treatment A with Treatment B could yield confusing results.
For example, what if there's not enough evidence to reject a null that treatment B = treatment A?
That would lead to a goofy conclusion like this:
Treatment A = Control
Treatment B > Control
Treatment B = Treatment A
If treatments A and B are likely only different due to random chance, how could only one of them outperform the control?
And that's making me wonder if there's a more statistically sound way to evaluate split tests with more than one treatment variable. Is there?
Your instincts are correct, and you can feel less goofy by rewording your statements:
We could find no statistically significant difference between Treatment A and Control.
Treatment B is significantly better than Control.
However it remains inconclusive whether Treatment B is better than Treatment A.
This would be enough to declare Treatment B a winner, with the possible followup of retesting A vs B. But depending on your specific situation, you may have a business need to actually make sure Treatment B is better than Treatment A before moving forward and you can make no such decision with your data. You must gather more data and/or restart a new test.
What I've found is a far more common scenario is Treatment A and Treatment B both soundly beat control (as they're often closely related and have related hypotheses), but there is no statistically significant difference between Treatment A or Treatment B. This is an interesting scenario where if you are required to pick a winner, it's okay throwing significance out the window and picking the one that has the strongest effect. The reason why is that the significance level (eg. 95%) is set to avoid false positives and making unnecessary changes. There's an assumption that there are switching costs. In this case, you must pick A or B and throw out control, so in my opinion it's okay picking the best one until you have more data.

Multi-objective optimization but the function equation is unknown?

Firstly, I am totally out of my expertise zone so please bear with me.
I developed a fluid dynamic engine with 5 exposed parameters (say A,B,C,D,E). When you give this engine these 5 parameters, it does magic and give out a value 'Z'.
I want to write a script which can explore which combinations of A-E give lowest (or close to lowest) value of Z.
I know optimization algorithm exists, but from all of my search for examples, they use some function.
So I guess my function would simply be minimize Z? But where do A-E go?
Not really an answer, but some questions and ideas that might help you think through the best way to address this. We have no understanding of how big a range of values needs to be explored for those parameters, or how Z behaves, so this is very vague...
If you look at the values of Z for given values of A...E, does the value of Z jump around a lot for small changes on the parameter values, or does the Z value change reasonably smoothly?
If the Z value is not too eratic you could try some kind of gradient descent approach using calculated values of Z for some values of the parameters to approximate the gradient - suppose changing the value of 'A' from 1 to 2 gives a better change in the Z value than a similar size change in the other parameters, then try other values of A while keeping the other parameters fixed until you find a value of A that gives the best value of Z. Then try changing the other parameter values to see which one gives the steepest descent and try to find some better value for that parameter. Repeat this process until you can't find any improvement and you will have found a (local) minimum. You could then start at a different place in your parameter space and try again - you will probably find several local minima, and may just choose the best of those. Not provably optimal but may be good enough. Of course you can get clever and use things like conjugate gradients, Newton-Raphson or similar if Z is smooth enough.
If the Z values are very eratic, then you might have to just do some sampling of the possible combinations of A...E to get values of Z and choose the best you can find. Again you might do that in some systematic way (e.g. points on a grid in your parameter space) or entirely at random, or a combination of both.
If you find that there are 'clusters' of good solutions with similar values of the parameters then maybe some kind of local search would help - the idea is that there is often a better solution in the local neighbourhood of a known good solution. So maybe try perturbing your parameter values a bit from a known solution to see if that can lead to a better solution - either by some gradient descent method or by random sampling.
Unfortunately, if your Z calculation is complex, then any method using it as a black box will likely be slow as it will need to be re-evaluated many times.
You could use a Genetic Algorithm, where your chromosomes are formed with the 5 candidate values of the variables you have to optimize, to minimize Z, and your optimization/fitness "function" is the simulation itself outputting Z.
Other viable alternatives are Particle Swarm Optimization algorithm or Ant Colony Optimization. All of those are usable algortihms for that kind of optimization problem.

I have a few question about MC/DC and piarwise testing

Recently, I started working on software testing, and I had some questions.
Pairwise testing is the combination of all the values that this parameter can have, and is it also applicable to Boolean Expression?
For example,
boolean expression is (A || B) && C
(It is assumed that each parameter has only 0 and 1.)
Here, Is it applicable to Boolean Exp ??
The second question is about MC/DC.
I had learned how to make test case through MC/DC
But, I wondered how MC/DC could prove to cover almost 90% code coverage ?
In (A || B) && C, there are 4 combination test case in my guess.
But, All combination is 8. how could MC/DC reduce cases ?
Is it applicable in Boolean expressions?
Yes. It's applicable to Boolean expressions.
Getting all possible combinations of your Boolean expression, we can have the truth table above.
How could MC/DC prove to cover almost 90% code coverage?
MC/DC can't assure over 90% code coverage. However, it can assure the decision, branch, and condition coverages, which are integral parts of a piece of code.
But, how could it prove to cover?
The answer will lie in the properties of the MC/DC criterion:
Each condition in a Boolean expression should take every possible
outcome.
Each decision takes every possible outcome.
Each condition is shown to independently affect the outcome of the decision.
So each condition's Boolean outcome has been considered (TRUE/FALSE) and the combination of these condition's Boolean outcome (decision) will result in every possible value (TRUE/FALSE).
How could MC/DC reduce cases?
When you identify MC/DC pairs, you will come up with this table:
Some of these pairs are similar. Why? Because when you evaluate the Boolean expression, you can short-circuit some of the conditions. This means that your expression can have a decision even if at least one of the conditions are not being evaluated.
This will be the final result. Notice that some of the rows have (-) empty values. This means that it wasn't evaluated, but the decision can be inferred.
Although relevant, but unrelated, I wrote an article here: How MC/DC Can Speedup Unit Test Creation
Hope this helps despite being late. :D

multiplying many numbers, but stopping at zero

Suppose one has a function, prod(x), that returns the product of a sequence of numbers, x.
If the number of operands in x can be arbitrarily large, how might one think about reducing the time to compute the product of x by ceasing multiplication when zero is encountered? Is this something that mature compilers and interpreters already do?
If one is writing one's own prod(x) function, what is the best approach? Does it ever make sense to do something like if 0 in x then return(0) else multiply(x)?
For example, if x = 1,0,3,4,...,-4,9, one shouldn't need to continue multiplying past the second term, right?
Any multiplication with 0 returns 0, so you'd check if there's a 0 in a sequence and just return 0.
In general: Compiler optimizations are mostly done on the given source code. You're talking about optimization on run-time data, which compilers (normally) won't do. This means that you have to write these optimizations yourself.
In this case: You can indeed stop when there's a 0.
Since the multiplication operation commutes, you can evaluate the terms in any order. If you were to have a sorted sequence of numbers, you could simply see if the first item is zero.

Building ranking with genetic algorithm,

Question after BIG edition :
I need to built a ranking using genetic algorithm, I have data like this :
P(a>b)=0.9
P(b>c)=0.7
P(c>d)=0.8
P(b>d)=0.3
now, lets interpret a,b,c,d as names of football teams, and P(x>y) is probability that x wins with y. We want to build ranking of teams, we lack some observations P(a>d),P(a>c) are missing due to lack of matches between a vs d and a vs c.
Goal is to find ordering of team names, which the best describes current situation in that four team league.
If we have only 4 teams than solution is straightforward, first we compute probabilities for all 4!=24 orderings of four teams, while ignoring missing values we have :
P(abcd)=P(a>b)P(b>c)P(c>d)P(b>d)
P(abdc)=P(a>b)P(b>c)(1-P(c>d))P(b>d)
...
P(dcba)=(1-P(a>b))(1-P(b>c))(1-P(c>d))(1-P(b>d))
and we choose the ranking with highest probability. I don't want to use any other fitness function.
My question :
As numbers of permutations of n elements is n! calculation of probabilities for all
orderings is impossible for large n (my n is about 40). I want to use genetic algorithm for that problem.
Mutation operator is simple switching of places of two (or more) elements of ranking.
But how to make crossover of two orderings ?
Could P(abcd) be interpreted as cost function of path 'abcd' in assymetric TSP problem but cost of travelling from x to y is different than cost of travelling from y to x, P(x>y)=1-P(y<x) ? There are so many crossover operators for TSP problem, but I think I have to design my own crossover operator, because my problem is slightly different from TSP. Do you have any ideas for solution or frame for conceptual analysis ?
The easiest way, on conceptual and implementation level, is to use crossover operator which make exchange of suborderings between two solutions :
CrossOver(ABcD,AcDB) = AcBD
for random subset of elements (in this case 'a,b,d' in capital letters) we copy and paste first subordering - sequence of elements 'a,b,d' to second ordering.
Edition : asymetric TSP could be turned into symmetric TSP, but with forbidden suborderings, which make GA approach unsuitable.
It's definitely an interesting problem, and it seems most of the answers and comments have focused on the semantic aspects of the problem (i.e., the meaning of the fitness function, etc.).
I'll chip in some information about the syntactic elements -- how do you do crossover and/or mutation in ways that make sense. Obviously, as you noted with the parallel to the TSP, you have a permutation problem. So if you want to use a GA, the natural representation of candidate solutions is simply an ordered list of your points, careful to avoid repitition -- that is, a permutation.
TSP is one such permutation problem, and there are a number of crossover operators (e.g., Edge Assembly Crossover) that you can take from TSP algorithms and use directly. However, I think you'll have problems with that approach. Basically, the problem is this: in TSP, the important quality of solutions is adjacency. That is, abcd has the same fitness as cdab, because it's the same tour, just starting and ending at a different city. In your example, absolute position is much more important that this notion of relative position. abcd means in a sense that a is the best point -- it's important that it came first in the list.
The key thing you have to do to get an effective crossover operator is to account for what the properties are in the parents that make them good, and try to extract and combine exactly those properties. Nick Radcliffe called this "respectful recombination" (note that paper is quite old, and the theory is now understood a bit differently, but the principle is sound). Taking a TSP-designed operator and applying it to your problem will end up producing offspring that try to conserve irrelevant information from the parents.
You ideally need an operator that attempts to preserve absolute position in the string. The best one I know of offhand is known as Cycle Crossover (CX). I'm missing a good reference off the top of my head, but I can point you to some code where I implemented it as part of my graduate work. The basic idea of CX is fairly complicated to describe, and much easier to see in action. Take the following two points:
abcdefgh
cfhgedba
Pick a starting point in parent 1 at random. For simplicity, I'll just start at position 0 with the "a".
Now drop straight down into parent 2, and observe the value there (in this case, "c").
Now search for "c" in parent 1. We find it at position 2.
Now drop straight down again, and observe the "h" in parent 2, position 2.
Again, search for this "h" in parent 1, found at position 7.
Drop straight down and observe the "a" in parent 2.
At this point note that if we search for "a" in parent one, we reach a position where we've already been. Continuing past that will just cycle. In fact, we call the sequence of positions we visited (0, 2, 7) a "cycle". Note that we can simply exchange the values at these positions between the parents as a group and both parents will retain the permutation property, because we have the same three values at each position in the cycle for both parents, just in different orders.
Make the swap of the positions included in the cycle.
Note that this is only one cycle. You then repeat this process starting from a new (unvisited) position each time until all positions have been included in a cycle. After the one iteration described in the above steps, you get the following strings (where an "X" denotes a position in the cycle where the values were swapped between the parents.
cbhdefga
afcgedbh
X X X
Just keep finding and swapping cycles until you're done.
The code I linked from my github account is going to be tightly bound to my own metaheuristics framework, but I think it's a reasonably easy task to pull the basic algorithm out from the code and adapt it for your own system.
Note that you can potentially gain quite a lot from doing something more customized to your particular domain. I think something like CX will make a better black box algorithm than something based on a TSP operator, but black boxes are usually a last resort. Other people's suggestions might lead you to a better overall algorithm.
I've worked on a somewhat similar ranking problem and followed a technique similar to what I describe below. Does this work for you:
Assume the unknown value of an object diverges from your estimate via some distribution, say, the normal distribution. Interpret your ranking statements such as a > b, 0.9 as the statement "The value a lies at the 90% percentile of the distribution centered on b".
For every statement:
def realArrival = calculate a's location on a distribution centered on b
def arrivalGap = | realArrival - expectedArrival |
def fitness = Σ arrivalGap
Fitness function is MIN(fitness)
FWIW, my problem was actually a bin-packing problem, where the equivalent of your "rank" statements were user-provided rankings (1, 2, 3, etc.). So not quite TSP, but NP-Hard. OTOH, bin-packing has a pseudo-polynomial solution proportional to accepted error, which is what I eventually used. I'm not quite sure that would work with your probabilistic ranking statements.
What an interesting problem! If I understand it, what you're really asking is:
"Given a weighted, directed graph, with each edge-weight in the graph representing the probability that the arc is drawn in the correct direction, return the complete sequence of nodes with maximum probability of being a topological sort of the graph."
So if your graph has N edges, there are 2^N graphs of varying likelihood, with some orderings appearing in more than one graph.
I don't know if this will help (very brief Google searches did not enlighten me, but maybe you'll have more success with more perseverance) but my thoughts are that looking for "topological sort" in conjunction with any of "probabilistic", "random", "noise," or "error" (because the edge weights can be considered as a reliability factor) might be helpful.
I strongly question your assertion, in your example, that P(a>c) is not needed, though. You know your application space best, but it seems to me that specifying P(a>c) = 0.99 will give a different fitness for f(abc) than specifying P(a>c) = 0.01.
You might want to throw in "Bayesian" as well, since you might be able to start to infer values for (in your example) P(a>c) given your conditions and hypothetical solutions. The problem is, "topological sort" and "bayesian" is going to give you a whole bunch of hits related to markov chains and markov decision problems, which may or may not be helpful.