How often x occurs after y happens in 3 different experimental conditions - frequency

I want to find out how often x happens when y occurs between 3 different experimental conditions.
For example:
The behaviour (positive reaction, negative reaction) of 10 people was observed in 3 different places. Each person experienced each place once.
Place 1. basketball court
Place 2. tennis court
Place 3. netball court
How can I find out the proportion of people who had the same reaction in places 1 and 2, and the same reaction in places 1 and 3?
Please help if possible.
Thankyou.

Related

Is there an algorithm for near optimal partition of the Travelling salesman problem, creating routes that need the same time to complete?

I have a problem to solve. I need to visit 7962 places with a vehicle. The vehicle travels with 10km/h and each time I visit one place I stay there for 1 minute. I want to divide those 7962 places into subsets that take will take up to 8 hours. So lets say 200 places take 8 hours I visit them and come back the next day to visit another maybe 250 places(the 200 places subsets will require more distance travelled). For the distance I only care for Euclidean Distances no need to take into account the distance through the road network.
A map of the 7962 places
What I have done so far is use the k means clustering algorithm to get good enough subsets and then the Lin Kernighan heuristic (Program Concorde) to find the distance. And then compute times. But my results go from 4 hours to 12 hours. Any idea to make it better? Or a code that does this whole task all together. Propose anything but I am not a programmer I just use Python some times.
Set of coordinates :
http://www.filedropper.com/wholesetofcoordinates
Coordinates subsets(40 clusters produces with the k means algorithm):
http://www.filedropper.com/kmeans40clusters

Generate a series of pseudo-random sequences to remove sequential bias

I have n things that are going to be tested by t testers. Each tester is subject to order bias (that is, they may give higher ratings to things they test earlier - or the reverse), so I want to eliminate that. One way to do that would be to generate a random testing sequence for each tester.
For example, with n=5, m=2:
Tester 1 Tester 2
4 2
2 5
5 4
1 3
3 1
Notice that in this generated sequence however, both testers test thing 5 immediately after thing 2. Also, things 3 and 1 both appear towards the end of the testing, for both testers. This is sub-optimal.
My question: how can I generate an optimal set of sequences, maximising the chance of each thing appearing at every different position, and minimising the repetition of individual sequences.
Harder question: how much more optimal would that be than the naive (pseudo-)random generation? Can that be quantified?
This isn't a homework question, although it may sound like it :) (It's for a wine tasting...)
I really wasn't sure whether this should go in math.stackexchange, cs.stackexchange, or here. Ultimately I actually want to implement this, so...

Performing Chi square test for independence in sas- simpson paradox

I want to find out whether there is a relationship between how well the students did on a particular test and the level of dropout from education. I have a 2×2 matrix with the variables Level in test which takes the values level 1 and level 2, and the variable dropout which has the values not active and active. (you can say that level 1=pass the test and level 2=not passed).
I can see that I have a problem with the term called "simpson paradox", because I get that every single education in the faculty has a high p value indicating that there is no relationship between level in test and dropout. BUT when I group the data and perform the analysis for the whole faculty, I get a low p value indicating that there is a significant relationship between the variables.??
I have tried to read about the Simpson paradox, but I don't seem to get the information of how to deal with this problem?
I have read one place that one should not perform the test on aggregated data, but that cannot be true?
I really hope that someone can help me!
Kind Regards Maria
For the cross-tabs labeled education 2 and education 5 you have cell values less than 5 which violates the assumptions for running a chi-square. There are arguments to be made about how chi-square is robust enough of a test to withstand these limitations, but I would still reconsider your grouping methodology.
Since the total number of cases in 'Faculty' is higher, the data is enough to refute the independence hypothesis, hence low p-values. When the number of cases is small (your education 1 to education 5 tables), there is not enough data to show significance. A higher p-value here just says that the differences could be by chance.
This is not an example of Simpson's paradox.

Access Query - Compare Multiple User Selections Against Each Other

I'm running into a conceptual problem that I cannot seem to conquer in my mind.
Let's say I want a user to enter what they're currently wearing into a database via a form. Throwing 'T-Shirt' and 'Blue' in a new row is incredibly easy. However, let's say I want to compare one users against others, and rank in order from most similar to least.
This becomes a huge nightmare when you consider the amount of options available.
Undershirt
Overshirt
Jacket
Scarf/Necklaces
Headwear
Pants
Underwear
Leggings
Socks
Footwear
Accessories
As I see it, I could hard-code in the 11 categories above and let a user make selections from drop-drop boxes tailored to each category. Now, let's use an example of 'Undershirt' and 'Overshirt'. Depending on the person, a long-sleeved shirt could be used as either; they're still wearing one. If I make users put values in categories, User A might put it in one and User B might but it in another category. And they wouldn't get compared because of that, separate categories.
Now, instead of hard-coding in categories (and thus making a limit of how much a user can enter), I could put each item into its own row and search by User ID. But let's say a person enters in shorts one day, and next throws in jeans and a shirt. How can I make sure that they're compared separately (e.g., dress compared to shorts, dress compared to jeans+shirt) and not (dress compared to shorts+jeans+shirt).
As to actually comparing, each item vs. each other could be performed via a 2d lookup table. (Row Dress vs. Column Jeans would net a zero, Row Dress vs. Column Dress would net a one)
The appropriate design for this would depend on the acceptable margin of error. If there is zero acceptable error, then you must present the users with the categories and they specify true/false yes/no for each one or select from a limited set of possible answers.
HANDS:
gloves
mittens
brass knuckles
[Caveat: user could be wearing brass knuckles inside the mittens. You have to take into account
whether values are mutually exclusive or not. Barefoot <> no socks.
Someone who is barefoot is not wearing socks but someone not wearings socks may be wearing docksiders]
FEET1:
anklet socks
sheer stockings
fishnet stockings
ragg wool hiking socks
kneesocks
gym socks
no socks
FEET2:
mocassins
running shoes
sandals
wing-tips
uggs
spike heels
...
HEAD:
sombrero
beret
baseball hat
pirate's hat
beanie
knitted cap
NECK:
scarf
mock turtleneck aka dickie
Et cetera et cetera ad nauseam.
Or if margin of error is very generous, you could allow simple freeform text-entry and match/partial-match on words. Slightly less error : you could set up a synonyms table and match on the synonyms of the supplied words.
As a general rule, get the database design right and worry about reporting later. If this is not just a thought exercise, you may like to say what you are actually comparing, because with the above, a person is quite likely to say "tuxedo" or "evening dress", and let the details be inferred, whereas in some other area, this may not be possible. Even so, it seems that you would need a minimum of three columns (fields) for each item:
Timestamp
Major category (jeans, trousers, skirt)
Item (Levi's, tweeds, mini)
If accuracy is particularly important, you will need a trained interviewer :)
I have just noticed underwear in that list, which is even more complicated, because what would qualify as full underwear for a lady of a certain age is by no means the same as that for a gentleman of ten years.

How to programmatically solve the 15 (moving numbers) puzzle?

all of you have probably seen the moving number/picture puzzle. The one where you have numbers from 1 to 15 in a 4x4 grid, and are trying to get them from random starting position to
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15
My girlfriend or some of my non-programmer friends can solve this with some mumbo-jumbo magic, that they can't explain to me. I can't solve the puzzle.
The most promising approach I have found out is to solve first row, then I'd get
1 2 3 4
X X X X
X X X X
X X X
then first column without touching solved cells
1 2 3 4
5 X X X
9 X X X
13 X X
then second row to
1 2 3 4
5 6 7 8
9 X X X
13 X X
then second column
1 2 3 4
5 6 7 8
9 10 X X
13 14 X
the problem is, that remaining X (random) tiles are sometimes in unsolvable position and this is where my solution fails. But I feel as if I'm on the right path.
My program does the solving of specified row/column by trying to get number X to specified position without messing up the correct cells, if possible. But it can't do the last 3 tiles on 2x2 grid. What am I missing?
Make sure that your puzzle is solvable first. Not all are.
Otherwise your strategy looks sound.
You're definitely on the right track, but rather than solving by row/column iteratively to the point of being left with a 2x2, solve until you have a minimum 3x3 and then solve just that grid. 3x3 is the smallest size you need to properly re-order the grid (while 2x2 doesn't give you the complete flexibility you may need as you've already discussed). This approach is scalable too - you can solve 5x5, 10x10 etc.
I think The most effective way of solving this, is using additive patterns, with an admisible heuristic, and IDA* algorithm. As described here - http://www.aaai.org/Papers/JAIR/Vol22/JAIR-2209.pdf. (I think Felner told us he found a way which is quite better, but I don't remember exactly what it was (bidirectional A*?), but anyhow this should be sufficient (-: ).
Anyhow this course was long ago, so I recommend reading the article..
HTH. Take care.
This site has a nice explanation about 3x3 grids, you could probably extend it to 4x4 quite easily.
By reduction the only possible case you can't solve must be of the form
1 3
2 X
and you want to get it to
1 2
3 X
by using an additional row and column you can move those to the proper positions with a simple precomputed sequence
The solution strategy described by the original poster will always work for a standard solvable 15-puzzle. If Axarydax can reduce a 15-puzzle to the state s/he described and still be unable to solve it, then it was impossible to begin with. Let me explain.
If we treat the blank space in the puzzle as one of the tiles, then each legal move involves swapping that blank "tile" for an adjacent tile. This allows us to regard motions on the puzzle as permutations on 16 characters. That is, elements of the symmetric group S16. Each primitive move is a "swap" or transposition between only two elements (one of which is the blank).
Because the puzzle begins and ends with the blank tile in the lower right, the blank tile must move an even number of times for the puzzle to be solved. (This is easiest to see by imagining an overlaid checkerboard pattern on top of the puzzle -- after an odd number of moves the blank would be on a different color square.) That means that the solution enacted must be a product of evenly many permutations, so it must be an element of the alternating group A16, which has exactly half of S16. (Of the 16! permutations of S16, 16!/2 permutations are even, and 16!/2 are odd. Moreover even*even=even, even*odd = odd, and odd*odd=even.)
If the necessary correcting permutation happens to be odd, it's not possible to solve the puzzle, no matter what you do. If the necessary correcting permutation is even, and if Axarydax follows the strategy described, then the permutation required for the remaining 2x2 block will necessarily be an even permutation fixing the blank square. The only even permutations of only three elements are the rotations 1->2->3->1 (cycle notation (123)) and 1->3->2->1 (cycle notation (132)). These are easily performed on the remaining four squares without disturbing the others.
Since it's implausible that the Axarydax cannot figure out these trivial solutions of the 2x2 blocks, I suspect that either s/he has been pranked, or the 15-puzzle being attempted is nonstandard in some way.
There are always up to 4 move positions from any given one. I wonder whether the simple algorithm that goes over all options building 2-4 tree will reach the "solved" position or the stack overflow :)