Pseudo randomization constraints - psychopy

I am programming a task switching experiment with 3 tasks. The aim of the experiment is to investigate sequential effects: triplets in which the task X repeat after a switch (e.g. ABA or CAC) will be compared with triplets in which the task always switch (e.g. CBA or BAC).
To this aim, is important that the 3 tasks never repeat and that in each block there are (roughly) the same number of repeat and switch sequences.
Each block has 108 trials resulting in 106 triplets (the first two trials cannot be classified as repeat or switch of course).
I have tried to find a solution with several softwares (Psychopy, Conan, Excel), but I haven't found a solution, and I have no clue how to do it.
Any help would be much appreciated

There are only six possible orders in which the task doesn't repeat:
A B C
A C B
B A C
B C A
C A B
C B A
And six where it does:
A B A
A C A
B A B
B C B
C A C
C B C
So to get to 108 trials, you just need to present each of those orders nine times. But this might conflict with your requirement that the tasks don't repeat (but that phrasing is ambiguous and you should be more specific as to the meaning of that constraint.)
Also, phrases like "there are (roughly) the same number of repeat and switch sequences" aren't great when defining an experimental design. Strive for as much precision as possible.
Having said all that, I'm not sure how this is yet an actual programming-related question? You'll need to say exactly what the issue is with implementing this. The programs you mention have wildly different purposes (PsychoPy is for implementing experiments. Excel is not. I don't know what Conan is).

Related

How to sample rows from a table with a specific probability?

I'm using BigQuery at my new position, and I'm totally new to SQL/BigQuery.
I'm testing a machine learning model and monitoring an A/B test with a different ratio, e.g., 3 vs. 10. To compare the A/B results, e.g., # of page view, I want to make the ratios equal first so that I can compare easily. For example, say we have a table with 13 records (3 are from A and 10 are from B). In addition, each row contains an id field that is identical. What I want to do is to extract only 3 samples out of 10 for B to match the sample number to A.
I'm trying to use the FARM_FINGERPRINT function to map fields to integers. Then I'm taking ABS and then calculating MOD to convert the integer numbers to a specific range, e.g., [0, 10). Eventually, I would like to get 3 in 10 items using the following line:
MOD(ABS(FARM_FINGERPRINT(field)), 10) < 3
However, I found that even if I run A/B with exactly the same ML model with different A/B ratio, the result is different between A and B (The results should be same because A and B are running the same ML model with just the different ratio). This made me doubt that the above implementation may bring some biased data sampling. I also read this post and confirmed the FARM_FINGERPRINT might not bring a randomly distributed result.
*There's a critical reason why I cannot simply multiply 3/10 to B, which is confidential and cannot disclose here.
Is there a better way to accomplish the equally distributed sampling?
Thank you in advance. (I'm sorry if the question is vague, as I'm hiding the confidential parts.)

What is the most conclusive way to evaluate an n-way split test where n > 2?

I have plenty of experience designing, running and evaluating two-way split tests (A/B Tests). Those are by far the most common in digital marketing, where I do most of my work.
However, I'm wondering if anything about the methodology needs to change when more variants are introduced into an experiment (creating, say, a 3-way test (A/B/C Test)).
My instinct tells me I should just run n-1 evaluations against the control group.
If I run a 3-way split test, for example, instinct says I should find significance and power twice:
Treatment A vs Control
Treatment B vs Control
So, in that case, I'm finding out which, if any, treatment performed better than the control (1-tailed test, alt: treatment - control > 0, the basic marketing hypothesis).
But, I'm doubting my instinct. It's occurred to me that running a third test contrasting Treatment A with Treatment B could yield confusing results.
For example, what if there's not enough evidence to reject a null that treatment B = treatment A?
That would lead to a goofy conclusion like this:
Treatment A = Control
Treatment B > Control
Treatment B = Treatment A
If treatments A and B are likely only different due to random chance, how could only one of them outperform the control?
And that's making me wonder if there's a more statistically sound way to evaluate split tests with more than one treatment variable. Is there?
Your instincts are correct, and you can feel less goofy by rewording your statements:
We could find no statistically significant difference between Treatment A and Control.
Treatment B is significantly better than Control.
However it remains inconclusive whether Treatment B is better than Treatment A.
This would be enough to declare Treatment B a winner, with the possible followup of retesting A vs B. But depending on your specific situation, you may have a business need to actually make sure Treatment B is better than Treatment A before moving forward and you can make no such decision with your data. You must gather more data and/or restart a new test.
What I've found is a far more common scenario is Treatment A and Treatment B both soundly beat control (as they're often closely related and have related hypotheses), but there is no statistically significant difference between Treatment A or Treatment B. This is an interesting scenario where if you are required to pick a winner, it's okay throwing significance out the window and picking the one that has the strongest effect. The reason why is that the significance level (eg. 95%) is set to avoid false positives and making unnecessary changes. There's an assumption that there are switching costs. In this case, you must pick A or B and throw out control, so in my opinion it's okay picking the best one until you have more data.

Building Turing machine graph

I have been trying making a Turing machine graph recognizing the language :
{(ab)^n(ba)^n | n >0}
How to build the Turing machine graph for the above-mentioned language?
find the substring bb by identifying two consecutive instances of b
cross these off by replacing with a tape symbol X
bounce across the section of instances of X, crossing off matching symbols in alternating fashion (first cross off matching instances of a, then b, then a, etc.)
halt-accept if the tape is empty after crossing off matching instances of a
halt-reject if you run out of symbols early or if the tape is empty after crossing out instances of b
I'll leave defining states as an exercise but if you need help with that I can revisit this answer later. As a hint - you will need either one or a couple of states to handle each of the above steps.

Generate a series of pseudo-random sequences to remove sequential bias

I have n things that are going to be tested by t testers. Each tester is subject to order bias (that is, they may give higher ratings to things they test earlier - or the reverse), so I want to eliminate that. One way to do that would be to generate a random testing sequence for each tester.
For example, with n=5, m=2:
Tester 1 Tester 2
4 2
2 5
5 4
1 3
3 1
Notice that in this generated sequence however, both testers test thing 5 immediately after thing 2. Also, things 3 and 1 both appear towards the end of the testing, for both testers. This is sub-optimal.
My question: how can I generate an optimal set of sequences, maximising the chance of each thing appearing at every different position, and minimising the repetition of individual sequences.
Harder question: how much more optimal would that be than the naive (pseudo-)random generation? Can that be quantified?
This isn't a homework question, although it may sound like it :) (It's for a wine tasting...)
I really wasn't sure whether this should go in math.stackexchange, cs.stackexchange, or here. Ultimately I actually want to implement this, so...

Find minimal key for relation scheme

I have a database for an investment firm:
B (broker)
O (office of broker)
I (investor)
S (stock)
Q (quantity of stock owned by investor)
D (dividend paid by stock)
Functional dependencies
S ⟶ D
I ⟶ B
IS ⟶ Q
B ⟶ O
I need to find minimal key for relation scheme R=BOSQID and need to prove it.
I have no idea how to solve this problem.
Can you give me any idea?
Jay, the way I understand this is the following. You need to find the minimal set of fields that would allow you to identify all fields BOSQID. There is an algorithm which I don't remember right now to properly do the analysis you're looking for, but the exercise seems to be simple enough in order not to need it.
Take B -> O. As B determines O we can keep B and remove O from the keys. Current possible key fields: BSQID.
Take I -> B. As I determines B we can keep I and remove B from the keys. Notice that, by transitivity, I determines O. Current possible key fields SQID
Take S -> D. As S determines D we can keep S and remove D from the keys. Current possible key fields SQI
Take IS -> Q. As IS determines Q we can keep IS and remove Q from the keys. Current possible key fields: IS
As we no longer have functional dependencies we can't go on, so the result is IS. There are more complex examples to work on in which this simple technique won't help you because it'll drive you crazy, that's why I recommend you too look for the algorithm to solve this.