finding largest number of candidate keys that a relation has? - sql

I am trying to solve this question which has to do with candidate keys in a relation.
This is the question:
Consider table R with attributes A, B, C, D, and E. What is the largest number of
candidate keys that R could simultaneously have?
the answer is 10 but i have no clue how it was done, nor how does the word simultaneously plays into effect when calculating the answer.

Sets that are not subsets of other sets.
For example {A-B} and {A,B,C} can't be candidates keys simultaneously, because {A,B} is a subset of {A,B,C}.
Combinations of 2 attributes or 3 attributes generates the maximum number of simultaneous candidates keys.
See how the 3 attributes sets are actually complements of the 2 attributes sets, e.g. {C,D,E} is the complement of {A,B}.
2 3
attributes attributes
sets sets
1. {A,B} - {C,D,E}
2. {A,C} - {B,D,E}
3. {A,D} - {B,C,E}
4. {A,E} - {B,C,D}
-
5. {B,C} - {A,D,E}
6. {B,D} - {A,C,E}
7. {B,E} - {A,C,D}
-
8. {C,D} - {A,B,E}
9. {C,E} - {A,B,D}
-
10. {D,E} - {A,B,C}
If I would take sets of a single attribute I would have only 4 options
{A},{B},{C},{D}
Any set with more than 1 element will contain one of the above and therefore will not be qualified.
If I would take sets of 4 attributes I would have only 4 options
{A,B,C,D},{A,B,C,E},{A,B,D,E},{B,C,D,E}
Any set with more than 4 element will contain one of the above and therefore will not be qualified.
Any set with less than 4 element will be contained by one of the above and therefore will not be qualified.
etc.

For 5 keys, it is probably best to do this by brute force. Understanding the ideas is more important than the calculation (DuDu/David gives a good example of 10 candidate keys, showing that a set of 10 keys is possible so the maximum is at least this large).
What is the idea? A candidate key is a combination of attributes that is unique. So, if A is unique, then A with any other column is also unique. One set of candidate keys is simply:
A
B
C
D
E
If each of these are unique, then any combination of keys is going to contain at least one of these attributes and the combination will also be unique. Hence, the uniqueness of these five would imply the uniqueness of any other combination.
5 is not the largest number of candidate keys with this property.
It gets a bit more complicated. If {A, B, C, D, E} is unique (and no subset is a candidate key), then there is exactly 1 candidate key. Rearranging the columns doesn't change the set (sets are unordered).
One thing we might postulate is that the biggest set of candidate keys has keys all of the same length. This is in fact true. Why? Well, if we have a set of keys that are of different lengths, we can lengthen the shorter ones by adding arbitrary attributes and still have a maximal set.
So, you only need to consider subsets of 1, 2, 3, 4, and 5 keys, exactly. When you work it out, you will find that the maximum numbers are:
5 10 10 5 1
You can add a "1" to the beginning and you may recognize the pattern. This is a row from Pascal's Triangle. This observation (well, and the related proof) actually makes it easy to determine the maximum value for any given n.
Incidentally, the sets of length 3 are:
A B C
A B D
A B E
A C D
A C E
A D E
B C D
B C E
B D E
C D E

Related

Randomisation - Partial Incomplete Block Design

I'm looking to replicate this randomisation in R studio.
Key features:
A and B are the primary comparison and must have 2 by 2 cross over design (i.e., occur togeather in each sequence)
The incomplete block design should include C D E and F - comparisons of interest are C vs D and E vs F. These comparisons need to occur the same number of times within the whole design, and one comparison must occur in each sequence
C D E F need to be balanced so that they occur the same number of times in a sequence
C D E F need to be balanced so that occurr the same number of times across periods
Any help would be greatly appreciated.
Many thanks.
The code I tried below is just for the incomplete block design C,D,E,F but I can't get it to balance across periods.
library(crossdes)
out=find.BIB(4,20,3, iter=1)#each dosed 6 times achieving first order balance
out
isGYD(out)
I had then planned to join on the A and B rand.

Reordering rows in sql database - idea

I was thinking about simple reordering rows in relational database's table.
I would like to avoid method described here:
How can I reorder rows in sql database
My simple idea was to use as ListOrder column of type double-precision 64-bit IEEE 754 floating point.
At inserting a row between two existing rows we calculate listOrder value as average of these sibling elements.
Example:
1. Starting state:
value, listOrder
a 1
b 2
c 3
d 4
e 5
f 6
2. Moving "e" two rows up
One simple sql update on e-row: update mytable set listorder=2.5 where value='e'
value, listOrder
a 1
b 2
e 2.5
c 3
d 4
f 6
3. Moving "a" one position down
value, listOrder
b 2
a 2.25
e 2.5
c 3
d 4
f 6
I have a question. How many insertions can I perform (in the edge situation) to have properly ordered list.
For the 64 bit integer there is less than 64 insertions in the same place.
Is floating point types allows to more insertions?
There are other problems with described approach?
Do you see any patches/adjustments to make this idea safe and usable in applications?
This is similar to a lexical order, which can also be done with varchar columns:
A
B
C
D
E
F
becomes
A
B
BM
C
D
F
becomes
B
BF
BM
C
D
F
I prefer the two step process, where you update every row in the table after the one you move to be one larger. Sql is efficient about this, where updating the rows following a change is not as bad as it seems. You preserve something that's more human readable, the storage size for your ordinal value scales in a linear rather with your data size, and you don't risk coming to a point where you don't have enough precision to put an item in between two values

Repetition while copying data to SQL table from multiple sheets

I have to copy data from multiple excel sheets to the single SQL table.
Excel inputs:
Sheet1's columns: fname a, b. lname c, d. (2 rows)
Sheet2's columns: city boston, austin, state ma, tx. (2 rows)
My output (tMSSqlOutpout) has 4 rows instead of 2.
a c boston ma, a c austin tx, b d boston ma, b d austin tx.
Desired output: a c boston ma, b d austin tx. (2 rows only)
How do I manage this?
As per the comments, you don't have a natural key to join the two data sets. Instead you could generate a sequence for each data set that would increment equally for both data sets and would equate to being your row number on each data set.
First of all, this should set alarm bells ringing about the state of your data and how you can be sure that row n in one data set definitely corresponds to row n in another data set. It smacks of something being badly normalised out without proper keys being added and it can be very dangerous to assume that the resulting data set from this is going to be accurate.
If you absolutely must do this, however, then you should assign a Numeric.sequence to each of your data sets. You can do this in a tMap that precedes your joining tMap:
Notice the "s1" parameter to the Numeric.sequence. If you reuse this elsewhere then it will increment this one rather than starting from 1 so typically you would want to choose a unique name for each sequence you have in your job (although there are obviously occasions where incrementing a previously defined sequence is what you desire).
Once you have defined a unique sequence with the same starting numbers (the second parameter) and the same increment numbers (the third parameter) then you should be able to create a join on these instances:

Deterministic/non-deterministic state system mapping

I read in a book on non-deterministic mapping there is mapping from Q*∑ to 2Q for M=(Q,∑,trans,q0,F)
where Q is a set of states.
But I am not able to understand how it's 2Q;
if there are 3 states a, b, c, how does it map to 8 states?
I always found that the easiest way to think about these (since the set of states is finite) is as having each of those subsets be an encoding of a base-2 number that ranges from 0 (all bits zero) to 2|Q|-1 (all bits one), where there are as many bits in the number as there are members in the state set, Q. Then, you can just take one of these numbers and map it into a subset by using whether a particular bit in the number is set. Easy!
Here's a worked example where Q = {a,b,c}. In this case, |Q| is 3 (there are three elements) and so 23 is 8. That means we get this if we say that the leading bit is for element a, the next bit is for b, and the trailing bit for c:
0 = 000 = {}
1 = 001 = {c}
2 = 010 = {b}
3 = 011 = {b,c}
4 = 100 = {a}
5 = 101 = {a,c}
6 = 110 = {a,b}
7 = 111 = {a,b,c}
See? That initial three states has been transformed into 8, and we have a natural numbering of them that we could use to create the labels of those states if we chose.
Now, to the interpretations of this within a non-deterministic context. Basically, the non-determinism means that we're uncertain about what state we're in. We represent this by using a pseudo-state that is the set of “real” states that we might be in; if we have total non-determinism then we are in the pseudo-state where all real-states are possible (i.e., {a,b,c}) whereas the pseudo-state where no real-states are possible (i.e., {}) is the converse (and really ought to be impossible to reach in the transition system). In a real system, you're usually not dealing with either of those extremes.
The logic of how you convert the deterministic transition system into a non-deterministic one is rather more complex than I want to go into here. (I had to read a substantial PhD thesis to learn it so it's definitely more than an SO answer's worth!)
2Q means the set of all subsets of Q. For each state q and each letter x from sigma, there is a subset of Q states to which you can go from q with letter x. So yeah, if there are three states abc the set 2Q consists of 8 elements {{}, {a}, {b}, {c}, {a,b}, {a,c}, {b,c}, {a,b,c}}. It doesn't map to 8 states, it maps to one of these 8 sets. HTH

Algorithm - combine multiple lists, resulting in unique list and retaining order

I want to combine multiple lists of items into a single list, retaining the overall order requirements. i.e.:
1: A C E
2: D E
3: B A D
result: B A C D E
above, starting with list 1, we have ACE, we then know that D must come before E, and from list 3, we know that B must come before A, and D must come after B and A.
If there are conflicting orderings, the first ordering should be used. i.e.
1: A C E
2: B D E
3: F D B
result: A C F B D E
3 conflicts with 2 (B D vs D B), therefore requirements for 2 will be used.
If ordering requirements mean an item must come before or after another, it doesn't matter if it comes immediately before or after, or at the start or end of the list, as long as overall ordering is maintained.
This is being developed using VB.Net, so a LINQy solution (or any .Net solution) would be nice - otherwise pointers for an approach would be good.
Edit: Edited to make example 2 make sense (a last minute change had made it invalid)
The keyword you are probably interested in is "Topological sorting". The solution based on that would look as follows:
Create an empty directed graph.
Process sequences in order, for each two consecutive elements X,Y in a sequence add an edge X->Y to the graph, unless this would form a cycle.
Perform a topological sort on the vertices of the graph. The resulting sequence should satisfy your requirements.