How to describe a array of integers whose available digits are positive integers - sql

It is more of a language question as I am working on translating a technical document into English. The current description is as below:
The value of this parameter can be an array of integers or an integer range. The integers start from 0. Examples: '0-2' and '1,2,3'.
My question is: does this description cause confusion to imply that the value must start from 0 which makes the second example look incorrect on first sight?

Related

The set of atomic irrational numbers used to express the character table and corresponding (unitary) representations

I want to calculate the irrational number, expressed by the following formula in gap:
3^(1/7). I've read through the related description here, but still can't figure out the trick. Will numbers like this appear in the computation of the character table and corresponding (unitary) representations?
P.S. Basically, I want to figure out the following question: For the computation of the character table and corresponding (unitary) representations, what is the minimum complete set of atomic irrational numbers used to express the results?
Regards,
HZ
You can't do that with GAP's standard cyclotomic numbers, as seventh roots of 3 are not cyclotomic. Indeed, suppose $r$ is such a root, i.e. a rot of the polynomial $f = x^7-3 \in \mathbb{Q}[x]$. Then $r$ is cyclotomic if and only if the field extension \mathbb{Q}[x] is a subfield of a cyclotomic field. By Kronecker-Weber this is equivalent to that field being an abelian extension, i.e., the Galois group is abelian. One can check that this is not the case here (the Galois group is a semidirect product of C_7 with C_6).
So, $r$ is not cyclotomic.

How to treat numbers inside text strings when vectorizing words?

If I have a text string to be vectorized, how should I handle numbers inside it? Or if I feed a Neural Network with numbers and words, how can I keep the numbers as numbers?
I am planning on making a dictionary of all my words (as suggested here). In this case all strings will become arrays of numbers. How should I handle characters that are numbers? how to output a vector that does not mix the word index with the number character?
Does converting numbers to strings weakens the information i feed the network?
Expanding your discussion with #user1735003 - Lets consider both ways of representing numbers:
Treating it as string and considering it as another word and assign an ID to it when forming a dictionary. Or
Converting the numbers to actual words : '1' becomes 'one', '2' as 'two' and so on.
Does the second one change the context in anyway?. To verify it we can find similarity of two representations using word2vec. The scores will be high if they have similar context.
For example,
1 and one have a similarity score of 0.17, 2 and two have a similarity score of 0.23. They seem to suggest that the context of how they are used is totally different.
By treating the numbers as another word, you are not changing the
context but by doing any other transformation on those numbers, you
can't guarantee its for better. So, its better to leave it untouched and treat it as another word.
Note: Both word-2-vec and glove were trained by treating the numbers as strings (case 1).
The link you provide suggests that everything resulting from a .split(' ') is indexed -- words, but also numbers, possibly smileys, aso. (I would still take care of punctuation marks). Unless you have more prior knowledge about your data or your problem you could start with that.
EDIT
Example literally using your string and their code:
corpus = {'my car number 3'}
dictionary = {}
i = 1
for tweet in corpus:
for word in tweet.split(" "):
if word not in dictionary: dictionary[word] = i
i += 1
print(dictionary)
# {'my': 1, '3': 4, 'car': 2, 'number': 3}
The following paper can be helpful: http://people.csail.mit.edu/mcollins/6864/slides/bikel.pdf
Specifically, page 7.
Before they use an <unknown> tag they try to replace alphanumeric symbol combination with common pattern names tags, such as:
FourDigits (good for years)
I've tried to implement it and it gave great results.

Is format ####0.000000 different to 0.000000?

I am working on some legacy code at the moment and have come across the following:
FooString = String.Format("{0:####0.000000}", FooDouble)
My question is, is the format string here, ####0.000000 any different from simply 0.000000?
I'm trying to generalize the return type of the function that sets FooDouble and so checking to make sure I don't break existing functionality hence trying to work out what the # add to it here.
I've run a couple tests in a toy program and couldn't see how the result was any different but maybe there's something I'm missing?
From MSDN
The "#" custom format specifier serves as a digit-placeholder symbol.
If the value that is being formatted has a digit in the position where
the "#" symbol appears in the format string, that digit is copied to
the result string. Otherwise, nothing is stored in that position in
the result string.
Note that this specifier never displays a zero that
is not a significant digit, even if zero is the only digit in the
string. It will display zero only if it is a significant digit in the
number that is being displayed.
Because you use one 0 before decimal separator 0.0 - both formats should return same result.

How can I construct finite automata

I have to create a deterministic finite automata accepting the set of strings with an even number of 1 and ends with 0.Should I include 0 as a string from this set? and how can I do this?
Should I include 0 as a string from this set?
Yes
And how do I do this?
To construct a finite automaton, you need to identify the states and transitions. The Myhill-Nerode theorem allows you to find the necessary (and sufficient!) states of for a finite automaton if you are able to identify the equivalence classes of "indistinguishable" strings.
Two strings x and y are indistinguishable, in this sense, if for any other string z, either both xz and yz are in the language, or neither is.
In your case, let's try to identify equivalence classes. The empty string is in some equivalence class. The string 0 is in a different equivalent class, since you can add the empty string to 0 and get a string in the language (whereas you can't add the empty string to the empty string to get a string in the language). We have found two distinct equivalence classes so far - one for the empty string, one for 0. Both of these will need different states in our FA.
What about the string 1? It's distinguishable from both 0 and the empty string, since you can add 10 to 1 to get 110, a string in the language, but you can't add it to 0 or the empty string to get a string in the language. So we have yet another state.
What about the string 00? This string is not in the language, and no other string can be added to this string to get a string in the language. This is another equivalence class. It turns out that the next strings, 01 and 10, are also in this class.
The string 11 ends up being in the same class as the empty string: you can add any string in the language to 11 and get another string in the language. If you try all strings of length 3, you will find that all of those already fall into one of the above classes, and you can stop checking at that point.
So we have four states - let's call them [-], [0], [1], and [00]. Now we figure out transitions.
If you get a 0 in [-], you need to go to [0]... and if you get a 1, you need to go to [1]. For the rest, just figure out what string you'd get by adding to the canonical one, and which class the resulting string would be in... and go to that state.
Given Question is to construct a Finite Automata with even number of 1's and ends with 0.
So the alphabet of the language is {0,1}
These are the the strings that are accepted by the language.
The Language always consists of '0' before its final state as it is the end of the string and we reach the final state when we reach the last '0' in the string.
Here in the normal procedure of conversion of it into the finite automata we get NFA
Then we need to convert the NFA to DFA by combining 2 states into single and simplifying them.
New transition diagram
Here we had drawn the new transition diagram based on the states reached by a specific state at a given input. Then the new states formed by joining 2 states [ here {q0,q2} state is formed]
This new state {q0,q1} on 0 as input goes to itself (as q0 on 0 goes to q0 and q2 on 0 goes to q2).
So let us conside this new state {q0,q2} as a new state q2'
So by using the Transition state diagram we can easily make the required DFA
Deterministic Finite Automata
The above diagram is the constructed finite automata accepting the set of strings with an even number of 1's and ending with 0.
q0 - is the Initial state
q2'- is the Final state

What is the technical term for the input used to calculate a checkdigit?

For example:
code = '7777-5';
input = code.substring(0, 4); // Returns '7777'
checkdigit = f(input); // f() produces a checkdigit
assert.areEqual(code, input + "-" + checkdigit)
Is there a technical term for input used above?
Specifically I'm calculating checkdigits for ISBNs, but that shouldn't effect the answer.
Is "original number excluding the check digit" technical enough? :)
Actually, it's often the case, as in the link you posted, that the check digit or checksum ensures a property about the full input:
...[the check digit] must be such that the sum of all the ten digits, each multiplied by the integer weight, descending from 10 to 1, is a multiple of the number 11.
Thus, you'd check the full number and see if it meets this property.
It's "backwards" when you're initially generating the check digit. In that case, the function would be named generate_check_digit or similar, and I'd just name its parameter as "input".
Although I am not sure if there is a well-known specific technical term for the input, what LukeH suggested (message/data) seems common enough.
Wiki for checksum:
With this checksum, any transmission error that flips a single bit of the message, or an odd number of bits, will be detected as an incorrect checksum
Wiki for check digit:
A check digit is a form of redundancy check used for error detection, the decimal equivalent of a binary checksum. It consists of a single digit computed from the other digits in the message.