Convert an alphanumeric string to integer format - ruby-on-rails-3

I need to store an alphanumeric string in an integer column on one of my models.
I have tried:
#result.each do |i|
hex_id = []
i["id"].split(//).each{|c| hex_id.push(c.hex)}
hex_id = hex_id.join
...
Model.create(:origin_id => hex_id)
...
end
When I run this in the console using puts hex_id in place of the create line, it returns the correct values, however the above code results in the origin_id being set to "2147483647" for every instance. An example string input is "t6gnk3pp86gg4sboh5oin5vr40" so that doesn't make any sense to me.
Can anyone tell me what is going wrong here or suggest a better way to store a string like the aforementioned example as a unique integer?
Thanks.

Answering by request form OP
It seems that the hex_id.join operation does not concatenate strings in this case but instead sums or performs binary complement of the hex values. The issue could also be that hex_id is an array of hex-es rather than a string, or char array. Nevertheless, what seems to happen is reaching the maximum positive value for the integer type 2147483647. Still, I was unable to find any documented effects on array.join applied on a hex array, it appears it is not concatenation of the elements.
On the other hand, the desired result 060003008600401100500050040 is too large to be recorded as an integer either. A better approach would be to keep it as a string, or use different algorithm for producing a number form the original string. Perhaps aggregating the hex values by an arithmetic operation will do better than join ?

Related

How to query for a zero-byte char?

According to the documentation, pg_attribute.attgenerated is typed as char and has a value of "a zero byte" if the column is not generated, and there is at least one other possible value, with potentially more in the future.
I want to query for all non-generated columns. Since I would prefer to not be tripped up by additions in future versions, the query predicate needs to be WHERE attgenerated = ZERO BYTE rather than an inequality, but I have no idea how to represent that value correctly in SQL.
What's the correct way to write this? In most programming languages you'd say '\0', and you can use escape sequences by prepending an e to the string literal, but if I say e'\0' it errors out with "invalid byte sequence for encoding "UTF8": 0x00". So I'm not quite sure what the right way to do this is.
It's simply an empty string:
WHERE attgenerated = ''

Regex match SQL values string with multiple rows and same number of columns

I tried to match the sql values string (0),(5),(12),... or (0,11),(122,33),(4,51),... or (0,121,12),(31,4,5),(26,227,38),... and so on with the regular expression
\(\s*\d+\s*(\s*,\s*\d+\s*)*\)(\s*,\s*\(\s*\d+\s*(\s*,\s*\d+\s*)*\))*
and it works. But...
How can I ensure that the regex does not match a values string like (0,12),(1,2,3),(56,7) with different number of columns?
Thanks in advance...
As i mentioned in comment to the question, the best way to check if input string is valid: contains the same count of numbers between brackets, is to use client side programm, but not clear SQL.
Implementation:
List<string> s = new List<string>(){
"(0),(5),(12)", "(0,11),(122,33),(4,51)",
"(0,121,12),(31,4,5),(26,227,38)","(0,12),(1,2,3),(56,7)"};
var qry = s.Select(a=>new
{
orig = a,
newst = a.Split(new string[]{"),(", "(", ")"},
StringSplitOptions.RemoveEmptyEntries)
})
.Select(a=>new
{
orig = a.orig,
isValid = (a.newst
.Sum(b=>b.Split(new char[]{','},
StringSplitOptions.RemoveEmptyEntries).Count()) %
a.newst.Count()) ==0
});
Result:
orig isValid
(0),(5),(12) True
(0,11),(122,33),(4,51) True
(0,121,12),(31,4,5),(26,227,38) True
(0,12),(1,2,3),(56,7) False
Note: The second Select statement gets the modulo of sum of comma instances and the count of items in string array returned by Split function. If the result isn't equal to zero, it means that input string is invalid.
I strongly believe there's a simplest way to achieve that, but - at this moment - i don't know how ;)
:(
Unless you add some more constraints, I don't think you can solve this problem only with regular expressions.
It isn't able to solve all of your string problems, just as it cannot be used to check that the opening and closing of brackets (like "((())()(()(())))") is invalid. That's a more complicated issue.
That's what I learnt in class :P If someone knows a way then that'd be sweet!
I'm sorry, I spent a bit of time looking into how we could turn this string into an array and do more work to it with SQL but built in functionality is lacking and the solution would end up being very hacky.
I'd recommend trying to handle this situation differently as large scale string computation isn't the best way to go if your database is to gradually fill up.
A combination of client and serverside validation can be used to help prevent bad data (like the ones with more numbers) from getting into the database.
If you need to keep those numbers then you could rework your schema to include some metadata which you can use in your queries, like how many numbers there are and whether it all matches nicely. This information can be computed inexpensively from your server and provided to the database.
Good luck!

Dataframes NAtype to binary Julia

I'm trying to write binary text files from a data frame in Julia using something along the lines of:
for x in RICT["$i"]["Sick"]
write(f9, convert(Int16, x ))
and everything works nicely except for when it comes to NA values. Missing values are treated as NA it seems, and I know that there are different ways of handling such values using the data frames package. Does anyone have any experience with these NAtypes? Should I convert the NAtypes to a more conventional type and then write them in? As always any help is much appreciated.
If you are writing a 16-byte integer value, there's no canonical representation of "blank", so you'd have to pick a special 16-byte integer value that represents NA. A common choice for this kind of thing is the smallest representable value – in this case typemin(Int16) == -32768. You can generalize this to other signed integer types.

Enumerating Strings as bytes?

I was looking for a way to enumerate String types in (vb).NET, but .NET enums only accept numeric type values.
The first alternative I came across was to create a dictionary of my enum values and the string I want to return. This worked, but was hard to maintain because if you changed the enum you would have to remember to also change the dictionary.
The second alternative was to set field attributes on each enum member, and retrieve it using reflection. Surely enough this worked aswell and also solved the maintenance problem, but it uses reflection and I've always read that using reflection should be a last resort thing.
So I started thinking and I came up with this: every ASCII character can be represented as a hexadecimal value, and you can assign hexadecimal values to enum members.
You could get rid of the attributes, assign the hexadecimal values to the enum members. Then, when you need the text value, convert the value to a byte array and use System.Text.Encodings.ASCII.GetString(enumMemberBytes) to get the string value.
Now speaking out of experience, anything I come up with is usually either flawed or just plain wrong. What do you guys think about this approach? Is there any reason not to do it like that?
Thanks.
EDIT
As pointed out by David W, enum member values are limited in length, depending on the underlying type (integer by default). So yes, I believe my method works but you are limited to characters in the ASCII table, with a maximum length of 4 or 8 characters using integers or longs respectively.
The easiest way I have found to dynamically parse a String representation of an Enumeration into the actual Enumeration type was to do the following:
Private EnumObject
[Undefined]
ValueA
ValueB
End Enum
dim enumVal as EnumObject = DirectCast([Enum].Parse(GetType(EnumObject), "ValueA"), EnumObject)
This removes the need to maintain a dictionary and allows you to just handle strings instead of converting to an Int or a Long. This does use reflection, but I have not come across any issues as long as you catch and handle any exceptions with the String Parse.

Store an NSString as a fixed length integer?

having a bit of trouble finding a solution to this.
I want to take a large ordered text file of words and create - in the same order - a text file of fixed length numeric values.
For example:
Input File Output File
AAA -> 00000001
AAH -> 00002718
AAZ -> 71827651
Initially it seemed a hash function would do the trick. However they are one way. Also perhaps they are a bit "heavyweight" for this. After all, I don't need any cryptography. Plus, it's a reference file. It will never change.
Any compression is a bonus not essential. That said, I don't want the file to get any bigger than it already is. Which is why I don't just want to write out the words as text but with fixed lengths.
So, bottom line; input is a NSString of variable length, output is an integer of fixed length. And, I must be able to take the integer and figure out the string.
Any help much appreciated!
Thanks!
xj
Well, this would be a bit of a brute force method, but here's my guess.
Start by making a custom function to convert one letter of text to an integer less than 100. (I'm not sure if such a function already exists, if so then great!) You might need to just go to stuff like "if ([input isEqual: #"a"]){ return 1;}
Then, run that function on each letter of text, and get the final integer by combining the previous results.
For example:
int myVal1 = [intConverter firstLetter];
int myVal2 = [intConverter secondLetter];
int myVal3 = [intConverter thirdLetter];
int finalValue =100^3 + 100^2*myVal1 + 100*myVal2 + myVal3;
Then, finalValue would be of the form 1(myVal1)(myVal2)(myVal3), which is what I think you're looking for.
To get back the original string, simply use the mod (%) and division functions to get the individual values back, then run the intConverter function backwards. (This would probably mean writing a new function that basically runs those if statements in reverse, but oh well.)
I hope this helps.