Dataframes NAtype to binary Julia - dataframe

I'm trying to write binary text files from a data frame in Julia using something along the lines of:
for x in RICT["$i"]["Sick"]
write(f9, convert(Int16, x ))
and everything works nicely except for when it comes to NA values. Missing values are treated as NA it seems, and I know that there are different ways of handling such values using the data frames package. Does anyone have any experience with these NAtypes? Should I convert the NAtypes to a more conventional type and then write them in? As always any help is much appreciated.

If you are writing a 16-byte integer value, there's no canonical representation of "blank", so you'd have to pick a special 16-byte integer value that represents NA. A common choice for this kind of thing is the smallest representable value – in this case typemin(Int16) == -32768. You can generalize this to other signed integer types.

Related

The set of atomic irrational numbers used to express the character table and corresponding (unitary) representations

I want to calculate the irrational number, expressed by the following formula in gap:
3^(1/7). I've read through the related description here, but still can't figure out the trick. Will numbers like this appear in the computation of the character table and corresponding (unitary) representations?
P.S. Basically, I want to figure out the following question: For the computation of the character table and corresponding (unitary) representations, what is the minimum complete set of atomic irrational numbers used to express the results?
Regards,
HZ
You can't do that with GAP's standard cyclotomic numbers, as seventh roots of 3 are not cyclotomic. Indeed, suppose $r$ is such a root, i.e. a rot of the polynomial $f = x^7-3 \in \mathbb{Q}[x]$. Then $r$ is cyclotomic if and only if the field extension \mathbb{Q}[x] is a subfield of a cyclotomic field. By Kronecker-Weber this is equivalent to that field being an abelian extension, i.e., the Galois group is abelian. One can check that this is not the case here (the Galois group is a semidirect product of C_7 with C_6).
So, $r$ is not cyclotomic.

Kotlin: Convert Hex String to signed integer via signed 2's complement?

Long story short, I am trying to convert strings of hex values to signed 2's complement integers. I was able to do this in a single line of code in Swift, but for some reason I can't find anything analogous in Kotlin. String.ToInt or String.ToUInt just give the straight base 16 to base 10 conversion. That works for some positive values, but not for any negative numbers.
How do I know I want the signed 2's complement? I've used this online converter and according to its output, what I want is the decimal from signed 2's complement, not the straight base 16 to base 10 conversion that's easy to do by hand.
So, "FFD6" should go to -42 (correct, confirmed in Swift and C#), and "002A" should convert to 42.
I would appreciate any help or even any leads on where to look. Because yes I've searched, I've googled the problem a bunch and, no I haven't found a good answer.
I actually tried writing my own code to do the signed 2's complement but so far it's not giving me the right answers and I'm pretty at a loss. I'd really hope for a built in command that does it instead; I feel like if other languages have that capability Kotlin should too.
For 2's complement, you need to know how big the type is.
Your examples of "FFD6" and "002A" both have 4 hex digits (i.e. 2 bytes).  That's the same size as a Kotlin Short.  So a simple solution in this case is to parse the hex to an Int and then convert that to a Short.  (You can't convert it directly to a Short, as that would give an out-of-range error for the negative numbers.)
"FFD6".toInt(16).toShort() // gives -42
"002A".toInt(16).toShort() // gives 42
(You can then convert back to an Int if needed.)
You could similarly handle 8-digit (4-byte) values as Ints, and 2-digit (1-byte) values as Bytes.
For other sizes, you'd need to do some bit operations.  Based on this answer for Java, if you have e.g. a 3-digit hex number, you can do:
("FD6".toInt(16) xor 0x800) - 0x800 // gives -42
(Here 0x800 is the three-digit number with the top bit (i.e. sign bit) set.  You'd use 0x80000 for a five-digit number, and so on.  Also, for 9–16 digits, you'd need to start with a Long instead of an Int.  And if you need >16 digits, it won't fit into a Long either, so you'd need an arbitrary-precision library that handled hex…)

Reading Fortran binary file in Python

I'm having trouble reading an unformatted F77 binary file in Python.
I've tried the SciPy.io.FortraFile method and the NumPy.fromfile method, both to no avail. I have also read the file in IDL, which works, so I have a benchmark for what the data should look like. I'm hoping that someone can point out a silly mistake on my part -- there's nothing better than having an idiot moment and then washing your hands of it...
The data, bcube1, have dimensions 101x101x101x3, and is r*8 type. There are 3090903 entries in total. They are written using the following statement (not my code, copied from source).
open (unit=21, file=bendnm, status='new'
. ,form='unformatted')
write (21) bcube1
close (unit=21)
I can successfully read it in IDL using the following (also not my code, copied from colleague):
bcube=dblarr(101,101,101,3)
openr,lun,'bcube.0000000',/get_lun,/f77_unformatted,/swap_if_little_endian
readu,lun,bcube
free_lun,lun
The returned data (bcube) is double precision, with dimensions 101x101x101x3, so the header information for the file is aware of its dimensions (not flattend).
Now I try to get the same effect using Python, but no luck. I've tried the following methods.
In [30]: f = scipy.io.FortranFile('bcube.0000000', header_dtype='uint32')
In [31]: b = f.read_record(dtype='float64')
which returns the error Size obtained (3092150529) is not a multiple of the dtypes given (8). Changing the dtype changes the size obtained but it remains indivisible by 8.
Alternately, using fromfile results in no errors but returns one more value that is in the array (a footer perhaps?) and the individual array values are wildly wrong (should all be of order unity).
In [38]: f = np.fromfile('bcube.0000000')
In [39]: f.shape
Out[39]: (3090904,)
In [42]: f
Out[42]: array([ -3.09179121e-030, 4.97284231e-020, -1.06514594e+299, ...,
8.97359707e-029, 6.79921640e-316, -1.79102266e-037])
I've tried using byteswap to see if this makes the floating point values more reasonable but it does not.
It seems to me that the np.fromfile method is very close to working but there must be something wrong with the way it's reading the header information. Can anyone suggest how I can figure out what should be in the header file that allows IDL to know about the array dimensions and datatype? Is there a way to pass header information to fromfile so that it knows how to treat the leading entry?
I played a bit around with it, and I think I have an idea.
How Fortran stores unformatted data is not standardized, so you have to play a bit around with it, but you need three pieces of information:
The Format of the data. You suggest that is 64-bit reals, or 'f8' in python.
The type of the header. That is an unsigned integer, but you need the length in bytes. If unsure, try 4.
The header usually stores the length of the record in bytes, and is repeated at the end.
Then again, it is not standardized, so no guarantees.
The endianness, little or big.
Technically for both header and values, but I assume they're the same.
Python defaults to little endian, so if that were the the correct setting for your data, I think you would have already solved it.
When you open the file with scipy.io.FortranFile, you need to give the data type of the header. So if the data is stored big_endian, and you have a 4-byte unsigned integer header, you need this:
from scipy.io import FortranFile
ff = FortranFile('data.dat', 'r', '>u4')
When you read the data, you need the data type of the values. Again, assuming big_endian, you want type >f8:
vals = ff.read_reals('>f8')
Look here for a description of the syntax of the data type.
If you have control over the program that writes the data, I strongly suggest you write them into data streams, which can be more easily read by Python.
Fortran has record demarcations which are poorly documented, even in binary files.
So every write to an unformatted file:
integer*4 Test1
real*4 Matrix(3,3)
open(78,format='unformatted')
write(78) Test1
write(78) Matrix
close(78)
Should ultimately be padded by an np.int32 values. (I've seen references that this tells you the record length, but haven't verified persconally.)
The above could be read in Python via numpy as:
input_file = open(file_location,'rb')
datum = np.dtype([('P1',np.int32),('Test1',np.int32),('P2',np.int32),('P3',mp.int32),('MatrixT',(np.float32,(3,3))),('P4',np.int32)])
data = np.fromfile(input_file,datum)
Which should fully populate the data array with the individual data sets of the format above. Do note that numpy expects data to be packed in C format (row major) while Fortran format data is column major. For square matrix shapes like that above, this means getting the data out of the matrix requires a transpose as well, before using. For non square matrices, you will need to reshape and transpose:
Matrix = np.transpose(data[0]['MatrixT']
Transposing your 4-D data structure is going to need to be done carefully. You might look into SciPy for automated ways to do so; the SciPy package seems to have Fortran related utilities which I have not fully explored.

Convert an alphanumeric string to integer format

I need to store an alphanumeric string in an integer column on one of my models.
I have tried:
#result.each do |i|
hex_id = []
i["id"].split(//).each{|c| hex_id.push(c.hex)}
hex_id = hex_id.join
...
Model.create(:origin_id => hex_id)
...
end
When I run this in the console using puts hex_id in place of the create line, it returns the correct values, however the above code results in the origin_id being set to "2147483647" for every instance. An example string input is "t6gnk3pp86gg4sboh5oin5vr40" so that doesn't make any sense to me.
Can anyone tell me what is going wrong here or suggest a better way to store a string like the aforementioned example as a unique integer?
Thanks.
Answering by request form OP
It seems that the hex_id.join operation does not concatenate strings in this case but instead sums or performs binary complement of the hex values. The issue could also be that hex_id is an array of hex-es rather than a string, or char array. Nevertheless, what seems to happen is reaching the maximum positive value for the integer type 2147483647. Still, I was unable to find any documented effects on array.join applied on a hex array, it appears it is not concatenation of the elements.
On the other hand, the desired result 060003008600401100500050040 is too large to be recorded as an integer either. A better approach would be to keep it as a string, or use different algorithm for producing a number form the original string. Perhaps aggregating the hex values by an arithmetic operation will do better than join ?

How to get float value as it is from the text box in objective c

Can any one please help me how to get float value as it is from text box
for Ex: I have entered 40.7
rateField=[[rateField text] floatValue];
I am getting rateField value as 40.7000008 but I want 40.7 only.
please help me.
thanks in advance
Thanks Every body,
I tried all the possibilities but I am not able to get what I want. I am not looking to print the value to convert into string.I want to use that value for computation. If i use Number Formatter again when i am converting from number to float it is giving same problem.So i want float value only but it should be whatever i have given in the text box it should not be padded with any values.This is my requirement.Please help me.
thanks&regards Balu
Thanks Every body,
I tried all the possibilities but I am not able to get what I want. I am not looking to print the value to convert into string.I want to use that value for computation. If i use Number Formatter again when i am converting from number to float it is giving same problem.So i want float value only but it should be whatever i have given in the text box it should not be padded with any values.This is my requirement.Please help me.
thanks&regards
Balu
This is ok. There is not guaranteed that you will get 40.7 if you will use even double.
If you want to output 40.7 you can use %.1f or NSNumberFormatter
Try using a double instead. Usually solves that issue. Has to do with the storage precision.
double dbl = [rateField.text doubleValue];
When using floating point numbers, these things can happen because of the way the numbers are stored in binary format in the computers memory.
It's similar to the way 1/3 = 0.33333333333333... in decimal numbers.
The best way to deal with this is to use number formatters in the textbox that displays the value.
You are already resolved float value.
Floating point numbers have limited precision. Although it depends on
the system, float relative error due to rounding will be around 1.1e-8
Non elementary arithmetic operations may give larger errors, and, of
course, error progragation must be considered when several operations
are compounded.
Additionally, rational numbers that are exactly representable as
floating point numbers in base 10, like 0.1 or 0.7, do not have an
exact representation as floating point numbers in base 2, which is
used internally, no matter the size of the mantissa. Hence, they
cannot be converted into their internal binary counterparts without a
small loss of precision. This can lead to confusing results: for
example, floor((0.1+0.7)*10) will usually return 7 instead of the
expected 8, since the internal representation will be something like
7.9999999999999991118....
So if you're using those numbers for output, you should use some rounding mechanism, even for double values.