Change data type in Numpy and Nibabel - numpy

I'm trying to convert numpy arrays into Nifti file format using Nibabel. Some of my Numpy arrays have dtype('<i8') when it should be dtype('uint8') when Nibabel calls for the data type.
arr.get_data_dtype()
Does anyone know how to convert and save Numpy arrays' data type?

The question of the title is slightly different than the question in the text. So...
If you want to change the data-type of a numpy array arr to np.int8, you are looking for arr.astype(np.int8).
Mind that you may lose precision due to data casting (see astype documentation)
To save it afterwards you may want to see ?np.save and ?np.savetxt (or to check the library pickle, to save more general objects than numpy array).
If you want to change the data-type of a nifti image saved in my_image.nii.gz
you have to go for:
import nibabel as nib
import numpy as np
image = nib.load('my_image.nii.gz')
# to be extra sure of not overwriting data:
new_data = np.copy(image.get_data())
hd = image.header
# in case you want to remove nan:
new_data = np.nan_to_num(new_data)
# update data type:
new_dtype = np.int8 # for example to cast to int8.
new_data = new_data.astype(new_dtype)
image.set_data_dtype(new_dtype)
# if nifty1
if hd['sizeof_hdr'] == 348:
new_image = nib.Nifti1Image(new_data, image.affine, header=hd)
# if nifty2
elif hd['sizeof_hdr'] == 540:
new_image = nib.Nifti2Image(new_data, image.affine, header=hd)
else:
raise IOError('Input image header problem')
nib.save(new_image, 'my_image_new_datatype.nii.gz')
Finally if you have a numpy array my_arr and you want to save it into a nifti image with a given data-type np.my_dtype, you can do:
import nibabel as nib
import numpy as np
new_image = nib.Nifti1Image(my_arr, np.eye(4))
new_image.set_data_dtype(np.my_dtype)
nib.save(new_image, 'my_arr.nii.gz')
Hope it helps!
NOTE: If you are using ITKsnap you may want to use np.float32, np.float64, np.uint16, np.uint8, np.int16, np.int8. Other choices may not produce images that can be open with this software.

Seems like you could also do
import nibabel
img = nibabel.load(filename)
img.set_data_dtype(dtype)
img.to_filename(new_filename)

You can use nilearn for a tidy solution. Here is an example if you want to change the data type of nifti image to int16:
from nilearn import image
import numpy as np
vol = image.load_img(input_file)
vol = image.new_img_like(vol, np.int16(vol.get_fdata()))
vol.to_filename(output_file)

Datatypes for .nii files can also be specified in the .to_filename() function:
import nibabel as nib
new_image = nib.Nifti2Image(my_arr, affine)
new_image.to_filename(fn, dtype=np.uint8)

Related

Convert np.array of PIL image to binary

Im trying to convert the numpy array of the PIL image I got to a binary one but anything I have tried doesn't work.
this is what I got so far:
from PIL import Image
import numpy as np
pixels=np.array(Image.open("covid_encrypted_new.png").getdata())
def to_bin(pixels):
return [format(i,"08b") for i in pixels]
also when I tried to iterate over the array and change each value to type bin it also didnt go well for me.
What else can I try?
thanks
This could be what your looking for
Ori here: How to read the file and convert it to a binary image in Python
# Read Image
img= Image.open(file_path)
# Convert Image to Numpy as array
img = np.array(img)
# Put threshold to make it binary
binarr = np.where(img>128, 255, 0)
# Covert numpy array back to image
binimg = Image.fromarray(binarr)
You could even use opencv to convert
img = np.array(Image.open(file_path))
_, bin_img = cv2. threshold(img,127,255,cv2.THRESH_BINARY)

MATLAB .mat in Pandas DataFrame to be used in Tensorflow

I have gone days trying to figure this out, hopefully someone can help.
I am uploading a .mat file into python using scipy.io, placing the struct into a dataframe, which will then be used in Tensorflow.
from scipy.io import loadmat
import pandas as pd
import numpy as p
import matplotlib.pyplot as plt
#import TF
path = '/home/anthony/PycharmProjects/Deep_Learning_MATLAB/circuit-data/for tinghao/template1-lib5-eqns-CR-RESULTS-SET1-FINAL.mat'
raw_data = loadmat(path, squeeze_me=True)
data = raw_data['Graphs']
df = pd.DataFrame(data, dtype=int)
df.pop('transferFunc')
print(df.dtypes)
The out put is:
A object
Ln object
types object
nz int64
np int64
dtype: object
Process finished with exit code 0
The struct is (43249x6). Each cell in the 'A' column is a different sized matrix, i.e. 18x18, or 16x16 etc. Each cell in "Ln" is a row of letters each in their own separate cell. Each cell in 'Types' contains 12 columns of numbers, and 'nz' and 'np' i have no issues with.
I want to put all columns into a dataframe, and use column A or LN or Types as the 'Labels' and nz and np as 'features', again i do not have issues with the latter. Can anyone help with this or have some kind of work around.
The end goal is to have tensorflow train on nz and np and give me either a matrix, Ln, or Type.
What type of data is your .mat file of ? Is your application very time critical?
If you can collect all your data in a struct you could give jsonencode a try, make the struct a json file and load it back into python via json (see json documentation on loading data).
Then you can create a pandas dataframe via
pd.df.from_dict()
Of course this would only be a workaround. Still you would have to ensure your data in the MATLAB struct is correctly orderer to be then imported and transferred to a df.
raw_data = loadmat(path, squeeze_me=True)
data = raw_data['Graphs']
graph_labels = pd.DataFrame()
graph_labels['perf'] = raw_data['Objective'][0:1000]
graph_labels['np'] = data['np'][0:1000]
The code above helped out. Its very simple and drawn out, but it got the job done. But, it does not work in tensorflow because tensorflow does not accept this format, and that was my main issue. I have to convert adjacency matrices to networkx graphs, then upload them into stellargraph.

Convert an image format from 32FC1 to 16UC1

I need to encode an image in 16UC1 format, but I receive the error:
cv_bridge.core.CvBridgeError:encoding specified as 16UC1, but image has incompatible type 32FC1
I tried to use skimage function img_as_uint but since my image values are not between -1 and 1 it doesn't work. i also tried to "normalize" my values by dividing all of them by the value obtained from np.amax, but using the skimage function only returns a blank image.
Is there a way of achieving this conversion?
This is the original 32FC1 image
With numpy you should be able to:
import numpy as np
img = np.random.normal(0, 1, (300, 300, 3)).astype(np.float32) # simulated image
uimg = img.astype(np.uint16)
You probably will first want to do some kind of normalization if it isn't already in an unsigned range. Probably something like:
img_normalized = (img-img.min())/(img.max()-img.min())*256**2
But your normalization strategy will depend on what you want to accomplish.
Thanks for sharing an image. I can visualize as follows:
import numpy as np
import matplotlib.pyplot as plt
arr = np.load('32FC1_image.npz')
img = arr['arr_0']
img = np.squeeze(img) # this gets rid of the extra dimensions that are causing matplotlib to not recognize it as an image, the extra dimensions also may be causing your problems
img_normalized = (img-img.min())/(img.max()-img.min())*256**2
img_normalized = img_normalized.astype(np.uint16)
plt.imshow(img_normalized)
Try using the normalized 16 bit image.

Slicing the channels of image and storing the channels into numpy array(same size as image). Plotting the numpy array not giving the original image

I separated the 3 channels of an colour image. I created a new NumPy array of the same size as the image, and stored the 3 channels of the image into 3 slices of the 3D NumPy array. After plotting the NumPy array, the plotted image is not same as original image. Why is this happening?
Both img and new_img array have same elements, but image is different.
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
img=mpimg.imread('/storage/emulated/0/1sumint/kali5.jpg')
new_img=np.empty(img.shape)
new_img[:,:,0]=img[:,:,0]
new_img[:,:,1]=img[:,:,1]
new_img[:,:,2]=img[:,:,2]
plt.imshow(new_img)
plt.show()
Expect the same image as original image.
The problem is that your new image will be created with the default data type of float64 on this line:
new_img=np.empty(img.shape)
unless you specify a different dtype.
You can either (best) copy the original image's dtype like this:
new_img = np.empty(im.shape, dtype=img.dtype)
or use something like this:
new_img = np.zeros_like(im)
or (worst) specify one you happen to know matches your data, like this,
new_img = np.empty(im.shape, dtype=np.uint8)
I presume you have some reason for copying one channel at a time, but if not, you can avoid all the foregoing issues and just do:
new_img = np.copy(img)

reading arrays from netCDF, why I get a size of (1,1,n)

I am trying to read and later on to plot data from a netcdf file. Some of the arrays contained at the .nc file that I am trying to store as variables, are created as a (1,1,n) size variable. When printing them i see [[[ numbers, numbers,....]]]. Why are these three [[[ are created? How can I read these variables as a simple (n,1) array?
Here is my code
import pandas as pd
import netCDF4 as nc
import matplotlib.pyplot as plt
from tkinter import filedialog
import numpy as np
file_path=filedialog.askopenfilename(title = "Select files", filetypes = (("all files","*.*"),("txt files","*.txt")))
file=nc.Dataset(file_path)
print(file.variables.keys()) # get all variable names
read_alt=file.variables['altitude'][:]
alt=np.array(read_alt)
read_b355=file.variables['backscatter'][:]
read_error_b355=file.variables['error_backscatter'][:]
b355=np.array(read_b355)
error_b355=np.array(read_error_b355)
the variable alt is fine, for the other two I have the aforementioned problem.
Is it possible that your variables - altitude, backscatter and error_backscatter - have more than one dimensions? Whenever you load that kind of data, the number of dimensions is kept by the netCDF library.
Nevertheless, what I usually do, is that I remove the dimensions that I do not need from the arrays by squeezing them:
read_alt = np.squeeze(file.variables['altitude'][:])
read_b355 = np.squeeze(file.variables['backscatter'][:]);
read_error_b355 = np.squeeze(file.variables['error_backscatter'][:]);