NumPy for dommies using Jupyter - numpy

I am learning how to use NumPy now and after cerating a 1 dim ndarray like this :
import numpy as np
x = np.array([1.5, 2.3, 4, 5.8], dtype = np.int64)
when printing x:
print(x)
the dtype = np.int64 does not work in the web base Jupyter notebook (Anaconda), can someone help me please? Thanks a lot guys!

Related

Weird behavior about numpy.mean()

I am using numpy 1.21.
import numpy as np
a = [(1, 1), (2,)]
np.mean(a)
It returns:
array([0.5, 0.5, 1. ])
It's not the mean of flattened array. Can anyone understand why it got returned?

How do I convert the generated python list numbers into a tensorflow dataset for onward feeding into an Artficial neural network model on colab

import random
import numpy as np
import pandas as pd
minWallThickness = 0.250
maxWallThickness = 0.375
minCurrentFlowRate = 320.0 #m3/HR
maxCurrentFlowRate = 600.0 #m3/HR
mycurrentFlRTList=[]
mywallTkList=[]
for x in range(5):
wallThickness= round(random.uniform(minWallThickness, maxWallThickness), 4)
mywallTkList.append(wallThickness)
currentFlowRate = random.randint(320, 600)
mycurrentFlRTList.append(currentFlowRate)
ls = [[mywallTkList],[mycurrentFlRTList]]
df = pd.DataFrame(ls)
print(df)
I would like to arrange mywallTkList and mycurrentFlRTList into a single tensorflow dataset which can be feed into an ANN model.
If i understood properly what yout want.
You could use the function from_tensor_slices.
You can see more about that here.
dataset = tf.data.Dataset.from_tensor_slices(mywallTkList)

sklearn random_state is not working properly

I read everything related to this but still did not understand what the problem is really. Basically I use TruncatedSVD with random_state and then print explained_variance_ratio_.sum() for it. It changes every time I run the code. Is this normal?
from sklearn.decomposition import TruncatedSVD
SVD = TruncatedSVD(n_components=40, n_iter=7, random_state=42)
XSVD = SVD.fit_transform(X)
print(SVD.explained_variance_ratio_.sum())
The problem is later I use umap and plot the result graph. And I have different graphs everytime I run the code. I do not understand if this is due to TruncatedSVD or UMAP. I use random_state=42 to stop things to change but it looks like there is no effect really.
You should probably do something wrong, because I cannot reproduce your issue with scikit-learn 0.22
In [16]: import numpy as np
...: from sklearn.decomposition import TruncatedSVD
...:
...: rng = np.random.RandomState(42)
...: X = rng.randn(10000, 100)
...: def func(X):
...: SVD = TruncatedSVD(n_components=40, n_iter=7, random_state=42)
...: XSVD = SVD.fit_transform(X)
...: print(SVD.explained_variance_ratio_.sum())
...: func(X);func(X);func(X);
0.43320350603512425
0.43320350603512425
0.43320350603512425

passing numpy array as parameter in theano function

As a beginner, i was trying to simply compute the dot product of two matrices using theano.
my code is very simple.
import theano
import theano.tensor as T
import numpy as np
from theano import function
def covarience(array):
input_array=T.matrix('input_array')
deviation_matrix = T.matrix('deviation_matrix')
matrix_filled_with_1s=T.matrix('matrix_filled_with_1s')
z = T.dot(input_array, matrix_filled_with_1s)
identity=np.ones((len(array),len(array)))
f=function([array,identity],z)
# print(f)
covarience(np.array([[2,4],[6,8]]))
but the problem is each time i run this code , i get error message like "TypeError: Unknown parameter type: "
Can anyone tell me whats wrong with my code?
You cannot pass numpy array to theano function, theano functions can only be defined by theano.tensor variables. So you can always define computations with interaction of tensor/symbolic variables, and to perform actual computation on values/real data you can use functions, it doesn't make sense to define theano function itself with numpy array.
This should work:
import theano
import theano.tensor as T
import numpy as np
a = T.matrix('a')
b = T.matrix('b')
z = T.dot(a, b)
f = theano.function([a, b], z)
a_d = np.asarray([[2, 4], [6, 8]], dtype=theano.config.floatX)
b_d = np.ones(a_d.shape, dtype=theano.config.floatX)
print(f(a_d, b_d))

How do I enable the REFS_OK flag in nditer in numpy in Python 3.3?

Does anyone know how one goes about enabling the REFS_OK flag in numpy? I cannot seem to find a clear explanation online.
My code is:
import sys
import string
import numpy as np
import pandas as pd
SNP_df = pd.read_csv('SNPs.txt',sep='\t',index_col = None ,header = None,nrows = 101)
output = open('100 SNPs.fa','a')
for i in SNP_df:
data = SNP_df[i]
data = np.array(data)
for j in np.nditer(data):
if j == 0:
output.write(("\n>%s\n")%(str(data(j))))
else:
output.write(data(j))
I keep getting the error message: Iterator operand or requested dtype holds references, but the REFS_OK was not enabled.
I cannot work out how to enable the REFS_OK flag so the program can continue...
I have isolated the problem. There is no need to use np.nditer. The main problem was with me misinterpreting how Python would read iterator variables in a for loop. The corrected code is below.
import sys
import string
import fileinput
import numpy as np
SNP_df = pd.read_csv('datafile.txt',sep='\t',index_col = None ,header = None,nrows = 5000)
output = open('outputFile.fa','a')
for i in range(1,51):
data = SNP_df[i]
data = np.array(data)
for j in range(0,1):
output.write(("\n>%s\n")%(str(data[j])))
for k in range(1,len(data)):
output.write(str(data[k]))
If you really want to enable the flag, I have an working example.
Python 2.7, numpy 1.14.2, pandas 0.22.0
import pandas as pd
import numpy as np
# get all data as panda DataFrame
data = pd.read_csv("./monthdata.csv")
print(data)
# get values as numpy array
data_ar = data.values # numpy.ndarray, every element is a row
for row in data_ar:
print(row)
sum = 0
count = 0
for month in np.nditer(row, flags=["refs_OK"], op_flags=["readwrite"]):
print month