What is the difference between flatten and ravel in numpy? [duplicate] - numpy

This question already has answers here:
What is the difference between flatten and ravel functions in numpy?
(3 answers)
Closed 5 years ago.
Numpy v 1.9 contains two seemingly identical functions:
'flatten'
and
'ravel'
What is the difference? and when might I pick one vs the other for converting a 2-D np.array to 1-D?

Aha:
The primary functional difference is thatflatten is a method of an ndarray object and hence can only be called for true numpy arrays. In contrast ravel() is a library-level function and hence can be called on any object that can successfully be parsed. For example ravel() will work on a list of ndarrays, while flatten (obviously) won't.
In addition, as #jonrsharpe pointed out in his comment, the flatten method always returns a copy, while ravel only does so "if needed." Still not quite sure how this determination is made.

Related

Tensorflow: Using one tensor to index slices of another [duplicate]

This question already has answers here:
Get the last output of a dynamic_rnn in TensorFlow
(4 answers)
Closed 4 years ago.
As motivation for this question, I'm trying to use variable length sequences with tf.nn.dynamic_rnn. When I was training with batch_size=1 (one element at a time), everything was going swimmingly, but now I'm trying to increase the batch size, which means zero-padding sequences to the same length.
I've zero-padded (or truncated) all of my sequences up to the max length of 15000.
outputs (from the RNN) has shape [batch_size, max_seq_length, num_units], which for concreteness is right now [16, 15000, 64].
I also create a seq_lengths tensor, which is [batch_size], so [16], corresponding to the actual sequence length of all the zero-padded sequences.
I've added a fully connected layer, to multiply what was previously outputs[:,-1,:] by W, then add a bias term, since ultimately I'm just trying to predict a single value (or rather batch_size values). However, now, I can't just naively use -1 as the index, because all of the sequences have been variously padded! I have seq_lengths, but I'm not sure exactly how to use it to index outputs. I've searched around, and I think the answer is some clever use of tf.gather_nd, but I can't quite figure it out. I can easily see how to take individual values, but I want to preserve entire slices. Do I need to create some sort of enormous 3D mask?
Here's what I want in terms of a Python comprehension (outputs is an np.array): outputs = np.array([outputs[i, seq_lengths[i], :] for i in range(batch_size)]).
I'd appreciate any help! Thank you.
Actually, Alex it turns out you've already answered my question for me :).
After some more research, I came across the following, which is exactly my use case: https://stackoverflow.com/a/43298689/5526865 . I won't copy the code here, but just check that out.

Are the eigenvectors returned in numpy.linalg.eig orthogonal? [duplicate]

This question already has answers here:
eigenvectors from numpy.eig not orthogonal
(2 answers)
Closed 5 years ago.
Are the eigenvectors returned in numpy.linalg.eig orthogonal? If not, how can I get orthogonal and normalized eigenvectors and relative eighenvaules?
I tried some simple example myself, in general, v0*v1=0.0001xxxxxxxxxxxxxxx, can I treat this result as orthogonal?
Documentation for numpy.linalg.eig clearly states:
The array v of eigenvectors may not be of maximum rank, that is, some of the columns may be linearly dependent, although round-off error may obscure that fact. If the eigenvalues are all different, then theoretically the eigenvectors are linearly independent.
However, they are not required to be orthogonal.
Are the eigenvectors returned in numpy.linalg.eig orthogonal?
NumPy does not make any such promise.
If not, how can I get orthogonal and normalized eigenvectors and relative eighenvaules?
There is no guarantee the eigenspaces of a matrix are even orthogonal; it may not be possible to choose orthogonal eigenvectors.

what is the difference between series/dataframe and ndarray?

Leaving that they are from two different binaries.
I know that series/dataframe can hold any data type, and ndarray is also heterogenous data.
And also all the slicing operations of numpy are applicable to series.
Is there any other difference between them?
After some research I found the answer to my question I asked above. For anyone who needs, here it is from pandas docs:
A key difference between Series and ndarray is that operations between
Series automatically align the data based on the label. Thus, you can
write computations without giving consideration to whether the Series
involved have the same labels.
An example:
s[1:] + s[:-1]
The result for above would produce NaN for both first and last index.
If a label is not found in one Series or the other, the result will be marked as missing NaN.

What is a seed in TensorFlow? [duplicate]

This question already has answers here:
What does 'seeding' mean?
(4 answers)
Closed 6 years ago.
I'm a beginner in TensorFlow, and I came across a parameter called seed in most of the functions. Also, it comes as the only parameter in some functions such as tf.set_random_seed(seed). Is this term seed specific to tensorflow? I believe I've surfed the TensorFlow documentation enough but couldn't find a solid answer.
The term "seed" is an abbreviation of the standard term "random seed".
TensorFlow operators that produce random results accept an optional seed parameter. If you pass the same number to two instances of the same operator, they will produce the same sequence of results. If you not pass a number to such an operator, it will produce different results on each execution.
This is not a tensorflow specific term, in fact almost any programming language have a seed for random generators, with a seed you make sure that you can reproduce your results when using random generators(using the same seed two times, would result in the same random number).

Numpy: How to unitfy a vector? [duplicate]

This question already has answers here:
How to normalize a NumPy array to a unit vector?
(15 answers)
Closed 6 years ago.
I'm not sure how to say unitfy for a vecor.
What I say is, for vector (4,3) -> (4/5,3/5). Just to divide the vector by its length.
I can to this as vv = v / np.linalg.norm(v)
What is the right word for unitfy and the standard way of doing it?
The word is "normalize":
http://mathworld.wolfram.com/NormalizedVector.html
Dividing by the norm is a pretty standard way of doing this. Watch for the case when the norm is very close to zero (may want to compate it with epsilon and handle that case specially, or throw an exception).
See also:
how to normalize array numpy?