How do I combine 2 arraylist into a list of lists in java - arraylist

I want to convert 2 arraylists to an arraylist list of arrays
newList3 = [-50, 30, -20, 0, 20, -30, 50]
newList4 = [1, 1, 1, 1, 1, 1, 1]
I want to return:
[[-50, 1], [30, 1], [-20, 1], [0, 1], [20, 1], [-30, 1], [50, 1]]
The only result I can get is:
[-50, 1, 30, 1, -20, 1, 0, 1, 20, 1, -30, 1, 50, 1]
I have tried
a = newList3.get(0);
b = newList4.get(0);
newList.add(a);
newList.add(b);
newList.add(newList2);
newList.clear();
a = newList3.get(1);
b = newList4.get(1);
newList.add(a);
newList.add(b);
newList.add(newList);

The operation you are looking for is called zip operation.
IntStream
.range(0, Math.min(list1.size(), list2.size()))
.mapToObj(i -> Arrays.asList(list1.get(i), list2.get(i)))
.collect(Collectors.toList());
Here, Since we'll be iterating over the lists we need the index. So, we're using IntStream.range to generate the index ranges. And then we're using the mapToObj to zip the 2 lists.
And in range, we're going from 0 to the list size which has minimum elements.

Related

Identify vectors being a multiple of another in rectangular matrix

Given a nxm matrix (n > m) of integers, I'd like to identify rows that are a multiple of a single other row, so not a linear combination of multiple other rows.
I could scale all rows to their length and find unique rows, but that is prone to numerical issues on floating points and would also not detect vectors being opposite (pointing in the other directon) of each other.
Any ideas?
Example
A = array([[-1, -1, 0, 0],
[-1, -1, 0, 1],
[-1, 0, -1, 0],
[-1, 0, 0, 0],
[-1, 0, 0, 1],
[-1, 0, 1, 1],
[-1, 1, -1, 0],
[-1, 1, 0, 0],
[-1, 1, 1, 0],
[ 0, -1, 0, 0],
[ 0, -1, 0, 1],
[ 0, -1, 1, 0],
[ 0, -1, 1, 1],
[ 0, 0, -1, 0],
[ 0, 0, 0, 1],
[ 0, 0, 1, 0],
[ 0, 1, -1, 0],
[ 0, 1, 0, 0],
[ 0, 1, 0, 1],
[ 0, 1, 1, 0],
[ 0, 1, 1, 1],
[ 1, -1, 0, 0],
[ 1, -1, 1, 0],
[ 1, 0, 0, 0],
[ 1, 0, 0, 1],
[ 1, 0, 1, 0],
[ 1, 0, 1, 1],
[ 1, 1, 0, 0],
[ 1, 1, 0, 1],
[ 1, 1, 1, 0]])
For example Rows 0 and -3 just point in the opposite direction (multiply one by -1 to make them equal).
You can normalize each row dividing it by its GCD:
import numpy as np
def normalize(a):
return a // np.gcd.reduce(a, axis=1, keepdims=True)
And you can define a distance that considers opposite vectors as equal:
def distance(a, b):
equal = np.all(a == b) or np.all(a == -b)
return 0 if equal else 1
Then you can use standard clustering methods:
from scipy.spatial.distance import pdist
from scipy.cluster.hierarchy import linkage, fcluster
def cluster(a):
norm_a = normalize(a)
distances = pdist(norm_a, metric=distance)
return fcluster(linkage(distances), t=0.5)
For example:
>>> A = np.array([( 1, 2, 3, 4),
... ( 0, 2, 4, 8),
... (-1, -2, -3, -4),
... ( 0, 1, 2, 4),
... (-1, 2, -3, 4),
... ( 2, -4, 6, -8)])
>>> cluster(A)
array([2, 3, 2, 3, 1, 1], dtype=int32)
Interpretation: cluster 1 is formed by rows 4 and 5, cluster 2 by rows 0 and 2, and cluster 3 by rows 1 and 3.
You can take advantage of the fact that inner product of two normalized linearly dependent vectors gives 1 or -1, so the code could look like this:
>>> A_normalized = (A.T/np.linalg.norm(A, axis=-1)).T
>>> M = np.absolute(np.einsum('ix,jx->ij', A_normalized, A_normalized))
>>> i, j = np.where(np.isclose(M, 1))
>>> i, j = i[i < j], j[i < j] # Remove repetitions
>>> print(i, j)
output: [ 0 2 3 6 7 9 11 13] [27 25 23 22 21 17 16 15]

Set all elements left to index to one, right of index to zero for list of indices

Say I have a list of Indices:
np.array([1, 3, 2, 4])
How do I create the following matrix, where all elements left to the index are ones and right to the index zeros?
[[1, 1, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0],
[1, 1, 1, 0, 0, 0],
[1, 1, 1, 1, 1, 0]]
1*(np.arange( 6 ) <= arr[:,None])
# array([[1, 1, 0, 0, 0, 0],
# [1, 1, 1, 1, 0, 0],
# [1, 1, 1, 0, 0, 0],
# [1, 1, 1, 1, 1, 0]])
This broadcasts the array of 6 elements across the rows and the array of indices across the columns. The 1* converts boolean to int.

numpy array to data frame and vice versa

I'm a noob in python!
I'd like to get sequences and anomaly together like this:
and sort only normal sequence.(if a value of anomaly column is 0, it's a normal sequence)
turn normal sequences to numpy array (without anomaly column)
each row(Sequence) is one session. so in this case their are 6 independent sequences.
each element represent some specific activity.
'''
sequence = np.array([[5, 1, 1, 0, 0, 0],
[5, 1, 1, 0, 0, 0],
[5, 1, 1, 0, 0, 0],
[5, 1, 1, 0, 0, 0],
[5, 1, 1, 0, 0, 0],
[5, 1, 1, 300, 200, 100]])
anomaly = np.array((0,0,0,0,0,1))
'''
i got these two variables and have to sort only normal sequences.
Here is the code i tried:
'''
# sequence to dataframe
empty_df = pd.DataFrame(columns = ['Sequence'])
empty_df.reset_index()
for i in range(sequence.shape[0]):
empty_df = empty_df.append({"Sequence":sequence[i]},ignore_index = True) #
#concat anomaly
anomaly_df = pd.DataFrame(anomaly)
df = pd.concat([empty_df,anomaly_df],axis = 1)
df.columns = ['Sequence','anomaly']
df
'''
I didn't want to use pd.DataFrame because it gives me this:
pd.DataFrame(sequence)
anyways, after making df, I tried to sort normal sequences
#sorting normal seq
normal = df[df['anomaly'] == 0]['Sequence']
# back to numpy. only sequence column.
normal = normal.to_numpy()
normal.shape
'''
and this numpy gives me different shape1 from the variable sequence.
sequence.shape: (6,6) normal.shape =(5,)
I want to have (5,6). Tried reshape but didn't work..
Can someone help me with this?
If there are any unspecific explanation from my question, plz leave a comment. I appreciate it.
I am not quite sure of what you need but here you could do:
import pandas as pd
df = pd.DataFrame({'sequence':sequence.tolist(), 'anomaly':anomaly})
df
sequence anomaly
0 [5, 1, 1, 0, 0, 0] 0
1 [5, 1, 1, 0, 0, 0] 0
2 [5, 1, 1, 0, 0, 0] 0
3 [5, 1, 1, 0, 0, 0] 0
4 [5, 1, 1, 0, 0, 0] 0
5 [5, 1, 1, 300, 200, 100] 1
Convert it into list then create an array.
Try:
normal = df.loc[df['anomaly'].eq(0), 'Sequence']
normal = np.array(normal.tolist())
print(normal.shape)
# (5,6)

Numpy Vectorization: add row above to current row on ndarray

I would like to add the values in the above row to the row below using vectorization. For example, if I had the ndarray,
[[0, 0, 0, 0],
[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3]]
Then after one iteration through this method, it would result in
[[0, 0, 0, 0],
[1, 1, 1, 1],
[3, 3, 3, 3],
[5, 5, 5, 5]]
One can simply do this with a for loop:
import numpy as np
def addAboveRow(arr):
cpy = arr.copy()
r, c = arr.shape
for i in range(1, r):
for j in range(c):
cpy[i][j] += arr[i - 1][j]
return cpy
ndarr = np.array([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3]).reshape(4, 4)
print(addAboveRow(ndarr))
I'm not sure how to approach this using vectorization though. I think slicers should be used? Also, I'm not really sure how to deal with the issue of the top border, because nothing should be added onto the first row. Any help would be appreciated. Thanks!
Note: I am really new to vectorization so an explanation would be great!
You can use indexing directly:
b = np.zeros_like(a)
b[0] = a[0]
b[1:] = a[1:] + a[:-1]
>>> b
array([[0, 0, 0, 0],
[1, 1, 1, 1],
[3, 3, 3, 3],
[5, 5, 5, 5]])
An alternative:
b = a.copy()
b[1:] += a[:-1]
Or:
b = a.copy()
np.add(b[1:], a[:-1], out=b[1:])
You could try the following
np.put(arr, np.arange(arr.shape[1], arr.size), arr[1:]+arr[:-1])

Convert string to integer pandas dataframe index

I have a pandas dataframe with a multiindex. Unfortunately one of the indices gives years as a string
e.g. '2010', '2011'
how do I convert these to integers?
More concretely
MultiIndex(levels=[[u'2010', u'2011'], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]],
labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 11, 12, , ...]], names=[u'Year', u'Month'])
.
df_cbs_prelim_total.index.set_levels(df_cbs_prelim_total.index.get_level_values(0).astype('int'))
seems to do it, but not inplace. Any proper way of changing them?
Cheers,
Mike
Will probably be cleaner to do this before you assign it as index (as #EdChum points out), but when you already have it as index, you can indeed use set_levels to alter one of the labels of a level of your multi-index. A bit cleaner as your code (you can use index.levels[..]):
In [165]: idx = pd.MultiIndex.from_product([[1,2,3], ['2011','2012','2013']])
In [166]: idx
Out[166]:
MultiIndex(levels=[[1, 2, 3], [u'2011', u'2012', u'2013']],
labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2], [0, 1, 2, 0, 1, 2, 0, 1, 2]])
In [167]: idx.levels[1]
Out[167]: Index([u'2011', u'2012', u'2013'], dtype='object')
In [168]: idx = idx.set_levels(idx.levels[1].astype(int), level=1)
In [169]: idx
Out[169]:
MultiIndex(levels=[[1, 2, 3], [2011, 2012, 2013]],
labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2], [0, 1, 2, 0, 1, 2, 0, 1, 2]])
You have to reassign it to save the changes (as is done above, in your case this would be df_cbs_prelim_total.index = df_cbs_prelim_total.index.set_levels(...))