What's difference between Summation and Concatenation at neural network like CNN? - tensorflow

What's difference between summation and concatenation at neural network like CNNs?
For example Googlenet's Inception module used concatenation, Resnet's Residual learning used summation.
Please teach me.

Concatenation means to concatenate two blobs, so after the concat we have a bigger blob that contains the previous blobs in a continuous memory. For example:
blob1:
1
2
3
blob2:
4
5
6
blob_res:
1
2
3
4
5
6
Summation means element-wise summation, blob1 and blob2 must have the exact same shape, and the resultant blob has the same shape with the elements a1+b1, a2+b2, ai+bi, ... an+bn.
For the example above,
blob_res:
(1+4) 5
(2+5) 7
(3+6) 9

Related

Order-independent Deep Learning Model

I have a dataset with parallel time series. The column 'A' depends on columns 'B' and 'C'. The order (and the number) of dependent columns can change. For example:
A B C
2022-07-23 1 10 100
2022-07-24 2 20 200
2022-07-25 3 30 300
How should I transform this data, or how should I build the model so the order of columns 'B' and 'C' ('A', 'B', 'C' vs 'A', C', 'B'`) doesn't change the result? I know about GCN, but I don't know how to implement it. Maybe there are other ways to achieve it.
UPDATE:
I want to generalize my question and make one more example. Let's say we have a matrix as a singe observation (no time series data):
col1 col2 target
0 1 a 20
1 2 a 30
2 3 b 30
3 4 b 40
I would like to predict one value 'target' per each row/instance. Each instance depends on other instances. The order of rows is irrelevant, and the number of rows in each observation can change.
You are looking for a permutation invariant operation on the columns.
One way of achieving this would be to apply column-wise operation, followed by a global pooling operation.
How that achieves your goal:
column-wise operations are permutation equivariant; that is, applying the operation on the columns and permuting the output, is the same as permuting the columns and then applying the operation.
A global pooling operation (e.g., max-pool, avg-pool) across the columns is permutation invariant: the result of an average pool does not depend on the order of the columns.
Applying a permutation invariant operation on top of a permutation equivariant one results in an overall permutation invariant function.
Additionally, you should look at self-attention layers, which are also permutation equivariant.
What I would try is:
Learn a representation (RNN/Transformer) for a single time series. Apply this representation to A, B and C.
Learn a transformer between the representation of A to those of B and C: that is, use the representation of A as "query" and those of B and C as "keys" and "values".
This will give you a representation of A that is permutation invariant in B and C.
Update (Aug 3rd, 2022):
For the case of "observations" with varying number of rows, and fixed number of columns:
I think you can treat each row as a "token" (with a fixed dimension = number of columns), and apply a Transformer encoder to predict the target for each "token", from the encoded tokens.

Efficient element-wise vector times matrix ,multiplication in MKL

I have a vector
[2 3 4]
That I need to multiply with a matrix
1 1 1
2 2 2
3 3 3
to get
2 3 4
4 6 8
6 9 12
Now, I can make the vector into a matrix and do an element-wise multiplication, but is there also an efficient way to do this in MKL / CBLAS?
Yes, there is a function in oneMKL called cblas_?gemv which computes the multiplication of matrix and vector.
You can refer to the below link for more details regarding the usage of the function.
https://www.intel.com/content/www/us/en/develop/documentation/onemkl-developer-reference-c/top/blas-and-sparse-blas-routines/blas-routines/blas-level-2-routines/cblas-gemv.html
If you have installed the oneMKL in your system, you can take a look at the examples which helps you to better understand the usage of the functions that are available in the library.

pandas create Cross-Validation based on specific columns

I have a dataframe of few hundreds rows , that can be grouped to ids as follows:
df = Val1 Val2 Val3 Id
2 2 8 b
1 2 3 a
5 7 8 z
5 1 4 a
0 9 0 c
3 1 3 b
2 7 5 z
7 2 8 c
6 5 5 d
...
5 1 8 a
4 9 0 z
1 8 2 z
I want to use GridSearchCV , but with a custom CV that will assure that all the rows from the same ID will always be on the same set.
So either all the rows if a are in the test set , or all of them are in the train set - and so for all the different IDs.
I want to have 5 folds - so 80% of the ids will go to the train and 20% to the test.
I understand that it can't guarentee that all folds will have the exact same amount of rows - since one ID might have more rows than the other.
What is the best way to do so?
As stated, you can provide cv with an iterator. You can use GroupShuffleSplit(). For example, once you use it to split your dataset, you can put the result within GridSearchCV() for the cv parameter.
As mentioned in the sklearn documentation, there's a parameter called "cv" where you can provide "An iterable yielding (train, test) splits as arrays of indices."
Do check out the documentation in future first.
As mentioned previously, GroupShuffleSplit() splits data based on group lables. However, the test sets aren't necessarily disjoint (i.e. doing multiple splits, an ID may appear in multiple test sets). If you want each ID to appear in exactly one test fold, you could use GroupKFold(). This is also available in Sklearn.model_selection, and directly extends KFold to take into account group lables.

Splitting data frame in to test and train data sets

Use pandas to create two data frames: train_df and test_df, where
train_df has 80% of the data chosen uniformly at random without
replacement.
Here, what does "data chosen uniformly at random without replacement" mean?
Also, How can i do it?
Thanks
"chosen uniformly at random" means that each row has an equal probability of being selected into the 80%
"without replacement" means that each row is only considered once. Once it is assigned to a training or test set it is not
For example, consider the data below:
A B
0 5
1 6
2 7
3 8
4 9
If this dataset is being split into an 80% training set and 20% test set, then we will end up with a training set of 4 rows (80% of the data) and a test set of 1 row (20% of the data)
Without Replacement
Assume the first row is assigned to the training set. Now the training set is:
A B
0 5
When the next row is assigned to training or test, it will be selected from the remaining rows:
A B
1 6
2 7
3 8
4 9
With Replacement
Assume the first row is assigned to the training set. Now the training set is:
A B
0 5
But the next row will be assigned using the entire dataset (i.e. The first row has been placed back in the original dataset)
A B
0 5
1 6
2 7
3 8
4 9
How can you can do this:
You can use the train_test_split function from scikit-learn: http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
Or you could do this using pandas and Numpy:
df['random_number'] = np.random.randn(length_of_df)
train = df[df['random_number'] <= 0.8]
test = df[df['random_number'] > 0.8]

SPSS Compute Variable

Below is some data:
Test Day1 Day2 Score
A 1 2 100
B 1 3 62
C 3 4 90
D 2 4 20
E 4 5 80
I am trying to take the values from column 'day' and 'day2' and use them to select the row number for the column score. For example for Test A I would like to find the sum of 100 and 62 because that is the values of the first and second rows of score. Test B I would like to find the sum of 100, 62 and 90.
Is their anyway to do this in the Compute Variable window? Found in the menu Transform-Compute Variable?
I tried the following:
Score(MEAN(VALUE(Day1), VALUE(DAY2)))
This is not the proper way to call the cell location of Score and I received an error.
Can anyone help?
Thank you!
You really have two different datasets here. One is a dataset of scores numbered 1 through 5.
The other is a dataset that includes indexes into the score dataset. So the steps would be something like this.
First take the scores dataset and transpose it so that it has one row and 5 columns (Data>Transpose)
Then match that dataset to each case in the main dataset (Data>Merge Files>Add Variables).
Next you have to resort to using syntax directly.
You would declare a vector for the scores (VECTOR)
Finally, you use COMPUTE to index into the scores.
For your real problem, I suppose that you might have batches of scores and maybe there are some gaps. The Restructure Data Wizard can help you generalize this - convert cases into variables, but let's not go there yet.
HTH,
Jon Peck