How to select Q value in DQN where Q is a multi-dimensional array

How to select Q value in DQN where Q is a multi-dimensional array - tensorflow

I'm implementing a DQN to do the Trading in the stock market (for educational purposes only)
I have this data and the shape of the data. This is the state in a time series data and I'm going to pass it to a nerual network. The first column is the Closing price of a stock, and the second column is the Volume (Normalized already):
array([[[-0.39283217, 3.96508668],
[-0.39415516, 0.04931261],
[-0.38271683, -0.34029827],
[-0.39283217, -0.42384451],
[-0.4332384 , -0.11795849],
[-0.41201548, -0.47441503],
[-0.41739012, -0.51788375],
[-0.42210326, -0.60101319],
[-0.43660099, -0.596672 ],
[-0.43660099, -0.64244935]]])
(1, 10, 2)
Now I pass this data to a neural network. It's essentially a policy network, but to simplify the question I write it like this here (The loss is Q - target Q value):
model = keras.Sequential([
keras.layers.Input(shape=(10,2,)),
keras.layers.Dense(10, activation='relu'),
keras.layers.Dense(3, activation='linear')
])
model.compile(loss=count_the_loss(),
optimizer='adam',
metrics='mse')
Now I get this by using the predict function:
array([[[-0.79352564, -0.22876596, 2.309589 ],
[-0.10996505, 0.01430818, 0.22286436],
[-0.17374574, 0.03645202, 0.10073717],
[-0.19824156, 0.07159233, 0.08594725],
[-0.12234195, 0.03734204, 0.19439939],
[-0.21589771, 0.088783 , 0.08315123],
[-0.22866695, 0.10703149, 0.07550874],
[-0.25188142, 0.1436682 , 0.05827002],
[-0.25386256, 0.13714936, 0.06612003],
[-0.26608405, 0.1581351 , 0.05540368]]], dtype=float32)
I'm supposed to get the q(s,a1), q(s,a2), q(s,a3) (where a1, a2 and a3 stands for actions: short, flat and long respectively), then find the q for the action sampled from the experience replay.
But now I get a 1x10x3 array.
My questions are:
How am I supposes to get the q?
And when this is done, it's time to find the target Q. It's similar to the process above. Suppose the above result is what I get by passing the next_state to a target network. I have to find the q max. How can I find q max in a 1x10x3 array?

Related

Order-independent Deep Learning Model

I have a dataset with parallel time series. The column 'A' depends on columns 'B' and 'C'. The order (and the number) of dependent columns can change. For example:
A B C
2022-07-23 1 10 100
2022-07-24 2 20 200
2022-07-25 3 30 300
How should I transform this data, or how should I build the model so the order of columns 'B' and 'C' ('A', 'B', 'C' vs 'A', C', 'B'`) doesn't change the result? I know about GCN, but I don't know how to implement it. Maybe there are other ways to achieve it.
UPDATE:
I want to generalize my question and make one more example. Let's say we have a matrix as a singe observation (no time series data):
col1 col2 target
0 1 a 20
1 2 a 30
2 3 b 30
3 4 b 40
I would like to predict one value 'target' per each row/instance. Each instance depends on other instances. The order of rows is irrelevant, and the number of rows in each observation can change.

You are looking for a permutation invariant operation on the columns.
One way of achieving this would be to apply column-wise operation, followed by a global pooling operation.
How that achieves your goal:
column-wise operations are permutation equivariant; that is, applying the operation on the columns and permuting the output, is the same as permuting the columns and then applying the operation.
A global pooling operation (e.g., max-pool, avg-pool) across the columns is permutation invariant: the result of an average pool does not depend on the order of the columns.
Applying a permutation invariant operation on top of a permutation equivariant one results in an overall permutation invariant function.
Additionally, you should look at self-attention layers, which are also permutation equivariant.
What I would try is:
Learn a representation (RNN/Transformer) for a single time series. Apply this representation to A, B and C.
Learn a transformer between the representation of A to those of B and C: that is, use the representation of A as "query" and those of B and C as "keys" and "values".
This will give you a representation of A that is permutation invariant in B and C.
Update (Aug 3rd, 2022):
For the case of "observations" with varying number of rows, and fixed number of columns:
I think you can treat each row as a "token" (with a fixed dimension = number of columns), and apply a Transformer encoder to predict the target for each "token", from the encoded tokens.

Group by Regression in TensorFlow

I am very new to TensorFlow - so please bear with me if this is a trivial question.
I'm coding in Python+TensorFlow. I have a dataframe with the following structure -
Y | X_1 | X_2 | ... | X_p | Grp
where Y is the continuous response, X_1 through X_p are features, and Grp is a categorical value indicating group. I want to fit a separate linear regression of Y on (X_1,...,X_p)for each Grp and save the weights/coefficients. I do not want to use the out of the shelf tf.estimator.LinearRegressor. Instead I want to go the loss function-optimizer-session.run() route.
The relevant tutorial pages on internet talk about linear regression but not per group. I would appreciate any suggestions. I am thinking to do this -
For each g in Grps :
1. Call the optimizer by passing the data for Group g as the placeholders.
2. Get the estimated weights (for Group g) and save them in a dataframe : Grp | weights
Another approach that sounds reasonable is to have separate graphs for each group and kick them all together using various "sessions".
Are these reasonable and feasible in TF? Which one is easier or are there better approaches?
Thank you,
Sai

Customizing tables in Stata

Using Stata14 on windows, I am wondering how to build customized tables from several regression results. Here is an example. We have
reg y, x1
predict resid1, residuals
summarize resid1
Which gives:
Variable | Obs Mean Std. Dev. Min Max
-------------+---------------------------------------------------------
resid1 | 5,708,529 4.83e-11 .7039736 -3.057633 3.256382
And run another regrerssion and similarly obtain the residuals:
reg y, x2
predict resid2, residuals
I would like to create a table which has the two standard deviations of the two residuals, and optimally output it to latex. I am familiar with the esttab and estout commands for outputting regression results to latex, but these do not work for customized tables as in the above example.

You need to use estpost. This should get you started.
sysuse auto, clear
regress price weight
predict error1, residuals
regress price trunk
predict error2, residuals
eststo clear
estpost summarize error1 error2
eststo
esttab, cells("count mean sd min max") noobs nonum
esttab using so.tex, cells("count mean sd min max") noobs nonum replace
More here.

torch7: Unexpected 'counts' in k-Means Clustering

I am trying to apply k-means clustering on a set of images (images are loaded as float torch.Tensors) using the following segment of code:
print('[Clustering all samples...]')
local points = torch.Tensor(trsize, 3, 221, 221)
for i = 1,trsize do
points[i] = trainData.data[i]:clone() -- dont want to modify the original tensors
end
points:resize(trsize, 3*221*221) -- to convert it to a 2-D tensor
local centroids, counts = unsup.kmeans(points, total_classes, 40, total_classes, nil, true)
print(counts)
When I observe the values in the counts tensor, I observe that it contains unexpected values, in the form of some entries being more than trsize, whereas the documentation says that counts stores the counts per centroid. I expected that it means counts[i] equals the number of samples out of trsize belonging to cluster with centroid centroids[i]. Am I wrong in assuming so?
If that indeed is the case, shouldn't sample-to-centroid be a hard-assignment (i.e. shouldn't counts[i] sum to trsize, which clearly is not the case with my clustering)? Am I missing something here?
Thanks in advance.

In the current version of the code, counts are accumulated after each iteration
for i = 1,niter do
-- k-means computations...
-- total counts
totalcounts:add(counts)
end
So in the end counts:sum() is a multiple of niter.
As a workaround you can use the callback to obtain the final counts (non-accumulated):
local maxiter = 40
local centroids, counts = unsup.kmeans(
points,
total_classes,
maxiter,
total_classes,
function(i, _, totalcounts) if i < maxiter then totalcounts:zero() end end,
true
)
As an alternative you can use vlfeat.torch and explicitly quantize your input points after kmeans to obtain these counts:
local assignments = kmeans:quantize(points)
local counts = torch.zeros(total_classes):int()
for i=1,total_classes do
counts[i] = assignments:eq(i):sum()
end

Pandas groupby for k-fold cross-validation with aggregation

say I have a data frame,df, with columns: id |site| time| clicks |impressions
I want to use the machine learning technique of k-fold cross validation ( split the data randomly into k=10 equal sized partitions - based on eg column id) . I think of this as a mapping from id: {0,1,...9} ( so new column 'fold' going from 0-9)
then iteratively take 9/10 partitions as training data and 1/10 partition as validation data
( so first fold==0 is validation, rest is training, then fold==1, rest is training)
[ so am thinking of this as a generator based on grouping by fold column]
finally I want to group all the training data by site and time ( and similarly for validation data) ( in other words sum over the fold index, but keeping the site and time indices)
What is the right way of doing this in pandas?
The way I thought of doing it at the moment is
df_sum=df.groupby( 'fold','site','time').sum()
#so df_sum has indices fold,site, time
# create new Series object,dat, name='cross' by mapping fold indices
# to 'training'/'validation'
df_train_val=df_sum.groupby( [ dat,'site','time']).sum()
df_train_val.xs('validation',level='cross')
Now the direct problem I run into is that groupby with columns will handle introducing a Series object but groupby on multiindices doesn't [df_train_val assignment above doesn't work]. Obviously I could use reset_index but given that I want to group over site and time [ to aggregate over folds 1 to 9, say] this seems wrong. ( I assume grouping is much faster on indices than on 'raw' columns)
So Question 1 is this the right way to do cross-validation followed by aggregation in pandas. More generally grouping and then regrouping based on multiindex values.
Question 2 - is there a way of mixing arbitrary mappings with multilevel indices.

This generator seems to do what I want. You pass in the grouped data (with 1 index corresponding to the fold [0 to n_folds]).
def split_fold2(fold_data, n_folds, new_fold_col='fold'):
i_fold=0
indices=list(fold_data.index.names)
slicers=[slice(None)]*len(fold_data.index.names)
fold_index=fold_data.index.names.index(new_fold_col)
indices.remove(new_fold_col)
while (i_fold<n_folds):
slicers[fold_index]=[i for i in range(n_folds) if i !=i_fold]
slicers_tuple=tuple(slicers)
train_data=fold_data.loc[slicers_tuple,:].groupby(level=indices).sum()
val_data=fold_data.xs(i_fold,level=new_fold_col)
yield train_data,val_data
i_fold+=1
On my data set this takes :
CPU times: user 812 ms, sys: 180 ms, total: 992 ms Wall time: 991 ms
(to retrieve one fold)
replacing train_data assignment with
train_data=fold_data.select(lambda x: x[fold_index]!=i_fold).groupby(level=indices).sum()
takes
CPU times: user 2.59 s, sys: 263 ms, total: 2.85 s Wall time: 2.83 s

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to select Q value in DQN where Q is a multi-dimensional array - tensorflow

Related

Order-independent Deep Learning Model

Group by Regression in TensorFlow

Customizing tables in Stata

torch7: Unexpected 'counts' in k-Means Clustering

Pandas groupby for k-fold cross-validation with aggregation

Categories

Resources