Advice on how to structure LSTM input Data [closed] - tensorflow

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
I've built model to predict the price of a particular stock. I have all the hourly candle data for this stock for the last three years, as well as additional features.
Right now, the input vector shape is [206,72,9]. The 72 being three days, and the 9 being the number of features.
My first question is, is there an optimal amount of candles to pass in for the second dimension? Would [618,24,9] potentially improve the results?
My second question is, right now the data [1,2,3,4,5,6] is passed in as [1,2,3],[4,5,6], which contains no overlapping hours. Would changing this to [1,2,3],[2,3,4],[3,4,5],[4,5,6] also potentially improve the results?

Let me try to answer both your questions concurrently.
It is possible that more data (both in terms of greater time steps and overlapping series might improve the results - however there are situations where too much data can also be a detriment to your forecasts.
One of the disadvantages of using LSTM models for time series forecasting is that they tend to carry forward too much volatility from previous time steps into the subsequent forecasts - which can make this model an unsuitable candidate for analyzing trend data - they are best used for time series that are highly volatile. Therefore - in answering your question - it is possible that too much data could be as bad as not having enough data - it all depends on the time series under analysis.
In this regard, you should consider the price trend of your stock. If it is a stock that is highly volatile, e.g. a small-cap stock, then an LSTM model might work well. However, if it is a large-cap stock, or one that has a clear trend in the data over time, then LSTM might prove unsuitable.
You might find the following article regarding the use of LSTM to forecast oil prices of use - it is evident that with a strong trend in the data, LSTM proves too volatile to forecast effectively.

Question 1: The optimal amount is like any model hyperparameter, you need to find it yourself. Each model and each data is different, and it's impossible to have a ready answer.
But in general:
Too short: not enough data, won't learn
Too long: may be too much processing for very little gain (or even loss)
Question 2: Yes, you'd get improvement from using the sliding windows, because you have more data for a better generalization. (Unless your original dataset was already so long that it was good enough)

Related

Efficiently blocking invalid solutions [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
What's the best way to block invalid solutions in Optaplanner? I know you can provide a negative hard score with HardSoftScore, but it might still take a long time exploring invalid solutions before arriving at a valid one.
For example, if you're seeing how many packages will fit in a truck, if the sum size of all packages exceeds the capacity of the truck, you don't want to explore any solutions in that space at all.
I think this runs counter to the way Optaplanner is expected to work, in which you have a lot of bad solutions and slowly converge towards a good solution. Veto'ing solutions doesn't give Optaplanner any information on why that solution was vetoed, and also it's possible that a better solution can only be found after traversing though a vetoable solution.
Instead, consider whether your score constraints are causing a score trap. Instead of using a fixed -1 hard score for a vetoable solution, have a score that's proportional to how bad that solution is.
In my example, this means instead of marking overcapacity solutions as hard -1, I should instead penalize them proportional to how over capacity they are, using the matchWeigher form of penalize.

What should be the value for learning rate and momentum? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I am using around 250K images in total with 6 different labels and I am using VGG , with its last layer changed to accommodate 6 categories. What will be the value of learning rate and momentum if SGD optimiser is used?
It depends on a lot of factor, including your training data, batch size, network... You should try different learning rates and see how fast they converge. The Keras LearningRateScheduler callback is also helpful.
Generally, in fine-tuning, the learning rate is kept small. The convention used is 10x smaller than the learning rate used to train the model from scratch.
Momentum is used to dampen the oscillations in the optimization procedure. (when a reduction in one dimension is higher than the other dimension). Higher the momentum more forcefully the optimization procedure is forced to move in directions where the gradient is consistent (in direction) and dampens movement in directions where gradient direction changes. Default values are good to go.
Generally used values lr = 1e-4, momentum=0.9.

Any way to manually make a variable more important in a machine learning model? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
Sometimes you know by experience or by some expert knowledge some variable will play a key role in this model, is there a way to manually make the variable count more so the training process can speed up and the method can combine some human knowledge/wisdom/intelligence.
I still think machine learning combined with human knowledge is the strongest weapon we have now
This might work by scaling your input data accordingly.
On the other hand the strength of a neural network is to figure out
which features are in fact important and which combinations with other
features are important - from the data.
You might argue, that you'll decrease training time. Somebody else might argue that you're biasing your training in such a way that it might even take more time.
Anyway if you would want to do this, assuming a fully connected layer, you could increasedly initialize the weights of the input feature you found important.
Another way, could be to first pretrain the model according to a training loss, that should have your feature as an output. Than keep the weights and switch to the actual loss - I have never tried this, but it could work.

What do these questions mean and how do I approach them? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 4 years ago.
Improve this question
I am currently making documentations in regards to my finished product, however I do not understand what the question wants by asking for:
Qualitative assessment of performance
Quantitative assessment of performance
A qualitative assessment of performance is a assessment which doesn't use specific measurements but compares the performance with the expectation or needs of the user. So something like:
The performance of the import is low, but acceptable for the intended use.
reaction of the application to user input most cases so fast, that no waiting time is perceived.
A quantitative assessment is based on measurements:
The import processes 1 million records per hour
98% of all user interactions are processed withing 0.2 seconds
Also more detailed information like standard deviations or plotting a measure with regards to some variable, would be a quantitative assessment.
Note that both assessments are important. The quantitative is great for comparisons, for example if you want to compare the performance of two versions of an application.
The qualitative is what really matters. In the end it often doesn't matter how many millions or records you process per ms. The question is: Is the user satisfied, and in most cases the user doesn't base their feelings on some measurement, but on ... well ... their feelings.

SQL summing vs running totals [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I'm currently in disagreement with my colleague regarding the best design of our database.
We have a need to frequently access the total user balance from our database of transactions, we will potentially need to access this information several times a second.
He says that sql is fast and all we need to do is SUM() the transactions. I, on the other hand, believe that eventually with enough users and a large database our server will be spending most of its time summing the same records in the database. My solution is to have a separate table to keep a record of the totals.
Which one of us is right?
That is an example for database denormalization. It makes the code more complex and introduces potential for inconsistencies, but the query will be faster. If that's worth it depends on the need for the performance boost.
The sum could also be quite fast (i.e. fast enough) if it can be indexed properly.
A third way would be using cached aggregates that are periodically recalculated. Works best if you don't need real-time data (such as for account activity up until yesterday, which you can maybe augment with real-time data from the smaller set of today's data).
Again, the tradeoff is between making things fast and keeping things simple (don't forget that complexity also tend to introduce bugs and increase maintenance costs). It's not a matter of one approach being "right" for all situations.
I don't think that one solution fits all.
You can go very far with a good set of indexes and well written queries. I would start with querying real time until you can't, and then jump to the next solution.
From there, you can go to storing aggregates for all non changing data (for example, beginning of time up to prior month), and just query the sum for any data that changes in this month.
You can save aggregated tables, but how many different kinds of aggregates are you going to save? At some point you have to look into some kind of a multi dimensional structure.