So I hit the wall pretty hard on my current implementation.
I have two planning entities: a machine and a task.
The task has a custom shadow variable which calculates the task start time.
It was all working well, however I needed to add a span when the task cannot start at the calculated time, because there are no employees available ( there's a fixed number of employees per machine )
To implement this last feature, after calculating the start time, if the task couldn't be started at that time it searches for the next available time where there are enough employees for this task to start. It is done by looping through the ordered planned tasks of all machines, and calculating if at the end of that task it has enough employees for this task.
The problem with this is: this span time does not go away if the task it spanned until the end of changes position.
I'll leave an image trying to explain this:
Is there a better way to add these spans? Or if I'm in the right direction, is there a way to make sure optaplanner invalidates the start times and recalculates them when such a move occurs?
Thank you in advance!
Update: I've managed to trigger the update for every entity after one changes, however this gets real slow real fast, but i do not see any other way around this, as if an entity changes it may cause lack of employees on another machine's entity, anyone has something else in mind for this issue?
Related
I'm in a situation where I have a server running sql 2012 with roughly two hundred scheduled jobs (all are SSIS package executions). I'm facing a directive from management where I need to run some custom software to create a bug report ticket whenever a job fails. Right now I'm relying on half the jobs jobs notifying an operator on failure, while the other half do like a "go to step X- send failure email" for each step on failure, where "step X" is some sql that queries the DB and sends out an email saying which job failed at which step.
So what I'm looking for is some universal solution where I can have every job do the same thing when it fails (in this case, run some program that creates a bug tracking ticket). I am trying to avoid the situation where I manually go into every single job and add a new step at the end, with all previous steps changing to "go to step Y on failure" where step Y is this thing that creates the bug report.
My first thought was to create a new job that queries the execution history tables and looks for unhandled failures and then does the bug report creation itself. However, I already made the mistake of presenting this idea to the manager and was told it's not a viable solution because it's "reactive and not proactive" and also not creating tickets in real-time. I should know better than to brainstorm with non-programming management but it's too late, so that option is off the table and I haven't been able to uncover any other methods.
Any suggestions?
I'm proposing this as an answer, though it's not a technical solution. Present the possible solutions and let the manager decide:
Update all the Agent Jobs - This will take a lot of time and every job will need to be tested, which will also take a lot of time. I'd guess 2-8 weeks depending on how it's done.
Create an error handler job that monitors the logs and creates tickets based on those errors. This has two drawbacks - it is not "real-time" (as desired by the manager) and something will need to be put into place to insure errors are only reported once. This has the upside of being one change to manage. Also it can be made near real time if it were run on the minute.
A third option, which would be more a preliminary step, is to create an error report based off of the logs. This will help to understand the quantity and types of failures. This may help to shape the ultimate solution - do we want all these tickets, can they be broken up into different categories, do we want tickets for errors that are self-healing (i.e. connection errors which have built-in retries)?
I'm trying to make sense of Process Default behavior on SSAS 2017 Enterprise Edition.
My cube is processed daily in this standard sequence:
Loop through 30 dimensions and performing Process Add or Process Update as required.
Process approximately 80 partitions for the previous day.
Exec a Process Default as the final step.
Everything works just fine, and for the amount of data involved, performs really well. However I have observed that after the process default completes, if I re-run the process default step manually (with no other activity having occurred whatsoever), it will take exactly the same time as the first run.
My understanding was that this step basically scans the cube looking for unprocessed objects and will process any objects found to be unprocessed. Given the flow of dimension processing, and subsequent partition processing, I'd certainly expect some objects to be unprocessed on the first run - particularly aggregations and indexes.
The end to end processing time is around 65 mins, but 10 mins of this is the final process default step.
What would explain this is that if the process default isn't actually finding anything to do, and the elapsed time is the cost of scanning the meta data. Firstly it seems an excessive amount of time, but also if I don't run the step, the cube doesn't come online, which suggests it is definitely doing something.
I've had a trawl through Profiler to try to find events to capture what process default is doing, but I'm not able to find anything that would capture the event specifically. I've also monitored the server performance during the step, and nothing is under any real load.
Any suggestions or clarifications..?
I'm really new to Rabbit and I am not sure how to search for the exact terms related to this question or what is the best way to execute this
I have a task that gets called when a user reaches a page. I just want that page to get run one time for that day. If more people reach that page, then the task is not executed since it already did. If no one ever goes to that page, then the task is never run.
Can someone please kindly point me to a direction as to what I should be looking for?
You can store a key in Redis (or any other db) and set it's value with timestamp of the last running task: in each page view you will check that value and compare it to the current timestamp; if it refer to the same day, ignore. If it older than current day, update the value and trigger your task.
If your task consume too much time you can wrap it with a simple small webserver (flash/bottle/..) and your original trigger only http request..
I am running hyper parameter tuning using Google Cloud ML. I am wondering if it is possible to benefit from (possibly partial) previous runs.
One application would be :
I launch an hyperparameter tuning job
I stop it because I want to change the type of cluster I am using
I want to restart my hypertune job on a new cluster, but I want to benefit from previous runs I already paid for.
or another application :
I launch an hypertune campain
I want to extend the number of trials afterwards, without starting from scratch
and then for instance, I want remove one degree of liberty (e.g. training_rate), focusing on other parameters
Basically, what I need is "how can I have a checkpoint for hypertune ?"
Thx !
Yes, this is an interesting workflow -- Its not exactly possible with the current set of APIs, so its something we'll need to consider in future planning.
However, I wonder if there are some workarounds that can pan out to approximate your intended workflow, right now.
Start with higher number of trials - given you can cancel a job, but not extend one.
Finish a training job early based on some external input - eg. once you've arrived at a fixed training_rate, you could record that in a file in GCS, and mark subsequent trials with different training rate as infeasible, so those trials end fast.
To go further, eg. launch another job (to add runs, or change scale tier), you could potentially try using the same output directory, and this time lookup previous results for a given set of hyperparameters with an objective metric (you'll need to record them somewhere where you can look them up -- eg. create gcs files to track the trial runs), so the particular trial completes early, and training moves on to the next trial. Essentially rolling your own "checkpoint for hypertune".
As I mentioned, all of these are workarounds, and exploratory thoughts on what might be possible from your end with current capabilities.
i have in my application a listview that shows the transport events , this listview should be updated every one second , to follow up the events.
i simply do that by a timer (1000 interval) that declare one connetion object,dataReader... and then fill the listview ,finally, i dispose the connection and another objects (this is every one timer tick).
Now, is there any better way to do that ? maybe better for performance,memory or other somethings?
i'am not expert, so i thinked maybe that is declaring many conncetions every second may making some memory problems :) (correct me if that is wrong)
DataBase Access 2007
VS 2012
Thank You.
Assuming that you are using ADO.NET to access your database, your access model should be fine, because .NET uses connection pooling to minimize performance impacts of closing and re-opening DB connections.
Your overall architecture may be questioned, however: polling for updates on a timer is usually not the best option. A better approach would be maintaining an update sequence in a separate table. The table would have a single row, with a single int column initially set to zero. Every time an update to the "real" data is made, this number is bumped up by one. With this table in place, your program could read just this one number every second, rather than re-reading your entire data set. If your program detects that the number is the same as it was the previous time, it stops and waits for the next timer interval; otherwise, it re-reads the data set.