OpenStack get vm cpu_util with Stein version - virtual-machine

In stein version, ceilometer remove polling for cpu_util.
Follow this doc:
https://docs.openstack.org/ceilometer/stein/admin/telemetry-measurements.html#openstack-compute
The measurements only cpu (
CPU time used) and vcpus (Number of virtual CPUs allocated to the instance).
And check github commit about ceilometer https://github.com/openstack/ceilometer/blob/4ae919c96e4116ab83e5d83f2b726ed44d165278/releasenotes/notes/save-rate-in-gnocchi-66244262bc4b7842.yaml,
The cpu_util meters are deprecated.
and this commit about ceilometer remove transformer support.
https://github.com/openstack/ceilometer/commit/9db5c6c9bfc66018aeb78c4a262e1bfa9b326798#diff-4161ff0e1519a6226d1117b428fc831a
According to commit message, gnocchi handle transformer data.
So, how to use gnocchi aggregate cpu and vcpus to calculate cpu usage?

Although this question is some three months old, I hope my answer still helps somebody.
It seems that Ceilometer's pipeline processing has never worked correctly. As the original poster noticed, the Ceilometer development team took the somewhat drastic measure to deprecate then obsolete this feature. Consequentially, the only CPU meter that remains in the Ceilometer arsenal is the accumulated CPU time of an instance, expressed in nanoseconds.
To calculate the CPU utilization of a single instance based on this meter, you can use Gnocchi's rate aggregation. If rate:mean is one of the aggregation methods in your archive policy, you can do this:
gnocchi measures show --resource-id <uuid> --aggregation rate:mean cpu
Or use the dynamic aggregation feature for the same result:
gnocchi aggregates '(metric cpu rate:mean)' id=<uuid>
The first parameter of the aggregates command is the operation that defines what figures you want. You find operations explained in the Gnocchi API documentation, especially the section that lists supported operations, and the examples section. The second parameter is a search expression that limits the calculation to instances with this particular UUID (of course, there is only one such instance).
So far, the command just pulls figures from the Gnocchi database. You can, however, pull the data from the database, then aggregate them on the fly. This technique is (or used to be) called re-aggregation. This way, you don't need to include rate:mean in the archive policy:
gnocchi aggregates '(aggregate rate:mean (metric cpu mean))' id=<uuid>
The numbers are expressed in nanoseconds, which is a bit unwieldy. Good news: Gnocchi aggregate operations also support arithmetic. To convert nanoseconds to seconds, divide them by one billion:
gnocchi aggregates '(/ (aggregate rate:mean (metric cpu mean)) 1000000000)' id=<uuid>
And to convert them to percentages, divide them by (granularity times one billion), then multiply the result with 100. Assuming a granularity of 60:
gnocchi aggregates '(* (/ (aggregate rate:mean (metric cpu mean)) 60000000000) 100)' id=<uuid>

Related

Check number of slots used by a query in BigQuery

Is there a way to check how many slots were used by a query over the period of its execution in BigQuery? I checked the execution plan but I could just see the Slot Time in ms but could not see any parameter or any graph to show the number of slots used over the period of execution. I even tried looking at Stackdriver Monitoring but I could not find anything like this. Please let me know if it can be calculated in some way or if I can see it somewhere I might've missed seeing.
A BigQuery job will report the total number of slot-milliseconds from the extended query stats in the job metadata, which is analogous to computational cost. Each stage of the query plan also indicates input stats for the stage, which can be used to indicate the number of units of work each stage dispatched.
More details about the representation can be found in the REST reference for jobs. See query.statistics.totalSlotMs and statistics.query.queryPlan[].parallelInputs for more information.
BigQuery now provides a key in the Jobs API JSON called "timeline". This structure provides "statistics.query.timeline[].completedUnits" which you can obtain either during job execution or after. If you choose to pull this information after a job has executed, "completedUnits" will be the cumulative sum of all the units of work (slots) utilised during the query execution.
The question might have two parts though: (1) Total number of slots utilised (units of work completed) or (2) Maximum parallel number of units used at a point in time by the query.
For (1), the answer is as above, given by "completedUnits".
For (2), you might need to consider the maximum value of queryPlan.parallelInputs across all query stages, which would indicate the maximum "number of parallelizable units of work for the stage" (https://cloud.google.com/bigquery/query-plan-explanation)
If, after this, you additionally want to know if the 2000 parallel slots that you are allocated across your entire on-demand query project is sufficient, you'd need to find the point in time across all queries taking place in your project where the slots being utilised is at a maximum. This is not a trivial task, but Stackdriver monitoring provides the clearest view for you on this.

BigQuery GUI - CPU Resource Limit

Is there a way to set the CPU Resource Limit on the BigQuery with Python and GUI?
I'm getting an error of:
Query exceeded resource limits. 2147706.163729571 CPU seconds were used, and this query must use less than 46300.0 CPU seconds.
Looking at the BigQuery's Python reference page: http://google-cloud-python.readthedocs.io/en/latest/bigquery/reference.html
It looks like there's:
1. maximum_billing_tier
2. maximum_bytes_billed
That can be set, but there is no CPU second options.
You cannot set anymore maximum_billing_tier - it is obsolete and as soon as you are lower than tier 100 you are billed as if it were 1. if you exceed 100 - query just failes.
As of CPU - check concept of slots
Maximum concurrent slots per project for on-demand pricing — 2,000
The default number of slots for on-demand queries is shared among all queries in a single project. As a rule, if you're processing less than 100 GB of queries at once, you're unlikely to be using all 2,000 slots.
To check how many slots you're using, see Monitoring BigQuery Using Stackdriver. If you need more than 2,000 slots, contact your sales representative to discuss whether flat-rate pricing meets your needs.
See more at https://cloud.google.com/bigquery/quotas#query_jobs

What is the meaning of totalSlotMs for a BigQuery job?

What is the meaning of the statistics.query.totalSlotMs value returned for a completed BigQuery job? Except for giving an indication of relative cost of one job vs the other, it's not clear how else one should interpret the number. For example, how does the slot-milliseconds number relate to the stack driver reported total slot usage for a given project (which needs to stay below 2000 for on demand BigQuery usage)?
The docs are a bit terse ('[Output-only] Slot-milliseconds for the job.')
The idea is to have a 'slots' metric in the same units at which slots of reservation are sold to customers.
For example, imagine that you have a 20-second query that is continuously consuming 4 slots. In that case, your query is using 80,000 totalSlotMs (4 * 20,000).
This way you can determine the average number of slots even if the peak number of slots differs as, in practice, the number of workers will fluctuate over the runtime of a query.

Calculate % Processor Utilization in Redis

Using INFO CPU command on Redis, I get the following values back (among other values):
used_cpu_sys:688.80
used_cpu_user:622.75
Based on my understanding, the value indicates the CPU time (expressed in seconds) accumulated since the launch of the Redis instance, as reported by the getrusage() call (source).
What I need to do is calculate the % CPU utilization based on these values. I looked extensively for an approach to do so but unfortunately couldn't find a way.
So my questions are:
Can we actually calculate the % CPU utilization based on these 2 values? If the answer is yes, then I would appreciate some pointers in that direction.
Do we need some extra data points for this calculation? If the answer is yes, I would appreciate if someone can tell me what those data points would be.
P.S. If this question should belong to Server Fault, please let me know and I will post it there (I wasn't 100% sure if it belongs here or there).
You need to read the value twice, calculate the delta, and divide by the time elapsed between the two reads. That should give you the cpu usage in % for that duration.

Q-learning value update

I am working on the power management of a device using Q-learning algorithm. The device has two power modes, i.e., idle and sleep. When the device is asleep, the requests for processing are buffered in a queue. The Q-learning algorithm looks for minimizing a cost function which is a weighted sum of the immediate power consumption and the latency caused by an action.
c(s,a)=lambda*p_avg+(1-lambda)*avg_latency
In each state, the learning algorithm takes an action (executing time-out values) and evaluates the effect of the taken action in next state (using above formula). The actions are taken by executing certain time-out values from a pool of pre-defined time-out values. The parameter lambda in above equation is a power-performance parameter (0_<lambda<1). It defines whether the algorithm should look for power saving (lambda-->1) or should look for minimizing latency (lambda-->0). The latency for each request is calculated as queuing-time + execution-time.
The problem is that the learning algorithm always favors small time-out values in sleep state. It is because the average latency for small time-out values is always lower, and hence their cost is also small. When I change the value of lambda from lower to higher, I don't see any effect in the final output policy. The policy always selects small time-out values as best actions in each state. Instead of average power and average latency for each state, I have tried using overall average power consumption and overall average latency for calculating cost for a state-action pair, but it doesn't help. I also tried using total energy consumption and total latency experinced by all the request for calculating cost in each state-action pair, but it doesn't help either. My question is: what could be a better cost function for this scenario? I update the Q-value as follows:
Q(s,a)=Q(s,a)+alpha*[c(s,a)+gamma*min_a Q(s',a')-Q(s,a)]
Where alpha is a learning rate (decreased slowly) and gamma=0.9 is a discount factor.
To answer the questions posed in the comments:
shall I use the entire power consumption and entire latency for all
the requests to calculate the cost in each state (s,a)?
No. In Q-learning, reward is generally considered an instantaneous signal associated with a single state-action pair. Take a look at Sutton and Barto's page on rewards. As shown the instantaneous reward function (r_t+1) is subscripted by time step - indicating that it is indeed instantaneous. Note that R_t, that expected return, considers the history of rewards (from time t back to t_0). Thus, there is no need for you to explicitly keep track of accumulated latency and power consumption (and doing so is likely to be counter-productive.)
or shall I use the immediate power consumption and average latency
caused by an action a in state s?
Yes. To underscore the statement above, see the definition of an MDP on page 4 here. The relevant bit:
The reward function specifies expected instantaneous reward as a
function of the current state and action
As I indicated in a comment above, problems in which reward is being "lost" or "washed out" might be better solved with a Q(lambda) implementation because temporal credit assignment is performed more effectively. Take a look at Sutton and Barto's chapter on TD(lambda) methods here. You can also find some good examples and implementations here.