Cloudera Manager API to fetch the list of Flume agents

Cloudera Manager API to fetch the list of Flume agents - api

I am trying to fetch the number of Flume agents are running on My CDH5.8 cluster using Cloudera Manager API .
https://cloudera.github.io/cm_api/
Till now i could not figure out which RESTful Model I should consider or the related Java class. If any one can help or to inform the the referenced Java class to look into that will be great
Regards

If you use the following API:
https://cloudera.github.io/cm_api/apidocs/v13/path__clusters_-clusterName-services-serviceName-_roles.html
The size of the items array in the JSON object returned will be the number of Flume agents. To find the number of running agents, for each item, check that roleState equals STARTED.
The Java class ApiRole is probably what you need. This code snippet from the whirr-cm example is close to what you want.
https://github.com/cloudera/whirr-cm/blob/edb38ca7faa3e4bb2c23450ff0183c2dd631dcf4/src/main/java/com/cloudera/whirr/cm/server/impl/CmServerImpl.java#L486
for (ApiService apiService : apiResourceRootV3.getClustersResource().getServicesResource(getName(cluster))
.readServices(DataView.SUMMARY)) {
for (ApiRole apiRole : apiResourceRootV3.getClustersResource().getServicesResource(getName(cluster))
.getRolesResource(apiService.getName()).readRoles()) {
if (apiRole.getRoleState().equals(ApiRoleState.STARTED)) {
servicesNotStarted.remove(apiRole.getName());
}
}
}
You would just need to limit this to the Flume service.
https://cloudera.github.io/cm_api/javadoc/5.11.0/index.html

Related

Elastic APM show total number of SQL Queries executed on .Net Core API Endpoint

Currently have Elastic Apm setup with: app.UseAllElasticApm(Configuration); which is working correctly. I've just been trying to find a way to record exactly how many SQL Queries are run via Entity Framework for each transaction.
Ideally when viewing the Apm data in Kibana the metadata tab could just include an EntityFramework.ExecutedSqlQueriesCount.
Currently on .Net Core 2.2.3

One thing you can use is the Filter API for this.
With that you have access to all transactions and spans before they are sent to the APM Server.
You can't run through all the spans on a given transaction, so you need some tweaking - for this I use a Dictionary in my sample.
var numberOfSqlQueries = new Dictionary<string, int>();
Elastic.Apm.Agent.AddFilter((ITransaction transaction) =>
{
if (numberOfSqlQueries.ContainsKey(transaction.Id))
{
// We make an assumption here: we assume that all SQL requests on a given transaction end before the transaction ends
// this in practice means that you don't do any "fire and forget" type of query. If you do, you need to make sure
// that the numberOfSqlQueries does not leak.
transaction.Labels["NumberOfSqlQueries"] = numberOfSqlQueries[transaction.Id].ToString();
numberOfSqlQueries.Remove(transaction.Id);
}
return transaction;
});
Elastic.Apm.Agent.AddFilter((ISpan span) =>
{
// you can't relly filter whether if it's done by EF Core, or another database library
// but you have all sorts of other info like db instance, also span.subtype and span.action could be helpful to filter properly
if (span.Context.Db != null && span.Context.Db.Instance == "MyDbInstance")
{
if (numberOfSqlQueries.ContainsKey(span.TransactionId))
numberOfSqlQueries[span.TransactionId]++;
else
numberOfSqlQueries[span.TransactionId] = 1;
}
return span;
});
Couple of thing here:
I assume you don't do "fire and forget" type of queries, if you do, you need to handle those extra
The counting isn't really specific to EF Core queries, but you have info like db name, database type (mssql, etc.) - hopefully based on that you'll be able filter the queries you want.
With transaction.Labels["NumberOfSqlQueries"] we add a label to the given transction, and you'll be able to see this data on the transaction in Kibana.

Changing the GemFire query ResultSender batch size

I am experiencing a performance issue related to the default batch size of the query ResultSender using client/server config. I believe the default value is 100.
If I run a simple query to get keys (with some order by columns due to the PARTITION Region type), this default batch size causes too many chunks being sent back for even 1000 records. In my tests, even the total query time is only less than 100 ms, however, the app takes more than 10 seconds to process those chunks.

Reading between the lines in your problem statement, it seems you are:
Executing an OQL query on a PARTITION Region (PR).
Running the query inside a Function as recommended when executing queries on a PR.
Sending batch results (as opposed to streaming the results).
I also assume since you posted exclusively in the #spring-data-gemfire channel, that you are using Spring Data GemFire (SDG) to:
Execute the query (e.g. by using the SDG GemfireTemplate; Of course, I suppose you could also be using the GemFire Query API inside your Function directly, too)?
Implemented the server-side Function using SDG's Function annotation support?
And, are possibly (indirectly) using SDG's BatchingResultSender, as described in the documentation?
NOTE: The default batch size in SDG is 0, NOT 100. Zero means stream the results individually.
Regarding #2 & #3, your implementation might look something like the following:
#Component
class MyApplicationFunctions {
#GemfireFunction(id = "MyFunction", batchSize = "1000")
public List<SomeApplicationType> myFunction(FunctionContext functionContext) {
RegionFunctionContext regionFunctionContext =
(RegionFunctionContext) functionContext;
Region<?, ?> region = regionFunctionContext.getDataSet();
if (PartitionRegionHelper.isPartitionRegion(region)) {
region = PartitionRegionHelper.getLocalDataForContext(regionFunctionContext);
}
GemfireTemplate template = new GemfireTemplate(region);
String OQL = "...";
SelectResults<?> results = template.query(OQL); // or `template.find(OQL, args);`
List<SomeApplicationType> list = ...;
// process results, convert to SomeApplicationType, add to list
return list;
}
}
NOTE: Since you are most likely executing this Function "on Region", the FunctionContext type will actually be a RegionFunctionContext in this case.
The batchSize attribute on the SDG #GemfireFunction annotation (used for Function "implementations") allows you to control the batch size.
Of course, instead of using SDG's GemfireTemplate to execute queries, you can, of course, use the GemFire Query API directly, as mentioned above.
If you need even more fine grained control over "result sending", then you can simply "inject" the ResultSender provided by GemFire to the Function, even if the Function is implemented using SDG, as shown above. For example you can do:
#Component
class MyApplicationFunctions {
#GemfireFunction(id = "MyFunction")
public void myFunction(FunctionContext functionContext, ResultSender resultSender) {
...
SelectResults<?> results = ...;
// now process the results and use the `resultSender` directly
}
}
This allows you to "send" the results however you see fit, as required by your application.
You can batch/chunk results, stream, whatever.
Although, you should be mindful of the "receiving" side in this case!
The 1 thing that might not be apparent to the average GemFire user is that GemFire's default ResultCollector implementation collects "all" the results first before returning them to the application. This means the receiving side does not support streaming or batching/chunking of the results, allowing them to be processed immediately when the server sends the results (either streamed, batched/chunked, or otherwise).
Once again, SDG helps you out here since you can provide a custom ResultCollector on the Function "execution" (client-side), for example:
#OnRegion("SomePartitionRegion", resultCollector="myResultCollector")
interface MyApplicationFunctionExecution {
void myFunction();
}
In your Spring configuration, you would then have:
#Configuration
class ApplicationGemFireConfiguration {
#Bean
ResultCollector myResultCollector() {
return ...;
}
}
Your "custom" ResultCollector could return results as a stream, a batch/chunk at a time, etc.
In fact, I have prototyped a "streaming" ResultCollector implementation that will eventually be added to SDG, here.
Anyway, this should give you some ideas on how to handle the performance problem you seem to be experiencing. 1000 results is not a lot of data so I suspect your problem is mostly self-inflicted.
Hope this helps!

John,
Just to clarify, I use client/server topology(actually wan, but that is not important in here). My client is a spring boot web app which has kendo grid as ui. Users can filter/sort on any combination of the columns, which will be passed to the spring boot app for generating dynamic OQL and create the pagination. Till now, except for being dynamic, my OQL queries are quite straight forward. I do not want to introduce server side functions due to the complexity of our global deployment process. But I can if you think that is something I have to do.
Again, thanks for your answers.

Is there a way to get the number of registered nodes by Selenium Grid other than with http://localhost:4444/grid/console

In order to run multiple tests in parallel, I would like to know how many nodes are already running at some point.
I have looked into many posts on this subject, but all of them include using http://localhost:4444/grid/console : I don't want to check this page.
I was thinking about sending a message to the hub each time a node is created. so the hub increments its count. But I can't find a way to do that.
Does anyone have a different solution ? Maybe using seleniumgrid parameters or command, I'm surprised this number is not stored somewhere?

The selenium grid has an API. You can do this:
http://hub_ip_address:4444/grid/api/hub
and parse the json it returns for "slotCounts"
{
"success":true,
"capabilityMatcher":"org.openqa.grid.internal.utils.DefaultCapabilityMatcher",
"newSessionWaitTimeout":-1,
"throwOnCapabilityNotPresent":true,
"registry":"org.openqa.grid.internal.DefaultGridRegistry",
"cleanUpCycle":5000,
"custom":{
},
"host":"XX.XXX.XX.XXX",
"maxSession":10,
"servlets":[
"ConsoleServlet"
],
"withoutServlets":[
],
"browserTimeout":0,
"debug":false,
"port":4444,
"role":"hub",
"timeout":300000,
"enablePassThrough":true,
"newSessionRequestCount":0,
"slotCounts":{
"free":9,
"total":12
}
}

Is it possible to pass to lettuce redis library MasterSlave connection only slaves uris?

my aim is to add only slaves URIs, because master is not available in my case. But lettuce library returns
io.lettuce.core.RedisException: Master is currently unknown: [RedisMasterSlaveNode [redisURI=RedisURI [host='127.0.0.1', port=6382], role=SLAVE], RedisMasterSlaveNode [redisURI=RedisURI [host='127.0.0.1', port=6381], role=SLAVE]]
So the question is: Is it possible so avoid this exception somehow? Maybe configuration. Thank you in advance
UPDATE: Forgot to say that after borrowing object from pool I set connection.readFrom(ReadFrom.SLAVE) before running commands.
GenericObjectPoolConfig config = fromRedisConfig(properties);
List<RedisURI> nodes = new ArrayList<>(properties.getUrl().length);
for (String url : properties.getUrl()) {
nodes.add(RedisURI.create(url));
}
return ConnectionPoolSupport.createGenericObjectPool(
() -> MasterSlave.connect(redisClient, new ByteArrayCodec(), nodes), config);

The problem was that I tried to set data, which is possible only with master node. So there is no problem with MasterSlave. Get data works perfectly

GemFire getRegion() returns null whereas OQL query gives result

I am using Pivotal GemFire 9.0.0 with 1 Locator and 1 Server. The Server has a Region called "submissions", like below -
<gfe:replicated-region id="submissionsRegion" name="submissions"
statistics="true" template="replicateRegionTemplate">
...
</gfe:replicated-region>
I am getting Region as null when executing the following code -
Region<K, V> region = clientCache.getRegion("submissions");
Surprisingly, the same ClientCache returns all the records when I query using OQL and QueryService as shown below -
String queryString = "SELECT * FROM /submissions";
QueryService queryService = clientCache.getQueryService();
Query query = queryService.newQuery(queryString);
SelectResults results = (SelectResults) query.execute();
I am initializing my ClientCache like this -
ClientCache clientCache = new ClientCacheFactory()
.addPoolLocator("localhost", 10479)
.set("name", "MyClientCache")
.set("log-level", "error")
.create();
I am really baffled by this. Any pointer or help would be great.

You need to configure your ClientCache (either through a cache.xml or pure GemFire API) with the regions as well. Using your example:
ClientRegionFactory regionFactory = clientCache.createClientRegionFactory(ClientRegionShortcut.PROXY);
Region region = regionFactory.create("submissions");
The ClientRegionShortcut.PROXY is used just for the sake of simplicity, you should use the shortcut that meets your needs.
The OQL works as expected because you are obtaining the QueryService through the ClientCache.getQueryService() method (instead of ClientCache.getLocalQueryService()), so the query is actually executed on Server Side.
You can get more information about how to configure the Client/Server topology in
Client/Server Configuration.
Hope this helps.
Cheers.

Yes, you need to "define" the corresponding client-side Region, matching the server-side REPLICATE Region by name (i.e. "submissions"). Actually this is a requirement independent of the server Regions' DataPolicy type (e.g. REPLICATE or PARTITION).
This is necessary since not every client wants to know about or even needs have data/events from every possible server Region. Of course, this is also configurable through subscription and "Interests Registration" (with Client/Server Event Messaging, or alternatively, CQs).
Anyway, you can completely avoid the use of the GemFire API directly or even GemFire's native cache.xml (highly recommend avoiding) by using either SDG's XML namespace...
<gfe:client-cache properties-ref="gemfireProperties" ... />
<gfe:client-region id="submissions" shortcut="PROXY"/>
Or by using Spring JavaConfig with SDG's API...
#Configuration
class GemFireConfiguration {
Properties gemfireProperties() {
Properties gemfireProperties = new Properties();
gemfireProperties.setProperty("log-level", "config");
...
return gemfireProperties;
}
#Bean
ClientCacheFactoryBean gemfireCache() {
ClientCacheFactoryBean gemfireCache = new ClientCacheFactoryBean();
gemfireCache.setClose(true);
gemfireCache.setProperties(gemfireProperties());
...
return gemfireCache;
}
#Bean(name = "submissions");
ClientRegionFactoryBean submissionsRegion(GemFireCache gemfireCache) {
ClientRegionFactoryBean submissions = new ClientRegionFactoryBean();
submissions.setCache(gemfireCache);
submissions.setClose(false);
submissions.setShortcut(ClientRegionShortcut.PROXY);
...
return submissions;
}
...
}
The "submissions" Region can be wrapped with SDG's GemfireTemplate, which will handle getting the "correct" QueryService on your behalf when running queries using the find(..) method.
Of course, you may be interested in making your client "submissions" Region a CACHING_PROXY" too. Of course, you will then need to register "interests" in the keys or data of interests. CQs are the best way to do this as it uses query criteria to define the data of "interests".
CACHING_PROXY is exactly as it sounds, caching data locally in the client based on the interests policies. This also gives you the ability to use the "local" QueryService to query data locally, avoiding the network hop.
Anyway, many options here.
Cheers,
John

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Cloudera Manager API to fetch the list of Flume agents - api

Related

Elastic APM show total number of SQL Queries executed on .Net Core API Endpoint

Changing the GemFire query ResultSender batch size

Is there a way to get the number of registered nodes by Selenium Grid other than with http://localhost:4444/grid/console

Is it possible to pass to lettuce redis library MasterSlave connection only slaves uris?

GemFire getRegion() returns null whereas OQL query gives result

Categories

Resources