How to check if the cloudera services like hive, Impala are running or not through java code? - hive

I want to run some hive queries, and then need to collect different metrics like hdfs bytes read/write. For this I have written java code. But before running the code I just want to check if the cloudera services like hive, impala, yarn are running or not. If running then the code need to execute otherwise just exit. Is there any way to check the status of services by java code?

Sampson S gave you a correct answer, but it's not trivial to implement. The information is available via the REST API of the Cloudera Manager (CM) tools offered by Cloudera. You would have your Java program make a web GET request to CM, parse the JSON result and use that to make a decision. Alternatively, you could look at the code behind their APIs to make a more direct query.
But I think you should ask "Why?" What are you trying to accomplish? Are you replicating the functionality already provided by CM? When asking questions here on SO it's always helpful to provide some context. It seems like you may be new to the environment. Perhaps it already does what you want.

Related

Does Informatica Powercenter provide API to access session logs

Question - Does Informatica PowerCenter provide API to access session logs - I believe No but wanted to through in forum to be sure?
Objective -Actually I want to extract session logs and process them through Logstash and perform reactive analytics periodically.
Alternate - The same could be solved using Logstash input plugin for Informatica - but I did not find that either.
Usage - This will be used to determine common causes, analyze usage of cache at session level, throughput, and any performance bottlenecks.
You can call Informatica Webservice's getSessionLog. Here's a sample blog post with details: http://www.kpipartners.com/blog/bid/157919/Accessing-Informatica-Web-Services-from-3rd-Party-Apps
I suppose that the correct answer i 'yes', since there is a command line tool to convert logfiles to txt or even xml format.
The tool for session/workflow logs is called infacmd with the 'getsessionlog' argument. You can look it up in the help section of you powercenter clients or here:
https://kb.informatica.com/proddocs/Product%20Documentation/5/IN_101_CommandReference_en.pdf
That has always been enough for my needs..
But there is more to look into: when you run this commandline tool (which is really a BAT file) a java.exe does the bulk of the processing in a sub-process. The jar files used by this process could potentially be utilized by somebody else directly, but I don't know if it has been documented anywhere publicly available....?
Perhaps someone else knows the answer to that.

Cosmos on Wirecloud

Taking as a reference public documentation (https://wirecloud.conwet.etsiinf.upm.es/slides/1.2_Integration%20with%20other%20GEs.html#slide16) I wonder if at this point there is any progress on connecting Wirecloud & Cosmos in order to retrieve historical data and visualised it over mashups setups.
If not, could you give any direction so I can give a try implementing something around this?
Note: I have already checked some of the available documentation, and it looks to me that my desired feature could be tackled by a simple python implementation to retrieve HDFS files to the appropriated NGSI format, Is it right?
Nevertheless, I believe it will be a dirty mechanism. What should be the recommended way?
I honestly hope not to be cheating by answering my own questions and marking them as correct, but I would like to leave a record of a solution for those folks that might be experiencing same troubles as me.
I have developed a quick and dirty mechanism to retrieve HDFS files into NGSI formats so we can retrieve historical data like we do with Orion widgets.
https://github.com/netzahdzc/cloudCos
Please note, that this is a quite working progress, so there are some hardcode that I hope eventually fix.
Official Cosmos-WireCloud integration is currently not available, although there are third-party widgets using cosmos out there.
In my opinion, the best option for accessing the HDFS filesystem, is using WebHDFS (you will need adding a FIWARE token into the request for authentication).
It should also be possible to connect to Hive (see this ticket for more info)

How to access results of Sonar metrics for use with applications like PowerPivot

I'm trying to run a number of applications with known failure rates through Sonar, with hopes of deciding which metrics are most valuable in determining whether a particular application will fail. Ultimately I'll be making some sort of algorithm that will look at the outputs of whatever metrics I'm using and generate a score from 1 - 100. I've got about 21 applications put through Sonar, and the results have been stored in a MySQL database. I originally planned to use PowerPivot to find relationships in the data, but it seems like the formatting of the tables doesn't lend itself well to that. Other questions on stackoverflow have told me that Sonar's tables are unformatted, and I should instead use the Web Service API to get the information. I'm unfamiliar with API and was unsuccessful in trying to do what I wanted by looking at Sonar's documentation for API.
From an answer to another question:
http://nemo.sonarsource.org/api/timemachine?resource=org.apache.cxf:cxf&format=csv&metrics=ncloc,violations_density,comment_lines_density,public_documented_api_density,duplicated_lines_density,blocker_violations,critical_violations,major_violations,minor_violations
This looks very similar to what I'd like to have, except I'm only looking at each application once (I'm analyzing a sample of all the live applications on a grid), which means Timemachine isn't really what I'm looking for. Would it be possible to generate a similar table, except instead of the stats for a particular application per date, it showed the statistics for an application and all of its classes, etc?
If you're not familiar with the WS API, you can also create your own Sonar plugin to achieve whatever you want: it is written in Java and it will execute on every analysis you run. This way, in the code ot this custom plugin, you can do whatever you want: flush the metrics you need in an output file, push them into a third party system, ... etc.
Just take a look on how to write a plugin (most probably you will create a Decorator). You have concrete examples also to get started faster.

How do I access SQL from XPages

What is the process to access data from a SQL data source and have it fill in a list box control so that the user may select one of the values?
I have been given the name of the database and server, the login ID and password.
Code samples would really be appreciated as I have never done any SQL coding.
The latest Extension Library on OpenNTF ( extlib.openntf.org ) has a whole bunch of Relational Database extensions.
You'll need to get the JDBC drivers for whatever SQL server your going to be accessing and then take a look at the ExtLib demo application on how to create the JDBC connector from your application. Once the connector is in place you can then just the new controls in ExtLib to easily create a view pane etc.
You will also need more then the SQL server, username and password, you'll need to find out the different tables that you'll be accessing so that you can reference them from your Xpages application.
I've created a video showing JDBC access from XPages: http://www.youtube.com/watch?v=p6oRCsTsVqc
Wait for the book that will e released soon about the extlib. I know Jeremy hodge wrote the chapter so you might be able to get some info from him.
From an answer I gave earlier: you might want to check out the blog post announcing the JDBC support . It has an excellent video explanation and a link to a slide deck.
Also, take a look at Xpages101 lesson 61. It's paid-for content, but well worth it if you're serious about Xpages development.
If you want to combine Upgrade Pack 1 (UP1) with the Extension Library JDBC parts, then make sure to use the Extension Library that matches exactly the UP1 version. This is version 853-20111215 of the Extension Library. Then you can use the update site method to only deploy the experimental parts of the Extension Library (com.ibm.xsp.extlibx.feature_8.5.3.20111215-0914.jar).
For newer releases of Extension Library things might (will) have changed so that UP1 and Extension Library can not work together.
When UP2 is released, you need to remove the Extension Library package and deploy UP2. At that point in time UP2 might contain the JDBC support.
Roy,
As the previous posters put the ext library stuff will make it a little more "Drag and Drop", but you can use regular JDBC connection to get the data you want, Its pretty simple, but a lot more code than using Domino as a backend. You might want to look at this John Mackey blog post about doing a very similar thing...http://www.jmackey.net/groupwareinc/johnblog/johnblog.nsf/d6plinks/GROC-7G9GT4
Keep in mind that you need the actual ext. library for this. The upgrade pack does not contain the JDBC stuff.
Edit:
Keep in mind that if you don't need "LIVE" data access, and the information you want is fairly static you could always just use a lotusscript agent to pull the data down into Notes Documents. Run that once a day or whatever. No fancy XPages stuff needed. That's fairly common coding and practices with examples available.
Then just have the list box pull from the documents you brought down.

Shell Scripting and Intersystems Cache: Extracting Information?

I would like to be able to execute a script to draw out the current cache process information. Has anybody done much scripting with cache? Is there an easier way to basically log the process information? The end result of this is I would like to present this information in a way that I could log it into Splunk
You try to solve an easy problem using the hard way. Just use the built-in SNMP provider.
The documentation for Cache 2008.2.6 contains a document Monitoring Cache Using SNMP.