SPARQLWRapper provides us a user-friendly interface for issuing SPARQL queries, for example, you can execute a SPARQL query simply with the following two lines of code.
sparql.setQuery(query)
results = sparql.query().convert()
However, one thing I notice is that, sometimes the query might crash the sparql endpoint, however, sparql.query().convert() does not raise any exception, it just returns an empty set as answer. And for the next query, because the sparql endpoint was crashed by the previous query, it will report connection refused. This reported error does not really help us to know what the real problem is; what we do want to know is what is happening when the server is crashed. I expect sparql.query().convert() to raise some exception if the sparql endpoint is crashed during processing it, but unfortunately, it does not. So is there any way for me to catch such exception when using SPARQLWrapper?
Related
I have a script I'm trying to write to process a large amount of data. There are, of course, potential for errors. In the script I need to connect to databases. If the script encounters an error, the code never reaches the point where the connection to the database is terminated. I'd like to have something in my python code that will recognize an error occurs, not matter where, and if nothing else at least close those databases. Does something like this exist? I know I can use try/except, but that would only work if I know exactly where I could get the error? I'm basically looking for a catchall to close my databases in the event an error occurs in a location I didn't anticipate.
To run certain cleanup code even if there is an error, use the finally block:
try:
# do stuff, possible exception
except:
# run this if exception
finally:
# always run this, even if exception
Reference: https://docs.python.org/3/tutorial/errors.html#defining-clean-up-actions
I'm trying to write a script to test all DataSources of a WebSphere Cell/Node/Cluster. While this is possible from the Admin Console a script is better for certain audiences.
So I found the following article from IBM https://www.ibm.com/support/knowledgecenter/en/SSAW57_8.5.5/com.ibm.websphere.nd.multiplatform.doc/ae/txml_testconnection.html which looks promising as it describles exactly what I need.
After having a basic script like:
ds_ids = AdminConfig.list("DataSource").splitlines()
for ds_id in ds_ids:
AdminControl.testConnection(ds_id)
I experienced some undocumented behavior. Contrary to the article above the testConnection function does not always return a String, but may also throw a exception.
So I simply use a try-catch block:
try:
AdminControl.testConnection(ds_id)
except: # it actually is a com.ibm.ws.scripting.ScriptingException
exc_type, exc_value, exc_traceback = sys.exc_info()
now when I print the exc_value this is what one gets:
com.ibm.ws.scripting.ScriptingException: com.ibm.websphere.management.exception.AdminException: javax.management.MBeanException: Exception thrown in RequiredModelMBean while trying to invoke operation testConnection
Now this error message is always the same no matter what's wrong. I tested authentication errors, missing WebSphere Variables and missing driver classes.
While the Admin Console prints reasonable messages, the script keeps printing the same meaningless message.
The very weird thing is, as long as I don't catch the exception and the script just exits by error, a descriptive error message is shown.
Accessing the Java-Exceptions cause exc_value.getCause() gives None.
I've also had a look at the DataSource MBeans, but as they only exist if the servers are started, I quickly gave up on them.
I hope someone knows how to access the error messages I see when not catching the Exception.
thanks in advance
After all the research and testing AdminControl seems to be nothing more than a convinience facade to some of the commonly used MBeans.
So I tried issuing the Test Connection Service (like in the java example here https://www.ibm.com/support/knowledgecenter/en/SSEQTP_8.5.5/com.ibm.websphere.base.doc/ae/cdat_testcon.html
) directly:
ds_id = AdminConfig.list("DataSource").splitlines()[0]
# other queries may be 'process=server1' or 'process=dmgr'
ds_cfg_helpers = __wat.AdminControl.queryNames("WebSphere:process=nodeagent,type=DataSourceCfgHelper,*").splitlines()
try:
# invoke MBean method directly
warning_cnt = __wat.AdminControl.invoke(ds_cfg_helpers[0], "testConnection", ds_id)
if warning_cnt == "0":
print = "success"
else:
print "%s warning(s)" % warning_cnt
except ScriptingException as exc:
# get to the root of all evil ignoring exception wrappers
exc_cause = exc
while exc_cause.getCause():
exc_cause = exc_cause.getCause()
print exc_cause
This works the way I hoped for. The downside is that the code gets much more complicated if one needs to test DataSources that are defined on all kinds of scopes (Cell/Node/Cluster/Server/Application).
I don't need this so I left it out, but I still hope the example is useful to others too.
I'm executing a query against my SPARQL endpoint and getting a strange error (see below). But if I add a pretty low LIMIT in the number of results my query returns a result as expected. As a test I queried DBpedia and had no issues even for retrieving millions of results. I suspect that it has something to do with some of the data that is being returned from my endpoint is causing the issue. Any ideas on how to resolve this?
QueryExecution qe = QueryExecutionFactory.sparqlService(sparqlEndpoint,
sparqlQuery);
ResultSet results = qe.execSelect();
Exception in thread "main" org.apache.jena.atlas.AtlasException:
org.apache.http.TruncatedChunkException: Truncated chunk ( expected
size: 4096; actual size: 3151)
The other end is sending broken data (at the HTTP level).
Check the service at sparqlEndpoint.
Using the Java SDK I am creating a load job for just a single record with a fairly complicated schema. When monitoring the status of the load job, it takes a surprisingly long time (but perhaps this is due to working out the schema), but then says:
11:21:06.975 [main] INFO xxx.GoogleBigQuery - Job status (21694ms) create_scans_1384744805079_172221126: DONE
11:24:50.618 [main] ERROR xxx.GoogleBigQuery - Job create_scans_1384744805079_172221126 caused error (invalid) with message
Too many errors encountered. Limit is: 0.
11:24:50.810 [main] ERROR xxx.GoogleBigQuery - {
"message" : "Too many errors encountered. Limit is: 0.",
"reason" : "invalid"
?}
BTW - how do I tell the job that it can have more than zero errors using Java?
This load job does not appear in the list of recent jobs in the console, and as far as I can see, none of the Java objects contains any more details about the actual errors encountered. So how can I pro-grammatically find out what is going wrong? All I can find is:
if (err != null) {
log.error("Job {} caused error ({}) with message\n{}", jobID, err.getReason(), err.getMessage());
try {
log.error(err.toPrettyString());
}
...
In general I am having a difficult time finding good documentation for some of these things and am working it out by trial and error and short snippets of code found on here and older groups. If there is a better source of information than the getting started guides, then I would appreciate any pointers to that information. The Javadoc does not really help and I cannot find any complete examples of loading, querying, testing for errors, cataloging errors and so on.
This job is submitted via a NEWLINE_DELIMITIED_JSON record, supplied to the job via:
InputStream dummy = getClass().getResourceAsStream("/googlebigquery/xxx.record");
final InputStreamContent jsonIn = new InputStreamContent("application/octet-stream", dummy);
createTableJob = bigQuery.jobs().insert(projectId, loadJob, jsonIn).execute();
My authentication and so on seems to work correctly as separate Java code to list the projects, and the datasets in the project all works correctly. So I just need help in working what the actual error is - does it not like the schema (I have records nested within records for instance), or does it think that there is an error in the data I am submitting.
Thanks in advance for any help. The job number cited above is an actual failed load job if that helps any Google staffers who might read this.
It sounds like you have a couple of questions, so I'll try to address them all.
First, the way to get the status of the job that failed is to call jobs().get(jobId), which returns a job object that has an errorResult object that has the error that caused the job to fail (e.g. "too many errors"). The errorStream list is a lost of all of the errors on the job, which should tell you which lines hit errors.
Note if you have the job id, it may be easier to use bq to lookup the job -- you can run bq show <job_id> to get the job error information. If you add the --format=prettyjson it will print out all of the information in the job.
A hint you also might want to consider is to supply your own job id when you create the job -- then even if there is an error starting the job (i.e. the insert() call fails, perhaps due to a network error) you can look up the job to see what actually happened.
To tell BigQuery that some errors are allowed during import, you can use the maxBadResults setting in the load job. See https://developers.google.com/resources/api-libraries/documentation/bigquery/v2/java/latest/com/google/api/services/bigquery/model/JobConfigurationLoad.html#getMaxBadRecords().
Let's suppose I want to inform the application about what happened / returned the SQL server. Let's have this code block:
BEGIN TRY
-- Generate divide-by-zero error.
SELECT 1/0;
END TRY
BEGIN CATCH
SELECT
ERROR_NUMBER() AS ErrorNumber,
ERROR_SEVERITY() AS ErrorSeverity,
ERROR_STATE() as ErrorState,
ERROR_PROCEDURE() as ErrorProcedure,
ERROR_LINE() as ErrorLine,
ERROR_MESSAGE() as ErrorMessage;
END CATCH;
GO
and Let's have this code block:
SELECT 1/0;
My question is:
Both return the division by zero error. What I don't understand clearly is that why I should surround it with the try catch clausule when I got that error in both cases ?
Isn't it true that this error will be in both cases propagated to the client application ?
Yes, the only reason for a Try Catch, (as in ordinary code) is if you can "Handle" the error, i.e., you can correct for the error and successfully complete whatever function the procedure was tasked to do, or, if want to do something with the error before returning it to the client (like modify the message, or store it in an error log table or send someone an email, etc. (althought i'd prefer to do most of those things from the DAL layer )
Technically, however, the catch clause is not returning an error. it is just returning a resultset with error information. This is very different, as it will not cause an exception in client code. This is why your conclusion is correct, ou should just let the original error propagate directly back to the client code.
As you have written it, no error will be returned to the client. As in ordinary code, if you do not handle (correct for) the error in a catch clause, you should always rethrow it (in sql that means Raiserror function) in a catch clause. What you have done above, in general is bad, the client code may or may not have any capability to properly deal with
a completely different recordset (one with error info) from what it was expecting. Some calls (like Inserts updates or deletes) may not be expecting or looking for a returned recordset at all... Instead, if you want or need to do something with the error in the procedure before returning it to the client, use Raiserror() function
BEGIN TRY
-- Generate divide-by-zero error.
SELECT 1/0;
END TRY
BEGIN CATCH
-- Other code to do logging, whatever ...
Raiserror(ERROR_MESSAGE(), ERROR_NUMBER(), ERROR_STATE() )
END CATCH;
Both return the division by zero
error.
Yes, but using different return paths.
The difference is that in the first example, you are anticipating the error and dealing with it in some way. The error enters the application as a regular result - it is not propagated via the error handling mechanism. In fact, if the application doesn't look specifically as the shape of the result, then it may be unaware that an error has occurred.
In the second instance, the error will propagate to your application typically via an error reporting mechanism, such as an exception. This will abort the operation. How big an impact this has will depend upon the application's exception handling. Maybe it will abort just the current operation, or the entire app may fail, depending upon the app's design and tolerance to exceptions.
You choose what makes sense for your application. Can the app meaningfully handle the error - if so, propagate the error (2nd example), or is it best handled in the query (1st example), with errors being "smoothed over" by returning default results, such as an empty rowset.
Try Catch is not as useful when all you have in the try portion is a select. However if you have a transaction with multiple steps, the catch block is used to roll all the steps back and possibly to record details about what caused the problem in a log. But the most important part is the rollback to ensure data integrity.
If you are creating dynamic SQl within the Try block, it is also helpful to log the dynamic SQl variable that failed and any parameters passed in. This can help resolve some hard-to-catch, "we don't have any idea what the user actually did to cause the problem" errors.
No, by executing Select 1/0 in a TRY/CATCH block the select statement returns nothing and the select statement in the catch block displays the error details gracefully. The query completes successfully - no errors are thrown.
If you run Select 1/0 on it's own the query does not complete successfully - it bombs out with an error.
Using a catch block within SQL gives you the chance to do something about it there and then not just let the error bubble up to the application.
The only reason you see the error details is because you are selecting them. If there was no code within the Catch block you wouldn't see any error information.
Using the first method, you wont get the error from SQL Server directly
The second method may stop the execution of the statements that follow it
So it is better you catch it in advance