What is the IN_HAS_NEXT state for GraphDB Queries? - graphdb

I noticed that I had a query stuck in the IN_HAS_NEXT state and I'm curious what its status means.
From the GraphDB SE Documentation 7.0 documentation,
But I'm not entirely sure what that amounts to.

IN_HAS_NEXT means, that the engine is evaluating the solutions from the binding set iterator (hasNext()). In simple words this is the "where" part of the update query which prepares the results before commit. It might seems stuck if there are many returned results. If you are still experiencing problem with this query you can send an email, describing the problem, to graphdb-support#ontotext.com

Related

Trigger action realtime based on keyword in Logs

I have a requirement for which I want to trigger an action (like calling a REST-ful service) in the event a keyword is found in the logs. The trigger would have to be fairly real time. I was evaluating open source solutions like GrayLog2, ELK stack (which I believe can't analyse real time), fluentd etc. but wanted to know your opinion on that. It would be great if the tool also allows setting up rules against key words to eliminate false positives and easy to set up.
I hope this makes sense and apologies if this has been discussed elsewhere!
You can try Massalyzer. It's a real-time analyzer too, very fast (up to 10 millinon line per sec), and you can analyze unlimited size with free demo version.
So, I tried Logstash+Graylog2 combination for the scenario I described in the question and it works quite well. I had to tweak a few things to make Logstash work with Graylog2, especially around capturing the right log levels. I will try this out on a highly loaded clustered environment and update my findings here.

incremental query vs. continuous query

I know that continuous query is a query which is registered once and it is evaluated continuously over a data stream. But, I don't understand what does incremental query means. I am reading about continuous data streams and the way we query for a specific pattern in the stream.
Can anyone explain me - what is an incremental query? Explanation with an example will be really helpful
Although after googling a lot, I find some definitions, but none of them explains clearly.
UPDATE:
I don't find the exact paper now in which I found this term, but in this paper I can find it on page no. 6.
You might already have researched incremental algorithm, I think it is what you're looking for.
I have never heard of an 'incremental' query. However that sounds a lot like doctrine's schema update command here in symfony's doc
Food for thought until someone come up with a better answer :)

Duplicated edges with the same #rid in OrientDB

I've discovered a strange behaviour when querying an Edge class using OrientDB (community-2.1-rc5). The database is returning the exact same edge with the exact same #rid and the exact same data, twice. My instinct says that this is a bug...
This is the query
SELECT FROM E WHERE #class='LIKES' AND (out IN [#12:0,#12:221]) AND in=#36:1913
And this is what orientDB studio returns
http://s29.postimg.org/hwruv0zif/Captura.png
This makes no sense. If I go to the vertex and query for LIKES relationship it only returns one registry... Anyone faced a problem like this?
This is the database I'm using if it helps
https://www.dropbox.com/sh/pkm28cfer1pwpqb/AAAVGeL1eftOGR4o0todTiAha?dl=0
To get help with this bug, you should make a request to join the google group. StackOverflow is not the best place to get help with this kind of bug.
The problem is that you somehow duplicated your edge by mistake. Orientdb let you do it for some unknown reason.
Here is the bug discussion on the orientdb google group : https://groups.google.com/forum/#!topic/orient-database/cAR7yUjCZcI
In the discussion Luca(creator of orientdb) says this :
"the problem is that without a transaction the creation of edge could
be dirty. OrientDB tries to fix dirty reference, so maybe that's the
reason why the next time the exception is raised. I've changed the
default behavior of all SQL commands against Graphs to be always
transactional"
Upgrading to the most recent version of orientdb would be good ideal. Maybe the bug has been fixed.

How do you think while formulating Sql Queries. Is it an experience or a concept?

I have been working on sql server and front end coding and have usually faced problem formulating queries.
I do understand most of the concepts of sql that are needed in formulating queries but whenever some new functionality comes into the picture that can be dont using sql query, i do usually fails resolving them.
I am very comfortable with select queries using joins and all such things but when it comes to DML operation i usually fails
For every query that i never done before I usually finds uncomfortable with that while creating them. Whenever I goes for an interview I usually faces this problem.
Is it their some concept behind approaching on formulating sql queries.
Eg.
I need to create an sql query such that
A table contain single column having duplicate record. I need to remove duplicate records.
I know i can find the solution to this query very easily on Googling, but I want to know how everyone comes to the desired result.
Is it something like Practice Makes Man Perfect i.e. once you did it, next time you will be able to formulate or their is some logic or concept behind.
I could have get my answer of solving above problem simply by posting it on stackoverflow and i would have been with an answer within 5 to 10 minutes but I want to know the reason. How do you work on any new kind of query. Is it a major contribution of experience or some an implementation of concepts.
Whenever I learns some new thing in coding section I tries to utilize it wherever I can use it. But here scenario seems to be changed because might be i am lagging in some concepts.
EDIT
How could I test my knowledge and
concepts in Sql and related sql
queries ?
Typically, the first time you need to open a child proof bottle of pills, you have a hard time, but after that you are prepared for what it might/will entail.
So it is with programming (me thinks).
You find problems, research best practices, and beat your head against a couple of rocks, but in the process you will come to have a handy set of tools.
Also, reading what others tried/did, is a good way to avoid major obsticles.
All in all, with a lot of practice/coding, you will see patterns quicker, and learn to notice where to make use of what tool.
I have a somewhat methodical method of constructing queries in general, and it is something I use elsewhere with any problem solving I need to do.
The first step is ALWAYS listing out any bits of information I have in a request. Information is essentially anything that tells me something about something.
A table contain single column having
duplicate record. I need to remove
duplicate
I have a table (I'll call it table1)
I have a
column on table table1 (I'll call it col1)
I have
duplicates in col1 on table table1
I need to remove
duplicates.
The next step of my query construction is identifying the action I'll take from the information I have.
I'll look for certain keywords (e.g. remove, create, edit, show, etc...) along with the standard insert, update, delete to determine the action.
In the example this would be DELETE because of remove.
The next step is isolation.
Asnwer the question "the action determined above should only be valid for ______..?" This part is almost always the most difficult part of constructing any query because it's usually abstract.
In the above example you're listing "duplicate records" as a piece of information, but that's really an abstract concept of something (anything where a specific value is not unique in usage).
Isolation is also where I test my action using a SELECT statement.
Every new query I run gets thrown through a select first!
The next step is execution, or essentially the "how do I get this done" part of a request.
A lot of times you'll figure the how out during the isolation step, but in some instances (yours included) how you isolate something, and how you fix it is not the same thing.
Showing duplicated values is different than removing a specific duplicate.
The last step is implementation. This is just where I take everything and make the query...
Summing it all up... for me to construct a query I'll pick out all information that I have in the request. Using the information I'll figure out what I need to do (the action), and what I need to do it on (isolation). Once I know what I need to do with what I figure out the execution.
Every single time I'm starting a new "query" I'll run it through these general steps to get an idea for what I'm going to do at an abstract level.
For specific implementations of an actual request you'll have to have some knowledge (or access to google) to go further than this.
Kris
I think in the same way I cook dinner. I have some ingredients (tables, columns etc.), some cooking methods (SELECT, UPDATE, INSERT, GROUP BY etc.) then I put them together in the way I know how.
Sometimes I will do something weird and find it tastes horrible, or that it is amazing.
Occasionally I will pick up new recipes from the internet or friends, then use parts of these in my own.
I also save my recipes in handy repositories, broken down into reusable chunks.
On the "Delete a duplicate" example, I'd come to the result by googling it. This scenario is so rare if the DB is designed properly that I wouldn't bother keeping this information in my head. Why bother, when there is a good resource is available for me to look it up when I need it?
For other queries, it really is practice makes perfect.
Over time, you get to remember frequently used patterns just because they ARE frequently used. Rare cases should be kept in a reference material. I've simply got too much other stuff to remember.
Find a good documentation to your software. I am using Mysql a lot and Mysql has excellent documentation site with decent search function so you get many answers just by reading docs. If you do NOT get your answer at least you are learning something.
Than I set up an example database (or use the one I am working on) and gradually build my SQL. I tend to separate the problem into small pieces and solve it step by step - this is very successful if you are building queries including many JOINS - it is best to start with some particular case and "polute" your SQL with many conditions like WHEN id = "123" which you are taking out as you are working towards your solution.
The best and fastest way to learn good SQL is to work with someone else, preferably someone who knows more than you, but it is not necessarry condition. It can be replaced by studying mature code written by others.
Your example is a test of how well you understand the DISTINCT keyword and the GROUP BY clause, which are SQL's ways of dealing with duplicate data.
Examples and experience. You look at other peoples examples and you create your own code and once it groks, you don't need to think about it again.
I would have a look at the Mere Mortals book - I think it's the one by Hernandez. I remember that when I first started seriously with SQL Server 6.5, moving from manual ISAM databases and Access database systems using VB4, that it was difficult to understand the syntax, the joins and the declarative style. And the SQL queries, while powerful, were very intimidating to understand - because typically, I was looking at generated code in Microsoft Access.
However, once I had developed a relatively systematic approach to building queries in a consistent and straightforward fashion, my skills and confidence quickly moved forward.
From seeing your responses you have two options.
Have a copy of the specification for whatever your working on (SQL spec and the documentation for the SQL implementation (SQLite, SQL Server etc..)
Use Google, SO, Books, etc.. as a resource to find answers.
You can't formulate an answer to a problem without doing one of the above. The first option is to become well versed into the capabilities of whatever you are working on.
The second option allows you to find answers that you may not even fully know how to ask. You example is fairly simplistic, so if you read the spec/implementation documentaion you would know the answer right away. But there are times, where even if you read the spec/documentation you don't know the answer. You only know that it IS possible, just not how to do it.
Remember that as far as jobs and supervisors go, being able to resolve a problem is important, but the faster you can do it the better which can often be done with option 2.

SQL With A Safety Net

My firm have a talented and smart operations staff who are working very hard. I'd like to give them a SQL-execution tool that helps them avoid common, easily-detected SQL mistakes that are easy to make when they are in a hurry. Can anyone suggest such a tool? Details follow.
Part of the operations team remit is writing very complex ad-hoc SQL queries. Not surprisingly, operators sometimes make mistakes in the queries they write because they are so busy.
Luckily, their queries are all SELECTs not data-changing SQL, and they are running on a copy of the database anyway. Still, we'd like to prevent errors in the SQL they run. For instance, sometimes the mistakes lead to long-running queries that slow down the duplicate system they're using and inconvenience others until we find the culprit query and kill it. Worse, occasionally the mistakes lead to apparently-correct answers that we don't catch until much later, with consequent embarrassment.
Our developers also make mistakes in complex code that they write, but they have Eclipse and various plugins (such as FindBugs) that catch errors as they type. I'd like to give operators something similar - ideally it would see
SELECT U.NAME, C.NAME FROM USER U, COMPANY C WHERE U.NAME = 'ibell';
and before you executed, it would say "Hey, did you realise that's a Cartesian product? Are you sure you want to do that?" It doesn't have to be very smart - finding obviously missing join conditions and similar evident errors would be fine.
It looks like TOAD should do this but I can't seem to find anything about such a feature. Are there other tools like TOAD that can provide this kind of semi-intelligent error correction?
Update: I forgot to mention that we're using MySQL.
If your people are using the mysql(1) program to run queries, you can use the safe-updates option (aka i-am-a-dummy) to get you part of what you need. Its name is somewhat misleading; it not only prevents UPDATE and DELETE without a WHERE (which you're not worried about), but also adds an implicit LIMIT 1000 to SELECT statements, and aborts SELECTs that have joins and are estimated to consider over 1,000,000 tuples --- perfect for discouraging Cartesian joins.
..."writing very complex ad-hoc SQL queries.... they are so busy"
Danger Will Robinson!
Automate Automate Automate.
Ideally, the ops team should not be put into a position where they have to write queries on the fly in a high stress situation – it’s a recipe for disaster! Better for them to build up a library of pre-written scripts that have undergone the appropriate testing to make sure it a) does what you want b) provides an audit trail c) has a possible ‘undo’ type function.
Failing that, giving them a user ID that only has SELECT premissions might help :-)
You might find SQL Prompt from redgate useful. I'm not sure what database engine you're using, as it's only for MSSQL Server
I'm not expecting anything like this to exist. The tool would have to first implement everything that the SQL parser in your database implements, and then it would have to do a data model analysis to predict "bad" queries.
Your best bet might be to write a plugin for a text editor that did some basic checking for suspicious patterns and highlighted them differently than the standard .sql mode. But even that would be quite difficult.
I would be happy with a tool that set off alarm bells whenever I typed in an update statement without a where clause. And perhaps administered a mild electric shock, since it's usually about 1 in the morning after a long day when mistakes like that happen.
It would be pretty easy to build this by setting up a sample database with a extremely small amount of dummy data, which would receive the query first. A couple of things will happen:
You might get a SQL syntax error, which would not load the database much since it's a small database.
You might get back a response which could clearly be shown to contain every row in one or more tables, which is probably not what they want.
Things which pass the above conditions are likely to be okay, so you can run them against the copy of the production database.
Assuming your schema doesn't change much and is not particularly weird, writing the above is likely the quickest solution to your problem.
I'd start with some coding standards - for instance never use the type of join in your example - it often results in bad results (especially in SQL Server if you try to do an outer join that way, you will get bad results). require them to do explicit joins.
If you have complex relationships, you might consider putting them in views and then writing the adhoc queries from the views. Then at least they will never make the mistake of getting the joins wrong.
Can't you just limit the amount of time a query can run for? I'm not sure about MySQL, but for SQL Server, even just the default query analyzer can restrict how long queries will run before they time out. Couple that with limited rights so they can only run SELECT queries, and you should be pretty much covered.