Complex query in elastic using Kibana - lucene

Say I have many documents, one document may indicate a door opening, another a door closing, each of these has a doorID field.
I would like to write a query that would return all the cases where a door both opened and closed.
There are doors that "just" opened, or "just closed - those are not interesting to me.
I want only open-closed pairs, based on ID.
How would I create that "connection" between both documents in lucene syntax?

Related

How to store tasks that can edit/add/delete without impact the history of the documents in sql?

It maybe seems a trivial question, but how to design the problem to SQL?
I have a document containing tasks inside.
Each document is per day and each task the have description, complete properties.
I want to be able to provide a page to the user where he can see the tasks of the document and be able to mark if it's complete.
Also, I want to provide the user the ability to edit/add or delete tasks without impacting the previous documents.
Of course, I start from the basic create a tasks table and docs table and docs_tasks for the relationship, but how to design the edit/add or delete without impact the previous docs (for search and archive purposes)?

How to find Tickets I worked on during a given timespan?

We are using HP ALM 12.53.193 for issue tracking.
For a progress report of our development activity, I am trying to find out what tickets I worked on within the past month. Primarily, this includes tickets for which my user name was added to the Editors field, or that were marked as Fixed or Closed by me in the given timespan.
I am seeing some promising hints in this forum post that uses SQL to query the AUDIT_LOG table. Likewise, this [sqa.se] post suggests an SQL script for an Excel report.
Unfortunately, when I click New Business View Excel Report in ALM, I am asked to upload an Excel file for some reason, so I am not sure how to proceed from there to entering my SQL.
I have also tried creating a report with the New Project Report command (i.e. I am also ok with a non-scripted solution, if that is the way to go in ALM). However, it seems I can only show the filtered contents of single tables (e.g. Defect) there (the "Cross Filter" feature does not let me choose another entity (e.g. the Audit Log) for cross-filtering my defects with).
How can I retrieve that data from ALM?
I am not entirely sure SO is the right site for this question, although I consider HP ALM a "software tool(...) commonly used by programmers". Please migrate if it is deemed a better fit for another site.

Lotus Notes: Replication conflict caused by agent and user running on the document at same time

In one of the lotus notes db, too frequent replication/save conflicts are caused reason being a scheduled agent and any user working on the document at the same time.
is there any way to avoid this.
Thanks,
H.P.
Several options in addition to merging conflicts:
Change the schedule The best way to avoid it is to have your scheduled agents running at times when users are not likely to be accessing the system. If the LastContact field on a Client document is updated by an agent every hour as it checks all Contact documents, maybe the agent should run overnight instead.
Run the agent on user action It may also be the case that the agent shouldn't be running on a schedule, but should be running when the user takes some action. For example, run the agent to update the Client document when the user saves the supporting Contact document.
Break the form into smaller bits A third thing to consider is redesigning your form so that not every piece of data is on a main form. For example, if comments on recent contacts with a client are currently held in a field on the Client document, you might change the design to have a separate ClientMeeting form from which the comments on the meeting are displayed in an embedded view or computed text (or designed using Xpages).
Despite the fact that I am a developer, I think rep/saves are far more often the result of design decisions than anything else.
You can use the Conflict Handling option on the form(s) in question and select either Merge Conflicts or Merge/No Conflicts in order to have Notes handle merging of edit conflicts.
From the Help database:
At the "Conflict Handling" section of the Form Info tab, choose one of the following options for the form:
Create Conflicts -- Creates conflicts so that a replication conflict appears as a response document to the main document. The main document is selected based on the time and date of the changes and an internal document sequence number.
Merge Conflicts -- If a replication conflict occurs, saves the edits to each field in a single document. However, if two users edit the same field in the same document, Notes saves one document as a main document and the other document as a response document marked as a replication conflict. The main document is selected based on the criteria listed in the bullet above.
Merge/No Conflicts -- If replication occurs, saves the edits to each field in a single document. If two users edit the same field in the same document, Notes selects the field from the main document, based on time and date, and an internal document sequence number. No conflict document is generated, instead conflicting documents are merged into a single document at the field level.
Do Not Create Conflicts -- No merge occurs. IBM® Lotus® Domino(TM) simply chooses one document over another. The other document is lost.
In later versions of Notes there is the concept of document locking, and used properly that can prevent conflicts (but also add complexity).
Usually most conflicts can be avoided by planning to run the agents late at night when users aren't on the system. If that's not an option, then locking may be the best solution. If the conflicts aren't too many, you might benefit from adding a view filtered to show only conflicts, which would make findind and resolving them easier.
IMHO, the best answer to conflicts between users and agents is to make sure that they are operating on different documents. I.e., there are two documents with a common key. They can be parent and child if it would be convenient to show them that way in a view, but they don't have to be. Just call them DocA and DocB for the purposes of this discussion.
DocA is read and updated by users. When a user is viewing DocA, computed field formulas can pull information from DocB via DbLookup or GetDocField. Users never update DocB.
DocB, on the other hand, is read and updated by agents, but agents only read DocA. They never update them.
If you design your application any other way, then you either have to use locking -- which can create the possibility of not being able to update something when you need to, or accepting the fact that conflicts can happen occasionally and will need to be resolved.
Note that even with this strategy, you can still have conflicts if you have multiple replicas of the database. You can use the 'Conflict Handling' section of the Form properties to help minimize replication conflicts, as per #Per Henrik Lausten's answer, but since you are talking about an existing please also see my comment to his answer for additional info about what you would have to do in order to use this feature.
If this is a mission critical application, consider creating a database with lock-documents. That means, every time a user opens a document, a separate lock-document is created.
Then code the agent to see if lock-documents exist for every document that the agent wants to modify. If it does, skip that document.
Document-close should remove the doc-lock.
The lock-doc should be created on document open, not just read. This way, when a user has the document open in read mode, the agent will not be able to modify as well. This is to prohibit, that the user might change to editmode afterwards and make changes.
If the agent has a long modification time, it should create lock-docs as well.

How to find the document visitior's count?

Actually I am in need of counting the visitors count for a particular document.
I can do it by adding a field, and increasing its value.
But the problem is following.,
I have 10 replication copies in different location. It is being replicated by scheduled manner. So replication conflict is happening because of document count is editing the same document in different location.
I would use an external solution for this. Just search for "visitor count" in your favorite search engine and choose a third party tool. You can then display the count on the page if that is important.
If you need to store the value in the database for some reason, perhaps you could store it as a new doc type that gets added each time (and cleaned up later) to avoid the replication issues.
Otherwise if storing it isn't required consider Google Analytics too.
Also I faced this problem. I can not say that it has a easy solution. Document locking is the only solution that i had found. But the visitor's count is not possible.
It is possible, but not by updating the document. Instead have an AJAX call to an agent or form with parameters on the URL identifying the document being read. This call writes a document into a tracking DB with one or two views and then determines from those views how many reads you have had. The number of reads is the return value of the AJAX form.
This can be written in LS, Java or #Formulas. I would try to do it 100% in #Formulas to make it as efficient as possible.
You can also add logic to exclude reads from the same user or same source IP address.
The tracking database then replicates using the same schedule as the other database.
Daily or Hourly agents can run to create summary documents and delete the detail documents so that you do not exceed the limits for #DBLookup.
If you do not need very nearly real time counts (and that is the best you can get with replicated system like this) you could use the web logs that domino generates by finding the reads in the logs and building the counts in a document per server.
/Newbs
Back in the 90s, we had a client that needed to know that each person had read a document without them clicking to sign or anything.
The initial solution was to add each name to a text field on a separate tracking document. This ran into problems when it got over 32k real fast. Then, one of my colleagues realized you could just have it create a document for each user to record that they'd read it.
Heck, you could have one database used to track all reads for all users of all documents, since one user can only open one document at a time -- each time they open a new document, either add that value to a field or create a field named after the document they've read on their own "reader tracker" document.
Or you could make that a mail-in database, so no worries about replication. Each time they open a document for which you want to track reads, it create a tiny document that has only their name and what document they read which gets mailed into the "read counter database". If you don't care who read it, you have an agent that runs on a schedule that updates the count and deletes the mailed-in documents.
There really are a lot of ways to skin this cat.

How does enterprise search display results for the user and hide unauthorized results?

I am looking to understand how enterprise search solutions tackle the issue of user-permissions.
My question is on displaying the search results for users. The naive approach would display the search results to the user, and then if the user clicks a document he is not authorized to see, he will fail to open it. However, it is even forbidden to display a document's title or excerpt if the user does not have permission to read it. So do the various enterprise earch engines:
index each document together with its ACL?
index all documents with no permission info, but check each link in every search result to see whether the querying user has permission to view this link?
Option #2 makes more sense to me, but also seems much slower than option #1.
Option #1 suffers from the need to constantly update the changes in permissions on the indexed documents.
I am looking to understand what is the common approach in the existing solutions in the market today. Is there a third option?
I'm surprised to see that this 5 year old question hasn't got any answers, as I think it's quite a common and important problem in enterprise search.
As outlined in the question there are two common approaches to deal with document-level security:
early-binding-security: indexing ACL's along with the content, and
late-binding-security: handling security at query-time, by filtering out protected results
Handling security on content side only is never recommended as at that point in time confidential information might already have been revealed (e.g. title or preview of a document in the search result).
The advantage of implementing security with a late-binding approach is, that it's very flexible, because there is no need to re-index content upon changed ACLs. The biggest drawback however is, that by doing so, confidential information might be leaked via facet values, and it's not possible to retrieve and display correct facet counts. It also more difficult to properly populate the result list and handle pagination. Last but not least, this approach can significantly slow down the performance.
The advantage of implementing security with an early-binding approach is, that it addresses all of the above disadvantages for the price of re-indexing the content as soon as ACLs change. However, leaks are still possible, e.g. when a group membership or ACL just got changed and isn't reflected yet in the search index. To address this gap the two approaches early-binding and late-binding are often combined.
Last but not least there might be a third option, depending on the Enterprise Search Platform you are using: Attivio's Active Security is based on query time joins, which allows to index security information independent from the document itself, but at query time merges the two documents to ensure that only authorised content makes it into the search results.