I've googled this a lot. Most of these issues are caused by a lock being left around after a JVM crash. This is not my case.
I have an index with multiple readers and writers. I'm am trying to do a mass index update (delete and add -- that's how lucene does updates). I'm using solr's embedded server (org.apache.solr.client.solrj.embedded.EmbeddedSolrServer). Other writers are using the remote, non-streaming server (org.apache.solr.client.solrj.impl.CommonsHttpSolrServer).
I kick off this mass update, it runs fine for a while, then dies with a
Caused by:
org.apache.lucene.store.LockObtainFailedException:
Lock obtain timed out:
NativeFSLock#/.../lucene-ff783c5d8800fd9722a95494d07d7e37-write.lock
I've adjusted my lock timeouts in solrconfig.xml
<writeLockTimeout>20000</writeLockTimeout>
<commitLockTimeout>10000</commitLockTimeout>
I'm about to start reading the lucene code to figure this out. Any help so I don't have to do this would be great!
EDIT: All my updates go through the following code (Scala):
val req = new UpdateRequest
req.setAction(AbstractUpdateRequest.ACTION.COMMIT, false, false)
req.add(docs)
val rsp = req.process(solrServer)
solrServer is an instance of org.apache.solr.client.solrj.impl.CommonsHttpSolrServer, org.apache.solr.client.solrj.impl.StreamingUpdateSolrServer, or org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.
ANOTHER EDIT:
I stopped using EmbeddedSolrServer and it works now. I have two separate processes that update the solr search index:
1) Servlet
2) Command line tool
The command line tool was using the EmbeddedSolrServer and it would eventually crash with the LockObtainFailedException. When I started using StreamingUpdateSolrServer, the problems went away.
I'm still a little confused that the EmbeddedSolrServer would work at all. Can someone explain this. I thought that it would play nice with the Servlet process and they would wait while the other is writing.
I'm assuming that you're doing something like:
writer1.writeSomeStuff();
writer2.writeSomeStuff(); // this one doesn't write
The reason this won't work is because the writer stays open unless you close it. So writer1 writes and holds on to the lock, even after it's done writing. (Once a writer gets a lock, it never releases until it's destroyed.) writer2 can't get the lock, since writer1 is still holding onto it, so it throws a LockObtainFailedException.
If you want to use two writers, you'd need to do something like:
writer1.writeSomeStuff();
writer1.close();
writer2.open();
writer2.writeSomeStuff();
writer2.close();
Since you can only have one writer open at a time, this pretty much negates any benefit you would get from using multiple writers. (It's actually much worse to open and close them all the time since you'll be constantly paying a warmup penalty.)
So the answer to what I suspect is your underlying question is: don't use multiple writers. Use a single writer with multiple threads accessing it (IndexWriter is thread safe). If you're connecting to Solr via REST or some other HTTP API, a single Solr writer should be able to handle many requests.
I'm not sure what your use case is, but another possible answer is to see Solr's Recommendations for managing multiple indices. Particularly the ability to hot-swap cores might be of interest.
>> But you have multiple Solr servers writing to the same location, right?
No, wrong. Solr is using the Lucene libraries and it is stated in "Lucene in Action" * that there can only be one process/thread writing to the index at a time. That is why the writer takes a lock.
Your concurrent processes that are trying to write could, perhaps, check for the org.apache.lucene.store.LockObtainFailedException exception when instantiating the writer.
You could, for instance, put the process that instantiates writer2 in a waiting loop to wait until the active writing process finishes and issues writer1.close(); which will then release the lock and make the Lucene index available for writing again. Alternatively, you could have multiple Lucene indexes (in different locations) being written to concurrently and when doing a search you would need to search through all of them.
* "In order to enforce a single writer at a time, which means an IndexWriter or an IndexReader doing deletions or changing norms, Lucene uses a file-based lock: If the lock file (write.lock, by default) exists in your index directory, a writer currently has the index open. Any attempt to create another writer on the same index will hit a LockObtainFailedException. This is a vital protection mechanism, because if two writers are accidentally created on a single index, it will very quickly lead to index corruption."
Section 2.11.3, Lucene in Action, Second Edition, Michael McCandless, Erik Hatcher, and Otis Gospodnetić, 2010
Related
I'm starting to assess our company using RavenDB for storing some stuff that doesn't really belong in a relational database (we're traditionally a SQL Server shop). I installed RavenDB locally on my machine, created a database, added a document. Nice!
Being a DBA, I decided to see how backups/restores work. I backed up my database, deleted it, then restored it from the backup. After refreshing my admin screen, I saw my database. I clicked on it, and got a message that the database doesn't exist.
After a couple hours, I tried again. Still doesn't exist. A full day later, I walk into work, and try again. This time the database works. I've had similar situations with updating documents. The update seems to take anywhere between 1 second - several hours to show an update...
Is this normal for RavenDB?? Am I completely misconfigured?? I run SQL Server on my local machine and it's lightning-fast, so I can't imagine updating a single document could take that long. As-is, I can't imagine recommending we use RavenDB for anything.
Are you querying using indexes or getting documents by ID? Documents should be updated immediately (ACID). If indexes are slow to update (check their status using RavenDB Studio), it could be a configuration problem or something external like an anti-virus software can cause them to update slowly.
Apparently, at least for the document-update latency, the default for caching in queries is enabled, so I was getting cached results.
Jeffery,
No, that isn't normal by a long short. You should be able to immediately see what was changed.
Note that certain AV products will interfere with the HTTP pipeline and can affect RavenDB's usage. The studio will also auto update things only every 5 seconds (to reduce UI jitter), but that is about it.
Restoring a database (from the same machine), should take only as long as it take to copy the files (pure I/O bound operation).
If this is from another machine using a different version of Windows, we might need to run a check on the file, which can take a bit of time, but that doesn't sound like your scenario
My organization has recently had trouble with some SQL Server blocking processes. dbWarden has successfully reported blocking to us, but we often have the blocking SQL text reported as 'FETCH API_CURSOR'.
So, we're looking to alter the blocking alerts trigger in dbWarden to use sys.dm_exec_cursors and sys.dm_exec_sql_text to retrieve the text in the case where we find 'FETCH API_CURSOR' reported.
Trouble is, I cannot seem to come up with a way to recreate/simulate a blocking situation on our development server that will report as 'FETCH API_CURSOR'. I've started from the VB script here on SQL Authority to recreate the open cursor, but I cannot for the life of me figure out how to make it blocking.
I've seen many methods for recreating blocking transactions (open a transaction in one window, but do not commit/close, then try an update on same table in another), but not that would utilize FETCH API_CURSOR in a way that would allow us to successfully test. I'm somewhat at a loss here.
Has anyone had success in simulating blocking cursors in the past and can offer suggestions?
I'd suggest you to use Profiler tool to capture actual code that creates and fetches cursor. In this case, you'd see exactly what's going on in an application. It is not so difficult to reproduce similar blocking on a development server.
Let's say, one thread fetches rows from a cursor and another thread try to UPDATE same rows. See what's going on under the hood. Reading thread creates cursor to fetch result of SELECT back to an application. This technology is ancient and extremely slow and nowaday only some old (mostly) Java application use cursors for this purpose. Rows get fetched one-by-one, client is handling this process, so it takes time. During this time, reading thread holds shared locks on data it reads. It is by design, SQL Server is locker, it does use locks to function properly. If another thread tries to update a row that has been locked with shared lock, it get blocked. Because updating thread uses shared locks when searching rows for update, and tries to upgrade it to something more serious, but can't. For example, you can't upgrade your shared lock to U lock if another thread owns S lock on the same row. So I'd try to create a cursor, fetch several rows and tried to update in another tab. If you see difficulties, try to increase reading transaction serialization level.
But, seriously, I don't think you have to reproduce this or similar scenario on a development server. Stop use cursors for recordset fetching! Reads will be much faster and blocking issues will be reduced a lot. It's been a while since 1989 when cursors seen their great times. Client DB-access libraries evolved a lot, it is worth trying to pick up fruits of progress. Even in Java, it is a configuration option, use or not use them.
I apologize if cursor does get used on purpose, in this case. It is very unprobably but possible. I haven't seen such 'proper' cursor usage for ages! I'll be delighted to run into one more proper case.
My program was running too slow that I had to terminate it halfway to optimize some part of the codes right after 40,000 documents was inserted into database, BUT, before Lucene indexwriter.close was called. Then I couldn't find any results for some of the records that seems to be limited to the 40,000 documents from that particular run.
Does that mean that those record which I had index during the program run was lost? IndexWriter must always be perfectly closed to allow for the data to be written to the index?
Thanks in advance!
It's not close that you need to call but commit. addDocument only analyzes the document and buffers data into memory while commit will flush pending changes and perform a fsync.
close calls commit internally, I think this is why you assume close is required.
However, beware of not calling commit too often as this operation is very costly compared to addDocument.
I am googling for two days... I really need help.
I have an application with multiple threads trying to update a lucene index using Open/Close of IndexWriter for each update. Threads can start at any time.
Yeah, the problem with write.lock! So there are may be two or more solutions:
1) check IndexWriter.IsLocked(index) and if it is locked to sleep the thread.
2) Open an IndexWriter and never close it. The problem is that I have another application using the same index. Also when should I close the index and finalize the whole process?
here are interesting postings
Lucene IndexWriter thread safety
Lucene - open a closed IndexWriter
Update:
Exactly 2 years later I did a rest API which wraps this and all writes and reads were routed to the API.
Multiple threads, same process:
Keep your IndexWriter opened, and share it across multiple thread. The IndexWriter is threadsafe.
For multiple processes, your solution #1 is prone to issues with race conditions. You would need to use named Mutex to implement it safely:
http://msdn.microsoft.com/en-us/library/bwe34f1k.aspx
That being said, I'd personally opt for a process dedicated to writing to the index, and communicate with it using something like WCF.
I have a Solr/Lucene index file of approximately 700 Gb. The documents that I need to index are being read in real-time, roughly 1000 docs every 30 minutes are submitted and need to be indexed. In my scenario a script is run every 30 mins that indexes the documents that are not yet indexed, since it is a requirement that new documents should be searchable as soon as possible, but this process slow down the searching.
Is this the best way i can index latest documents or there is some other better way!
First, remember that Solr is not a real-time search engine (yet). There is still work to be done.
You can use a master/slave setup, where the indexation are done on the master and the search on the slave. With this, indexation does not affect search performance. After the commit is done on the master, force the slave to fetch the latest index from the master. While the new index is being replicated on the slave, it is still processing queries with the previous index.
Also, check you cache warming settings. Remember that this might slow down the searches if those settings are too aggressive. Also check the queries launched on the new searcher event.
You can do this with Lucene easily. Split the indexes in multiple parts (or to be precise, while building indexes, create "smaller" parts.) Create searcher for each of the part and store a reference to them. You can create a MultiSearcher on top of these individual parts.
Now, there will be only one index that will get the new documents. At regular intervals, add documents to this index, commit and re-open this searcher.
After the last index is updated, you can create a new multi-searcher again, using the previously opened searchers.
Thus, at any point, you will be re-opening only one searcher and that will be quite fast.
Check http://code.google.com/p/zoie/ wrapper around Lucene to make it real time - code donated from Linkedin.
^^i do this, with normal lucene, non solr, and it works really nice. however not sure if there is a solr way to do that at the moment. twitter recently went with lucene for searching and has effectively real time searching by just writing to their index at any update. their index resides completely in memory, so updating/reading the index is of no consequence and happens instantly, a lucene index can always be read while being written to, as long as there is only one writer at a time.
Check out this wiki page