How to corrupt a Raven Index - ravendb

I am building a script that checks for corrupted indexes and resets them but I am having issues getting corrupted indexes locally.
Does anyone know how to force an index corruption for RavenDB?

To cause a corruption you can delete one of the header files (headers.one or headers.two or both) or delete one of the journal files (when the database is offline).
The files are located under the relevant index folder.

You can simply divide by 0 and you will get index errors.
For example - define an index with:
from order in docs.Orders
select new
{
order.Company,
Total = order.Lines.Sum(l => (l.Quantity / 0))
}
Update:
Go to Debugging Index Errors
To see how you can generate:
Index Compilation Errors -and/or-
Index Execution Errors
https://ravendb.net/docs/article-page/4.1/Csharp/indexes/troubleshooting/debugging-index-errors

Related

Snowflake COPY INTO from JSON - ON_ERROR = CONTINUE - Weird Issue

I am trying to load JSON file from Staging area (S3) into Stage table using COPY INTO command.
Table:
create or replace TABLE stage_tableA (
RAW_JSON VARIANT NOT NULL
);
Copy Command:
copy into stage_tableA from #stgS3/filename_45.gz file_format = (format_name = 'file_json')
Got the below error when executing the above (sample provided)
SQL Error [100069] [22P02]: Error parsing JSON: document is too large, max size 16777216 bytes If you would like to continue loading
when an error is encountered, use other values such as 'SKIP_FILE' or
'CONTINUE' for the ON_ERROR option. For more information on loading
options, please run 'info loading_data' in a SQL client.
When I had put "ON_ERROR=CONTINUE" , records got partially loaded, i.e until the record with more than max size. But no records after the Error record was loaded.
Was "ON_ERROR=CONTINUE" supposed to skip only the record that has max size and load records before and after it ?
Yes, the ON_ERROR=CONTINUE skips the offending line and continues to load the rest of the file.
To help us provide more insight, can you answer the following:
How many records are in your file?
How many got loaded?
At what line was the error first encountered?
You can find this information using the COPY_HISTORY() table function
Try setting the option strip_outer_array = true for file format and attempt the loading again.
The considerations for loading large size semi-structured data are documented in the below article:
https://docs.snowflake.com/en/user-guide/semistructured-considerations.html
I partially agree with Chris. The ON_ERROR=CONTINUE option only helps if the there are in fact more than 1 JSON objects in the file. If it's 1 massive object then you would simply not get an error or the record loaded when using ON_ERROR=CONTINUE.
If you know your JSON payload is smaller than 16mb then definitely try the strip_outer_array = true. Also, if your JSON has a lot of nulls ("NULL") as values use the STRIP_NULL_VALUES = TRUE as this will slim your payload as well. Hope that helps.

Exceeded max configured index size while indexing document while running LS Agent

In our project we have a LS agent which is supposed to delete and create a new FT index. We haven't figured out yet why updating the index cannot be accomplished automatically(although in our database option this is set to update the index daily). So we decided to make roughly the same thing, but using our very simple agent. The agent looks like this and runs daily in the night:
Option Public
Option Declare
Dim s As NotesSession
Dim ndb As NotesDatabase
Sub Initialize
Set s = New NotesSession
Set ndb = s.CurrentDatabase
Print("BEFORE REMOVING INDEXES")
Call ndb.Removeftindex()
Print("INDEXES HAVE BEEN REMOVED SUCCESSFULLY")
Call ndb.createftindex(FTINDEX_ALL_BREAKS, true)
Print("INDEXES HAVE BEEN CREATED SUCCESSFULLY")
End Sub
In most cases it works very well, but sometimes when somebody creates a document which exceeds 12MB (we really don't know how this is possible) the agent fails to create the index. (but the latest index is already deleted).
Error message is:
31.05.2018 03:01:25 Full Text Error (FTG): Exceeded max configured index
size while indexing document NT000BD992 in database index *path to FT file*.ft
My question is how to avoid this problem? We've already expanded the limit of 6MB by the following command SET CONFIG FTG_INDEX_LIMIT=12582912. Can we expand it even more? And in general, how to solve the problem? Thanks in advance.
Using FTG_INDEX_LIMIT is an option to avoid this error, yes. But it will impact server performance in two ways - FTI-update processes will take more time and more memory.
There's no max size of this limit (in theory), but! As update-processes eats memory from common heap it can lead to 'out-of-memory-overheaping' and server crash.
You can try to exclude attachments from index - i don't think someone can put more than 1mb of text in one doc, but users can attach some big text files - and this will produce the error you writing about.
p.s. and yeah, i agree with Scott - why do you need such agent anyway? common ft-indexing working fine usualy.

Sitecore Lucene indexing - file not found exception using advanced database crawler

I'm having a problem with Sitecore/Lucene on our Content Management environment, we have two Content Delivery environments where this isn't a problem. I'm using the Advanced Database Crawler to index a number of items of defined templates. The index is pointing to the master database.
The index will remain 'stable' for a few hours or so, and then in the logs I will start to see this error appearing. Along with if I try and open a Searcher.
ManagedPoolThread #17 16:18:47 ERROR Could not update index entry. Action: 'Saved', Item: '{9D5C2EAC-AAA0-43E1-9F8D-885B16451D1A}'
Exception: System.IO.FileNotFoundException
Message: Could not find file 'C:\website\www\data\indexes\__customSearch\_f7.cfs'.
Source: Lucene.Net
at Lucene.Net.Index.SegmentInfos.FindSegmentsFile.Run()
at Sitecore.Search.Index.CreateReader()
at Sitecore.Search.Index.CreateSearcher(Boolean close)
at Sitecore.Search.IndexSearchContext.Initialize(ILuceneIndex index, Boolean close)
at Sitecore.Search.IndexDeleteContext..ctor(ILuceneIndex index)
at Sitecore.Search.Crawlers.DatabaseCrawler.DeleteItem(Item item)
at Sitecore.Search.Crawlers.DatabaseCrawler.UpdateItem(Item item)
at System.EventHandler.Invoke(Object sender, EventArgs e)
at Sitecore.Data.Managers.IndexingProvider.UpdateItem(HistoryEntry entry, Database database)
at Sitecore.Data.Managers.IndexingProvider.UpdateIndex(HistoryEntry entry, Database database)
From what I read this can be due to an update on the index whilst there is an open reader, and when a merge operation happens the reader will still have a reference to the deleted segment, or something to that avail (I'm not an expert on Lucene).
I have tried a few things with no success. Including sub classing the Sitecore.Search.Index object and overriding CreateWriter(bool recreate) to change the merge scheduler/policy and tweaking the merge factor. See below.
protected override IndexWriter CreateWriter(bool recreate)
{
IndexWriter writer = base.CreateWriter(recreate);
LogByteSizeMergePolicy policy = new LogByteSizeMergePolicy();
policy.SetMergeFactor(20);
policy.SetMaxMergeMB(10);
writer.SetMergePolicy(policy);
writer.SetMergeScheduler(new SerialMergeScheduler());
return writer;
}
When I'm reading the index I call SearchManager.GetIndex(Index).CreateSearchContext().Searcher and when I'm done getting the documents I need I call .Close() which I thought would've been sufficient.
I was thinking I could perhaps try overriding CreateSearcher(bool close) as well, to ensure I'm opening a new reader each time, which I will give a go after this. I don't really know enough about how Sitecore handles Lucene, its readers/writers?
I also tried playing around with the UpdateInterval value in the web config to see if that would help, alas it didn't.
I would greatly appreciate anyone who a) knows of any kind of situations in which this could occur, and b) any potential advice/solutions, as I'm starting to bang my head against a rather large wall :)
We're running Sitecore 6.5 rev111123 with Lucene 2.3.
Thanks,
James.
It seems like Lucene freaks out when you try to re-index something that is in the process of being indexed already. To verify that, try the following:
Set the updateinterval of your index to a really high value (8 hours).
Then, stop the w3wp.exe and delete the index.
After deleting the index try to rebuild the index in Sitecore and wait for this to finish.
Test again and see if this occurs.
If this doesn't occur anymore it will be the updateinterval set too low which causes your index (that is probably still being constructed) to be overwritten with a new one (that won't be finished either) causing your segments.gen file to contain the wrong index information.
This .gen file will point your indexreader to what segments are part of your index and is recreated after index rebuilding.
That's why I suggest to try to disable the updates for a large amount of time and to rebuild it manually.

Compass/Lucene in clustered environment

I get the following error in a clustered environment where one node is indexing the objects and the other node is confused about the segments that are there in cache. The node never recovers by itself even after server restart. The node that's indexing might be merging the segments and deleting which the other node is not aware of. I did not touch the invalidateCacheInterval setting and added compass.engine.globalCacheIntervalInvalidation property with 500ms. It didn't help.
This is happening while searching and indexing on the other node.
Can someone help me how to resolve this issue? Maybe to ask compass reload the cache or start from scratch without having to reindex all the objects?
org.compass.core.engine.SearchEngineException: Failed to search with query [+type:...)]; nested exception is org.apache.lucene.store.jdbc.JdbcStoreException: No entry for [_6ge.tis] table index_objects
org.apache.lucene.store.jdbc.JdbcStoreException: No entry for [_6ge.tis] table index_objects
at org.apache.lucene.store.jdbc.index.FetchOnBufferReadJdbcIndexInput$1.execute(FetchOnBufferReadJdbcIndexInput.java:68)
at org.apache.lucene.store.jdbc.support.JdbcTemplate.executeSelect(JdbcTemplate.java:112)
at org.apache.lucene.store.jdbc.index.FetchOnBufferReadJdbcIndexInput.refill(FetchOnBufferReadJdbcIndexInput.java:58)
at org.apache.lucene.store.ConfigurableBufferedIndexInput.readByte(ConfigurableBufferedIndexInput.java:27)
at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:78)
at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:64)
at org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:127)
at org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java:158)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:250)
at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:218)
at org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java:752)
at org.apache.lucene.index.MultiSegmentReader.docFreq(MultiSegmentReader.java:377)
at org.apache.lucene.search.IndexSearcher.docFreq(IndexSearcher.java:86)
at org.apache.lucene.search.Similarity.idf(Similarity.java:457)
at org.apache.lucene.search.TermQuery$TermWeight.(TermQuery.java:44)
at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:146)
at org.apache.lucene.search.BooleanQuery$BooleanWeight.(BooleanQuery.java:185)
at org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:360)
at org.apache.lucene.search.Query.weight(Query.java:95)
at org.apache.lucene.search.Hits.(Hits.java:85)
at org.apache.lucene.search.Searcher.search(Searcher.java:61)
at org.compass.core.lucene.engine.transaction.support.AbstractTransactionProcessor.findByQuery(AbstractTransactionProcessor.java:146)
at org.compass.core.lucene.engine.transaction.support.AbstractSearchTransactionProcessor.performFind(AbstractSearchTransactionProcessor.java:59)
at org.compass.core.lucene.engine.transaction.search.SearchTransactionProcessor.find(SearchTransactionProcessor.java:50)
at org.compass.core.lucene.engine.LuceneSearchEngine.find(LuceneSearchEngine.java:352)
at org.compass.core.lucene.engine.LuceneSearchEngineQuery.hits(LuceneSearchEngineQuery.java:188)
at org.compass.core.impl.DefaultCompassQuery.hits(DefaultCompassQuery.java:199)

Drupal is looking for a field that no longer exists

Warning: Cannot modify header information - headers already sent by (output started at /home/sites/superallan.com/public_html/includes/common.inc:2561) in drupal_send_headers() (line 1040 of /home/sites/superallan.com/public_html/includes/bootstrap.inc).
PDOException: SQLSTATE[42S02]: Base table or view not found: 1146 Table 'web247-sa_admin.field_data_field_embedcode' doesn't exist: SELECT field_data_field_embedcode0.entity_type AS entity_type, field_data_field_embedcode0.entity_id AS entity_id, field_data_field_embedcode0.revision_id AS revision_id, field_data_field_embedcode0.bundle AS bundle FROM {field_data_field_embedcode} field_data_field_embedcode0 WHERE (field_data_field_embedcode0.deleted = :db_condition_placeholder_0) AND (field_data_field_embedcode0.bundle = :db_condition_placeholder_1) LIMIT 10 OFFSET 0; Array ( [:db_condition_placeholder_0] => 1 [:db_condition_placeholder_1] => blog ) in field_sql_storage_field_storage_query() (line 569 of /home/sites/superallan.com/public_html/modules/field/modules/field_sql_storage/field_sql_storage.module).
As I understand it Drupal is looking for a data field that I deleted. I thought maybe it had got corrupted and Drupal couldn't find it to delete it properly. In phpMyAdmin it doesn't exist so how can I get Drupal to recognize it's not longer there and stop it showing this error at the bottom of every page?
You can see it on this page: http://superallan.com/404
Have you tried clearing the site cache, or uninstalling the module that provides the field? It looks like a reference to the field's SQL data is sticking around in your database, which is of course causing the error you posted.
This worked for me:
DELETE FROM field_config WHERE deleted = 1;
DELETE FROM field_config_instance WHERE deleted = 1;
I received no adverse effect from removing all the things that were already deleted.
Source:
http://digcms.com/remove-field_deleted_data-drupal-database/