ravendb remove multiple versions of the same document - ravendb

I have a database with version bundle turned on. So the documents were saved like:
user/1/revision/1, user/1/revision/2, etc.
But what I didnt expect is that on search I will have all the versions of the same user or whatever else document I deal with.
I tried to restore this db to a new database with version bundle turned on and off and I still have all the versions in search.
I make search like this:
session.Query<Entity>().Search(x=>x.Name, query, options: SearchOptions.And, escapeQueryOptions: EscapeQueryOptions.AllowPostfixWildcard)
Not sure, maybe I should use some specific parameters to work with latest document version only?
UPDATE:
What I did so far:
reinstalled ravendb and installed it as a service(it was as a service, just made sure I didnt break anything)
imported data from old database to new database
deleted all the indexes related to the entity
I still get all the revisions in my search results. Also my Raven.Server.config doesnt have anything related to bundles. My raven version is 2750 which seem to be the latest production release recommended.
UPDATE 2: When I try to import data to new database from old dump I get the following error:
Client side exception:
System.Exception: Server Error:
/bulk_docs
Raven.Abstractions.Exceptions.OperationVetoedException: PUT vetoed by Raven.Bundles.Versioning.Triggers.VersioningPutTrigger because: Modifying a historical revision is not allowed
at Raven.Database.DocumentDatabase.AssertPutOperationNotVetoed(String key, RavenJObject metadata, RavenJObject document, TransactionInformation transactionInformation)
at Raven.Database.DocumentDatabase.<>c_DisplayClass4b.b_43(IStorageActionsAccessor actions)
at Raven.Storage.Esent.TransactionalStorage.Batch(Action1 action)
at Raven.Database.DocumentDatabase.Put(String key, Etag etag, RavenJObject document, RavenJObject metadata, TransactionInformation transactionInformation)
at Raven.Database.Extensions.CommandExtensions.Execute(ICommandData self, DocumentDatabase database, BatchResult batchResult)
at Raven.Database.DocumentDatabase.ProcessBatch(IList1 commands)
at Raven.Database.DocumentDatabase.<>c_DisplayClass10c.b_108(IStorageActionsAccessor actions)
at Raven.Storage.Esent.TransactionalStorage.ExecuteBatch(Action1 action, EsentTransactionContext transactionContext)
at Raven.Storage.Esent.TransactionalStorage.Batch(Action1 action)
at Raven.Database.DocumentDatabase.Batch(IList`1 commands)
at Raven.Database.Server.Responders.DocumentBatch.Batch(IHttpContext context)
at Raven.Database.Server.HttpServer.DispatchRequest(IHttpContext ctx)
at Raven.Database.Server.HttpServer.HandleActualRequest(IHttpContext ctx)
Any ideas how to fix it?

Revisions shouldn't be indexed. As long as the versioning bundle is active on the database and the revision documents have a metadata key Raven-Document-Revision-Status with value Historical they should be ignored by all indexed.
Check the bundle is active on that DB, and the metadata mentioned above exists.
This holds true for 2.0, 2.5 and iirc 1.0 as well.

Related

RavenDb: Document Refresh feature does not run at nor after the specified time by #refresh flag

I need to mark documets as expired after some time and therefore I am trying to use #refresh feature to re-run subscription and to compute my 'expired' flag. I know there is 'Document expiration' feature but this one removes data which I don't want.
I have turned Refresh feature in settings and added #refresh UTC datetime in metadata for required documents. For example I added manually this document:
{
"Name": "My data",
"#metadata": {
"#collection": "Testing",
"#refresh": "2021-04-30T07:41:35.4845961Z"
}
}
It looks like I am facing non deterministic behavior - sometimes refresh is processed sometimes not. I tried with different combinations of times and set through code or via Raven Studio.
Refresh interval is set to refresh but still says "in less than a minute"
I am using
Community license (Document refresh not mentioned here, but I don't see it mentioned for any other licenses as well)
community license extensions
tried more vresions of RavenDB with same result (5.1.7. was looking more promising as it worked for some time but after a while stopped):
4.2.111 server/studio version in Docker on Windows 10
5.1.7 server/studio version
C# RavenDB.Client 5.1.6
Did not find related issue in bug tracker
https://issues.hibernatingrhinos.com/issues/RavenDB?q=document%20refresh
Any ideas what to check or what might be the case?
EDIT: After turned logging into console I found some error log. It looks like
RavendbProject, Raven.Server.Documents.Expiration.ExpiredDocumentsCleaner, Failed to refresh documents on RavendbProject which are older than 05/17/2021 09:48:47, EXCEPTION: System.NullReferenceException: Object reference not set to an instance of an object.
RavendbProject | at Sparrow.Server.ByteStringContext`1.From(String value, ByteStringType type, ByteString& str) in C:\Builds\RavenDB-Stable-5.1\51024\src\Sparrow.Server\ByteString.cs:line 1297
RavendbProject | at Raven.Server.Documents.DocumentPutAction.PutDocument(DocumentsOperationContext context, String id, String expectedChangeVector, BlittableJsonReaderObject document, Nullable`1 lastModifiedTicks, String changeVector, DocumentFlags flags, NonPersistentDocumentFlags nonPersistentFlags) in C:\Builds\RavenDB-Stable-5.1\51024\src\Raven.Server\Documents\DocumentPutAction.cs:line 190
Also worth mentioning is that my document was stored in ClusterWide transaction and thus I can see in one of my documents corresponding flag:
"#flags": "FromClusterTransaction",
My current suspicion is that it may happen that one of these documents prevented other documents from being refreshed. After deleting cluster-transaction document, other documents in collection were refreshed
The bug related to document that was added via cluster transaction, the workaround for now would be to not use cluster transaction.
I have opened an issue on bug tracker,
https://issues.hibernatingrhinos.com/issue/RavenDB-16710

Migrating from Microsoft.Azure.Storage.Blob to Azure.Storage.Blobs - directory concepts missing

These are great guides for migrating between the different versions of NuGet package:
https://github.com/Azure/azure-sdk-for-net/blob/Azure.Storage.Blobs_12.6.0/sdk/storage/Azure.Storage.Blobs/README.md
https://elcamino.cloud/articles/2020-03-30-azure-storage-blobs-net-sdk-v12-upgrade-guide-and-tips.html
However I am struggling to migrate the following concepts in my code:
// Return if a directory exists:
container.GetDirectoryReference(path).ListBlobs().Any();
where GetDirectoryReference is not understood and there appears to be no direct translation.
Also, the concept of a CloudBlobDirectory does not appear to have made it into Azure.Storage.Blobs e.g.
private static long GetDirectorySize(CloudBlobDirectory directoryBlob) {
long size = 0;
foreach (var blobItem in directoryBlob.ListBlobs()) {
if (blobItem is BlobClient)
size += ((BlobClient) blobItem).GetProperties().Value.ContentLength;
if (blobItem is CloudBlobDirectory)
size += GetDirectorySize((CloudBlobDirectory) blobItem);
}
return size;
}
where CloudBlobDirectory does not appear anywhere in the API.
There's no such thing as physical directories or folders in Azure Blob Storage. The directories you sometimes see are part of the blob (e.g. folder1/folder2/file1.txt). The List Blobs requests allows you to add a prefix and delimiter in a call, which are used by the Azure Portal and Azure Data Explorer to create a visualization of folders. As example prefix folder1/ and delimiter / would allow you to see the content as if folder1 was opened.
That's exactly what happens in your code. The GetDirectoryReference() adds a prefix. The ListBlobs() fires a request and Any() checks if any items return.
For V12 the command that'll allow you to do the same would be GetBlobsByHierarchy and its async version. In your particular case where you only want to know if any blobs exist in the directory a GetBlobs with prefix would also suffice.

Unable to upgrade/import a dacpac file

ORIGINAL QUESTION:
Trying to upgrade a blank database created in a test VM using a .dacpac file, but get the following error message:
Error SQL72014: .Net SqlClient Data Provider: Msg 15401, Level 16, State 1, Line 1 Windows NT user or group 'SOURCE_DOMAIN\SOURCE SQL Readers' not found. Check the name again.
Error SQL72045: Script execution error. The executed script:
CREATE LOGIN [SOURCE_DOMAIN\SOURCE SQL Readers]
FROM WINDOWS WITH DEFAULT_LANGUAGE = [us_english];
(Microsoft.SqlServer.Dac)
------------------------------
Program Location:
at Microsoft.SqlServer.Dac.DeployOperation.ThrowIfErrorManagerHasErrors()
at Microsoft.SqlServer.Dac.DeployOperation.<>c__DisplayClass14.<>c__DisplayClass16.<CreatePlanExecutionOperation>b__13()
at Microsoft.Data.Tools.Schema.Sql.Dac.OperationLogger.Capture(Action action)
at Microsoft.SqlServer.Dac.DeployOperation.<>c__DisplayClass14.<CreatePlanExecutionOperation>b__12(Object operation, CancellationToken token)
at Microsoft.SqlServer.Dac.Operation.Microsoft.SqlServer.Dac.IOperation.Run(OperationContext context)
at Microsoft.SqlServer.Dac.ReportMessageOperation.Microsoft.SqlServer.Dac.IOperation.Run(OperationContext context)
at Microsoft.SqlServer.Dac.OperationExtension.CompositeOperation.Microsoft.SqlServer.Dac.IOperation.Run(OperationContext context)
at Microsoft.SqlServer.Dac.OperationExtension.CompositeOperation.Microsoft.SqlServer.Dac.IOperation.Run(OperationContext context)
at Microsoft.SqlServer.Dac.DeployOperation.Microsoft.SqlServer.Dac.IOperation.Run(OperationContext context)
at Microsoft.SqlServer.Dac.OperationExtension.Execute(IOperation operation, DacLoggingContext loggingContext, CancellationToken cancellationToken)
at Microsoft.SqlServer.Dac.DacServices.InternalDeploy(IPackageSource packageSource, Boolean isDacpac, String targetDatabaseName, DacDeployOptions options, CancellationToken cancellationToken, DacLoggingContext loggingContext, Action`3 reportPlanOperation, Boolean executePlan)
at Microsoft.SqlServer.Dac.DacServices.Deploy(DacPackage package, String targetDatabaseName, Boolean upgradeExisting, DacDeployOptions options, Nullable`1 cancellationToken)
at Microsoft.SqlServer.Management.Dac.DacWizard.UpgradeModel.RunAction()
at Microsoft.SqlServer.Management.Dac.DacWizard.ExecuteDacPage.backgroundWorker1_DoWork(Object sender, DoWorkEventArgs e)
at System.ComponentModel.BackgroundWorker.OnDoWork(DoWorkEventArgs e)
at System.ComponentModel.BackgroundWorker.WorkerThreadStart(Object argument)
Assuming that user existed in the source, but not in the destination. Will creating that user on the VM fix this issue or will I need to use a different approach to get the schema data from the source re-created in a VM destination for testing purposes?
UPDATE TO QUESTION 1:
The .dacpac file is generated on a server which is on a totally different domain and it will not be possible for the test VM to ever be on the same domain. With that in mind, how do I get the .dacpac file to work on the test VM?
If you still have access to VM, you could generate .dacpac again this time ignoring the logins. Depending which tool you use you should have access to option like "Include User Login Mapping".
The most roboust one has the VS: "How to create DACPAC file?" by Kamil Nowinski:
Image source: https://sqlplayer.net/wp-content/uploads/2018/10/visual-studio-extract-dacpac-options.png
You could recreate the proper logins and users afterwards with own SQL script.
Related: Using Publish Profiles to Deploy a DACPAC Database Without User Accounts
The solution to this problem lies in defining an appropriate publish profile for your DACPAC, which then instructs your chosen deployment tool – SQLPackage.exe, Visual Studio, or Azure DevOps – on how to carry out the deployment
The profile is defined as an XML file.
ExcludeUsers
ExcludeLogins
ExcludeDatabaseRoles
By setting these options to True within our publish profile, creation or modification of these objects will be skipped entirely during any database deployment.
One more option is to use dbtools.io - Export-DbaDacPackage
Key point here is:
$exportProperties = "/p:IgnorePermissions=True /p:IgnoreUserLoginMappings=True" # Ignore
and publish.xml:
...
<ExcludeLogins>True</ExcludeLogins>
<IgnorePermissions>True</IgnorePermissions>
<IgnoreLoginSids>True</IgnoreLoginSids>
<IgnoreRoleMembership>True</IgnoreRoleMembership>
Summary:
create a dacpac without login
create a publish.xml file that will ignore permissions
Creating the user inside the VM is one way to solve this issue, but you will need to change 'SOURCE_DOMAIN' to the VM hostname, as the user will be part of the local user database.
Probably the best solution is to fix VM communication to the Domain Controller, so authentication will work and user accounts end up being actually visible within the VM.
Take a look at this,
This error usually occurs because of COMPATIBILITY_LEVEL
I would recommend trying this quarry out:
ALTER DATABASE database_name
SET COMPATIBILITY_LEVEL = 130;
Hope it helps!
If a dacpac contains users or groups that aren’t on the domain where the dacpac is being deployed, then one way to deploy it is using the SqlPackage command line tool, as this allows you to explicitly list the object types you want to exclude.
To exclude users and groups, the PowerShell command would be something like this:
.\SqlPackage.exe `
/a:Publish `
/tsn:"(localdb)\mssqllocaldb" `
/tdn:YourDatabaseName `
/p:ExcludeObjectTypes="Users;RoleMembership;Logins;ServerRoles;ServerRoleMembership;Permissions" `
/sf:YourFile.dacpac
This command uses the following switches:
/a (Action): the action to run, in this case Publish
/tsn (TargetServerName): the name of the server to deploy to
/tdn (TargetDatabaseName): the name of the database to deploy to
/p (Properties): name value pair of action-specific properties, in this case:
ExcludeObjectTypes: a semicolon-delimited list of object types that should be ignored
/sf (SourceFile): the dacpac file to deploy
More details of the syntax for Publish (including a list of the object types that can be exlcuded) are available in the docs for the publish action.

cannot view document in ravendb studio

When I try to view my document I get this error:
Client side exception:
System.InvalidOperationException: Document's property: "DocumentData" is too long to view in the studio (property length: 699.608, max allowed length: 500.000)
at Raven.Studio.Models.EditableDocumentModel.AssertNoPropertyBeyondSize(RavenJToken token, Int32 maxSize, String path)
at Raven.Studio.Models.EditableDocumentModel.AssertNoPropertyBeyondSize(RavenJToken token, Int32 maxSize, String path)
at Raven.Studio.Models.EditableDocumentModel.<LoadModelParameters>b__2a(DocumentAndNavigationInfo result)
at Raven.Studio.Infrastructure.InvocationExtensions.<>c__DisplayClass17`1.<>c__DisplayClass19.<ContinueOnSuccessInTheUIThread>b__16()
at AsyncCompatLibExtensions.<>c__DisplayClass55.<InvokeAsync>b__54()
I am saving a pdf in that field.
I want to be able to edit the other fields.
Is it possible for it to ignore the field that's too big?
Thanks!
Don't save large binary (or base64 encoded) data into the json document. That's a poor use of the database. Instead, you should consider one of these two options:
Option 1
Write the binary data to disk (or cloud storage) yourself.
Save a file path (or url) to it in your document.
Option 2
Use Raven's attachments feature. This is a separate area in the database meant specifically for storing binary files.
The advantage is that your binary documents are included in database backups, and if you like you can take advantage of features like my Indexed Attachments Bundle, or write your own custom bundles that use attachment triggers.
The disadvantage is that your database can grow very large. For this reason, many prefer option 1.

RavenDB, RavenHQ and Appharbor - document size error with very first document

I have a completely empty RavenHQ database that's linked to my Appharbor application. The amount of space the database is currently using is 1.1mb out of an available 25mb for my bronze account. The database previously had records in it, but I have deleted them using "delete collection" in the management studio.
The very first time I call session.Store(myobject), and BEFORE I call .SaveChanges(), I get the following error.
System.InvalidOperationException: Url: "/docs/Raven/Hilo/AccItems"
Raven.Database.Exceptions.OperationVetoedException: PUT vetoed by Raven.Bundles.Quotas.Triggers.DatabaseSizeQoutaForDocumetsPutTrigger because: Database size is 45,347 KB, which is over the allowed quota of 25,600 KB. No more documents are allowed in.
Now, the document is definitely not that big, so I don't know what this error can mean, especially as I don't think I've even hit the database at that point since I haven't closed the session by calling SaveChanges(). Any ideas? Here's the code itself.
XDocument doc = XDocument.Parse(rawXml);
var accItems = ExtractItemsFromFeed(doc);
using (IDocumentSession session = _store.OpenSession())
{
var dbItems = session.Query<AccItem>().ToList();
foreach (var item in accItems)
{
var existingRecord = dbItems.SingleOrDefault(x => x.Source == x.SourceId == cottage.SourceId);
if (existingRecord == null)
{
session.Store(item);
_logger.Info("Saved new item {0}.", item.ShortName);
}
else
{
existingRecord.ShortName = item.ShortName;
_logger.Info("Updated item {0}.", item.ShortName);
}
session.SaveChanges();
}
}
Any other comments about the style of this code would be most welcome, as I was unsure of the best way to approach the "update existing item or create if it isn't there" scenario.
The answer here was as follows.
RavenHQ support found that the database was indeed oversized, but it seemed that the size reported in the Appharbor-branded RavenHQ control panel was incorrect. I had filled up the database way over the limit with a previous faulty version of the code posted above, so the error message I received was actually correct.
Fixing this problem without paying to upgrade the database wasn't straightforward, as it's not possible to shrink the database. As I also wasn't able to delete my single Appharbor/RavenHQ database or create another one that left me with the choice of creating an entirely new Appharbor application, or registering directly with RavenHQ for a new account. I chose the latter. The RavenHQ-branded control panel is slightly different to the Appharbor one, in that it has the ability to create and delete databases.
So to summarize: there doesn't seem to be any benefit to using RavenHQ as an add-on to Appharbor - you might as well go and get a proper free RavenHQ account.