Update Document with external object - ravendb

i have a database containing Song objects. The song class has > 30 properties.
My Music Tagging application is doing changes on a song on the file system.
It then does a lookup in the database using the filename.
Now i have a Song object, which i created in my Tagging application by reading the physical file and i have a Song object, which i have just retrieved from the database and which i want to update.
I thought i just could grab the ID from the database object, replace the database object with my local song object, set the saved id and store it.
But Raven claims that i am replacing the object with a different object.
Do i really need to copy every single property over, like this?
dbSong.Artist = songfromFilesystem.Artist;
dbSong.Album = songfromFileSystem.Album;
Or are there other possibilities.
thanks,
Helmut
Edit:
I was a bit too positive. The suggestion below works only in a test program.
When doing it in my original code i get following exception:
Attempted to associate a different object with id 'TrackDatas/3452'
This is produced by following code:
try
{
originalFileName = Util.EscapeDatabaseQuery(originalFileName);
// Lookup the track in the database
var dbTracks = _session.Advanced.DocumentQuery<TrackData, DefaultSearchIndex>().WhereEquals("Query", originalFileName).ToList();
if (dbTracks.Count > 0)
{
track.Id = dbTracks[0].Id;
_session.Store(track);
_session.SaveChanges();
}
}
catch (Exception ex)
{
log.Error("UpdateTrack: Error updating track in database {0}: {1}", ex.Message, ex.InnerException);
}
I am first looking up a song in the database and get a TrackData object in dbTracks.
The track object is also of type TrackData and i just put the ID from the object just retrieved and try to store it, which gives the above error.
I would think that the above message tells me that the objects are of different types, which they aren't.
The same error happens, if i use AutoMapper.
any idea?

You can do what you're trying: replace an existing object using just the ID. If it's not working, you might be doing something else wrong. (In which case, please show us your code.)
When it comes to updating existing objects in Raven, there are a few options:
Option 1: Just save the object using the same ID as an existing object:
var song = ... // load it from the file system or whatever
song.Id = "Songs/5"; // Set it to an existing song ID
DbSession.Store(song); // Overwrites the existing song
Option 2: Manually update the properties of the existing object.
var song = ...;
var existingSong = DbSession.Load<Song>("Songs/5");
existingSong.Artist = song.Artist;
existingSong.Album = song.Album;
Option 3: Dynamically update the existing object:
var song = ...;
var existingSong = DbSession.Load<Song>("Songs/5");
existingSong.CopyFrom(song);
Where you've got some code like this:
// Inside Song.cs
public virtual void CopyFrom(Song other)
{
var props = typeof(Song)
.GetProperties(System.Reflection.BindingFlags.Public | System.Reflection.BindingFlags.Instance)
.Where(p => p.CanWrite);
foreach (var prop in props)
{
var source = prop.GetValue(other);
prop.SetValue(this, source);
}
}
If you find yourself having to do this often, use a library like AutoMapper.
Automapper can automatically copy one object to another with a single line of code.

Now that you've posted some code, I see 2 things:
First, is there a reason you're using the Advanced.DocumentQuery syntax?
// This is advanced query syntax. Is there a reason you're using it?
var dbTracks = _session.Advanced.DocumentQuery<TrackData, DefaultSearchIndex>().WhereEquals("Query", originalFileName).ToList();
Here's how I'd write your code using standard LINQ syntax:
var escapedFileName = Util.EscapeDatabaseQuery(originalFileName);
// Find the ID of the existing track in the database.
var existingTrackId = _session.Query<TrackData, DefaultSearchIndex>()
.Where(t => t.Query == escapedFileName)
.Select(t => t.Id);
if (existingTrackId != null)
{
track.Id = existingTrackId;
_session.Store(track);
_session.SaveChanges();
}
Finally, #2: what is track? Was it loaded via session.Load or session.Query? If so, that's not going to work, and it's causing your problem. If track is loaded from the database, you'll need to create a new object and save that:
var escapedFileName = Util.EscapeDatabaseQuery(originalFileName);
// Find the ID of the existing track in the database.
var existingTrackId = _session.Query<TrackData, DefaultSearchIndex>()
.Where(t => t.Query == escapedFileName)
.Select(t => t.Id);
if (existingTrackId != null)
{
var newTrack = new Track(...);
newTrack.Id = existingTrackId;
_session.Store(newTrack);
_session.SaveChanges();
}

This means you already have a different object in the session with the same id. The fix for me was to use a new session.

Related

Why does SQLite in memory returns different result when using AsNoTracking?

I was writing tests using SQLite in-memory database with XUnit and ASP.NET Core 3.1 and found strange behavior.
Lets say that we have User model and we want to change property IsActive to false:
var u = new User {Id = Guid.NewGuid(), IsActive = true};
_db.Users.Add(u);
_db.SaveChanges();
u.IsActive = false;
// Returns false
var isActive = _db.Users.Single(x => x.Id == u.Id).IsActive;
// Returns true
var isActiveNoTracking = _db.Users.AsNoTracking().Single(x => x.Id == u.Id).IsActive;
// Fails.
Assert.Equal(isActive, isActiveNoTracking);
I get different result depending if AsNoTracking() is called or not. Why is this happening? Isn't AsNoTracking() supposed to stop tracking changes made on fetched object, not to mess with data that was already changed?
If I call SaveChanges() after changing the property then it is all good (as expected):
var u = new User {Id = Guid.NewGuid(), IsActive = true};
_db.Users.Add(u);
_db.SaveChanges();
u.IsActive = false;
_db.SaveChanges();
// Returns false
var isActive = _db.Users.Single(x => x.Id == u.Id).IsActive;
// Returns false
var isActiveNoTracking = _db.Users.AsNoTracking().Single(x => x.Id == u.Id).IsActive;
// Success.
Assert.Equal(isActive, isActiveNoTracking);
So I am confused, I'm not sure when SQLite in-memory actually commits changes. Sometimes you can fetch changes from db without calling SaveChanges() but sometimes you cannot.
Here is code related to db
public class SqliteInMemoryAppDbContext : AppDbContext
{
public SqliteInMemoryAppDbContext(IConfiguration configuration) : base(configuration)
{
}
protected override void OnConfiguring(DbContextOptionsBuilder options)
{
var connection = new SqliteConnection("DataSource=:memory:");
connection.Open();
options.UseSqlite(connection);
}
}
// I create db context for each test like this and dispose it after each test.
var _db = new SqliteInMemoryAppDbContext(null);
_db.Database.EnsureDeleted();
_db.Database.EnsureCreated();
So I am confused, I'm not sure when SQLite in-memory actually commits changes. Sometimes you can fetch changes from db without calling SaveChanges() but sometimes you cannot.
This is impossible. In order for something to be saved to the DB you need to call SaveChanges. What happens here is that you see local objects and you assume that they are stored in your DB. I generally suggest that you use a a DB query tool to learn how it works because it can be difficult at first.
Entity Framework has some local objects that it stores. For example at your first query.
// returns true, because it checks the db due to no tracking
_db.Users.AsNoTracking().Where(x => x.IsActive).OrderBy(x=>x.Username).ToList()[0].IsActive
// returns false, it finds the local reference
_db.Users.Where(x => x.IsActive).OrderBy(x=>x.Username).ToList()[0].IsActive
As you can see from the comments above it has different behavior based on the commands. It's not about when changes are saved to the db. This happens only if you call SaveChanges. What you are confused is for when the 'queries' you write with EF look at the DB or locally.
Generally for SQL at least I like to work with SQL profiler to see what queries EF sends to the Database. For example in your case you will have a query with where and order by send to the db.
EDIT:
About how to understand when the db is called or not i suggest reading here.
To summarize AsNoTracking always creates the new entity which means that it will look in the db for it. Instead the other commands in your example first look locally for the object.

Renaming models in database with existing data when using RavenDb

Is there any 'easy' way to rename models in RavenDb when the database already has existing data? I have various models which were originally created in another language, and now I would like to rename them to English as the codebase is becoming quite unmaintainable. If I just rename them, then the data won't be loaded because the properties don't match anymore.
I would like the system to automatically do it on first load. Is there any best way how to approach this? My solution would be:
Check if a document exists to determine if the upgrade has been done or not
If upgrade has not been done, execute patch scripts to update fields
Update document to know that the upgrade has been done
I'd recommend you create new documents from the old documents.
This can be done pretty easily using patching via docStore.UpdateByIndex.
Suppose I had an old type name, Foo, and wanted to rename it to the new type name, Bar. And I wanted all the IDs to change from Foos/123 to Bars/123.
It would look something like this:
var patchScript = #"
// Copy all the properties from the old document
var newDoc = {};
for (var prop in this) {
if (prop !== '#metadata') {
newDoc[prop] = this[prop];
}
}
// Create the metadata.
var meta = {};
meta['Raven-Entity-Name'] = newCollection;
meta['Raven-Clr-Type'] = newType;
// Store the new document.
var newId = __document_id.replace(oldCollection, newCollection);
PutDocument(newId, newDoc, meta);
";
var oldCollection = "Foos";
var newCollection = "Bars";
var newType = "KarlCassar.Bar, KarlCassar"; // Where KarlCassar is your assembly name.
var query = new IndexQuery { Query = $"Tag:{oldCollection}" };
var options = new BulkOperationOptions { AllowStale = false };
var patch = new ScriptedPatchRequest
{
Script = patchScript,
Values = new Dictionary<string, object>
{
{ nameof(oldCollection), oldCollection },
{ nameof(newCollection), newCollection },
{ nameof(newType), newType }
}
};
var patchOperation = docStore.DatabaseCommands.UpdateByIndex("Raven/DocumentsByEntityName", query, patch, options);
patchOperation.WaitForCompletion();
Run that code once at startup, and then your app will be able to work with the new name entities. Your old entities are still around - those can be safely deleted via the Studio.

Google diff-match-patch : How to unpatch to get Original String?

I am using Google diff-match-patch JAVA plugin to create patch between two JSON strings and storing the patch to database.
diff_match_patch dmp = new diff_match_patch();
LinkedList<Patch> diffs = dmp.patch_make(latestString, originalString);
String patch = dmp.patch_toText(diffs); // Store patch to DB
Now is there any way to use this patch to re-create the originalString by passing the latestString?
I google about this and found this very old comment # Google diff-match-patch Wiki saying,
Unpatching can be done by just looping through the diff, swapping
DIFF_INSERT with DIFF_DELETE, then applying the patch.
But i did not find any useful code that demonstrates this. How could i achieve this with my existing code ? Any pointers or code reference would be appreciated.
Edit:
The problem i am facing is, in the front-end i am showing a revisions module that shows all the transactions of a particular fragment (take for example an employee details), like which user has updated what details etc. Now i am recreating the fragment JSON by reverse applying each patch to get the current transaction data and show it as a table (using http://marianoguerra.github.io/json.human.js/). But some JSON data are not valid JSON and I am getting JSON.parse error.
I was looking to do something similar (in C#) and what is working for me with a relatively simple object is the patch_apply method. This use case seems somewhat missing from the documentation, so I'm answering here. Code is C# but the API is cross language:
static void Main(string[] args)
{
var dmp = new diff_match_patch();
string v1 = "My Json Object;
string v2 = "My Mutated Json Object"
var v2ToV1Patch = dmp.patch_make(v2, v1);
var v2ToV1PatchText = dmp.patch_toText(v2ToV1Patch); // Persist text to db
string v3 = "Latest version of JSON object;
var v3ToV2Patch = dmp.patch_make(v3, v2);
var v3ToV2PatchTxt = dmp.patch_toText(v3ToV2Patch); // Persist text to db
// Time to re-hydrate the objects
var altV3ToV2Patch = dmp.patch_fromText(v3ToV2PatchTxt);
var altV2 = dmp.patch_apply(altV3ToV2Patch, v3)[0].ToString(); // .get(0) in Java I think
var altV2ToV1Patch = dmp.patch_fromText(v2ToV1PatchText);
var altV1 = dmp.patch_apply(altV2ToV1Patch, altV2)[0].ToString();
}
I am attempting to retrofit this as an audit log, where previously the entire JSON object was saved. As the audited objects have become more complex the storage requirements have increased dramatically. I haven't yet applied this to the complex large objects, but it is possible to check if the patch was successful by checking the second object in the array returned by the patch_apply method. This is an array of boolean values, all of which should be true if the patch worked correctly. You could write some code to check this, which would help check if the object can be successfully re-hydrated from the JSON rather than just getting a parsing error. My prototype C# method looks like this:
private static bool ValidatePatch(object[] patchResult, out string patchedString)
{
patchedString = patchResult[0] as string;
var successArray = patchResult[1] as bool[];
foreach (var b in successArray)
{
if (!b)
return false;
}
return true;
}

given a list of objects using C# push them to ravendb without knowing which ones already exist

Given 1000 documents with a complex data structure. for e.g. a Car class that has three properties, Make and Model and one Id property.
What is the most efficient way in C# to push these documents to raven db (preferably in a batch) without having to query the raven collection individually to find which to update and which to insert. At the moment I have to going like so. Which is totally inefficient.
note : _session is a wrapper on the IDocumentSession where Commit calls SaveChanges and Add calls Store.
private void PublishSalesToRaven(IEnumerable<Sale> sales)
{
var page = 0;
const int total = 30;
do
{
var paged = sales.Skip(page*total).Take(total);
if (!paged.Any()) return;
foreach (var sale in paged)
{
var current = sale;
var existing = _session.Query<Sale>().FirstOrDefault(s => s.Id == current.Id);
if (existing != null)
existing = current;
else
_session.Add(current);
}
_session.Commit();
page++;
} while (true);
}
Your session code doesn't seem to track with the RavenDB api (we don't have Add or Commit).
Here is how you do this in RavenDB
private void PublishSalesToRaven(IEnumerable<Sale> sales)
{
sales.ForEach(session.Store);
session.SaveChanges();
}
Your code sample doesn't work at all. The main problem is that you cannot just switch out the references and expect RavenDB to recognize that:
if (existing != null)
existing = current;
Instead you have to update each property one-by-one:
existing.Model = current.Model;
existing.Make = current.Model;
This is the way you can facilitate change-tracking in RavenDB and many other frameworks (e.g. NHibernate). If you want to avoid writing this uinteresting piece of code I recommend to use AutoMapper:
existing = Mapper.Map<Sale>(current, existing);
Another problem with your code is that you use Session.Query where you should use Session.Load. Remember: If you query for a document by its id, you will always want to use Load!
The main difference is that one uses the local cache and the other not (the same applies to the equivalent NHibernate methods).
Ok, so now I can answer your question:
If I understand you correctly you want to save a bunch of Sale-instances to your database while they should either be added if they didn't exist or updated if they existed. Right?
One way is to correct your sample code with the hints above and let it work. However that will issue one unnecessary request (Session.Load(existingId)) for each iteration. You can easily avoid that if you setup an index that selects all the Ids of all documents inside your Sales-collection. Before you then loop through your items you can load all the existing Ids.
However, I would like to know what you actually want to do. What is your domain/use-case?
This is what works for me right now. Note: The InjectFrom method comes from Omu.ValueInjecter (nuget package)
private void PublishSalesToRaven(IEnumerable<Sale> sales)
{
var ids = sales.Select(i => i.Id);
var existingSales = _ravenSession.Load<Sale>(ids);
existingSales.ForEach(s => s.InjectFrom(sales.Single(i => i.Id == s.Id)));
var existingIds = existingSales.Select(i => i.Id);
var nonExistingSales = sales.Where(i => !existingIds.Any(x => x == i.Id));
nonExistingSales.ForEach(i => _ravenSession.Store(i));
_ravenSession.SaveChanges();
}

LINQ SQL Attach, Update Check set to Never, but still Concurrency conflicts

In the dbml designer I've set Update Check to Never on all properties. But i still get an exception when doing Attach: "An attempt has been made to Attach or Add an entity that is not new, perhaps having been loaded from another DataContext. This is not supported." This approach seems to have worked for others on here, but there must be something I've missed.
using(TheDataContext dc = new TheDataContext())
{
test = dc.Members.FirstOrDefault(m => m.fltId == 1);
}
test.Name = "test2";
using(TheDataContext dc = new TheDataContext())
{
dc.Members.Attach(test, true);
dc.SubmitChanges();
}
The error message says exactly what is going wrong: You are trying to attach an object that has been loaded from another DataContext, in your case from another instance of the DataContext. Dont dispose your DataContext (at the end of the using statement it gets disposed) before you change values and submit the changes. This should work (all in one using statement). I just saw you want to attach the object again to the members collection, but it is already in there. No need to do that, this should work just as well:
using(TheDataContext dc = new TheDataContext())
{
var test = dc.Members.FirstOrDefault(m => m.fltId == 1);
test.Name = "test2";
dc.SubmitChanges();
}
Just change the value and submit the changes.
Latest Update:
(Removed all previous 3 updates)
My previous solution (removed it again from this post), found here is dangerous. I just read this on a MSDN article:
"Only call the Attach methods on new
or deserialized entities. The only way
for an entity to be detached from its
original data context is for it to be
serialized. If you try to attach an
undetached entity to a new data
context, and that entity still has
deferred loaders from its previous
data context, LINQ to SQL will thrown
an exception. An entity with deferred
loaders from two different data
contexts could cause unwanted results
when you perform insert, update, and
delete operations on that entity. For
more information about deferred
loaders, see Deferred versus Immediate
Loading (LINQ to SQL)."
Use this instead:
// Get the object the first time by some id
using(TheDataContext dc = new TheDataContext())
{
test = dc.Members.FirstOrDefault(m => m.fltId == 1);
}
// Somewhere else in the program
test.Name = "test2";
// Again somewhere else
using(TheDataContext dc = new TheDataContext())
{
// Get the db row with the id of the 'test' object
Member modifiedMember = new Member()
{
Id = test.Id,
Name = test.Name,
Field2 = test.Field2,
Field3 = test.Field3,
Field4 = test.Field4
};
dc.Members.Attach(modifiedMember, true);
dc.SubmitChanges();
}
After having copied the object, all references are detached, and all event handlers (deferred loading from db) are not connected to the new object. Just the value fields are copied to the new object, that can now be savely attached to the members table. Additionally you do not have to query the db for a second time with this solution.
It is possible to attach entities from another datacontext.
The only thing that needs to be added to code in the first post is this:
dc.DeferredLoadingEnabled = false
But this is a drawback since deferred loading is very useful. I read somewhere on this page that another solution would be to set the Update Check on all properties to Never. This text says the same: http://complexitykills.blogspot.com/2008/03/disconnected-linq-to-sql-tips-part-1.html
But I can't get it to work even after setting the Update Check to Never.
This is a function in my Repository class which I use to update entities
protected void Attach(TEntity entity)
{
try
{
_dataContext.GetTable<TEntity>().Attach(entity);
_dataContext.Refresh(RefreshMode.KeepCurrentValues, entity);
}
catch (DuplicateKeyException ex) //Data context knows about this entity so just update values
{
_dataContext.Refresh(RefreshMode.KeepCurrentValues, entity);
}
}
Where TEntity is your DB Class and depending on you setup you might just want to do
_dataContext.Attach(entity);