Strategy for generating unique identifier for document in RavenDB - ravendb

Say I have something like a support ticket system (simplified as example). It has many users and organizations. Each user can be a member of several organizations, but the typical case would be one org => many users, and most of them belong only to this organization. Each organization has a "tag" which is to be used to construct "ticket numbers" for this organization. Lets say we have an org called StackExchange that wants the tag SES.
So if I open the first ticket of today, I want it to be SES140407-01. The next is SES140407-02 and so on. Doesn't have to be two digits after the dash.
How can I make sure this is generated in a way that makes sure it is 100% unique across the organization (no orgs will have the same tag)?
Note: This does not have to be the document ID in the database - that will probably just be a Guid or similar. This is just a ticket reference - kinda like a slug - that will appear in related emails etc. So it has to be unique, and I would prefer if we didn't "waste" the sequential case numbers hilo style.
Is there a practical way to ensure I get a unique ticket number even if two or more people report a new one at almost the same time?
EDIT: Each Organization is a document in RavenDB, and can easily hold a property like LastIssuedTicketId. My challenge is basically to find the best way to read this field, generate a new one, and store this back in a way that is "race condition safe".
Another edit: To be clear - I intend to generate the ticket ID in my own software. What I am looking for is a way to ask RavenDB "what was the last ticket number", and then when I generate the next one after that, "am I the only one using this?" - so that I give my ticket a unique case id, not necessarily related to what RavenDB considers the document id.

I use for that generic sequence generator written for RavenDB:
public class SequenceGenerator
{
private static readonly object Lock = new object();
private readonly IDocumentStore _docStore;
public SequenceGenerator(IDocumentStore docStore)
{
_docStore = docStore;
}
public int GetNextSequenceNumber(string sequenceKey)
{
lock (Lock)
{
using (new TransactionScope(TransactionScopeOption.Suppress))
{
while (true)
{
try
{
var document = GetDocument(sequenceKey);
if (document == null)
{
PutDocument(new JsonDocument
{
Etag = Etag.Empty,
// sending empty guid means - ensure the that the document does NOT exists
Metadata = new RavenJObject(),
DataAsJson = RavenJObject.FromObject(new { Current = 0 }),
Key = sequenceKey
});
return 0;
}
var current = document.DataAsJson.Value<int>("Current");
current++;
document.DataAsJson["Current"] = current;
PutDocument(document);
{
return current;
}
}
catch (ConcurrencyException)
{
// expected, we need to retry
}
}
}
}
}
private void PutDocument(JsonDocument document)
{
_docStore.DatabaseCommands.Put(
document.Key,
document.Etag,
document.DataAsJson,
document.Metadata);
}
private JsonDocument GetDocument(string key)
{
return _docStore.DatabaseCommands.Get(key);
}
}
It generates incremental unique sequence based on sequenceKey. Uniqueness is guaranteed by raven optimistic concurrency based on Etag. So each sequence has its own document which we update when generate new sequence number. Also, there is lock which reduced extra db calls if several threads are executing at the same moment at the same process (appdomain).
For your case you can use it this way:
var sequenceKey = string.Format("{0}{1:yyMMdd}", yourCompanyPrefix, DateTime.Now);
var nextSequenceNumber = new SequenceGenerator(yourDocStore).GetNextSequenceNumber(sequenceKey);
var nextSequenceKey = string.Format("{0}-{1:00}", sequenceKey, nextSequenceNumber);

Related

A confusing situation on my way to understand DDD

Thank you in advance for your help and attentation!
My project is dedicated only for learning purposes and I'm totally confused with DDD and have the following situation:
There is the ubiquitous language of my domain where I have users and documents. It says the following:
- A user can create a document. One of the main purpose of my project is to provide users an ability to create different documents. I mean that the documents cannot exist without the users. So,I think that the process of a document creation belongs to my domain.
- A user can send a document for approval. It's one more thing that belongs to the domain. An approval process is one of the most important part of the project. It has its steps that other users must confirm.
- A user can approve a step of approval process.
- A user can reject a step of approval process.
That's enough to understand and answer my question:
Is it normal that a User can contain such methods as: CreateDocument(params), SendDocumentForApproval(docId), ApproveApprovalStepOfDocument(stepId)?
I'm comfused with it because It looks in code a bit strange.
For example for the document creatation process we have something like that:
public async Task<bool> CreateDocumentCommandHandler(CreateDocumentCommand command)
{
//We have our injected repositories
User user = await _userRepository.UserOfId(command.UserId);
Document document = User.CreateDocoment(command.*[Params for the document]);
_documentRepostiory.Add(document);
// It raises event before it makes a commit to the database
// It gets event from an entity. The entity keeps it as readonly collection.
// Here it raises DocumentCreatedEvent. This event contains logic which concerns
// creation some additional entities for the document and log actions.
await _documentRepository.UnitOfWork.SaveEntitiesAsync();
}
The approval process:
//The first try out to model this process:
public async Task<bool> SendDocumentForApprovalCommandHandler(SendDocumentForApprovalCommand command)
{
//We have our injected repositories
User user = await _userRepository.UserOfId(command.UserId);
//Here I have some problems.
//Is it okay that the method returns the document?
//The method that is placed inside the User has this logic:
//public Document SendDocumentForApproval(int docId)
//{
// Document document = this.GetDocument(docId);
//
// //Inside this method ChangedStatusToApproving is created
// document.SetStatusToApproving();
// return document;
//}
Document document = User.SendDocumentForApproval(command.DocId);
_documentRepostiory.Upadate(document);
// It raises event before it makes a commit to the database
// It gets event from an entity. The entity keeps it as readonly collection.
// Here it raises ChangedStatusToApproving. This event contains logic which concerns
// creation some additional entities for the document and log actions.
await _documentRepository.UnitOfWork.SaveEntitiesAsync();
}
//Is it okay to do something like the command handler above?
//The second one:
public async Task<bool> SendDocumentForApprovalCommandHandler(SendDocumentForApprovalCommand command)
{
//We have our injected repositories
User user = await _userRepository.UserOfId(command.UserId);
//The same one as we have in the previous method.
//But here I don't want to put the logic about the changing status of the doucnent inside it.
Document document = User.SendDocumentForApproval(command.DocId);
//I see that it breaks the method above (SendDocumentForApproval)
//Now It doesn't mean anything for our domain, does it?
//It is only getter like User.GetDocument or we can even do it
//by using repository - documentRepository.DocumentOfId(docId)
document.SetStatusToApproving();
_documentRepostiory.Upadate(document);
await _documentRepository.UnitOfWork.SaveEntitiesAsync();
}
// So, I think the first one is better, isn't it? It follows the ubiquitous language.
//And here is the final question: Why can't I do it like this:
public async Task<bool> SendDocumentForApprovalCommandHandler(SendDocumentForApprovalCommand command)
{
//Here we don't want to use the userRepository. We don't need at all
//Here as a consequence we also don't need a user entity
//Everything what we need is:
Document document = _documentRepository.DocOfId(command.DocId);
document.ForApproval();
_documentRepostiory.Upadate(document);
await _documentRepository.UnitOfWork.SaveEntitiesAsync();
}
//I think that the last approach breaks the ubiquitous language and we're about to having an anemic model.
//But here we have only two queries to the database because we don't need a user.
//Which of the approaches is better? Why? How can I do it more correctly if I want to apply DDD?
I want to explain my thoughts in more details.
Let's have a look at the user. They manage documents. A Document cannot exist without the user. Does it mean that the User is an aggregate root through we need to create, update, delete its aggregates.
And the document is also an aggregate root due to it contains an apporval process. The ApprovalProcess cannot exist without the document.
Does it mean that I need to do something like this:
public async Task<bool> SendDocumentForApprovalCommandHandler(SendDocumentForApprovalCommand command)
{
Document document = _documentRepository.DocumentOfId(command.DocId);
document.SendForApproval();
_documentRepository.SaveChangesAsync();//Raise a domain event - SentDocumentForApprovalEvent
}
// Here we have a handler for the event SentDocumentForApprovalEvent
public async Task SentDocumentForApprovalEventHandler(SentDocumentForApprovalEvent sentDocumentForApprovalEvent)
{
//Now I want to create an approval process for the document
//Can I do the next thing:
ApprovalProcess process = new ApprovalProcess(sentDocumentForApprovalEvent.DocId);
_approvalProcessRepository.Add(process);
_approvalProcessRepository.SaveEntitiesAsync();//Raise a domain event - InitiatedApprovalProcessEvent
//Or Should I create the approval process through Document?
//Looks terrible due to we need to call the repostiory amd
ApprovalProcess process = Document.InitiateApprovalProcess(sentDocumentForApprovalEvent.DocID);//Static method
_approvalProcessRepository.Add(process);
_approvalProcessRepository.SaveEntitiesAsync();
//log
}
// Here we have a handler for the event InitiatedApprovalProcessEvent
public async Task InitiatedApprovalProcesEventHandler(SentDocumentForApprovalEvent sentDocumentForApprovalEvent)
{
//The same question as we have with handler above.
//Should I create steps trough the approval process or just with the help of constructor of the step?
//log
}
Thank you so much and sorry for my terrible English!
Best regards
Is it normal that a User can contain such methods as: CreateDocument(params), SendDocumentForApproval(docId), ApproveApprovalStepOfDocument(stepId)?
In most domain models, the method belongs with the entity that manages the state that is going to change.
Document document = User.SendDocumentForApproval(command.DocId);
_documentRepository.Update(document);
The fact that your sample is updating the document repository here is a big hint that it is the document that is changing, and therefore we would normally expect to see SendDocumentForApproval as a method on the document.
document.SendDocumentForApproval(command.UserId)
_documentRepository.Update(document);
(Yes, the code doesn't read like written or spoken English.)
When creating a new document... creation patterns are weird. Udi Dahan suggests that there should always be some entity in your domain model that is responsible for creating the other entities, but I'm not convinced that the result is actually easier to work with in the long term.
How can we model the approval business process
General answer: business processes are protocols, which is to say that you can normally model them as a state machine. Here's the state we are in right now, here is some new information from the outside world, compute the consequences.
(Often, the data model for a process will just look like a history of events; the domain model's job is to then take the new information and compute the right events to store in the history. You don't have to do it that way, but there are interesting possibilities available when you can).
You are headed in a right direction, User and Document both are aggregates as they are created in separate transactions. When it comes to who references whom, IDDD principle of scalability says that aggregates should refer aggregates only via their IDs.
I think sticking to the ubiquitious, language your code should look something like this
class User {
private UserId id;
private String name;
User(String name) {
this.id = new UserId();
this.name = name;
}
Document createDocument(String name) {
Document document = new Document(name);
document.createdBy(this);
return document;
}
Document approve(Document document) {
document.approved();
return document;
}
}
class Document {
private DocumentId id;
private String name;
private UserId userId;
private Boolean isApproved;
Document(String name) {
this.id = new DocumentId();
this.name = name;
}
void createdBy(UserId userId) {
this.userId = userId;
}
void approved() {
this.isApproved = true;
}
}
// User creation
User user = new User("Someone");
userRepository.save(user);
//Document creation
User user = userRepository.find(new UserId("some-id"))
Document document = user.createDocument("important-document")
documentRepository.save(document)
// Approval
User user = userRepository.find(new UserId("some-id"))
Document document = documentRepository.find(new DocumentId("some-id"))
document = user.approve(Document)
I would highly recommend reading Vaughn Vernon's three part aggregate design paper series better aggregete design

Adding key value to RavenDB document based on another document's values (For all documents in large collection)

I am new to RavenDB and could really use some help.
I have a collection of ~20M documents, and I need to add a key to each document. The challenge is that the value of the key needs to be derived from another document.
For instance, given the following document:
{
"Name" : "001A"
"Date" : "09-09-2013T00:00:00.0000000"
"Related" : [
"002B",
"003B"
]
}
The goal is to add a key that holds the dates for the related documents, i.e. 002B and 003B, by looking up the related documents in the collection and returning their date. E.g.:
{
"Name" : "001A"
"Date" : "09-09-2013T00:00:00.0000000"
"Related" : [
"002B",
"003B"
]
"RelatedDates" : [
"08-10-2013T00:00:00.0000000",
"08-15-2013T00:00:00.0000000"
]
}
I realize that I'm trying to treat the collection somewhat like a relational database, but this is the form that my data is in to begin with. I would prefer not to put everything into a relational dataset first in order to structure the data for RavenDB.
I first tried doing this on the client side, by paging through the collection and updating the records. However, I quickly reach the maximum number of request for the session.
I then tried patching on the server side with JavaScript, but I'm not sure if this is possible.
At this point I would greatly appreciate some strategic guidance on the right way to approach this problem, as well as, more tactical guidance on how to implement it.
The recommended way of doing this is via a Console application that loops thru all your records, similar to what you have already done but in a way that pages the data so you dont hit the maximum number of requests per session.
See this example from the ravendb source code example application:
you need to do something like this:
using (var store = new DocumentStore { ConnectionStringName = "RavenDB" }.Initialize())
{
int start = 0;
while (true)
{
using (var session = store.OpenSession())
{
var posts = session.Query<Post>()
.OrderBy(x => x.CreatedAt)
.Include(x => x.CommentsId)
.Skip(start)
.Take(128)
.ToList();
if (posts.Count == 0)
break;
foreach (var post in posts)
{
session.Load<PostComments>(post.CommentsId).Post = new PostComments.PostReference
{
Id = post.Id,
PublishAt = post.PublishAt
};
}
session.SaveChanges();
start += posts.Count;
Console.WriteLine("Migrated {0}", start);
}
}
}
I've done this sort of thing with about ~1.5M records and it wasnt exactly quick to do the migration. If your records are small then you can just Load<> and SaveChanges on each one as from experience programmatically patching the documents did not speed things up materially
As a side note, the ravendb google groups is very active if you want to ask specifically about doing this from the studio

Delaying writes to SQL Server

I am working on an app, and need to keep track of how any views a page has. Almost like how SO does it. It is a value used to determine how popular a given page is.
I am concerned that writing to the DB every time a new view needs to be recorded will impact performance. I know this borderline pre-optimization, but I have experienced the problem before. Anyway, the value doesn't need to be real time; it is OK if it is delayed by 10 minutes or so. I was thinking that caching the data, and doing one large write every X minutes should help.
I am running on Windows Azure, so the Appfabric cache is available to me. My original plan was to create some sort of compound key (PostID:UserID), and tag the key with "pageview". Appfabric allows you to get all keys by tag. Thus I could let them build up, and do one bulk insert into my table instead of many small writes. The table looks like this, but is open to change.
int PageID | guid userID | DateTime ViewTimeStamp
The website would still get the value from the database, writes would just be delayed, make sense?
I just read that the Windows Azure Appfabric cache does not support tag based searches, so it pretty much negates my idea.
My question is, how would you accomplish this? I am new to Azure, so I am not sure what my options are. Is there a way to use the cache without tag based searches? I am just looking for advice on how to delay these writes to SQL.
You might want to take a look at http://www.apathybutton.com (and the Cloud Cover episode it links to), which talks about a highly scalable way to count things. (It might be overkill for your needs, but hopefully it gives you some options.)
You could keep a queue in memory and on a timer drain the queue, collapse the queued items by totaling the counts by page and write in one SQL batch/round trip. For example, using a TVP you could write the queued totals with one sproc call.
That of course doesn't guarantee the view counts get written since its in memory and latently written but page counts shouldn't be critical data and crashes should be rare.
You might want to have a look at how the "diagnostics" feature in Azure works. Not because you would use diagnostics for what you are doing at all, but because it is dealing with a similar problem and may provide some inspiration. I am just about to implement a data auditing feature and I want to log that to table storage so also want to delay and bunch the updates together and I have taken a lot of inspiration from diagnostics.
Now, the way Diagnostics in Azure works is that each role starts a little background "transfer" thread. So, whenever you write any traces then that gets stored in a list in local memory and the background thread will (by default) bunch all the requests up and transfer them to table storage every minute.
In your scenario, I would let each role instance keep track of a count of hits and then use a background thread to update the database every minute or so.
I would probably use something like a static ConcurrentDictionary (or one hanging off a singleton) on each webrole with each hit incrementing the counter for the page identifier. You'd need to have some thread handling code to allow multiple request to update the same counter in the list. Alternatively, just allow each "hit" to add a new record to a shared thread-safe list.
Then, have a background thread once per minute increment the database with the number of hits per page since last time and reset the local counter to 0 or empty the shared list if you are going with that approach (again, be careful about the multi threading and locking).
The important thing is to make sure your database update is atomic; If you do a read-current-count from the database, increment it and then write it back then you may have two different web role instances doing this at the same time and thus losing one update.
EDIT:
Here is a quick sample of how you could go about this.
using System.Collections.Concurrent;
using System.Data.SqlClient;
using System.Threading;
using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
static void Main(string[] args)
{
// You would put this in your Application_start for the web role
Thread hitTransfer = new Thread(() => HitCounter.Run(new TimeSpan(0, 0, 1))); // You'd probably want the transfer to happen once a minute rather than once a second
hitTransfer.Start();
//Testing code - this just simulates various web threads being hit and adding hits to the counter
RunTestWorkerThreads(5);
Thread.Sleep(5000);
// You would put the following line in your Application shutdown
HitCounter.StopRunning(); // You could do some cleverer stuff with aborting threads, joining the thread etc but you probably won't need to
Console.WriteLine("Finished...");
Console.ReadKey();
}
private static void RunTestWorkerThreads(int workerCount)
{
Thread[] workerThreads = new Thread[workerCount];
for (int i = 0; i < workerCount; i++)
{
workerThreads[i] = new Thread(
(tagname) =>
{
Random rnd = new Random();
for (int j = 0; j < 300; j++)
{
HitCounter.LogHit(tagname.ToString());
Thread.Sleep(rnd.Next(0, 5));
}
});
workerThreads[i].Start("TAG" + i);
}
foreach (var t in workerThreads)
{
t.Join();
}
Console.WriteLine("All threads finished...");
}
}
public static class HitCounter
{
private static System.Collections.Concurrent.ConcurrentQueue<string> hits;
private static object transferlock = new object();
private static volatile bool stopRunning = false;
static HitCounter()
{
hits = new ConcurrentQueue<string>();
}
public static void LogHit(string tag)
{
hits.Enqueue(tag);
}
public static void Run(TimeSpan transferInterval)
{
while (!stopRunning)
{
Transfer();
Thread.Sleep(transferInterval);
}
}
public static void StopRunning()
{
stopRunning = true;
Transfer();
}
private static void Transfer()
{
lock(transferlock)
{
var tags = GetPendingTags();
var hitCounts = from tag in tags
group tag by tag
into g
select new KeyValuePair<string, int>(g.Key, g.Count());
WriteHits(hitCounts);
}
}
private static void WriteHits(IEnumerable<KeyValuePair<string, int>> hitCounts)
{
// NOTE: I don't usually use sql commands directly and have not tested the below
// The idea is that the update should be atomic so even though you have multiple
// web servers all issuing similar update commands, potentially at the same time,
// they should all commit. I do urge you to test this part as I cannot promise this code
// will work as-is
//using (SqlConnection con = new SqlConnection("xyz"))
//{
// foreach (var hitCount in hitCounts.OrderBy(h => h.Key))
// {
// var cmd = con.CreateCommand();
// cmd.CommandText = "update hits set count = count + #count where tag = #tag";
// cmd.Parameters.AddWithValue("#count", hitCount.Value);
// cmd.Parameters.AddWithValue("#tag", hitCount.Key);
// cmd.ExecuteNonQuery();
// }
//}
Console.WriteLine("Writing....");
foreach (var hitCount in hitCounts.OrderBy(h => h.Key))
{
Console.WriteLine(String.Format("{0}\t{1}", hitCount.Key, hitCount.Value));
}
}
private static IEnumerable<string> GetPendingTags()
{
List<string> hitlist = new List<string>();
var currentCount = hits.Count();
for (int i = 0; i < currentCount; i++)
{
string tag = null;
if (hits.TryDequeue(out tag))
{
hitlist.Add(tag);
}
}
return hitlist;
}
}

Relation many-to-one retrieved from custom cache

It's more like theoretical question.
I have one table to hold dictionary items, and the next one for hold Users data.
User table contains a lot reference collumns of type many to one indicated on dictionary item table. It's looks like:
public class User
{
public int Id;
public Dictionary Status;
public Dictionary Type;
public Dictionary OrganizationUnit;
......
}
I want retrieve all dictionary on startup of aplication, and then when i retrieved user and invoke reference property to dictionary the dictionary object should be taken from cache.
I know i can use a 2nd level cache in this scenario, but i'm interested about other solution. Is there any?
It's posible to make my custom type and said that: use my custom cache to retrieved value of dictionary??
Across multiple session the second level cache is the best answer, the only other solutions to populate objects from a cache without using second level cache i can think of would be to use an onLoad interceptor (and simply leave your dictionaries unmapped) or do it manually somewhere in your application.
But why don't you want to use the seocondlevel cache? If your views on caching is very different from the storages there are providers for in hibernate it is possible for you to implement your own provider?
Why not store it in the session? Just pull the record set one time and push it into session and retrieve it each time you want it. I do something similar for other stuff and I believe my method should work for you. In my code I have a session manager that I call directly from any piece of code needs the session values. I choose this method since I can query the results and I can manipulate the storage and retrieval methods. When relying on NHibernate to do the Caching for me, I don't have the granularity of control to cause specific record sets to only be available to specific sessions. I also find that NHibernate is not as efficient as using the session directly. When profiling the CPU and memory usage I find that this method is faster and uses a little less memory. If you want to do it on a site level instead of session, look into HttpContext.Current.Cache.
The following example works perfectly for storing and retrieving record sets:
// Set the session
SessionManager.User = (Some code to pull the user record with relationships. Set the fetch mode to eager for each relationship else you will just have broken references.)
// Get the session
User myUser = SessionManager.User;
public static class SessionManager
{
public static User User
{
get { return GetSession("MySessionUser") as User; }
set { SetSession("MySessionUser", value); }
}
private static object GetSession(string key)
{
// Fix Null reference error
if (System.Web.HttpContext.Current == null || System.Web.HttpContext.Current.Session == null)
{
return null;
}
else
{
return System.Web.HttpContext.Current.Session[key];
}
}
private static void SetSession(string key, object valueIn)
{
// Fix null reference error
if (System.Web.HttpContext.Current.Session[key] == null)
{
System.Web.HttpContext.Current.Session.Add(key, valueIn);
}
else
{
System.Web.HttpContext.Current.Session[key] = valueIn;
}
}
}

given a list of objects using C# push them to ravendb without knowing which ones already exist

Given 1000 documents with a complex data structure. for e.g. a Car class that has three properties, Make and Model and one Id property.
What is the most efficient way in C# to push these documents to raven db (preferably in a batch) without having to query the raven collection individually to find which to update and which to insert. At the moment I have to going like so. Which is totally inefficient.
note : _session is a wrapper on the IDocumentSession where Commit calls SaveChanges and Add calls Store.
private void PublishSalesToRaven(IEnumerable<Sale> sales)
{
var page = 0;
const int total = 30;
do
{
var paged = sales.Skip(page*total).Take(total);
if (!paged.Any()) return;
foreach (var sale in paged)
{
var current = sale;
var existing = _session.Query<Sale>().FirstOrDefault(s => s.Id == current.Id);
if (existing != null)
existing = current;
else
_session.Add(current);
}
_session.Commit();
page++;
} while (true);
}
Your session code doesn't seem to track with the RavenDB api (we don't have Add or Commit).
Here is how you do this in RavenDB
private void PublishSalesToRaven(IEnumerable<Sale> sales)
{
sales.ForEach(session.Store);
session.SaveChanges();
}
Your code sample doesn't work at all. The main problem is that you cannot just switch out the references and expect RavenDB to recognize that:
if (existing != null)
existing = current;
Instead you have to update each property one-by-one:
existing.Model = current.Model;
existing.Make = current.Model;
This is the way you can facilitate change-tracking in RavenDB and many other frameworks (e.g. NHibernate). If you want to avoid writing this uinteresting piece of code I recommend to use AutoMapper:
existing = Mapper.Map<Sale>(current, existing);
Another problem with your code is that you use Session.Query where you should use Session.Load. Remember: If you query for a document by its id, you will always want to use Load!
The main difference is that one uses the local cache and the other not (the same applies to the equivalent NHibernate methods).
Ok, so now I can answer your question:
If I understand you correctly you want to save a bunch of Sale-instances to your database while they should either be added if they didn't exist or updated if they existed. Right?
One way is to correct your sample code with the hints above and let it work. However that will issue one unnecessary request (Session.Load(existingId)) for each iteration. You can easily avoid that if you setup an index that selects all the Ids of all documents inside your Sales-collection. Before you then loop through your items you can load all the existing Ids.
However, I would like to know what you actually want to do. What is your domain/use-case?
This is what works for me right now. Note: The InjectFrom method comes from Omu.ValueInjecter (nuget package)
private void PublishSalesToRaven(IEnumerable<Sale> sales)
{
var ids = sales.Select(i => i.Id);
var existingSales = _ravenSession.Load<Sale>(ids);
existingSales.ForEach(s => s.InjectFrom(sales.Single(i => i.Id == s.Id)));
var existingIds = existingSales.Select(i => i.Id);
var nonExistingSales = sales.Where(i => !existingIds.Any(x => x == i.Id));
nonExistingSales.ForEach(i => _ravenSession.Store(i));
_ravenSession.SaveChanges();
}