Problem with simultaneous read and save to database - asp.net-core

I have a problem with joining users to game room.
Controller
[HttpPut("join/")]
public async Task<ActionResult<string>> JoinRoom([FromQuery] int leagueId, [FromQuery] int userId)
{
var data = await classicGameService.JoinRoom(leagueId, userId);
if (data == "")
{
return NotFound();
}
else
{
return Ok(data);
}
}
Service
public async Task<string> JoinRoom(int leaguePosition, int userId)
{
var gameRoom = await
context.ClassicGames.Include(x => x.League)
.FirstOrDefaultAsync(x => x.League.Position == leaguePosition && x.User2 == 0 && x.User2State == (int)EClassicGameUserState.OBSERVER);
if (gameRoom is null)
{
return "";
}
else
{
gameRoom.User2 = userId;
gameRoom.User2State = (int)EClassicGameUserState.STAGNATION;
await context.SaveChangesAsync();
return $"{gameRoom.Id},{gameRoom.User1}";
}
}
When users send request to join simultaneously they getting correct respose for join.
It is a big problem for my game.
How to make response for the first user and then for the second?
I tried to change the methods to synchronous and there was the same problem.

If your database supports row versioning, such as Timestamps within SQL Server, you can configure your entities to observe these and reject concurrent changes.
For example to reproduce this kind of issue with an Update statement I have an entity called Game with a Player 1 and Player 2 value which I intend should only be updated once at a time. Concurrent access is a problem within web applications as two requests can come in simultaneously and both "capture" data in the same effective state which is perfectly valid for both to try and update. To simulate this you can use the following code:
using (var context = new TestDbContext())
{
var gameA = context.Games.SingleOrDefault(x => x.GameId == 1 && x.Player2 == null);
using (var context2 = new TestDbContext())
{
var gameB = context2.Games.SingleOrDefault(x => x.GameId == 1 && x.Player2 == null);
if (gameA != null)
gameA.Player2 = "Roy";
if (gameB != null)
gameB.Player2 = "George";
context.SaveChanges();
context2.SaveChanges();
}
}
In this example we use 2 separate DbContext instances representing our two simultaneous requests. Each loads our desired game satisfied that Player2 is empty. We now have 2 object references, one tracked by each DbContext and we tell both instances to set Player2's name. We then tell the contexts to SaveChanges(). The resulting output will be "George". If we reverse the SaveChanges() call order, the output would be "Roy". We don't want to allow the 2nd call to update. We cannot change the fact that both concurrent reads will get the game and be satisfied that Player2 has not been set unless we were to do something drastic like lock the table or row when trying to read the Games, and only unlock it after saving/aborting. (Pessimistic locking) This would potentially lead to big issues with timeouts or deadlocks.
The alternative is optimistic locking. We update our table to include a Timestamp column (in this example named RowVersion), then configure that column in our EF entity:
public class Game
{
[Key]
public int GameId { get; set; }
public string Player1 { get; set; }
public string Player2 { get; set; }
[Timestamp]
public byte[] RowVersion { get; set; }
}
Now if you run the above code, without any changes at all, the first SaveChanges() call will succeed, while the second SaveChanges() will fail with a DbUpdateConcurrencyException which you will need to handle. Basically in your case you'd likely want to return to the client that their game selection failed, refresh the list, and they'd see that the game was no longer available.
If your storage doesn't support optimistic concurrency then things get a bit more tricky. You would need to develop something like a marshal of sorts where join requests are queued to be performed by a single process responsible for updating player state. The initial call would return a status of something like "Joining" along with a Queue ID which would result in a user seeing a spinner while their client continued to poll with that Queue ID for an update from the marshal. The marshal processes the requests on a first come, first serve basis, and evaluates the rules. When a game is empty and allows Player 2 to join, that queued job gets a "Join Successful" status which comes back to the client on the next poll.. The duplicate request processes and finds Player 2 is filled so that Queued job gets a "Join Failed" response for that client on it's next poll. (Serializing the join operation)

Related

Save complex object to session ASP .NET CORE 2.0

I am quite new to ASP .NET core, so please help. I would like to avoid database round trip for ASP .NET core application. I have functionality to dynamically add columns in datagrid. Columns settings (visibility, enable, width, caption) are stored in DB.
So I would like to store List<,PersonColumns> on server only for actual session. But I am not able to do this. I already use JsonConvert methods to serialize and deserialize objects to/from session. This works for List<,Int32> or objects with simple properties, but not for complex object with nested properties.
My object I want to store to session looks like this:
[Serializable]
public class PersonColumns
{
public Int64 PersonId { get; set; }
List<ViewPersonColumns> PersonCols { get; set; }
public PersonColumns(Int64 personId)
{
this.PersonId = personId;
}
public void LoadPersonColumns(dbContext dbContext)
{
LoadPersonColumns(dbContext, null);
}
public void LoadPersonColumns(dbContext dbContext, string code)
{
PersonCols = ViewPersonColumns.GetPersonColumns(dbContext, code, PersonId);
}
public static List<ViewPersonColumns> GetFormViewColumns(SatisDbContext dbContext, string code, Int64 formId, string viewName, Int64 personId)
{
var columns = ViewPersonColumns.GetPersonColumns(dbContext, code, personId);
return columns.Where(p => p.FormId == formId && p.ObjectName == viewName).ToList();
}
}
I would like to ask also if my approach is not bad to save the list of 600 records to session? Is it better to access DB and load columns each time user wants to display the grid?
Any advice appreciated
Thanks
EDIT: I have tested to store in session List<,ViewPersonColumns> and it is correctly saved. When I save object where the List<,ViewPersonColumns> is property, then only built-in types are saved, List property is null.
The object I want to save in session
[Serializable]
public class UserManagement
{
public String PersonUserName { get; set; }
public Int64 PersonId { get; set; }
public List<ViewPersonColumns> PersonColumns { get; set; } //not saved to session??
public UserManagement() { }
public UserManagement(DbContext dbContext, string userName)
{
var person = dbContext.Person.Single(p => p.UserName == userName);
PersonUserName = person.UserName;
PersonId = person.Id;
}
/*public void PrepareUserData(DbContext dbContext)
{
LoadPersonColumns(dbContext);
}*/
public void LoadPersonColumns(DbContext dbContext)
{
LoadPersonColumns(dbContext, null);
}
public void LoadPersonColumns(DbContext dbContext, string code)
{
PersonColumns = ViewPersonColumns.GetPersonColumns(dbContext, code, PersonId);
}
public List<ViewPersonColumns> GetFormViewColumns(Int64 formId, string viewName)
{
if (PersonColumns == null)
return null;
return PersonColumns.Where(p => p.FormId == formId && p.ObjectName == viewName).ToList();
}
}
Save columns to the session
UserManagement userManagement = new UserManagement(_context, user.UserName);
userManagement.LoadPersonColumns(_context);
HttpContext.Session.SetObject("ActualPersonContext", userManagement);
HttpContext.Session.SetObject("ActualPersonColumns", userManagement.PersonColumns);
Load columns from the session
//userManagement build-in types are set. The PersonColumns is null - not correct
UserManagement userManagement = session.GetObject<UserManagement>("ActualPersonContext");
//The cols is filled from session with 600 records - correct
List<ViewPersonColumns> cols = session.GetObject<List<ViewPersonColumns>>("ActualPersonColumns");
Use list for each column is better than use database.
you can't create and store sessions in .net core like .net framework 4.0
Try Like this
Startup.cs
public void ConfigureServices(IServiceCollection services)
{
//services.AddDbContext<GeneralDBContext>(options => options.UseSqlServer(Configuration.GetConnectionString("DefaultConnection")));
services.AddMvc().AddSessionStateTempDataProvider();
services.AddSession();
}
Common/SessionExtensions.cs
sing Microsoft.AspNetCore.Http;
using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
namespace IMAPApplication.Common
{
public static class SessionExtensions
{
public static T GetComplexData<T>(this ISession session, string key)
{
var data = session.GetString(key);
if (data == null)
{
return default(T);
}
return JsonConvert.DeserializeObject<T>(data);
}
public static void SetComplexData(this ISession session, string key, object value)
{
session.SetString(key, JsonConvert.SerializeObject(value));
}
}
}
Usage
==> Create Session*
public IActionResult Login([FromBody]LoginViewModel model)
{
LoggedUserVM user = GetUserDataById(model.userId);
//Create Session with complex object
HttpContext.Session.SetComplexData("loggerUser", user);
return Json(new { status = result.Status, message = result.Message });
}
==> Get Session data*
public IActionResult Index()
{
//Get Session data
LoggedUserVM loggedUser = HttpContext.Session.GetComplexData<LoggedUserVM>("loggerUser");
}
Hope this is helpful. Good luck.
This is an evergreen post, and even though Microsoft has recommended serialisation to store the object in session - it is not a correct solution unless your object is readonly, I have a blog explaining all scenario here and i have even pointed out the issues in GitHub of Asp.Net Core in issue id 18159
Synopsis of the problems are here:
A. Serialisation isn't same as object, true it will help in distributed server scenario but it comes with a caveat that Microsoft have failed to highlight - that it will work without any unpredictable failures only when the object is meant to be read and not to be written back.
B. If you were looking for a read-write object in the session, everytime you change the object that is read from the session after deserialisation - it needs to be written back to the session again by calling serialisation - and this alone can lead to multiple complexities as you will need to either keep track of the changes - or keep writing back to session after each change in any property. In one request to the server, you will have scenarios where the object is written back multiple times till the response is sent back.
C. For a read-write object in the session, even on a single server it will fail, as the actions of the user can trigger multiple rapid requests to the server and not more than often system will find itself in a situation where the object is being serialised or deserialised by one thread and being edited and then written back by another, the result is you will end up with overwriting the object state by threads - and even locks won't help you much since the object is not a real object but a temporary object created by deserialisation.
D. There are issues with serialising complex objects - it is not just a performance hit, it may even fail in certain scenario - especially if you have deeply nested objects that sometimes refer back to itself.
The synopsis of the solution is here, full implementation along with code is in the blog link:
First implement this as a Cache object, create one item in IMemoryCache for each unique session.
Keep the cache in sliding expiration mode, so that each time it is read it revives the expiry time - thereby keeping the objects in cache as long as the session is active.
Second point alone is not enough, you will need to implement heartbeat technique - triggering the call to session every T minus 1 min or so from the javascript. (This we anyways used to do even to keep the session alive till the user is working on the browser, so it won't be any different
Additional Recommendations
A. Make an object called SessionManager - so that all your code related to session read / write sits in one place.
B. Do not keep very high value for session time out - If you are implementing heartbeat technique, even 3 mins of session time out will be enough.

Atomic Read and Write with Entity Framework

I have two different processes (on different machines) that are reading and updating a database record.
The rule I need to ensure is that the record must only be updated if the value of it, lets say is "Initial". Also, after the commit I would want to know if it actually got updated from the current process or not (in case if value was other than initial)
Now, the below code performs something like:
var record = context.Records
.Where(r => (r.id == id && r.State == "Initial"))
.FirstOrDefault();
if(record != null) {
record.State = "Second";
context.SaveChanges();
}
Now couple of questions
1) From looking at the code it appears that after the record is fetched with state "Initial", some other process could have updated it to state "Second" before this process performs SaveChanges.
In this case we are unnecessarily overwriting the state to the same value. Is this the case happening here ?
2) If case 1 is not what happens then EntityFramework may be translating the above to something like
update Record set State = "Second" where Id = someid and State = "Initial"
and performing this as a transaction. This way only one process writes the value. Is this the case with EF default TransactionScope ?
In both cases again how do I know for sure that the update was made from my process as opposed to some other process ?
If this were in-memory objects then in code it would translate to something like assuming multiple threads accessing same data structure
Record rec = FindRecordById(id);
lock (someobject)
{
if(rec.State == "Initial")
{
rec.State = "Second";
//Now, that I know I updated it I can do some processing
}
}
Thanks
In general there are 2 main concurrency patterns that can be used:
Pessimistic concurrency: You lock a row to prevent others from unexpectedly changing the data you are currently attempting to update. EF does not provide any native support for this type of concurrency pattern.
Optimistic concurrency: Citing from EF's documentation: "Optimistic concurrency involves optimistically attempting to save your entity to the database in the hope that the data there has not changed since the entity was loaded. If it turns out that the data has changed then an exception is thrown and you must resolve the conflict before attempting to save again." This pattern is supported by EF, and can be used rather simply.
Focusing on the optimistic concurrency option, which EF does support, let's compare how your example behaves with and without EF's optimistic concurrency control handling. I'll assume you are using SQL Server.
No concurrency control
Let's start with the following script in the database:
create table Record (
Id int identity not null primary key,
State varchar(50) not null
)
insert into Record (State) values ('Initial')
And here is the code with the DbContext and Record entity:
public class MyDbContext : DbContext
{
static MyDbContext()
{
Database.SetInitializer<MyDbContext>(null);
}
public MyDbContext() : base(#"Server=localhost;Database=eftest;Trusted_Connection=True;") { }
public DbSet<Record> Records { get; set; }
protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
base.OnModelCreating(modelBuilder);
modelBuilder.Conventions.Remove<PluralizingTableNameConvention>();
modelBuilder.Configurations.Add(new Record.Configuration());
}
}
public class Record
{
public int Id { get; set; }
public string State { get; set; }
public class Configuration : EntityTypeConfiguration<Record>
{
public Configuration()
{
this.HasKey(t => t.Id);
this.Property(t => t.State)
.HasMaxLength(50)
.IsRequired();
}
}
}
Now, let's test your concurrent update scenario with the following code:
static void Main(string[] args)
{
using (var context = new MyDbContext())
{
var record = context.Records
.Where(r => r.Id == 1 && r.State == "Initial")
.Single();
// Insert sneaky update from a different context.
using (var sneakyContext = new MyDbContext())
{
var sneakyRecord = sneakyContext.Records
.Where(r => r.Id == 1 && r.State == "Initial")
.Single();
sneakyRecord.State = "Sneaky Update";
sneakyContext.SaveChanges();
}
// attempt to update row that has just been updated and committed by the sneaky context.
record.State = "Second";
context.SaveChanges();
}
}
If you trace the SQL, you will see that the update statement looks like this:
UPDATE [dbo].[Record]
SET [State] = 'Second'
WHERE ([Id] = 1)
So, in effect, it doesn't care that another transaction sneaked in an update. It just blindly writes over whatever the other update did. And so, the final value of State in the database for that row is 'Second'.
Optimistic concurrency control
Let's adjust our initial SQL script to include a concurrency control column to our table:
create table Record (
Id int identity not null primary key,
State varchar(50) not null,
Concurrency timestamp not null -- add this row versioning column
)
insert into Record (State) values ('Initial')
Let's also adjust our Record entity class (the DbContext class stays the same):
public class Record
{
public int Id { get; set; }
public string State { get; set; }
// Add this property.
public byte[] Concurrency { get; set; }
public class Configuration : EntityTypeConfiguration<Record>
{
public Configuration()
{
this.HasKey(t => t.Id);
this.Property(t => t.State)
.HasMaxLength(50)
.IsRequired();
// Add this config to tell EF that this
// property/column should be used for
// concurrency checking.
this.Property(t => t.Concurrency)
.IsRowVersion();
}
}
}
Now, if we try to re-run the same Main() method we used for the previous scenario, you will notice a change in how the update statement is generated and executed:
UPDATE [dbo].[Record]
SET [State] = 'Second'
WHERE (([Id] = 1) AND ([Concurrency] = <byte[]>))
SELECT [Concurrency]
FROM [dbo].[Record]
WHERE ##ROWCOUNT > 0 AND [Id] = 1
In particular, notice how EF automatically includes the column defined for concurrency control in the where clause of the update statement.
In this case, because there was in fact a concurrent update, EF detects it, and throws a DbUpdateConcurrencyException exception on this line:
context.SaveChanges();
And so, in this case, if you check the database, you'll see that the State value for the row in question will be 'Sneaky Update', because our 2nd update failed to pass the concurrency check.
Final thoughts
As you can see, there isn't much that needs to be done to activate automatic optimistic concurrency control in EF.
Where it gets tricky though is, how do you handle the DbUpdateConcurrencyException exception when it gets thrown? It will largely be up to you to decide what you want to do in this case. But for further guidance on the topic, you'll find more information here: EF - Optimistic Concurrency Patterns.

Loading Contacts on Key Down in ListView

I am using following code to load contacts using ContactStore. My requirement is to search contacts in ContactStore when key is pressed in a textbox. Below is the code in my View Model.
private async void LoadAllContacts(string searchQuery)
{
ContactStore contactStore = await ContactManager.RequestStoreAsync();
IReadOnlyList<Contact> contacts = null;
// Find all contacts
contacts = await contactStore.FindContactsAsync(searchQuery);
foreach (var item in contacts)
{
if (!string.IsNullOrEmpty(item.FirstName) && !string.IsNullOrEmpty(item.LastName))
{
var contact = new Member
{
FirstName = item.FirstName,
LastName = item.LastName,
FullName = item.DisplayName, //item.HonorificNamePrefix ?? "" + item.FirstName + item.MiddleName ?? "" + item.LastName,
Bio = item.Notes
};
if (item.DataSuppliers != null)
{
foreach (var dataSupplier in item.DataSuppliers)
{
switch (dataSupplier.ToLower())
{
case "facebook":
break;
case "hotmail2":
case "hotmail":
break;
}
}
}
if (item.Thumbnail != null)
{
var thumnailStream = await item.Thumbnail.OpenReadAsync();
BitmapImage thumbImage = new BitmapImage();
thumbImage.SetSource(thumnailStream);
contact.ImageSource = thumbImage;
}
this.Insert(0, contact);
}
}
}
The contacts are loading perfectly in my listview but the problem is that loading contacts from contact store is an extensive task and when user I press text in the textbox quickly then application throws exception.
My question is how can I load contacts efficiently? Means if User pressed 'a' and I call this method and quickly user pressed 'c' so if the results are loaded for 'a' then application should cancel continuing with previous operation and load 'ac' related contacts.
Thanks.
You should read up on the concept of CancellationTokens. That's the idea you're looking for. Lots of examples online.
What is "cancellationToken" in the TaskFactory.StartNew() used for?
http://dotnetcodr.com/2014/01/31/suspending-a-task-using-a-cancellationtoken-in-net-c/
CancellationToken/CancellationTokenSource is C#'s way of managing and cancelling Tasks.
In your case, you would create a CancellationTokenSource, and pass its Token object in your ContactManager.RequestStoreAsync(CancellationToken token) and contactStore.FindContactsAsync(CancellationToken token) methods (any async methods where you're awaiting).
Those two methods should accept a CancellationToken, and sporadically do a token.ThrowIfCancellationRequested().
If at some point you find that you want to stop the current Task and start a new one (for your case, when the user types something new), call CancellationTokenSource.Cancel(), which will kill the running Task thread because of the ThrowIfCancellationRequested().
One thing I want to point out though, is that your code can be optimized even further before going through CancellationTokens. You call ContactStore contactStore = await ContactManager.RequestStoreAsync(); every time. Could you not store that as a member variable?
You can also do tricks such as not running your load contacts method unless the user has stopped typing for 1 second, and I would suggest that you force concurrent Task execution in this case by using a ConcurrentExclusiveSchedular.

Strategy for generating unique identifier for document in RavenDB

Say I have something like a support ticket system (simplified as example). It has many users and organizations. Each user can be a member of several organizations, but the typical case would be one org => many users, and most of them belong only to this organization. Each organization has a "tag" which is to be used to construct "ticket numbers" for this organization. Lets say we have an org called StackExchange that wants the tag SES.
So if I open the first ticket of today, I want it to be SES140407-01. The next is SES140407-02 and so on. Doesn't have to be two digits after the dash.
How can I make sure this is generated in a way that makes sure it is 100% unique across the organization (no orgs will have the same tag)?
Note: This does not have to be the document ID in the database - that will probably just be a Guid or similar. This is just a ticket reference - kinda like a slug - that will appear in related emails etc. So it has to be unique, and I would prefer if we didn't "waste" the sequential case numbers hilo style.
Is there a practical way to ensure I get a unique ticket number even if two or more people report a new one at almost the same time?
EDIT: Each Organization is a document in RavenDB, and can easily hold a property like LastIssuedTicketId. My challenge is basically to find the best way to read this field, generate a new one, and store this back in a way that is "race condition safe".
Another edit: To be clear - I intend to generate the ticket ID in my own software. What I am looking for is a way to ask RavenDB "what was the last ticket number", and then when I generate the next one after that, "am I the only one using this?" - so that I give my ticket a unique case id, not necessarily related to what RavenDB considers the document id.
I use for that generic sequence generator written for RavenDB:
public class SequenceGenerator
{
private static readonly object Lock = new object();
private readonly IDocumentStore _docStore;
public SequenceGenerator(IDocumentStore docStore)
{
_docStore = docStore;
}
public int GetNextSequenceNumber(string sequenceKey)
{
lock (Lock)
{
using (new TransactionScope(TransactionScopeOption.Suppress))
{
while (true)
{
try
{
var document = GetDocument(sequenceKey);
if (document == null)
{
PutDocument(new JsonDocument
{
Etag = Etag.Empty,
// sending empty guid means - ensure the that the document does NOT exists
Metadata = new RavenJObject(),
DataAsJson = RavenJObject.FromObject(new { Current = 0 }),
Key = sequenceKey
});
return 0;
}
var current = document.DataAsJson.Value<int>("Current");
current++;
document.DataAsJson["Current"] = current;
PutDocument(document);
{
return current;
}
}
catch (ConcurrencyException)
{
// expected, we need to retry
}
}
}
}
}
private void PutDocument(JsonDocument document)
{
_docStore.DatabaseCommands.Put(
document.Key,
document.Etag,
document.DataAsJson,
document.Metadata);
}
private JsonDocument GetDocument(string key)
{
return _docStore.DatabaseCommands.Get(key);
}
}
It generates incremental unique sequence based on sequenceKey. Uniqueness is guaranteed by raven optimistic concurrency based on Etag. So each sequence has its own document which we update when generate new sequence number. Also, there is lock which reduced extra db calls if several threads are executing at the same moment at the same process (appdomain).
For your case you can use it this way:
var sequenceKey = string.Format("{0}{1:yyMMdd}", yourCompanyPrefix, DateTime.Now);
var nextSequenceNumber = new SequenceGenerator(yourDocStore).GetNextSequenceNumber(sequenceKey);
var nextSequenceKey = string.Format("{0}-{1:00}", sequenceKey, nextSequenceNumber);

Delaying writes to SQL Server

I am working on an app, and need to keep track of how any views a page has. Almost like how SO does it. It is a value used to determine how popular a given page is.
I am concerned that writing to the DB every time a new view needs to be recorded will impact performance. I know this borderline pre-optimization, but I have experienced the problem before. Anyway, the value doesn't need to be real time; it is OK if it is delayed by 10 minutes or so. I was thinking that caching the data, and doing one large write every X minutes should help.
I am running on Windows Azure, so the Appfabric cache is available to me. My original plan was to create some sort of compound key (PostID:UserID), and tag the key with "pageview". Appfabric allows you to get all keys by tag. Thus I could let them build up, and do one bulk insert into my table instead of many small writes. The table looks like this, but is open to change.
int PageID | guid userID | DateTime ViewTimeStamp
The website would still get the value from the database, writes would just be delayed, make sense?
I just read that the Windows Azure Appfabric cache does not support tag based searches, so it pretty much negates my idea.
My question is, how would you accomplish this? I am new to Azure, so I am not sure what my options are. Is there a way to use the cache without tag based searches? I am just looking for advice on how to delay these writes to SQL.
You might want to take a look at http://www.apathybutton.com (and the Cloud Cover episode it links to), which talks about a highly scalable way to count things. (It might be overkill for your needs, but hopefully it gives you some options.)
You could keep a queue in memory and on a timer drain the queue, collapse the queued items by totaling the counts by page and write in one SQL batch/round trip. For example, using a TVP you could write the queued totals with one sproc call.
That of course doesn't guarantee the view counts get written since its in memory and latently written but page counts shouldn't be critical data and crashes should be rare.
You might want to have a look at how the "diagnostics" feature in Azure works. Not because you would use diagnostics for what you are doing at all, but because it is dealing with a similar problem and may provide some inspiration. I am just about to implement a data auditing feature and I want to log that to table storage so also want to delay and bunch the updates together and I have taken a lot of inspiration from diagnostics.
Now, the way Diagnostics in Azure works is that each role starts a little background "transfer" thread. So, whenever you write any traces then that gets stored in a list in local memory and the background thread will (by default) bunch all the requests up and transfer them to table storage every minute.
In your scenario, I would let each role instance keep track of a count of hits and then use a background thread to update the database every minute or so.
I would probably use something like a static ConcurrentDictionary (or one hanging off a singleton) on each webrole with each hit incrementing the counter for the page identifier. You'd need to have some thread handling code to allow multiple request to update the same counter in the list. Alternatively, just allow each "hit" to add a new record to a shared thread-safe list.
Then, have a background thread once per minute increment the database with the number of hits per page since last time and reset the local counter to 0 or empty the shared list if you are going with that approach (again, be careful about the multi threading and locking).
The important thing is to make sure your database update is atomic; If you do a read-current-count from the database, increment it and then write it back then you may have two different web role instances doing this at the same time and thus losing one update.
EDIT:
Here is a quick sample of how you could go about this.
using System.Collections.Concurrent;
using System.Data.SqlClient;
using System.Threading;
using System;
using System.Collections.Generic;
using System.Linq;
class Program
{
static void Main(string[] args)
{
// You would put this in your Application_start for the web role
Thread hitTransfer = new Thread(() => HitCounter.Run(new TimeSpan(0, 0, 1))); // You'd probably want the transfer to happen once a minute rather than once a second
hitTransfer.Start();
//Testing code - this just simulates various web threads being hit and adding hits to the counter
RunTestWorkerThreads(5);
Thread.Sleep(5000);
// You would put the following line in your Application shutdown
HitCounter.StopRunning(); // You could do some cleverer stuff with aborting threads, joining the thread etc but you probably won't need to
Console.WriteLine("Finished...");
Console.ReadKey();
}
private static void RunTestWorkerThreads(int workerCount)
{
Thread[] workerThreads = new Thread[workerCount];
for (int i = 0; i < workerCount; i++)
{
workerThreads[i] = new Thread(
(tagname) =>
{
Random rnd = new Random();
for (int j = 0; j < 300; j++)
{
HitCounter.LogHit(tagname.ToString());
Thread.Sleep(rnd.Next(0, 5));
}
});
workerThreads[i].Start("TAG" + i);
}
foreach (var t in workerThreads)
{
t.Join();
}
Console.WriteLine("All threads finished...");
}
}
public static class HitCounter
{
private static System.Collections.Concurrent.ConcurrentQueue<string> hits;
private static object transferlock = new object();
private static volatile bool stopRunning = false;
static HitCounter()
{
hits = new ConcurrentQueue<string>();
}
public static void LogHit(string tag)
{
hits.Enqueue(tag);
}
public static void Run(TimeSpan transferInterval)
{
while (!stopRunning)
{
Transfer();
Thread.Sleep(transferInterval);
}
}
public static void StopRunning()
{
stopRunning = true;
Transfer();
}
private static void Transfer()
{
lock(transferlock)
{
var tags = GetPendingTags();
var hitCounts = from tag in tags
group tag by tag
into g
select new KeyValuePair<string, int>(g.Key, g.Count());
WriteHits(hitCounts);
}
}
private static void WriteHits(IEnumerable<KeyValuePair<string, int>> hitCounts)
{
// NOTE: I don't usually use sql commands directly and have not tested the below
// The idea is that the update should be atomic so even though you have multiple
// web servers all issuing similar update commands, potentially at the same time,
// they should all commit. I do urge you to test this part as I cannot promise this code
// will work as-is
//using (SqlConnection con = new SqlConnection("xyz"))
//{
// foreach (var hitCount in hitCounts.OrderBy(h => h.Key))
// {
// var cmd = con.CreateCommand();
// cmd.CommandText = "update hits set count = count + #count where tag = #tag";
// cmd.Parameters.AddWithValue("#count", hitCount.Value);
// cmd.Parameters.AddWithValue("#tag", hitCount.Key);
// cmd.ExecuteNonQuery();
// }
//}
Console.WriteLine("Writing....");
foreach (var hitCount in hitCounts.OrderBy(h => h.Key))
{
Console.WriteLine(String.Format("{0}\t{1}", hitCount.Key, hitCount.Value));
}
}
private static IEnumerable<string> GetPendingTags()
{
List<string> hitlist = new List<string>();
var currentCount = hits.Count();
for (int i = 0; i < currentCount; i++)
{
string tag = null;
if (hits.TryDequeue(out tag))
{
hitlist.Add(tag);
}
}
return hitlist;
}
}