Exception of Shard 0 already allocated - akka.net

we are getting an exception after every re-start of service 3 POD [3 shard]inside AWS cluster, we are using Nuget Akka1.4.16 for our .net service using AWS Document DB, attaching the code for your refernce, please let us know if we are missing anything.
Exception in ReceiveRecover when replaying event type [Akka. Cluster.Sharding.PersistentShardCoordinator+ShardHomeAllocated] with sequence number [26] for persistenceId
Cause: System.ArgumentException: Shard 0 is already allocated (Parameter 'e')
at Akka.Cluster.Sharding.PersistentShardCoordinator.State.Updated(IDomainEvent e)
at Akka.Cluster.Sharding.PersistentShardCoordinator.ReceiveRecover(Object message)
at Akka.Persistence.Eventsourced.<>c__DisplayClass91_0.g__RecoveryBehavior|0(Object message)
at Akka.Actor.ActorBase.AroundReceive(Receive receive, Object message)
at Akka.Persistence.Eventsourced.<>n__0(Receive receive, Object message)
at Akka.Persistence.Eventsourced.<>c__DisplayClass92_0.b__1(Receive receive, Object message)
private void ConfigureActorSystem(IServiceCollection services)
{
var conf = new Config();
var setUp = conf.BootstrapApplication(Configuration["ConnectionString"].Replace("{UserName}", Configuration["UserName"]).Replace("{Password}", Configuration["Password"]));
var dockerconfigsetUp = setUp?.BootstrapFromDocker();
var bootstrap = BootstrapSetup.Create().WithConfig(dockerconfigsetUp);
var di = ServiceProviderSetup.Create(services.BuildServiceProvider());
var actorSystemSetup = bootstrap.And(di);
var actorSystem = ActorSystem.Create("user-actor-system", actorSystemSetup);
var shards=3;
Cluster.Get(actorSystem).RegisterOnMemberUp(() =>
{
var provider = Akka.DependencyInjection.ServiceProvider.For(actorSystem);
var sharding = ClusterSharding.Get(actorSystem);
var shardRegion = sharding.Start(
typeName: nameof(UserActor),
entityPropsFactory: s => provider.Props<UserActor>(s),
settings: ClusterShardingSettings.Create(actorSystem),
messageExtractor: new MessageExtractor(shards)
);
Startup.ShardRegion = shardRegion;
}
);
MongoDbPersistence.Get(actorSystem);
}

Ah, in this case the issue is that your Akka.Cluster.Sharding allocation state was saved in a dirty fashion - usually the result of not allowing the cluster to properly shut itself down, so the sharding system is recovering older / no longer valid shard state data.
There are a couple of ways to fix this:
Run https://github.com/petabridge/Akka.Cluster.Sharding.RepairTool - that will purge old ShardCoordinator data out of Akka.Persistence
Switch from using Akka.Persistence as the backing state mode for Akka.Cluster.Sharding to using DistributedData, which doesn't have this problem even when the cluster shuts down without cleanup: https://getakka.net/articles/clustering/cluster-sharding.html - explains how to configure this at the top of the page.

Related

EF Core 6 - The instance of entity type 'AppCase' cannot be tracked because another instance with the same key value for {} is already being tracked

I'm using ASP.NET Core 6, and EF Core 6, and Repository/UnitOfWork pattern. I have some trouble with my project, The error "The instance of entity type 'AppCase' cannot be tracked because another instance with the same key value for {} is already being tracked" have throwed when I call UpdateAsync in Repository.
I think i know the issue description, and I used AsNoTracking() but the issue can't have been solved yet.
How can I solve it, please.
Thanks for your support and sorry for my bad English!
Here is my source code:
//MicroOrderBusiness.cs
public async Task<dynamic> CreateOrder(MicroOrderRequest requsetObject, string caseNo)
{
string processCode = "CEC1D433-C376-498B-8A80-B23DC3772024";
var caseModel = await _appCaseBusiness.PrepareAppCase(processCode, caseNo);
if (caseModel == null)
throw new Exception("Create request faield!");
var appCaseEntity = await _appCaseBusiness.AddNewCase(caseModel); // => Add new Case by using EF 6 Core via Repository and UnitOfWork**
if (appCaseEntity == null)
throw new Exception("Adding faield!");
var taskResponse = await _processTaskBusiness.ExecuteTask<MicroOrderRequest>(caseModel.FirstTaskCode, requsetObject);
if (Equals(taskResponse, null))
throw new Exception("Response task is not found!");
caseModel.CaseObjects.Add(taskResponse);
// Update case state
appCaseEntity.State = STATE.DONE;
appCaseEntity.CaseObject = JsonConvert.SerializeObject(appCase.CaseObjects);
var updateCase = await _appCaseRepository.UpdateAsync(appCaseEntity); // => Error throw here**
if (updateCase == null)
throw new Exception($"Update state case faiedl!");
return caseModel;
}
//Program.cs
builder.Services.AddDbContext<AppDbContext>(options =>
options.UseOracle(connectionString,
o => o.MigrationsAssembly(typeof(AppDbContext)
.Assembly.FullName)
.UseOracleSQLCompatibility("12")));
builder.Services.AddScoped(typeof(IUnitOfWork), typeof(AppDbContext));
The error has been explained in the document:https://learn.microsoft.com/en-us/ef/core/change-tracking/identity-resolution#attaching-a-serialized-graph
I checked your codes, I think the error may caused by:
tracking a serialized graph of entities
appCaseEntity.CaseObject = JsonConvert.SerializeObject(appCase.CaseObjects);
var updateCase = await _appCaseRepository.UpdateAsync(appCaseEntity); // => Error throw here**
You could learn more details and found the solution in this part
reusing a DbContext instance for multiple units-of-work
var appCaseEntity = await _appCaseBusiness.AddNewCase(caseModel);
........
var updateCase = await _appCaseRepository.UpdateAsync(appCaseEntity);
You registed the dbcontext as below(the lifetime is scoped by default):
builder.Services.AddDbContext<AppDbContext>(options =>
options.UseOracle(connectionString,
o => o.MigrationsAssembly(typeof(AppDbContext)
.Assembly.FullName)
.UseOracleSQLCompatibility("12")));
You could try to set the lifetime with Trasident or Use DbContextFactory to create different instance in different units of work
builder.Services.AddDbContext<AppDbContext>((options=>........),ServiceLifetime.Transient)
Check this document about using DbContextFactory
If you have further issue on this case,please provide the minimal codes that could reproduce the error

Apache Ignite performance problem on Azure Kubernetes Service

I'm using Apache Ignite on Azure Kubernetes as a distributed cache.
Also, I have a web API on Azure based on .NET6
The Ignite service works stable and very well on AKS.
But at first request, the API tries to connect Ignite and it takes around 3 seconds. After that, Ignite responses take around 100 ms which is great. Here are my Web API performance outputs for the GetProduct function.
At first, I've tried adding the Ignite Service to Singleton but it failed sometimes as 'connection closed'. How can I keep open the Ignite connection always? or does anyone has something better idea?
here is my latest GetProduct code,
[HttpGet("getProduct")]
public IActionResult GetProduct(string barcode)
{
Stopwatch _stopWatch = new Stopwatch();
_stopWatch.Start();
Product product;
CacheManager cacheManager = new CacheManager();
cacheManager.ProductCache.TryGet(barcode, out product);
if(product == null)
{
return NotFound(new ApiResponse<Product>(product));
}
cacheManager.DisposeIgnite();
_logger.LogWarning("Loaded in " + _stopWatch.ElapsedMilliseconds + " ms...");
return Ok(new ApiResponse<Product>(product));
}
Also, I add CacheManager class here;
public CacheManager()
{
ConnectIgnite();
InitializeCaches();
}
public void ConnectIgnite()
{
_ignite = Ignition.StartClient(GetIgniteConfiguration());
}
public IgniteClientConfiguration GetIgniteConfiguration()
{
var appSettingsJson = AppSettingsJson.GetAppSettings();
var igniteEndpoints = appSettingsJson["AppSettings:IgniteEndpoint"];
var igniteUser = appSettingsJson["AppSettings:IgniteUser"];
var ignitePassword = appSettingsJson["AppSettings:IgnitePassword"];
var nodeList = igniteEndpoints.Split(",");
var config = new IgniteClientConfiguration
{
Endpoints = nodeList,
UserName = igniteUser,
Password = ignitePassword,
EnablePartitionAwareness = true,
SocketTimeout = TimeSpan.FromMilliseconds(System.Threading.Timeout.Infinite)
};
return config;
}
Make it a singleton. Ignite node, even in client mode, is supposed to be running for the entire lifetime of your application. All Ignite APIs are thread-safe. If you get a connection error, please provide more details (exception stack trace, how do you create the singleton, etc).
You can also try the Ignite thin client which consumes fewer resources and connects instantly: https://ignite.apache.org/docs/latest/thin-clients/dotnet-thin-client.

TriggeredFunctionData null when entering Webjob SDK function

I am trying to use a rabbitMQ extension to webjob SDK (https://github.com/Sarmaad/WebJobs.Extensions.RabbitMQ) to have it trigger when something is put on the queue.
The triggering works fine, but the content is never passed into my function.
I downloaded the source for the extension so i could debug inside it and I see that the content of the queue is delivered successfully and the extension repackages it into a TriggeredFunctionData object. The object is then passed to my function through the Webjob executor.
However as I step into my function this object is null.
Listener method from extension lib:
_consumer.Received += (sender, args) =>
{
var triggerValue = new RabbitQueueTriggerValue {MessageBytes = args.Body};
if (args.BasicProperties != null)
{
triggerValue.MessageId = args.BasicProperties.MessageId;
triggerValue.ApplicationId = args.BasicProperties.AppId;
triggerValue.ContentType = args.BasicProperties.ContentType;
triggerValue.CorrelationId = args.BasicProperties.CorrelationId;
triggerValue.Headers = args.BasicProperties.Headers;
}
var result = _executor.TryExecuteAsync(new TriggeredFunctionData{TriggerValue = triggerValue}, CancellationToken.None).Result;
When debugging I can see that Triggervalue contains my message data.
My function being executed:
public static async Task ProcessRabbitMqTopicStatusMessage([RabbitQueueTrigger("tempq")]
[RabbitQueueBinder("myexchange", "myroutingkey", "myerrorq",autoDelete:true,durable:true, execlusive:false)]
TriggeredFunctionData message,
TextWriter logger)
{
if (message != null)
{
}
}
This method is triggered successfully, but message is always null.
Any suggestions?
Your user function shouldn't bind directly to TriggeredFunctionData. That's an intermediate object used by the triggering infrastructure which gets converted to final destination objects to match your function's signature.
The binding author (in this case, RabbitMQ at the GitHub site you linked to) is what defines the possible objects that it can bind to.
From http://www.sarmaad.com/2016/11/azure-webjobs-and-rabbitmq/, here was an example of their usage:
public void IntegrateApprovedProductToMarketPlace(
[RabbitQueueBinder("product", "product.approved", "error")]
[RabbitQueueTrigger("integration-product-approved")]
ProductApproved message, TextWriter log)
{
[handle message here]
}

Akka.Net cluster singleton - handover not occurs when current singleton node shutdown unexpectedly

I'm trying Akka.Net Cluster Tools, in order to use the Singleton behavior and it seems to work perfectly, but just when the current singleton node "host" leaves the cluster in a gracefully way. If I suddenly shutdown the host node, the handover does not occur.
Background
I'm building a system that will be composed by four nodes (initially). One of those nodes will be the "workers coordinator" and it will be responsible to monitor some data from database and, when necessary, submit jobs to the other workers. I was thinking to subscribe to cluster events and use the role leader changing event to make an actor (on the leader node) to become a coordinator, but I think that the Cluster Singleton would be a better choice in this case.
Working sample (but just if I gracefully leave the cluster)
private void Start() {
Console.Title = "Worker";
var section = (AkkaConfigurationSection)ConfigurationManager.GetSection("akka");
var config = section.AkkaConfig;
// Create a new actor system (a container for your actors)
var system = ActorSystem.Create("SingletonActorSystem", config);
var cluster = Cluster.Get(system);
cluster.RegisterOnMemberRemoved(() => MemberRemoved(system));
var settings = new ClusterSingletonManagerSettings("processorCoordinatorInstance",
"worker", TimeSpan.FromSeconds(5), TimeSpan.FromSeconds(1));
var actor = system.ActorOf(ClusterSingletonManager.Props(
singletonProps: Props.Create<ProcessorCoordinatorActor>(),
terminationMessage: PoisonPill.Instance,
settings: settings),
name: "processorCoordinator");
string line = Console.ReadLine();
if (line == "g") { //handover works
cluster.Leave(cluster.SelfAddress);
_leaveClusterEvent.WaitOne();
system.Shutdown();
} else { //doesn't work
system.Shutdown();
}
}
private async void MemberRemoved(ActorSystem actorSystem) {
await actorSystem.Terminate();
_leaveClusterEvent.Set();
}
Configuration
akka {
suppress-json-serializer-warning = on
actor {
provider = "Akka.Cluster.ClusterActorRefProvider, Akka.Cluster"
}
remote {
helios.tcp {
port = 0
hostname = localhost
}
}
cluster {
seed-nodes = ["akka.tcp://SingletonActorSystem#127.0.0.1:4053"]
roles = [worker]
}
}
Thank you #Horusiath, your answer is totaly right! I wasn't able to find this configuration in akka.net documentation, and I didn't realize that I was supposed to take a look on the akka documentation. Thank you very much!
Have you tried to set akka.cluster.auto-down-unreachable-after to some timeout (eg. 10 sec)? – Horusiath Aug 12 at 11:27
Posting it as a response for caution for those who find this post.
Using auto-downing is NOT recommended in a clustered environment, due to different part of the system might decide after some time that the other part is down, splitting the cluster into two clusters, each with their own cluster singleton.
Related akka docs: https://doc.akka.io/docs/akka/current/split-brain-resolver.html

Redis on Appharbor - Booksleeve GetString exception

i am trying to setup Redis on appharbor. I have followed their instructions and again i have an issue with the Booksleeve API. Here is the code i am using to make it work initially:
var connectionUri = new Uri(url);
using (var redis = new RedisConnection(connectionUri.Host, connectionUri.Port, password: connectionUri.UserInfo.Split(new[] { ':' }, 2)[1]))
{
redis.Strings.Set(1, "greeting", "welcome to remember your stuff!");
try
{
var task = redis.Strings.GetString(1, "greeting");
redis.Wait(task);
ViewBag.Message = task.Result;
}
catch (Exception)
{
// It throws an exception trying to wait for the task?
}
}
However, the issue is that it sets the string correctly, but when trying to retrieve the same string from the key value store, it throws a timeout exception waiting for the task to eexecute. However, this code works on my local redis server connection.
Am i using the API in a wrong way? or is this something related to Appharbor?
Thanks
Like a SqlConnection, you need to call Open() (otherwise your messages are queued for delivery).
Unlike SqlConnection, you should not fire up a RedisConnection each time you need it - it is intended to be used as a shared, thread-safe, multiplexer - i.e. a single connection is held somewhere and used by lots and lots of unrelated callers. Unless of course you only need to do one thing!