Use multiple instance of hangfire with single database - hangfire

Has anyone used multiple instances of Hangfire (in different applications) with same SQL DB for configuration. So instead of creating new SQL DB for each hangfire instance i would like to share same DB with multiple instances.
As per the hangfire documentation here it is supported since v1.5 However forum discussion here and here shows we still have issues running multiple instances with same db
Update 1
So based on suggestions and documentation i configired hangfire to use queue
public void Configure(IApplicationBuilder app, IHostingEnvironment env,
ILoggerFactory loggerFactory)
{
app.UseHangfireServer(new BackgroundJobServerOptions()
{
Queues = new string[] { "instance1" }
});
}
Method to invoke
[Queue("instance1")]
public async Task Start(int requestID)
{
}
This is how i Enqueue job
_backGroundJobClient.Enqueue<IPrepareService>(x => x.Start(request.ID));
however when i check [JobQueue] table the new job has queue name default and because of that hangfire will never pickup that job because it picks up jobs for queues.
I think is a bug
Update 2
Found one more thing. I am using instance of IBackgroundJobClient. The instance is automatically get injected by .Net Core's inbuilt container.
So if i use instance to enqueue the job then hangfire creates new job with default queue name
_backGroundJobClient.Enqueue<IPrepareService>(x => x.Start(request.ID));
However if i use static method, then hangfire creates new job with configured queue name instance1
BackgroundJob.Enqueue<IPrepareService>(x => x.Start(prepareRequest.ID));
How do i configure hangfire in .Net Core so the instance of IBackgroundJobClient will use configure queue name ?

This is possible by simply setting the SQL server options with a different schema name for each instance.
Instance 1:
configuration.UseSqlServerStorage(
configuration.GetConnectionString("Hangfire"),
new SqlServerStorageOptions { SchemaName = "PrefixOne" }
);
Instance 2:
configuration.UseSqlServerStorage(
configuration.GetConnectionString("Hangfire"),
new SqlServerStorageOptions { SchemaName = "PrefixTwo" }
);
Both instances use same connection string and will create two instances of all the required tables with the prefix specified in the settings.
Queues are used for having separate queues in the same Hangfire instance. If you want to use different queues you'll need to specify the queues you want the IBackgroundJobClient to listen to and then specify that queue when creating jobs. This doesn't sound like what you're trying to accomplish.

Related

Configuring and Using Geode Regions and Locks for Atomic Data Structures

I am currently using Spring Boot Starter 1.4.2.RELEASE, and Geode Core 1.0.0-incubating via Maven, against a local Docker configuration consisting of a Geode Locator, and 2 cache nodes.
I've consulted the documentation here:
http://geode.apache.org/docs/guide/developing/distributed_regions/locking_in_global_regions.html
I have configured a cache.xml file for use with my application like so:
<?xml version="1.0" encoding="UTF-8"?>
<client-cache
xmlns="http://geode.apache.org/schema/cache"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://geode.apache.org/schema/cache
http://geode.apache.org/schema/cache/cache-1.0.xsd"
version="1.0">
<pool name="serverPool">
<locator host="localhost" port="10334"/>
</pool>
<region name="testRegion" refid="CACHING_PROXY">
<region-attributes pool-name="serverPool"
scope="global"/>
</region>
</client-cache>
In my Application.java I have exposed the region as a bean via:
#SpringBootApplication
public class Application {
#Bean
ClientCache cache() {
return new ClientCacheFactory()
.create();
}
#Bean
Region<String, Integer> testRegion(final ClientCache cache) {
return cache.<String, Integer>getRegion("testRegion");
}
public static void main(String[] args) {
SpringApplication.run(Application.class, args);
}
}
And in my "service" DistributedCounter.java:
#Service
public class DistributedCounter {
#Autowired
private Region<String, Integer> testRegion;
/**
* Using fine grain lock on modifier.
* #param counterKey {#link String} containing the key whose value should be incremented.
*/
public void incrementCounter(String counterKey) {
if(testRegion.getDistributedLock(counterKey).tryLock()) {
try {
Integer old = testRegion.get(counterKey);
if(old == null) {
old = 0;
}
testRegion.put(counterKey, old + 1);
} finally {
testRegion.getDistributedLock(counterKey).unlock();
}
}
}
I have used gfsh to configure a region named /testRegion - however there is no option to indicate that it's type should be "GLOBAL", only a variety of other options are available - ideally this should be a persistent, and replicated cache though so the following command:
create region --name=/testRegion --type=REPLICATE_PERSISTENT
Using the how-to at: http://geode.apache.org/docs/guide/getting_started/15_minute_quickstart_gfsh.html it is easy to see the functionality of persistence and replication on my two node configuration.
However, the locking in DistributedCounter, above, does not cause any errors - but it just does not work when two processes attempt to acquire a lock on the same "key" - the second process is not blocked from acquiring the lock. There is an earlier code sample from the Gemfire forums which uses the DistributedLockService - which the current documentation warns against using for locking region entries.
Is the use-case of fine-grained locking to support a "map" of atomically incremental longs a supported use case, and if so, how to appropriately configure it?
The Region APIs for DistributedLock and RegionDistributedLock only support Regions with Global scope. These DistributedLocks have locking scope within the name of the DistributedLockService (which is the full path name of the Region) only within the cluster. For example, if the Global Region exists on a Server, then the DistributedLocks for that Region can only be used on that Server or on other Servers within that cluster.
Cache Clients were originally a form of hierarchical caching, which means that one cluster could connect to another cluster as a Client. If a Client created an actual Global region, then the DistributedLock within the Client would only have a scope within that Client and the cluster that it belongs to. DistributedLocks do not propagate in anyway to the Servers that such a Client is connected to.
The correct approach would be to write Function(s) that utilize the DistributedLock APIs on Global regions that exist on the Server(s). You would deploy those Functions to the Server and then invoke them on the Server(s) from the Client.
In general, use of Global regions is avoided because every individual put acquires a DistributedLock within the Server's cluster, and this is a very expensive operation.
You could do something similar with a non-Global region by creating a custom DistributedLockService on the Servers and then use Functions to lock/unlock around code that you need to be globally synchronized within that cluster. In this case, the DistributedLock and RegionDistributedLock APIs on Region (for the non-Global region) would be unavailable and all locking would have to be done within a Function on the Server using the DistributedLockService API.
This only works for server side code (in Functions for example).
From client code you can implement locking semantics using "region.putIfAbsent".
If 2 (or more) clients call this API on the same region and key, only one will successfully put, which is indicated by a return value of null. This client is considered to hold the lock. The other clients will get the object that was put by the winner. This is handy because, if the value you "put" contains a unique identifier of the client, then the losers even know who is holding the lock.
Having a region entry represent a lock has other nice benefits. The lock survives across failures. You can use region expiration to set the maximum lease time for a lock, and, as mentioned previously, its easy to tell who is holding the lock.
Hope this helps.
It seems that GFSH does not provide an option to provide the correct scope=GLOBAL.
Maybe you could start a server with --cache-xml-file option... which would point to a cache.xml file.
The cache.xml file should look like this:
<?xml version="1.0" encoding="UTF-8"?>
<cache xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schema.pivotal.io/gemfire/cache" xsi:schemaLocation="http://schema.pivotal.io/gemfire/cache http://schema.pivotal.io/gemfire/cache/cache-8.1.xsd" version="8.1" lock-lease="120" lock-timeout="60" search-timeout="300" is-server="true" copy-on-read="false">
<cache-server port="0"/>
<region name="testRegion">
<region-attributes data-policy="persistent-replicate" scope="global"/>
</region>
</cache>
Also the client configuration does not need to define the scope in region-attributes

can we use Spark sql for reporting queries in REST web services

Some basic question regarding Spark. Can we use spark only in the context of processing jobs?In our use case we have stream of positon and motion data which we can refine and save it to cassandra tables.That is done with kafka and spark streaming.But for a web user who want to view some report with some search criteria can we use Spark(Spark SQL).Or for this purpose should we restrict to cql ? If we can use spark , how can we invoke spark-sql from a webservice deployed in tomcat server.
Well, you can do it by passing a SQL request via HTML address like:
http://yourwebsite.com/Requests?query=WOMAN
At the receiving point, the architecture will be something like:
Tomcat+Servlet --> Apache Kafka/Flume --> Spark Streaming --> Spark SQL inside a SS closure
In the servlet (if you don't know what a servlet is, better look it up) in the webapplication folder in your tomcat, you will have something like this:
public class QueryServlet extends HttpServlet{
#Override
public void doGet(ttpServletRequest request, HttpServletResponse response){
String requestChoice = request.getQueryString().split("=")[0];
String requestArgument = request.getQueryString().split("=")[1];
KafkaProducer<String, String> producer;
Properties properties = new Properties();
properties.setProperty("bootstrap.servers", "localhost:9092");
properties.setProperty("acks", "all");
properties.setProperty("retries", "0");
properties.setProperty("batch.size", "16384");
properties.setProperty("auto.commit.interval.ms", "1000");
properties.setProperty("linger.ms", "0");
properties.setProperty("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
properties.setProperty("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
properties.setProperty("block.on.buffer.full", "true");
producer = new KafkaProducer<>(properties);
producer.send(new ProducerRecord<String, String>(
requestChoice,
requestArgument));
In the Spark Streaming running application (which you need to be running in order to catch the queries, otherwise you know how long it takes Spark to start), You need to have a Kafka Receiver
JavaStreamingContext jssc = new JavaStreamingContext(sparkConf, new Duration(batchInt*1000));
Map<String, Integer> topicMap = new HashMap<>();
topicMap.put("wearable", 1);
//FIrst Dstream is a couple made by the topic and the value written to the topic
JavaPairReceiverInputDStream<String, String> kafkaStream =
KafkaUtils.createStream(jssc, "localhost:2181", "test", topicMap);
After this, what happens is that
You do a GET setting either the GET body or giving the argument to the query
The GET is caught by your servlet, which immediately creates, send, close a Kafka Producer (it is possible to actually avoid the Kafka Step, simply sending your Spark Streaming app the information in any other way; see SparkStreaming receivers)
Spark Streaming operates your SparkSQL code as any other submitted Spark application, but it keeps running waiting for other queries to come.
Of course, in the servlet you should check the validity of the request, but this is the main idea. Or at least the architecture I've been using

Running hangfire single threaded "mode"

Is there any way of configuring hangfire to run single threaded? I'd like the jobs to be processed sequentially, rather than concurrently.
Something like:
app.UseHangfire(config =>
{
config.RunSingleThreaded();
config.UseServer();
});
Either this or the ability to "chain" jobs together so they happen in sequence.
Something like:
BackgroundJob
.Enqueue(() => taskContainer.PublishBatch(batchId, accountingPeriodId, currentUser, filePath))
.WithDependentJobId(23); // does not run until this job has finished...
Should have read the docs obviously...
http://docs.hangfire.io/en/latest/background-processing/configuring-degree-of-parallelism.html
To configure single thread use the BackgroundJobServerOptions type, and specify workerCount:
var server = new BackgroundJobServer(new BackgroundJobServerOptions
{
WorkerCount = 1
});
Also, it appears job chaining is a feature of Hangfire Pro version.

Read SQL Server Broker messages and publish them using NServiceBus

I am very new to NServiceBus, and in one of our project, we want to accomplish following -
Whenever table data is modified in Sql server, construct a message and insert in sql server broker queue
Read the broker queue message using NServiceBus
Publish the message again as another event so that other subscribers
can handle it.
Now it is point 2, that I do not have much clue, how to get it done.
I have referred the following posts, after which I was able to enter the message in broker queue, but unable to integrate with NServiceBus in our project, as the NServiceBus libraries are of older version and also many methods used are deprecated. So using them with current versions is getting very troublesome, or if I was doing it in improper way.
http://www.nullreference.se/2010/12/06/using-nservicebus-and-servicebroker-net-part-2
https://github.com/jdaigle/servicebroker.net
Any help on the correct way of doing this would be invaluable.
Thanks.
I'm using the current version of nServiceBus (5), VS2013 and SQL Server 2008. I created a Database Change Listener using this tutorial, which uses SQL Server object broker and SQLDependency to monitor the changes to a specific table. (NB This may be deprecated in later versions of SQL Server).
SQL Dependency allows you to use a broad selection of all the basic SQL functionality, although there are some restrictions that you need to be aware of. I modified the code from the tutorial slightly to provide better error information:
void NotifyOnChange(object sender, SqlNotificationEventArgs e)
{
// Check for any errors
if (#"Subscribe|Unknown".Contains(e.Type.ToString())) { throw _DisplayErrorDetails(e); }
var dependency = sender as SqlDependency;
if (dependency != null) dependency.OnChange -= NotifyOnChange;
if (OnChange != null) { OnChange(); }
}
private Exception _DisplayErrorDetails(SqlNotificationEventArgs e)
{
var message = "useful error info";
var messageInner = string.Format("Type:{0}, Source:{1}, Info:{2}", e.Type.ToString(), e.Source.ToString(), e.Info.ToString());
if (#"Subscribe".Contains(e.Type.ToString()) && #"Invalid".Contains(e.Info.ToString()))
messageInner += "\r\n\nThe subscriber says that the statement is invalid - check your SQL statement conforms to specified requirements (http://stackoverflow.com/questions/7588572/what-are-the-limitations-of-sqldependency/7588660#7588660).\n\n";
return new Exception(messageMain, new Exception(messageInner));
}
I also created a project with a "database first" Entity Framework data model to allow me do something with the changed data.
[The relevant part of] My nServiceBus project comprises two "Run as Host" endpoints, one of which publishes event messages. The second endpoint handles the messages. The publisher has been setup to IWantToRunAtStartup, which instantiates the DBListener and passes it the SQL statement I want to run as my change monitor. The onChange() function is passed an anonymous function to read the changed data and publish a message:
using statements
namespace Sample4.TestItemRequest
{
public partial class MyExampleSender : IWantToRunWhenBusStartsAndStops
{
private string NOTIFY_SQL = #"SELECT [id] FROM [dbo].[Test] WITH(NOLOCK) WHERE ISNULL([Status], 'N') = 'N'";
public void Start() { _StartListening(); }
public void Stop() { throw new NotImplementedException(); }
private void _StartListening()
{
var db = new Models.TestEntities();
// Instantiate a new DBListener with the specified connection string
var changeListener = new DatabaseChangeListener(ConfigurationManager.ConnectionStrings["TestConnection"].ConnectionString);
// Assign the code within the braces to the DBListener's onChange event
changeListener.OnChange += () =>
{
/* START OF EVENT HANDLING CODE */
//This uses LINQ against the EF data model to get the changed records
IEnumerable<Models.TestItems> _NewTestItems = DataAccessLibrary.GetInitialDataSet(db);
while (_NewTestItems.Count() > 0)
{
foreach (var qq in _NewTestItems)
{
// Do some processing, if required
var newTestItem = new NewTestStarted() { ... set properties from qq object ... };
Bus.Publish(newTestItem);
}
// Because there might be a number of new rows added, I grab them in small batches until finished.
// Probably better to use RX to do this, but this will do for proof of concept
_NewTestItems = DataAccessLibrary.GetNextDataChunk(db);
}
changeListener.Start(string.Format(NOTIFY_SQL));
/* END OF EVENT HANDLING CODE */
};
// Now everything has been set up.... start it running.
changeListener.Start(string.Format(NOTIFY_SQL));
}
}
}
Important The OnChange event firing causes the listener to stop monitoring. It basically is a single event notifier. After you have handled the event, the last thing to do is restart the DBListener. (You can see this in the line preceding the END OF EVENT HANDLING comment).
You need to add a reference to System.Data and possibly System.Data.DataSetExtensions.
The project at the moment is still proof of concept, so I'm well aware that the above can be somewhat improved. Also bear in mind I had to strip out company specific code, so there may be bugs. Treat it as a template, rather than a working example.
I also don't know if this is the right place to put the code - that's partly why I'm on StackOverflow today; to look for better examples of ServiceBus host code. Whatever the failings of my code, the solution works pretty effectively - so far - and meets your goals, too.
Don't worry too much about the ServiceBroker side of things. Once you have set it up, per the tutorial, SQLDependency takes care of the details for you.
The ServiceBroker Transport is very old and not supported anymore, as far as I can remember.
A possible solution would be to "monitor" the interesting tables from the endpoint code using something like a SqlDependency (http://msdn.microsoft.com/en-us/library/62xk7953(v=vs.110).aspx) and then push messages into the relevant queues.
.m

SessionFactory - one factory for multiple databases

We have a situation where we have multiple databases with identical schema, but different data in each. We're creating a single session factory to handle this.
The problem is that we don't know which database we'll connect to until runtime, when we can provide that. But on startup to get the factory build, we need to connect to a database with that schema. We currently do this by creating the schema in an known location and using that, but we'd like to remove that requirement.
I haven't been able to find a way to create the session factory without specifying a connection. We don't expect to be able to use the OpenSession method with no parameters, and that's ok.
Any ideas?
Thanks
Andy
Either implement your own IConnectionProvider or pass your own connection to ISessionFactory.OpenSession(IDbConnection) (but read the method's comments about connection tracking)
The solution we came up with was to create a class which manages this for us. The class can use some information in the method call to do some routing logic to figure out where the database is, and then call OpenSession passing the connection string.
You could also use the great NuGet package from brady gaster for this. I made my own implementation from his NHQS package and it works very well.
You can find it here:
http://www.bradygaster.com/Tags/nhqs
good luck!
Came across this and thought Id add my solution for future readers which is basically what Mauricio Scheffer has suggested which encapsulates the 'switching' of CS and provides single point of management (I like this better than having to pass into each session call, less to 'miss' and go wrong).
I obtain the connecitonstring during authentication of the client and set on the context then, using the following IConnectinProvider implementation, set that value for the CS whenever a session is opened:
/// <summary>
/// Provides ability to switch connection strings of an NHibernate Session Factory (use same factory for multiple, dynamically specified, database connections)
/// </summary>
public class DynamicDriverConnectionProvider : DriverConnectionProvider, IConnectionProvider
{
protected override string ConnectionString
{
get
{
var cxnObj = IsWebContext ?
HttpContext.Current.Items["RequestConnectionString"]:
System.Runtime.Remoting.Messaging.CallContext.GetData("RequestConnectionString");
if (cxnObj != null)
return cxnObj.ToString();
//catch on app startup when there is not request connection string yet set
return base.ConnectionString;
}
}
private static bool IsWebContext
{
get { return (HttpContext.Current != null); }
}
}
Then wire it in during NHConfig:
var configuration = Fluently.Configure()
.Database(MsSqlConfiguration.MsSql2005
.Provider<DynamicDriverConnectionProvider>() //Like so