Fetch all regions from Gemfire with Spring-data-gemfire - gemfire

I am developing a very simple dashboard to clear Gemfire regions for testing purposes. I am mainly doing this to get testers a tool for doing this by themselves.
I would like to dynamically fetch the current available Regions names to clear.
I am searching spring-data-gemfire documentation but I couldn't find a way to get all region names.
The best hint I have so far is <gfe:auto-region-lookup/>, but I guess I would still need to have a cache.xml with all region names and also I am not sure how to dynamically displaying their names and how to remove all data from those regions.
Thanks

<gfe:auto-region-lookup> is meant to automatically create beans in the Spring ApplicationContext for all GemFire Regions that have been explicitly created outside the Spring context (i.e. cache.xml or using GemFire's relatively new Cluster-based Configuration Service). However, a developer must use and/or enable those mechanisms to employ the auto-region-lookup functionality.
To get a list of all Region names in the GemFire "cluster", you need something equivalent to Gfsh's 'list region' command, which employs a Function to gather up all the Regions defined in the GemFire (Cache) cluster.
Note that members can define different Regions, i.e. all members participating in the cluster do not necessarily have to define the same Regions. In most cases they do since it is beneficial for replication and HA purposes. Still some members may define local Regions only that member will use.
To go on to clear the Regions from the list, you would again need to employ a GemFire Function to "clear" the other Regions in the cluster that the inquiring, acting member does not currently define.
Of course, this problem is real simple if you only want to clear Regions defined on the member itself...
#Autowired
private Cache gemfireCache;
...
public void clearRegions() {
for (Region rootRegion : gemfireCache.rootRegions()) {
for (Region subRegion : rootRegion.subregions(true)) {
subRegion.clear();
}
rootRegion.clear());
}
}
See rootRegions() and subregions(recursive:boolean) for more details.
Note, GemFire's Cache interface implements the RegionService interface.
Hope this helps.
Cheers!

Related

Gemfire spring example

The example at https://spring.io/guides/gs/caching-gemfire/ shows that if there is a cache miss, we have to fetch the data from a server and store in the cache.
Is this an example of Gemfire running as the Gemfire server or is it a Gemfire client? I thought a client would automatically fetch the data from a Server if there is a cache miss. If that is the case, would there ever be a cache miss for the client?
Regards,
Yash
First, I think you are missing the point of the core Spring Framework's Cache Abstraction. I encourage you to read more about the Cache Abstraction's intended purpose here.
In a nutshell, if one of your application objects makes a call to some "external", "expensive" service to access a resource, then caching maybe applicable, especially if the inputs passed result in the exact same output every single time.
So, for a moment, lets imagine your application makes a call to the Geocoding API in the Google Maps API to translate a addresses and (the inverse,) latitude/longitude coordinates.
You might have a application Spring #Service component like so...
#Service("AddressService")
class MyApplicationAddressService {
#Autowired
private GoogleGeocodingApiDao googleGeocodingApiDao;
#Cacheable("Address")
public Address getAddressFor(Point location) {
return googleGeocodingApiDao.convert(location);
}
}
#Region("Address")
class Address {
private Point location;
private State state;
private String street;
private String city;
private String zipCode;
...
}
Clearly, given a latitude/longitude (input), it should produce the same Address (result) everytime. Also, since making a (network) call to an external API like Google's Geocoding service can be very expensive, to both access the resource and perform the conversion, then this type of service call is a perfect candidate for use to cache in our application.
Among many other caching providers (e.g. EhCache, Hazelcaset, Redis, etc), you can, of course, use Pivotal GemFire, or the open source alternative, Apache Geode to back Spring's Caching Abstraction.
In your Pivotal GemFire/Apache Geode setup, you can of course use either the peer-to-peer (P2P) or client/server topology, it doesn't really matter, and GemFire/Geode will do the right thing, once "called upon".
But, the Spring Cache Abstraction documentation states, when you make a call to one of your application components methods (e.g. getAddressFor(:Point)) that support caching (with #Cacheable) the interceptor will first "consult" the cache before making the method call. If the value is present in the cache, then that value is returned and the "expensive" method call (e.g. getAddressFor(:Point)) will not be invoked.
However, if there is a cache miss, then Spring will proceed in invoking the method, and upon successful return from the method invocation, cache the result of the call in the backing cache provider (such as GemFire/Geode) so that the next time the method call is invoked with the same input, the cached value will be returned.
Now, if your application is using the client/sever topology, then of course, the client cache will forward the request onto the server if...
The corresponding client Region is a PROXY, or...
The corresponding client Region is a CACHING_PROXY, and the client's local client-side Region does not contain the requested Point for the Address.
I encourage you to read more about different client Region data management policies here.
To see another working example of Spring's Caching Abstraction backed by Pivotal GemFire in Action, have a look at...
caching-example
I used this example in my SpringOne-2015 talk to explain caching with GemFire/Geode as the caching provider. This particular example makes a external request to a REST API to get the "Quote of the Day".
Hope this helps!
Cheers,
John

SpringData Gemfire inserting fake date on Dev env

I am developing some app using Gemfire and it would be great to be able to provide some fake data while in Dev environment.
So instead of doing it in the code like I do today, I was thinking about using spring application-context.xml do pre-load some dummy data in the region I am currently working on. Something close to what DBUnit does but for DEV not Test scope.
Later I could just switch envs on Spring and that data would not be loaded.
Is it possible to add data using SpringData Gemfire to a local data grid?
Thanks!
There is no direct support in Spring Data GemFire to load data into a GemFire cluster. However, there are several options afforded to a SDG/GemFire developer to load data.
The most common approach is to define a GemFire CacheLoader attached to the Region. However, this approach is "lazy" and only loads data from a (potentially) external data source on a cache miss. Of course, you could program the logic in the CacheLoader to "prefetch" a number of entries in a somewhat "predictive" manner based on data access patterns. See GemFire's User Guide for more details.
Still, we can do better than this since it is more likely that you want to "preload" a particular data set for development purposes.
Another, more effective technique, is to use a Spring BeanPostProcessor registered in your Spring ApplicationContext that post processes your "Region" bean after initialization. For instance...
Where the RegionPutAllBeanPostProcessor is implemented as...
package example;
public class RegionPutAllBeanPostProcessor implements BeanPostProcessor {
private Map regionData;
private String targetRegionBeanName;
protected Map getRegionData() {
return (regionData != null ? regionData : Collections.emptyMap());
}
public void setRegionData(final Map regionData) {
this.regionData = regionData;
}
protected String getTargetRegionBeanName() {
Assert.state(StringUtils.hasText(targetRegionBeanName), "The target Region bean name was not properly specified!");
return targetBeanName;
}
public void setTargetRegionBeanName(final String targetRegionBeanName) {
Assert.hasText(targetRegionBeanName, "The target Region bean name must be specified!");
this.targetRegionBeanName = targetRegionBeanName;
}
#Override
public Object postProcessBeforeInitialization(final Object bean, final String beanName) throws BeansException {
return bean;
}
#Override
#SuppressWarnings("unchecked")
public Object postProcessAfterInitialization(final Object bean, final String beanName) throws BeansException {
if (beanName.equals(getTargetRegionBeanName()) && bean instanceof Region) {
((Region) bean).putAll(getRegionData());
}
return bean;
}
}
It is not too difficult to imagine that you could inject a DataSource of some type to pre-populate the Region. The RegionPutAllBeanPostProcessor was designed to accept a specific Region (based on the Region beans ID) to populate. So you could defined multiple instances each taking a different Region and different DataSource (perhaps) to populate the Region(s) of choice. This BeanPostProcess just take a Map as the data source, but of course, it could be any Spring managed bean.
Finally, it is a simple matter to ensure that this, or multiple instances of the RegionPutAllBeanPostProcessor is only used in your DEV environment by taking advantage of Spring bean profiles...
<beans>
...
<beans profile="DEV">
<bean class="example.RegionPutAllBeanPostProcessor">
...
</bean>
...
</beans>
</beans>
Usually, loading pre-defined data sets is very application-specific in terms of the "source" of the pre-defined data. As my example illustrates, the source could be as simple as another Map. However, it would be a JDBC DataSource, or perhaps a Properties file or well, anything for that matter. It is usually up to the developers preference.
Though, one thing that might be useful to add to Spring Data GemFire would be to load data from a GemFire Cache Region Snapshot. I.e. data that may have been dumped from a QA or UAT environment, or perhaps even scrubbed from PROD for testing purposes. See GemFire Snapshot Service for more details.
Also see the JIRA ticket (SGF-408) I just filed to add this support.
Hopefully this gives you enough information and/or ideas to get going. Later, I will add first-class support into SDG's XML namespace for preloading data sets.
Regards,
John

Using Redis as a cache storage for for multiple application on the same server

I want to use Redis as a cache storage for multiple applications on the same physical machine.
I know at least two ways of doing it:
by running several Redis instances on different ports;
by using different Redis databases for different applications.
But I don't know which one is better for me.
What are advantages and disadvantages of these methods?
Is there any better way of doing it?
Generally, you should prefer the 1st approach, i.e. dedicated Redis servers. Shared databases are managed by the same Redis process and can therefore block each other. Additionally, shared databases share the same configuration (although in your case this may not be an issue since all databases are intended for caching). Lastly, shared databases are not supported by Redis Cluster.
For more information refer to this blog post: https://redislabs.com/blog/benchmark-shared-vs-dedicated-redis-instances
We solved this problem by namespacing the keys. Intially we tried using databases where each database ID would be used a specific applications. However, that idea was not scalable since there is a limited number of databases, plus in Premium offerings (like Azure Cache for Redis Premium instances with Sharding enabled), the concept of database is not used.
The solution we used is attaching a unique prefix for all keys. Each application would be annotated with a unique moniker which would be prefixed infront of each key.
To reduce churn, we have built a framework (URP). If you are using StackExchange.Redis then yuo will be able to use the URP SDK directly. If it helps, I have added some of the references.
Source Code and Documentation - https://github.com/microsoft/UnifiedRedisPlatform.Core/wiki/Management-Console
Blog Post (idea) - https://www.devcompost.com/post/__urp
You can use different cache manager for each application will also work same way I am using.
like :
#Bean(name = "myCacheManager")
public CacheManager cacheManager(RedisTemplate<String, Object> redisTemplate) {
RedisCacheManager cacheManager = new RedisCacheManager(redisTemplate);
return cacheManager;
}
#Bean(name ="customKeyGenerator")
public KeyGenerator keyGenerator() {
return new KeyGenerator() {
#Override
public Object generate(Object o, Method method, Object... objects) {
// This will generate a unique key of the class name, the method name,
// and all method parameters appended.
StringBuilder sb = new StringBuilder();
sb.append(o.getClass().getName());
sb.append(method.getName());
for (Object obj : objects) {
sb.append(obj.toString());
}
return sb.toString();
}
};
}

Using javaconfig to create regions in gemfire

Is it possible to do Javaconfig i.e annotations in spring instead of xml to create client regions in Spring gemfire?
I need to plug in cache loader and cache writer also to the regions created...how is that possible to do?
I want to perform the client pool configuration as well..How is that possible?
There is a good example of this in the spring.io guides. However, GemFire APIs are factories, wrapped by Spring FactoryBeans in Spring Data Gemfire, so I find XML actually more straightforward for configuring Cache and Regions.
Regarding... "how can I create a client region in a distributed environment?"
In the same way the Spring IO guides demonstrate Regions defined in a peer cache on a GemFire Server, something similar to...
#Bean
public ClientRegionFactoryBean<Long, Customer> clientRegion(ClientCache clientCache) {
return new ClientRegionFactoryBean() {
{
setCache(clientCache);
setName("Customers");
setShortcut(ClientRegionShortcut.CACHING_PROXY); // Or just PROXY if the client is not required to store data, or perhaps another shortcut type.
...
}
}
}
Disclaimer, I did not test this code snippet, so it may need minor tweaking along with additional configuration as necessary by the application.
You of course, will defined a along with a Pool in your Spring config, or use the element abstraction on the client-side.

Best way to share data between .NET application instance?

I have create WCF Service (host on Windows Service) on load balance server. Each of this service instance maintain list of current user. E.g. Instance A has user A001, A002, A005, instance B has user A003, A004, A008 and so on.
On each service has interface that use to get user list, I expect this method to return all user in all service instance. E.g. get user list from instance A or instance B will return A001, A002, A003, A004, A005 and A008.
Currently I think that I will store the list of current users on database but this list seem to update so often.
I want to know, is it has another way to share data between WCF service that suit my situation?
Personally, the database option sounds like overkill to me just based on the notion of storing current users. If you are actually storing more than that, then using a database may make sense. But assuming you simply want a list of current users from both instances of your WCF service, I would use an in-memory solution, something like a static generic dictionary. As long as the services can be uniquely identified, I'd use the unique service ID as the key into the dictionary and just pair each key with a generic list of user names (or some appropriate user data structure) for that service. Something like:
private static Dictionary<Guid, List<string>> _currentUsers;
Since this dictionary would be shared between two WCF services, you'll need to synchronize access to it. Here's an example.
public class MyWCFService : IMyWCFService
{
private static Dictionary<Guid, List<string>> _currentUsers =
new Dictionary<Guid, List<string>>();
private void AddUser(Guid serviceID, string userName)
{
// Synchronize access to the collection via the SyncRoot property.
lock (((ICollection)_currentUsers).SyncRoot)
{
// Check if the service's ID has already been added.
if (!_currentUsers.ContainsKey(serviceID))
{
_currentUsers[serviceID] = new List<string>();
}
// Make sure to only store the user name once for each service.
if (!_currentUsers[serviceID].Contains(userName))
{
_currentUsers[serviceID].Add(userName);
}
}
}
private void RemoveUser(Guid serviceID, string userName)
{
// Synchronize access to the collection via the SyncRoot property.
lock (((ICollection)_currentUsers).SyncRoot)
{
// Check if the service's ID has already been added.
if (_currentUsers.ContainsKey(serviceID))
{
// See if the user name exists.
if (_currentUsers[serviceID].Contains(userName))
{
_currentUsers[serviceID].Remove(userName);
}
}
}
}
}
Given that you don't want users listed twice for a specific service, it would probably make sense to replace the List<string> with HashSet<string>.
A database would seem to offer a persistent store which may be useful or important for your application. In addition it supports transactions etc which may be useful to you. Lots of updates could be a performance problem, but it depends on the exact numbers, what the query patterns are, database engine used, locality etc.
An alternative to this option might be some sort of in-memory caching server like memcached. Whilst this can be shared and accessed in a similar (sort of) way to a database server there are some caveats. Firstly, these platforms are generally not backed by some sort of permanent storage. What happens when the memcached server dies? Second they may not be ACID-compliant enough for your use. What happens under load in terms of additions and updates?
I like the in memory way. Actually I am designing a same mechanism for one my projects I'm working now. This is good for scenarios where you don't have opportunities to access database or some people are really reluctant to create a table to store simple info like a list of users against a machine name.
Only update I'd do there is a node will only return the list of its available users to its peer and peer will combine that with its existing list. Then return its existing list to the peer who called. Thats how all the peers would be in sync with same list.
The DB option sounds good. If there are no performance issues it is a simple design that should work. If you can afford to be semi realtime and non persistent one way would be to maintain the list in memory in each service and then each service updates the other when a new user joins. This can be done as some kind of broadcast via a centralised service or using msmq etc.
If you reconsider and host using IIS you will find that with a single line in a config file you can make the ASP Global, Application and Session objects available. This trick is also very handy because it means you can share session state between an ASP application and a WCF service.