Implementing a RMW operation in Redis - redis

I would like to maintain comma separated lists of entries of the following form <ip>:<app> indexed by a an account ID. There would be one such list for each user indexed by their account ID with the number of users in the millions. This is mainly to track which server in a cluster a user using a certain application is connected to.
Since all servers are written in Java, with Redisson I'm currently doing:
RSet<String> set = client.getSet(accountKey);
and then I can modify the set using some typical Java container APIs supported by Redisson. I basically need three types of updates to these comma separated lists:
Client connects to a new application = append
Client reconnects with existing application to new endpoint = modify
Client disconnects = remove
A new connection would require a change to a field like:
1.1.1.1:foo,2.2.2.2:bar -> 1.1.1.1:foo,2.2.2.2:bar,3.3.3.3:baz
A reconnect would require an update like:
1.1.1.1:foo,2.2.2.2:bar -> 3.3.3.3:foo,2.2.2.2:bar
A disconnect would require an update like:
1.1.1.1:foo,2.2.2.2:bar -> 2.2.2.2:bar
As mentioned the fields would be keyed by the account ID of the user.
My question is the following: Without using Redisson how can I implement this "directly" on top of Redis commands? The goal is to allow rewriting certain components in a language different than Java. The cluster handles close to a million requests per second.
I'm actually quite curious how Redisson implements an RSet under the hood and I haven't had time to dig into it. I guess one option would be to use Lua, but I've never used it with Redis. Any ideas how to efficiently implement these operations on top of Redis on a manner that is easily supported by multiple languages, i.e. not relying on a specific library?

Having actually thought about the problem properly, it can be solved directly with a HSET. Where <app> is the field name and the value are the IPs. Keys being user accounts.

Related

Is there any object-storage standardized protocol for storing data+metadata?

When it comes to query databases, you can rely on SQL or when exchanging emails you can rely on SMTP.
They are both standards across vendors. We could generalize this to many other "standardized services".
This allows the backing provider of the service to be switched easily. I can change a local MySQL to a managed one, or I can switch an SMTP by another just changing the config of the application without changing the code.
Question:
When it comes to "store binaries", like for example say a company that needs to store photos sent by users, PDFs sent by providers and phone-call recordings from the quality department...
...when it's about "storing binaries over the net" is there any industry-standard in the form of "key-value-metadata" where the value is the binary, the metadata is a free-object (with possibly info about creation time, who is the creator, context of the creation, etc); and the ID is a function of the content (for example a Sha1 or whatever), that ensures anti-duplication measures by design?
When I say industry-standard I mean that I don't care if "behind" my connecting API, the "real" storage is implemented by a redis or a mysql or a plain-file-system or an S3 from AWS... just something "standard" given by multiple parties, to where I connect to and I can "easily change what's behind" by configuration and change from one storage-provider to another with the same simplicity we change our SMTP for example not caring if there's a sendmail or a postfix behind.

Prevent entry of GemFire cache being accessed by more than one request

I have an application using Springboot, Gemfire and MySQL. The Springboot application serves as a rest api. I want to "lock" the cache entry so that only one request sent to rest api can access certain entry in GemFire at a time. Others cannot do CRUD on that entry until the entry owner release the possession. I have two approaches as of now.
Approach 1 - Create a GemFire function, which performs a lock/unlock on the entry when invoked by rest api(at different time) using org.apache.geode.cache.Region.getDistributedLock.
Approach 2 - Create a region(eg. Lock) where an entry is created when an entry of target region(eg. Customer) is accessed for the fist time. When the 2nd request wants to access the same entry, the rest api checks the region Lock first. Rest api retrieves and returns the entry from region Customer if the key does not exist in region Lock. Otherwise, no entry will be returned. Once the first requester finishes, rest api removes the entry in region Lock.
I am wondering if there are any alternatives besides these two options.
If you want a more space efficient solution, you could add a boolean field to the value to indicate if it was locked. You can then use region.replace(K,V,V) to efficiently set the "lock" on the entry as well. Although, this will leak your locking concerns into your business objects.

Access control of objects in Julia Web Platform

We are creating a online platform and exposing an Julia API via a embedded code-editor. The user can access the API and run some analysis on our web-app. I have a question related to controlling access to the API and objects.
The API right now contains a database handle and other objects that are exposed to the user and can be used to hack the internal system.
Below is the current architecture:
UserProgram.jl
function doanalysis()
data = getdata()
# some analysis on data
end
InternalProgram.jl
const client = MongoClient()
const collection = MongoCollection(client,"dbname","collectionName")
function getdata()
data = #some function to get data from collection
return data
end
#after parsing the user program
doanalysis()
To run the user analysis, we pass user program as a command-line argument (using ArgParse module) and run the internal program as follows
$ julia InternalProgram.jl --file Userprogram.jl
With this architecture, user potentially gets access to "client" and "collection" and can modify internal databases.
Is there a better way to solve this problem without exposing the objects?
I hope someone has an answer to this.
You will be exposing yourself to multiple types of vulnerabilities - as the general rule, executing user inputed code is a VERY BAD IDEA.
1/ like you said, you'll potentially allow users to execute random code against your database.
2/ your users will have access to all the power of Julia to do things on your server (download files they can later execute for example, access other servers and services on the server [MySQL, email, etc]). Depending on the level of access of the Julia process, think unauthorized access to your file system, installing key loggers, running spam servers, etc.
3/ will be able to use Julia packages and get you into a lot of trouble - like for example add/use the Requests.jl package and execute DoS attacks on other servers.
If you really want to go this way, I recommend that:
A/ set proper (minimal) permissions for the MongoDB user configured to be used in the app (ex: http://blog.mlab.com/2016/07/mongodb-tips-tricks-collection-level-access-control/)
B/ execute each user's code into a separate sandbox / container that only exposes the minimum necessary software
C/ have your containers running on a managed platform where tooling exists (firewalls) to monitor incoming and outgoing traffic (for example to block spam or DoS attacks)
In order to achieve B/ and C/ my recommendation is to use JuliaBox. I haven't used it myself, but seems to be exactly what you need: https://github.com/JuliaCloud/JuliaBox
Once you get that running, you can also use https://github.com/JuliaWeb/JuliaWebAPI.jl

Event Processing - Module channel

Could you explain what is the usage of Modules with select query?
For example if I write (as shown on this page https://cumulocity.com/guides/users-guide/administration/):
select * from MeasurementCreated
Is it useful to get real time notifications by subscribing of the related channel? Is the module reachable by an angularJs Module? Can this module be used in other CEL statements?
Just selecting data without putting it into another stream can make sense in the case you want to make this data available via a real-time channel to some external application (this could be of course AngularJs).
Take a look at this section in the docs: http://cumulocity.com/guides/reference/real-time-statements/#notifications
This very one example though does not make a lot of sense because raw measurement data is already provided on a real-time channel
http://www.cumulocity.com/guides/reference/measurements/#notifications
As for the second part of the question:
Yes it is possible to communicate with other modules within your tenant.
e.g. You can declare some stream in module a and it will be available in module b.

Securely storing variables without session in zope

I want to store values in variables to access form another page (a.k.a State management).
Now I cannot use sessions since I have multiple Zope instances & if one fails the user need to be redirected to another Zope instance and one session is valid only for one Zope instance.
Now my remaining options are
submit a Hidden input tag using POST method
Passing through URL with GET method
Using cookies
Using Database (which I think is 'making simple things complex'.)
I am not even considering the first 2 methods and I think using cookies is not secure.
So is there a commercial or open source module that can securely (encryption etc.) do cookie management.
If not I will have to use a database.
Please inform me, if I am missing something.
Version - Zope 2.11.1
The SESSION support built-in to Zope 2 actually keeps the session in a temporary partition of the ZODB so I think it actually is valid for multiple Zope clients connecting to the same ZEO server. The cost of this is that all session changes invoke the transaction machinery and result in a commit, so just make sure you're not using the SESSION in something very low-level like PAS auth or you'll have commits hitting your ZODB for every image, CSS file, and JS file.