Use Multiple DBs With One Redis Lua Script? - redis

Is it possible to have one Redis Lua script hit more than one database? I currently have information of one type in DB 0 and information of another type in DB 1. My normal workflow is doing updates on DB 1 based on an API call along with meta information from DB 0. I'd love to do everything in one Lua script, but can't figure out how to hit multiple dbs. I'm doing this in Python using redis-py:
lua_script(keys=some_keys,
args=some_args,
client=some_client)
Since the client implies a specific db, I'm stuck. Ideas?

It is usually a wrong idea to put related data in different Redis databases. There is almost no benefit compared to defining namespaces by key naming conventions (no extra granularity regarding security, persistence, expiration management, etc ...). And a major drawback is the clients have to manually handle the selection of the correct database, which is error prone for clients targeting multiple databases at the same time.
Now, if you still want to use multiple databases, there is a way to make it work with redis-py and Lua scripting.
redis-py does not define a wrapper for the SELECT command (normally used to switch the current database), because of the underlying thread-safe connection pool implementation. But nothing prevents you to call SELECT from a Lua script.
Consider the following example:
$ redis-cli
SELECT 0
SET mykey db0
SELECT 1
SET mykey db1
The following script displays the value of mykey in the 2 databases from the same client connection.
import redis
pool = redis.ConnectionPool(host='localhost', port=6379, db=0)
r = redis.Redis(connection_pool=pool)
lua1 = """
redis.call("select", ARGV[1])
return redis.call("get",KEYS[1])
"""
script1 = r.register_script(lua1)
lua2 = """
redis.call("select", ARGV[1])
local ret = redis.call("get",KEYS[1])
redis.call("select", ARGV[2])
return ret
"""
script2 = r.register_script(lua2)
print r.get("mykey")
print script2( keys=["mykey"], args = [1,0] )
print r.get("mykey"), "ok"
print
print r.get("mykey")
print script1( keys=["mykey"], args = [1] )
print r.get("mykey"), "misleading !!!"
Script lua1 is naive: it just selects a given database before returning the value. Its usage is misleading, because after its execution, the current database associated to the connection has changed. Don't do this.
Script lua2 is much better. It takes the target database and the current database as parameters. It makes sure that the current database is reactivated before the end of the script, so that next command applied on the connection still run in the correct database.
Unfortunately, there is no command to guess the current database in the Lua script, so the client has to provide it systematically. Please note the Lua script must reset the current database at the end whatever happens (even in case of previous error), so it makes complex scripts cumbersome and awkward.

Related

Redis how to make EVAL script behave like MULTI / EXEC?

One thing I noticed when playing around with Lua scripts is that, in a script containing multiple operations, if an error is thrown halfway through the execution of the script, the operations that completed before the error will actually be reflected in the database. This is in contrast to MULTI / EXEC, where either all operations succeed or fail.
For example, if I have a script like the following:
redis.call("hset", "mykey", "myfield", "val")
local expiry = someFunctionThatMightThrow()
redis.call("expire", "mykey", expiry)
I tested this and the results of the first hset call were reflected in redis. Is there any way to make the lua script behave so that if any error is thrown during the script, then all actions performed during that script execution are reverted?
Sample script for my comment above, on error manually rollback. Note: Syntax is not verified.
redis.call("hset", "mykey", "myfield", "val")
local expiry,error = pcall(someFunctionThatMightThrow())
if expiry ~= nil then
redis.call("expire", "mykey", expiry)
else
redis.call("hdel", "mykey", "myfield")
end

attempt to call field 'replicate_commands' (a nil value)

I use jedis + lua to eval script, here is my lua script:
redis.replicate_commands()
local second = redis.call('TIME')[1]
local currentKey = KEYS[1]..second
if redis.call('EXISTS', currentKey) == 0 then
redis.call('SETEX', currentKey, 1, 1)
return 1
else
return redis.call('INCR', currentKey)
end
As I use 'Time', it reports error:Write commands not allowed after non deterministic commands.
after searching on internet, I add 'redis.replicate_commands()' as first line of lua script, but it still reports error:ERR Error running script (call to f_c89a6ee8ad732a325e530f4a69226851cde302e2): #user_script:1: user_script:1: attempt to call field 'replicate_commands' (a nil value)
Does replicate_commands need arguments or is there a way to solve my problem?
redis version:3.0
jedis version:2.9
lua version: I don't know where to find
The error attempt to call field 'replicate_commands' (a nil value) means replicate_commands() doesn't exists in the redis object. It is a Lua-side error message.
replicate_commands() was introduced until Redis 3.2. See EVAL - Replicating commands instead of scripts. Consider upgrading.
The first error message (Write commands not allowed after non deterministic commands) is a redis-side message, you cannot call write-commands (like SET, SETEX, INCR, etc) after calling non-deterministic commands (like SPOP, SCAN, RANDOMKEY, TIME, etc).
A very important part of scripting is writing scripts that are pure functions.
Scripts executed in a Redis instance are, by default, propagated to
replicas and to the AOF file by sending the script itself -- not the
resulting commands.
This is so if the Redis server is restarted, playing again the AOF log, or also if replicated in a slave, the script should deliver the same dataset.
This is why in Redis 3.2 replicate_commands() was introduced. And starting with Redis 5 scripts are always replicated as effects -- as if replicate_commands() was called when the script started. But for versions before 3.2, you simply cannot do this.
Therefore, either upgrade to 3.2 or later, or pass currentKey already calculated to the script from the client instead.
Note that creating currentKey dynamically makes your script single-instance-only.
All Redis commands must be analyzed before execution to determine
which keys the command will operate on. In order for this to be true
for EVAL, keys must be passed explicitly. This is useful in many ways,
but especially to make sure Redis Cluster can forward your request to
the appropriate cluster node.
Note this rule is not enforced in order to provide the user with
opportunities to abuse the Redis single instance configuration, at the
cost of writing scripts not compatible with Redis Cluster.
Finally, the Lua version at Redis 3.0.0 is Lua 5.1.5, same as all the way up to Redis 6 RC1.

Is this redis lua script that deals with key expire race conditions a pure function?

I've been playing around with redis to keep track of the ratelimit of an external api in a distributed system. I've decided to create a key for each route where a limit is present. The value of the key is how many request I can still make until the limit resets. And the reset is made by setting the TTL of the key to when the limit will reset.
For that I wrote the following lua script:
if redis.call("EXISTS", KEYS[1]) == 1 then
local remaining = redis.call("DECR", KEYS[1])
if remaining < 0 then
local pttl = redis.call("PTTL", KEYS[1])
if pttl > 0 then
--[[
-- We would exceed the limit if we were to do a call now, so let's send back that a limit exists (1)
-- Also let's send back how much we would have exceeded the ratelimit if we were to ignore it (ramaning)
-- and how long we need to wait in ms untill we can try again (pttl)
]]
return {1, remaining, pttl}
elseif pttl == -1 then
-- The key expired the instant after we checked that it existed, so delete it and say there is no ratelimit
redis.call("DEL", KEYS[1])
return {0}
elseif pttl == -2 then
-- The key expired the instant after we decreased it by one. So let's just send back that there is no limit
return {0}
end
else
-- Great we have a ratelimit, but we did not exceed it yet.
return {1, remaining}
end
else
return {0}
end
Since a watched key can expire in the middle of a multi transaction without aborting it. I assume the same is the case for lua scripts. Therefore I put in the cases for when the ttl is -1 or -2.
After I wrote that script I looked a bit more in depth at the eval command page and found out that a lua script has to be a pure function.
In there it says
The script must always evaluates the same Redis write commands with
the same arguments given the same input data set. Operations performed
by the script cannot depend on any hidden (non-explicit) information
or state that may change as script execution proceeds or between
different executions of the script, nor can it depend on any external
input from I/O devices.
With this description I'm not sure if my function is a pure function or not.
After Itamar's answer I wanted to confirm that for myself so I wrote a little lua script to test that. The scripts creates a key with a 10ms TTL and checks the ttl untill it's less then 0:
redis.call("SET", KEYS[1], "someVal","PX", 10)
local tmp = redis.call("PTTL", KEYS[1])
while tmp >= 0
do
tmp = redis.call("PTTL", KEYS[1])
redis.log(redis.LOG_WARNING, "PTTL:" .. tmp)
end
return 0
When I ran this script it never terminated. It just went on to spam my logs until I killed the redis server. However time dosen't stand still while the script runs, instead it just stops once the TTL is 0.
So the key ages, it just never expires.
Since a watched key can expire in the middle of a multi transaction without aborting it. I assume the same is the case for lua scripts. Therefore I put in the cases for when the ttl is -1 or -2.
AFAIR that isn't the case w/ Lua scripts - time kinda stops (in terms of TTL at least) when the script's running.
With this description I'm not sure if my function is a pure function or not.
Your script's great (without actually trying to understand what it does), don't worry :)

How redis pipe-lining works in pyredis?

I am trying to understand, how pipe lining in redis works? According to one blog I read, For this code
Pipeline pipeline = jedis.pipelined();
long start = System.currentTimeMillis();
for (int i = 0; i < 100000; i++) {
pipeline.set("" + i, "" + i);
}
List<Object> results = pipeline.execute();
Every call to pipeline.set() effectively sends the SET command to Redis (you can easily see this by setting a breakpoint inside the loop and querying Redis with redis-cli). The call to pipeline.execute() is when the reading of all the pending responses happens.
So basically, when we use pipe-lining, when we execute any command like set above, the command gets executed on the server but we don't collect the response until we executed, pipeline.execute().
However, according to the documentation of pyredis,
Pipelines are a subclass of the base Redis class that provide support for buffering multiple commands to the server in a single request.
I think, this implies that, we use pipelining, all the commands are buffered and are sent to the server, when we execute pipe.execute(), so this behaviour is different from the behaviour described above.
Could someone please tell me what is the right behaviour when using pyreids?
This is not just a redis-py thing. In Redis, pipelining always means buffering a set of commands and then sending them to the server all at once. The main point of pipelining is to avoid extraneous network back-and-forths-- frequently the bottleneck when running commands against Redis. If each command were sent to Redis before the pipeline was run, this would not be the case.
You can test this in practice. Open up python and:
import redis
r = redis.Redis()
p = r.pipeline()
p.set('blah', 'foo') # this buffers the command. it is not yet run.
r.get('blah') # pipeline hasn't been run, so this returns nothing.
p.execute()
r.get('blah') # now that we've run the pipeline, this returns "foo".
I did run the test that you described from the blog, and I could not reproduce the behaviour.
Setting breakpoints in the for loop, and running
redis-cli info | grep keys
does not show the size increasing after every set command.
Speaking of which, the code you pasted seems to be Java using Jedis (which I also used).
And in the test I ran, and according to the documentation, there is no method execute() in jedis but an exec() and sync() one.
I did see the values being set in redis after the sync() command.
Besides, this question seems to go with the pyredis documentation.
Finally, the redis documentation itself focuses on networking optimization (Quoting the example)
This time we are not paying the cost of RTT for every call, but just one time for the three commands.
P.S. Could you get the link to the blog you read?

Data is not properly stored to hsqldb when using pooled data source by dbcp

I'm using hsqldb to create cached tables and indexed tables.
The data being stored has pretty high frequency so I need to use a connection pool.
Also because there is a lot of data I do not call checkpoint on every commit, but rather expect the data to be flushed after 50,000 rows are inserted.
So the thing is that I can see the .data file is growing but when I connect with hsqldb client I don't see the tables and the data.
So I had 2 simple tests, one inserted single row and one inserted 60,000 rows to new table. In both cases I couldn't see the result in any hsqldb client.
(Note that I use shutdown=true)
So when I add checkpoint after each commit, it solve the problem.
Also if specify in the connection string to use log, it solves the problem (I don't want the log in production though). Also not using pooled connection solved the problem and last is using pooled data source and explicitly close it before shutdown.
So I guess that some connections in the connection pool are not being closed, preventing from the db to somehow commit the changes and make them available for the client. But then, why couldn't I see the result even with 60,000 rows?
I also would expect the pool to be closed automatically...
What am I doing wrong? What is happening behind the scene?
The code to get the data source looks like this:
Class.forName("org.hsqldb.jdbcDriver");
String url = "jdbc:hsqldb:" + m_dbRoot + dbName + "/db" + ";hsqldb.log_data=false;shutdown=true;hsqldb.nio_data_file=false";
ConnectionFactory connectionFactory = new DriverManagerConnectionFactory(url, user, password);
GenericObjectPool connectionPool = new GenericObjectPool();
KeyedObjectPoolFactory stmtPool = new GenericKeyedObjectPoolFactory(null);
new PoolableConnectionFactory(connectionFactory, connectionPool, stmtPool, null, false, true);
DataSource ds = new PoolingDataSource(connectionPool);
And I'm using this Pooled data source to create table:
Connection c = m_dataSource.getConnection();
Statement st = c.createStatement();
String script = String.format("CREATE CACHED TABLE IF NOT EXISTS %s (id %s NOT NULL, entity %s NOT NULL, PRIMARY KEY (id));", m_tableName, m_idGenerator.getIdType(), TABLE_ENTITY_TYPE);
st.execute(script);
c.close;
st.close();
And insert rows:
Connection c = m_dataSource.getConnection();
c.setAutoCommit(false);
Statement stmt = c.prepareStatement(m_sqlInsert);
stmt.setObject(1, id);
stmt.setBinaryStream(2, Serializer.Helper.serialize(m_serializer, entity));
stmt.executeUpdate();
stmt.close();
stmt = null;
c.commit();
c.close();
stmt.close();
so the above seems to add data but it cannot be seen.
When I explicitly called
connectionPool.close();
Then and only then I could see the result.
I also tried to use JDBCDataSource and it worked as well.
So what is going on? And what is the right way to do this?
Your method of accessing the database from outside your application process is simply wrong.
Only one java process is supposed to connect to the file: database.
In order to achieve your aim, launch an HSQLDB server within your application, using exactly the same JDBC URL. Then connect to this server from the external client.
See the Guide:
http://www.hsqldb.org/doc/2.0/guide/listeners-chapt.html#lsc_app_start
Update: The OP commented that the external client was used after the application had stopped. Because you have turned the log off with hsqldb.log_data=false, nothing is persisted permanently. You need to perform an explicit CHECKPOINT or SHUTDOWN when your application completes its work. You cannot rely on shutdown=true at all, even without connection pooling.
See the Guide:
http://www.hsqldb.org/doc/2.0/guide/deployment-chapt.html#dec_bulk_operations