I could increase key "days" by command
$ redis-cli
127.0.0.1:6379> set days 1
OK
127.0.0.1:6379> incr days
(integer) 2
127.0.0.1:6379> get days
"2"
How could I augment it automatically every 24 hours?
First you need to add celery conf, read doc. Somthing like this:
import os
from celery import Celery
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'proj.settings')
app = Celery('allunac', broker='redis://localhost:6379/0')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
I choose redis for broker, because you work with it in your project, but you can choose another broker like RabbitMQ, read doc.
Because you need a task at regular intervals you need celery beat too, read doc.
Add your task:
from datetime import timedelta
from django.core.cache import cache
from celery.decorators import periodic_task
#periodic_task(run_every=timedelta(seconds=30))
def redis_add():
if not cache.get('days'):
cache.set('days', 1) # set initial value
else:
cache.incr('days', 2) # increase by 2
Run celery with beat:
celery -A proj worker -l info -B
CELERY LOG
REDIS
Related
I am newbie to Redis and trying to understand concept of Redis PubSub.
Step- 1:
root#01a623a828db:/data# redis-cli -n 1
127.0.0.1:6379[1]> subscribe foo
Reading messages... (press Ctrl-C to quit)
1) "subscribe"
2) "foo"
3) (integer) 1
In 1st step, subscribed database 1
Step- 2:
root#01a623a828db:/data# redis-cli -n 4
127.0.0.1:6379[4]> publish foo 2
(integer) 1
In 2nd step, published message on database 4
Step- 3:
root#01a623a828db:/data# redis-cli -n 1
127.0.0.1:6379[1]> subscribe foo
Reading messages... (press Ctrl-C to quit)
..........................................
1) "message"
2) "foo"
3) "2"
In 3rd step, on database 1 got the message which was published on database 4 in 2nd Step.
I tried to find out the reason behind it but I found same answer everywhere- "Pub/Sub has no relation to the key space. It was made to not interfere with it on any level, including database numbers. Publishing on db 10, will be heard by a subscriber on db 1. If you need scoping of some kind, prefix the channels with the name of the environment (test, staging, production)- This is as per official documentation of Redis PubSub."
Ques-
Why redis pubsub working architecture is independent of database?
How to implement "If you need scoping of some kind, prefix the channels with the name of the environment (test, staging, production)"?
"Publishing on db 10, will be heard by a subscriber on db 1."- It is not inline with statement
It was made to not interfere with it on any level, including database numbers.
it's a matter of design choice really.
If you need scoping, you can always prefix the pattern. eg: pattern productupdate on test env will be watched via test:productupdate and on staging env, it will be staging:productupdate
It seems to inline well with the statement. the database number doesn't matter here.
For my task I need to load a bulk of data into Redis as soon as possible. It looks like this article is right about my case: https://redis.io/topics/mass-insert
The article starts from giving an example of using multiple inline SET commands with redis-cli. Then they proceed to generating Redis protocol and again use it with redis-cli. They don't explain the reasons or benefits of using Redis protocol.
Using of Redis protocol is a bit harder and it generates a bit more traffic. I wonder, what are the reasons to use Redis protocol rather than simple one-line commands? Probably despite the fact the data is larger, it is easier (and faster) for Redis to parse it?
Good point.
Only a small percentage of clients support non-blocking I/O, and not
all the clients are able to parse the replies in an efficient way in
order to maximize throughput. For all this reasons the preferred way
to mass import data into Redis is to generate a text file containing
the Redis protocol, in raw format, in order to call the commands
needed to insert the required data.
What I understood is that you emulate a client when you use Redis protocol directly, which would benefit from the highlighted points.
Based on the docs you provided, I tried these scripts:
test.rb
def gen_redis_proto(*cmd)
proto = ""
proto << "*"+cmd.length.to_s+"\r\n"
cmd.each{|arg|
proto << "$"+arg.to_s.bytesize.to_s+"\r\n"
proto << arg.to_s+"\r\n"
}
proto
end
(0...100000).each{|n|
STDOUT.write(gen_redis_proto("SET","Key#{n}","Value#{n}"))
}
test_no_protocol.rb
(0...100000).each{|n|
STDOUT.write("SET Key#{n} Value#{n}\r\n")
}
ruby test.rb > 100k_prot.txt
ruby test_no_protocol.rb > 100k_no_prot.txt
time cat 100k.txt | redis-cli --pipe
time cat 100k_no_prot.txt | redis-cli --pipe
I've got these results:
teixeira: ~/stackoverflow $ time cat 100k.txt | redis-cli --pipe
All data transferred. Waiting for the last reply...
Last reply received from server.
errors: 0, replies: 100000
real 0m0.168s
user 0m0.025s
sys 0m0.015s
(5 arquivo(s), 6,6Mb)
teixeira: ~/stackoverflow $ time cat 100k_no_prot.txt | redis-cli --pipe
All data transferred. Waiting for the last reply...
Last reply received from server.
errors: 0, replies: 100000
real 0m0.433s
user 0m0.026s
sys 0m0.012s
To create and start a cluster in Redis, I use create-cluster.sh file inside
/redis-3.04/utils/create-cluster
With the use of this I can create as many nodes I want by changing the:
Settings
PORT=30000
TIMEOUT=2000
NODES=10
REPLICAS=1.
I wonder if I can create for example 10 nodes (5 masters with 5 slaves) in the beginning but start only 4 masters and 4 slaves (meet and join).
Thanks in advance.
Yes. You can add more nodes if load increase on your existing cluster .
Basic Steps are :
Start new redis instances - Let's say you want to add 2 more master and there slaves (Total 4 redis instances)
Then using redis-trib utility do following :
redis-trib.rb add-node <new master node:port> <any existing master>
e.g. ./redis-trib.rb add-node 192.168.1.16:7000 192.168.1.15:7000
After this new node will be assigned an id . Note that id and run following command to add slave to node that we added in prev step
/redis-trib.rb add-node --slave --master-id <master-node-id> <new-node> <master-node>
./redis-trib.rb add-node --slave --master-id 6f9db976c3792e06f9cd252aec7cf262037bea4a 192.168.1.17:7000 192.168.1.16:7000
where 6f9db976c3792e06f9cd252aec7cf262037bea4a is id of 192.168.1.16:7000.
Using similar steps you can add 1 more master-slave pair .
Since these node do not contains any slots to serve, you have move some of the slots from existing masters to new masters .( Re-Sharding)
To that you can run following command/Resharding steps :
6.1 ./redis-trib.rb reshard <any-master-ip>:<master-port>
6.2 It will ask : How many slots do you want to move (from 1 to 16384)? => Enter number of slots you want to move
6.3 Then it will ask : What is the receiving node ID?
6.4 Enter node id to which slots need to be moved . (new masters)
6.5 It will prompt :
Please enter all the source node IDs.
Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
Source node #1: (enter source node id or all)
6.6 Then it will prompt info saying Moving slot n to node node-id like
Moving slot 10960 from 37d10f18f349a6a5682c791bff90e0188ae35e49
Moving slot 10961 from 37d10f18f349a6a5682c791bff90e0188ae35e49
Moving slot 10962 from 37d10f18f349a6a5682c791bff90e0188ae35e49
6.7 It will ask : Do you want to proceed with the proposed reshard plan (yes/no)? Type Yes and enter and you are done .
Note : If data is large it might take some time to reshard.
Few Commands :
To know all nodes in cluster and cluster nodes with node ids:
redis-cli -h node-ip -p node-port cluster nodes
e.g. redis-cli -h 127.0.0.1 -p 7000 cluster nodes
To know all slots in cluster :
redis-cli -h 127.0.0.1 -p 7000 cluster slots
Ref : https://redis.io/commands/cluster-nodes
Hope this will help .
Can someone let me know how can I use redis-benchmark to do a benchmarking for HMSET, HGETALL with a fixed data size (-d option in redis-benchmark). I am using redis 3.2.5.
I have gone through this answer and tried the below command:-
root#cache-server1:~# redis-benchmark -h a.b.c.d -p XXXX hmset hgetall myhash rand_int rand_string -d 2048
====== hmset hgetall myhash rand_int rand_string -d 2048 ======
10000 requests completed in 0.11 seconds
50 parallel clients
3 bytes payload
keep alive: 1
99.64% <= 1 milliseconds
100.00% <= 1 milliseconds
89285.71 requests per second
But looking at the output it seems it is using only 3 bytes payload.
If it is not possible via redis-benchmark can someone suggest some other alternative?
The payload is only 3 bytes (the default) because the -d is taken as part of the command. The command must be the last argument, and all switches must precede it.
Besides that, you can't use redis-benchmark to run two custom commands. Also, the -d option is only applicable to predefined tests (the ones that run by default or with the -t option) and has no meaning if the user specifies the command used in the benchmark.
If you have a specific benchmarking flow that you want to test, the best thing you can do is mock it with any client that you're comfortable with.
I have some fairly simple Hadoop streaming jobs that look like this:
yarn jar /usr/lib/hadoop-mapreduce/hadoop-streaming-2.2.0.2.0.6.0-101.jar \
-files hdfs:///apps/local/count.pl \
-input /foo/data/bz2 \
-output /user/me/myoutput \
-mapper "cut -f4,8 -d," \
-reducer count.pl \
-combiner count.pl
The count.pl script is just a simple script that accumulates counts in a hash and prints them out at the end - the details are probably not relevant but I can post it if necessary.
The input is a directory containing 5 files encoded with bz2 compression, roughly the same size as each other, for a total of about 5GB (compressed).
When I look at the running job, it has 45 mappers, but they're all running on one node. The particular node changes from run to run, but always only one node. Therefore I'm achieving poor data locality as data is transferred over the network to this node, and probably achieving poor CPU usage too.
The entire cluster has 9 nodes, all the same basic configuration. The blocks of the data for all 5 files are spread out among the 9 nodes, as reported by the HDFS Name Node web UI.
I'm happy to share any requested info from my configuration, but this is a corporate cluster and I don't want to upload any full config files.
It looks like this previous thread [ why map task always running on a single node ] is relevant but not conclusive.
EDIT: at #jtravaglini's suggestion I tried the following variation and saw the same problem - all 45 map jobs running on a single node:
yarn jar \
/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.2.0.2.0.6.0-101.jar \
wordcount /foo/data/bz2 /user/me/myoutput
At the end of the output of that task in my shell, I see:
Launched map tasks=45
Launched reduce tasks=1
Data-local map tasks=18
Rack-local map tasks=27
which is the number of data-local tasks you'd expect to see on one node just by chance alone.