What does the "trib" in the redis-trib utility stand for?
I think the programmer was having a bit of fun with that name and meant it to mean Redistribute.
From the page https://redis.io/topics/cluster-tutorial
... the Redis Cluster command line utility called redis-trib, a Ruby program executing special commands on instances in order to create new clusters, check or reshard an existing cluster, and so forth.
Since a cluster is sometimes called "distributed computing", the program deals entirely with cluster creation and maintenance, and there is no formal explanation of the name; I believe this to be what the programmer was trying to hint at.
Related
If I want to remap processes-core for MPI program, can I migrate after those are spawned? For example: Node 1 have: P0,P3,P6 and Node 2 have: P1,P4,P7. Can I migrate P1 to Node 1? Topology aware MPI suggests remapping in research papers. That hints of picking a process and put it into such a node that provides best result.
Is it possible to do?
No. MPI does not have any migration functionality. Topology-aware MPI (which as you remark is pretty much research level, not production) uses knowledge of how the application communicates to map ranks to nodes. Normally ranks are put on successive nodes; if you have knowledge about what ranks often communicate, they can be mapped closer together.
To go off of what Victor said:
MPI-libraries do allow you to manually place processes via the use of a hostfile and/or mpirun-based flags (be it inside MPICH, OpenMPI, MVAPICH2, etc.). Profiling your application via something like TAU and viewing a communication matrix (see tau.uoregon.edu for documentation) before choosing the "best" process mapping for your application.
As a student that is learning Operating System, I learned that there is a data structure in the kernel space called "Process Table", which maintains all information about processes.
Later on, when I got to the topics of scheduling, I was told that all processes that got into the system would first be put into a data structure called "Job Queue", which seemed to also maintain general information about processes.
This got me thinking, is the "Process Table" here the same as the "Job Queue"? Maybe this is a trivial question, just want to make sure I understand things right. I knew that I might need to look into the Linux kernel source code to figure this out, but could anyone give me any quick insight?
Is Process Table the same thing as the Job Queue in UNIX?
No, they are not the same.
Read this textbook about operating systems; and the wikipage on processes.
The Linux OS is mostly open source. You can download (here) and study its source code, and look inside the source code of GNU bash or GNU make to understand better how syscalls(2) are used. You could also play with strace(1).
See also the kernelnewbies and OSDEV websites.
Of course, the kernel has a lot of data structures (and millions lines of source code). Read more about the kernel scheduler.
I'm using RabbitMQ as message broker in first time and now I have a question about when to declare queues and exchanges using rabbit's own management tool and when to do it in the code of the software? In my opinion is that it is much better to create queues and exchanges using the management tool, because it's a centralized place to add new or remove useless queues without the need to modify the actual software. I am asking some advice and opinions.
Thank you.
The short answer is: whatever works best for you.
I've worked with message brokers that required external tools for defining the topology (exchanges, queues, bindings, etc) and with RabbitMQ that allows me to define them at runtime, as needed.
I don't think either scenario is "the right way". Rather, it depends entirely on your situation.
Personally, I see a lot of value in letting my software define the topology at runtime with RabbitMQ. But there are still times when it gets frustrating because I often end up duplicating my definitions between producers and consumers.
But then, moving from development to production is easier when the software itself defines the topology. No need to pre-configure things before moving code to production.
It's all tradeoffs.
Try it however you're comfortable. Then try it the other way. See what happens, and learn which you prefer and when. Just remember that you don't have to do one or the other. You can do both if you want.
I want to make an application that executes a remote script. The user can create a script (probabily a LUA script) then stores it in the server. Then he can uses an API for execute the script. I was thinking that API could be a webservice.
So my questions are:
I need high performance to execute the script. So my first choice was LUA script. Someone has another sugestion?
Cause I need high perfomance, I was thinking if the webservice is the best solution. Maybe I could create a TCP/IP Windows Service that hold the users request. It is important to say that I will have many user executing scripts at the same time. So I will have a concurrency problem.
My scripts will query in a database. I will use Tokyo Cabinet or Tokio Tyrant. I think Tokio Tyrant is the only solution cause I will have many requests. For perfomance, Do I need to make a connection pooling? Is there anyway to share variables between webservices requests?
To make the webservice or the Windows service i was thinking to use C++.
Can someone help with these questions?
thanks
Lua is pretty high performance for a scripting language, especially if you use LuaJIT or something similar.
You speak of high performance. How much are we speaking about? Say you have a very simple webservice that executes scripts it receives via POST, then probably the HTTP overhead is comparably small when compared to the Lua compile, environment setup & execution time.
About the database I cannot tell you anything. There's many possibilities to do pooling and this also depends on how you execute the Lua scripts. Are they running in a common environment? One per session? One per request?
C++ surely is a good choice to host Lua, because Lua fits in pretty well. Though there are other good language bindings as well.
But keep in mind that your job is not over by just sandboxing scripts. User submitted scripts can do a lot other Bad Things(TM), intentionally or by mistake, like allocating a lot of memory or hogging the CPU. In Lua (and I think this is true of many, if not all, sandboxed environments) you cannot do much about this, except killing the offending instance or, if you disallowed using coroutines in your sandbox, yield out of the offending coroutine and do something smarter.
Almost every application out there performs i/o operations, either with disk or over network.
As my applications work fine under the development-time environment, I want to be sure they will still do when the Internet connection is slow or unstable, or when the user attempts to read data from badly-written CD.
What tools would you recommend to simulate:
slow i/o (opening files, closing files, reading and writing, enumeration of directory items)
occasional i/o errors
occasional 'access denied' responses
packet loss in tcp/ip
etc...
EDIT:
Windows:
The closest solution to do the job as described seems to be holodeck, commercial software (>$900).
Linux:
Open solution wasn't found by now, but the same effect
can be achived as specified by smcameron and krosenvold.
Decorator pattern is a good idea.
It would require to wrap my i/o classes, but resulting in a testing framework.
The only remaining untested code would be in 3rd party libraries.
Yet I decided not to go this way, but leave my code as it is and simulate i/o errors from outside.
I now know that what I need is called 'fault injection'.
I thought it was a common production-line part with plenty of solutions I just didn't know.
(By the way, another similar good idea is 'fuzz testing', thanks to Lennart)
On my mind, the problem is still not worth $900.
I'm going to implement my own open-source tool based on hooks (targeting win32).
I'll update this post when I'm done with it. Come back in 3 or 4 weeks or so...
What you need is a fault injecting testing system. James Whittaker's 'How to break software' is a good read on this subject and includes a CD with many of the tools needed.
If you're on linux you can do tons of magic with iptables;
iptables -I OUTPUT -p tcp --dport 7991 -j DROP
Can simulate connections up/down as well. There's lots of tutorials out there.
Check out "Fuzz testing": http://en.wikipedia.org/wiki/Fuzzing
At a programming level many frameworks will let you wrap the IO stream classes and delegate calls to the wrapped instance. I'd do this and add in a couple of wait calls in the key methods (writing bytes, closing the stream, throwing IO exceptions, etc). You could write a few of these with different failure or issue type and use the decorator pattern to combine as needed.
This should give you quite a lot of flexibility with tweaking which operations would be slowed down, inserting "random" errors every so often etc.
The other advantage is that you could develop it in the same code as your software so maintenance wouldn't require any new skills.
You don't say what OS, but if it's linux or unix-ish, you can wrap open(), read(), write(), or any library or system call etc, with an LD_PRELOAD-able library to inject faults.
Along these lines:
http://scaryreasoner.wordpress.com/2007/11/17/using-ld_preload-libraries-and-glibc-backtrace-function-for-debugging/
I didn't go writing my own file system filter, as I initially thought, because there's a simpler solution.
1. Network i/o
I've found at least 2 ways to simulate i/o errors here.
a) Running a virtual machine (such as vmware) allows to configure bandwidth and packet loss rate. Vmware supports on-machine debugging.
b) Running a proxy on the local machine and tunneling all the traffic through it. For the case of upd/tcp communications a proxifier (e.g. widecap) can be used.
2. File i/o
I've managed to deduce this scenario to the previous one by mapping a drive letter to a network share which resides inside the virtual machine. The file i/o will be slow.
A cheaper alternative exists: to set up a local ftp server (e.g. FileZilla), configure speeds and use Novell's NetDrive to access it.
You'll wanna setup a test lab for this. What type of application are you building anyway? Are you really expecting the application be fed corrupt data?
A test technique I know the Microsoft Exchange Server people tried was sending noise to the server. Basically feeding every possible input with seemingly random data. They managed to crash the server quite often this way.
But still, if you can't trust input that hasn't been signed then general rules apply. Track every operation which could potentially be untrusted (result of corrupt data) and you should be able to handle most problems gracefully.
Just test your application behavior on random input, that should catch most problems but you'll never be able to fully protect your self from corrupt data. That's just not possible, as the data could be part of some internal buffer being handed off within the application itself.
Be mindful of when and how you decode data. That is all.
The first thing you'll need to do is define what "correct" means under these circumstances. You can only test against a definition of what behaviour is intended.
The tactics of testing will depend on technology. In the context of automated unit testing, I have found it very useful, in OO languages such as Java, to use various flavors of "mocking" or "stubbing" to pass e.g. misbehaving InputStreams to parts of my code that used file I/O.
Consider holodeck for some of the fault injection, if you have access to spare hardware you can simulate network impairment using Netem or a commercial product based on it the Mini-Maxwell, which is much more expensive than free but possibly easier to use.