Can we implement a mid-scale broadcasting (maybe upto few hundreds) over WebRTC by using a star topology.
Here a peer can take the role of a streaming server (and even be put in a server kind of setup from where more bandwidth is accessible).
Will this kind of setup not scale pretty well (as the central peer can take advantage of a server infrastructure, if needed); say for 100-200 users or even more?
Can we consider it a viable option than going with a dedicated MCU solution? Or, if you know can you point out its limitations?
Can somebody point me to any implementation code for this?
That would be quite difficult. I think that in theory you could pass stream from one peer connection to multiple other ones. That said even if it works, I really doubt it will work on any reasonable scale.
The best way is to use the MCU. You'll get a real star topology.
I can only suggest using our services - AddLive - http://www.addlive.com (usual disclaimer - I work there).
Related
In the talk "Beyond DevOps: How Netflix Bridges the Gap," around 29:10 Josh Evans mentions squeeze testing as something that can help them understand system drift. What is squeeze testing and how is it implemented?
It seems to be a term used by the folks at Netflix.
It is to run some tests/benchmarks to see changes in performance and calculate the breaking point of the application. Then see if that last change was inefficient or determine the recommended auto-scaling parameters before deploying it.
There is a little more information here and here:
One practice which isn’t yet widely adopted but is used consistently
by our edge teams (who push most frequently) is automated squeeze
testing. Once the canary has passed the functional and ACA analysis
phases the production traffic is differentially steered at an
increased rate against the canary, increasing in well-defined steps.
As the request rate goes up key metrics are evaluated to determine
effective carrying capacity; automatically determining if that
capacity has decreased as part of the push.
As someone who helped with the development of squeeze testing at Netflix. It is using the large amount of stateless requests from actual production traffic to test the system. One of those ways is to put inordinately more load on one instance of a service until it breaks. Monitor the key performance metrics of that instance and use that information to inform how to configure auto scaling policies. It eliminates the problems of fake traffic not stressing the system in the right way.
The reasons it might not work for every one:
you need more traffic than any one instance can handle.
the requests need to be somewhat uniform in demand on the service under test.
clients & protocol need to be resilient to errors should things go wrong.
The way it is set up is a proxy is put in front of the service. The proxy is configured to send a specific RPS to one instance. I used the bresenham's line algorithm to evenly spread the fluctuation in incoming traffic over time to a precise out going RPS. turn up the dial on the RPS, watch it burn.
I have a small cluster of servers I need to keep in sync. My initial thought on this was to have one server be the "master" and publish updates using redis's pub/sub functionality (since we are already using redis for storage) and letting the other servers in the cluster, the slaves, poll for updates in a long running task. This seemed to be a simple method to keep everything in sync, but then I thought of the obvious issue: What if my "master" goes down? That is where I started looking into techniques to make sure there is always a master, which led me to reading about ideas like leader election. Finally, I stumbled upon Apache Zookeeper (through python binding, "pettingzoo"), which apparently takes care of a lot of the fault tolerance logic for you. I may be able to write my own leader selection code, but I figure it wouldn't be close to as good as something that has been proven and tested, like Zookeeper.
My main issue with using zookeeper is that it is just another component that I may be adding to my setup unnecessarily when I could get by with something simpler. Has anyone ever used redis in this way? Or is there any other simple method I can use to get the type of functionality I am trying to achieve?
More info about pettingzoo (slideshare)
I'm afraid there is no simple method to achieve high-availability. This is usually tricky to setup and tricky to test. There are multiple ways to achieve HA, to be classified in two categories: physical clustering and logical clustering.
Physical clustering is about using hardware, network, and OS level mechanisms to achieve HA. On Linux, you can have a look at Pacemaker which is a full-fledged open-source solution coming with all enterprise distributions. If you want to directly embed clustering capabilities in your application (in C), you may want to check the Corosync cluster engine (also used by Pacemaker). If you plan to use commercial software, Veritas Cluster Server is a well established (but expensive) cross-platform HA solution.
Logical clustering is about using fancy distributed algorithms (like leader election, PAXOS, etc ...) to achieve HA without relying on specific low level mechanisms. This is what things like Zookeeper provide.
Zookeeper is a consistent, ordered, hierarchical store built on top of the ZAB protocol (quite similar to PAXOS). It is quite robust and can be used to implement some HA facilities, but it is not trivial, and you need to install the JVM on all nodes. For good examples, you may have a look at some recipes and the excellent Curator library from Netflix. These days, Zookeeper is used well beyond the pure Hadoop contexts, and IMO, this is the best solution to build a HA logical infrastructure.
Redis pub/sub mechanism is not reliable enough to implement a logical cluster, because unread messages will be lost (there is no queuing of items with pub/sub). To achieve HA of a collection of Redis instances, you can try Redis Sentinel, but it does not extend to your own software.
If you are ready to program in C, a HA framework which is often forgotten (but can be quite useful IMO) is the one coming with BerkeleyDB. It is quite basic but support off-the-shelf leader elections, and can be integrated in any environment. Documentation can be found here and here. Note: you do not have to store your data with BerkeleyDB to benefit from the HA mechanism (only the topology data - the same ones you would put in Zookeeper).
What would be the best existing Linux device driver to use for a generic device that requires 2-way communication (custom protocol)? Preferably bulk transfer, as fairly large blocks will need to be transmitted.
I have considered using mass storage, but I'm unsure if it requires file system handling?
I have also considered modem, but I can't seem to find much information on it (most people who ask are just told that "This is not how you connect to the Internet". Since I'm not going to connect to any "internets" this is rather unhelpful to me). If anyone can point me to some more detailed information on this one, preferably with C or C++ examples, I'd be grateful.
Linux also seems to have a generic serial communications driver, though it does not seem to have bulk transfer? I am also unsure if it provides the speed of the other drivers, since it is apparently targeted at USB->serial converters?
Bulk transfer is the correct choice for large transfers "as fast as the device/PC can handle it".
Well you could get away with just CDC ACM profile. But this has some issues: You need the user to select the correct serial device /dev/ttyACMx.
If you only need to talk to your application, I recommend using libusb. This way you don't need a kernel driver, and you can talk to individual bulk endpoints of your device.
How exactly does unison star topology work?
I sort of understand the concept that one machine acts as the HUB where every spoke syncs to it but is it just a concept and I have to implement it on my own or is this some kind of feature built into unison?
If I have to script this myself how exactly would I do it, what are the sync steps?
Unison is a bidirectional synching system that you can use anyway you want. To avoid synchronization confllicts however, a star topology is often preferred, but there is nothing that forces you to do it in that way, nor is there any node to be designated as 'hub' or that needs a special implementation. As far as the protocol is concerned, all nodes are peers (unless you run in socket mode, which is insecure, and only intended for specific needs).
I use unison in a star topology, and I don't need any special scripting. Mostly I initiate the synch from the clients, but nothing prevents me from initiating it from the server, or to synch 2 clients directly when the server is down. But the latter "unstructured" approach has a higher risk to become unmanagable, especially if you have a lot of clients.
Does any other provider offer a cloud computing + storage layer like S3/EC2, with free data transfer between the two layers?
I have looked at:
Softlayer CloudLayer Storage -- no free transfer between the cloudlayer storage and cloudlayer computing instances.
Rackspace CloudFiles - Quite a bit of marketing mumbo-jumbo, and something about Cloud Connect, I gave up on the site once the Live Chat CSS Popup started following me around.
Does anyone know of any others?
I'm looking to store some large (non-random access) files for constant re-processing on a storage solution, and process it nearby, without paying transfer costs daily (looking to store in the 500-2000GB range, re-processing it all daily).
Re-processing requires a (Linux) server with a "decent" (weasel word alert) configuration.
Thanks!
'Cloud computing' is a bit of a myth.
They're all just, essentially, virtual private servers. 'Cloud' instances tend to have the flexibility to be billed by the hour, rather than monthy, but they're still just a VPS.
Persistent storage is a useful feature from a very limited number of VPS providers, but one that can easily be emulated by having two+ VPS' in the same data centre (Linode are an excellent VPS provider with free local data transfer, sadly they're rather limited by capacity). I don't know of any other VPS/Cloud providers who offer their own persistent storage solution.
It is something you can easily achieve yourself. VPS servers tend to be a little restrictive on hard drive capacity if you're looking for 500-2000GB, Perhaps you could consider a dedicated server and handle storage and processing on the same machine... you can't get data more local than the same machine!
First, the short version: stop looking for “free”.
Now, in more detail: you're looking to consume some somewhat-non-trivial computing, data storage and networking resources. Presumably you've got a good reason for doing this; if you truly have, you'll have the ability to also purchase the resources required for what you want to do. There are a few options on this front, none of which are free:
Buy and host your own hardware.
Buy the hardware and host it in a colocation facility.
Hire the hardware
Long term hire
Short term hire
All the Amazon are doing is short-term, easy set up hiring of resources. Their prices are quite keen (if some other option is cheaper, it's because it is missing something significant that Amazon do; maybe it's something you don't need but that's up to you to figure out). You can host the core of the Amazon API quite easily on whatever resources you've hired (see Eucalyptus) but be aware that going from having the software and the API to having everything work smoothly is a really big step; the more I work with Eucalyptus installations, the more impressed with Amazon I become. And that's despite being also pretty impressed with Eucalyptus itself.
But none of this is free. It takes real resources to provide – e.g., electricity to power the machines and keep them cool and a building to house them in – and ultimately, that's got to be paid for somewhere. To expect otherwise is to believe that others should have to pay for things for you; it's pretty rare that that happens, and the more you need to consume, the rarer it is (especially if the economy isn't doing too good). So stop thinking in terms of how you can get it for nothing (“freeload”) and instead take a good look at what it really costs to provide through various routes and seek to minimize your costs. If you can't afford even that, your #1 problem isn't hosting but funding; fix that first.
Rest assured you're not alone in this matter. This is what lots of other people worldwide have to do to make their projects into reality. Good luck!
GoGrid has an external storage with free transfer and access over typical protocols like SMB, NFS, rsync, FTP. The first two allows for mounting as normal drive.
Note also that many providers will allow you to create cloud servers with 2 TB instance storage. For sure not able to name all of them, but you can find some with cloudorado.com .