How to collect network statistics using floodlight? - sdn

Using the floodlight controller in Mininet, how can I collect network statistics?
The documentation only states how to collect bandwidth data:https://floodlight.atlassian.net/wiki/spaces/floodlightcontroller/pages/1343539/Floodlight+REST+API
Is there an API already available that can be used in floodlight for collecting Packet statistics, flow statistics, and other networking statistics?
Thank you for your help.

Related

CoTURN Usage Statistics

I am still a bit new to the WebRTC world and trying to find my way through. I have succcessfully set up CoTURN, and been able to route calls behind a firewall by using CoTURN. Now I am wondering if it is possible to somehow inspect and possibly visualize usage statistics of CoTURN? I would love to know how many users are utilizing the server at any given time, how much the bandwidth and CPU usage is etc.? I saw details on how to optimize bandwidth and CPU usage in the official docs, but I haven't found any info on actually monitoring the usage. Any help would be highly appreciated.
If you want to monitor standard usage statistics like CPU usage, load, bandwidth, etc., you can focus on what's available for your infrastructure. For example in AWS you could have CloudWatch, or in generic Linux deployments export the usage stats with Prometheus and have them presented with Grafana.
For the coturn/TURN specific statistics, then coturn allows to store some metrics in Redis; it's described in https://github.com/coturn/coturn/blob/master/turndb/schema.stats.redis
Total traffic information is also reported when the allocation is deleted. The keys are
"turn/user/<username>/allocation/<id>/total_traffic" or "turn/user/<username>/allocation/<id>/total_traffic/peer".
Applications interested in the total amount of traffic per allocation can subscribe to these events as:
psubscribe turn/realm/*/user/*/allocation/*/total_traffic
psubscribe turn/realm/*/user/*/allocation/*/total_traffic/peer

What exactly is Gemfire?

I have been studying 'in-memory data grids' and saw the term 'gemfire'. I'm confused. It seems that gemfire is a term to refer to technologies that store and manipulate data like a database but in the computer memory, isn't it? What exactly is gemfire?
Which technologies can I use to work with 'in-memory data grids' in Node.js?
I saw some applications, like 'Apache Geode' and 'Pivotal gemfire'. How do I work with them? Is it like work with some cache technologies (like Redis or Memcached)? In geode's case, are the data only accessed through an API or are there other ways to access this one?
There are many products that qualify as a "in-memory data grid", GemFire is one of the leading ones. From this article the main ones are:
VMware Gemfire (Java)
Oracle Coherence (Java)
Alachisoft NCache (.Net)
Gigaspaces XAP Elastic Caching Edition (Java)
Hazelcast (Java)
Scaleout StateServer (.Net)
Most of these products have drivers in many languages. You can access data in GemFire over REST, or over the native node.js client.
Apache Geode is the open source version of GemFire. It is much more powerful than memcached and Redis; You can use Geode not only as a cache, but as a store of record (it has native persistence). It has an Object Query Language (OQL) engine built in, which allows you to query nested objects, has powerful features like Continuous Queries and replication over WAN, among others. Geode also has protocol adapters for memcached and Redis, allowing your memcached and Redis clients to connect to Geode.
I would add to the list of "In memory data grid" solutions:
Apache Ignite
Infinispan
They also provide powerful features.
For feature comparison you can use this website: https://db-engines.com/en/system/Hazelcast%3BIgnite .
Last note: GemFire is now a Pivotal solution.
GemFire is a high performance distributed data management infrastructure that sits between application cluster and back-end data sources.
With GemFire, data can be managed in-memory, which makes the access faster.
Kindly check the Link below for further details
https://www.baeldung.com/spring-data-gemfire

Google Compute Engine Load Balancer limits

I'm thinking of using Google Compute Engine to run a LOT of instances in a target pool behind a network load balancer. Each of those instances will end up real-time processing many large data streams, so at full scale and peak times there might be multiple Terabytes per second go through.
Question:
Is there a quota or limit to the data you can push through those load balancers? Is there a limit of instances you can have in a target pool? (the documentation does not seem to specify this)
It seems like load balancers have a dedicated IP (means it's a single machine?)
There's no limit on the amount of data that you can push through a LB. As for instances, there are default limits on CPUs, persistent or SSD disks, and you can see those quotas in the Developers Console at 'Compute' > 'Compute Engine'> 'Quotas', however you can always request increase quota at this link. You can have as many instances that you need in a target pool. Take a look to the Compute Engine Autoscaler that will help you to spin up machines as your service needs. The single IP provided for your LB is in charge of distributing incoming traffic across your multiple instances.

VisualVM collect performance data over a period of time

for Java I am using VisualVM to monitor CPU, Memory, Thread info. Is there a way from VisualVM to collect this information for a range of time so that i am able to present it in a graph.
In VisualVM under Monitor tab i am able to see CPU,Classes,Heap and thread graph. I would like to be able to collect this data over a period of time when i run my load test. Later on present it on graph for later analysis.
If VisualVM is not the tool please suggest alternate option.
Thanks
You can use Tracer plugin for monitoring. Select probes which suits your needs and you should be able to export monitored data, which can be used to construct the graph of your choice.

Is there a way to leverage Hadoop tools to mange parallel REST API calls to external sources?

I am writing software that creates a large graph database. The software needs to access dozens of different REST APIs with millions of total requests. The data will then be processed by the Hadoop cluster. Each of these APIs have rate limits that vary by requests/second, per window, per day and per user (typically via OAuth).
Does anyone have any suggestions on how I might use either a Map function or other Hadoop-ecosystem tool to manage these queries? The goal would to be to leverage the parallel processing in Hadoop.
Because of the varied rate limits, it often makes sense to switch to a different API query while waiting for the first limit to reset. An example would be one API call that creates nodes in the graph and another that enriches the data for that node. I could have the system go out and enrich the data for the new nodes while waiting for the first API limit to reset.
I have tried using SQS queuing on EC2 to manage the various API limits and states (creating a queue for each API call), but have found it to be ridiculously slow.
Any ideas?
It looks like the best option for my scenario will be using Storm, or specifically the Trident abstraction. It gives me the greatest flexibility for both workload management but process management as well