how to add more cars into simulation, limit speed and edit traffic lights - sumo

I want to add more vehicles and change the speed.
I used the command --max-num-vehicles 30 to try and start the simulation with more cars but for some reason the run time simulation never passes 50 or 60 active cars.
Also my simulation has traffic lights but they seem no to work properly because it only has 2 stages (Green light and Yellow light).
Screenshot

The purpose of --max-num-vehicles is to limit the number of cars not to increase it. The easiest way to get more cars is usually to define more traffic input either by using randomTrips.py with a higher period parameter (e.g. -p 1000) or by introducing flows with high counts into your route file, see also http://sumo.dlr.de/wiki/FAQ#How_do_I_get_high_flows.2Fvehicle_densities.3F.
Concerning the traffic light, you probably created a traffic light on a junction with just two streets. It is a known issue that sumo's netconvert does not add red phases here, because there are no conflicting flows demanding green time for them. If you are using the nightly svn version, you can use the --tls.red.time option to force a red time, if not you can edit them manually in netedit.

You may have to regenerate your map.rou.xml file in order to have more vehicles. As the response above said, use the -p 1000 to your randomTrips.py command, as an option

Related

Baselining internal network traffic (corporate)

We are collecting network traffic from switches using Zeek in the form of ‘connection logs’. The connection logs are then stored in Elasticsearch indices via filebeat. Each connection log is a tuple with the following fields: (source_ip, destination_ip, port, protocol, network_bytes, duration) There are more fields, but let’s just consider the above fields for simplicity for now. We get 200 million such logs every hour for internal traffic. (Zeek allows us to identify internal traffic through a field.) We have about 200,000 active IP addresses.
What we want to do is digest all these logs and create a graph where each node is an IP address, and an edge (directed, sourcedestination) represents traffic between two IP addresses. There will be one unique edge for each distinct (port, protocol) tuple. The edge will have properties: average duration, average bytes transferred, number of logs histogram by the hour of the day.
I have tried using Elasticsearch’s aggregation and also the newer Transform technique. While both work in theory, and I have tested them successfully on a very small subset of IP addresses, the processes simply cannot keep up for our entire internal traffic. E.g. digesting 1 hour of logs (about 200M logs) using Transform takes about 3 hours.
My question is:
Is post-processing Elasticsearch data the right approach to making this graph? Or is there some product that we can use upstream to do this job? Someone suggested looking into ntopng, but I did not find this specific use case in their product description. (Not sure if it is relevant, but we use ntop’s PF_RING product as a Frontend for Zeek). Are there other products that does the job out of the box? Thanks.
What problems or root causes are you attempting to elicit with graph of Zeek east-west traffic?
Seems that a more-tailored use case, such as a specific type of authentication, or even a larger problem set such as endpoint access expansion might be a better use of storage, compute, memory, and your other valuable time and resources, no?
Even if you did want to correlate or group on Zeek data, try to normalize it to OSSEM, and there would be no reason to, say, collect tuple when you can collect community-id instead. You could correlate Zeek in the large to Suricata in the small. Perhaps a better data architecture would be VAST.
Kibana, in its latest iterations, does have Graph, and even older version can lever the third-party kbn_network plugin. I could see you hitting a wall with 200k active IP addresses and Elasticsearch aggregations or even summary indexes.
Many orgs will build data architectures beyond the simple Serving layer provided by Elasticsearch. What I have heard of would be a Kappa architecture streaming into the graph database directly, such as dgraph, and perhaps just those edges of the graph available from a Serving layer.
There are other ways of asking questions from IP address data, such as the ML options in AWS SageMaker IP Insights or the Apache Spot project.
Additionally, I'm a huge fan of getting the right data only as the situation arises, although in an automated way so that the puzzle pieces bubble up for me and I can simply lock them into place. If I was working with Zeek data especially, I could lever a platform such as SecurityOnion and its orchestrated Playbook engine to kick off other tasks for me, such as querying out with one of the Velocidex tools, or even cross correlating using the built-in Sigma sources.

Optimal EC2 Instance to handle/edit PDF files

So I have two applications that are processing PDF files:
The first one combines two files and then captures digital signature and fingerprint to then return a single file with the images of the captures.
The second one is a cloud-signer that inserts qr codes for all of the pages of the pdfs and then digitally signs the document placing also some bar codes and legend. This application will eventually receive requests with 100 documents.
The applications both work now with t2.micro instances (some operations take a while e.g. signing 200 docs around 4 minutes) but I would like to know the optimal type to go smoothly without trouble, would M or C instances make a significant difference?
Which one do you recommend, M, C or other?
Thanks,
Felipe
t-type instances are "burstable performance" which means you cannot really utilize them 100% over a long period of time.
I worked on a project where I extracted text from a PDF and it seemed that it was pretty much a CPU-bound task, so c-type instances may be a good choice. However, what you really should take in the account (especially if the workload will be uneven let's say during the day) is elasticity.
It may take some effort, but let you consider using either lambda functions or an auto-scaling groups so you can start more servers when you need it and stop them later. Actually, it may be a significantly more work, but it may save lots of money if you will use this function for a long time.
Also, I would invest some time to make it simple to deploy with various parameters. For example if you create a CloudFormation template and you need only occasionally sign 200 documents fast, you can deploy 24 t2.micro and run them over the period when you need it (like 1 hour) and then shut them down. It will cost you about the same money as running 1 t2.micro the full day.
One service which may help with it is SQS - you can put documents into it and then just let the machines pick up a document, sign it and put it to its destination.

How to delete an instance if cpu is low?

I am running managed Instance groups whose overall c.p.u is always below 30% but if i check instances individually then i found some are running at 70 above and others are running as low as 15 percent.
Keep in mind that Managed Instance Groups don't take into account individual instances as whether a machine should be removed from the pool or not. GCP's MIGs keep a running average of the last 10 minutes of activity of all instances in the group and use that metric to determine scaling decisions. You can find more details here.
Identifying instances with lower CPU usage than the group doesn't seem like the right goal here, instead I would suggest focusing on why some machines have 15% usage and others have 70%. How is work distributed to your instances, are you using the correct strategies for load balancing for your workload?
Maybe your applications have specific endpoints that cause large amounts of CPU usage while the majority of them are basic CRUD operations, having one machine generating a report and displaying higher usage is fine. If all instances render HTML pages from templates and return the results one machine performing much less work than the others is a distribution issue. Maybe you're using a RPS algorithm when you want a CPU utilization one.
In your use case, the best option is to create an Alert notification that will alert you when an instance goes over the desired CPU usage. Once you receive the notification, you will then be able to manually delete the VM instance. As it is part of the Managed Instance group, the VM instance will automatically recreate.
I have attached an article on how to create an Alert notification here.
There is no metric within Stackdriver that will call the GCE API to delete a VM instance .
There is currently no such automation in place. It should't be too difficult to implement it yourself though. You can write a small script that would run on all your machines (started from Cron or something) that monitors CPU usage. If it decides it is too low, the instance can delete itself from the MIG (you can use e.g. gcloud compute instance-groups managed delete-instances --instances ).

debugging a slow site -> long delay between connection and data sending

i ran a test from pingdom tools to check the loading time of my website... the result is that i have a lot of files that, in spite of being very small (5kB), take a lot of time (1 second or more) to load because there is a big delay between the beginning of the connection and the beginning of data downloading (in pingdom tools, this results in a very large green bar).
Have a look at this for example: http://tools.pingdom.com/default.asp?url=http%3a%2f%2fwww.giochigratis-online.net%2f&id=5691308
How can i lower the "green bar" time? Is this an apache problem (like, i dont know, the number of max. parallel connections, or something similar...), or an hardware problem? Cpu-limited, bandwith-limited, or what else?
I see that many other websites have very little green bars... how do they reduce the delay between the connection and the real data sending?
Thanks!
ps.: the site is made with drupal. Homepage generation takes about 700ms
pps.: i tested 3 other websites on the same server: same problem.
I think it could be the problem with max no. of parallel connections as you mentioned - either on server or client side. For instance, Firefox has default of network.http.max-connections-per-server = 15 (see here) while you have >70 files to be downloaded in your domain and next 40 from Facebook.
You can reduce number of loaded images by generating sprites i.e. the image consisting of multiple small images, and then using CSS to diplay them properly in places that you want. This is widely used e.g. by Google - see http://www.google.com/images/nav_logo83.png

Keep time in sync among servers without internet connection

I have 5 servers on a LAN without Internet connection. I need them to keep the clock in sync among them.
I could configure them as NTP peers, and set a high stratum for the local clock of one of them. In this way, the other four would sync with that clock.
What I actually want, is them to agree on a time using all of the 5 local clocks (i.e. doing some kind of average), for reasons of robustness and precision. Is it possible with NTP?
PS: I do not want to use an external clock source.
EDIT: and no scripting outside NTP features, that could only make precision worse :)
If you average 5 drifting clocks, the only thing you get is another drifting clock that's harder to correct. It won't be more precise. NTP uses multiple servers to increase precision because it takes network latency into account. Since all your systems are on a fast local network, you just need one server.
Set up two systems to be NTP server, one a primary, and if you feel the need, one a backup. Have all other systems synchronize to them. This will be significantly easier to set up than the clock-averaging solution, and you won't have to develop any crazy scripts.
You might be able to have one of them listen for the times from each computer, perform an average, set the average as it's own time, and broadcast that time for all the other computers. It seems a little excessive, though.
you can set up one of them as ntp server which will broadcast its time on the local network and the others as slaves to listen on the local network
edit:
I missed the average part. well, in that case, you can probably write a script on the local server to collect times from all the slaves get the average and update own time with that value.
You may even want to get rid of ntp in that case and just use the script to update time on all the servers
I wish I could give a definitive proposal, but I don't know enough about your environment. No matter what you'll likely be doing some sort of script kung fu.
If it's unix/linux I would set everyone up with SSH authorized keys to poll each others' date +%s command (to get the epoch), average those times with awk or something, and then set the machine's own local date.
Or perhaps it would be more secure (and reliable) to have one authoritative machine check everyone's time in the same manor, average it, and then provision itself and every other host to that average.
On Windows you'll probably be looking into VBScript and WMI.
EDIT:
You may run into some weird problems if anyone's clock drifts forward from the average and my guess is about half of them will ;). Future timestamps can be rather strange. It will be up to you to determine how frequently this synchronization will need to occur.