Domain Name Re-resolution issue Firefox - selenium

If I have four identical servers hosted on AWS EC2, divided into two groups and the each group is located in different AWS regions. There is one ELB in front of each group. I configured two weighted alias records (not latency based) point to the ELB of each group in the AWS Route53.
Each server has a simple apache2 server installed which display a simple page with different words to distinguish from each other. I started a browser client (made by Selenium library) frequently reloading page with the URL which is the domain name of these servers (pause for 1 seconds) but I found that that the browser (firefox) always return pages from servers in one group instead of returning pages from both group in 50% times as how Weighted Round Robin works.
I also found that if I pause for a relatively longer time, pages from the other group do get returned but as long as I refresh the page frequently. It never changes. Request always hits one group not the other.
I also wrote a simple Java program to keep querying the domain name from AWS Route 53 and the address I got back does change between two group, but the browser seems stuck in the connection with the group it first connected (as long as I frequently refresh)
I suspect it is the problem of TCP connection still alive, but I am not sure. BTW, I have already disabled browser cache and I am using Mac OS X 10.9. (This does happen on Ubuntu as well)
Any ideas will be really appreciated. This issue is really important to my works the deadline of which is approaching. Many many thanks in advance.

Unfortunately, that's normal.
Many, perhaps most, browsers cache the dns response they get from the OS, and the timing of this cache is unrelated to the DNS TTL -- it's at the discretion of the browser developers.
For Firefox, the time by default appears to be 60 seconds, so, not likely to be directly related to keepalives, though there's certainly some potential for that, too... albeit for a shorter time interval, in some cases, since many servers will tear down an idle kept-alive connection before 60 seconds, since a connection idle for so long is a potentially expensive waste of a resource.
Firefox: http://kb.mozillazine.org/Network.dnsCacheExpiration
For a discussion of the issue and observations made of the behavior of different browsers, see also: http://dyn.com/blog/web-browser-dns-caching-bad-thing/

Related

load testing through wifi/lan + vpn

I really did not found answer on my question in the web. I am currently doing load tests for a web service, for example: how service will handle 15 threads in 1 seconds, for that I use Jmeter. I always get different average response times for 15 threads. When I'm in my company's inner network, I get wonderful results, but when I am at home, using lan/wifi + vpn to get access to that web services, I get horrible results. When I test it through vpn, web service can not handle 30 threads in 1 seconds, average response time is like 13 seconds, otherwise from company's network, average response time is 4-5 seconds. Also, that web service, would be also called from a system using vpn.
My question is, what is correct result and correct way to test it. Test it from company's network, or though vpn?
Response time consists of the following metrics:
Connect time
Latency (also known as Time To First Byte)
Time to last byte
So my expectation is that it's not the high response time, it's more about bandwidth of your ISP and VPN connections, theoretically you can subtract these connect time and the time for the packets to travel back and forth and get the "real" response time, however a better idea would be setting up a remote JMeter slave to be "local" to the system under test and orchestrate it from your "remote" JMeter master host, this way you will be able to obtain "clean" results without these network-related slow downs.
More information: Apache JMeter Glossary
Arguably, the correct way to test it should be the way your users are accessing your web service.
If the majority of users are accessing it through a VPN from outside, then test it that way; if it is the other way around test it from the company's network.
In the case of mixed access, you might want to test both at the same time.

Single request to specific API stalled for long

I've built up an API application with ASP.NET Core 2.2.
Everything has been fine. Except one PATCH API, which takes an ID and a list, to replace the list of corresponding item.
This API works fine with POSTMAN too. Simply and fast, works just as expected.
However, to run on browsers, it stalls 1 minute to send that request.
I've tried to make it simple by rewriting the App within only one jQuery function, to check if the problem is on my frontend app; however it still stalls for 1 minute.
I've looked up stalled, people say that it can be a Chrome policy to load maximum 6 requests at the same time; however it's not my case. There's only such request at that time, and every other API works fine except this one.
Also, I've tried with other browsers: Firefox and Edge, but it's still the same.
According to the article Chrome provides:
Queueing. The browser queues requests when:
There are higher priority requests.
There are already six TCP connections open for this origin, which is the limit. Applies to HTTP/1.0 and > HTTP/1.1 only.
The browser is briefly allocating space in the disk cache
Stalled. The request could be stalled for any of the reasons described in Queueing.
It seems that getting "stalled" for long, means that the request wasn't event sent. Does it mean that I can just exclude the possibility to fix backend API?
And also, since that there's no other request at the same time, does it mean that it most likely goes to the reason that "The browser is briefly allocating space in the disk cache", or is there any other reason?
And I also wander why only this API gets this issue. Is there anything special with the method "PATCH"?
At first use stopwatch and evaluate response time of your code in browser and postman and see how take long time in each.
If both is same, don't touch your code because your problem isn't from your method.
If you can, test it with 'post http attribute' till know your problem is because of it or not.
However I guess reason of it is your system.
Of course it's possible ypur problem resolve with changing pipeline (startup.cs) . There are also problems like CORS that occurred only in browsers and not postman.

TURN servers: always or never needed for a given network, or needed unpredictably?

I am currently just using a STUN server and am wondering whether TURN is necessary for an MVP. The number of users accessing the website from a workplace with super secure firewalls should be near-zero.
Let's say 4 people are testing WebRTC connection reliability. Sometimes they all successfully connect and can see/hear one another, but other times they cannot see/hear someone and refresh the page to try again.
Does the fact that they can sometimes all see/hear each other rule out whether a TURN server would make a difference?
In other words, is it possible for a STUN server to identify my IP so I can connect one second, but fail if I try again a few seconds later? Or is it just network-based, so if I a STUN doesn't work for me on my network now, it 100% will not work in 5 minutes either?
If the latter (and a TURN is either always or never needed for a given network), I guess that tells me the problem in my code is elsewhere...

Long polling blocking multiple windows?

Long polling has solved 99% of my problems. There is now just one other problem. Imagine a penny auction site, where people bid. On the frontpage, there are several Auctions.
If the user opens three of these auctions, and because javascript is not multithreaded, how would you get the other pages to ever load? Won't they always get bogged down and not load because they are waiting for long polling to end? In practice, I've experienced this and I can't think of a way around it. Any ideas?
There are two ways that javascript gets around some of this.
While javascript is single threaded conceptually, it does its io in separate threads using completion handlers. This means other pieces of javascript can be running while you are waiting for your network request to complete.
Javascript for each page (or even each frame in each page) is isolated from Javascript on the other pages/frames. This means that each copy of javascript can be running in its own thread.
A bigger issue for you is likely to be that browsers often limit the number of concurrent connections to a given site, and it sounds like you want to make many concurrent connections to the same site. In this case you will get a lock up.
If you control both the sever and client, you will need to combined the multiple long-poll request from the client into a single long-poll request to the server.

using BOSH/similar technique for existing application/system

We've an existing system which connects to the the back end via http (apache/ssl) and polls the server for new messages, needless to say we have scalability issues.
I'm researching on removing this polling and have come across BOSH/XMPP but I'm not sure how we should take the BOSH technique (using long lived http connection).
I've seen there are few libraries available but the entire thing seems bloaty since we do not need buddy lists etc and simply want to notify the clients of available messages.
The client is written in C/C++ and works across most OS so that is an important factor. The server is in Java.
does bosh result in huge number of httpd processes? since it has to keep all the clients connected, what would be the limit on that. we are also planning to move to 64 bit JVM/apache what would be the max limit of clients in that case.
any hints?
I would note that BOSH is separate from XMPP, so there's no "buddy lists" involved. XMPP-over-BOSH is what you're thinking of there.
Take a look at collecta.com and associated blog posts (probably by Jack Moffitt) about how they use BOSH (and also XMPP) to deliver real-time information to large numbers of users.
As for the scaling issues with Apache, I don't know — presumably each connection is using few resources, so you can increase the number of connections per Apache process. But you could also check out some of the connection manager technologies (like punjab) mentioned on the BOSH page above.