Random ECONNREFUSED errors accessing Azure Storage - azure-storage

we are currently facing connection refused issues in our production environment when downloading files stored on our Azure Storage account.
Node gives us this error randomly:
Error: connect ECONNREFUSED 52.239.194.36:443
at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1191:14)
The random aspect of this issue makes it hard to find a cause.
Some clues we gathered so far :
every refusal comes from ip 52.239.194.36
our relevant firewall rules
ACCEPT tcp -- anywhere anywhere tcp dpt:http
ACCEPT tcp -- anywhere anywhere tcp dpt:https
since original requests are issued by our customers, our server act as a proxy for azure files thus all connections to azure come from our IP. May we hit some DDOS protection ?
Any ideas welcome !
Feel free to ask more details.
Thanks !

After contacting Azure's support without much result, I ran into this document :
https://learn.microsoft.com/en-us/azure/architecture/best-practices/transient-faults
TL;DR;
It says network errors are to happen one day or another and thus any cloud based application should be resilient to such transient issues.
The solution here is to implement some retry procedure for our cloud interactions.

Related

connection refused for news.google.com when I try yo scrape it

I am trying to scrape the new.google.com and get some information, I have no issue locally but when I deploy to our Datacenter it fails with "connection refused" that means it is blocked.
Get https://news.google.com?ceid=en%3Agb&gl=en-gb&hl=en-gb&hs=en-gb&pz=1: dial tcp 172.217.5.206:443: connect: connection refused
Do we have any alternative of passing header and by-pass to get unblocked? or is using paid API is the only option if I have to use google for some of my testing?
Did anyone encounter and resolved it?
This is pretty common. Somebody else used the same IP address for scraping or even something worse :) so it's blocked.
You can use some proxy services. There are some with free tier so it will do the job for testing.
And before you ask... free proxy services are super slow and probably already blocked :)

Error occurred during the pre-login handshake, due to AntiVirus?

I need your expertise on one of my issues. I often get an intermittent issue from our Power BI on-premises Gateway to SQL connectivity
Error from gateway log
Error: A connection was successfully established with the server, but
then an error occurred during the pre-login handshake. (provider: SSL
Provider, error: 0 - The wait operation timed out.)
The difficult part here is it's very difficult to reproduce ☹️ Whenever I tried the connectivity from the gateway to SQL server, it succeeds but at some very rare case, it fails.
Steps we did to find the root cause
Checked in both the gateway server and SQL server TLS 1.2 only is
enabled, other versions of TLS are disabled
created a .udl file and tried the connectivity but got the error like
[DBNETLIB] ConnectionOpen( SECCreateCredentials().] SSL Security
error.
Finally, we contacted our internal support team, they told to run the network tracer. So we did.
After some long times, we had the luck to capture the error in the network tracer. (Below Image)
Support team told like:
We see that client (gateway server) is sending Client hello after 14 seconds for the TLS SSL handshake, this delay is causing the connection to fail as connection needs to get established in 15 seconds.
We see the same pattern, where the client is causing delay on multiple instances of the communication.
And such delay is generally caused by the Antivirus
My question:
Is this really the Antivirus issue? If so then why it's not happening
all the times?
P.S I know this question is already asked in SO and possible for duplicate, but my real question is this antivirus would be a possible cause for this?
The issue is finally resolved after so many attempts. The below is the solution worked for us
• Azure AD join, where the connections head to the “login.microsoft.com” and delay the connections. There are few settings from registry and GPO that needs to be performed to disable this Auto Azure WorkPlace join.
https://learn.microsoft.com/en-us/azure/active-directory/device-management-troubleshoot-hybrid-join-windows-current
It talks about restricting the server from joining AzureAD through a GPO, which resolves to:
HKLM\SOFTWARE\Policies\Microsoft\Windows\WorkplaceJoin\ key:
autoWorkplaceJoin = 0
• Connections headed to http://ctldl.windowsupdate.com , refer the below article that talks about this issue.
https://blogs.technet.microsoft.com/askds/2018/04/10/tls-handshake-errors-and-connection-timeouts-maybe-its-the-ctl-engine/
To disable it: • Create a backup of this registry key (export and save
a copy)
HKEY_LOCAL_MACHINE\SOFTWARE\Policies\Microsoft\SystemCertificates\AuthRoot • Then create the following DWORD registry values under the key
“EnableDisallowedCertAutoUpdate”=dword:00000000
“DisableRootAutoUpdate”=dword:00000001
I hope this helps someone in the future !

BGP peers established in SDN

I downloaded a routine from github about interconnection about traditional network with sdn. The program establishes ibpg peers. When I run the program, there is a problem occurred shown as follows. How can I deal with this trouble?
Since the peer closed the connection, you should check the logs and/or debug on the other side of the connection. The log file will probably explain why it didn't like your connection attempt / why it refused you.
You tagged quagga, so I assume that at least one of the side is quagga. It should be simple enough to enable some debug on the quagga cli to see what's going on.
Additionally, BGP can send notifications, notifying the peer of an error. So the connecting side should be aware of the error. This implies that the connection (TCP) was established and that the first BGP exchanges (BGP OPEN MSG) happened.
Maybe start with a tcpdump -vvv -i -s0 'host 10.10.10.1 and port 179'

Cannot connect to RDS SQL Server Database using Management Studio

I created a SQL Server RDS Instance in AWS and it seems to be up and running, but if I try to connect to it using Management Studio I get this error:
Here is the text of the error:
A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: TCP Provider, error: 0 - The wait operation timed out.) (Microsoft SQL Server, Error: 258)
I initially tried with the default security group that was created with the instance, but when that didn't work I created a new security group and modified the instance to use it.
Here you can see the details:
I tried this connection setup to connect:
Server Type: Database Engine
Server Name: valuationdlsdev.ck1qvjqhglyg.us-west-2.rds.amazonaws.com,1433
Authentication: SQL Server Authentication
Login: the Master User Login I created when creating the RDS Instance
Password: the Master User Password I created when creating the RDS Instance
I was kinda at my wits end and so I changed the setting on the Security Group to All traffic just to see if that would work, so here are all the settings on the security group:
At this point I'm wondering if port 1433 is not open, because I feel like I've tried everything. Could someone please help me.
Thanks.
In my case I opened the VPC Security group associated with my database
In the EC2 Security groups dashboard I selected Edit Inbound Rules from the actions dropdown and chose edit inbound rules.
At first, I looked at the inbound rules and thought everything was OK since this was the current setup
After all - if it was allowing all traffic, then what could possibly be wrong?
On a whim I added a rule for TCP port 1433. Ending up with this simple setup
Then it immediately started working for me.
Make sure it is publicly accessible, there is a radio button you have to check to make it publicly accessible.
Also add MS SQL inbound rule in inbound tab.After making the change wait for sometime so that the settings are updated in the instance.
In my experience this was counter-intuitive. With the options I selected, all ports and IPs seemed to be open, but after editing the inbound and outbound rules in the security group to have MS SQL for anywhere, I was able to connect.
For inbound rules, go to the VPC Security group of your database instance
In Inbound tab click modify
In column source change ip 0.0.0.0 by your IP by "My IP" or "Anywhere"
I had the same issue.
I ended up deleting the security group inbound rules, and just added a new inbound rule for port 1433, source being: 0.0.0.0
Image attached.
inbound rules
Thanks for the discussion here. Just post my finding in case anyone needs help in the future.
I initially followed this guide https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ConnectToMicrosoftSQLServerInstance.html.
then, I got some ideas from this post and figure out my particular issue in the end. https://forums.aws.amazon.com/thread.jspa?messageID=845682 The poster really did wonderful troubleshooting steps which could help fix most of the general Error 258 problems already. In the end I used the suggestion from the answerer to find out my problem.
In terms of my case of encountering error 258, I tried to connect to RDS SQL server 2016 inside a secure network from my workplace. When I switched to use the public network served by some Telecomm vendor, the connecting was succeeded.
If you want to access from different network were the instance was created, you'll need to open access to the IP range of where you want to access, by going to the "security group" assigned to your DB instance, and then adding the rule for your IP range.
PD. AWS by default only allow access from the IP range of the machine where you activated "public access" to the instance.
I was also not able to access it from my office laptop, but I was able to access it from my personal laptop. I think it is because of some company firewall rules.
In case anyone comes across this post looking for an answer, I just wanted to updated and make sure it's there if anyone needs it. The issue here turned out to be that I misunderstood the way "Publicly Accessible" works and set it to "Yes". Apparently it should have been set to "No". "Yes", however does work for the SQL Server Express version.

NSURLSession - The request timed out

I'm posting data from my app to my server using NSURLSession when a button is pressed. I can successfully send the data to my server and insert into a database, for the first two occasions, but any time after that, the request times out.
I've tried: changing session configuration (connections per host, timeoutInterval etc), session configuration types, changing the way the data is posted.
Has anyone seen this sort of behaviour before and know how I can fix this issue?
Or is it a server issue? I thought my server was down initially. I couldn't connect to it, nor load certain pages. However, it was only down for me. After rebooting my modem, I could connect back to the server. I didn't have any issues connecting to phpMyAdmin.
If the problem was reproducible after a reboot of the router, then I would look into whether Apple's captive portal test servers were down at the time.
Otherwise, my suspicion is that it is a network problem rather than anything specific to your app.
It is quite possible that the pages you were loading successfully were coming from cache.
Because you said that rebooting your modem fixed the problem, that likely means that your modem stopped responding to either DHCP requests or DNS lookups (either for your domain or for one of the captive portal test domains).
It is also possible that you have a packet loss problem, and that it gets worse the longer your router has been up and running. This could cause some requests to complete and others to fail.
Occasionally, I've seen weird behavior vaguely similar to this when ICMP is getting blocked too aggressively.
I've also seen this when a stateful firewall loses its mind and forgets the state.
This can also be caused by keeping HTTP/HTTPS connections alive past the point at which the server gives up and drops the connection, if your firewall is blocking the packet that tells you that the connection was closed by the remote end.
But without a packet trace, there's no way to be certain. To get one:
If your network code is running on OS X, you can just do this with tcpdump on your network interface.
If you are doing this on iOS, you can do this by connecting your computer via wired Ethernet, enabling network sharing over Wi-Fi, pointing tcpdump at the bridge interface, and pointing your iPhone at that Wi-Fi network.
Either way, that will tell you if there are requests going out that never come back, and more importantly, what type of requests they are, and who is responsible for replying to them. Once you have that information, if the source of the problem isn't obvious, post a link to the packet trace and we'll add more suggestions.