Azure Container Instance Can't Connect to Internet (outbound http GET request failed) - azure-container-service

I have an Azure Container Instance created from the base image microsoft/windowsservercore:ltsc2016. The image has mercurial installed and checks out a private repo using hg clone but fails with the result abort: error: getaddrinfo failed. When run on my workstation using Docker for Windows, the container successfully checks out the repo.
I believe this is a network connectivity issue, because if I run powershell Invoke-WebRequest http://microsoft.com the container also logs an error that the request could not be completed due to failure to connect to the server.

The Windows container on ACI has a known issue about the out-bound network. It is suggested to add a retry logic on any network request or add a 30 seconds' delay before you start your application.
https://learn.microsoft.com/en-us/azure/container-instances/container-instances-troubleshooting#windows-containers-slow-network-readiness
The issue only impacts the Windows Server 2016. It is fixed in Windows Server 2019. Once ACI adapts WS2019, the workaround will be no longer needed.

Related

Repos missing suddenly in azure container registry instance

Repos missing suddenly in azure container registry instance.
There were lots of repos in ACR instance and now i am seeing its not available.
Can you help here.
It says "unable to send request to fetch repositories"
If your repositories or tags not available in azure portal I would suggest you use the Firefox or Chrome to list all of your registry's repositories.
To unable to send request to fetch repositories
It's possible to fetch repositories but the browser may be unable to send the server's request to fetch repositories.
Make sure the registration portal you are using from a public network it's allows only private access.
To prevent these issues, Ensure your network connectivity is sufficient, DNS mistakes & Ads blockers.
And check whether you have enabled firewall.
Try to run this command az acr check-health -n yourRegistry using Azure CLI to determine whether your environment can connect to the Container Registry. To avoid any expired cookie, try to use incognito or private session in your browser.
Reference: Azure Container Registry | Microsoft Docs

Impossible to login to my azure container registry with docker login

I created an Azure Container Registre some days ago, and now it's impossible to login to this registry with docker login command. I always get this error message:
Error response from daemon: Get https://XXXXXXXXX.azurecr.io/v2/: dial tcp: lookup XXXXXXXXX.azurecr.io on [::1]:53: read udp [::1]:52627->[::1]:53: read: connection refused
Docker client may throw such error when it is unable to connect to the local Docker daemon properly. So, Restart/Reinstall-Docker should mostly fix that.

Mesos Failed to connect error to IP:5050

I am new to Mesos and just finished setting up mesos and along with zookeeper on my test server.
Unfortunately I keep getting this error message on my mesos console indicating i am unable to connect to mesos on port 5050 and can't seem to figure out why.
I have included the error in the screen shot below
The mesos log files doesn't point to why the error is showing either.
I resolved the problem by this:
./bin/mesos-master.sh --ip=x.x.x.x --work_dir=/var/lib/mesos --hostname=x.x.x.x
We can avoid this problem by starting mesos-master with following option:
--ip=xx.xx.xx.xx --hostname_lookup=false
I have resolved this problem. Open the web page in Chrome, and open the developer tool, you will see the chrome is accessing the web site with domain, in my case the domain name is "mesosphere", as there is no mesosphere in dns, so the accessing was failed.
I solved the problem by adding the mesosphere in the hosts file, C:/windows/system32/etc/hosts/
If you use the domain name for the Mesos cluster you must set the domain name in windows hosts.
There can be multiple issues here.
Is your mesos-master running and healthy ?
Has leader election process completed, if all is good.
Check if you are able to do
ping leader.mesos
If above ping doesn't work, that means leader has not been elected. First fix that.
I had this problem also. Luckily, I have a running mesos server also. So, I can compare the different between my demo and the running mesos server. I captured the packets between client and server in my demo. I found the explorer didn`t resend fresh request, only some keepalive packets.
but, when I catch the packets in the running mesos server, I found the explorer send get request frequently. like the image
I think, if you run some task or add some agent, maybe it will activate the explore to send request frequently. Then the "Failed to connect" will disappeared.
I was having the same issues and what fixed it for me was the zookeeper configuration. In my case I was using the EC2 public IP Address rather than the private one. Once I changed the /etc/mesos/zk file to zk://<private IP>:2181/mesos I was able to connect without the constant error messages. In other words, zookeeper was reporting to be running in one IP and mesos-master was trying to connect using a different IP.
My configuration was correct as suggested. But failed to start mesos-master service. But There is alternative way to start mesos-master node with exact same configuration. Commands to start mesos-master
$ cd /usr/sbin [or mesos_installation directory/bin]
$sudo ./mesos-master --work_dir=/var/lib/mesos --log_dir=/home/rajeev/logs/mesos/
Its start mesos-master service successfully for me.

Not able to add worker after successful installation of Website Controller, Management server,Front end server,Publication server and File server

After successful installation of Windows azure pack I am trying to install Windows Azure Pack: Web Sites v2 U6.
I am able to install most of the servers to the Website controller. All below servers are in ready state so there is no issue of installation error.
Management server
Front end server
Publication server
File server.
but while adding Worker server, it's added perfectly and start the installation process. At stage when it reach at the installation it seems like it is not getting some connection string. The information is below.
Start service: rsfilter.
Service rsfilter is running.
Configure Idle Pageout feature.
Completed configuration of Idle Pageout feature.
Take ownership for file C:\Windows\system32\Drivers\http.sys.
Configure DWAS Files location to path 'C:\DWASFiles'
File caching is turned off
Execute command 'powershell.exe Import-Module NetQoS; $policy = Get-NetQosPolicy -PolicyStore ActiveStore | Where-Object { $_.Name -eq 'udplimit' }; if (!$policy) { New-NetQosPolicy -name 'udplimit' -ThrottleRateActionBitsPerSecond 65536 -IpProtocol UDP -PolicyStore ActiveStore -ea Stop }'
Setup database connection string for server WAPSQL .
Setup data service credentials.
Stop service: WAS.
Service 'WAS' is stopped.
Set IPv4 dynamic port range, with starting port 30000, and number of ports 35536.
Execute command 'netsh.exe int ipv4 set dynamicport tcp start=30000 num=35536'
Start service: dwassvc.
Service dwassvc is running.
WorkerManagementService started. Ready to receive ConnectionString and DataServiceCredentials.
Waiting for worker connection string. Attempt number is 12.
Waiting for worker connection string. Attempt number is 24.
Waiting for worker connection string. Attempt number is 36.
Waiting for worker connection string. Attempt number is 48.
I also tried to repair Frontend server as mention in on Microsoft forum but it will not help.
Here's Microsoft forum link
Any trick or guideline will be appreciated
Thanks,
Dharmendra
I have found that my Front end server is not accessible from management server and web site controller. It is DNS server that not resolving named correctly (Accessed by IP address but not accessed by server name). Also by using klist tickets I have fount that front end server doesn't have Kerberos ticket available on the list.
For the solution what I have done is, I removed front end server from Web site controller and after that I remove it from Domain controller. Then I rejoin it again to domain. After that I tried it to access it by using it's name from management server and website controller. For example,\FRONTEND\C$. Then I first added FrontEnd server and then Web Worker Server and now it get's connection string properly and added with "Ready" status.
Hope this will help someone.
Thanks you very much for point out. It is hectic from more then a week.
Regards,
Dharmendra

Teamcity - Selenium Grid environment - Unable to connect to the remote server

I am running a small selenium project using grid setup. All is fine when I run locally which has the hub and the node running in the same machine. What I did next was kept the hub running in my local machine and tried running the test through teamcity. I presume that the test will run in one of the build agents. When I kick start the build job, I got the following exception. Not sure what the issue is.
Test(s) failed. System.Net.WebException : Unable to connect to the remote server ---->
System.Net.Sockets.SocketException : A connection attempt failed because the connected
party did not properly respond after a period of time, or established connection failed
because connected host has failed to respond 192.168.6.121:80 at
System.Net.HttpWebRequest.GetResponse()
Also, the team city build agent IP is 192.168.7.132 whereas I am getting a message that it is unable to contact 192.168.6.121
Is there something happening which I am not aware of.
The selenium server is up and running in my local machine. and all is fine at the server end.
Any pointers will be much helpful