Get FQDN from domain - selenium

this is my first question here, so I will try my best.
I am trying to get the protocol and the FQDN (fully qualified domain name) from a bunch of domains, i.e. get https://es.aliexpress.com from aliexpress.com.
I have tried Selenium webdriver, but it takes too long to compute all the domains (even with short timeouts and blocking images).
I am asking if someone knows a way to do this without loading the content, something like wget but only for the URL.
Thank you for reading.

Not really...
First of all, http and https have nothing to do with domain names. Those are transfer protocols.
Ignoring that part, what you are calling FQDN are often generated at the time you access them.
For instance, many websites redirect the browser from a desktop site to a mobile version (the typical m.something.com) based on your User Agent string. Which mean www.something.com and m.something.com are both valid answers
In the example you gave, aliexpress.com, prepended es. which means there is most likely some code on the server that reads in either your location (based on IP address) or a locale setting in your browser to direct you to the es version of the website as opposed to the en or dk version.
These changes can be done via an .htaccess file in the root folder of the website, or via back end code.
Google Chrome itself automatically tries to add www. if it looks like you typed a URL into the everything bar.
It's also possible that the URL is one giant redirect. Some websites buy up extra domain names that all redirect to their core site. So even if you input xyz.com you'll end up at abcd.com.
There is no algorithmic way to go from a base URL to what you're calling the FQDN.
P.S. Here is an article about what FQDN means.

Related

Domain URL masking

I am currently hosting the contents of a site with ProviderA. I have a domain registered with ProviderB. I want users to access the contents (www.providerA.com/sub/content) by visiting www.providerB.com. A domain forward is easy enough and works as intended, however, unless I embed the site in a frame (which is a big no-no), the actual URL reads www.providerA.com/sub/content despite the user inputting www.providerB.com.
I really need a solution for this. A domain masking without the use of a frame. I'm sure this has been done before. An .htaccess domain rewrite?
Your help would be hugely appreciated! I'm going nuts trying to find a solution.
For Apache
Usual way: setup mod_proxy. The apache on providerB becomes a client to providerA's apache. It gets the content and sends it back to the client.
But looks like you only have .htaccess. So no proxy, you need full configuration access for that.
So you cannot, see: How to set up proxy in .htaccess
If you have PHP on providerB
Setup a proxy written in PHP. All requests to providerB are intercepted by that PHP proxy. It gets the content from providerA and sends it back. So it does the same thing as the Apache module. However, depending on the quality of the implementation, it might fail on some requests, types, sizes, timeouts, ...
Search for "php proxy" on the web, you will see a couple available on GitHub and others. YMMV as to how difficult it is to setup, and the reliability.
No PHP but some other server side language
Obviously that could be done in another language, I checked PHP because that is what I use the most.
The best solution would be to transfer the content to providerB :-)

Using the '.localhost' TLD searches in browsers instead of showing the site associated with the address

According to RFC 2606 (1999) the TLD .localhost is reserved for use for testing locally.
The goal is to configure a preview site to run locally using the TLD .localhost, e.g. http://example.localhost
The problem is that when I use Chrome or Safari to access a '.localhost' TLD it searches google for example.localhost instead of treating it as a proper address. This is after configuring the hosts file to point back to 127.0.0.1.
Am I misunderstanding the usage of this reserved TLD? Is there a way to configure this to work properly?
.localhost is not an existing, delegated TLD, which is why your browser doesn't find it.
What RFC 2606 says is that .localhost (along with .test, .invalid and .example) will never be a delegated TLD, so you can safely use that name for your own, local, purposes. That is, if you want to set up a private TLD for internal use, that TLD can be safely named .localhost without the risk of a future collision with a globally assigned name.
You can add http:// first. Write http://yolo.localhost in your address bar and not yolo.localhost, then it will work.
See answers here for more information: Chrome browser doesn't like a domain with .loc TLD (for localhost domain testing) without http:// - how to fix?

How to create a friendly url in Tomcat?

I want to modify my application URL from //localhost:8080/monitor/index.html to just monitor , so that on putting monitor on browser, my application should open. Is there a way to achieve this, can someone suggest the configuration changes which will be required for this.
Can I map my short URL to the existing one may be somewhere in web.xml. I am not sure about the approach any suggestions will be great.
Thanks and regards
Deb
You're mixing up several different protocol layers in your question.
If you just enter nothing but "monitor" in the browser URL bar the browser is going to first lookup "monitor" in DNS and finding nothing it will then probably send a query to Google or your configured search engine. In the past browsers have taken other steps, such as appending ".com" and prepending "www." but I don't think modern browsers do that any more.
So far, your server is not even remotely involved.
If you're a large ISP user (TimeWarner, Comcast) and use their DNS it's also possible the ISP will intercept your failed DNS lookup and route the request to a "helpful" search page (i.e. SPAM) of their own.
At this point the request is still nowhere near your server.
I suppose you could mess with the /etc/hosts file on your local system to resolve "monitor" to the proper hostname, but that's an extremely brittle solution that has to be hard coded on each machine you want to have this "shortcut" link (and which breaks when the hostname changes).
You're much better off just setting up a web shortcut in your browser that points to the right place.

301 web forwarding on main domain - Can subdomain point somewhere else?

Excuse the potential noobishness of the question, but I'm, well, a bit of a noob when it comes to this domain architecture lark. If "domain architecture" is even the technical term. Anyway, I digress...
So, I've googled this question , but I can't see the answer I'm looking for (maybe it doesn't exist, who knows!?) The situation is that I host a .com top level domain which does a 301 forward to another site on the net not hosted by me. Can I set up a subdomain that then points somewhere else whether that be on my host itself or just some other site elsewhere on the net?
Essentially, if I set up a subdomain, will it too inherit the web forwarding, and if so, can I directly affect where that subdomain points?
Any answers gratefully appreciated!
Before I try to answer your question, let me be a little fussy :)
First thing first you are confusing and mixing together two different protocols ([DNS] and [HTTP]), actually there is even a dedicated page to the Wikipedia for HTTP 301 responses: http://en.wikipedia.org/wiki/HTTP_301 (but you should read the whole shebang: ([Wikipedia, search for HTTP] is always a good start, and the [RFC 2616] is absolutely a must, IETF RFCs are not easy reading but the Internet is built on them).
DNS is used to translate a name, like www.example.com into an IP address, like 192.168.0.1, in order to locate a machine on the Internet. So DNS is involved as the one of the very first steps a browser takes in order to resolve an URL: but once the "machine name" is translated by the separate DNS Service, and it has become an IP address, DNS job is over and it is used/involved no more.
Then when a browser, using HTTP, contacts the Web Server located on that machine (in this example the machine www.example.com, which the DNS Service has kindly translated to an IP address, in our example 192.168.0.1, because the operating system can only use an IP address as the argument for an [internet socket]) and only at that moment the web server, instead of serving a page, answers whith an "error" code (which, actually is a "response header" with a numeric code that does not start with "2").
Only that this error code is actually used to tell something else: that the browser should try again an "HTTP request", this time connecting to another machine (and, as long as this redirection is "permanent" instead of "temporary" ([HTTP_307]), the new address should be remembered by the browser, its cache and history).
So, if you can setup [redirection response header] on the first machine, it means that there is a Web Server on that first machine that is programmed (given a certain URL pattern) to spit out a Redirection Header, and as long as you can control these redirections, you can as well send the browser wherever you want, not merely sending them to another machine on the Internet but to another URL as well, even on the same website (actually this is the original intended use of code 301, as a measure against [link rot]).
Basically you are free to do whatever you want, or better, to send them wherever you want.
The pros are obvious... the cons are that you must have control over the first web server, and that the visiting browsers will have to perform two "GET request" in order to land at the intended page (this is not grim as it looks, since the [RFC 2616] suggests that the browser (they call it User Agent) caches and remembers the redirection (because it is
permanent)).
Disclaimer: I am being prevented to post hyperlinks, but they where basically all from the Wikipedia so, if you will, you can look the words in brackets "[...]" on the Wikipedia...

Mask redirect to temporary domain with mod_rewrite

We are putting up a company blog at companyname.com/blog but for now the blog is a Wordpress installation that lives on a different server (blog.companyname.com).
The intention is to have the blog and web site both on the same server in a month or two, but that leaves a problem in the interim.
At the moment I am using mod_rewrite to do the following:
http://companyname.com/blog/article-name redirects to http://blog.companyname.com/article-name
Can I somehow keep the address bar displaying companyname.com/blog even though the content is coming from the latter blog.companyname.com?
I can see how to do this if it is on the same server and vhost, but not across a different server?
Thanks
Rather than using mod_rewrite, you could use mod_proxy to set up a reverse proxy on companyname.com, so that requests to http://companyname.com/blog/article-name are proxied (rather than redirected) to http://blog.companyname.com/article-name.
Here are more instructions and examples.
There is functionality with ZoneEdit called webforwards which could probably do this and hide what you are actually doing (unless someone looked into it).
The only thing that mod_rewrite can do is send HTTP header redirects, and those redirects (across servers) always result in the browser address bar reflecting the reality.
You should instead consider writing a 404 script that 'reflects' the blog. This would essentially be a transparent proxy, and many are already written.
The script would find if the requested page (that was 404'd) started with http://mycompany.com/blog/ . If it did, it would download and then send onto the client the blog page and associated files (probably caching them as well).
So requesting http://mycompany.com/blog/article_xyz would cause the 404 script to download and send http://blog.companyname.com/article_xyz.
It's probably more work than it's worth, but you might be able to design a simple enough 404 script that it's worthwhile.
-Adam