schema in ML and non exist url - imagine url - schema

Is it possible to use non-exist url in schema for machine learning? url's that do not exist, such as subdomains
"#id":"https://visualsubdomain.sample.com/#personlogo",
"url":"https://sample.com/app/uploads/2022/02/avatar_user_1.png",
or imagine non-exist url

Related

.htaccess rewrite domain but keep directory structure and preserve url in address bar

I have copied a Joomla site from one domain to a new domain.
I want to rewrite the domain name only to keep the directory structure.
And I want to keep the original URL in the address bar to preserve SEO ranking.
Joomla is using relative url's, so the real domain name of the new server will not as such be invoked by Joomla.
How to do this in .htaccess on Apache?
And I want to keep the original URL in the address bar to preserve SEO ranking.
That won't help you really, just add proper 301 redirects and make sure you catch as much of the indexed url's with your redirects component within joomla to prevent any dead links (google hates those and will penalize your domain for it). Also add sitemap, upload it to your google webmaster tools and ask google to index it.

hosting multiple sites on S3 bucket serving index.html from directory path

I'm new to using AWS S3. I want to know if it's possible to host multiple static websites in one bucket using the website route directing meta data option. I am planning to have multiple folders each with their own index.html, but how can I configure the bucket settings to route to each individual site when a user types the address.
For example by typing
http://<bucket-name>.s3-website-<AWS-region>.amazonaws.com/folder1
will take them to website 1
and
http://<bucket-name>.s3-website-<AWS-region>.amazonaws.com/folder2
will take them to website 2
If this is possible, is there any way to also achieve the configuration using the AWS CLI?
This is possible with a slight modification to the URL. You need to use the URLs as follows with the trailing slash to serve the index.html document inside folder1 and folder2.
http://<bucket-name>.s3-website-<AWS-region>.amazonaws.com/folder1/
http://<bucket-name>.s3-website-<AWS-region>.amazonaws.com/folder2/
If you create such a folder structure in your bucket, you must have an
index document at each level. When a user specifies a URL that
resembles a folder lookup, the presence or absence of a trailing slash
determines the behavior of the website. For example, the following
URL, with a trailing slash, returns the photos/index.html index
document.
Reference: Index Document Support

sitemap for multiple domains of same site

Here is the situation, i have a website that can be accessed from multiple domains, lets say www.domain1.com, www.domain2.net, www.domain3.com. the domains access the exact same code base, but depending on the domain, different CSS, graphics, etc are loaded.
everything works fine, but now my question is how do i deal with the sitemap.xml?
i wrote the sitemap.xml for the default domain (www.domain1.com), but what about when the site is accessed from the other domains? the content of the sitemap.xml will contain the wrong domain.
i read that i can add multiple sitemap files to robots.txt, so does that mean that i can for example create sitemap-domain2.net.xml and sitemap-domain3.com.xml (containing the links with the matching domains) and simply add them to robots.txt?
somehow i have doubts that this would work thus i turn to you experts to shed some light on the subject :)
thanks
You should use server-side code to send the correct sitemap based on the domain name for requests to /sitemap.xml
Apache rewrite rules for /robots.txt requests
If you're using Apache as a webserver, you can create a directory called robots and put a robots.txt for each website you run on that VHOST by using Rewrite Rules in your .htaccess file like this:
# URL Rewrite solution for robots.txt for multidomains on single docroot
RewriteCond %{REQUEST_FILENAME} !-d # not an existing dir
RewriteCond %{REQUEST_FILENAME} !-f # not an existing file
RewriteCond robots/%{HTTP_HOST}.txt -f # and the specific robots file exists
RewriteRule ^robots\.txt$ robots/%{HTTP_HOST}.txt [L]
NginX mapping for /robots.txt requests
When using NginX as a webserver (while taking yourdomain1.tld and yourdomain2.tld as example domains), you can achieve the same goal as post above with the following conditional variable (place this outside your server directive):
map $host $robots_file {
default /robots/default.txt;
yourdomain1.tld /robots/yourdomain1.tld.txt;
yourdomain2.tld /robots/yourdomain2.tld.txt;
}
This way you can use this variable in a try_files statement inside your server directive:
location = /robots.txt {
try_files /robots/$robots_file =404;
}
Content of /robots/*.txt
After setting up the aliases to the domain-specific robots.txt-files, add the sitemap to each of the robots files (e.g.: /robots/yourdomain1.tld.txt) using this syntax at the bottom of the file:
# Sitemap for this specific domain
Sitemap: https://yourdomain1.tld/sitemaps/yourdomain1.tld.xml
Do this for all domains you have, and you'll be set!
You have to make sure URLs in each XML sitemap match within domain/subdomain. But, if you really want, you can host all sitemaps on one domain look using "Sitemaps & Cross Submits"
I'm not an expert with this but I have a similar situation
for my situation is that I have one domain but with 3 sub-domain
so what happen is that each of the sub-domain contain the sitemap.xml
but since my case was different directory for each of the sub-domain
but I'm pretty sure that the sitemap.xml can be specify for which of each domain.
The easiest method that I have found to achieve that is to use an XML sitemap generator to create a sitemap for each domain name.
Place both the /sitemap.xml in the root directory of your domains or sub-domains.
Go to Google Search and create separate properties for each domain name.
Submit an appropriate sitemap to each domain in the Search Console. The submission will say show success.
I'm facing a similar situation for a project I'm working on right now. And Google Search Central actually have the following answer:
If you have multiple websites, you can simplify the process of creating and submitting sitemaps by creating one or more sitemaps that include URLs for all your verified sites, and saving the sitemap(s) to a single location. All sites must be verified in Search Console.
So it seems that as long as you have added the different domains as your properties in Google Search Console, at least Google will know how to deal with the rest, even if you upload sitemaps for the other domains to only one of your properties in the Google Search Console.
For my use case, I then use server side code to generate sitemaps where all the dynamic pages with English content end up getting a location on my .io domain, and my pages with German content end up with a location on the .de domain:
<url>
<loc>https://www.mydomain.io/page/some-english-content</loc>
<changefreq>weekly</changefreq>
</url>
<url>
<loc>https://www.mydomain.de/page/some-german-content</loc>
<changefreq>weekly</changefreq>
</url>
And then Google handles the rest. See docs.

Map multiple subdomains to same S3-bucket

Is there some way to map multiple (thousands) of subdomains to one s3-bucket?
If so is it also possible to map it to a specific path in the bucket for each subdomain?
I want test1.example.com to map to mybucket/test1 and test2.example.com to map to mybucket/test2.
I know the last part isn't possible with normal dns-records but maybe there is some nifty Route 53 feature?
It's not possible with S3 directly. You can only use 1 subdomain with an S3 bucket.
However you can map multiple subdomains to a Cloudfront distribution.
Update (thanks to #SimonHutchison's comment below)
You can now map up to 100 alternate domains to a Cloudfront
distribution - see http://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html#limits_cloudfront
You can also use a wildcard to map any subdomain to your distribution:
Using the * Wildcard in Alternate Domain Names
When you add alternate domain names, you can use the * wildcard at the
beginning of a domain name instead of specifying subdomains
individually. For example, with an alternate domain name of
*.example.com, you can use any domain name that ends with example.com in your object URLs, such as www.example.com,
product-name.example.com, and marketing.product-name.example.com. The
name of an object is the same regardless of the domain name, for
example:
www.example.com/images/image.jpg
product-name.example.com/images/image.jpg
marketing.product-name.example.com/images/image.jpg
Starting from October 2012 Amazon introduced a function to handle redirects (HTTP 301) for S3 buckets. You can read the release notes here and refer to this link for configuration via Console / API.
From AWS S3 docs :
Redirects all requests If your root domain is example.com and you want to serve requests for both http://example.com and
http://www.example.com, you can create two buckets named
example.com and www.example.com, maintain website content in only
one bucket, say, example.com, and configure the other bucket to
redirect all requests to the example.com bucket.
Advanced conditional redirects You can conditionally route requests according to specific object key names or prefixes in the
request, or according to the response code. For example, suppose
that you delete or rename an object in your bucket. You can add a
routing rule that redirects the request to another object. Suppose
that you want to make a folder unavailable. You can add a routing
rule to redirect the request to another page, which explains why
the folder is no longer available. You can also add a routing rule
to handle an error condition by routing requests that return the
error to another domain, where the error will be processed.

Multiple Domain name

I have a customer that been on the web for some time. They have bought a domain name that describe it product, and a second one more up to date. Now that company has evolved to something more general and has bought a 3rd domain - something like:
vegetables.com (2005)
ecolo-vegetables.com (2006)
good-health-eating.com (2009)
Here are my questions:
What is the bet way to get all those domains under the new name?
The new name is unknown to search engine and other linker, I don't want to lose the ranking, so what is the best way to keep that ranking?
Can I point URLs to the "best" ranked domain?
What append to the backlinker? they link to which domain?
The new domain has a "-" in the name... which is really good to SEO but a little unnatural to type, should I get the no dash version too?
n.b. It make sense to redirect all the domain under the same, but will you choose the oldest (with modrewrite) or the newest but with no life under it's belt (so it doesn't exist anywhere in search engine)
another p.s. Some will tell me to redirect with .htaccess, but should I change the dns to point to the last .com. which solution is better
Are all three sites "Different" or do they point to the same website/content?
Use 301 Redirects to redirect your old domain names to the new domain names. If all domains are pointing to the same website, make sure you also use the Canonical Tag on all your pages.
If you 301 Redirect from the old domain names / urls, your rankings will be transfered to your new domain/pages. (the only exception to this may be any extra points you get from embedded keywords in your old domain names).
You should point old urls to your "new" urls/domain. Rankings and link juice should/will be transfered to the new urls/domain.
Ideally all your backlinks should update their links to the new domain, but it doesn't really matter. If the old domains are 301 redirecting to the new domain anyway, point to the old domain is just like pointing to the new domain.
Definitely get the no-dash version of the domain as well and just have it 301 redirect to the actual domain you want to target.
I'll give this a go.
1. You could possibly have redirects or just allow the DNS of the domain to point to the new (desired) website.
2. It's not hard to understand SEO (Search Engine Optimization) nowadays - ensuring you have the correct meta tags and other SE info will give you a big helping hand. There isn't any way of transferring SE ranks.
3. That's possible. You could have ABCDEF.COM at number 3 on google, but then set ABCDEF.COM to redirect to GHIJKL.COM.
4. If you set up redirects, and the new site has the same content as the old one, there is the possiblity of setting up your DNS and your redirect to redirect to the new version of the previous page on the new website.
( I don't think I worded that very well, hope you catch my drift )
5. Out of pure experience I'd say yes, get both. That way you can market to your customer audience as ABCDEF.com, but show to SEs as AB-CD-EF.COM.
Here is the best answer i got from this link
302 and 301 Redirects
When a request for a page or URL is
made by a browser, agent or spider,
the web server where the page is
hosted checks a file called
'.htaccess'. This file contains
instructions on how to handle specific
requests and also plays a key role in
security. The '.htaccess' file can be
modified so that it instructs
browsers, agents or spiders that the
page has either temporarily moved (302
redirect) or permanently moved (301
redirect). It is usually possible to
implement this redirect without
messing with the '.htaccess' file
directly, using your web host's
control panel instead.
From a search engine perspective, 301
redirects are the only acceptable way
to redirect URLs. In the case of
moved pages, search engines will index
only the new URL, but will transfer
link popularity from the old URL to
the new one so that search engine
rankings are not affected. The same
behavior occurs when additional
domains are set to point to the main
domain through a 301 redirect.
And the last word : from this link that just confirm what i know know !
First off, ensure you're using "301 redirects" rather than "302 redirects" or the link juice (PageRank) won't transfer to the destination URL. You can verify that 301s (not 302s) are in place by using a "server header checker" like this one. Only a 301 tells engines the previous URL has moved permanently and thus forwards the page's link equity to the new location.