my webmaster tools lists lots of (malicious) inbound links with invalid characters:
%EE%80%80
and
%EE%80%81
so my links often look like this:
domain dot com/pdf/%EE%80%80sometext%EE%80%81moretext.pdf
or
domain dot com/pdf/sometext%EE%80%80moretext%EE%80%81moretext.pdf
Edit: I need to remove these invalid characters to get:
domain dot com/pdf/sometextmoretext.pdf
or
domain dot com/pdf/sometextmoretextmoretext.pdf
Related
I'm having some issues with a few 301 redirects in htaccess. The original filenames/URLs were given special characters that I'm not quite sure how to properly escape. The URLs are structured like:
company%E2%80%99s-person-of-interest-aman%E2%80%99s-most-prestigious-%E2%80%9Cacademy-of-leaders-award%E2%80%9D
which equates to:
company’s-person-of-interest-aman’s-most-prestigious-“academy-of-leaders-award”
I've tried some things like
company\'-person-of-interest-aman\'s-most-prestigious-\"Cacademy-of-leaders-award\"
but that didn't work. What am I missing?
This is a UTF-8 character, which doesn't equate to \' or \" on the server side because ' and ’ are different characters according to the encoding spec. You could do one of two things:
1) You could simply rename the files, substituting the ASCII compatible characters for the UTF-8 ones
2) Use the percent encoded values in your redirect string directly.
Instead of
company\'-person-of-interest-aman\'s-most-prestigious-\"Cacademy-of-leaders-award\"
do
company%E2%80%99s-person-of-interest-aman%E2%80%99s-most-prestigious-%E2%80%9Cacademy-of-leaders-award%E2%80%9C
EDIT: while writing the answer, I also realized that your original expression for the redirect url isn't quite matching up even if your characters were ASCII:
company\'-person-of-interest-aman\'s-most-prestigious-\"Cacademy-of-leaders-award\"
should be
company\'s-person-of-interest-aman\'s-most-prestigious-\"academy-of-leaders-award\"
I have changed web platforms and have old URLs that I cannot and do not want to match on the new platform where the old content is now living.
I have an array of old product URLs that all have '-p-' in the URL, followed by a string of numbers and ending in .html (osCommerce platform URLs).
I would like to know how to rewrite:
/x/[rest-of-url]-p-[random numbers].html
to
/x/[rest-of-url]
I would like the end result to look something like this:
http://www.shop.com/shop/versace-black-snakeskin-pony-hair-hobo-p-2214.html
redirects to:
http://www.shop.com/shop/versace-black-snakeskin-pony-hair-hobo
Does anyone know if this is doable in the htaccess file as a rewrite rule?
My managed hosting service providers BeepWeb answered my question.
RewriteRule ^/shop/(.*)-p-(.*).html$ http://www.shop.com/product/$1/ [R=302]
The first argument is the URI that you are matching. The (.) matches any characters. The second argument is the destination URL. The $1 corresponds to the first (.). $2 would be the second (.*), and so on... The [R=302] tells the rewrite to be a 302 redirect (use [R=301] for a 301 redirect).
Using the (.) is essentially like using a wildard. You can instead narrow this down by specifying which characters you want to match as opposed to all characters (instead of using (.) you could use ([abc]*) which would match only against a, b and c characters).
Also, be careful that you do not match other URLs unintentionally (i.e. you need to make sure that the pattern matches are unique to the URLs being rewritten).
If you need the source reference, see the following:
https://httpd.apache.org/docs/current/rewrite/intro.html
Thanks again to http://www.beepweb.com for their detailed response.
Hope it helps others.
If a user is trying to access www.example.com/local
I want to send him to www.example.com/home if he has from a certain IP address, and www.example.com/work if he is not in that IP range.
What would be the best way to do that using mod_rewrite?
RewriteCond %{REMOTE_ADDR} ^123\.\123\.123\.123$
Also in the above example, what is the purpose of the backslashes and the $ sign?
I thought a backslash was an escape character, but then I'm not sure why you would be escaping the 1 in the 2nd group of digits
Thanks
(more questions in one is usually frowned upon, as I have seen it)
escaping a 1 is just a 1, I think,
$ means the end of the string
and ^ is the start except inside [].
the backslashes are meant to say that we look for a real . and not just any character.
You can test your regexes on your ip address ranges using for example regexr.com
I want to know if there are any restriction to attribute names in amazons simpledb.
I tried the following attribute name
my.attribute.name
Running the following query
select * from mydomain where my.attribute.name is not null
results in an error: "The specified query expression syntax is not valid.".
Also surrounding 'my.attribute.name' results in an error because is invalid select syntax.
Changing point to underscore and everything works fine:
my_attribute_name
and the query runs fine
select * from mydomain where my_attribute_name is not null
Now my question: What are the allowed characters for attributes?
On the amazon developer manual the names are restricted to characters that are valid in xml documents. What exactly does this mean? The linked W3C documents seems not answering this. In domain names the dot "." is allowed.
Currently I use the sdbTool. I hope this doesnt affect the behaviour.
Inserting some other characters in attribute names is working, like this one: 'my:attribute-name.with other%20chars'.
Any ideas?
Can you please enclosed your attribute name in back-tick quotes and try again ?
Domain names & Attribute names need to be enclosed in back-tick quotes if they contains any special characters. Attribute and domain names may appear without quotes if they contain only letters, numbers, underscores (_), or dollar symbols ($). You must quote all other attribute and domain names with the back-tick (`) if they contains any special characters.
Users are uploading files with names like "abc #1", "abc #2". I am uploading these files to S3. When I try to download these files I get an error like this
InvalidArgument
Header value contained an open quoted span.
I am creating the link by wrapping the file name using "Uri.EscapeUriString".
Any suggestions?
From AWS documentation:
The name for a key is a sequence of Unicode characters whose UTF-8 encoding is at most 1024 bytes long.
So the "abc #1" and "abc #2" are valid key names, the problem is then probably in your client code, check the documentation of your Http client.
AWS also warn about using special characters:
You can use any UTF-8 character in an object key name. However, using certain characters in key names may cause problems with some applications and protocols. The following guidelines help you maximize compliance with DNS, web-safe characters, XML parsers, and other APIs.
Alphanumeric characters: 0-9, a-z, A-Z
Special characters: !, -, _, ., *, ', (, )
So either restrict the set of available characters in your app to only allow the recommended ones, or fix the issue at your client level.
You should use Uri.EscapeDataString instead of Uri.EscapeUriString for 3 reasons:
Uri.EscapeUriString has been deprecated as of .NET 6 - https://learn.microsoft.com/en-us/dotnet/api/system.uri.escapeuristring?view=net-6.0
Uri.EscapeUriString can corrupt the Uri string in some cases
Uri.EscapeUriString only escapes the spaces - not the #
Uri.EscapeUriString("abc #1") returns "abc%20#1" whereas Uri.EscapeDataString("abc #1") returns "abc%20%231" which is preferable.