how can i change the pattern in php? - sql

i got this pattern for searching purpose for mysql from third party javascript. im assuming its html pattern, which can be converted, anyone have any idea on this?
pattern A%5B%D9%8B-%D9%93%D9%B0%5D*L%5B%D9%8B-%D9%93%D9%B0%5D*L%5B%D9%8B-%D9%93%D9%B0%5D*

It looks urlencoded and has some unicode charecters in it.
A[ً-ٰٓ]*L[ً-ٰٓ]*L[ً-ٰٓ]*

It looks like a query string - where values are encoded.
See http://www.blooberry.com/indexdot/html/tagpages/text.htm for more information.

Related

.NET Entity Framework: What is the Correct Escape Character to Pass a URL via the SQL Method?

I think this may be a pretty basic question, but I can't word it specifically to generate a helpful answer.
I'm going to pass some data via Entity Framework migration to an existing table with Sql(). The column, Url, takes strings. What escape character(s) do I need to use to pass the Url Value as a string?
In other words, in the code below, do I need to use any escape characters in conjunction with http://www.someurl.com?
Thank you!
Sql("INSERT INTO Videos (Name, Url, VideoGenreId, ArtistId) VALUES ('SomeName', 'http://www.someurl.com', 1, 1)");
Answered my own question. The data passed to the database without any escape characters with no errors.
I did not anticipate that :)
Thanks.

System.Web.HttpUtility.UrlEncode method gives wrong result with different language value

Web.HttpUtility.UrlEncode method in my project. When I am encoding name in English language then I got correct result. For example,
string temp = System.Web.HttpUtility.UrlEncode("Jewelry");
then I got exact result in temp variable. But if I wrote name in Russian language then I got different result.
string temp = System.Web.HttpUtility.UrlEncode("ювелирные изделия");
then I got value in temp variable like "%d1%8e%d0%b2%d0%b5%d0%bb%d0%b8%d1%80%d0%bd%d1%8b%d0%b5+%d0%b8%d0%b7%d0%b4%d0%b5%d0%bb%d0%b8%d1%8f"
Can anyone help me how to achieve exact name as per language?
Thank you!
Actually, the method has "done the right thing" for you!
It encodes non-ASCII characters so that it can be valid in all of the cases and transmit over the Internet. If you put your temp variable in an URL as a parameter, you will get your correct result at server side. That's what UrlEncode means for. Here your question is not a problem at all.
So please have a look at this link for further reading to understand about URL Encoding: http://www.w3schools.com/tags/ref_urlencode.asp
If you input that Russian word to the "URL Encoding Functions" part in the page I have given, it will return the same result as Web.HttpUtility.UrlEncode method does.
Can anyone help me how to achieve exact name as per language?
In short: not with that method, but it might depend on what is your exact goal.
In details:
In general URIs as defined by RFC 3986 (see Section 2: Characters) may contain any of the following characters: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]#!$&'()*+,;=. Any other character needs to be encoded with the percent-encoding (%hh).
This is why UrlEncode produces
UrlEncode("Jewelry") -> "Jewelry"
UrlEncode("ювелирные изделия") -> "%d1%8e%d0%b2%d0%b5%d0%bb%d0%b8%d1%80%d0%bd%d1%8b%d0%b5+%d0%b8%d0%b7%d0%b4%d0%b5%d0%bb%d0%b8%d1%8f"
The string of "ювелирные изделия" contains characters that are not allowed in a URL as per RFC 3986.
Today, modern browsers could work with UTF-8 in URL it might be not necessary to use UrlEncode(). See example: http://jsfiddle.net/ybgt96ms/

Regex to replace asterisk characters with html bold tag

Does anyone have a good regex to do this? For example:
This is *an* example
should become
This is <b>an</b> example
I need to run this in Objective C, but I can probably work that bit out on my own. It's the regex that's giving me trouble (so rusty...). Here's what I have so far:
s/\*([0-9a-zA-Z ])\*/<b>$1<\/b>/g
But it doesn't seem to be working. Any ideas? Thanks :)
EDIT: Thanks for the answer :) If anyone is wondering what this looks like in Objective-C, using RegexKitLite:
NSString *textWithBoldTags = [inputText stringByReplacingOccurrencesOfRegex:#"\\*([0-9a-zA-Z ]+?)\\*" withString:#"<b>$1<\\/b>"];
EDIT AGAIN: Actually, to encompass more characters for bolding I changed it to this:
NSString *textWithBoldTags = [inputText stringByReplacingOccurrencesOfRegex:#"\\*([^\\*]+?)\\*" withString:#"<b>$1<\\/b>"];
Why don't you just do \*([^*]+)\* and replace it with <b>$1</b> ?
You're only matching one character between the *s. Try this:
s/\*([0-9a-zA-Z ]*?)\*/<b>$1<\/b>/g
or to ensure there's at least one character between the *s:
s/\*([0-9a-zA-Z ]+?)\*/<b>$1<\/b>/g
I wrote a slightly more complex version that ensures the asterisk is always at the boundary so it ignores hanging star characters:
/\*([^\s][^\*]+?[^\s])\*/
Test phrases with which it works and doesn't:
This one regexp works for me (JavaScript)
x.match(/\B\*[^*]+\*\B/g)

what is the impact of escaped characters on seo-friendly urls?

I have a site that displays products - in the simplest sense the url of the page for a particular product is:
site.com/products/manufacturer_model - so for example if I was displaying a Dell Latitude D700 laptop my URL would look like:
site.com/products/dell_latitude_d700
I have a number of products that contain characters that I would need to URL escape - so for example a Dell Latitude 12?34. Obviously I cannot include the '?' character in the URL. For the purpose of being SEO-friendly - should I ignore that character? e.g.
site.com/products/dell_latitude_1234
Or should I escape it? e.g.
site.com/products/dell_latitude_12%3F34
Seems like escaping it would be the most logical approach - but do crawlers understand this?
Well, using "_" is not so friendly to users, so I think using "-" is better (check seoMOZ beginners guide).
Also, you would like to check what characters really need escaping on RFC 3986. If you are using PHP, check out urlencode function page at php.net. I wrote a function to make this updated conversion a few months ago ;)
But getting back to your main question, do use escaped (when needed per RFC 3986) for writing your URLs. It is the safe path to not getting stuck or penalized.

Rails ActiveRecord: Inserting text containing unprintable/weird characters

I am inserting some text from scraped web into my database. some of the fields in the string have unprintable/weird characters. For example,
if text is "C__O__?__P__L__E__T__E",
then the text in the database is stored only as "C__O__"
I know about h(), strip_tags()... sanitize, ... etc etc. But I do not want to sanitize this SQL. The activerecord logs the SQL correctly, and when run in phpMySQL, the query is executed correctly. something happens between the SQL query generation and it being executed.
Help is much appreciated.
Just replace the question mark in the string with a string containing a question mark, I haven't found any other way either:
["C__O__?__P__L__E__T__E", '?']
works perfectly.
Can you escape the question mark using "\?"?
Hmmmm.. using CGI escape, I found out that the character coming in the system is not what I expected it to be. It is not a question mark (%3F) but a question mark (%D5).
C__%D5__M__P__L__%80___T__%80__
C__%3F__M__P__L__%3F___T__%3F__
Eventually I gsubbed out the non-printable characters before saving.
gsub(/[^[:print:]]/, '')
Only after removing the invalid characters in my string, was I able to save the item properly.
None of the other solutions worked, partially because the problem was not understood clearly upfront.
I know this is way late, but I ran into the same problem when we were trying to process a file as UTF-8 that actually used the ISO-8859-1 character encoding. I suspect you had a similar issue in your scraping where you assumed the wrong encoding and it ended up causing things to fail.