single, double quotes, or tokens? - yacc

Tutorials on writing yacc code online use single quotes for semicolons, and other characters:
';' '+' '-' (etc)
however when using:
'<' or '>'
I got errors until I changed it to double quotes:
"<" or ">"
Similarly,
'>=' '=<' '==' '!='
do not seem to be the same as
">=" "=<" "==" "!="
How does yacc treat single quotes? double quotes?
And when should tokens be used instead of putting stuff in quotes?
ie: '!=' vs "!=" vs TOKNOTEQUALS

You can use either ' or " around literals -- they're equivalent. HOWEVER, you can in general only put a single character between the quotes and get a sensible result -- a parser that accepts that single character token. Putting multiple characters in the quotes gives you a single token, but there's no way for your lexer to return that token, so its not useful.

Related

How to the numbers that start with a $ only Kotlin

I scanned a document in to kotlin and it has words, numbers, values, etc... but I only want the values that start with a $ and have 2 decimal places after the .(so the price) do I use a combination of a substring with other string parses?
Edit: I have looked into Regex and the problem I am having now is I am using this line
val reg = Regex("\$([0-9]*\.[0-9]*)")
to grab all the prices however the portion of *. is saying Invalid escape. However in other languages this works just fine.
You have to use double \ instead of single . It's because the \ is an escape character both in Regex and in Kotlin/Java strings. So when \ appears in a String, Kotlin expects it to be followed by a character that needs to be escaped. But you aren't trying to escape a String's character...you're trying to escape a Regex character. So you have to escape your backslash itself using another backslash, so the backslash is part of the computed String literal and can be understood by Regex.
You also need double \ before your dollar sign for it to behave correctly. Technically, I think it should be triple \ because $ is a special character in both Kotlin and in Regex and you want to escape it in both. However, Kotlin seems smart enough to guess what you're trying to do with a double escape if no variable name or expression follows the dollar sign. Rather than rely on that, I would use the triple escape.
val reg = Regex("\\\$([0-9]*\\.[0-9]*)")

What regular expression characters have to be escaped in SQL?

To prevent SQL injection attack, the book "Building Scalable Web Sites" has a function to replace regular expression characters with escaped version:
function db_escape_str_rlike($string) {
preg_replace("/([().\[\]*^\$])/", '\\\$1', $string);
}
Does this function escape ( ) . [ ] * ^ $? Why are only those characters escaped in SQL?
I found an excerpt from the book you mention, and found that the function is not for escaping to protect against SQL injection vulnerabilities. I assumed it was, and temporarily answered your question with that in mind. I think other commenters are making the same assumption.
The function is actually about escaping characters that you want to use in regular expressions. There are several characters that have special meaning in regular expressions, so if you want to search for those literal characters, you need to escape them (precede with a backslash).
This has little to do with SQL. You would need to escape the same characters if you wanted to search for them literally using grep, sed, perl, vim, or any other program that uses regular expression searches.
Unfortunately, active characters in sql databases is an open issue. Each database vendor uses their own (mainly oracle's mysql, that uses \ escape sequences)
The official SQL way to escape a ', which is the string delimiter used for values is to double the ', as in ''.
That should be the only way to ensure transparency in SQL statements, and the only way to introduce a proper ' into a string. As soon as any vendor admits \' as a synonim of a quote, you are open to support all the extra escape sequences to delimit strings. Suppose you have:
'Mac O''Connor' (should go into "Mac O'Connor" string)
and assume the only way to escape a ' is that... then you have to check the next char when you see a ' for a '' sequence and:
you get '' that you change into '.
you get another, and you terminate the string literal and process the char as the first of the next token.
But if you admit \ as escape also, then you have to check for \' and for \\', and \\\' (this last one should be converted to \' on input) etc. You can run into trouble if you don't detect special cases as
\'' (should the '' be processed as SQL mandates, or the first \' is escaping the first ' and the second is the string end quote?)
\\'' (should the \\ be converted into a single \ then the ' should be the string terminator, or do we have to switch to SQL way of encoding and consider '' as a single quote?)
etc.
You have to check your database documentation to see if \ as escape characters affect only the encoding of special characters (like control characters or the like) and also affects the interpretation of the quote character or simply doesn't, and you have to escape ' the other way.
That is the reason for the vendors to include functions to do the escape/unescape of character literals into values to be embedded in a SQL statement. The idea of the attackers is to include (if you don't properly do) escape sequences into the data they post to you to see if that allows them to modify the text of the sql command to simply add a semicolon ; and write a complete sql statement that allows them to access freely your database.

What is the difference between double and single quotes in pig?

I always thought that '' and "" were the same in pig, but today I got the
Unexpected character '"'
error on
register datafu-pig-1.2.1.jar
define Coalesce datafu.pig.util.Coalesce;
...
Coalesce(x,"a")
while
Coalesce(x,'a')
works just fine.
So, what is the difference between single and double quotes?
Pig doesn't support double quotes for string literals(ie chararray). All the chararray must be enclosed within single quotes.
A String or Chararrays are represented in interfaces by java.lang.String.
Constant chararrays are expressed as string literals with single quotes, for example, 'fred'
Reference:http://chimera.labs.oreilly.com/books/1234000001811/ch04.html#scalar_types

ANTLR String LEXER token

I am trying to do a STRING lexer token. My problem is that besides \n, \r, \t
any character is himself (for example \c is c). That being said i have the following example:
"This is a valid \
string."
"This is
not valid."
"This is al\so a valid string"
After searching on the internet to no avail for me, i concluded that i must use an #after clause. Unfortunately i don't understand how to do this. If i am not mistaking i can't use a syntactic predicate because this is not a parser rule, it's a lexer rule.
How about something like this:
STRING
: '"' ( '\\' ('\\'|'\t'|'\r\n'|'\r'|'\n'|'"') | ~('\\'|'\t'|'\r'|'\n'|'"') )* '"'
;
where '\\' ('\\'|'\t'|'\r\n'|'\r'|'\n'|'"') is an escaped slash, tab, line break or quote. And ~('\\'|'\t'|'\r'|'\n'|'"') matches any char other than a slash, tab, line break or quote.

what characters should be escaped in sql string parameters

I need a complete list of characters that should be escaped in sql string parameters to prevent exceptions. I assume that I need to replace all the offending characters with the escaped version before I pass it to my ObjectDataSource filter parameter.
No, the ObjectDataSource will handle all the escaping for you. Any parametrized query will also require no escaping.
As others have pointed out, in 99% of the cases where someone thinks they need to ask this question, they are doing it wrong. Parameterization is the way to go. If you really need to escape yourself, try to find out if your DB access library offers a function for this (for example, MySQL has mysql_real_escape_string).
SQL Books online:
Search for String Literals:
String Literals
A string literal consists of zero or more characters surrounded by quotation marks. If a string contains quotation marks, these must be escaped in order for the expression to parse. Any two-byte character except \x0000 is permitted in a string, because the \x0000 character is the null terminator of a string.
Strings can include other characters that require an escape sequence. The following table lists escape sequences for string literals.
\a
Alert
\b
Backspace
\f
Form feed
\n
New line
\r
Carriage return
\t
Horizontal tab
\v
Vertical tab
\"
Quotation mark
\
Backslash
\xhhhh
Unicode character in hexadecimal notation
Here's a way I used to get rid of apostrophes. You could do the same thing with other offending characters that you run into. (example in VB.Net)
Dim companyFilter = Trim(Me.ddCompany.SelectedValue)
If (Me.ddCompany.SelectedIndex > 0) Then
filterString += String.Format("LegalName like '{0}'", companyFilter.Replace("'", "''"))
End If
Me.objectDataSource.FilterExpression = filterString
Me.displayGrid.DataBind()