As Oracle docs describe:
ESCAPE_REFERENCE Function converts a text string to its character reference counterparts for characters that fall outside the character set used by the current document.
However, Why does "utl_i18n.escape_reference('啊我鵝覺〇喆','zhs16cgb231280')" return "啊我鹅觉?喆"? It seems that there is some other conversions beside character reference.
(ps. NLS_NCHAR_CHARACTERSET:AL16UTF16)
Related
In my bigquery table i have some string values that for some unkown reason to me show up like;
BIQUÃ\215NI or BRASÃ\u008dLIA.
I know Ã\215 and Ã\u008d are equivalent to "Í", but i can't find a way to convert them to i'ts equivalent inside my query, i don't want to do a replace for each value that appears like that inside my bank, and i can't find a way to convert them to it's text equivalent inside bigquery documentation.
I already tried FORMAT('%o', 215) but it only converts octal to byte and it only work`s with numeric tables.
I tried REGEXP_REPLACE too but can`t find a way to refer to all octal forms inside the strings.
By using this online tool, Ã\215 and Ã\u008d are equivalent to "Í". But when you put in BigQuery, both gave an "Ã" value as it reads à only and both \215 and \u008d are not used or simply don't have an equivalent.
The CAST() function can be simply used in converting these UTF-8 encoded values, but the query reads ISO 8859-1(Latin-1) Unicode Mappings, and as I stated earlier, it will only return a null value.
My take on this case, you can convert first using the tool that I mentioned, and find the right Unicode Hex in unicode mappings.
SELECT CAST('BIQU\u00cdNI' AS STRING) AS Converted
Whereas, \u00cd is equivalent to Í.
There is a numeric field in a legacy application that I am trying to change to alphanumeric with a field length of about 15. The field is for data entry of account information. In the code, its referenced at numerous places:
.BANK_accno = Format(Me.txtBANK, "####-##-##-##-##")
and
!BANK_accno = Format(Me.txtBANK, "####-##-##-##-##")
The Format is: ####-##-##-##-## and the Mask is ####-##-##-##-##. What I am wondering is what Format (and code) changes should I make to get the field to become alphanumeric? I tried using ##########, however that has not worked.
As BobRodes commented, you can use # to mask characters not limited to numbers. There are other options (ignore spaces, force left-to-right filling, upper/lower case).
Have a look at Format function documentation at MSDN for details. This link is for VBA but Format strings should be compatible.
Please note that you still need to validate Input, Format function is not strict about input.
Let's say, I have a regular expression that checks the validation of the input value as a whole. For example, it is an email input box and when user hits enter, I check it against ^[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}$ to see if it is a valid email address.
What I want to achieve is, I want to intercept the character input too, and check every single input character to see if that character is also a valid character. I can do this by adding an extra regular expression, e.g. [A-Z0-9._%+-] but that is not what I want.
Is there a way to extract the widest possible range of acceptable characters from a given regular expression? So in the example above, can I extract all the valid characters that are defined by the original regular expression (i.e. ^[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}$) programmatically?
I would appreciate any help or hint.
P.S. This is project for iOS written in Objective-C.
If you don't mind writing half a regex parser, certainly. You would have to be able to distinguish literals from meta-characters and to unroll/merge all character classes (including negated character classes, and nested negated character classes, if you regex flavor supports them).
If NSRegularExpressions doesn't come with some convenience method, I cannot imagine how it would be possible otherwise. Just think about ^. When it is outside of a character class, it's a meta-character that you can ignore. If it is inside a character class, it's a meta-character, that negates the character class unless it is not the first character. - is a meta-character inside character classes, unless it is the first character, the last character, or right after another character range (depending on regex flavor). And I'm not even speaking about escaped characters.
I don't know about NSRegularExpressions, but some flavors also support nested character classes (like [a-z[^aeiou]] for all consonants). I think you get where I am going with this.
I'm working on a translator that will take English language text (as user input into a UITextView) and (with a button press) replace specific words with alternatives. I have both the English words in scope plus their alternatives in separate Arrays (englishArray and alternativeArray), indexed correspondingly.
My challenge is finding an algorithm that will allow me to identify a word in the input text (a UITextView) ignoring characters like <",.()>, lookup the word in englishArray (case insensitive), locate the corresponding word in alternativeArray and then use that word in place of the original - writing it back to the UITextView.
Any help greatly appreciated.
NB. I have created a Category extending the NSArray functionality with a indexOfCaseInsensitiveString method that ignores case when doing an indexOfObject type lookup if that helps.
Tony.
I think that using an NSScanner would be best to parse the string into separate words which you could then pass to your indexOfCaseInsensitiveString method. scanCharactersFromSet:intoString: using a set of all the characters you want to ignore, including whitespace and newline characters should get you to the start of a word, and then you could use scanUpToCharactersFromSet:intoString: using the same set to scan to the end of the word. Using scanLocation at the beginning and end of each scan should allow you to get the range of that word, so if you find a match in your array, you will know where in your string to make the replacement.
Thanks for your suggestion. It's working with one exception.
I want to capture all punctuation so I can recreate the original input but with the substituted words. Even though I have a 'space' in my Character Set, the scanner is not putting the spaces into the 'intoString'. Other characters I specify in the Character Set such as '(' and ';' are represented in the 'intoString'.
Net is that when I recreate the input, it's perfect except that I get individual words running into each other.
UPDATE: I fixed that issue by including:
[theScanner setCharactersToBeSkipped:nil];
Thanks again.
I am interested in searching a string (using objective-c) starting from a specific character in the middle of the string. I could split the string, but is there another way? I don't see an obvious option for that in the definition of NSString. I want to be able to search either backwards or forwards in the string starting from a defined character.
-[NSString rangeOfString:options:range:]. Use the range parameter to tell the string to only look in a sub-portion of the original string for the search string.