bigquery Convert oct/hex/dec codes to its text equivalent - google-bigquery

In my bigquery table i have some string values that for some unkown reason to me show up like;
BIQUÃ\215NI or BRASÃ\u008dLIA.
I know Ã\215 and Ã\u008d are equivalent to "Í", but i can't find a way to convert them to i'ts equivalent inside my query, i don't want to do a replace for each value that appears like that inside my bank, and i can't find a way to convert them to it's text equivalent inside bigquery documentation.
I already tried FORMAT('%o', 215) but it only converts octal to byte and it only work`s with numeric tables.
I tried REGEXP_REPLACE too but can`t find a way to refer to all octal forms inside the strings.

By using this online tool, Ã\215 and Ã\u008d are equivalent to "Í". But when you put in BigQuery, both gave an "Ã" value as it reads à only and both \215 and \u008d are not used or simply don't have an equivalent.
The CAST() function can be simply used in converting these UTF-8 encoded values, but the query reads ISO 8859-1(Latin-1) Unicode Mappings, and as I stated earlier, it will only return a null value.
My take on this case, you can convert first using the tool that I mentioned, and find the right Unicode Hex in unicode mappings.
SELECT CAST('BIQU\u00cdNI' AS STRING) AS Converted
Whereas, \u00cd is equivalent to Í.

Related

What does utl_i18n.escape_reference really do?

As Oracle docs describe:
ESCAPE_REFERENCE Function converts a text string to its character reference counterparts for characters that fall outside the character set used by the current document.
However, Why does "utl_i18n.escape_reference('啊我鵝覺〇喆','zhs16cgb231280')" return "啊我鹅觉?喆"? It seems that there is some other conversions beside character reference.
(ps. NLS_NCHAR_CHARACTERSET:AL16UTF16)

How to migrate string format function with "!###" from vb6.0 to vb.net?

I'm migrating source from vb6.0 to vb.net and am struggling with this format function:
VB6.Format(text, "!#########")
VB6.Format(text, "00000")
I don't understand the meaning of "!#########" and "00000", and how to do the equivalent in VB.Net. Thanks
This:
VB6.Format(text, "!#########")
indicates that the specified text should be left-aligned within a nine-character string. Using standard .NET functionality, that would look like this:
String.Format("{0,-9}", text)
or, using the newer string interpolation, like this:
$"{text,-9}"
The second on is a little bit trickier. It's indicating that the specified text should be formatted as a number, zero-padded to five digits. In .NET, only actual numbers can be formatted as numbers. Strings containing digit characters are not numbers. You could convert the String to a number and then format it:
String.Format("{0:00000}", CInt(text))
or:
String.Format("{0:D5}", CInt(text))
If you were going to do that then it's simpler to just call ToString on the number:
CInt(text).ToString("D5")
If you don't want to do the conversion then you can pad the String explicitly instead:
text.PadLeft(5, "0"c)

VB6 Change Numeric Field To Alphanumeric

There is a numeric field in a legacy application that I am trying to change to alphanumeric with a field length of about 15. The field is for data entry of account information. In the code, its referenced at numerous places:
.BANK_accno = Format(Me.txtBANK, "####-##-##-##-##")
and
!BANK_accno = Format(Me.txtBANK, "####-##-##-##-##")
The Format is: ####-##-##-##-## and the Mask is ####-##-##-##-##. What I am wondering is what Format (and code) changes should I make to get the field to become alphanumeric? I tried using ##########, however that has not worked.
As BobRodes commented, you can use # to mask characters not limited to numbers. There are other options (ignore spaces, force left-to-right filling, upper/lower case).
Have a look at Format function documentation at MSDN for details. This link is for VBA but Format strings should be compatible.
Please note that you still need to validate Input, Format function is not strict about input.

c# Comparing strings from Oracle and SQL server

I think I have an encoding problem that needs to be fixed.
Is there a way to compare strings across code pages?
Oracle returns a string "TEST - My String" with the minus sign encoded as ascii 63.
SQL Server quite correctly returns the string with the minus encoded as 45.
Is there a way to compare these strings?
Does the framework contain a comparison that is capable of ignoring code page mismatches.
Use one of the overloads of string.compare, probably:
if (string.Equals(value1, value2, StringComparison.OrdinalIgnoreCase))
{
...
}
More useful info here:
http://msdn.microsoft.com/en-us/library/dd465121.aspx

Extract terms from query for highlighting

I'm extracting terms from the query calling ExtractTerms() on the Query object that I get as the result of QueryParser.Parse(). I get a HashTable, but each item present as:
Key - term:term
Value - term:term
Why are the key and the value the same? And more why is term value duplicated and separated by colon?
Do highlighters only insert tags or to do anything else? I want not only to get text fragments but to highlight the source text (it's big enough). I try to get terms and by offsets to insert tags by hand. But I worry if this is the right solution.
I think the answer to this question may help.
It is because .Net 2.0 doesnt have an equivalent to java's HashSet. The conversion to .Net uses Hashtables with the same value in key/value. The colon you see is just the result of Term.ToString(), a Term is a fieldname + the term text, your field name is probably "term".
To highlight an entire document using the Highlighter contrib, use the NullFragmenter