Illegal input character "\342" in Google Bigquery - google-bigquery

Getting illegal character \342 error on the "29" at the end of the below query and cant find any documentation as why. Note that store_num is a string in the source table.
SELECT * FROM table_1
WHERE store_num = '27082​29'

Looks like you copied the code from another document that tool that char with copy/paste. Just go to that position and delete that char (it will be invisible)

This worked for me. I copied the query from slack. After pasting it into the BigQuery console I received the described error message.
To circumvent this, I pasted the query first in sublime, which showed "​​​<0x200>" in the empty lines. I deletes these tags and copied the query again and insert it in to my console. The execution ran without any issues.

Related

TRIM or REPLACE in Netsuite Saved Search

I've looked at lots of examples for TRIM and REPLACE on the internet and for some reason I keep getting errors when I try.
I need to strip suffixes from my Netsuite item record names in a saved item search. There are three possible suffixes: -T, -D, -S. So I need to turn 24335-D into 24335, and 24335-S into 24335, and 24335-T into 24335.
Here's what I've tried and the errors I get:
Can you help me please? Note: I can't assume a specific character length of the starting string.
Use case: We already have a field on item records called Nickname with the suffixes stripped. But I've ran into cases where Nickname is incorrect compared to Name. Ex: Name is 24335-D but Nickname is 24331-D. I'm trying to build a saved search alert that tells me any time the Nickname does not equal suffix-stripped Name.
PS: is there anywhere I can pay for quick a la carte Netsuite saved search questions like this? I feel bad relying on free technical internet advice but I greatly appreciate any help you can give me!
You are including too much SQL - a formulae is like a single result field expression not a full statement so no FROM or AS. There is another place to set the result column/field name. One option here is Regex_replace().
REGEXP_REPLACE({name},'\-[TDS]$', '')
Regex meaning:
\- : a literal -
[TDS] : one of T D or S
$ : end of line/string
To compare fields a Formulae (Numeric) using a CASE statement can be useful as it makes it easy to compare the result to a number in a filter. A simple equal to 1 for example.
CASE WHEN {custitem_nickname} <> REGEXP_REPLACE({name},'\-[TDS]$', '') then 1 else 0 end
You are getting an error because TRIM can trim only one character : see oracle doc
https://docs.oracle.com/javadb/10.8.3.0/ref/rreftrimfunc.html (last example).
So try using something like this
TRIM(TRAILING '-' FROM TRIM(TRAILING 'D' FROM {entityid}))
And always keep in mind that saved searches are running as Oracle SQL queries so Oracle SQL documentation can help you understand how to use the available functions.

ERROR: extra data after last expected column on PostgreSQL while the number of columns is the same

I am new to PostgreSQL and I need to import a set of csv files, but some of them weren't imported successfully. I got the same error with these files: ERROR: extra data after last expected column. I have investigated this error report and learned that these errors occur might because the number of columns of the table is not equal to that in the file. But I don't think I am in this situation.
For example, I create this table:
CREATE TABLE cast_info (
id integer NOT NULL PRIMARY KEY,
person_id integer NOT NULL,
movie_id integer NOT NULL,
person_role_id integer,
note character varying,
nr_order integer,
role_id integer NOT NULL
);
And then I want to copy the csv file:
COPY cast_info FROM '/private/tmp/cast_info.csv' WITH CSV HEADER;
Then I got the error:
**ERROR: extra data after last expected column
CONTEXT: COPY cast_info, line 8801: "612,207,2222077,1,"(segments \"Homies\" - \"Tilt A Whirl\" - \"We don't die\" - \"Halls of Illusions..."**
The complete row in this csv file is as follows:
612,207,2222077,1,"(segments \"Homies\" - \"Tilt A Whirl\" - \"We don't die\" - \"Halls of Illusions\" - \"Chicken Huntin\" - \"Another love song\" - \"How many times?\" - \"Bowling balls\" - \"The people\" - \"Piggy pie\" - \"Hokus pokus\" - \"Let\"s go all the way\" - \"Real underground baby\")/Full Clip (segments \"Duk da fuk down\" - \"Real underground baby\")/Guy Gorfey (segment \"Raw deal\")/Sugar Bear (segment \"Real underground baby\")",2,1
You can see that there's exactly 7 columns as the table has.
The strange thing is, I found that the error lines of all these files contain the characters backslash and quotation mark (\"). Also, these rows are not the only row that contains \" in the files. I wonder why this error doesn't appear in other rows. Because of that, I am not sure if this is the problem.
After modifying these rows (e.g. replace the \" or delete the content while remaining the commas), there are new errors: ERROR: invalid input syntax for line 2 of every file. And the errors occur because the data in the last column of these rows have been added three semicolons(;;;) for no reason. But when I open these csv files, I can't see the three semicolons in those rows.
For example, after deleting the content in the fifth column of this row:
612,207,2222077,1,,2,1
I got the error:
**ERROR: invalid input syntax for type integer: "1;;;"
CONTEXT: COPY cast_info, line 2, column role_id: "1;;;"**
While the line 2 doesn't contain three semicolons, as follows:
2,2,2163857,1,,25,1
In principle, I hope the problem can be solved without any modification to the data itself. Thank you for your patience and help!
The CSV format protects quotation marks by doubling them, not by backslashing them. You could use the text format instead, except that that doesn't support HEADER, and also it would then not remove the outer quote marks. You could instead tweak the files on the fly with a program:
COPY cast_info FROM PROGRAM 'sed s/\\\\/\"/g /private/tmp/cast_info.csv' WITH CSV;
This works with the one example you gave, but might not work for all cases.
ERROR: invalid input syntax for line 2 of every file. And the errors
occur because the data in the last column of these rows have been
added three semicolons(;;;) for no reason. But when I open these csv
files, I can't see the three semicolons in those rows
How are you editing and viewing these files? Sounds like you are using something that isn't very good at preserving formatting, like Excel.
Try actually naming the columns you want processed in the copy statement:
copy cast_info (id, person_id, movie_id, person_role_id, note, nr_order, role_id) from ...
According to a friend's suggestion, I need to specify the backslashes as escape characters:
copy <table_name> from '<csv_file_path>' csv escape '\';
and then the problem is solved.

Teradata SQL - Replacing special characters

I'm using Report Builder 3.0 for my reports. My report runs, however, if a user exports the results to Excel (xlsx) instead of Excel 2003 (xls), they get an "illegal xml character" message when the file is open.
4 of the columns contain "&" and / or " ' "; so I'm trying to replace these special characters; which I believe are causing the issue.
I've tried to update this line:
j.journal_desc AS "Jrnl Description",
with this line:
oreplace(oreplace(j.journal_desc, ’&’, ‘and’),'''','') AS "Jrnl Description",
and it works fine. However when I do this on a second line I get the message: "SELECT Failed. [9804] Response Row size or Constant Row size overflow".
I've tried "otranslate" and it works on 2 columns. However, when I try it on the 3rd column, I get the same overflow message.
Is it possible to use oreplace or otranslate on multiple columns? Am I doing something wrong? Is there a better way to replace these special characters? t
Thanks for the help......
oreplace and otranslate when used the result string will have length of 8000 unicode characterset.each of otranslate will make much longer by 8000. Try to cast to smaller length should fix problem.
CAST(oreplace(journal_desc,'&','and') AS VARCHAR(100))

Is there a tool for finding the Char code of a character?

I am trying to write a VB function to strip unwanted characters from a string. It is for generating a 'clean' url from data that has been inputted into a CMS. Someone has copied and pasted from a Word document and so there appears to be an mdash or ndash in the product title. This results in ─ appearing instead of -
I have tried a Replace(text, Chr(196), Chr(45)) but it isn't working so it can't be 196. Is there a tool or something where I can copy this character and paste it into the tool and it will tell me what char code it is?
Thanks.
You can make your program write out the Character Code using the finction Asc()
Response.write Asc("-") would write out
45
for example.
Try here or here. From 2nd link I can see that your char is alt150

Replace character in SQL results

This is from a Oracle SQL query. It has these weird skinny rectangle shapes in the database in places where apostrophes should be. (I wish we would could paste screen shots in here)
It looks like this when I copy and paste the results.
spouse�s
is there a way to write a SQL SELECT statement that searches for this character in the field and replaces it with an apostrophe in the results?
Edit: I need to change only the results in a SELECT statement for reporting purposes, I can't change the Database.
I ran this
select dump('�') from dual;
which returned
Typ=96 Len=3: 239,191,189
This seems to work so far
select translate('What is your spouse�s first name?', '�', '''') from dual;
but this doesn't work
select translate(Fieldname, '�', '''') from TableName
Select FN from TN
What is your spouse�s first name?
SELECT DUMP(FN, 1016) from TN
Typ=1 Len=33 CharacterSet=US7ASCII: 57,68,61,74,20,69,73,20,79,6f,75,72,20,73,70,6f,75,73,65,92,73,20,66,69,72,73,74,20,6e,61,6d,65,3f
EDIT:
So I have established that is the backquote character. I can't get the DB updated so I'm trying this code
SELECT REGEX_REPLACE(FN,"\0092","\0027") FROM TN
and I"m getting ORA-00904:"Regex_Replace":invalid identifier
This seems a problem with your charset configuracion. Check your NLS_LANG and others NLS_xxx enviroment/regedit values. You have to check the oracle server, your client and the client of the inserter of that data.
Try to DUMP the value. you can do it with a select as simple as:
SELECT DUMP(the_column)
FROM xxx
WHERE xxx
UPDATE: I think that before try to replace, look for the root of the problem. If this happens because a charset trouble you can get big problems with bad data.
UPDATE 2: Answering the comments. The problem may be is not on the database server side, may be is in the client side. The problem (if this is the problem) can be a translation on server to/from client comunication. It's for a server-client bad configuracion-coordination. For instance if the server has defined UTF8 charset and your client uses US7ASCII, then all acutes will appear as ?.
Another approach can be that if the server has defined UTF8 charset and your client also UTF8 but the application is not able to show UTF8 chars, then the problem is in the application side.
UPDATE 3: On your examples:
select translate('What. It works because the � is exactly the same char: You have pasted on both sides.
select translate(Fieldname. It does not work because the � is not stored on database, it's the char that the client receives may be because some translation occurs from the data table until it's showed to you.
Next step: Look in DUMP syntax and try to extract the codes for the mysterious char (from the table not pasting �!).
I would say there's a good chance the character is a single-tick "smart quote" (I hate the name). The smart quotes are characters 91-94 (using a Windows encoding), or Unicode U+2018, U+2019, U+201C, and U+201D.
I'm going to propose a front-end application-based, client-side approach to the problem:
I suspect that this problem has more to do with a mismatch between the font you are trying to display the word spouse�s with, and the character �. That icon appears when you are trying to display a character in a Unicode font that doesn't have the glyph for the character's code.
The Oracle database will dutifully return whatever characters were INSERTed into its' column. It's more up to you, and your application, to interpret what it will look like given the font you are trying to display your data with in your application, so I suggest investigating as to what this mysterious � character is that is replacing your apostrophes. Start by using FerranB's recommended DUMP().
Try running the following query to get the character code:
SELECT DUMP(<column with weird character>, 1016)
FROM <your table>
WHERE <column with weird character> like '%spouse%';
If that doesn't grab your actual text from the database, you'll need to modify the WHERE clause to actually grab the offending column.
Once you've found the code for the character, you could just replace the character by using the regex_replace() built-in function by determining the raw hex code of the character and then supplying the ASCII / C0 Controls and Basic Latin character 0x0027 ('), using code similar to this:
UPDATE <table>
set <column with offending character>
= REGEX_REPLACE(<column with offending character>,
"<character code of �>",
"'")
WHERE regex_like(<column with offending character>,"<character code of �>");
If you aren't familiar with Unicode and different ways of character encoding, I recommend reading Joel's article The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!). I wasn't until I read that article.
EDIT: If your'e seeing 0x92, there's likely a charset mismatch here:
0x92 in CP-1252 (default Windows code page) is a backquote character, which looks kinda like an apostrophe. This code isn't a valid ASCII character, and it isn't valid in IS0-8859-1 either. So probably either the database is in CP-1252 encoding (don't find that likely), or a database connection which spoke CP-1252 inserted it, or somehow the apostrophe got converted to 0x92. The database is returning values that are valid in CP-1252 (or some other charset where 0x92 is valid), but your db client connection isn't expecting CP-1252. Hence, the wierd question mark.
And FerranB is likely right. I would talk with your DBA or some other admin about this to get the issue straightened out. If you can't, I would try either doing the update above (seems like you can't), or doing this:
INSERT (<normal table columns>,...,<column with offending character>) INTO <table>
SELECT <all normal columns>, REGEX_REPLACE(<column with offending character>,
"\0092",
"\0027") -- for ASCII/ISO-8859-1 apostrophe
FROM <table>
WHERE regex_like(<column with offending character>,"\0092");
DELETE FROM <table> WHERE regex_like(<column with offending character>,"\0092");
Before you do this you need to understand what actually happened. It looks to me that someone inserted non-ascii strings in the database. For example Unicode or UTF-8. Before you fix this, be very sure that this is actually a bug. The apostrophe comes in many forms, not just the "'".
TRANSLATE() is a useful function for replacing or eliminating known single character codes.