Big Query Table Creation Confusion - google-bigquery

I have to create a big query table with the schema as follows
snippet:STRING,comment_date:TIMESTAMP
And i have data as follows
"Love both of these brands , but the "" buy a $100k car , get or give a pair of $40 shoes "" message seems .",2015-06-22 00:00:00
"All Givens Best Commercial Ever",2015-06-22 00:00:00
I was confused because both the rows were accepted and were inserted in the table although in the first line all the snippet string is in between the double quotes but it also contains double quotes and comma in between
Why does not big query get confused there ?

When parsing CSV, BigQuery splits only on unquoted commas, and it treats double quotes "" as a single escaped quote character " when encountered inside a quoted string. So your input is valid CSV according to BigQuery.

Related

Replace comma (,) with dot (.) in SQL and Float Datatype

I created a Table with numeric values like 9,35 or 10,5 in it. The Datatype is float. The table looks like this in the short version:
Currency | Euro | 2018 |
USD | 1 | 9,35 |
Now I want to update my table and replace all komma (,) with a dot (.)
I tried it with this code:
update dbo.[Table]
set [2018] = replace([2018], ',','.')
It says that 24 Rows are affected but when I Update my table it changed nothing.
If I use this code:
select replace ([2018],',','.') from dbo.[Table]
Then it works fine but it don't update my table...
Numeric columns do not contain a separator - they use a separator when the data is displayed. The SQL server was probably set up with a culture that uses commas instead of decimals when it displays data. The coma is not stored with the value.
But, all you need to do is specify the format when you display the data, meaning in a report, form, app, whatever. That's where you specify how to format the values.
I would not format the data in the actual SQL query (e.g. converting the data to a string and specifying the format), since it makes it harder to do aggregations and other numeric operations on the client, and takes up more space in memory (which may not be a problem until you get to a massive scale).

How do I load <file name>.csv.gz from snowflake stage into a snowflake table?

I have successfully loaded 1000 files into a Snowflake stage=MT_STAGE.
Every file has exact same schema.
Every file has exact same naming convention (filename).csv.gz
Every file is about 50 megs (+/- a couple megs).
Every file has between 115k-120k records.
Every file has 184 columns.
I have created a Snowflake table=MT_TABLE.
I keep on getting errors trying to do a "COPY INTO" to move files from stage into a single table.
I've tried countless variations of the command, with & without different options. I've spent 3 days reading documentation and trying to watch videos. I have failed. Can anyone help?
copy into MT_TABLE from #MT_STAGE;
Copy executed with 0 files processed
copy into MT_TABLE from #MT_STAGE (type=csv field_delimiter=”,” skip_header=1);
Syntax error: unexpected '('. (line 1)
copy into MT_TABLE from #MT_STAGE type=csv field_delimiter=”,” skip_header=1;
Syntax error: unexpected '”,'. (line 1)
So as per Mike's statement if there are comma's in your data
col_a
col_b
col c
no comma
one, comma
two,, commas
col_a, col_b, col_b
no comma, one, comma, two,, commas
how can anything tell which is the correct way to know what is in what
col_a
col_b
col c
no comma
one, comma
two,, commas
no comma, one
, comma
two,, commas
no comma
one, comma, two
, commas
no comma, one
, comma, two
, commas
no comma
one, comma, two,
commas
no comma, one
, comma, two,
commas
which is the correct line.
So you ether change the field delimeter from , to pipe | or you quote the data
no comma| one, comma| two,, commas
double quotes
"no comma","one, comma"," two,, commas"
single quotes
'no comma','one, comma',' two,, commas'
The cool thing is, if you change your column delimiter it has to not be in the in the data OR the data has to be quoted.
And if you change to quoting it has to not be in the filed OR it has to be escaped.
OR you can encode as some safe data type like base64 and it takes more space, but now it's transportation transport safe:
bm8gY29tbWE,IG9uZSwgY29tbWE,IHR3bywsIGNvbW1hcw

How to search by SQL while doing "a cut of trailing zeros" on a number field?

I have a db table in oracle where I have a column defined as a number.
The columns contains numbers like:
MyColumn
12540000000
78590000000
I want to find the records by searching MyColumn=12540000000 as well as MyColumn=1254 (without trailing zeros).
What could I try? TO_CHAR and a cutting logic or is there something more simple?
rtrim(MyColumn, '0') = '1254'
Note that on the right I enclosed the string within quotes (so it is really seen as a string, not a number). Apparently you are treating these as strings, right? Even if MyColumn is a number, it will be implicitly converted to a string before applying rtrim.

How to query for special characters

I have a large table filled with vendor information.
I need to split this list into two separate lists based on column VENDOR_NAME. One list where VENDOR_NAME is all regular characters and numbers, another list where VENDOR_NAME is special/foreign characters.
I am not sure what the SELECT statements would be to view this information off of the existing master table. Then I could just create two new tables.
VENDOR_NAME only numbers and regular characters
VENDOR_NAME only foreign characters
Example:
Regular: BLUE RIBBON TAG & LABEL CORP
Foreign: 俞章平
Regular: ULSTER-SOCIETY OF GASTROENTEROLOGY/1
Foreign: 马建忠
You could use the function ASCIISTR():
ASCIISTR takes as its argument a string, or an expression that
resolves to a string, in any character set and returns an ASCII
version of the string in the database character set. Non-ASCII
characters are converted to the form \xxxx, where xxxx represents a
UTF-16 code unit.
To get all strings without special characters:
SELECT * FROM table
WHERE INSTR(ASCIISTR(vendor_name),'\') = 0
You have to take care, of course, that strings with '\' would be filtered out by this as well, since the backslash is translated to '\005C' by ASCIISTR. Maybe like this:
WHERE INSTR(REPLACE(ASCIISTR(vendor_name),'\005C','_' ),'\') = 0

Escaping a single quote in Oracle regex query

This is really starting to hurt!
I'm attempting to write a query in Oracle developer using a regex condition
My objective is to find all last names that contain charachters not commonly contained in names (non-alpha, spaces, hyphens and single quotes)
i.e.
I need to find
J00ls
McDonald "Macca"
Smithy (Smith)
and NOT find
Smith
Mckenzie-Smith
El Hassan
O'Dowd
My present query is
select * from dm_name
WHERE regexp_like(last_name, '([^A-Za-z -])')
and batch_id = 'ATEST';
which excludes everything expected except the single quote. When it comes to putting the single quote character, the Oracvel SQL Develoepr parser takes it as a literal.
I've tried:
\' -- but got a "missing right parenthesis" error
||chr(39)|| -- but the search returned nothing
'' -- negated the previous character in the matching group e.g. '([^A-Za-z -''])' made names with '-' return.
I'd appreciate anything you could offer.
Just double the single quote to escape your quote.
So
select *
from dm_name
where regexp_like(last_name, '[^A-Za-z ''-]')
and batch_id = 'ATEST'
See also this sqlfiddle. Note, I tried a similar query in SQL developer and that worked as well as the fiddle.
Note also, for this to work the - character has to be the last character in the group as otherwise it tries to find the group SPACE to ' rather than the character -.
The following works:
select *
from dm_name
WHERE regexp_like(last_name, '([^A-Za-z ''-])');
See this SQLFiddle.
Whether SQL Developer will like it or not is something I cannot attest to as I don't have that product installed.
Share and enjoy.