does anyone know how to include special characters like $ on a table name or if the table already exist with a $ on its name, how to ingest it on bigquery.
Thank you in advance,
That's not possible.
BigQuery column names must contain only letters, numbers, and undescores.
They must start with either a letter or an underscore.
Link to relevant doc: https://cloud.google.com/bigquery/docs/schemas#column_names
If a table contains special characters, you need to change its name before ingesting to BigQuery ($COLUMN_NAME -> DOLLAR_COLUMN_NAME, maybe?)
Related
I'm trying to pull data from a column called file name in which users have to upload the file name with only numericals for eg: 245654, 346595 , 700542. But in few cases i have also noted users where using special characters and aplhabets for e.g. 245654 / Abc, 654658-cgds,78345|ghj. I need to extract all such entries where along with numericals such special characters and alphabets are also noted.
You may use regex like here:
SELECT *
FROM yourTable
WHERE filename ~ '[^0-9]';
The above query will return any record whose file name has one or more non digit characters in it.
I am trying to add a label to my bigquery table/view using the following bq command.
bq update --set_label primary_keys:a,b project-id:dataset.tablename
The command works perfectly fine if I have only one key (a) as the primary key. However, when I try to insert multiple keys (a,b) separated by comma then it throws an invalid characters error. Is there a way to add multiple keys within the same label separated by comma.
I don't think that this is feasible, thus comma character is not accepted there, according to the documentation:
Keys and values can contain only lowercase letters, numeric
characters, underscores, and dashes. All characters must use UTF-8
encoding, and international characters are allowed.
According to the documentation, labels are key-value pairs that helps you organize your Google Cloud BigQuery resources.
Being a key-value pair is a requirement as per the documentation, and this is not compatible with your intention of giving two different values to the same key.
Bigquery column names (fields) can only contain English letters, numbers, and underscores.
I am using python and I want to create a script to migrate my data from Postgres to Bigquery and the Postgres tables have many non-english column names.
I will probably need to encode the column names to some format that Bigquery accepts, but I will need the ability to later decode it back to the original.
what is the best way to do this?
You can encode the column names to something like base64 and replace the +=/ characters to some kind of place holder.
If you don't care about fields length you can encode to base32 (its about 20% longer then base64 but don't use '+' or '/' and the '=' is used only for padding so you can discard it and it wont affect the string)
Except that you can make small conversion table for each non English character in your language to some combination in English chars, this will work only if you have small amount of non-english characters.
I am facing an issue while fetching the data via query from a redshift table. For example:
table name: test_users
column names: user_id, userName, userLastName
Now while creating the test_users table it converts the capital letter of the userName column to username and similar with userLastName which will be converted to userlastname.
I have found the way to convert the all columns to capital or in lowercase, but not in the way to get it as it is.
Unfortunately, AWS Redshift does not support case-sensitive identifiers at the time of writing (Feb 2020). And, while Redshift is based on PostgreSQL, AWS has heavily modified it to the point where many assumptions that would be correct for PostgreSQL 8 are not correct for Redshift.
The documentation at https://docs.aws.amazon.com/redshift/latest/dg/r_names.html explicitly states that it downcases identifiers. The relevant paragraph is below, with the critical sentence bolded:
Names identify database objects, including tables and columns, as well as users and passwords. The terms name and identifier can be used interchangeably. There are two types of identifiers, standard identifiers and quoted or delimited identifiers. Identifiers must consist of only UTF-8 printable characters. ASCII letters in standard and delimited identifiers are case-insensitive and are folded to lowercase in the database. In query results, column names are returned as lowercase by default. To return column names in uppercase, set the describe_field_name_in_uppercase configuration parameter to true.
To preserve case:
SET enable_case_sensitive_identifier TO true;
https://docs.aws.amazon.com/redshift/latest/dg/r_enable_case_sensitive_identifier.html
To force returned uppercase fields (for anyone else curious):
SET describe_field_name_in_uppercase TO on;
https://docs.aws.amazon.com/redshift/latest/dg/r_describe_field_name_in_uppercase.html
I want to create a SQL Server database that will hold thousands of tables who's names will reflect stock ticker names. For example, '0099-OL.HK' is a company's ticker name. Many of the stocks I'm creating tables for have special characters in them just like that.
I've read that special characters in table names should be avoided, but I still don't know why. SQL Server lets you use special characters in table names if you enclose the name with brackets, e.g., 'CREATE TABLE [0099-OL.HK] ...'.
Should I use the ticker names as their table names, or should I avoid using their special characters?
This will lead to no end of problems. The reason SQL Server allows names with spaces and special characters is because people migrate from databases that allow these characters in their names. If you must do this replace all special characters with _ like so: TN0099_OL_HK (TN for ticker name) so users can type sql without using the brackets.
It is bad practice to do so, since not every library might be able to process the table name correctly.
Avoid using special characters, spaces, and leading numbers in database names, table names, and column names.
For the full Rules for Regular Identifiers: Database Identifiers - docs