I'm trying to import a CSV to a BigQuery table from the user interface. My import job fails with the message:
Too many errors encountered. (error code: invalid)
gs://foo/bar.csv: CSV table references column position 15, but line starting at position:29350998 contains only 15 columns. (error code: invalid)
I'm assuming this means the importer doesn't like null fields in source data without an actual null string. Is there a way to make the UI allow jagged rows on import? Or, if not, what CLI command should I use to import my CSV file to a table this way?
The UI has an Allow jagged rows checkbox that you can select. Did you try that? It's part of the Options for the Create Table wizard.
Related
I am trying to create a table from JSON files in BigQuery and want just one column which will represent the first key 'id' only.
Creating a schema with only one column causes errors because all of the JSON keys in the input files are considered.
Is there a way to create a table that corresponds to only specific JSON keys?
Unfortunately, you can’t create a table from a JSON file in BigQuery with just one column from the JSON file. You can create a feature request in this link.
You have these options:
Option 1
Don't import as JSON, but as CSV instead (define null character as
separator)
Each line has only one column - the full JSON string
Parse inside BigQuery with maximum flexibility (JSON parsing
functions and even JS)
Option 2
Do a 2-step import:
Import as a new table with all the columns.
Append "SELECT column1 FROM [newtable]" into the existing table.
I'm attempting to upload a CSV file (which is an output from a BCP command) to BigQuery using the gcloud CLI BQ Load command. I have already uploaded a custom schema file. (was having major issues with Autodetect).
One resource suggested this could be a datatype mismatch. However, the table from the SQL DB lists the column as a decimal, so in my schema file I have listed it as FLOAT since decimal is not a supported data type.
I couldn't find any documentation for what the error means and what I can do to resolve it.
What does this error mean? It means, in this context, a value is REQUIRED for a given column index and one was not found. (By the way, columns are usually 0 indexed, meaning a fault at column index 8 is most likely referring to column number 9)
This can be caused by myriad of different issues, of which I experienced two.
Incorrectly categorizing NULL columns as NOT NULL. After exporting the schema, in JSON, from SSMS, I needed to clean it
up for BQ and in doing so I assigned IS_NULLABLE:NO to
MODE:NULLABLE and IS_NULLABLE:YES to MODE:REQUIRED. These
values should've been reversed. This caused the error because there
were NULL columns where BQ expected a REQUIRED value.
Using the wrong delimiter The file I was outputting was not only comma-delimited but also tab-delimited. I was only able to validate this by using the Get Data tool in Excel and importing the data that way, after which I saw the error for tabs inside the cells.
After outputting with a pipe ( | ) delimiter, I was finally able to successfully load the file into BigQuery without any errors.
I am using Teradata SQL to import a CSV file. I clicked import to activate the import operation, then typed the following
insert into databasename.tablename values(?,?,?,...)
I made sure to specify the database name as well as what I want the table to be named, and I put 13 commas--the number of columns in my CSV file.
It gives me the following error:
Query contains 13 parameters but Import file contains 1 data values
I have no idea what the issue is.
The default delimiter used by your SQL Assistant doesn't match the one used in the CSV, so it doesn't recognise all the columns.
On SQL Assistant, go to : Tools >> Options >> Export/Import and choose the proper delimiter so it matches the one in your CSV.
I have a NEWLINE_DELIMITED_JSON file on my computer and I would like to load it into a BigQuery table.
I have 3 keys in each lines. One of those is a timestamp: I would like to remove it and not get a "timestamp" column in my BigQuery table.
One of them has a wrong name: the name of the key in the JSON file is "special_id" but I would like to load it in a column named "main_id".
I can't find a way to do that while specifying the schema of the table created while loading. Is there a way to do this ?
Thanks you
For that level of flexibility:
Don't import as JSON
Import as CSV (define null character as separator)
Each line has only one column - the full JSON string
Parse inside BigQuery with maximum flexibility (JSON parsing functions and even JS)
I have a following table in xlsx format which I would like to import into the my sql database:
The table is pretty complicated and I only want the records after '1)HEADING'
I have been looking at php libraries to import into sql but they only seem to be for simple excel files.
You have two ways to realize that :
First method :
1) Export it into some text format. The easiest will probably be a tab-delimited version, but CSV can work as well.
2) Use the load data capability. See http://dev.mysql.com/doc/refman/5.1/en/load-data.html
3) Look half way down the page, as it will gives a good example for tab separated data:
FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\'
4) Check your data. Sometimes quoting or escaping has problems, and you need to adjust your source, import command-- or it may just be easier to post-process via SQL.
Second method :
There's a simple online tool that can do this called sqlizer.io.
You upload an XLSX file to it, enter a sheet name and cell range, and it will generate a CREATE TABLE statement and a bunch of INSERT statements to import all your data into a MySQL database.