We are using a table TABLE_NAME as a source for a dbt snapshot run and this table has few columns named 'abc[l2]' and 'xyz[l1]' in it. Now, since the column name itself has square brackets '[]', we are getting the below error while trying to run an incremental load for dbt snapshot:
Runtime Error in snapshot TABLE_NAME (snapshots/dse/facts/TABLE_NAME.sql)
06:24:03 [UNRESOLVED_COLUMN.WITH_SUGGESTION] A column or function parameter with name `snapshotted_data`.`abc` cannot be resolved. Did you mean one of the following?
Tried adding backslash, single tick, double tick and back tick as escape characters but nothing seems to be working
Related
I cannot imagine there is such issue in BigQuery:
le's say if I drop a column using below command in BQ console for User table:
Alter table User drop column name -> successful
I am aware this column is preserved for 7 day(for time travel duration purpose).
But I cannot add any column anymore by running below command in BQ console:
ALTER TABLE User add column first_name STRING
Cause it will give an error like below even though the two columns have totally different naming:
Column name was recently deleted in the table User. Deleted column name is reserved for up to the time travel duration, use a different column name instead.
The above error is same as when I try to drop the same column again even with IF EXISTS:
Alter table User drop IF EXISTS column name
My question:
Why is this issue will happen? After 7 days, Can I add new columns as usual?
I have recreated your issue wherein I dropped a column named employee_like_2 and then tried to add a new column named new_column.
There is already a created bug for this issue. You may click on +1 to bring more attention on the issue and STAR the issue so that you can be notified for updates.
For the meantime, a possible workaround is to manually add columns through BigQuery UI.
Apart from the solution using UI suggested #Scott B, we can also do it using bq command:
Basically bq query --use_legacy_sql=false 'ALTER TABLE User add column first_name STRING' will fail to add a column. But I found a workaround
I can run bq update command instead like below:
bq show --schema --format=prettyjson DATASET.User > user_schema.json
Add a new column I want into file user_schema.json
bp update DATASET.User user_schema.json
So this basically means it is a 100% bug in BigQuery SQL command
I'm attempting to upload a CSV file (which is an output from a BCP command) to BigQuery using the gcloud CLI BQ Load command. I have already uploaded a custom schema file. (was having major issues with Autodetect).
One resource suggested this could be a datatype mismatch. However, the table from the SQL DB lists the column as a decimal, so in my schema file I have listed it as FLOAT since decimal is not a supported data type.
I couldn't find any documentation for what the error means and what I can do to resolve it.
What does this error mean? It means, in this context, a value is REQUIRED for a given column index and one was not found. (By the way, columns are usually 0 indexed, meaning a fault at column index 8 is most likely referring to column number 9)
This can be caused by myriad of different issues, of which I experienced two.
Incorrectly categorizing NULL columns as NOT NULL. After exporting the schema, in JSON, from SSMS, I needed to clean it
up for BQ and in doing so I assigned IS_NULLABLE:NO to
MODE:NULLABLE and IS_NULLABLE:YES to MODE:REQUIRED. These
values should've been reversed. This caused the error because there
were NULL columns where BQ expected a REQUIRED value.
Using the wrong delimiter The file I was outputting was not only comma-delimited but also tab-delimited. I was only able to validate this by using the Get Data tool in Excel and importing the data that way, after which I saw the error for tabs inside the cells.
After outputting with a pipe ( | ) delimiter, I was finally able to successfully load the file into BigQuery without any errors.
I am trying to list all the table in a database in Amazon AWS Athena via a Python script.
Here is my script:
data = {'name':['database1', 'database-name', 'database2']}
# Create DataFrame
df = pd.DataFrame(data)
for index, schema in df.iterrows():
tables_in_schema = pd.read_sql("SHOW TABLES IN "+schema[0],conn)
There is an error running this
When I run the same query in the Athena query editor, I get an error
SHOW TABLES IN database-name
Here is the error
DatabaseError: Execution failed on sql: SHOW TABLES IN database-name
An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line
1:19: mismatched input '-'. Expecting: '.', 'LIKE', <EOF>
unable to rollback
I think the issue is with the hypen "-" in the database name.
How do I escape this in the query?
You can use the Glue client instead. It provides a function get_tables(), which returns a list of all the tables in a specific data base.
The database, table or columns names cannot have anything other than an underscore "_" in its name. Any other special character will cause an issue when querying. It does not stop you from creating an object with the special characters but will cause an issue when using those objects.
The only way around this is to re-create the database names without the special character, hyphen "-" in this case.
https://docs.aws.amazon.com/athena/latest/ug/tables-databases-columns-names.html
I am trying to do a basic dataflow task in SSIS2012 where the destination is a Netezza table. I have a column that I named "Transaction_ID" that I am planning to use as a primary key and I have a sequence ready to populate it (Seq_Transaction_ID). However, I am not sure how exactly to assign the sequence.
For instance I tried making the table and assigning the column to a default of the next value for the sequence and got an error "you can only use a next value function within a target list". I'm also not sure how I would get the sequence into SSIS either.
Any ideas?
Thank you,
My Source File is (|) Pipe Delimited text file(.txt). I am trying load the file into SQL Server 2012 using SSIS(SQL Server Data Tools 2012). I have three columns. Below is the example for how data in file looks like.
I am hoping my package should fail as this is pipe(|) delimited instead my package is a success and the last row in the third column with multiple Pipes into last column.
My Question is Why is't the package failing? I believe it has corrupt data because it has more number of columns if we go by delimiter?
If I want to fail the package what are my options,If number of delimiters are more than the number columns?
You can tell what is happening if you look at the advanced page of the flat file connection manager. For all but the last field the delimiter is '|', for the last field it is CRLF.
So by design all data after the last defined pipe and the end of the line (CRLF) is imported into your last field.
What I would do is add another column to the connection manager and your staging table. Map the new 'TestColumn' in the destination. When the import is complete you want to ensure that this column is null in every row. If not then throw an error.
You could use a script task but this way you will not need to code in c# and you will not have to process the file twice. If you are comfortable coding a script task and / or you can not use a staging table with extra column then that will be the only other route I could think of.
A suggestion for checking for null would be to use an execute sql task with single row result set to integer. If the value is > 0 then fail the package.
The query would be Select Count(*) NotNullCount From Table Where TestColumn Is Not Null.
You can write a script task that reads the file, counts the pipes, and raises an error if the number of pipes is not what you want.