I have a staging table with around 200 columns in Redshift. I first copy data from S3 to this table and then copy data from this table to another table using a large insert into select from query. Most of the fields in staging table are varchar, which I convert to the proper datatype in the query.
I am getting some field in the staging table which is causing a numeric overflow -
org.postgresql.util.PSQLException: ERROR: Numeric data overflow (addition)
Detail:
-----------------------------------------------
error: Numeric data overflow (addition)
code: 1058
context:
query: 9620240
location: numeric.hpp:112
process: query1_194 [pid=680]
how can I find, which field is causing this overflow, so that I can sanitize my input or correct my query.
I use Netezza which also can use regex functions to grep out rows. Fortunately redshift supports regexp as well. Please take a look at
http://docs.aws.amazon.com/redshift/latest/dg/REGEXP_COUNT.html
So the idea in your case is to use the regexp in the where clause and in this way you can find which values are exceeding the numeric cast occurring during the insert. The issue will be finding identifying data that allows you to determine which rows in the physical file are causing the issue. You could create another copy of the data and create row numbers in a temporary table. Use the temporary table as your source of analysis. How large is the numeric field you are going into ? You may need to do this analysis against more than 1 column if you have multiple columns being cast to numeric.
Related
I am trying to insert data from a staging table into the master table. The table has nearly 300 columns, and is a mix of data-typed Varchars, Integers, Decimals, Dates, etc.
Snowflake gives the unhelpful error message of "Numeric value '' is not recognized"
I have gone through and cut out various parts of the query to try and isolate where it is coming from. After several hours and cutting every column, it is still happening.
Does anyone know of a Snowflake diagnostic query (like Redshift has) which can tell me a specific column where the issue is occurring?
Unfortunately not at the point you're at. If you went back to the COPY INTO that loaded the data, you'd be able to use VALIDATE() function to get better information to the record and byte-offset level.
I would query your staging table for just the numeric fields and look for blanks, or you can wrap all of your fields destined for numeric fields with try_to_number() functions. A bit tedious, but might not be too bad if you don't have a lot of numbers.
https://docs.snowflake.com/en/sql-reference/functions/try_to_decimal.html
As a note, when you stage, you should try and use the NULL_IF options to get rid of bad characters and/or try to load them into stage using the actual datatypes in your stage table, so you can leverage the VALIDATE() function to make sure the data types are correct before loading into Snowflake.
Query your staging using try_to_number() and/or try_to_decimal() for number and decimal fields of the table and the use the minus to get the difference
Select $1,$2,...$300 from #stage
minus
Select $1,try_to_number($2)...$300 from#stage
If any number field has a string that cannot be converted then it will be null and then minus should return those rows which have a problem..Once you get the rows then try to analyze the columns in the result set for errors.
I have a integer type column in my BigQuery table and now I need to convert it to a float column. I also have to keep all records. What I want to do is changing the column type. Not casting.
I've read that it's possible to do it just by exporting results of a query on a table to itself.
How to do it?
Using SELECT with writing result back to table
SELECT
CAST(int_field AS FLOAT) AS float_field,
<all_other_fields>
FROM YourTable
This approach will co$t you scan of whole table
To execute this - you should use Show Option button in BQ Web UI and properly set options as in below example. After you run this - your table will have that column with float vs. original integer data type as you wanted. Note: You should use proper/same field name for both int_field and float_field
if you would just needed to add new column I would point you to Tables: patch GBQ API.
I am not sure if this API allows to change type of column - I doubt - but it is easy to check
Using Jobs: insert EXTRACT and then LOAD
Here you can extract table to GCS and then load it back to GBQ with adjusted schema
Above approach will a) eliminate cost cost of querying (scan) tables and b) can help with limitations that you can come up if you have complex schame (with records/repeated/nested, etc. type /mode)
I have a integer type column in my BigQuery table and now I need to convert it to a float column. I also have to keep all records. What I want to do is changing the column type. Not casting.
I've read that it's possible to do it just by exporting results of a query on a table to itself.
How to do it?
Using SELECT with writing result back to table
SELECT
CAST(int_field AS FLOAT) AS float_field,
<all_other_fields>
FROM YourTable
This approach will co$t you scan of whole table
To execute this - you should use Show Option button in BQ Web UI and properly set options as in below example. After you run this - your table will have that column with float vs. original integer data type as you wanted. Note: You should use proper/same field name for both int_field and float_field
if you would just needed to add new column I would point you to Tables: patch GBQ API.
I am not sure if this API allows to change type of column - I doubt - but it is easy to check
Using Jobs: insert EXTRACT and then LOAD
Here you can extract table to GCS and then load it back to GBQ with adjusted schema
Above approach will a) eliminate cost cost of querying (scan) tables and b) can help with limitations that you can come up if you have complex schame (with records/repeated/nested, etc. type /mode)
In my SQL, I would need to compare data between two tables in SQLServer 2008R2 to return the rows where mismatch is present (using EXCEPT) and likewise matching rows in other cases (INTERSECT). The problem is, some of the columns have NTEXT datatype (SQLServer), and SQLServer gives error when such tables having columns with NTEXT are present.
Example:
SELECT * FROM table_pre
EXCEPT
SELECT * FROM table_post
The above operation gives an error -
'The ntext data type cannot be selected as DISTINCT because it is not comparable.'
I believe that tables (table_pre, table_post) have at least one column of datatype = NTEXT that is causing the comparison to fail.
Question -
1. Is there some way to exclude these NTEXT columns from the above comparison, without me having to explicitly list out the column names and excluding the problem column? There's a large number of columns involved and explicitly listing is not easy.
2. Can I just explicitly cast/convert the NTEXT column alone to say VARCHAR, and still go by not having to list down the rest of the columns?
3. Or, in general, can I somehow exclude certain columns by listing those out during such comparisons?
Any suggestions, really appreciated! Thanks.
Question - 1. Is there some way to exclude these NTEXT columns from the above comparison,
Yes, use explicitly the column names.
without me having to explicitly list out the column names and excluding the problem column?
Using * is a bad habit, you well deserve the error for abusing it.
There's a large number of columns involved and explicitly listing is not easy
Is actually trivial, build the statement dinamycally
Can I just explicitly cast/convert the NTEXT column alone to say VARCHAR
No. You have to convert to NVARCHAR, the N is very important. But, yes you can convert.
Or, in general, can I somehow exclude certain columns by listing those out during such comparisons
Fortunately no. SQL does not randomly decide what columns are or are not part of a result, so you get the predictability you desire.
So, in conclussion:
never use *
build complex statements dynamically. SELECT ... FROM sys.columns is your friend, you can easily build it in a few seconds
ditch the deprecated TEXT, NTEXT and IMAGE types
Can you please help me concerning this matter (I didnĀ“t found it in the Teradata documentation, which is honestly little overwhelming): My table had this column -BAN DECIMAL(9,0)-, and now I want to change it to - BAN DECIMAL(15,0) COMPRESS 0.- How can I do it? What does COMPRESS constraint 0. or any other mean anyway?
I hope this is possible, and I don`t have to create a new table and then copy the data form the old table. The table is very very big - when I do COUNT(*) form that table I get this error: 2616 numeric overflow occurred during computation
The syntax diagram for ALTER TABLE doesn't seem to support directly changing a column's data type. (Teradata SQL DDL Documentation). COMPRESS 0 compresses zeroes. Teradata supports a lot of different kinds of compression.
Numeric overflow here probably means you've exceeded the range of an integer. To make that part work, just try casting to a bigger data type. (You don't need to change the column's data type to do this.)
select cast(count(*) as bigint)
from table_name;
You asked three different questions:
You cannot change the data type of a column from DECIMAL(9,0) to DECIMAL(15,0). Your best bet would be to create a new column (NEW_BAN), assign values from your old column, drop the old column and rename NEW_BAN back to BAN).
COMPRESS 0 is not a constraint. It means that values of "zero" are compressed from the table, saving disk space.
Your COUNT(*) is returning that error becasue the table has more than 2,147,483,647 rows (the max value of an INTEGER). Cast the result as BIGINT (as shown by Catcall).
And I agree, the documentation can be overwhelming. But be patient and focus only on the SQL titles for your exact release. They really are well written.
You can not use ALTER TABLE to change the data type from DECIMAL(9,0) to DECIMAL(15,0) because it cross the byte boundary required to store the values in the table. For Teradata 13.10, see the Teradata manual for SQL Data Definition Language Detailed Topics pages 61-65 for more details on using ALTER TABLE to change column data types.