Invalid digits on Redshift - sql

I'm trying to load some data from stage to relational environment and something is happening I can't figure out.
I'm trying to run the following query:
SELECT
CAST(SPLIT_PART(some_field,'_',2) AS BIGINT) cmt_par
FROM
public.some_table;
The some_field is a column that has data with two numbers joined by an underscore like this:
some_field -> 38972691802309_48937927428392
And I'm trying to get the second part.
That said, here is the error I'm getting:
[Amazon](500310) Invalid operation: Invalid digit, Value '1', Pos 0,
Type: Long
Details:
-----------------------------------------------
error: Invalid digit, Value '1', Pos 0, Type: Long
code: 1207
context:
query: 1097254
location: :0
process: query0_99 [pid=0]
-----------------------------------------------;
Execution time: 2.61s
Statement 1 of 1 finished
1 statement failed.
It's literally saying some numbers are not valid digits. I've already tried to get the exactly data which is throwing the error and it appears to be a normal field like I was expecting. It happens even if I throw out NULL fields.
I thought it would be an encoding error, but I've not found any references to solve that.
Anyone has any idea?
Thanks everybody.

I just ran into this problem and did some digging. Seems like the error Value '1' is the misleading part, and the problem is actually that these fields are just not valid as numeric.
In my case they were empty strings. I found the solution to my problem in this blogpost, which is essentially to find any fields that aren't numeric, and fill them with null before casting.
select cast(colname as integer) from
(select
case when colname ~ '^[0-9]+$' then colname
else null
end as colname
from tablename);
Bottom line: this Redshift error is completely confusing and really needs to be fixed.

When you are using a Glue job to upsert data from any data source to Redshift:
Glue will rearrange the data then copy which can cause this issue. This happened to me even after using apply-mapping.
In my case, the datatype was not an issue at all. In the source they were typecast to exactly match the fields in Redshift.
Glue was rearranging the columns by the alphabetical order of column names then copying the data into Redshift table (which will
obviously throw an error because my first column is an ID Key, not
like the other string column).
To fix the issue, I used a SQL query within Glue to run a select command with the correct order of the columns in the table..
It's weird why Glue did that even after using apply-mapping, but the work-around I used helped.
For example: source table has fields ID|EMAIL|NAME with values 1|abcd#gmail.com|abcd and target table has fields ID|EMAIL|NAME But when Glue is upserting the data, it is rearranging the data by their column names before writing. Glue is trying to write abcd#gmail.com|1|abcd in ID|EMAIL|NAME. This is throwing an error because ID is expecting a int value, EMAIL is expecting a string. I did a SQL query transform using the query "SELECT ID, EMAIL, NAME FROM data" to rearrange the columns before writing the data.

Hmmm. I would start by investigating the problem. Are there any non-digit characters?
SELECT some_field
FROM public.some_table
WHERE SPLIT_PART(some_field, '_', 2) ~ '[^0-9]';
Is the value too long for a bigint?
SELECT some_field
FROM public.some_table
WHERE LEN(SPLIT_PART(some_field, '_', 2)) > 27
If you need more than 27 digits of precision, consider a decimal rather than bigint.

If you get error message like “Invalid digit, Value ‘O’, Pos 0, Type: Integer” try executing your copy command by eliminating the header row. Use IGNOREHEADER parameter in your copy command to ignore the first line of the data file.
So the COPY command will look like below:
COPY orders FROM 's3://sourcedatainorig/order.txt' credentials 'aws_access_key_id=<your access key id>;aws_secret_access_key=<your secret key>' delimiter '\t' IGNOREHEADER 1;

For my Redshift SQL, I had to wrap my columns with Cast(col As Datatype) to make this error go away.
For example, setting my columns datatype to Char with a specific length worked:
Cast(COLUMN1 As Char(xx)) = Cast(COLUMN2 As Char(xxx))

Related

oracle add column with value based on condition

I would like to add a column "tag" based on value of "LEASE_ID_count" with ORACLE.
But i get this error :
value too large for column "CUSTOM_LIFETIME_VALUE_TAG"."tag" (actual:
7, maximum: 3) , caused by: OracleDatabaseException: ORA-12899: value
too large for column "CUSTOM_LIFETIME_VALUE_TAG"."tag" (actual: 7,
maximum: 3
select "COMPANY_CODE", "LEASE_ID_count",
(CASE WHEN "LEASE_ID_count" IN ('3','4', '5') THEN '3 à 5vh' WHEN "LEASE_ID_count" ='1' THEN '1vh' WHEN "LEASE_ID_count" ='2' THEN '2vh' END) "tag"
from "CUSTOM_LIFETIME_VALUE_TESR"
Any idea please to help me ? thanks
This is too long for a comment. The error message is referring to "CUSTOM_LIFETIME_VALUE_TAG"."tag". This is from a table that has no obvious reference in the query. Okay, perhaps CUSTOM_LIFETIME_VALUE_TESR is a view that references that table. That is possible.
However, the error message is about storing data into that column, not referencing it. So, my best guess is that you have a query like this:
INSERT INTO CUSTOM_LIFETIME_VALUE_TAG (COMPANY_CODE, LEASE_ID_count, tag)
<your select here>;
And the column tag in this table is defined as 3 characters. Clearly, '3 à 5vh' has 7 characters which is more than 3 which causes an error. Hence the error.
Oracle does have a lot of functionality lurking around. Even so, it is hard for me to think of how a SELECT could cause this error with no DML involved.
As Alex Poole very correctly notes: write the queries without double quotes. Quoted identifiers just make queries harder to write and read.

How do I escape reserved words used as column names on HIVE?

When I am executing following query in HIVE, it is giving me current date instead of the column values from USER_INFO table.
SELECT CURRENT_DATE
FROM USER_INFO
LIMIT 1;
How do I escape reserved words used as column names on HIVE?
Thanks & Regards,
Kamlesh
Got it.
There are two ways
1 You can use apostrophe before and after the keyword as shown below.
SELECT CURRENT_DATE
FROM `USER_INFO`
LIMIT 1;
2 There is setting shown below which you can do if you are executing it with unix shell (i.e. with sh command)
hive.support.sql11.reserved.keywords=false
Hope this helps.
Thanks & Regards,
Kamleshkumar Gujarathi
In order to query reserved words in columns (for example "users" or "current_date")
SELECT `CURRENT_DATE`
FROM USER_INFO
LIMIT 1;
Notice the aposthrope type ` (As Wheezil points out this character is 'backtick' , but I want to point out that in international keyboards this character is relatively hard to type )
Self answer is ok but misleading due to wrong query, adding answer here for future reference.
Facing the same issue with the keyword "application". Even backtick is not working.
select application
from abc.def
where transaction is not null
and load_date = '20221115';
Error -
Error while compiling statement: FAILED: SemanticException [Error 10004]: line 1:7 Invalid table alias or column reference 'application': (possible column names are: date, transaction, status, direction, message_type, smi, medium, medium_type, aircraft, flight, callsign, dep_icao, arr_icao, application_str, from_to, machine, msg_length, msg_text, attached_files, load_date)
Any other solution ?

Locate Cause of IBM DB2 CAST Failure

I need to work on an IBM DB2 database.
The LOCATION field is a CHARACTER(8) field of numbers.
To sort the table, the column is cast to an INTEGER:
SELECT LOCATION, PARTNO, INSTOCK
FROM INVENTORY
ORDER BY CAST(LOCATION AS INTEGER)
Currently, this fails with:
ERROR [22018] [IBM][DB2/AIX64] SQL0420N Invalid character found in a character string argument of the function "INTEGER".
Is there a quick way to determine which row is failing?
IBM's solution is to "Insure that the results set for the query item that the cast it being applied to does not contain non numeric SQL constants when casting to a numeric type."
That wasn't really helpful.
Thinking someone inserted a letter O or lower case L, I tried this:
SELECT DISTINCT LOCATION
FROM LOCATIONS
WHERE LOCATION LIKE '%l%' OR LOCATION LIKE '%O%'
ORDER BY LOCATION
Zero records returned.
That wasn't really helpful.
That's IBM error messages and documentation in a nutshell.
One place to start is the TRANSLATE() function.
SELECT LOCATION, PARTNO, INSTOCK
FROM INVENTORY
WHERE TRANSLATE(LOCATION, '', ' 0123456789') <> ''
You can add other characters, like -, ., etc. depending on what you find.

find maximum difference in a postgres row

I have a table called compression that lists the initial size and compressed size of each item.
I'd like a query that shows me the best compression stored so far, something like:
select max(
cast(uncompressed_size as int) -
cast(compressed_size as int)
)
from compression
The problem is, this code won't execute because of this error:
ERROR: Invalid digit, Value '2', Pos 0, Type: Integer
Detail:
-----------------------------------------------
error: Invalid digit, Value '2', Pos 0, Type: Integer
code: 1207
context:
query: 362794
location: :0
process: query0_21 [pid=0]
-----------------------------------------------
Something must be going on with the casting, but I'm not sure what is causing this.
It's a postgres database (technically amazon redshift), and I'm really confused why an operation like this might fail.
Check for any extra spaces at the end or beginning of this character field values and try to trim it. If there is empty space in character then casting into integer will fail, so you may want to use NULLIF(compressed_size,'') and then cast that null value which should succeed.

Searching for a specific text value in a column in SQLite3

Suppose I have a table named 'Customer' with many columns and I want to display all customers who's name ends with 'Thomas'(Lastname = 'Thomas'). The following query shows an empty result(no rows). Also it didn't show any error.
SELECT * FROM Customer WHERE Lastname = 'Thomas';
While executing the following query give me correct result.
SELECT * FROM Customer WHERE Lastname LIKE '%Thomas%';
I would like to know what is the problem with my first query. I am using sqlite3 with Npm. Below is the result of '.show' command(Just in case of the problem is with config).
sqlite> .show
echo: off
explain: off
headers: on
mode: column
nullvalue: ""
output: stdout
separator: "|"
stats: off
width:
Use Like instead of =
Trim to ensure that there arent spaces messing around
so the query will be
SELECT * FROM Customer WHERE trim(Lastname) LIKE 'Thomas';
depending on your types, probably you dont need point 2, since as can be read in mysql manual
All MySQL collations are of type PADSPACE. This means that all CHAR
and VARCHAR values in MySQL are compared without regard to any
trailing spaces
But the point 1 could be the solution. Actually if you want to avoid problems, you should compare strings with LIKE, instead of =.
If You still have problems, probably you will have to use collates.
SELECT *
FROM t1
WHERE k LIKE _latin1 'Müller' COLLATE latin1_german2_ci; #using your real table collation
more information here But specifically with 'Thomas' you shouldn't need it, since it hasn't got any special characters.