find maximum difference in a postgres row - sql

I have a table called compression that lists the initial size and compressed size of each item.
I'd like a query that shows me the best compression stored so far, something like:
select max(
cast(uncompressed_size as int) -
cast(compressed_size as int)
)
from compression
The problem is, this code won't execute because of this error:
ERROR: Invalid digit, Value '2', Pos 0, Type: Integer
Detail:
-----------------------------------------------
error: Invalid digit, Value '2', Pos 0, Type: Integer
code: 1207
context:
query: 362794
location: :0
process: query0_21 [pid=0]
-----------------------------------------------
Something must be going on with the casting, but I'm not sure what is causing this.
It's a postgres database (technically amazon redshift), and I'm really confused why an operation like this might fail.

Check for any extra spaces at the end or beginning of this character field values and try to trim it. If there is empty space in character then casting into integer will fail, so you may want to use NULLIF(compressed_size,'') and then cast that null value which should succeed.

Related

Invalid digits on Redshift

I'm trying to load some data from stage to relational environment and something is happening I can't figure out.
I'm trying to run the following query:
SELECT
CAST(SPLIT_PART(some_field,'_',2) AS BIGINT) cmt_par
FROM
public.some_table;
The some_field is a column that has data with two numbers joined by an underscore like this:
some_field -> 38972691802309_48937927428392
And I'm trying to get the second part.
That said, here is the error I'm getting:
[Amazon](500310) Invalid operation: Invalid digit, Value '1', Pos 0,
Type: Long
Details:
-----------------------------------------------
error: Invalid digit, Value '1', Pos 0, Type: Long
code: 1207
context:
query: 1097254
location: :0
process: query0_99 [pid=0]
-----------------------------------------------;
Execution time: 2.61s
Statement 1 of 1 finished
1 statement failed.
It's literally saying some numbers are not valid digits. I've already tried to get the exactly data which is throwing the error and it appears to be a normal field like I was expecting. It happens even if I throw out NULL fields.
I thought it would be an encoding error, but I've not found any references to solve that.
Anyone has any idea?
Thanks everybody.
I just ran into this problem and did some digging. Seems like the error Value '1' is the misleading part, and the problem is actually that these fields are just not valid as numeric.
In my case they were empty strings. I found the solution to my problem in this blogpost, which is essentially to find any fields that aren't numeric, and fill them with null before casting.
select cast(colname as integer) from
(select
case when colname ~ '^[0-9]+$' then colname
else null
end as colname
from tablename);
Bottom line: this Redshift error is completely confusing and really needs to be fixed.
When you are using a Glue job to upsert data from any data source to Redshift:
Glue will rearrange the data then copy which can cause this issue. This happened to me even after using apply-mapping.
In my case, the datatype was not an issue at all. In the source they were typecast to exactly match the fields in Redshift.
Glue was rearranging the columns by the alphabetical order of column names then copying the data into Redshift table (which will
obviously throw an error because my first column is an ID Key, not
like the other string column).
To fix the issue, I used a SQL query within Glue to run a select command with the correct order of the columns in the table..
It's weird why Glue did that even after using apply-mapping, but the work-around I used helped.
For example: source table has fields ID|EMAIL|NAME with values 1|abcd#gmail.com|abcd and target table has fields ID|EMAIL|NAME But when Glue is upserting the data, it is rearranging the data by their column names before writing. Glue is trying to write abcd#gmail.com|1|abcd in ID|EMAIL|NAME. This is throwing an error because ID is expecting a int value, EMAIL is expecting a string. I did a SQL query transform using the query "SELECT ID, EMAIL, NAME FROM data" to rearrange the columns before writing the data.
Hmmm. I would start by investigating the problem. Are there any non-digit characters?
SELECT some_field
FROM public.some_table
WHERE SPLIT_PART(some_field, '_', 2) ~ '[^0-9]';
Is the value too long for a bigint?
SELECT some_field
FROM public.some_table
WHERE LEN(SPLIT_PART(some_field, '_', 2)) > 27
If you need more than 27 digits of precision, consider a decimal rather than bigint.
If you get error message like “Invalid digit, Value ‘O’, Pos 0, Type: Integer” try executing your copy command by eliminating the header row. Use IGNOREHEADER parameter in your copy command to ignore the first line of the data file.
So the COPY command will look like below:
COPY orders FROM 's3://sourcedatainorig/order.txt' credentials 'aws_access_key_id=<your access key id>;aws_secret_access_key=<your secret key>' delimiter '\t' IGNOREHEADER 1;
For my Redshift SQL, I had to wrap my columns with Cast(col As Datatype) to make this error go away.
For example, setting my columns datatype to Char with a specific length worked:
Cast(COLUMN1 As Char(xx)) = Cast(COLUMN2 As Char(xxx))

ORA-01722: invalid number in column with numbers only?

I did a rather easy view to return only rows where there is number is CONTRACT_ID column. CONTRACT_ID has data type number(8).
CREATE OR REPLACE VIEW cid AS
SELECT *
FROM transactions
WHERE contract_id IS NOT NULL
AND LENGTH(contract_id) > 0;
View works just fine until I scroll down to row ~2950 where I get ORA-01722. Same thing happens if I want to export data to Excel, my file gets only ~2950 rows instead of expected ~20k.
Any idea what might be causing this and how to resolve this issue?
Many thanks!
You wrote too much SQL.. The following will provide all the results you require:
CREATE OR REPLACE VIEW cid AS
SELECT *
FROM transactions
WHERE contract_id IS NOT NULL
You can't LENGTH() a number - a number is either null or it's a value, so you don't need this kind of check.
Passing a number to LENGTH() will turn it into a string first, i.e. LENGTH(TO_CHAR(numbercolumn)). You don't even need a LENGTH() check for null strings, as to oracle NULL string and a zero length string are equivalent, and calling LENGTH() on an empty string or a null, will return null, not 0 (so LENGTH(myNullStr) = 0 doesnt work out; it's not comparing 0 = 0, it's comparing null = 0 and null compared with anything is always false).
The only time this seems to cause confusion is when the string columns in the table are CHAR types rather than VARCHAR types, and people forget that assigning an empty string to a CHAR causes it to become space padded out to the CHAR length hence, not a zero length string any more
First of all, you should remove redundant condition about length(), it's senseless. I'm not sure how it can produce such error, but check whether error disappered after it.
If no, replace star (*) to some field names, say, contract_id. If it will fix error - it would appoint that error source somewhere into removed fields (say, if generated column used).
I cannot imagine how error can be still alive after that, by if so, I'd tried to move it into other tablespace and add into fields list a call of logging function which stores rowid's of rows read - thus check which row produces error.

ERROR: operator does not exist: numeric ~* unknown

I need to created domain in PostgreSQL for a price. The price must be NUMERIC(9,2) where 9 is precision and 2 - scale. When trying to create domain getting:
ERROR: operator does not exist: numeric ~* unknown
Hint: No operator matches the given name and argument type(s). You might need to add explicit type casts.
QUERY:
CREATE DOMAIN d_price AS NUMERIC(9, 2) NOT NULL
CONSTRAINT Product_price_can_contain_only_double_precision_value
CHECK(VALUE ~*'^(([[:digit:]])+\.([[:digit:]]){2})$');
You need your numeric value as a string before you can use the string operator, change your VALUE to: CAST(VALUE AS TEXT)
Your CHECK constraint is nonsensical, because it applies after the value has already been converted to NUMERIC by the database engine's number parser.
VALUE ~*'^(([[:digit:]])+\.([[:digit:]]){2})$')
appears to say "one or more leading digits, a period, and exactly two trailing digits". You can't do that check in any useful way once the number has already been parsed. Observe:
regress=> SELECT NUMERIC(18,2) '1', NUMERIC(18,2) '1.12345';
numeric | numeric
---------+---------
1.00 | 1.12
(1 row)
No matter what the input is, if it fits inside the NUMERIC you've specified, it'll be extended to fit. If it doesn't fit the NUMERIC size you've given it'll produce an error before your CHECK constraint ever runs.

PostgreSQL sum typecasting as a bigint

I am doing the sum() of an integer column and I want to typecast the result to be a bigint - to avoid an error. However when I try to use sum(myvalue)::bigint it still gives me an out of range error.
Is there anything that I can do to the query to get this to work? Or do I have to change the column type to a bigint?
The current manual is more explicit than it used to be in 2013:
sum ( integer ) → bigint
If your column myvalue indeed has the type integer like you say, then the result is bigint anyway, and the added cast in sum(myvalue)::bigint is just noise.
Either way, to get an "out of range" error, the result would have to be bigger than what bigint can hold:
-9223372036854775808 to +9223372036854775807
One would have to aggregate a huge number of big integer values (>= 2^32 * 2^31). If so, cast the base column to bigint, thereby forcing the result to be numeric:
SELECT sum(myvalue::int8) ...
The more likely explanation is that your column has, in fact, a different data type, or the error originates from something else. Not enough detail in the question.
I solved my problem using following statement
SUM(CAST(gross_amount AS Integer))
This is give the result of the column as SUm bigint,
Note:My column gross_amount was double type.
You need to cast it before doing the operation:
SUM(myvalue::bigint)

Error converting data type varchar to float

Searched and searched on SO and can't figure it out
Tried CASTING each field as FLOAT to no avail, convert didn't get me any further
How can I get the below case clause to return the value stated in the THEN section?
Error:
Msg 8114, Level 16, State 5, Line 1
Error converting data type varchar to float.
section of my SQL query that makes it error:
When cust_trendd_w_costsv.terms_code like '%[%]%' and (prod.dbo.BTYS2012.average_days_pay) - (substring(cust_trendd_w_costsv.terms_code,3,2)) <= 5 THEN prod.dbo.cust_trendd_w_costsv.terms_code
average_days_pay = float
terms_code = char
Cheers!
Try to use ISNUMERIC to handle strings which can't be converted:
When cust_trendd_w_costsv.terms_code like '%[%]%'
and (prod.dbo.BTYS2012.average_days_pay) -
(case when isnumeric(substring(cust_trendd_w_costsv.terms_code,3,2))=1
then cast(substring(cust_trendd_w_costsv.terms_code,3,2) as float)
else 0
end)
<= 5 THEN prod.dbo.cust_trendd_w_costsv.terms_code
The issue that you're having is that you're specifically searching for strings that contain a % character, and then converting them (implicitly or explicitly) to float.
But strings containing % signs can't be converted to float whilst they still have a % in them. This also produces an error:
select CONVERT(float,'12.5%')
If you're wanting to convert to float, you'll need to remove the % sign first, something like:
CONVERT(float,REPLACE(terms_code,'%',''))
will just eliminate it. I'm not sure if there are any other characters in your terms_code column that may also trip it up.
You also need to be aware that SQL Server can quite aggressively re-order operations and so may attempt the above conversion on other strings in terms_code, even those not containing %. If that's the source of your error, then you need to prevent this aggressive re-ordering. Provided there are no aggregates involved, a CASE expression can usually avoid the worst of the issues - make sure that all strings that you don't want to deal with are eliminated by earlier WHEN clauses before you attempt your conversion
If your are sure that Substring Part returns a numeric value, You can Cast The substring(....) to Float :
.....and (prod.dbo.BTYS2012.average_days_pay) - (CAST(substring(cust_trendd_w_costsv.terms_code,3,2)) as float ) <= 5 ....