Create DOUBLE PRECISION column from VARCHAR that is 99% numbers... (but includes some strings) - sql

I use Postgresql. I have a varchar column RAW_RET, mostly of numbers, and I'm trying to make a new column RET which is DOUBLE PRECISION. The problem is that a few rows of RAW_RET contain text, (Eg. character 'B' instead of numbers. Hence the following command:
update q_stock.daily set RET = cast(RAW_RET as double precision) ;
returns an error, invalid input syntax for type double precision: "B"
It seems like I need to do something like select only numeric rows, but I'm struggling at the moment to figure out how to do that...

For real numbers, you can use a regular expression:
update q_stock.daily
set RET = cast(RAW_RET as double precision)
where RAW_RET ~ '^[-]?[0-9]+\.?[0-9]*$';

Related

Truncating in Presto when zeroes appears after the decimal point

I am trying to truncate a decimal which has 2 numbers after decimal point in presto that should display on truncating the number without floating values and display the full decimal with floating values when there are numbers from 1 to 9 after decimal point. I have used the following query but it does not do the job and still I am ending up with numbers having zeroes after decimal point.
select column1,case when right(cast(column1 as varchar),7)='.000000' then truncate(column1) else column1 end from table1;
Using varchar pads extra zeroes to the right and hence are the extra zeroes I have used in the above expression after the decimal point
Please let me know what has to done to truncate the decimal only when it has zeroes as the floating values
The thing is truncate(x) → double Returns x rounded to integer by dropping digits after decimal point, but it is double, not integer. And displaying double without non-significant zeroes is a GUI job, it displays all of them or not displays non-significant zeroes. For example when I am using Presto Qubole, it does not displays .000000 if nothing else except 0s after dot. So the problem is in tool you are using probably.
For example this works fime in Presto on Qubole:
with mydata as (
select 123.00000 as figure union all
select 123.0123 )
select case when regexp_like(cast(figure as varchar),'\d+\.0+$') then truncate(figure) else figure end
from mydata
Result:
123.0123
123
But in your GUI it may not work the same because in second line is not integer, it is decimal(8,5), wrap in the typeof() function and you will see, and GUI decides how to display decimal(8,5).
You said:
Using varchar pads extra zeroes to the right and hence are the extra
zeroes I have used in the above expression after the decimal point
No, the result of your expression is not varchar, varchar is being implicitly converted to decimal or double, check using typeof().
If you want it to work not depending on tool you are using, convert to varchar and transform explicitly:
select case when regexp_like(cast(figure as varchar),'\d\.0+$') --all zeroes, change according to your requirements
then regexp_replace(cast(figure as varchar),'\.0+$','') --remove fractional part
else cast(figure as varchar) --we need same type in case
end as result
from mydata
This will work guaranteed because result is varchar and displayed as is.
All that expression can be simplified:
--remove .0+ if no 1-9 after dot:
select regexp_replace(cast(figure as varchar),'\.0+$','')
from mydata

Minimum length of double value in a column in Impala table

I am trying to find the minimum length by getting the length of each values in a column (double) in a table and running a min function on top of it to get the minimum length.
This works well when the column is a string type but the 'length' function does not work for double datatype in impala, what is the other way to address this?
min(length(columnname))
All double columns are 8 bytes, as explained in the documentation. LENGTH() is a string function and it doesn't really make sense on a numeric value (although you can convert to a string and then measure the length).

MariaDB comparison between char and number

I am so confused by the following query
SELECT
'6217001180007179362' = 6217001180007179156 -- 1
, 6217001180007179362 = 6217001180007179156 ; -- 0
how come 「'6217001180007179362' = 6217001180007179156」 is 1 ?
my maria DB verision is 10.2.11-MariaDB.
When you enter a series of digits in MariaDB/MySQL, it is interpreted as a numeric constant. In your case, this does what you want.
When you enter a series of digits surrounded by single quotes, then it is interpreted as a string.
When you compare two numeric constants -- well, there isn't a problem. The solution is what you expect.
However, if one is a string, then the values are implicitly converted. They fall under this condition, as described in the documentation:
In all other cases, the arguments are compared as floating-point (real) numbers.
Your values have more precision than can be represented in a float, so they values look equal to MySQL.

How to avoid PG::NumericValueOutOfRange when using sum function

I have method like this:
def self.weighted_average(column)
sql = "SUM(#{column} * market_cap) / SUM(market_cap) as weighted_average"
Company.select(sql).to_a.first.weighted_average
end
When the column is a decimal, it returns a value without problem.
But when the column is integer, the method ends up with a PG::NumericValueOutOfRange error.
Should I change column type integer to decimal, or is there a way to get the result of sum without changing column type?
You can always make float from your integer.
def self.weighted_average(column)
column = column.to_f
sql = "SUM(#{column} * market_cap) / SUM(market_cap) as weighted_average"
Company.select(sql).to_a.first.weighted_average
end
You can cast your value to alway be a decimal value, thus no need to change the column type:
sql = "SUM(#{column} * CAST(market_cap as decimal(53,8))) / SUM(CAST(market_cap as decimal(53,8))) as weighted_average"
P.S. I would go with changing the column type - it is consistent then.
I would suggest you to change the datatype to decimal. Because, when SUM gets PG::NumericValueOutOfRange, it means that your datatype is not sufficient. It will lead to gracefully handle this scenario, instead of a workaround.
Postgres documentation says this about SUM() return type:
bigint for smallint or int arguments, numeric for bigint arguments,
otherwise the same as the argument data type
This means that you will somehow need to change datatype that you pass to SUM. It can be one of the following:
Alter table to change column datatype.
Cast column to other datatype in your method.
Create a view that casts all integer columns to numeric and use that in your method.
You are trying to place a decimal value into a integer parameter. Unless you use the ABS() value that will not be possible, unless you are 100% sure that the % value will always be 0.
Use type Float or function ABS() if you HAVE to have an INT
Yo could try casting column to decimal
sql = "SUM(CAST(#{column}) AS DECIMAL * market_cap) / SUM(market_cap) as weighted_average"

I'm confused about Sqlite comparisons on a text column

I've got an Sqlite database where one of the columns is defined as "TEXT NOT NULL". Some of the values are strings and some can be cast to a DOUBLE and some can be case to INTEGER. Once I've narrowed it down to DOUBLE values, I want to do a query that gets a range of data. Suppose my column is named "Value". Can I do this?
SELECT * FROM Tbl WHERE ... AND Value >= 23 AND Value < 42
Is that going to do some kind of ASCII comparison or a numeric comparison? INTEGER or REAL? Does the BETWEEN operator work the same way?
And what happens if I do this?
SELECT MAX(Value) FROM Tbl WHERE ...
Will it do string or integer or floating-point comparisons?
It is all explained in the Datatypes In SQLite Version 3 article. For example, the answer to the first portion of questions is
An INTEGER or REAL value is less than any TEXT or BLOB value. When an INTEGER or REAL is compared to another INTEGER or REAL, a numerical comparison is performed.
This is why SELECT 9 < '1' and SELECT 9 < '11' both give 1 (true).
The expression "a BETWEEN b AND c" is treated as two separate binary comparisons "a >= b AND a <= c"
The most important point to know is that column type is merely an annotation; SQLite is dynamically typed so each value can have any type.
you cant convert text to integer or double so you wont be able to do what you want.
If the column were varchar you could have a chance by doing:
select *
from Tbl
WHERE ISNUMERIC(Value ) = 1 --condition to avoid a conversion from string to int for example
and cast(value as integer) > 1 --rest of your conditions