I have PostgreSQL 9.4 installed on my laptop and my database contains a versioned-number which has this format : A.B.C.D ( example : 1.2.13.6 ). How can i apply MAX aggregation to my column "version" which is text. Thank you very much
If the numbers are always numeric, you can do something like this:
select max(string_to_array(version, '.')::int[])
from your_table;
By converting the string into an array of integers, the comparison will be done correctly [1,12,1] is bigger than [1,1,1]
This will however fail if you have values like 1.2.13.6a in that column
SQLFiddle: http://sqlfiddle.com/#!15/d41d8/4608
Related
I have a table with a column that has some variable data. I would like to select only the rows that have values with numerical characters [0-9]
The column would look someting like this:
time
1545123
none
1565543
1903-294
I would want the rows with the first and third values only (1545123 and 1565543). None of my approaches have worked.
I've tried:
WHERE time NOT LIKE '%[^0-9]+%'
WHERE NOT regexp_like(time, '%[^0-9]+%')
WHERE regexp_like(time, '[0-9]+')
I've also tried these expressions in a CASE statement, but that was also a no go. Am I missing something here?
This is on Amazon Athena, which uses an older version of Presto
Thanks in advance
You can use regexp matching only numbers like '^[0-9]+$' or '^\d+$':
-- sample data
WITH dataset (time) AS (
VALUES
('1545123'),
('none'),
('1565543'),
('1903-294')
)
--query
select *
from dataset
WHERE regexp_like(time, '^[0-9]+$')
Output:
time
1545123
1565543
Another option which I would say should not be used in this case but can be helpful in some others is using try with cast:
--query
select *
from (
select try(cast(time as INTEGER)) time
from dataset
)
where time is not null
I am using redshift
I have a table like this :
metric is a super type, built with the array() function within redshift
user
metrics
red
array(2021, 120)
red
array(2020, 99)
blue
array(2021, 151)
I would like to do :
select user, max(metrics) from table group by user
and get this :
user
metrics
red
array(2021, 120)
blue
array(2021, 151)
Sadly using this query, I only get null values
Do you know how to handle that ?
Thanks
If you are familiar with Redshift Spectrum, the logic is very similar to unnest an array field when you query an external schema.
In your case, the query is pretty simple:
SELECT t.user, max(metric)
FROM my_schema.my_table as t, t.metrics as metric
GROUP BY 1
If the array contains types other than numerical ones, you can simply cast it to int or double like:
max(metric::int)
In this way, pure string such as "hello world" are considered as null, but string like "33333" is converted to int.
I am trying to extract only the numeric values from a column that contains cells that are exclusively numbers, and cells that are exclusively letter values, so that I can multiply the column with another that contains only numeric values. I have tried
SELECT trim(INTENT_VOLUME)
from A
WHERE ISNUMERIC(INTENTVOLUME)
and also
SELECT trim(INTENT_VOLUME)
from A
WHERE ISNUMERIC(INTENTVOLUME) = 1
and neither works. I get the error Function ISNUMERIC(VARCHAR) does not exist. Can someone advise? Thank you!
It highly depends on DBMS.
in SqlServer you have a limited built-in features to do it, so the next query may not work with all variants of your data:
select CAST(INTENT_VOLUME AS DECIMAL(10, 4))
from A
where INTENT_VOLUME LIKE '%[0-9.-]%'
and INTENT_VOLUME NOT LIKE '%[^0-9.-]%';
In Oracle you can use regex in a normal way:
select to_number(INTENT_VOLUME)
from A
where REGEXP_LIKE(INTENT_VOLUME,'^[-+]?[0-9]+(\.[0-9]+)?$');
MySQL DBMS has also built-in regex
Try this, which tests if that text value can be cast as numeric...
select intent_volume
from a
where (intent_volume ~ '^([0-9]+[.]?[0-9]*|[.][0-9]+)$') = 't'
I have some data in a postgres table that is a string representation of an array of json data, like this:
[
{"UsageInfo"=>"P-1008366", "Role"=>"Abstract", "RetailPrice"=>2, "EffectivePrice"=>0},
{"Role"=>"Text", "ProjectCode"=>"", "PublicationCode"=>"", "RetailPrice"=>2},
{"Role"=>"Abstract", "RetailPrice"=>2, "EffectivePrice"=>0, "ParentItemId"=>"396487"}
]
This is is data in one cell from a single column of similar data in my database.
The datatype of this stored in the db is varchar(max).
My goal is to find the average RetailPrice of EVERY json item with "Role"=>"Abstract", including all of the json elements in the array, and all of the rows in the database.
Something like:
SELECT avg(json_extract_path_text(json_item, 'RetailPrice'))
FROM (
SELECT cast(json_items to varchar[]) as json_item
FROM my_table
WHERE json_extract_path_text(json_item, 'Role') like 'Abstract'
)
Now, obviously this particular query wouldn't work for a few reasons. Postgres doesn't let you directly convert a varchar to a varchar[]. Even after I had an array, this query would do nothing to iterate through the array. There are probably other issues with it too, but I hope it helps to clarify what it is I want to get.
Any advice on how to get the average retail price from all of these arrays of json data in the database?
It does not seem like Redshift would support the json data type per se. At least, I found nothing in the online manual.
But I found a few JSON function in the manual, which should be instrumental:
JSON_ARRAY_LENGTH
JSON_EXTRACT_ARRAY_ELEMENT_TEXT
JSON_EXTRACT_PATH_TEXT
Since generate_series() is not supported, we have to substitute for that ...
SELECT tbl_id
, round(avg((json_extract_path_text(elem, 'RetailPrice'))::numeric), 2) AS avg_retail_price
FROM (
SELECT *, json_extract_array_element_text(json_items, pos) AS elem
FROM (VALUES (0),(1),(2),(3),(4),(5)) a(pos)
CROSS JOIN tbl
) sub
WHERE json_extract_path_text(elem, 'Role') = 'Abstract'
GROUP BY 1;
I substituted with a poor man's solution: A dummy table counting from 0 to n (the VALUES expression). Make sure you count up to the maximum number of possible elements in your array. If you need this on a regular basis create an actual numbers table.
Modern Postgres has much better options, like json_array_elements() to unnest a json array. Compare to your sibling question for Postgres:
Can get an average of values in a json array using postgres?
I tested in Postgres with the related operator ->>, where it works:
SQL Fiddle.
I need to order a select query using a varchar column, using numerical and text order. The query will be done in a java program, using jdbc over postgresql.
If I use ORDER BY in the select clause I obtain:
1
11
2
abc
However, I need to obtain:
1
2
11
abc
The problem is that the column can also contain text.
This question is similar (but targeted for SQL Server):
How do I sort a VARCHAR column in SQL server that contains words and numbers?
However, the solution proposed did not work with PostgreSQL.
Thanks in advance, regards,
I had the same problem and the following code solves it:
SELECT ...
FROM table
order by
CASE WHEN column < 'A'
THEN lpad(column, size, '0')
ELSE column
END;
The size var is the length of the varchar column, e.g 255 for varying(255).
You can use regular expression to do this kind of thing:
select THECOL from ...
order by
case
when substring(THECOL from '^\d+$') is null then 9999
else cast(THECOL as integer)
end,
THECOL
First you use regular expression to detect whether the content of the column is a number or not. In this case I use '^\d+$' but you can modify it to suit the situation.
If the regexp doesn't match, return a big number so this row will fall to the bottom of the order.
If the regexp matches, convert the string to number and then sort on that.
After this, sort regularly with the column.
I'm not aware of any database having a "natural sort", like some know to exist in PHP. All I've found is various functions:
Natural order sort in Postgres
Comment in the PostgreSQL ORDER BY documentation