Order by a value of an arbitrary attribute in hstore - sql

I have records like these:
id, hstore_col
1, {a: 1, b: 2}
2, {c: 3, d: 4}
3, {e: 1, f: 5}
How to order them by a maximum/minimum value inside hstore for any attribute?
The result should be like this(order by lowest):
id, hstore_col
1, {a: 1, b: 2}
3, {e: 1, f: 5}
2, {c: 3, d: 4}
I know, I can only order them by specific attribute like this: my_table.hstore_fields -> 'a', but it doesn't work for my issue.

Convert to an array using avals and cast the resulting array from text to ints. Then sort the array and order the results by the 1st element of the sorted array.
select * from mytable
order by (sort(avals(attributes)::int[]))[1]
http://sqlfiddle.com/#!15/84f31/5

If you know all of the elements, you can just piece them all together like this:
ORDER BY greatest(my_table.hstore_fields -> 'a', my_table.hstore_fields -> 'b',my_table.hstore_fields -> 'c', my_table.hstore_fields -> 'd', my_table.hstore_fields -> 'e', my_table.hstore_fields -> 'f')
or
ORDER BY least(my_table.hstore_fields -> 'a', my_table.hstore_fields -> 'b',my_table.hstore_fields -> 'c', my_table.hstore_fields -> 'd', my_table.hstore_fields -> 'e', my_table.hstore_fields -> 'f')

By using svals you can create an exploded version of the hstore_col's values - then you can sort on those values and get the first entry from each of them. There is doubtlessly a much more efficient way to do this, but here's a first pass:
select my_table.id, my_table.hstore_col
from my_table
join (
select id, svals(hstore_col) as hstore_val
from my_table
) exploded_table
on my_table.id = exploded_table.id
group by my_table.id, my_table.hstore_col
order by my_table.id, exploded_table.hstore_val desc

Related

prestoSQL aggregate columns and rows into one column

I would like to aggregate some columns and rows into one column in prestoSQL table.
with example_table as (
select * from (
values ('A', 'nh', 7), ('A', 'mn', 4), ('A', 'sv', 3),
('B', 'tb', 6), ('B', 'ty', 5), ('A', 'rw', 2),
('C', 'op', 9), ('C', 'au', 8)
) example_table("id", "time", "value")
)
select id, agg(value, time) # Unexpected parameters (integer, VARCHAR(2)) for function array_agg. Expected: array_agg(T) T
from example_table
group by id
I would like to combine column "time" and "value" as one column and then aggregate all rows by "id" such that
id. time_value_agg
A. [['nh', 7], ['mn', 4], ['sv', 3], ['rw', 2]
B. [['tb', 6], ['tv',5]
C. [['op', 9], ['au', 8]]
the column
time_value_agg
should be an array of str. If the "time" col is not str, cast it to str.
I am not sure which function can be used for this ?
thanks
array_agg can be applied to single column only. If times are unique per id you can turn data into map:
select id, map(array_agg(time), array_agg(value)) time_value_agg
from example_table
group by id
Output:
id
time_value_agg
C
{op=9, au=8}
A
{mn=4, sv=3, rw=2, nh=7}
B
{ty=5, tb=6}
Or turn data into ROW type (or map) before aggregation:
select id,
array_agg(arr) time_value_agg
from (
select id, cast (row(time, value) as row(time varchar, value integer))arr
from example_table
)
group by id
Output:
id
time_value_agg
C
[{time=op, value=9}, {time=au, value=8}]
A
[{time=nh, value=7}, {time=mn, value=4}, {time=sv, value=3}, {time=rw, value=2}]
B
[{time=tb, value=6}, {time=ty, value=5}]

SQL ARRAY: Select ID from my_table where "arrayvalue" = "defined_arrayvalue"

This is a beginner-question relating arrays. I hope the answer is simple.
The example is taken from Oracle Spatial, but I think it is valid for all arrays.
I have this SELECT:
SELECT
D.FID
, D.GEOM.SDO_ELEM_INFO -- column GEOM contains spatial data
FROM
my_table D
I get this result:
73035 MDSYS.SDO_ELEM_INFO_ARRAY(1, 2, 1)
73036 MDSYS.SDO_ELEM_INFO_ARRAY(1, 4, 3, 1, 2, 1, 11, 2, 2, 19, 2, 1)
73037 MDSYS.SDO_ELEM_INFO_ARRAY(1, 2, 1)
Now I want to SELECT all rows where (1,2,1) is defined:
SELECT
D.FID
, D.GEOM.SDO_ELEM_INFO
FROM
my_table D
WHERE
-- Pseudo-Code is following
D.GEOM.SDO_ELEM_INFO is "(1, 2, 1)";
So, in simple words: "array_from_row = defined_array".
I found a lot about IMPLODE and TABLE and COLLECT etc. But how to define a clause on two arrays?
Thanks for help!
Try IN clause, you can also use both
SELECT
D.FID
, D.GEOM.SDO_ELEM_INFO
FROM
my_table D
WHERE
D.GEOM.SDO_ELEM_INFO in (1, 2, 1) or ( D.GEOM.SDO_ELEM_INFO = 1 or D.GEOM.SDO_ELEM_INFO = 2 or D.GEOM.SDO_ELEM_INFO = 3);

Extract last N elements of an array in SQL (hive)

I have a column with arrays and I want to extract the X last elements in an array.
Example trying to extract the last two elements:
Column A
['a', 'b', 'c']
['d', 'e']
['f', 'g', 'h', 'i']
Expected output:
Column A
['b', 'c']
['d', 'e']
['h', 'i']
Best case scenario would be to do it without using a UDF
One method using reverse, explode, filtering and re-assembling array again:
with your_table as (
select stack (4,
0, array(), --empty array to check it works if no elements or less than n
1, array('a', 'b', 'c'),
2, array('d', 'e'),
3, array('f', 'g', 'h', 'i')
) as (id, col_A)
)
select s.id, collect_list(s.value) as col_A
from
(select s.id, a.value, a.pos
from your_table s
lateral view outer posexplode(split(reverse(concat_ws(',',s.col_A)),',')) a as pos, value
where a.pos between 0 and 1 --last two (use n-1 instead of 1 if you want last n)
distribute by s.id sort by a.pos desc --keep original order
)s
group by s.id
Result:
s.id col_a
0 []
1 ["b","c"]
2 ["d","e"]
3 ["h","i"]
More elegant way using brickhouse numeric_range UDF in this answer

pivot from multiple rows to multiple columns in hive

I have a hive table like following
(id:int, vals: Map<String, int> , type: string)
id, vals, type
1, {"foo": 1}, "a"
1, {"foo": 2}, "b"
2, {"foo": 3}, "a"
2, {"foo": 1}, "b"
Now, there are only two types
I want to change this to following schema
id, type_a_vals, type_b_vals
1, {"foo", 1}, {"foo": 2}
2, {"foo": 3}, {"foo": 1}
and if any "type" is missing, it can be null?
An easy way keeping in mind the map column would be a self join.
select ta.id,ta.vals,tb.vals
from (select * from tbl where type = 'a') ta
full join (select * from tbl where type = 'b') tb on ta.id = tb.id
You can use conditional aggregation to solve questions like these as below. However, doing so on a map column would produce an error.
select id
,max(case when type = 'a' then vals end) as type_a_vals
,max(case when type = 'b' then vals end) as type_a_vals
from tbl
group by id

Idiomatic equivalent to map structure

My analytics involves the need to aggregate rows and to store the number of different values occurrences of a field someField in all the rows.
Sample data structure
[someField, someKey]
I'm trying to GROUP BY someKey and then be able to know for each of the results how many time there was each someField values
Example:
[someField: a, someKey: 1],
[someField: a, someKey: 1],
[someField: b, someKey: 1],
[someField: c, someKey: 2],
[someField: d, someKey: 2]
What I would like to achieve:
[someKey: 1, fields: {a: 2, b: 1}],
[someKey: 2, fields: {c: 1, d: 1}],
Does it work for you?
WITH data AS (
select 'a' someField, 1 someKey UNION all
select 'a', 1 UNION ALL
select 'b', 1 UNION ALL
select 'c', 2 UNION ALL
select 'd', 2)
SELECT
someKey,
ARRAY_AGG(STRUCT(someField, freq)) fields
FROM(
SELECT
someField,
someKey,
COUNT(someField) freq
FROM data
GROUP BY 1, 2
)
GROUP BY 1
Results:
It won't give exactly the results you are looking for, but it might work to receive the same queries your previous result would. As you said, for each key you can retrieve how many times (column freq) someField happened.
I've been looking for a way on how to aggregate structs and couldn't find one. But retrieving the results as an ARRAY of STRUCTS turned out to be quite straightforward.
There's probably a smarter way to do this (and get it in the format you want e.g. using an Array for the 2nd column), but this might be enough for you:
with sample as (
select 'a' as someField, 1 as someKey UNION all
select 'a' as someField, 1 as someKey UNION ALL
select 'b' as someField, 1 as someKey UNION ALL
select 'c' as someField, 2 as someKey UNION ALL
select 'd' as someField, 2 as someKey)
SELECT
someKey,
SUM(IF(someField = 'a', 1, 0)) AS a,
SUM(IF(someField = 'b', 1, 0)) AS b,
SUM(IF(someField = 'c', 1, 0)) AS c,
SUM(IF(someField = 'd', 1, 0)) AS d
FROM
sample
GROUP BY
someKey order by somekey asc
Results:
someKey a b c d
---------------------
1 2 1 0 0
2 0 0 1 1
This is well used technique in BigQuery (see here).
I'm trying to GROUP BY someKey and then be able to know for each of the results how many time there was each someField values
#standardSQL
SELECT
someKey,
someField,
COUNT(someField) freq
FROM yourTable
GROUP BY 1, 2
-- ORDER BY someKey, someField
What I would like to achieve:
[someKey: 1, fields: {a: 2, b: 1}],
[someKey: 2, fields: {c: 1, d: 1}],
This is different from what you expressed in words - it is called pivoting and based on your comment - The a, b, c, and d keys are potentially infinite - most likely is not what you need. At the same time - pivoting is easily doable too (if you have some finite number of field values) and you can find plenty of related posts