I have a presto with a column of string arrays I would like to convert to a table of each element in the array mapped to its number of occurrences.
A, B, C, D, E, F are all strings
set
---------
[A,B,C,D]
[A,C,E,F]
string|count
-------------
A 2
B 1
C 2
D 1
E 1
F 1
Well, you can use unnest() and aggregate:
select char, count(*)
from t cross join
unnest(t.set) as u(char)
group by char
Related
Using SQL I'd like to convert a table that looks like this
id
col11
col2
1
a
b
1
c
d
2
e
f
2
g
h
Into something that looks like this:
id
combined
1
[{col1: a, col2:b}, {col1: c, col2:d}]
1
[{col1: e, col2:f}, {col1: g, col2:h}]
We can try to use json_build_object function to build a JSON object out of a variadic argument list then use json_agg function.
SELECT id,
json_agg(json_build_object('col1',col11,
'col2',col2)) combined
FROM t
GROUP BY id
sqlfiddle
I am very confused how to define the problem statement but Let's say below is table History i want to find those rows which have a pair.
Pair I will defined like column a and b will have same value and c should have False and d should be different for both row.
If I am using Java i would have set row 3, C column as true when i hit a pair or would have saved both row 1 and row 3 into different list. So that row 2 can be excluded. But i don't know how to do the same functionality in SQL.
Table - History
col a, b, c(Boolean ), d
1 bb F d
1 bb F d
1 bb F c
Query ? ----
Result - rows 1 and 3.
Assuming the table is called test:
SELECT
*
FROM
test
WHERE id IN (
SELECT
MIN(id)
FROM
test
WHERE
!c
AND a = b
AND d != a
GROUP BY a, d
)
We get the smallest id of every where matching your conditions. Furthermore we group the results by a, d which means we get only unique pairs of "a and d". Then we use this ids to select the rows we want.
Working example.
Update: without existing id
# add PK afterwards
ALTER TABLE test ADD COLUMN id INT PRIMARY KEY AUTO_INCREMENT FIRST;
Working example.
All the rows match the conditioin you specified. A "pair" happens when:
column a and b will have same value, and
c should have False, and
d should be different for both rows.
1 and 3 will match that as well as 2 and 3. Also, 3 and 1 will match as well as 3 and 2. There are four solutions.
You don't say which database, so I'll assume PostgreSQL. The query that can search using your criteria is:
select *
from t x
where exists (
select null from t y
where y.a = x.a
and y.b = x.b
and not y.c
and y.d <> x.d
);
Result:
a b c d
-- --- ------ -
1 bb false d
1 bb false d
1 bb false c
That is... the whole table.
See running example at DB Fiddle.
My input is:
a b c
-------
A 5 3
A 4 2
B 3 1
B 5 3
I would like to get all a values having the same values in b and c, so the output should be as:
{A,B} 5 3
I am using the group by, but I am not achieving my goal.
In standard SQL, this would look like:
select b, c, listagg(a, ',') within group (order by a)
from t
group by b, c;
Not all databases support listagg(), but most have a method for concatenating strings.
In Hive, you would use collect_list() or collect_set():
select b, c, collect_list(a, ',')
from t
group by b, c;
You can convert the array back to a string, but I recommend keeping it as an array.
I am trying to make a sum of the count of different values.
Here's an example :
a
a
a
b
b
b
c
c
c
d
d
d
d
e
e
e
e
The output would be :
5
Because there's 5 different values in that column.
Perform a Distinct Count which should give you the count of distinct values in a column and no need to do sum here
select count(distinct colname)
from yourtable
After searching and giving some good tought and testing here's the correct query :
SELECT SUM(uniqueValues) FROM (SELECT COUNT(DISTINCT values) as uniqueValues FROM tablename GROUP BY values)
I am trying to return values from a query adding a column for the row position without adding an identity column. I don't want the absolute position in the table, but the position in the query result
Suppose I have a table like this
My_TBL
-----------------------
FLD_A FLD_B FLD_C
a A t
b B t
c C p
d D p
.. ..
and the select query is
select FLD_A,FLD_B from My_Tbl where FLD_C='p'
FLD_A FLD_B
-----------------------
c C
d D
What do I have in Db2 to add in my query to get each row counted in that output?
POS FLD_A FLD_B
-----------------------
1 c C
2 d D
Use row_number(). It will only count the rows that are actually returned.
select row_number() over (order by FLD_A,FLD_B) as POS,
FLD_A,
FLD_B
from My_Tbl
where FLD_C='p'