How can I aggregate values inside a jsonb array using SQL - sql

My data looks something like this:
tags | fullName
----------------------------------------------+------------------------------------------
["tag1", "tag2"] | John
["tag3", "tag1"] | Jane
["tag1", "tag3"] | Bob
tags is a jsonb type and fullName a text in a postgres database
What I'm struggling to do is, create a view such as
tags | count
----------------------------------------------+------------------------------------------
tag1 | 3
tag2 | 1
tag3 | 2

You may use the jsonb_array_elements function to expand the row as array elements before grouping and counting.
See example and working fiddle below.
SELECT
tag_names as tag,
COUNT(1) as count
FROM (
SELECT
jsonb_array_elements(tags) as tag_names
FROM
my_table
) t
GROUP BY tag_names;
tag
count
tag1
3
tag2
1
tag3
2
View on DB Fiddle
Or shorter
SELECT
jsonb_array_elements(tags) as tag,
COUNT(1) as count
FROM
my_table
GROUP BY
tag
View on DB Fiddle

Related

Aggregate rows according to JSON array content

I have a PSQL table with json tags, that are always strings stored in a json array :
id | tags (json)
--------------------------
1 | ["tag1", "tag2"]
2 | ["tag12", "tag2"]
122 | []
I would like to query, for instance, the count of entries in the table containing each tag.
For instance, I'd like to get :
tag | count
--------------------------
tag1 | 1
tag2 | 2
tag12 | 1
I tried
SELECT tags::text AS tag, COUNT(id) AS cnt FROM my_table GROUP BY tag;
but if does not work, since it gives
tag | cnt
--------------------------
["tag1", "tag2"] | 1
["tag12", "tag2"] | 1
I guess I need to get the list of all tags in an inner query, and then for each tag count the rows that contain this tag, but I can't find how to do that. Can you help me with that ?
Use json[b]_array_elements_text() and a lateral join to unnest the array:
select x.tag, count(*) cnt
from mytable t
cross join lateral json_array_elements_text(t.tags) as x(tag)
group by x.tag

BigQuery: Query to GroupBy Array Column

I have (2) columns in BigQuery table:
1. url
2. tags
URL is a single value, and TAGS is an array(example below):
row | URL &nbsp| TAGS
1 | x.com | donkey
| donkey
| lives
| here
How can I group by TAGS array in BigQuery?
What's the trick to get the following query working?
SELECT TAGS FROM `URL_TAGS_TABLE`
group by unnest(TAGS)
I have tried group by TO_JSON_STRING but it does not give me the desired results
I'd like to see the following output
x.com | donkey | count 2
x.com | lives | count 1
x.com | here | count 1
Below is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'x.com' url, ['donkey','donkey','lives','here'] tags UNION ALL
SELECT 'y.com' url, ['abc','xyz','xyz','xyz'] tags
)
SELECT url, tag, COUNT(1) AS `count`
FROM `project.dataset.table`, UNNEST(tags) tag
GROUP BY url, tag
with result
Row url tag count
1 x.com donkey 2
2 x.com lives 1
3 x.com here 1
4 y.com abc 1
5 y.com xyz 3

Extract fields from postgres jsonb column

I have a postgres table with jsonb column, which has the value as follows:
id | messageStatus | payload
-----|----------------------|-------------
1 | 123 | {"commissionEvents":[{"id":1,"name1":"12","name2":15,"name4":"apple","name5":"fruit"},{"id":2,"name1":"22","name2":15,"name4":"sf","name5":"fdfjkd"}]}
2 | 124 | {"commissionEvents":[{"id":3,"name1":"32","name2":15,"name4":"sf","name5":"fdfjkd"},{"id":4,"name1":"42","name2":15,"name4":"apple","name5":"fruit"}]}
3 | 125 | {"commissionEvents":[{"id":5,"name1":"42","name2":15,"name4":"apple","name5":"fdfjkd"},{"id":6,"name1":"52","name2":15,"name4":"sf","name5":"fdfjkd"},{"id":7,"name1":"62","name2":15,"name4":"apple","name5":"fdfjkd"}]}
here payload column is a jsonb datatype, I want to write a postgres query to fetch name1 from commissionEvents where name4 = apple.
So my result will be like:
Since I was new to this jsonb, can anyone please suggest me some solution for it.
You need to unnest all array elements, then you can apply a WHERE condition on that to filter out those with the desired name.
select t.id, x.o ->> 'name1'
from the_table t
cross join lateral jsonb_array_elements(t.payload -> 'commissionEvents') as x(o)
where x.o ->> 'name4' = 'apple'
Online example: https://rextester.com/XWHG26387

postgres - pivot query with array values

Suppose I have this table:
Content
+----+---------+
| id | title |
+----+---------+
| 1 | lorem |
+----|---------|
And this one:
Fields
+----+------------+----------+-----------+
| id | id_content | name | value |
+----+------------+----------+-----------+
| 1 | 1 | subtitle | ipsum |
+----+------------+----------+-----------|
| 2 | 1 | tags | tag1 |
+----+------------+----------+-----------|
| 3 | 1 | tags | tag2 |
+----+------------+----------+-----------|
| 4 | 1 | tags | tag3 |
+----+------------+----------+-----------|
The thing is: i want to query the content, transforming all the rows from "Fields" into columns, having something like:
+----+-------+----------+---------------------+
| id | title | subtitle | tags |
+----+-------+----------+---------------------+
| 1 | lorem | ipsum | [tag1,tag2,tag3] |
+----+-------+----------+---------------------|
Also, subtitle and tags are just examples. I can have as many fields as I desired, them being array or not.
But I haven't found a way to convert the repeated "name" values into an array, even more without transforming "subtitle" into array as well. If that's not possible, "subtitle" could also turn into an array and I could change it later on the code, but I needed at least to group everything somehow. Any ideas?
You can use array_agg, e.g.
SELECT id_content, array_agg(value)
FROM fields
WHERE name = 'tags'
GROUP BY id_content
If you need the subtitle, too, use a self-join. I have a subselect to cope with subtitles that don't have any tags without returning arrays filled with NULLs, i.e. {NULL}.
SELECT f1.id_content, f1.value, f2.value
FROM fields f1
LEFT JOIN (
SELECT id_content, array_agg(value) AS value
FROM fields
WHERE name = 'tags'
GROUP BY id_content
) f2 ON (f1.id_content = f2.id_content)
WHERE f1.name = 'subtitle';
See http://www.postgresql.org/docs/9.3/static/functions-aggregate.html for details.
If you have access to the tablefunc module, another option is to use crosstab as pointed out by Houari. You can make it return arrays and non-arrays with something like this:
SELECT id_content, unnest(subtitle), tags
FROM crosstab('
SELECT id_content, name, array_agg(value)
FROM fields
GROUP BY id_content, name
ORDER BY 1, 2
') AS ct(id_content integer, subtitle text[], tags text[]);
However, crosstab requires that the values always appear in the same order. For instance, if the first group (with the same id_content) doesn't have a subtitle and only has tags, the tags will be unnested and will appear in the same column with the subtitles.
See also http://www.postgresql.org/docs/9.3/static/tablefunc.html
If the subtitle value is the only "constant" that you wan to separate, you can do:
SELECT * FROM crosstab
(
'SELECT content.id,name,array_to_string(array_agg(value),'','')::character varying FROM content inner join
(
select * from fields where fields.name = ''subtitle''
union all
select * from fields where fields.name <> ''subtitle''
) fields_ordered
on fields_ordered.id_content = content.id group by content.id,name'
)
AS
(
id integer,
content_name character varying,
tags character varying
);

Get all column entries for each unique id in a table

I have a table structured something like this:
profile | tag
-------------------------
p1 | tag1
p1 | tag2
p1 | tag3
p2 | tag1
p2 | tag3
I want to run a query that returns something like:
tag1, tag2, tag3
tag1, tag3
Basically, for each unique profile, a list of the tags that exist in the table.
Is that possible, or do I need to have a unique column in my result for each tag I care about?
Thanks!
This returns what you want:
SELECT array_agg(tag)
FROM table1
GROUP BY profile
Or :
SELECT string_agg(tag, ',')
FROM table1
GROUP BY profile
string_agg let's you specify what delimiter you want and ORDER the results in each row.
SQLFIDDLE DEMO