Search entities by weighted keywords and spelling correction - sql

For starters, a little diagram relations-entities
Diagram relations-entities http://img11.hostingpics.net/pics/32979039DB.png
And now, a dataset
Archive
create :
CREATE TABLE archive (
id integer NOT NULL,
parent_id integer,
code character varying(15) NOT NULL,
label text NOT NULL
);
ALTER TABLE ONLY archive ADD CONSTRAINT archive_pkey PRIMARY KEY (id);
CREATE INDEX idx_142 ON archive USING btree (parent_id);
CREATE UNIQUE INDEX uniq_14242 ON archive USING btree (code);
ALTER TABLE ONLY archive ADD CONSTRAINT fk_14242 FOREIGN KEY (parent_id) REFERENCES archive(id);
insert :
INSERT INTO archive VALUES (1, NULL, 'B28', 'Confidential');
INSERT INTO archive VALUES (2, 1, 'B28.0', 'Nuclear zone');
Keyword
create :
CREATE TABLE keyword (
id integer NOT NULL,
label text NOT NULL,
label_double_metaphone text NOT NULL
);
ALTER TABLE ONLY keyword ADD CONSTRAINT eyword_pkey PRIMARY KEY (id);
CREATE UNIQUE INDEX uniq_242 ON keyword USING btree (label);
insert :
INSERT INTO keyword VALUES (1, 'SECURITY', 'SKRT');
INSERT INTO keyword VALUES (2, 'AREA', 'AR');
INSERT INTO keyword VALUES (3, 'NUCLEAR', 'NKLR');
Assoc_kw_archive
create :
CREATE TABLE assoc_kw_archive (
id integer NOT NULL,
keyword_id integer,
archive_id integer,
weight integer NOT NULL
);
ALTER TABLE ONLY assoc_kw_archive ADD CONSTRAINT assoc_kw_archive_pkey PRIMARY KEY (id);
CREATE INDEX idx_3421 ON assoc_kw_archive USING btree (archive_id);
CREATE INDEX idx_3422 ON assoc_kw_archive USING btree (keyword_id);
ALTER TABLE ONLY assoc_kw_archive ADD CONSTRAINT fk_3421 FOREIGN KEY (archive_id) REFERENCES archive(id);
ALTER TABLE ONLY assoc_kw_archive ADD CONSTRAINT fk_3422 FOREIGN KEY (keyword_id) REFERENCES keyword(id);
insert :
INSERT INTO assoc_kw_archive VALUES (1, 1, 1, 10);
INSERT INTO assoc_kw_archive VALUES (2, 1, 2, 20);
INSERT INTO assoc_kw_archive VALUES (3, 2, 2, 30);
INSERT INTO assoc_kw_archive VALUES (4, 3, 2, 30);
The target
The goal here is to search in the database. The research is based on a string typed by a user. Output a list of archives sorted by relevance. Relevant archive depends on three factors:
The people can make a mistake in the spelling of a word, etc...
The weight of a word to give it importance
Give a gain to the archives include the x keywords typed by the user
I worked on different versions of sql query, but, now I can't to step back and look at the overall problem.
The archive table is composed of 100,000 tuples, 80 000 for the table of keywords and 1,000,000 associations between these two entities.
This is my last version, she is functional, but is very slowly :
select f.id, f.code, f.label, min(f.dist) as distF, max(f.poid) as poidF
from
(
select
a.id,
a.code,
a.label,
( ( levenshtein(lower('Security'), lower(k1.label)) + 1 ) + ( levenshtein(lower('Nuclear'), lower(k2.label)) + 1 ) ) as dist,
( ka1.weight + ka2.weight ) as poid
from archive a
inner join assoc_kw_archive ka1
on ka1.archive_id = a.id
inner join keyword k1
on k1.id = ka1.keyword_id
inner join assoc_kw_archive ka2
on ka2.archive_id = a.id
inner join keyword k2
on k2.id = ka2.keyword_id
where levenshtein(dmetaphone('Security'), k1.label_double_metaphone) < 2
and levenshtein(dmetaphone('Nuclear'), k2.label_double_metaphone) < 2
) as f
group by f.id, f.code, f.label
order by distF asc, poidF desc
limit 10;
I made one join by keyword, it's this that makes it slow! But I can't find another solution.

I think the problem is doing the full join with the distance calculation. here is an alternative approach. Filter the keywords first. Keep the information in the where clause by using a subquery. Then use conditional aggregation to get the information you want.
The query ends up looking something like:
select a.id, a.code, a.label,
min( (levenshtein(lower('Security'), lower(case when securityl < 2 then k.label end)) + 1 ) +
(levenshtein(lower('Nuclear'), lower(case when nuclearl < 2 then k.label end)) + 1 )
) as mindist,
sum(weight) as poid
from archive a inner join
assoc_kw_archive ka
on ka.archive_id = a.id inner join
(select k.*, levenshtein(dmetaphone('Security'), k.label_double_metaphone) as securityl,
levenshtein(dmetaphone('Nuclear'), k.label_double_metaphone) as nuclearl
from keyword k
having securityl < 2 or
nuclearl < 2
) k
on k.id = ka.keyword_id
group by a.id, a.code, a.label

Related

Select records that do not have at least one child element

How can I make an SQL query to select records that do not have at least one child element?
I have 3 tables: article (~40K rows), calendar (~450K rows) and calendar_cost (~500K rows).
It is necessary to select such entries of the article table:
there are no entries in the calendar table,
if there are entries in the calendar table, then all of them should not have any entries in the calendar_cost table.
create table article (
id int PRIMARY KEY,
name varchar
);
create table calendar (
id int PRIMARY KEY,
article_id int REFERENCES article (id) ON DELETE CASCADE,
number varchar
);
create table calendar_cost (
id int PRIMARY KEY,
calendar_id int REFERENCES calendar (id) ON DELETE CASCADE,
cost_value numeric
);
insert into article (id, name) values
(1, 'Article 1'),
(2, 'Article 2'),
(3, 'Article 3');
insert into calendar (id, article_id, number) values
(101, 1, 'Point 1-1'),
(102, 1, 'Point 1-2'),
(103, 2, 'Point 2');
insert into calendar_cost (id, calendar_id, cost_value) values
(400, 101, 100.123),
(401, 101, 400.567);
As a result, "Article 2" (condition 2) and "Article 3" (condition 1) will suit us.
My SQL query is very slow (the second condition part), how can I do it optimally? Is it possible to do without "union all" operator?
-- First condition
select a.id from article a
left join calendar c on a.id = c.article_id
where c.id is null
union all
-- Second condition
select a.id from article a
where id not in(
select aa.id from article aa
join calendar c on aa.id = c.article_id
join calendar_cost cost on c.id = cost.calendar_id
where aa.id = a.id limit 1
)
UPDATE
This is how you can fill my tables with random data for about the same amount of data. The #Bohemian query is very fast, and the rest are very slow. But as soon as I applied 2 indexes, as #nik advised, all queries began to be executed very, very quickly!
do $$
declare
article_id int;
calendar_id bigint;
i int; j int;
begin
create table article (
id int PRIMARY KEY,
name varchar
);
create table calendar (
id serial PRIMARY KEY,
article_id int REFERENCES article (id) ON DELETE CASCADE,
number varchar
);
create INDEX ON calendar(article_id);
create table calendar_cost (
id serial PRIMARY KEY,
calendar_id bigint REFERENCES calendar (id) ON DELETE CASCADE,
cost_value numeric
);
create INDEX ON calendar_cost(calendar_id);
for article_id in 1..45000 loop
insert into article (id, name) values (article_id, 'Article ' || article_id);
for i in 0..floor(random() * 25) loop
insert into calendar (article_id, number) values (article_id, 'Number ' || article_id || '-' || i) returning id into calendar_id;
for j in 0..floor(random() * 2) loop
insert into calendar_cost (calendar_id, cost_value) values (calendar_id, round((random() * 100)::numeric, 3));
end loop;
end loop;
end loop;
end $$;
#Bohemian
Planning Time: 0.405 ms
Execution Time: 1196.082 ms
#nbk
Planning Time: 0.702 ms
Execution Time: 165.129 ms
#Chris Maurer
Planning Time: 0.803 ms
Execution Time: 800.000 ms
#Stu
Planning Time: 0.446 ms
Execution Time: 280.842 ms
So which query to choose now as the right one is a matter of taste.
No need to split the conditions: The only condition you need to check for is that there are no calendar_cost rows whatsoever, which is the case if there are no calendar rows.
The trick is to use outer joins, which still return the parent table but have all null values when there is no join. Further, count() does not count null values, so requiring that the count of calendar_cost is zero is all you need.
select a.id
from article a
left join calendar c on c.article_id = a.id
left join calendar_cost cost on cost.calendar_id = c.id
group by a.id
having count(cost.calendar_id) = 0
See live demo.
If there are indexes on the id columns (the usual case), this query will perform quite well given the small table sizes.
Your second condition should start just like your first one: find all the calendar entries without calendar cost and only afterwards join it to article.
select a.id
from article a
Inner Join (
Select article_id
From calendar c left join calendar_cost cc
On c.id=cc.calendar_id
Where cc.calendar_id is null
) cnone
On a.id = cnone.article_id
This approach is based on the thought that calendar entries without calendar_cost is relatively rare compared to all the calendar entries.
Your query is not valid as IN clauses don't support LIMIT
Adding some indexes on article_id and calender_id
Will help the performance
As you can see in the query plan
create table article (
id int PRIMARY KEY,
name varchar(100)
);
create table calendar (
id int PRIMARY KEY,
article_id int REFERENCES article (id) ON DELETE CASCADE,
number varchar(100)
,index(article_id)
);
create table calendar_cost (
id int PRIMARY KEY,
calendar_id int REFERENCES calendar (id) ON DELETE CASCADE,
cost_value numeric
,INDEX(calendar_id)
);
insert into article (id, name) values
(1, 'Article 1'),
(2, 'Article 2'),
(3, 'Article 3');
insert into calendar (id, article_id, number) values
(101, 1, 'Point 1-1'),
(102, 1, 'Point 1-2'),
(103, 2, 'Point 2');
insert into calendar_cost (id, calendar_id, cost_value) values
(400, 101, 100.123),
(401, 101, 400.567);
Records: 3 Duplicates: 0 Warnings: 0
Records: 3 Duplicates: 0 Warnings: 0
Records: 2 Duplicates: 0 Warnings: 2
select a.id from article a
left join calendar c on a.id = c.article_id
where c.id is null
id
3
-- First condition
EXPLAIN
select a.id from article a
left join calendar c on a.id = c.article_id
where c.id is null
union all
-- Second condition
select a.id from article a
JOIN (
select aa.id from article aa
join calendar c on aa.id = c.article_id
join calendar_cost cost on c.id = cost.calendar_id
LIMIT 1
) t1 ON t1.id <> a.id
id
select_type
table
partitions
type
possible_keys
key
key_len
ref
rows
filtered
Extra
1
PRIMARY
a
null
index
null
PRIMARY
4
null
3
100.00
Using index
1
PRIMARY
c
null
ref
article_id
article_id
5
fiddle.a.id
3
33.33
Using where; Not exists; Using index
2
UNION
<derived3>
null
system
null
null
null
null
1
100.00
null
2
UNION
a
null
index
null
PRIMARY
4
null
3
66.67
Using where; Using index
3
DERIVED
cost
null
index
calendar_id
calendar_id
5
null
2
100.00
Using where; Using index
3
DERIVED
c
null
eq_ref
PRIMARY,article_id
PRIMARY
4
fiddle.cost.calendar_id
1
100.00
Using where
3
DERIVED
aa
null
eq_ref
PRIMARY
PRIMARY
4
fiddle.c.article_id
1
100.00
Using index
fiddle
Try the following using a combination of exists criteria.
Usually, with supporting indexes, this is more performant than simply joining tables as it offers a short-circuit to get out as soon as a match is found, where as joining typically filters after all rows are joined.
select a.id
from article a
where not exists (
select * from calendar c
where c.article_id = a.id
)
or (exists (
select * from calendar c
where c.article_id = a.id
)
and not exists (
select * from calendar_cost cc
where cc.calendar_id in (select id from calendar c where c.article_id = a.id)
)
);

Get column values from mapping tables "id | value" binding

I am trying to get all the columns associated to with my item, some columns are "key | value" paired and that's where my problem is. My idea for a structure looks like this
I can retrieve 1 item from Posts along with all associated tag names with this query, but the problem is that I just can get 1 post
SELECT TOP(10)
bm.title, bm.post_id,
a.name AS tag1, b.name AS tag2, c.name AS tag3, d.name AS tag4
FROM
Posts AS bm
INNER JOIN
Tagmap AS tm
INNER JOIN
Tag AS a ON a.tag_id = tm.tag_id1
INNER JOIN
Tag AS b ON b.tag_id = tm.tag_id2
INNER JOIN
Tag AS c ON c.tag_id = tm.tag_id3
INNER JOIN
Tag AS d ON d.tag_id = tm.tag_id4
ON bm.post_id = tm.post_id
Here is the DDL for the table, or you can get it from this PasteBin link:
CREATE TABLE Tag
(
tag_id int NOT NULL identity(0,1) primary key,
name nvarchar(30) NOT NULL,
);
CREATE TABLE Tagmap
(
id int NOT NULL identity(0,1) primary key,
post_id int FOREIGN KEY REFERENCES Posts(post_id),
tag_id1 int FOREIGN KEY REFERENCES Tag(tag_id),
tag_id2 int FOREIGN KEY REFERENCES Tag(tag_id),
tag_id3 int FOREIGN KEY REFERENCES Tag(tag_id),
tag_id4 int FOREIGN KEY REFERENCES Tag(tag_id)
);
CREATE TABLE Posts
(
post_id int NOT NULL identity(0,1) primary key,
title nvarchar(50) not null,
);
INSERT INTO Posts VALUES ('Title1');
INSERT INTO Posts VALUES ('Title2');
INSERT INTO Tag VALUES ('Tag number one');
INSERT INTO Tag VALUES ('Tag number two');
INSERT INTO Tag VALUES ('Tag number three');
INSERT INTO Tag VALUES ('Tag number four');
INSERT INTO Tagmap VALUES (0, 0, 1, 2, 3);
My question: is my approach totally off? Should I change the structure or is it good?
If so how can it be better and how can I retrieve all these "key | value" columns along with my posts?
First, you should fix your data structure, so you have one row in tagMap per post_id and tag_id -- not four!
But event with your current structure, I imagine that not all posts have four tags. So, with your current data model you should be using LEFT JOIN, rather than INNER JOIN.

PostgreSQL | Need values from right table where is no match in m:n index

I have a three tables issue with PostgreSQL
table_left, table_index, table_right
table_index is m:n ...
I want to get all values from right table matching (m:n) and not-matching (NULL) values based on the values of left table.
SELECT field_left, field_index1, field_index2, field_right
FROM table_left
LEFT JOIN table_index ON left_id = index_left
LEFT JOIN table_right ON index_right = right_id
Using this query I get all values from left to right, but I'm not getting values from table_right were are not based in m:n table_index
If I do something like this ...
SELECT field_left, field_index1, field_index2, field_right
FROM table_left
LEFT JOIN table_index ON left_id = index_left
LEFT JOIN table_right ON index_right = right_id OR right_id NOT IN (1,2,3)
... I will get some strange results ...
field_index1, field_index2 using values from m:n but should be NULL because there is no dependency.
Any suggestions?
EDIT:
Have added some data ... Thx to #jarlh
DROP TABLE IF EXISTS "table_index";
CREATE TABLE "public"."table_index" (
"index_left" integer NOT NULL,
"index_right" integer NOT NULL,
"index_data1" character varying NOT NULL,
"index_data2" character varying NOT NULL,
CONSTRAINT "table_index_index_left_index_right" PRIMARY KEY ("index_left", "index_right"),
CONSTRAINT "table_index_index_left_fkey" FOREIGN KEY (index_left) REFERENCES table_left(left_id) NOT DEFERRABLE,
CONSTRAINT "table_index_index_right_fkey" FOREIGN KEY (index_right) REFERENCES table_right(right_id) NOT DEFERRABLE
) WITH (oids = false);
INSERT INTO "table_index" ("index_left", "index_right", "index_data1", "index_data2") VALUES
(1, 1, 'index-Left-A', 'index-Right-A'),
(1, 2, 'index-Left-A', 'index-Right-B'),
(1, 3, 'index-Left-A', 'index-Right-C'),
(2, 1, 'index-Left-B', 'index-Right-A');
DROP TABLE IF EXISTS "table_left";
DROP SEQUENCE IF EXISTS table_left_left_id_seq;
CREATE SEQUENCE table_left_left_id_seq INCREMENT 1 MINVALUE 1 MAXVALUE 2147483647 START 1 CACHE 1;
CREATE TABLE "public"."table_left" (
"left_id" integer DEFAULT nextval('table_left_left_id_seq') NOT NULL,
"left_data" character varying NOT NULL,
CONSTRAINT "table_left_left_id" PRIMARY KEY ("left_id")
) WITH (oids = false);
INSERT INTO "table_left" ("left_id", "left_data") VALUES
(1, 'Left-A'),
(2, 'Left-B'),
(3, 'Left-C');
DROP TABLE IF EXISTS "table_right";
DROP SEQUENCE IF EXISTS table_right_right_id_seq;
CREATE SEQUENCE table_right_right_id_seq INCREMENT 1 MINVALUE 1 MAXVALUE 2147483647 START 1 CACHE 1;
CREATE TABLE "public"."table_right" (
"right_id" integer DEFAULT nextval('table_right_right_id_seq') NOT NULL,
"right_data" character varying NOT NULL,
CONSTRAINT "table_right_right_id" PRIMARY KEY ("right_id")
) WITH (oids = false);
INSERT INTO "table_right" ("right_id", "right_data") VALUES
(1, 'Right-A'),
(2, 'Right-B'),
(3, 'Right-C');
Using this query ....
SELECT left_id, left_data, index_left, index_right, index_data1, index_data2, right_id, right_data
FROM table_left
LEFT JOIN table_index ON left_id = index_left
LEFT JOIN table_right ON index_right = right_id
... I get some NULL values as expected...
Using the original database I'm not getting these kind of values. Have seen there is an id col within the index table. Primary isn't set to both id values from left/right like my test. Have changed this in my local db with the same result as my test before. I'm getting these NULL values as expected.
I want to get all values from right table matching (m:n) and not-matching (NULL) values based on the values of left table.
If you want all values from the right table, then left join is the right way to go. However the right table should be the first table in the from clause:
SELECT field_left, field_index1, field_index2, field_right
FROM table_right r LEFT JOIN
table_index i
ON i.index_right = r.right_id LEFT JOIN
table_left l
ON l.left_id = i.index_left;
Notes:
There is a bit of cognitive dissonance (in English) because the roles of left/right are reversed.
I recommend that you use table aliases for your actual tables.
I strongly, strongly recommend that you qualify all column references so it is clear what tables they come from.
You could also use RIGHT JOIN. However, I also recommend using LEFT JOIN for this type of logic. It is easier to follow query logic that says: "Keep all rows in the table you just read" rather than "Keep all rows in some table that you will see much further down in the FROM clause."
EDIT:
Based on your comments, I suspect you want a Cartesian product of all left and right values along with a flag that indicates if it is in the junction table. Something like this:
SELECT field_left, field_index1, field_index2, field_right
FROM table_right r CROSS JOIN
table_left l LEFT JOIN
table_index i
ON i.index_right = r.right_id AND
i.index_left = l.left_id;

How can I improve performance of a many-to-many SQL query?

I have a many-to-many relationship between Books and Genres. For example "The Hobbit" Book may have the Genres "Kids", "Fiction" and "Fantasy".
Here's the schema:
CREATE TABLE "genre" (
"id" integer NOT NULL PRIMARY KEY,
"name" varchar(50) NOT NULL
)
;
CREATE TABLE "book_genres" (
"book_id" integer NOT NULL REFERENCES "book" ("id"),
"genre_id" integer NOT NULL REFERENCES "genre" ("id"),
CONSTRAINT book_genres_pkey PRIMARY KEY (book_id, genre_id)
)
;
CREATE TABLE "book" (
"id" integer NOT NULL PRIMARY KEY,
"name" varchar(255) NOT NULL,
"price" real NOT NULL
)
;
And the indexes:
CREATE INDEX "book_genres_36c249d7" ON "book_genres" ("book_id");
CREATE INDEX "book_genres_33e6008b" ON "book_genres" ("genre_id");
CREATE INDEX "book_5a5255da" ON "book" ("price");
Row counts:
genre: 30
book_genres: 800,000
book: 200,000
I am trying to write a query in SQL which brings back all the Books for specific Genres ordered by price without duplicates.
Here's my query which does this:
SELECT name, price
FROM book
WHERE book.id
IN
(SELECT book_id
FROM book_genres
WHERE genre_id = 1
OR genre_id = 2)
ORDER BY price LIMIT 10
My problem is performance. This query can take up to 2000ms to execute. How can I improve the performance?
I have full control over the database (Postgres 9.3) so can add views, indexes or denormalise. I am also using Django so could perform multiple queries perform operations in memory using Python/Django.
SELECT b.name, b.price
FROM book b
WHERE EXISTS (
SELECT *
FROM book_genres bg
WHERE bg.book_id = b.id
AND bg.genre_id IN( 1 , 2)
)
ORDER BY b.price
LIMIT 10
;
The order by price+LIMIT can be a performance killer: check the query plan.
PLUS: replace the one-column indices by a "reversed" index:
make book_id a FK into books.id
and (maybe) omit the surrogate key id
CREATE TABLE book_genres
( book_id integer NOT NULL REFERENCES book (id)
, genre_id integer NOT NULL REFERENCES genre (id)
, PRIMARY KEY (book_id, genre_id)
) ;
CREATE INDEX ON book_genres (genre_id,book_id);
In most cases you can improve you performance using JOIN instead of subqueries (Although it depends on many factors so ) :
SELECT *
FROM
(
SELECT b.name, b.price
FROM book b JOIN book_genres g ON b.book.id = g.book_id
AND g.genre_id = 1
UNION
SELECT b.name, b.price
FROM book b JOIN book_genres g ON b.book.id = g.book_id
AND g.genre_id = 2
)
ORDER BY price LIMIT 10

Including a set of rows in a view column

Design:
A main table where each entry in it can have zero of more of a set of options “checked”. It seems to me that it would be easier to maintain (adding/removing options) if the options were part of a separate table and a mapping was made between the main table and an options table.
Goal:
A view that contains the information from the main table, as well as all options to which that row has been mapped. However the latter information exists in the view, it should be possible to extract the option’s ID and its description easily.
The implementation below is specific to PostgreSQL, but any paradigm that works across databases is of interest.
The select statement that does what I want is:
WITH tmp AS (
SELECT
tmap.MainID AS MainID,
array_agg(temp_options) AS options
FROM tstng.tmap
INNER JOIN (SELECT id, description FROM tstng.toptions ORDER BY description ASC) AS temp_options
ON tmap.OptionID = temp_options.id
GROUP BY tmap.MainID
)
SELECT tmain.id, tmain.contentcolumns, tmp.options
FROM tstng.tmain
INNER JOIN tmp
ON tmain.id = tmp.MainID;
However, attempting to create a view from this select statement generates an error:
column "options" has pseudo-type record[]
The solution that I’ve found is to cast the array of options (record[]) to a text array (text[][]); however, I’m interested in knowing if there is a better solution out there.
For reference, the create instruction:
CREATE OR REPLACE VIEW tstng.vsolution AS
WITH tmp AS (
SELECT
tmap.MainID AS MainID,
array_agg(temp_options) AS options
FROM tstng.tmap
INNER JOIN (SELECT id, description FROM tstng.toptions ORDER BY description ASC) AS temp_options
ON tmap.OptionID = temp_options.id
GROUP BY tmap.MainID
)
SELECT tmain.id, tmain.contentcolumns, CAST(tmp.options AS text[][])
FROM tstng.tmain
INNER JOIN tmp
ON tmain.id = tmp.MainID;
Finally, the DDL in case my description has been unclear:
CREATE TABLE tstng.tmap (
mainid INTEGER NOT NULL,
optionid INTEGER NOT NULL
);
CREATE TABLE tstng.toptions (
id INTEGER NOT NULL,
description text NOT NULL,
unwanted_column text
);
CREATE TABLE tstng.tmain (
id INTEGER NOT NULL,
contentcolumns text
);
ALTER TABLE tstng.tmain ADD CONSTRAINT main_pkey PRIMARY KEY (id);
ALTER TABLE tstng.toptions ADD CONSTRAINT toptions_pkey PRIMARY KEY (id);
ALTER TABLE tstng.tmap ADD CONSTRAINT tmap_pkey PRIMARY KEY (mainid, optionid);
ALTER TABLE tstng.tmap ADD CONSTRAINT tmap_optionid_fkey FOREIGN KEY (optionid)
REFERENCES tstng.toptions (id);
ALTER TABLE tstng.tmap ADD CONSTRAINT tmap_mainid_fkey FOREIGN KEY (mainid)
REFERENCES tstng.tmain (id);
You could create composite type e.g. temp_options_type with:
DROP TYPE IF EXISTS temp_options_type;
CREATE TYPE temp_options_type AS (id integer, description text);
After that just cast temp_options to that type within array_agg, so it returns temp_options_type[] instead of record[]:
DROP VIEW IF EXISTS tstng.vsolution;
CREATE OR REPLACE VIEW tstng.vsolution AS
WITH tmp AS
(
SELECT
tmap.MainID AS MainID,
array_agg(CAST(temp_options AS temp_options_type)) AS options
FROM
tstng.tmap INNER JOIN
(
SELECT id, description
FROM tstng.toptions
ORDER BY description
) temp_options
ON tmap.OptionID = temp_options.id
GROUP BY tmap.MainID
)
SELECT tmain.id, tmain.contentcolumns, tmp.options
FROM tstng.tmain
INNER JOIN tmp ON tmain.id = tmp.MainID;
Example result:
TABLE tstng.vsolution;
id | contentcolumns | options
----+----------------+-----------------------
1 | aaa | {"(1,xxx)","(2,yyy)"}
2 | bbb | {"(3,zzz)"}
3 | ccc | {"(1,xxx)"}
(3 rows)