Mapping array to composite type to a different row type - sql

I want to map an array of key value pairs of GroupCount to a composite type of GroupsResult mapping only specific keys.
I'm using unnest to turn the array into rows, and then use 3 separate select statements to pull out the values.
This feels like a lot of code for something so simple.
Is there an easier / more concise way to do the mapping from the array type to the GroupsResult type?
create type GroupCount AS (
Name text,
Count int
);
create type GroupsResult AS (
Cats int,
Dogs int,
Birds int
);
WITH unnestedTable AS (WITH resultTable AS (SELECT ARRAY [ ('Cats', 5)::GroupCount, ('Dogs', 2)::GroupCount ] resp)
SELECT unnest(resp)::GroupCount t
FROM resultTable)
SELECT (
(SELECT (unnestedTable.t::GroupCount).count FROM unnestedTable WHERE (unnestedTable.t::GroupCount).name = 'Cats'),
(SELECT (unnestedTable.t::GroupCount).count FROM unnestedTable WHERE (unnestedTable.t::GroupCount).name = 'Dogs'),
(SELECT (unnestedTable.t::GroupCount).count FROM unnestedTable WHERE (unnestedTable.t::GroupCount).name = 'Birds')
)::GroupsResult
fiddle
http://sqlfiddle.com/#!17/56aa2/1

A bit simpler. :)
SELECT (min(u.count) FILTER (WHERE name = 'Cats')
, min(u.count) FILTER (WHERE name = 'Dogs')
, min(u.count) FILTER (WHERE name = 'Birds'))::GroupsResult
FROM unnest('{"(Cats,5)","(Dogs,2)"}'::GroupCount[]) u;
db<>fiddle here
See:
Aggregate columns with additional (distinct) filters
Subtle difference: our original raises an exception if one of the names pops up more than once, while this will just return the minimum count. May or may not be what you want - or be irrelevant if duplicates can never occur.
For many different names, crosstab() is typically faster. See:
PostgreSQL Crosstab Query

Related

Query data from one column depending on other values on the table

So, I have a table table with three columns I am interested in : value, entity and field_entity.
There are other columns that are not important for this question. There are many different types of entity, and some of them can have the same field_entity, but those two columns determine what the column value refers to (if it is an id number for a person or the address or some other thing)
If I need the name of a person I would do something like this:
select value from table where entity = 'person' and field_entity = 'person_name';
My problem is I need to get a lot of different values (names, last names, address, documents, etc.), so the way I am doing it now is using a left join like this:
select
doc_type.value as doc_type,
doc.value as doc,
status.value as status
from
table doc
-- Get doc type
left join table doc_type
on doc.entity = doc_type.entity
and doc.transaction_id = doc_type.transaction_id
and doc_type.field_entity = 'document_type'
-- Get Status
left join table status
on doc.entity = status.entity
and doc.transaction_id = status.transaction_id
and status.field_entity = 'status'
where doc.entity = 'users' and doc.field_entity = 'document' and doc.transaction_id = 11111;
There are 16 values or more, and this can get a bit bulky and difficult to maintain, so I was wondering if some of you can point out a better way to do this?
Thanks!
I assume that you are not in position to alter the table structure, but can you add views to the database? If so, you can create views based on the different types of entities in your table.
For example:
CREATE VIEW view_person AS
SELECT value AS name
FROM doc
WHERE doc.entity = 'person'
AND doc.field_entity = 'name';
Then you can write clearer queries:
SELECT view_person.name FROM view_person;

SQL: combine two tables for a query

I want to query two tables at a time to find the key for an artist given their name. The issue is that my data is coming from disparate sources and there is no definitive standard for the presentation of their names (e.g. Forename Surname vs. Surname, Forename) and so to this end I have a table containing definitive names used throughout the rest of my system along with a separate table of aliases to match the varying styles up to each artist.
This is PostgreSQL but apart from the text type it's pretty standard. Substitute character varying if you prefer:
create table Artists (
id serial primary key,
name text,
-- other stuff not relevant
);
create table Aliases (
artist integer references Artists(id) not null,
name text not null
);
Now I'd like to be able to query both sets of names in a single query to obtain the appropriate id. Any way to do this? e.g.
select id from ??? where name = 'Bloggs, Joe';
I'm not interested in revising my schema's idea of what a "name" is to something more structured, e.g. separate forename and surname, since it's inappropriate for the application. Most of my sources don't structure the data, sometimes one or the other name isn't known, it may be a pseudonym, or sometimes the "artist" may be an entity such as a studio.
I think you want:
select a.id
from artists a
where a.name = 'Bloggs, Joe' or
exists (select 1
from aliases aa
where aa.artist = a.id and
aa.name = 'Bloggs, Joe'
);
Actually, if you just want the id (and not other columns), then you can use:
select a.id
from artists a
where a.name = 'Bloggs, Joe'
union all -- union if there could be duplicates
select aa.artist
from aliases aa
where aa.name = 'Bloggs, Joe';

SQL query to get conflicting values in JSONB from a group

I have a table defined similar to the one below. location_id is a FK to another table. Reports are saved in an N+1 fashion: for a single location, N reporters are available, and there's one report used as the source of truth, if you will. Reports from reporters have a single-letter code (let's say R), the source of truth has a different code (let's say T). The keys for the JSONB column are regular strings, values are any combination of strings, integers and integral arrays.
create table report (
id integer not null primary key,
location_id integer not null,
report_type char(1),
data jsonb
)
Given all the information above, how can I get all location IDs where the data values for a given set of keys (supplied at query time) are not all the same for the report_type R?
There are at least two solid approaches, depending on how complex you want to get and how numerous and/or dynamic the keys are. The first is very straightforward:
select location_id
from report
where report_type = 'R'
group by location_id
having count(distinct data->'key1') > 1
or count(distinct data->'key2') > 1
or count(distinct data->'key3') > 1
The second construction is more complex, but has the advantage of providing a very simple list of keys:
--note that we also need distinct on location id to return one row per location
select distinct on(location_id) location_id
--jsonb_each returns the key, value pairs with value in type JSON (or JSONB) so the value field can handle integers, text, arrays, etc
from report, jsonb_each(data)
where report_type = 'R'
and key in('key1', 'key2', 'key3')
group by location_id, key
having count(distinct value) > 1
order by location_id

SQL Server - matching attributes query

SQL Server Gurus ...
Currently using MS SQL Server 2016
I know Joe Celko and all SQL purists are squirming at the thought of using bitmasks, but I have a use case in which I need to query for all widgets that contain a set of given attributes.
Each widget may contain several hundred attributes.
The attributes of a widget are either present or not (1 = present, 0 = not
present)
One way I thought to do this is via bitmasks – the attributes to be found (a bitmask) could be ANDed with the attributes of each widget to find matches in a single operation. For example, the widgets table might be:
widets table:
widget_uid Uniqueidentifier
attributes BigInt
SELECT widget_uid
FROM widgets
WHERE ( attributes & bitmask ) = bitmask;
Problem is, using a BigInt for the attributes limits the number of attributes to 64 (a widget can have several hundred attributes), I could group the attributes in chunks of 64 bits, ie:
widets table:
widget_uid Uniqueidentifier
attributes0 BigInt -- Attributes 0-63
attributes1 BigInt -- Attributes 64-127
attributes2 BigInt -- Attributes 128-191
SELECT widget_uid
FROM widgets
WHERE ( attributes0 & bitmask0 ) = bitmask0
AND ( attributes1 & bitmask1 ) = bitmask1
AND ( attributes2 & bitmask2 ) = bitmask2
... but was wondering if anyone has come up with a solution for bit operations using bitmasks with greater than 64 bits – or if other (more efficient?) solutions would exist?
In the use case, the widgets table does contain other columns, but I am only concerned with the attributes matching portion of the query at the moment.
Any and all ideas are welcome - would be interested in knowing how others tackle this particular problem.
Thanks in advance.
We had a similar use case, on a significantly large data set. This was for an e-commerce site with products and attributes. Our case was a bit more complex than here, where we had any possible number of attributes and then values assigned to those attributes. e.g. Color - Red/Green/Blue, Size - S/M/L etc.
We found that associated tables with good indexing was the key in our case. While this may not be an option for you we found this to be the optimal solution for a dynamic data set.
I can code you up an example if you feel it will be helpful.
Edited to add example:
DROP TABLE IF EXISTS #Widgets
DROP TABLE IF EXISTS #Attributes
DROP TABLE IF EXISTS #WidgetAttributes
CREATE TABLE #Widgets (widget_UID UNIQUEIDENTIFIER PRIMARY KEY CLUSTERED, Name NVARCHAR(255))
CREATE TABLE #Attributes (Attribute_UID UNIQUEIDENTIFIER PRIMARY KEY CLUSTERED, Name NVARCHAR(255))
CREATE TABLE #WidgetAttributes (widget_UID UNIQUEIDENTIFIER,Attribute_UID UNIQUEIDENTIFIER)
CREATE NONCLUSTERED INDEX ix_WidgetAttribute ON #WidgetAttributes (Attribute_UID) INCLUDE (widget_UID)
INSERT INTO #Widgets (widget_UID, Name) values
( '{c63bea73-2331-4698-82c9-f71845ab8601}', N'Widget 1' ),
( '{a0865b8f-606b-4273-9207-39a8a26016c4}', N'Widget 2' ),
( '{211fe27e-ab98-4b61-83a3-3d006d66db5a}', N'Widget 3' )
INSERT INTO #Attributes (Attribute_UID, Name)
VALUES
( '{99354dc0-d0b2-4919-a887-edf115eeb1bd}', N'Height' ),
( '{136bbe4c-497d-472f-a905-670e4a7805d0}', N'Width' ),
( '{f006f950-30d1-453e-8e09-4f7d140fa3cb}', N'Depth' ),
( '{0d190639-677f-4b75-8d36-1bdac00de132}', N'Colour' )
-- Set links
-- Widget 1 All attributes
-- Widget 2 Height Width
-- Widget 3 Colour
INSERT INTO #WidgetAttributes (widget_UID, Attribute_UID)
SELECT '{c63bea73-2331-4698-82c9-f71845ab8601}',Attribute_UID FROM #Attributes
UNION ALL
SELECT TOP (2) '{a0865b8f-606b-4273-9207-39a8a26016c4}',Attribute_UID FROM #Attributes WHERE Name<> 'Colour'
UNION ALL
SELECT '{211fe27e-ab98-4b61-83a3-3d006d66db5a}',Attribute_UID FROM #Attributes WHERE Name = 'Colour'
-- #SearchAttributes to hold list of attributes you are trying to find
DECLARE #SearchAttributes TABLE (Attribute_UID UNIQUEIDENTIFIER)
INSERT INTO #SearchAttributes
SELECT Attribute_UID FROM #Attributes WHERE Name<> 'Colour'
;WITH cte AS (
SELECT WA.widget_UID, COUNT(1) AttributesPresent FROM #WidgetAttributes WA
JOIN #SearchAttributes SA ON SA.Attribute_UID = WA.Attribute_UID
GROUP BY WA.widget_UID
)
SELECT cte.AttributesPresent
, W.widget_UID
, W.Name
FROM cte
JOIN #Widgets W ON W.widget_UID = cte.widget_UID
ORDER BY cte.AttributesPresent DESC
Gives an output of:
AttributesPresent widget_UID Name
----------------- ------------------------------------ ----------
3 C63BEA73-2331-4698-82C9-F71845AB8601 Widget 1
2 A0865B8F-606B-4273-9207-39A8A26016C4 Widget 2
We used an approach of counting how many attributes were present for each so we not only had the option of "exact match" but also "closest fit".
Using bitmask in databases is wrong approach. Even if you somewhow manage it to work, you will not be able to use indexes to speed up execution.
Use standard solution, this is standard situation. There is standard M:N relationship between Widgets and Attributes (both should be tables, of course). You will add another table that will assign Attributes to Widgets - you can call it WidgetAttributes.
It will have 3 columns: Id, WidgetId, AttributeId
Then you can simply for example get list of Widgets that have Attribute:
select w.*
from Widgets w
inner join WidgetAttributes wa on wa.WidgetId = w.Id
inner join Attributes a on a.Id = wa.AttributeId
where a.AttributeName='xxx'

Excluding results from an SQL search based on the contents of another column

The table I'm using has 3 important columns on which to exclude entries in a table. I'll call them:
SampleTable:
Thing_Type (varchar)
Thing_ID (int)
Parent_ID (int)
I want to find all 'leaf' entries of a specific type. However, it is possible that an entry of that type has a child that is not of that type, and thus I don't want to exclude it without first filtering the table to only that type. Then I want to include all entries which have an ID that is not present anywhere in the ParentID column.
There's no EXISTS in the tool I'm using (not that I'm sure it would help).
Ignoring the fact that the tool I'm using doesn't like the following, and that it might not be syntactically correct, here's what I feel it should be like.
SELECT * FROM (
SELECT *
FROM SampleTable
WHERE SampleTable.Thing_Type = 'DesiredType'
)
WHERE Thing_ID NOT IN Parent_ID
I'm pretty sure this is wrong but I'm not sure how to make it right.
First, NOT IN has to go against a set, not a single value, so NOT IN parent_id doesn't actually make sense. This is one way to approach this problem:
SELECT
thing_type,
thing_id,
parent_id
FROM
Sample_Table T1
WHERE
thing_type = 'Desired Type' AND
thing_id NOT IN (
SELECT parent_id
FROM Sample_Table T2
WHERE T2.thing_type = 'Desired Type'
)