Querying on EAV SQL Design - sql

I have 3 tables like this.
Entity_Table
|e_id|e_name|e_type |e_tenant|
|1 | Bob | bird | owner_1|
|2 | Joe | cat | owner_1|
|3 | Joe | cat | owner_2|
AttributeValue_Table
|av_id|prop_name |prop_value|
|1 | color | black |
|2 | color | white |
|3 | wing size| 7" |
|4 | whiskers | long |
|5 | whiskers | short |
|6 | random | anything |
Entity_AttrVal
|e_id|av_id|
| 1 | 1 |
| 1 | 3 |
| 2 | 2 |
| 2 | 5 |
| 3 | 1 |
| 3 | 4 |
| 3 | 6 |
What I want to be able to do is something like 'find entity where e_name='Joe' and color=black and whiskers=short.
I can obtain a result set where each row has 1 prop/value, along with the entity information, so querying on one property works. But I need to be able to do arbitrary N properties. How do I do something like this?
Can I build a join table with all properties as columns or something
edit2: Looks like I can do something like this
SELECT et.e_id, et.e_name, et.e_type
FROM Entitiy_table et
LEFT JOIN Entity_AttrVal j ON et.e_id = j.e_id
RIGHT JOIN AttributeValue_Table at ON at.av_id = j.av_id
WHERE (av.prop_name='color' AND av.prop_value='white') OR (av.prop_name='whiskers' AND av.prop_value='long')
GROUP BY et.e_id, et.e_name, et.e_type
HAVING COUNT(*) = 2;

You have to add a predicate for each name/value combination:
SELECT <whatever you need>
FROM Entity_Table et
WHERE et.e_name = 'Joe'
AND EXISTS (SELECT 1
FROM AttributeValue_Table avt
JOIN Entity_AttrVal ea ON ea.e_id = et.e_id
WHERE ea.a_id = avt.av_id
AND avt.prop_name = 'color'
AND avt.prop_value = 'black')
AND EXISTS (SELECT 1
FROM AttributeValue_Table avt
JOIN Entity_AttrVal ea ON ea.e_id = et.e_id
WHERE ea.a_id = avt.av_id
AND avt.prop_name = 'whiskers'
AND avt.prop_value = 'short')
(I apologize if my Sql Server dialect shines through)
To do an arbitrary number of comparisons, you'd have to generate the SQL and execute it.
As said in a comment, this goes to show that EAV is a pain (an anti-pattern, really), but I know by experience that sometimes there's simply no alternative if we're bound to a relational database.

Related

Trying to use SQL to group accounts by number of sub-types

I'm using a table that houses account info. These accounts can have between 1 and 6 unique sub types. Currently it only tracks between single and multi subtypes but doesn't show the totals of how many of each multi sub-type account there are (how many accounts with 2 subtypes vs. 3 subtypes and so on). I'm looking for a wholly SQL way to view how many of each grouping of account types. There are a LOT of accounts in the table so pulling it manually isn't really an option. Is there a way I can get a count of each of the amount of sub-type groupings?
| account | Sub-Type | Single_V_Multi |
|---------|--------- | -------------- |
|123456789|123456789 | Multi |
|123456789|123456790 | Multi |
|123456789|123456791 | Multi |
|123456792|123456792 | Single |
|123456793|123456793 | Multi |
|123456793|123456794 | Multi |
|123456795|123456795 | Single |
|123456796|123456796 | Single |
|123456797|123456797 | Single |
|123456798|123456798 | Single |
|123456799|123456799 | Multi |
|123456799|123456800 | Multi |
|123456799|123456801 | Multi |
|123456799|123456802 | Multi |
From this example I'd be looking to get separate counts of the Account column based on the number of unique Sub-Type. What I've done so far is a query that groups the Sub-Types:
SELECT account, COUNT(DISTINCT(Sub-Type)) as BAN_SUB_COUNT
FROM Table
Which give the output:
| account | BAN_SUB_COUNT |
| ------- | ------------- |
|123456789| 3 |
|123456792| 1 |
|123456793| 2 |
|123456795| 1 |
|123456796| 1 |
|123456797| 1 |
|123456798| 1 |
|123456799| 4 |
What I need from this is a way to get a separate count of accounts for each of the distinct BAN_SUB_COUNT entries. Ideally it would be along the lines of:
| BAN_SUB_COUNT |count of Accounts|
| ------------- | --------------- |
| 1 | 5 |
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
Sorry for any confusion and I hope I'm explaining myself better here!
You just need to wrap your query with another one:
select ban_sub_count, count(distinct account) as count_of_accounts
from (
SELECT account, COUNT(DISTINCT Sub-Type ) as BAN_SUB_COUNT
FROM Table
group by account
)z
group by ban_sub_count
Output:
BAN_SUB_COUNT
count of Accounts
1
5
2
1
3
1
4
1
I try to answer your question:
select a2.*,a.`count_sub_type`
FROM (
select count(`sub-type`) as count_sub_type,`sub-type` from account group by `sub-type`
) a
left join account a2 on a2.`sub-type` = a.`sub-type`;
output :
|account |sub-type|single_v_multi|count_sub_type|
|--------|--------|--------------|--------------|
|account6|type1 |multiview |3 |
|account5|type1 |single |3 |
|account1|type1 |single |3 |
|account4|type2 |single |2 |
|account2|type2 |single |2 |
|account6|type3 |single |2 |
|account3|type3 |single |2 |
Best regards,

How to select table with a concatenated column?

I have the following data:
select * from art_skills_table;
+----+------+---------------------------+
| ID | Name | skills |
+----+------+---------------------------|
| 1 | Anna | ["painting","photography"]|
| 2 | Bob | ["drawing","sculpting"] |
| 3 | Cat | ["pastel"] |
+----+------+---------------------------+
select * from computer_table;
+------+------+-------------------------+
| ID | Name | skills |
+------+------+-------------------------+
| 1 | Anna | ["word","typing"] |
| 2 | Cat | ["code","editing"] |
| 3 | Bob | ["excel","code"] |
+------+------+-------------------------+
I would like to write an SQL statement which results in the following table.
+------+------+-----------------------------------------------+
| ID | Name | skills |
+------+------+-----------------------------------------------+
| 1 | Anna | ["painting","photography","word","typing"] |
| 2 | Bob | ["drawing","sculpting","excel","code"] |
| 3 | Cat | ["pastel","code","editing"] |
+------+------+-----------------------------------------------+
I've tried something like SELECT * from art_skills_table LEFT JOIN computer_table ON name. However it doesn't give what I need. I've read about array_cat but I'm having a bit of trouble implementing it.
if the skills column from both tables are arrays, then you should be able to get away with this:
SELECT a.ID, a.name, array_cat(a.skills, c.skills)
FROM art_skills_table a LEFT JOIN computer_table c
ON c.id = a.id
That said, While you used LEFT join in your sample, I think either an INNER or FULL (OUTER) join might serve you better.
First, i wondered why the data are stored in such a model.
Was of the opinion that NoSQL databases lack ability for joins and ...
... a semantic triple would be in the form of subject–predicate–object.
... a Key-value (KV) stores use associative arrays.
... a relational database would be normalized.
A few information about the use case would have helped.
Nevertheless, you can select the data with CONCAT and REPLACE for the desired form.
SELECT art_skills_table.ID, computer_table.name,
CONCAT(
REPLACE(art_skills_table.skills, '}',','),
REPLACE(computer_table.skills, '{','')
)
FROM art_skills_table JOIN computer_table ON art_skills_table.ID = computer_table.ID
The query returns the following result:
+----+------+--------------------------------------------+
| ID | Name | Skills |
+----+------+--------------------------------------------+
| 1 | Anna | {"painting","photography","word","typing"} |
| 2 | Cat | {"drawing","sculpting","code","editing"} |
| 3 | Bob | {"pastel","excel","code"} |
+----+------+--------------------------------------------+
I've used the ID for the JOIN, even though Bob has different values.
The JOIN should probably be done over the name.
JOIN computer_table ON art_skills_table.Name = computer_table.Name
BTW, you need to tell us what SQL engine you're running on.

When Querying Many-To-Many Relationship in SQL, Return Multiple Connections As an Array In Single Row?

Basically, I have 3 tables, titles, providers, and provider_titles.
Let's say they look like this:
| title_id | title_name |
|------------|----------------|
| 1 | San Andres |
| 2 |Human Centipede |
| 3 | Zoolander 2 |
| 4 | Hot Pursuit |
| provider_id| provider_name |
|------------|----------------|
| 1 | Hulu |
| 2 | Netflix |
| 3 | Amazon_Prime |
| 4 | HBO_GO |
| provider_id| title_id |
|------------|----------------|
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 3 | 1 |
| 3 | 3 |
| 4 | 4 |
So, clearly there are titles with multiple providers, yeah? Typical many-to-many so far.
So what I'm doing to query it is with a JOIN like the following:
SELECT * FROM provider_title JOIN provider ON provider_title.provider_id = provider.provider_id JOIN title ON title.title_id = provider_title.title_id WHERE provider.name IN ('Netflix', 'HBO_GO', 'Hulu', 'Amazon_Prime')
Ok, now to the actual issue. I don't want repeated title names back, but I do want all of the providers associated with the title. Let me explain with another table. Here is what I am getting back with the current query, as is:
| provider_id| provider_name | title_id | title_name |
|------------|---------------|----------|---------------|
| 1 | Hulu | 1|San Andreas |
| 1 | Hulu | 2|Human Centipede|
| 2 | Netflix | 1|San Andreas |
| 3 | Amazon_Prime | 1|San Andreas |
| 3 | Amazon_prime | 3|Zoolander 2 |
| 4 | HBO_GO | 4|Hot Pursuit |
But what I really want would be something more like
| provider_id| provider_name |title_id| title_name|
|------------|-----------------------------|--------|-----------|
| [1, 2, 3] |[Hulu, Netflix, Amazon_Prime]| 1|San Andreas|
Meaning I only want distinct titles back, but I still want each title's associated providers. Is this only possible to do post-sql query with logic iterating through the returned rows?
Depending on your database engine, there may be an aggregation function to help achieve this.
For example, this SQLfiddle demonstrates the postgres array_agg function:
SELECT t.title_id,
t.title_name,
array_agg( p.provider_id ),
array_agg( p.provider_name )
FROM provider_title as pt
JOIN
provider as p
ON pt.provider_id = p.provider_id
JOIN title as t
ON t.title_id = pt.title_id
GROUP BY t.title_id,
t.title_name
Other database engines have equivalents. For example:
mySQL has group_concat
Oracle has listagg
sqlite has group_concat (as well!)
If your database isn't covered by the above, you can google '[Your database engine] aggregate comma delimited string'

Pivot table using flat table structure in SQL Server without aggregation

I have a flat table structure which I've turned into a column based table. I'm struggling with getting the rowId from my raw data to appear in my column based table. Any help greatly appreciated.
Raw data in table derived from three different tables:
| rowId |columnName |ColumnValue |
| ---------------- |:---------------:| -----------:|
| 1 |itemNo |1 |
| 1 |itemName |Polo Shirt |
| 1 |itemDescription |Green |
| 1 |price1 |4.2 |
| 1 |price2 |5.3 |
| 1 |price3 |7.5 |
| 1 |displayOrder |1 |
| 1 |rowId |[NULL] |
| 2 |itemNo |12 |
| 2 |itemName |Digital Watch|
| 2 |itemDescription |Red Watch |
| 2 |price1 |4.0 |
| 2 |price2 |2.0 |
| 2 |price3 |1.5 |
| 2 |displayOrder |3 |
| 2 |rowId |[NULL] |
SQL using pivot to give me the column structure:
select [displayOrder],[itemDescription],[itemName],[itemNo],[price1],[price2],[price3],[rowId]
from
(
SELECT [columnName], [columnValue] , row_number() over(partition by c.columnName order by cv.rowId) as rn
FROM tblFlatTable AS t
JOIN tblFlatColumns c
ON t.flatTableId = c.flatTableId
JOIN tblFlatColumnValues cv
ON cv.flatColumnId = c.flatColumnId
WHERE (t.flatTableId = 1) AND (t.isActive = 1)
AND (c.isActive = 1) AND (cv.isActive = 1)
) as S
Pivot
(
MIN([columnValue])
FOR columnName IN ([displayOrder],[itemDescription],[itemName],[itemNo],[price1],[price2],[price3],[rowId])
) as P
Result:
|displayOrder|itemDescription|itemName |price1|price2|price3|rowId |
| ---------- |:-------------:|:------------:|:----:|:----:|:----:|-----:|
|1 |Green |Polo Shirt |4.2 |5.3 |7.5 |[NULL]|
|3 |Red watch |Digital Watch |4.0 |2.0 |1.5 |[NULL]|
I understand why I'm getting the NULL value for rowId. What I'm stuck on and I'm not sure if it's possible to do as I've looked an many example and none seem to do this, that is to pull the value for rowId from the raw data and add it to my structure.
It looks obvious now!
I'm now not including rowId as part of my flat structure.
| rowId |columnName |ColumnValue |
| ---------------- |:---------------:| -----------:|
| 1 |itemNo |1 |
| 1 |itemName |Polo Shirt |
| 1 |itemDescription |Green |
| 1 |price1 |4.2 |
| 1 |price2 |5.3 |
| 1 |price3 |7.5 |
| 1 |displayOrder |1 |
| 2 |itemNo |12 |
| 2 |itemName |Digital Watch|
| 2 |itemDescription |Red Watch |
| 2 |price1 |4.0 |
| 2 |price2 |2.0 |
| 2 |price3 |1.5 |
| 2 |displayOrder |3 |
I've updated the SQL, you can see I'm pulling in the rowId from tblFlatColumnValues
select [rowId],[displayOrder],[itemDescription],[itemName],[itemNo],[price1],[price2],[price3]
from
(
SELECT cv.rowId, [columnName], [columnValue] , row_number() over(partition by c.columnName order by cv.rowId) as rn
FROM tblFlatTable AS t
JOIN tblFlatColumns c
ON t.flatTableId = c.flatTableId
JOIN tblFlatColumnValues cv
ON cv.flatColumnId = c.flatColumnId
WHERE (t.flatTableId = 1) AND (t.isActive = 1)
AND (c.isActive = 1) AND (cv.isActive = 1)
) as S
Pivot
(
MIN([columnValue])
FOR columnName IN ([displayOrder],[itemDescription],[itemName],[itemNo],[price1],[price2],[price3])
) as P

Grouped string aggregation / LISTAGG for SQL Server

I'm sure this has been asked but I can't quite find the right search terms.
Given a schema like this:
| CarMakeID | CarMake
------------------------
| 1 | SuperCars
| 2 | MehCars
| CarMakeID | CarModelID | CarModel
-----------------------------------------
| 1 | 1 | Zoom
| 2 | 1 | Wow
| 3 | 1 | Awesome
| 4 | 2 | Mediocrity
| 5 | 2 | YoureSettling
I want to produce a dataset like this:
| CarMakeID | CarMake | CarModels
---------------------------------------------
| 1 | SuperCars | Zoom, Wow, Awesome
| 2 | MehCars | Mediocrity, YoureSettling
What do I do in place of 'AGG' for strings in SQL Server in the following style query?
SELECT *,
(SELECT AGG(CarModel)
FROM CarModels model
WHERE model.CarMakeID = make.CarMakeID
GROUP BY make.CarMakeID) as CarMakes
FROM CarMakes make
http://www.simple-talk.com/sql/t-sql-programming/concatenating-row-values-in-transact-sql/
It is an interesting problem in Transact SQL, for which there are a number of solutions and considerable debate. How do you go about producing a summary result in which a distinguishing column from each row in each particular category is listed in a 'aggregate' column? A simple, and intuitive way of displaying data is surprisingly difficult to achieve. Anith Sen gives a summary of different ways, and offers words of caution over the one you choose...
If it is SQL Server 2017 or SQL Server VNext, Azure SQL database you can use String_agg as below:
SELECT make.CarMakeId, make.CarMake,
CarModels = string_agg(model.CarModel, ', ')
FROM CarModels model
INNER JOIN CarMakes make
ON model.CarMakeId = make.CarMakeId
GROUP BY make.CarMakeId, make.CarMake
Output:
+-----------+-----------+---------------------------+
| CarMakeId | CarMake | CarModels |
+-----------+-----------+---------------------------+
| 1 | SuperCars | Zoom, Wow, Awesome |
| 2 | MehCars | Mediocrity, YoureSettling |
+-----------+-----------+---------------------------+