SQL SELECT criteria in another table - sql

I have 2 related tables:
messages
--------
mid subject
--- -----------------
1 Hello world
2 Bye world
3 The third message
4 Last one
properties
----------
pid mid name value
--- --- ---------------- -----------
1 1 read false
2 1 importance high
3 2 read false
4 2 importance low
5 3 read true
6 3 importance low
7 4 read false
8 4 importance high
And I need to get from messages using the criteria on the properties table.
Eg: if I have a criteria like return unread (read=false) high prio (importance=high) messages it should return
mid subject
--- -----------------
1 Hello world
4 Last one
How could I get this with a SELECT clause (MySQL dialect)?

In SQL, any expression in a WHERE clause can only reference one row at a time. So you need some way of getting multiple rows from your properties table onto one row of result. You do this with self-joins:
SELECT ...
FROM messages AS m
JOIN properties AS pRead
ON m.mid = pRead.mid AND pRead.name = 'read'
JOIN properties AS pImportance
ON m.mid = pImportance.mid AND pImportance.name = 'importance'
WHERE pRead.value = 'false' AND pImportance.value = 'high';
This shows how awkward it is to use the EAV antipattern. Compare with using conventional attributes, where one attribute belongs in one column:
SELECT ...
FROM messages AS m
WHERE m.read = 'false' AND m.importance = 'high';
By the way, both answers from #Abe Miessler and #Thomas match more mid's than you want. They match all mid's where read=false OR where importance=high. You need to combine these properties with the equivalent of AND.

I believe the query below will work.
UPDATE: #Gratzy is right, this query won't work, take a look at the structure changes I suggested.
SELECT DISTINCT m.id as mid, m.subject
FROM message as m
INNER JOIN properties as p
ON m.mid = p.mid
where (p.name = 'read' and p.value = 'false') or (p.name = 'importance' AND p.value = 'high')
The structure of your properties table seems a little off to me though...
Would it be possible to structure the table like this:
messages
--------
mid subject Read Importance
--- ----------------- --------- ------------
1 Hello world false 3
2 Bye world false 1
3 The third message true 1
4 Last one false 3
importance
----------
iid importanceName
--- --------------
1 low
2 medium
3 high
and use this query:
SELECT m.id as mid, m.subject
FROM message as m
where m.read = false AND m.importance = 3

Clearly, you are using an EAV (Entity-Attribute-Value) schema. One of the many reasons for avoiding such a structure is that it makes queries more difficult. However, for the example you gave, you could do something like:
Select ...
From messages As M
Where Exists (
Select 1
From Properties As P1
Where P1.mid = M.mid
And P1.name = 'unread' And P1.value = 'false'
)
And Exists (
Select 1
From Properties As P2
Where P2.mid = M.mid
And P2.name = 'importance' And P2.value = 'high'
)
A more succinct solution would be:
Select ...
From messages As M
Where Exists (
Select 1
From Properties As P1
Where P1.mid = M.mid
And ((P1.name = 'unread' And P1.value = 'false')
Or (P1.name = 'importance' And P1.value = 'high'))
Having Count(*) = 2
)

Select m.mid, m.subject
from properties p
inner join properties p1 on p.mid = p1.mid
inner join messages m on p.mid = m.mid
where
p.name = 'read'
and p.value = 'false'
and p1.name = 'importance'
and p2.value = 'high'
I prefer to put my filter criteria in the where clause and leave my join's to elements that are in both tables and are the actual criteria for the join.

Another way might be (untested) to use a derived table to hold the criteria that all messages must meet then use the standard relational division technique of double NOT EXISTS
SELECT mid,
subject
FROM messages m
WHERE NOT EXISTS
( SELECT *
FROM ( SELECT 'read' AS name,
'false' AS value
UNION ALL
SELECT 'importance' AS name,
'high' AS value
)
c
WHERE NOT EXISTS
(SELECT *
FROM properties P
WHERE p.mid = m.mid
AND p.name =c.name
AND p.value=c.value
)
)

If you want to keep your existing data model, then go with Bill Karwin's first suggestion. Run it with this select clause to understand what it's doing:
select m.*, r.value as read, i.value as importance
from message m
join properties r
on r.mid = m.mid and r.name = 'read'
join properties i
on i.mid = m.mid and i.name = 'importance'
where r.value = 'false' and i.value = 'high';
But if you go this way, there are a few constraints you should put in place to avoid storing and retrieving bad data:
A unique index on message(mid) and a unique index on properties(pid), both of which I'm sure you have already.
A unique index on properties(mid, name) so that each property can only be defined once for a message -- otherwise you may get duplicate results from your query. This will also help your query performance by allowing an index access for both joins.

Related

SELECT from 2 differents tables

I would like to select all possible brands for different products where the fk_category_id is for example equal to "2".
produits :
id titre fk_category_id fk_marque_id is_active is_delete
1 Swoke 2 1 1 0
2 Café 2 2 1 0
3 Fraise 2 3 1 0
4 Fruits 2 4 1 0
manufacturers :
id name
1 Swoke
2 Liqua
3 Alfaliquid
4 TJuice
5 otherBrands
I already tried a lot of things and for example this request :
SELECT m.name, m.id, p.fk_category_id
FROM produits p
INNER JOIN manufacturers m
WHERE p.fk_category_id = 2
AND p.is_active = 1
AND p.fk_marque_id = m.id
AND p.is_delete = 0;
But it doesn't works.
The expected result is :
result :
id name
1 Swoke
2 Liqua
3 Alfaliquid
4 TJuice
It's the same as the table "manufacturers" but I have to sort by fk_category_id because I only want the brand with the fk_category = 2.
So if someone could explain me or help me to understand how to solve my "problem" ? Thanks you in advance, I continue my research by my side :).
If you need something else i can give you anything.
I think what you need to do is have a condition on your join, to say how the data from the 2 table should join together.
On the assumption that fk_marque_id is your reference in produits to an item in manufacturers (assumed from looking at your where clause), your sql could look like this:
SELECT
p.id, p.titre, m.name, m.id, p.fk_category_id
FROM
produits p
INNER JOIN manufacturers m ON p.fk_marque_id = m.id
WHERE p.fk_category_id = 2
AND p.is_active = 1
AND p.is_delete = 0;
The naming convention of your fields in produits is a little weird however, if one of the the FK fields is a link to an ID in manufacturers. You'd normally expect to see something like FK_manufacturers_Id so it's clear that this column is the reference to that field (Id) in that table (manufacturers)
If you are just looking to join products and manufacturers table where fk_category_id = 2, it can be done something like this
SELECT m.id, m.name
FROM manufacturers m
INNER JOIN produits p
ON m.id = p.fk_marque_id
WHERE p.fk_category_id = 2
AND p.is_active = 1
AND p.is_delete = 0;

Join Tables to return 1 or 0 based on multiple conditions

I am working on a project management website and have been asked for a new feature in a review meeting section.
A meeting is held to determine whether to proceed to the next phase, and I need to maintain a list of who attended each phase review meeting. I need to write an SQL query to return all people, with an additional column that states they have already been added before.
There are two tables involved to get my desired result, with the relevant columns listed below:
Name: PersonList
ID | Name | Division
Name: reviewParticipants
ProjectID | PersonID | GateID
The query I am looking for is something that returns all people in PersonList, with an additional "hasAttended" bit that is TRUE if reviewParticipants.ProjectID = 5 AND reviewParticpants.CurrentPhase = 'G0' ELSE FALSE.
PersonName | PersonID | hasAttended
Mr Smith | 1 | 1
Mr Jones | 2 | 0
I am not sure how to structure such a query with multiple conditions in a (left?) join, that would return as a different column name and data type, so I would appreciate if anybody can point me in the right direction?
With the result of this query I am going to add a series of checkboxes, and use this additional bit to mark it checked, or not, for page refreshes.
You can use LEFT JOIN as well:
SELECT DISTINCT p.*
,CASE WHEN rp.id IS NOT NULL THEN 1 ELSE 0 END AS hasAttended
FROM personlist p
LEFT JOIN reviewParticipants rp ON rp.personid = p.id
AND rp.projectid = 5
AND rp.currentphase = 'GO'
I agree with Gordon Linoff: I would prefer an int or tinyint over a bit value,
You can use exists to see if there is a matching row.
select p.*,
(case when exists (select 1
from reviewParticipants rp
where rp.personid = p.id and
rp.projectid = 5 and
rp.currentphase = 'GO'
)
then 1 else 0 end)
from personlist p;
I see no reason to prefer a bit over an integer, but you can return a bit if you really prefer.
This will do :
select a.* from PersonList a where a.hasAttended=1 and
a.Id in (select b.PersonId from reviewParticipants b
where b.ProjectID =5 and exists (
select 1 from reviewParticipants c where c.CurrentPhase = 'G0'and
c.Project =b.projectId
)
)

Is there a simpler way to write this query? [MS SQL Server]

I'm wondering if there is a simpler way to accomplish my goal than what I've come up with.
I am returning a specific attribute that applies to an object. The objects go through multiple iterations and the attributes might change slightly from iteration to iteration. The iteration will only be added to the table if the attribute changes. So the most recent iteration might not be in the table.
Each attribute is uniquely identified by a combination of the Attribute ID (AttribId) and Generation ID (GenId).
Object_Table
ObjectId | AttribId | GenId
32 | 2 | 3
33 | 3 | 1
Attribute_Table
AttribId | GenId | AttribDesc
1 | 1 | Text
2 | 1 | Some Text
2 | 2 | Some Different Text
3 | 1 | Other Text
When I query on a specific object I would like it to return an exact match if possible. For example, Object ID 33 would return "Other Text".
But if there is no exact match, I would like for the most recent generation (largest Gen ID) to be returned. For example, Object ID 32 would return "Some Different Text". Since there is no Attribute ID 2 from Gen 3, it uses the description from the most recent iteration of the Attribute which is Gen ID 2.
This is what I've come up with to accomplish that goal:
SELECT attr.AttribDesc
FROM Attribute_Table AS attr
JOIN Object_Table AS obj
ON obj.AttribId = obj.AttribId
WHERE attr.GenId = (SELECT MIN(GenId)
FROM(SELECT CASE obj2.GenId
WHEN attr2.GenId THEN attr2.GenId
ELSE(SELECT MAX(attr3.GenId)
FROM Attribute_Table AS attr3
JOIN Object_Table AS obj3
ON obj3.AttribId = attr3.AttribId
WHERE obj3.AttribId = 2
)
END AS GenId
FROM Attribute_Table AS attr2
JOIN Object_Table AS obj2
ON attr2.AttribId = obj2.AttribId
WHERE obj2.AttribId = 2
) AS ListOfGens
)
Is there a simpler way to accomplish this? I feel that there should be, but I'm relatively new to SQL and can't think of anything else.
Thanks!
The following query will return the matching value, if found, otherwise use a correlated subquery to return the value with the highest GenId and matching AttribId:
SELECT obj.Object_Id,
CASE WHEN attr1.AttribDesc IS NOT NULL THEN attr1.AttribDesc ELSE attr2.AttribDesc END AS AttribDesc
FROM Object_Table AS obj
LEFT JOIN Attribute_Table AS attr1
ON attr1.AttribId = obj.AttribId AND attr1.GenId = obj.GenId
LEFT JOIN Attribute_Table AS attr2
ON attr2.AttribId = obj.AttribId AND attr2.GenId = (
SELECT max(GenId)
FROM Attribute_Table AS attr3
WHERE attr3.AttribId = obj.AttribId)
In the case where there is no matching record at all with the given AttribId, it will return NULL. If you want to get no record at all in this case, make the second JOIN an INNER JOIN rather than a LEFT JOIN.
Try this...
Incase the logic doesn't find a match for the Object_table GENID it maps it to the next highest GENID in the ON clause of the JOIN.
SELECT AttribDesc
FROM object_TABLE A
INNER JOIN Attribute_Table B
ON A.AttrbId = B.AttrbId
AND (
CASE
WHEN A.Genid <> B.Genid
THEN (
SELECT MAX(C.Genid)
FROM Attribute_Table C
WHERE A.AttrbId = C.AttrbId
)
ELSE A.Genid
END
) -- Selecting the right GENID in the join clause should do the job
= B.Genid
This should work:
with x as (
select *, row_number() over (partition by AttribId order by GenId desc) as rn
from Attribute_Table
)
select isnull(a.attribdesc, x.attribdesc)
from Object_Table o
left join Attribute_Table a
on o.AttribId = a.AttribId and o.GenId = a.GenId
left join x on o.AttribId = x.AttribId and rn = 1

Exclude results with join conditions

I'm trying to make an sql request with join exclusion.
Explains:
Table element
id # name #
1 Sea
2 tree
Table colour
id # name #
1 green
2 blue
3 brown
Table relation
element_id # colour_id
1 2
2 1
2 3
I have my working request for "get elements for one of these colours".
Exemple with green and blue:
SELECT element.name, colour.name FROM element
LEFT JOIN relation
ON (element.id = relation.element_id)
LEFT JOIN colour
ON (colour.id = relation.colour_id)
WHERE (relation.colour_id = 1 OR relation.colour_id = 2)
I would like make request for "get elements where they have a relation with all listed colors". Where for green and brown it returns tree.
I've tried to change the 'OR' to 'AND' but request return 0 results :/
General way to solve this problem is to filter values and count how many times they appear in result. If equal, all elements are found.
select element_id
from relation
where colour_id in (1, 2)
group by element_id
having count (distinct colour_id) = 2
Having this table one might join it to original tables to produce full column set:
SELECT element.name, colour.name
FROM relation
INNER JOIN
(
select element_id
from relation
where colour_id in (1, 2)
group by element_id
having count (distinct colour_id) = 2
) matches
ON relation.element_id = matches.element_id
INNER JOIN element
ON element.id = relation.element_id
INNER JOIN colour
ON colour.id = relation.colour_id
This type of query can be handled with some of SQL's set-based operators:
Which elements have relations for all colours?
Using the ALL operator (syntax may vary slightly by database):
SELECT element.name
FROM element
WHERE ( SELECT colour.id FROM relation
INNER JOIN colour ON colour.id = relation.colour_id
WHERE relation.element_id = element.id )
= ALL ( SELECT colour.id from colour)
;
Using the EXCEPT operator:
SELECT element.name
FROM element
WHERE NOT EXISTS
( SELECT colour.id from colour
EXCEPT
SELECT colour.id FROM relation
INNER JOIN colour ON colour.id = relation.colour_id
WHERE relation.element_id = element.id
)
;
There is a typo in your WHERE clause - one of the ids is 2 ("blue") instead of 3 ("brown"). It should be
WHERE (relation.colour_id = 1 OR relation.colour_id = 3)
(or in shorter form:
WHERE relation.colour_id IN (1, 3)
).
Note however, that your current query - although after this fix it should work for your sample data - won't give you correct results in general. It will give you the elements associated with any of the specified colors. The correct solution to this is given in #Nikola's answer though.
without subselects, I would suggest:
SELECT
e.id AS id
FROM
element AS e
LEFT OUTER JOIN relation AS r ON r.element_id = e.id
GROUP BY e.id HAVING SUM(CASE WHEN r.colour_id = 1 THEN 1 ELSE 0 END ) = 0
ORDER BY e.id ASC;
after this you can select the elements by id.

A Simple but complex SQL query

I have a very simple MS SQL table with the following data(with column name and datatype):
TableId PersonName Attribute AttributeValue
(int) (nvarchar 50) (nvarchar 50) (bit)
----------- ----------------------- ------------------- --------------
1 A IsHuman 1
2 A CanSpeak 1
3 A CanSee 1
4 A CanWalk 0
5 B IsHuman 1
6 B CanSpeak 1
7 B CanSee 0
8 B CanWalk 0
9 C IsHuman 0
10 C CanSpeak 1
11 C CanSee 1
12 C CanWalk 0
Now, What I need as a result is the unique PersonName that have both Attribute IsHuman and CanSpeak with AttributeValue = 1.
The expected result should be (Must not include C as this one has IsHuman = 0)
PersonName
------------
A
B
Please can any expert help me in writting a SQL Query for this.
SELECT PersonName
FROM MyTable
WHERE AttributeName = 'IsHuman'
AND AttributeValue = 1
INTERSECT
SELECT PersonName
FROM MyTable
WHERE AttributeName = 'CanSpeak'
AND AttributeValue = 1;
Obviously this approach doesn't 'scale' if the criteria can vary. It could be that the relational operator you require is division, popularly known as "the supplier who supplies all parts", specifically division with remainder.
SELECT PersonName
FROM MyTable
WHERE (AttributeName = 'IsHuman' AND AttributeValue = 1) OR
(AttributeName = 'CanSpeak' AND AttributeValue = 1)
GROUP BY PersonName
HAVING COUNT(*) > 1
or
SELECT PersonName
FROM MyTable
WHERE AttributeValue = 1 AND AttributeName IN ('IsHuman', 'CanSpeak')
GROUP BY PersonName
HAVING COUNT(*) > 1
SELECT PersonName FROM MyTable
WHERE PersonName IN
(SELECT T1.PersonName FROM MyTable T1 WHERE T1.Attribute = 'IsHuman' and T1.AttributeValue='1')
AND (Attribute = 'CanSpeak' AND AttributeValue='1')
I think two inner joins may give you alright performance depending on indexing and table sizes.
SELECT t.PersonName FROM table t
INNER JOIN table t2 ON t.PersonName=t2.PersonName AND t3.Attribute = 'IsHuman' AND t2.AttributeValue = 1
INNER JOIN table t3 ON t2.PersonName=t3.PersonName AND t3.Attribute = 'CanSpeak' AND t3.AttributeValue = 1
or
SELECT t.PersonName FROM table t
INNER JOIN table t2 ON t.PersonName=t2.PersonName
INNER JOIN table t3 ON t2.PersonName=t3.PersonName
WHERE t2.Attribute = 'IsHuman' AND t2.AttributeValue = 1 AND t3.Attribute = 'CanSpeak' AND t3.AttributeValue = 1
This solution could be simplified significantly however should the properties IsHuman and CanSpeak were in separate tables with an linking ID table between them. Sounds like this table could possibly benefit from some normalization.
If you cant progress that, a view may assist in performance. I am at home without SQL installed so I cant verify any performance aspects.
select personname from yourtablename where personname in ('a','b') group by personname
I actually use this as a screening question for interviews. None of you people would get the job.
OK, maybe you would, but while the strategies you use might or might not work, they aren't generalizable and they miss a basic notion of relational algebra, to wit, aliasing.
The right answer (in the sense that it would make me more likely to employ you as well as the less importance senses that the RDMS's optimizer understands it and it can be extended to other, arbitrarily complex, cases) is
SELECT t1.PersonName
FROM MyTable t1, MyTable t2
WHERE t2.AttributeName = 'CanSpeak'
AND t2.AttributeValue = 1
AND t1.AttributeName = 'IsHuman'
AND t1.AttributeValue = 1
AND t1.PersonName = t2.PersonName;