Collecting entity attribute values with null for missing attributes in SQL - sql

I have some entities in a table and their attributes and values in an other. I would like to create a select where I can see the value a specific attribute for every entity or null if that attribute is missing. How can I do this using standard SQL?
This is the setup:
create table person (id int not null, nick varchar(32) not null);
insert into person (id, nick) values (1, 'John');
insert into person (id, nick) values (2, 'Peter');
create table req_attributes (name varchar(32));
create table person_attributes (id int not null,
person_id int not null,
attribute varchar(32) not null,
value varchar(64) not null);
insert into person_attributes values (1, 1, 'age', '21');
insert into person_attributes values (2, 1, 'hair', 'brown');
insert into person_attributes values (3, 2, 'age', '32');
insert into person_attributes values (4, 2, 'music', 'jazz');
And this is my current select statement:
select * from person join person_attributes on
person.id = person_attributes.person_id
where attribute = 'hair';
Obviously Peter is not in the result set because we have no information about his hair. I would like to get him into the result set as well, but with null value.
The best would be if the result set was like
Person, Hair color
John, brown
Peter, null
I would like to avoid subqueries if possible, but if it is impossible to do with joins then they are welcome.

An outer join will do this:
select p.*, pa.value
from person p
left join person_attributes pa
on p.id = pa.person_id
and pa.attribute = 'hair';
Note that the condition for the "outer joined" table needs to go into the JOIN clause, not into the where clause. If the condition was in the where clause it would effectively turn the outer join into an inner join. This is because pa.attribute would be null due to the outer join, and the where would not match the null value thus eliminating all the rows that should actually stay in the result.
SQFiddle based on your example: http://sqlfiddle.com/#!12/d0342/1

Related

How to select rows in a many-to-many relationship? (SQL)

I have a Students table and a Courses table.
They have a many to many relationship between them and the StudentCourses table is the intermediary.
Now, I have a list of Course ids and want to select the Students that follow all Courses in my list.
How??
--CREATE TYPE CourseListType AS TABLE
--(
-- CourseID INT
--)
DECLARE
#CourseList CourseListType
CREATE TABLE #Students
(
ID INT
,Name CHAR(10)
)
CREATE TABLE #Courses
(
ID INT
,Name CHAR(10)
)
CREATE TABLE #StudentCourses
(
StudentID INT
,CourseID INT
)
INSERT INTO #CourseList (CourseID)
VALUES
(1) --English
,(2) --Math
INSERT INTO #Students (ID, Name)
VALUES
(1, 'John')
,(2, 'Jane')
,(3, 'Donald')
INSERT INTO #Courses (ID, Name)
VALUES
(1, 'English')
,(2, 'Math')
,(3, 'Geography')
INSERT INTO #StudentCourses (StudentID, CourseID)
VALUES
(1, 1)
,(1, 2)
,(2, 1)
,(2, 2)
,(3, 1)
,(3, 3)
In this example, I only want the result to be John and Jane, because they both have the two courses in my CourseList.
I dont want Donald, because he only has one of them.
Have tried this JOIN, construction, but it does not eliminate students that only have some of my desired courses.
SELECT
*
FROM
#CourseList CRL
INNER JOIN #Courses CRS ON CRS.ID = CRL.CourseID
INNER JOIN #StudentCourses STC ON STC.CourseID = CRS.ID
INNER JOIN #Students STD ON STD.ID = STC.StudentID
If you want students with all your required courses, you can use aggregation and having:
SELECT sc.StudentId
FROM #StudentCourses sc JOIN
#CourseList cl
ON sc.CourseID = cl.id
GROUP BY sc.StudentId
HAVING COUNT(DISTINCT sc.CourseId) = (SELECT COUNT(*) FROM #DcourseList);
If you want additional information about students, you can join in the Students table (or use a IN or a similar construct).
Note that this only needs the StudentCourses table. It has the matching ids. There is no need to join in the reference tables.

Sql query select all messages from everybody inside every group where this one user is at

I'm trying to get all of the messages, from EVERY user, from 2 groups, where the user is located. But I don't know how to get all message from every group. This is my code so far:
SELECT DISTINCT m.*
FROM `message` m
INNER JOIN users u
ON u.id = m.idUser
LEFT JOIN whats_app w
ON w.idUser= u.id
WHERE u.id = w.idUser
So there is ONLY ONE user in 2 groups. I wan't to get all messages from everybody inside the groups where the ONE user is located at.
this is some simple sql query als example:
create table users (
id int PRIMARY KEY NOT NULL,
name varchar(60)
);
create table whatsapp(
idUser ,
idGroup int
);
create table allGroups(
id int PRIMARY KEY NOT NULL,
name varchar(60)
);
create table message_send(
id int,
idUser int,
message text
);
INSERT INTO users(id, name) VALUES
(1, 'John'),
(2, 'Martijn'),
(3, 'Rick'),
(4, 'Vera'),
(5, 'Leon');
INSERT INTO allGroups(id, name) VALUES
(1, 'School'),
(2, 'Friends'),
(3, 'moreFriends'),
(4, 'secretmeeting');
INSERT INTO message_send(id, idUser, message) VALUES
(1, 2, 'How are you feeling today?'),
(2, 1, 'What up?'),
(3, 4, 'I am fine, you?'),
(4, 1, 'hi!');
create table message_send(
id int,
idUser int,
idGroup int,
message text
);
Create message table like this and then just directly join with user and group you will get the output there is not need for the table watsapp
select b.name,message
from
message_send as a,
users as b
where
a.idUser=b.id
Similarly join the group table

How to convert a varchar value to datatype int in SQL Server 2008 with inner join

I have two lookup tables MyProviders and MyGroups. In my stored procedure, I have a temp table (replaced with an actual table for this example) with data. One column EntityId refers to either provider or a group. EntityTypeId tells me in that temp table if the entity is 1 = Provider or 2 = Group. EntityId can either have numeric GroupId or alphanumeric ExternalProviderId.
I want to check if there is any record in my temp table that has an invalid combination of clientOid + entityid from myprovider and mygroup table.
create table MyProviders
(
id int,
clientoid varchar(20),
externalproviderid varchar(20),
name varchar(25)
)
create table MyGroups
(
id int,
clientoid varchar(20),
name varchar(25)
)
create table MyJobDetails
(
clientoid varchar(20),
entityid varchar(20),
entitytypeid int,
entityname varchar(30)
)
insert into MyJobDetails values ('M.OID', 'MONYE', 1, 'Mark')
insert into MyJobDetails values ('M.OID', 2, 1, 'Lori')
insert into MyJobDetails values ('M.OID', 2, 2, 'Group 1')
insert into MyJobDetails values ('M.OID', 44444, 2, 'Group 2')
insert into MyProviders values (1, 'M.OID', 'MONY', 'Richard')
insert into MyProviders values (2, 'M.OID', '2', 'Mike')
insert into MyProviders values (3, 'M.OID', '3', 'Lori')
insert into MyGroups values (1, 'M.OID', 'Group 1')
insert into MyGroups values (2, 'M.OID', 'Group 2')
I tried the following query to determine if there is an invalid entity or not.
select
COUNT(*)
from
MyJobDetails as jd
where
not exists (select 1
from MyProviders as p
where p.ClientOID = jd.ClientOID
and p.ExternalProviderID = CAST(jd.EntityId as varchar(20))
and jd.EntityTypeId = 1)
and not exists (select 1
from MyGroups as g
where g.ClientOID = jd.ClientOID
and g.Id = jd.EntityId
and jd.EntityTypeId = 2)
This works as expected until I get an alphanumeric data in my temp table that doesn't exist in provider table. I get the following error message:
Conversion failed when converting the varchar value 'MONYE' to data type int.
I have tried to update the solutions mentioned in other threads to use IsNumeric but it didn't work either. In this example, I need to return 1 for one invalid entry of MONYE which doesn't exist either in MyProvider or MyGroup table.
Also, if I can optimize the query in better way to achieve what I want?
This is a really bad design in my opinion.
Since you're referencing one out of two tables, you cannot enforce referential integrity.
And having different datatypes for your keys makes things even more horrible.
I would use
two separate foreign keys in MyJobDetails - one to MyProvider (varchar(20)) and another one to MyGroup (int)
make them both nullable
establish a proper foreign key relationship to the referenced table for each of those two
This way, both can be the correct datatype for each referenced table, and you won't need the EntityTypeId column anymore.
As a side note: whenever you use Varchar in SQL Server, whether you're defining a parameter, a variable, or using it in a CAST statement, I would recommend to always explicitly define a length for that varchar.
Or do you know what length this varchar in your conversion here is going to be?
CAST(jd.EntityId as varchar)
Use an explicit length - always - it's just a good, safe practice to employ:
CAST(jd.EntityId as varchar(15))
In the second AND NOT EXISTS section you compare g.Id, an int, with jd.EntityId, a varchar. Cast the g.Id as a varchar.
and not exists (select 1
from #MyGroups as g
where g.ClientOID = jd.ClientOID
and CAST(g.Id AS VARCHAR(20)) = jd.EntityId
and jd.EntityTypeId = 2)
Try this
select count(*)
from (
select clientoid,entityid from #MyJobDetails where entitytypeid=1
except
select p.ClientOID ,convert(varchar(200),p.ExternalProviderID) from #MyProviders p inner join #MyJobDetails jd on p.ClientOID = jd.ClientOID and p.ExternalProviderID = CAST(jd.EntityId as varchar(20)) where jd.EntityTypeId = 1
except
select g.ClientOID,convert(varchar(200),g.Id) from #MyGroups g inner join #MyJobDetails jd on g.ClientOID = jd.ClientOID and g.Id = jd.EntityId where jd.EntityTypeId = 2
)a

SQL Join tables - detecting presence of some tuples but not others

I've got two primary tables: codes and categories.
I've also got a join table code_mappings which associates codes with categories.
I need to be able to determine which codes are mapped to one group of categories, but not mapped to another. Been banging my head against this for a while, but am completely stuck.
Here's the schema:
create table codes(
id int,
name varchar(256));
create table code_mappings(
id int,
code_id int,
category_id int);
create table categories(
id int,
name varchar(256));
And some seed data:
INSERT INTO categories VALUES(1, 'Dental');
INSERT INTO categories VALUES(2, 'Weight');
INSERT INTO categories VALUES(3, 'Other');
INSERT INTO categories VALUES(4, 'Acme Co');
INSERT INTO categories VALUES(5, 'No Name');
INSERT INTO codes VALUES(100, "big bag of cat food");
INSERT INTO codes VALUES(200, "healthy doggie treatz");
INSERT INTO code_mappings VALUES(50, 200, 1);
INSERT INTO code_mappings VALUES(51, 100, 4);
INSERT INTO code_mappings VALUES(52, 100, 3);
How would I write a query that will give me the codes that are mapped to one of categories (1,2,3) but not to one of categories (4,5)?
This is an example of a set-within-sets query. I like to approach these using group by and having, because I find that the most flexible approach:
select cm.code_id
from code_mappings cm
group by cm.code_id
having sum(case when cm.category_id in (1, 2, 3) then 1 else 0 end) = 1 and
sum(case when cm.category_id in (4, 5) then 1 else 0 end) = 0;
Each condition in the having clause implements exactly one of the conditions. You said one code of 1, 2, or 3, hence the = 1 (if you wanted at least one of these three, it would be > 0). You said no 4 or 5, hence = 0.
SELECT *
FROM codes co
WHERE EXISTS (
SELECT *
FROM code_mappings ex
WHERE ex.code_id = co.id
AND ex.category_id IN (1,2,3)
)
AND NOT EXISTS (
SELECT *
FROM code_mappings nx
WHERE nx.code_id = co.id
AND nx.category_id IN (4,5)
)
;

How to select a value in the same table as the value for an update for each row

I have a table structure with columns like this
[ID]
[Name]
[ParentId]
[ParentName]
The parents are contained in the same table, and i would like to populate the parent name column using a statement like:
UPDATE Table
SET ParentName = (select Name
from Table
where Id = ParentId)
When i do this, all the ParentNames are set to null. Thoughts?
I would go with the update from statement.
UPDATE tb
SET
tb.ParentName = parent.Name
FROM Table tb
INNER JOIN Table parent ON parent.Id = tb.ParentId
This is T-SQL specific, but it should work pretty well.
Here's another T-SQL syntax you can use :
(BTW, I agree with cletus about the denormalization concerns.)
-- create dummy table
create table test (id int, name varchar(20),
parentid int, parentname varchar(20))
go
-- add some rows
insert test values (1, 'parent A', null, null)
insert test values (2, 'parent B', null, null)
insert test values (3, 'parent C', null, null)
insert test values (11, 'child A 1', 1, null)
insert test values (12, 'child A 2', 1, null)
insert test values (33, 'child C 1', 3, null)
go
-- perform update
update c set parentname = p.name from test c join test p on c.parentid = p.id
go
-- check result
select * from test
Here is a solution that I have working
UPDATE TABLE
SET ParentName = b.Name from
(
select t.name as name, t.id as id
from TABLE t
) b
where b.id = parentid
Note I refuse to believe that it has to be this ugly, I'm sure that something very similar to what OMG Ponies posted should work but try as I might I couldn't make it happen.
Here , sub query returning null values, So that it is assigning null to ParentName
UPDATE
T
SET
parentname = PT.name
FROM
MyTable T
JOIN
MyTable PT ON t.parentid = PT.id
You error occurs becasue you have no correlation in the subquery. You get zero rows unless "Id = ParentId" in each row
select Name from Table where Id = ParentId -- = no rows
You can't use an alias like UPDATE TABLE T ... so push the JOIN/correlation into the FROM clause (or a CTE or derived table)