MySQL query help (involving joins?) - sql

Although I've figured out several queries that almost do this, I can't quite get it perfectly and I'm getting frustrated. Here is the setup:
Table: Issue
| id | name | value |
+-------------------+
| 1 | a | 10 |
| 2 | b | 3 |
| 3 | c | 4 |
| 4 | d | 9 |
Table: Link
| source | dest |
+---------------+
| 1 | 2 |
| 1 | 3 |
The link table sets up a source/dest relationship between rows in the issue table. Yes, I know this is normalized terribly, but I did not create this schema even though I now have to write queries against it :(.
What I want is results that look like this:
| name | value |
+--------------+
| a | 17 |
| d | 9 |
The values in the results should be the sum of the values in the issue table when you aggregate together a source with all its dests along with the name of the source.
Some notes
(1) A source->dest is a one->many relationship.
(2) The best answer will not have any hardcoded id's or names in the query (meaning, it will be generalized for all setups like this).
(3) This is in MySQL
Thank you and let me know if I should include any more information

Its fairly simple, but the stickler is the fact that A is not a destination of A yet it is included in the table. The robust solution would involve modifying the data to add
Table: Link
| source | dest |
+---------------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
Then a simple
SELECT a.name, SUM(d.value) FROM
Issues as a
JOIN Link as b on a.id=b.source
JOIN Issues AS d on b.dest=d.id;
GROUP BY a.name;
If you can't modify the data.
SELECT a.name, SUM(d.value)+a.value FROM
Issues as a
JOIN Link as b on a.id=b.source
JOIN Issues AS d on b.dest=d.id;
GROUP BY a.name,a.value;
MAY work.

SELECT S.name, S.value + SUM(D.value) as value
FROM Link AS L
LEFT JOIN Issue AS S ON L.source = S.id
LEFT JOIN Issue AS D ON L.dest = D.id
GROUP BY S.name

You could use a double join to find all linked rows, and add the sum to the value of the source row itself:
select src.name, src.value + sum(dest.value)
from Issue src
left join Link l
on l.source = src.id
left join Link dest
on dest.id = l.dest
group by src.name, src.value

This one should return the SUM of both source and dests, and only return items which are source.
SELECT s.name, COALESCE( SUM(d.value), 0 ) + s.value value
FROM Issue s
LEFT JOIN Link l ON ( l.source = s.id )
LEFT JOIN Issue d ON ( d.id = l.dest )
WHERE s.id NOT IN ( SELECT dest FROM Link )
GROUP BY s.name, s.value
ORDER BY s.name;

Related

Multiple select from CTE with different number of rows in a StoredProcedure

How to do two select with joins from the cte's which returns total number of columns in the two selects?
I tried doing union but that appends to the same list and there is no way to differentiate for further use.
WITH campus AS
(SELECT DISTINCT CampusName, DistrictName
FROM dbo.file
),creditAcceptance AS
(SELECT CampusName, EligibilityStatusFinal, CollegeCreditAcceptedFinal, COUNT(id) AS N
FROM dbo.file
WHERE (EligibilityStatusFinal LIKE 'Eligible%') AND (CollegeCreditEarnedFinal = 'Yes') AND (CollegeCreditAcceptedFinal = 'Yes')
GROUP BY CampusName, EligibilityStatusFinal, CollegeCreditAcceptedFinal
),eligibility AS
(SELECT CampusName, EligibilityStatusFinal, COUNT(id) AS N, CollegeCreditAcceptedFinal
FROM dbo.file
WHERE (EligibilityStatusFinal LIKE 'Eligible%')
GROUP BY CampusName, EligibilityStatusFinal, CollegeCreditAcceptedFinal
)
SELECT a.CampusName, c.[EligibilityStatusFinal], SUM(c.N) AS creditacceptCount
FROM campus as a FULL OUTER JOIN creditAcceptance as c ON a.CampusName=c.CampusName
WHERE (a.DistrictName = 'xy')
group by a.CampusName ,c.EligibilityStatusFinal
Union ALL
SELECT a.CampusName , b.[EligibilityStatusFinal], SUM(b.N) AS eligible
From Campus as a FULL OUTER JOIN eligibility as b ON a.CampusName = b.CampusName
WHERE (a.DistrictName = 'xy')
group by a.CampusName,b.EligibilityStatusFinal
Expected output:
+------------+------------------------+--------------------+
| CampusName | EligibilityStatusFinal | creditacceptCount |
+------------+------------------------+--------------------+
| M | G | 1 |
| E | NULL | NULL |
| A | G | 4 |
| B | G | 8 |
+------------+------------------------+--------------------+
+------------+------------------------+----------+
| CampusName | EligibilityStatusFinal | eligible |
+------------+------------------------+----------+
| A | G | 8 |
| C | G | 9 |
| A | T | 9 |
+------------+------------------------+----------+
As you can see here CTEs can be used in a single statement only, so you can't get the expected output with CTEs.
Here is an excerpt from Microsoft docs:
A CTE must be followed by a single SELECT, INSERT, UPDATE, or DELETE
statement that references some or all the CTE columns. A CTE can also
be specified in a CREATE VIEW statement as part of the defining SELECT
statement of the view.
You can use table variables (declare #campus table(...)) or temp tables (create table #campus (...)) instead.

Create a flag in a left join

I am trying to create a new column (a sort of identifier flag) for the "Null" rows resulting of my following left join :
with CTE (...) as (
... unrelated code
) select * from CTE
left join (select columnID from table1) Pu
on CTE.columnID = Pu.columnID
left join (select case when bz.column2 is null then 'null test is working' else columnID2, column2 end FROM table2) Bz
ON CTE.columnID2 = Bz.columnID2
This code is working properly when I don't try to use a 'case when'. Actually, you could very well ignore the first left join.
My purpose would be to be able test the left join result while doing it, and act depending on the result :
If the left join result give a null row : creation of a flag column for the row,
If the left join result give a normal row : the left join is done normally, and the flag column is empty (as I suspect it cant be un existent).
I'd be glad if you could give me a hand!
EDIT : tables example:
CTE
| columnID | columnID2 | InformationsCTE |
| ab | mp | randominfo1 |
| ac | ma | randominfo2 |
| ae | me | randominfo3 |
| ad | mb | randominfo4 |
table2
| columnID2 | InformationsTable2 |
| mp | randominfo5 |
| ma | randominfo6 |
| me | randominfo7 |
Result after the second left join :
new CTE
| columnID | columnID2 | InformationsCTE | InformationsTable2| FLAG |
| ab | mp | randominfo1 | randominfo5 | OK |
| ac | ma | randominfo2 | randominfo6 | OK |
| ae | me | randominfo3 | randominfo7 | OK |
| ad | mb | randominfo4 | NULL | NOK |
Just use
T-SQL:
SELECT ISNULL(Column_to_check,'flag') FROM SomeTable
PL/SQL:
SELECT NVL(Column_to_check,'flag') FROM SomeTable
Also use NVL2 as below if you want to return other value from the Column_to_check:
NVL2(Column_to_check, value_if_NOT_null, value_if_null )
Would it not be more practical to SELECT this column, use the ISNULL operator and just use a straightforward LEFT JOIN? I feel like you're over-complicating it a bit.
Something like:
with CTE (...) as (
... unrelated code
)
SELECT CTE.*, NVL(bz.InformationsTable2, 'TEST OK')
FROM CTE
LEFT JOIN table2 Bz ON CTE.columnID2 = Bz.columnID2
EDIT: Based on your example table, if you join on the ID, then use NVL on the other column, it should work for you.
Here is an example I prepared for a previous question: SQL Fiddle
Example was build in mysql, so beware syntax, but logically it works the same way
Why the joins? It seems you only want to look up data in other table, for which you'd use EXISTS or IN:
with cte (...) as (
... unrelated code
)
select
cte.*,
case when columnid in (select columnid from table1) then 'okay' else 'fail' end as test1,
case when columnid2 in (select columnid2 from table2) then 'okay' else 'fail' end as test2
from cte;

T-SQL Select Join 3 Tables

I'm currently working on a select query in T-SQL on SQL Server 2012. It's a complex query, I want to query a list from 3 tables. The result should look something like this:
Desired Output:
ProjectId | Title | Manager | Contact | StatusId
----------+-------------+-----------+-----------+-----------
1 | projectX | 1123 | 4453 | 1
2 | projectY | 2245 | 5567 | 1
3 | projectZ | 3335 | 8899 | 1
My 3 Tables:
1) Project: ProjectId, ProjectDataId, MemberVersionId
2) ProjectData: ProjectDataId, Title, StatusId
3) Members: MemberId, MemberVersionId, MemberTypeId, EmployeeId
The tricky part is, to implement versioning. Thus, over time the project Members can change, and it should always be possible to return to a previous version, that's why I use MemberVersionId as a foreign key inbetween Project and Members. The tables Project and ProjectData a linked with ProjectDataId.
Hence, 1 Project has 1 OfferData and 1 Project has N Members.
Some sample data:
Project
ProjectId | ProjectDataId | MemberVersionId |
----------+---------------+-----------------+
1 | 2 | 1 |
2 | 3 | 1 |
3 | 4 | 1 |
ProjectData
ProjectDataId | Title | StatusId
--------------+-------------+-----------
2 | projectX | 1
3 | projectY | 1
4 | projectZ | 1
Members: MemberTypeId 1 = Manager, MemberTypeId 2 = Contact, 3 = Other
MemberId | MemberVersionId | MemberTypeId | EmployeeId |
---------+-----------------+--------------+------------+
1 | 1 | 1 | 1123 |
2 | 1 | 2 | 4453 |
3 | 1 | 3 | 9999 |
4 | 2 | 1 | 2245 |
5 | 2 | 2 | 5567 |
6 | 2 | 3 | 9999 |
7 | 3 | 1 | 3335 |
8 | 3 | 2 | 8899 |
9 | 3 | 3 | 9999 |
My current query looks like this:
SELECT ProjectId, Title, EmployeeId AS Manager, EmployeeId AS Contact, StatusId
FROM [MySchema].[Project] a,
[MySchema].[ProjectData] b,
[MySchema].[Members] c
WHERE a.ProjectDataId = b.ProjectDataId
AND a.MemberVersionId = c.MemberVersionId
Unfortunately this doesn't work yet. Do you know how to solve this issue?
Thanks
Something like this?
SELECT
p.ProjectId,
pd.Title,
mm.EmployeeId AS Manager,
mc.EmployeeId AS Contact,
pd.StatusId
FROM
[MySchema].[Project] p
INNER JOIN [MySchema].[ProjectData] pd ON pd.ProjectDataId = p.ProjectDataId
INNER JOIN [MySchema].[Members] mm ON mm.MemberVersionId = p.MemberVersionId AND mm.MemberTypeId = 1
INNER JOIN [MySchema].[Members] mc ON mc.MemberVersionId = p.MemberVersionId AND mc.MemberTypeId = 2;
You can try this:
SELECT ProjectId, Title, C.EmployeeId AS Manager, d.EmployeeId AS Contact, StatusId
FROM [MySchema].[Project] a
INNER JOIN [MySchema].[ProjectData] b ON A.ProjectDataId=B.ProjectDataId
LEFT JOIN (SELECT * FROM [MySchema].[Members] WHERE MemberTypeID=1) c ON a.MemberVersionId=c.MemberVersionId
LEFT JOIN (SELECT * FROM [MySchema].[Members] WHERE MemberTypeID=2) d ON a.MemberVersionId=d.MemberVersionId
You must select members two times, one for the manager and another for contact:
SELECT ProjectId, Title, m.EmployeeId AS Manager, c.EmployeeId AS
Contact, StatusId
FROM [MySchema].[Project] a,
[MySchema].[ProjectData] b,
[MySchema].[Members] m
[MySchema].[Members] c
WHERE a.ProjectDataId = b.ProjectDataId
AND a.MemberVersionId = m.MemberVersionId and m.MemberTypeId = 1
AND a.MemberVersionId = c.MemberVersionId and c.MemberTypeId = 2
try this,
SELECT ProjectId, Title, cmanager.EmployeeId AS Manager, ccon.EmployeeId AS
Contact, StatusId
from [MySchema].[ProjectData] b
inner join [MySchema].[Project] a on b.ProjectDataId=a.ProjectDataId
left join [MySchema].[Members] cmanager on cmanager.MemberVersionId =
a.MemberVersionId and cmanager.MemberTypeId=1
left join [MySchema].[Members] ccon on ccon.MemberVersionId =
a.MemberVersionId and ccon.MemberTypeId=2
The simplest solution to your problem would be introducing additional field to Project table. You'd either call it LatestMemberVersion (int, holds the currently highest MemberVersionId), which would by the most up to date version of the relationship, your you can add even simpler IsLatestMemberVersion (bit, holds 1 if the record is the latest/active). You can compute both of them using ROW_NUMBER() OVER statement.
Then, the query would change to:
SELECT ProjectId, Title, EmployeeId AS Manager, EmployeeId AS Contact, StatusId
FROM [MySchema].[Project] a,
[MySchema].[ProjectData] b ON a.ProjectDataId = b.ProjectDataId
[MySchema].[Members] c ON a.MemberVersionId = c.MemberVersionId
WHERE
a.[IsLatestMemberVersion] = 1 -- alternative is a.[LatestMemberVersion] = a.[MemberVersionId]
Additionally, there are two more things you can try:
you might want to borrow ideas from data warehousing, namely you will want to have combination of Slowly Changing Dimension Type 1 and 2
you can try to use SQL Server features, such as Change Data Tracking. But I have no experience with that, so it's possible it'll lead to nowhere.
And one last piece of advice, if you can, never write join conditions into the WHERE clause. It is not readable and can lead to problems when you suddenly change JOIN to LEFT JOIN. Microsoft itself recommends using ON instead of WHERE when applicable.

Left Joining table with values in lookup table

I have two tables on SQL-Server. One containing clients, and one a client profile lookup table. So a bit like this (note that Fred doesn't have any values in the lookup table):
Table: Clients Table: Profile
ID | Name | Status ClientID | Type | Value
----------------------- -----------------------
1 | John | Current 1 | x | 1
2 | Peter | Past 1 | y | 2
3 | Fred | Current 2 | x | 3
2 | y | 4
I then am trying to create a tmp table that needs to contain all current clients like this:
ID | Name | TypeY
==================
1 | John | 2
3 | Fred |
My knowledge of SQL is limited, but I think I should be able to do this with a Left Join, so I tried this (#tmpClient is already created):
insert into #tmpClient
select a.ID, a.Name, b.Value
from Clients a
left join Profile b
on a.ID = b.ClientID
where a.Status = 'Current' and b.Type = 'y'
However this will always miss Fred out of the temporary table. I am probably doing something very simple wrong, but as I said I am missing the SQL skills to work this one out. Please can someone help me with getting this query right.
You have to move the predicate concerning the second table of the LEFT JOIN operation from WHERE to ON clause:
insert into #tmpClient
select a.ID, a.Name, b.Value
from Clients a
left join Profile b
on a.ID = b.ClientID and b.Type = 'y'
where a.Status = 'Current'

MySQL - Join tables, retrieve only Max ID

I've seen solutions for something similar on other posts, but I've been having an issue applying it to my specific problem.
Here is my initial join:
SELECT service_note_task, comment_id, comment FROM service_note_task LEFT JOIN service_note_task_comments ON service_note_task.service_note_task_id = service_note_task_comments.service_note_task_id;
Which results in:
+-----------------------------+------------+--------------+
| service_note_task | comment_id | comment |
+-----------------------------+------------+--------------+
| This is service note task 3 | 25 | Comment |
| This is service note task 3 | 26 | Comment Blah |
| This is service note task 3 | 36 | aaa |
| This is service note task 2 | 13 | Awesome comm |
| This is service note task 1 | 12 | Cool Comm |
+-----------------------------+------------+--------------+
But for each service_note_task, I really only need one row representing the comment with the highest comment_id, like this:
+-----------------------------+------------+--------------+
| service_note_task | comment_id | comment |
+-----------------------------+------------+--------------+
| This is service note task 3 | 36 | aaa |
| This is service note task 2 | 13 | Awesome comm |
| This is service note task 1 | 12 | Cool Comm |
+-----------------------------+------------+--------------+
I figure I could use MAX in a sub-select statement to narrow down the results as I want them. How can I incorporate that into my statement to get these results?
For reference, this is known as "groupwise-maximum"
http://dev.mysql.com/doc/refman/5.0/en/example-maximum-column-group-row.html
since you haven't mention the RDBMS you are using, this query below mostly works on many RDBMS (not all)
SELECT a.*, b.* -- select only the columns you want.
FROM service_note_task a
INNER JOIN service_note_task_comments b
ON a.service_note_task_id = b.service_note_task_id
INNER JOIN
(
SELECT service_note_task_id, MAX(commentID) max_ID
FROM service_note_task_comments
GROUP BY service_note_task_id
) c ON b.service_note_task_id = c.service_note_task_id AND
b.commentID = c.max_ID
if your RDBMS supports Analytical Functions, you can use this below,
SELECT a.service_note_task, b.comment_id, b.comment
FROM service_note_task a
INNER JOIN
(
SELECT service_note_task_id, comment_id, comment,
ROW_NUMBER() OVER (PARTITION BY service_note_task_id
ORDER BY comment_id DESC) rn
FROM service_note_task_comments
GROUP BY
) c ON a.service_note_task_id = b.service_note_task_id AND
b.rn = 1
try:
SELECT service_note_task, comment_id, comment
FROM service_note_task SNT1
LEFT JOIN service_note_task_comments ON service_note_task.service_note_task_id = service_note_task_comments.service_note_task_id
WHERE comment_id = (SELECT MAX(comment_id) FROM service_note_task SNT2 WHERE SNT1.service_note_task = SNT2.service_note_task);
SELECT service_note_task, comment_id, comment
FROM service_note_task s LEFT JOIN service_note_task_comments sc
ON s.service_note_task_id = sc.service_note_task_id;
WHERE EXISTS (
SELECT 1
FROM service_note_task_comments s2
WHERE s.service_note_task_id = s2.service_note_task_id
HAVING MAX(s2.comment_id) = sc.comment_id
)