Using data from tables in a group by

Using data from tables in a group by - sql

names:
id, first, last
879 Scotty Anderson
549 Melvin Anderson
554 Freddy Appleton
321 Grace Appleton
112 Milton Appleton
189 Jackson Black
99 Elizabeth Black
298 Jordan Frey
parents:
id, student_id
549 879
321 554
112 554
99 189
298 189
Expected Output
(without the 'Student:' / 'Parent:')
Student: Anderson, Scotty
Parent: Anderson, Melvin
Student: Appleton, Freddy
Parent: Appleton, Grace
Parent: Appleton, Milton
Student: Black, Jackson
Parent: Black, Elizabeth
Parent: Frey, Jordan
Using the data above, how can I achieve the expected output?
I currently use SQL similar to this to get a list of current students and names.
select b.last, b.first
from term a, names b
where a.id = b.id(+)
order by b.last
Which returns:
Anderson, Scotty
Appleton, Freddy
Black, Jackson
My question is how to take the parents table and add to this query so it has this output:
Anderson, Scotty
Anderson, Melvin
Appleton, Freddy
Appleton, Grace
Appleton, Milton
Black, Jackson
Black, Elizabeth
Frey, Jordan

The idea in a query like this is to break the data down into something that helps you solve the problem, and then put it back together as needed. In this case I'm going to make use of common table expressions, which allows me to treat queries as tables and then recombine them handily.
Looking at the desired results it looks like we want to have the students appear first, followed by their mothers (ladies first :-), and then their fathers. So, OK, let's figure out how to extract the needed data. We can get the students and their associated data pretty simply:
select distinct p.student_id as student_id,
n.first,
n.last,
0 as type
from parents p
inner join names n
on n.id = p.student_id
The type column, with its constant value of zero, is just used to identify that this is a student. (You'll see why in a minute).
Now, getting the mother's is a bit more difficult because we don't have any gender information to use. However, we'll use what we have, which is names. We know that names like Melvin, Milton, and Jordan are "guy" names. (Yes, I know Jordan can be a girls name too. My daughter has a male coach named Jordan, and a female teammate named Jordan. Just go with it - for purposes of argument in this case Jordan is a guys name, 'K? 'K :-). So we'll use that information to help us identify the mom's:
select p.student_id, n.first, n.last, 1 as type
from parents p
inner join names n
on n.id = p.id
where first not in ('Melvin', 'Milton', 'Jordan')
Notice here that we assign the value of 1 to the type column for mothers.
Similarly, we'll find the dads:
select p.student_id, n.first, n.last, 2 as type
from parents p
inner join names n
on n.id = p.id
where first in ('Melvin', 'Milton', 'Jordan')
And here we assign a value of 2 for the type.
OK - given the above we just need to combine the data properly. We don't want to use a JOIN, however, because we want the names to get spit out one after the other from the query - and the way we do THAT in SQL is with the UNION or UNION ALL operator. (Generally, you're going to want to use UNION ALL, because UNION will check the result set to ensure there are no duplicates - which in the case of a large result set takes, oh, more or less FOREVER!). And so, the final query looks like:
with all_students as (select distinct p.student_id as student_id,
n.first,
n.last,
0 as type
from parents p
inner join names n
on n.id = p.student_id),
all_mothers as (select p.student_id, n.first, n.last, 1 as type
from parents p
inner join names n
on n.id = p.id
where first not in ('Melvin', 'Milton', 'Jordan')),
all_fathers as (select p.student_id, n.first, n.last, 2 as type
from parents p
inner join names n
on n.id = p.id
where first in ('Melvin', 'Milton', 'Jordan'))
select last || ', ' || first as name from
(select * from all_students
union all
select * from all_mothers
union all
select * from all_fathers)
order by student_id desc, type;
We just take the student data, followed by the mom data, followed by the dad data, then sort it by the student ID from highest to lowest (I just looked at the desired results to figure out that this should be a descending sort), and then by the type (which results in the student (type=0) being first, following by their mother (type=1) and then their father (type=2)).
SQLFiddle here
Share and enjoy.

generic SQL, mmmmm I'd like there to be A generic SQL :)
First off you want to stop using the antique (+) join syntax that is exclusive to Oracle
select b.last, b.first
from term a
LEFT OUTER JOIN names b ON a.id = b.id
order by b.last
That is way more generic! (nb: You can abbreviate to just LEFT JOIN)
Now to concatenate (Last Name comma space First Name) there are options some not generic
SQL Server/MySQL and others supporting CONCAT()
select CONCAT(b.last , ', ', b.first)
from term a
LEFT OUTER JOIN names b ON a.id = b.id
order by b.last
not all versions of Oracle or SQL Server support CONCAT()
Oracle's concat() only takes 2 parameters; grrrrr
ORACLE
select b.last || ', ' || b.first
from term a
LEFT OUTER JOIN names b ON a.id = b.id
order by b.last
In this form Oracle generally handles data type conversions automatically (I think, please check on date/timestamps maybe others)
TSQL (Sybase, MS SQL Server)
select b.last + ', ' + b.first
from term a
LEFT OUTER JOIN names b ON a.id = b.id
order by b.last
In this form you must explicitly cast/convert data types to n|var|char for concatenation if not already a string type
For your list of concatenated names:
You need in addition to the last name a method to retain the family group together, plus distinguish between student and parent. As you want just one column of names this indicates you need a column of id's that point to the last and first names. So making some assumptions about the table TERM my guess is you list the students from that, then append the parents that relate to that group of students, and finally to output the required list in the required order.
select
case when type = 1 then 'Student' else 'Parent' end as who
, names.last || ', ' || names.first as Name
from (
select
STUDENT_ID as name_id
, STUDENT_ID as family_id
, 1 as TYPE
from term
union all
select
PARENTS.ID as name_id
, PARENTS.STUDENT_ID as family_id
, 2 as TYPE
from PARENTS
inner join term on PARENTS.STUDENT_ID = term.STUDENT_ID
) sq
inner join NAMES ON sq.name_id = NAMES.ID
order by
names.last
, sq.family_id
, sq.type
see: http://sqlfiddle.com/#!4/01804/6

This is too long for a comment.
Your question doesn't make sense. The easy answer to the question is:
select last, first
from names;
But it seems unlikely that is what you want.
Your sample query mentions a table term. That is not mentioned elsewhere in the question. Please clarify the question or delete this one and ask another.

I think I see what you're trying to do. I think you could set up a derived table and then query it. Set up something like: case when student id= id then 1 else 0 as match or whatever. Then query your derived table and group by match.

I would do it like that in SQL:
Select last +', '+ first as fullname from names;

Related

Query to select a row from a column that matches "x but not y" where y is everything else thats not x

I'm trying to write a query that will select rows that match only what I'm looking for. If the row has other stuff, then I don't want it. The column is a varchar field and the values in the column are a comma delimited string.
So here is the dilemma:
The table has a recipe column and an ingredients column. Like this:
Muffin | "salt"
Cake | "salt,pepper"
Pie | "salt,pepper,butter"
In my query I want to find all of the recipes that contain ANY COMBINATION of salt and/or pepper but nothing else.
If I write the query like this:
select recipe
from mytable
where ingredients like "%pepper%" and/or ingredients like "%salt%"
I want the Muffin and the Cake be returned but not the Pie (because it has additional ingredients that are not specifically listed in the search criteria). How do I write the exclusion?
I'm using SQL server 2008

You've already received comments encouraging you to consider a different design for your schema and the rationale for this so I'll only focus on a suggestion for your schema here.
You may consider using REPLACE to determine if the column of ingredients will be empty or whether this recipe has no other ingredients. The LIKE was used to determine whether the recipe had the desired ingredients.
Approach 1
SELECT
recipe,
ingredients
FROM mytable
WHERE (
CONCAT(',',ingredients,',') LIKE '%,salt,%' OR
CONCAT(',',ingredients,',') LIKE '%,pepper,%'
) AND
REPLACE(REPLACE(REPLACE(ingredients,'salt',','),'pepper',','),',','')=''
View working demo online
Approach 2
Ingredients that are only desired are placed in a subquery and filtered using the left join. The having clause is then used to determine whether the list of ingredients only has these ingredients .
SELECT recipe
FROM (
SELECT
recipe,
ingredients,
desired
FROM
mytable m
LEFT JOIN (
SELECT 'salt' as desired UNION ALL
SELECT 'pepper'
) d ON CONCAT(',',ingredients,',') LIKE CONCAT('%,',d.desired,',%')
) t
GROUP BY
recipe
HAVING
LEN(
MAX(
REPLACE(ingredients,',','')
)
) <= SUM(LEN(desired))
View working demo online

select recipe
from T cross apply string_split(ingredients, ',')
group by recipe
having count(case when value in (<my list>) then 1 end) > 0
and count(case when value not in (<my list>) then 1 end) = 0

You really should have 3 tables for this solution. You are suffering from the X Y Problem
Your solution for example should look something like this:
product
product_id
name
1
Muffin
2
Cake
3
Pie
ingredient
ingredient_id
name
1
Salt
2
Pepper
3
Butter
ingredient_to_product
product_id
ingredient_id
1
1
2
1
2
2
3
1
3
2
3
3
Then you can simply write your query based on your positive ingredient list and not worry about what they DON'T HAVE: According to your original statement: "in my query I want to find all of the recipes that contain ANY COMBINATION of salt and/or pepper but nothing else." That can be accomplished using the IN operator
SELECT a.name FROM product a
LEFT JOIN ingredient_to_product b
ON a.product_id = b.product_id
LEFT JOIN ingredient c
ON b.ingredient_id = c.ingredient_id
WHERE c.name IN ('salt','pepper');
And conversely you can exclude with NOT IN --
WHERE c.name IN ('salt','pepper') AND c.name NOT IN ('milk', 'butter');
OR to include salt and pepper -- But exclude every other possibility .. Use a nested SELECT to exclude everything BUT salt, and pepper
WHERE c.name IN ('salt','pepper')
AND c.name NOT IN (
SELECT c.name FROM product a
LEFT JOIN ingredient_to_product b
ON a.product_id = b.product_id
LEFT JOIN ingredient c
ON b.ingredient_id = c.ingredient_id
WHERE c.name NOT IN ('salt','pepper')
GROUP BY c.name
)
GROUP BY a.name;

How to Limit Results Per Match on a Left Join - SQL Server

I have a table with student info [STU] and a table with parent info [PAR]. I want to return an email address for each student, but just one. So I run this query:
SELECT [STU].[ID], [PAR].[EM]
FROM (SELECT [STU].* FROM DB1.STU)
STU LEFT JOIN (SELECT [PAR].* FROM DB1.PAR) PAR ON [STU].[ID] = [PAR].[ID]
This gives me the below table:
Student ID ParentEmail
1 jim#email.com
1 sarah#email.com
2 paul#email.com
2 tim#email.com
3 bill#email.com
3 frank#email.com
3 joyce#email.com
4 greg#email.com
5 tony#email.com
5 sam#email.com
Each student has multiple parent emails, but I only want one. In other words, I want the output to look like this:
Student ID ParentEmail
1 jim#email.com
2 paul#email.com
3 frank#email.com
4 greg#email.com
5 sam#email.com
I've tried so many things. I've tried using GROUP BY and MIN/MAX and I've tried complex CASE statements, and I've tried COALESCE but I just can't seem to figure it out.

I think OUTER APPLY is the simplest method:
SELECT [STU].[ID], [PAR].[EM]
FROM DB1.STU OUTER APPLY
(SELECT TOP (1) [PAR].*
FROM DB1.PAR
WHERE [STU].[ID] = [PAR].[ID]
) PAR;
Normally, there would be an ORDER BY in the subquery, to give you control over which email you want -- the longest, shortest, oldest, or whatever. Without an ORDER BY it returns just one email, which is what you are asking for.

If you just want one column from the parent table, a simple approach is a correlated subquery:
select
s.id student_id,
(select max(p.em) from db1.par p where p.id = s.id) parent_email
from db1.stu s
This gives you the greatest parent email per student.

Postgresql join tables to several columns

I have a list of students' name called table Names and I want to find their categories from another table called Categories as below:
Class_A Class_B Class_C Class_D Category
Sam Adam High
Sarah Medium
James High
Emma Simon Nick Low
My solution is to do a left join but students name from first table should be matching with one of four columns so I am not sure how to write queries. At the moment my query is just matching to Class_A while I need to check all categories and if the student's name exist, return category.
(Note: some rows have more than one student's name)
SELECT Names.name, Categories.Category
FROM Names
LEFT JOIN Categories ON Names.name = Categories.Class_A;
Table Names looks like this:
Name
----
Emma
Nick
James
Adam
Jack
Sarah
And I am expecting an output as below:
Name Category
---- ----
Emma Low
Nick Low
James High
Adam High
Jack -
Sarah Medium

I would be inclined to unpivot the first table. This looks like:
select n.name, c.category
from name n left join
(categories c cross join lateral
(values (c.class_a), (c.class_b), (c.class_c), (c.class_d)
) v(name)
)
on n.name = v.name
where v.name is not null;
Although you can also solve this using in (or or) in the on clause, that may produce a much less efficient execution plan.

Try this using OR in on clause:
SELECT Names.name, coalesce(Categories.Category,'-') as category
FROM Names
LEFT JOIN Categories ON Names.name = Categories.Class_A or Names.name = Categories.Class_B or Names.name = Categories.Class_C or Names.name = Categories.Class_D

Will this left join on same table ever return data?

In SQL Server, on a re-engineering project, I'm walking through some old sprocs, and I've come across this bit. I've hopefully captured the essence in this example:
Example Table
SELECT * FROM People
Id | Name
-------------------------
1 | Bob Slydell
2 | Jim Halpert
3 | Pamela Landy
4 | Bob Wiley
5 | Jim Hawkins
Example Query
SELECT a.*
FROM (
SELECT DISTINCT Id, Name
FROM People
WHERE Id > 3
) a
LEFT JOIN People b
ON a.Name = b.Name
WHERE b.Name IS NULL
Please disregard formatting, style, and query efficiency issues here. This example is merely an attempt to capture the exact essence of the real query I'm working with.
After looking over the real, more complex version of the query, I burned it down to this above, and I cannot for the life of me see how it would ever return any data. The LEFT JOIN should always exclude everything that was just selected because of the b.Name IS NULL check, right? (and it being the same table). If a row from People was found where b.Name IS NULL evals to true, then shouldn't that mean that data found in People a was never found? (impossible?)
Just to be very clear, I'm not looking for a "solution". The code is what it is. I'm merely trying to understand its behavior for the purpose of re-engineering it.
If this code indeed never returns results, then I'll conclude it was written incorrectly and use that knowledge during the re-engineering.
If there is a valid data scenario where it would/could return results, then that will be news to me and I'll have to go back to the books on SQL Joins! #DrivenCrazy

Yes. There are circumstances where this query will retrieve rows.
The query
SELECT a.*
FROM (
SELECT DISTINCT Id, PName
FROM People
WHERE Id > 3
) a
LEFT JOIN People b
ON a.PName = b.PName
WHERE b.PName IS NULL;
is roughly (maybe even exactly) equivalent to...
select distinct Id, PName
from People
where Id > 3 and PName is null;
Why?
Tested it using this code (mysql).
create table People (Id int, PName varchar(50));
insert into People (Id, Pname)
values (1, 'Bob Slydell'),
(2, 'Jim Halpert'),
(3,'Pamela Landy'),
(4,'Bob Wiley'),
(5,'Jim Hawkins');
insert into People (Id, PName) values (6,null);
Now run the query. You get
6, Null
I don't know if your schema allows null Name.
What value can P.Name have such that a.PName = b.PName finds no match and b.PName is Null?
Well it's written right there. b.PName is Null.
Can we prove that there is no other case where a row is returned?
Suppose there is a value for (Id,PName) such that PName is not null and a row is returned.
In order to satisfy the condition...
where b.PName is null
...such a value must include a PName that does not match any PName in the People table.
All a.PName and all b.PName values are drawn from People.PName ...
So a.PName may not match itself.
The only scalar value in SQL that does not equal itself is Null.
Therefore if there are no rows with Null PName this query will not return a row.
That's my proposed casual proof.
This is very confusing code. So #DrivenCrazy is appropriate.

The meaning of the query is exactly "return people with id > 3 and a null as name", i.e. it may return data but only if there are null-values in the name:
SELECT DISTINCT Id, PName
FROM People
WHERE Id > 3 and PName is null
The proof for this is rather simple, if we consider the meaning of the left join condition ... LEFT JOIN People b ON a.PName = b.PName together with the (overall) condition where p.pname is null:
Generally, a condition where PName = PName is true if and only if PName is not null, and it has exactly the same meaning as where PName is not null. Hence, the left join will match only tuples where pname is not null, but any matching row will subsequently be filtered out by the overall condition where pname is null.
Hence, the left join cannot introduce any new rows in the query, and it cannot reduce the set of rows of the left hand side (as a left join never does). So the left join is superfluous, and the only effective condition is where PName is null.

LEFT JOIN ON returns the rows that INNER JOIN ON returns plus unmatched rows of the left table extended by NULL for the right table columns. If the ON condition does not allow a matched row to have NULL in some column (like b.NAME here being equal to something) then the only NULLs in that column in the result are from unmatched left hand rows. So keeping rows with NULL for that column as the result gives exactly the rows unmatched by the INNER JOIN ON. (This is an idiom. In some cases it can also be expressed via NOT IN or EXCEPT.)
In your case the left table has distinct People rows with a.Id > 3 and the right table has all People rows. So the only a rows unmatched in a.Name = b.Name are those where a.Name IS NULL. So the WHERE returns those rows extended by NULLs.
SELECT * FROM
(SELECT DISTINCT * FROM People WHERE Id > 3 AND Name IS NULL) a
LEFT JOIN People b ON 1=0;
But then you SELECT a.*. So the entire query is just
SELECT DISTINCT * FROM People WHERE Id > 3 AND Name IS NULL;

sure.left join will return data even if the join is done on the same table.
according to your query
"SELECT a.*
FROM (
SELECT DISTINCT Id, Name
FROM People
WHERE Id > 3
) a
LEFT JOIN People b
ON a.Name = b.Name
WHERE b.Name IS NULL"
it returns null because of the final filtering "b.Name IS NULL".without that filtering it will return 2 records with id > 3

How to fetch the non matching rows in Oracle

Can anyone help me fetch the non matching rows from two tables in Oracle?
Table: Names
Class_id Stud_name
S001 JAMES
S001 PETER
S002 MARK
Table: Course
Course_id Stud_name
S001 JAMES
S001 KEITH
S002 MARK
Output
I need the rows to display as
CLASS ID STUD_NAME_FROM_NAME_TABLE STUD_NAME_FROM_COURSE_TABLE
---------------------------------------------------------------------
S001 PETER KEITH
I have used Oracle joins to fetch the non matching names:
SELECT *
FROM Names, Course
WHERE Names.Class_id=Course.Course_id
AND Names.Stud_name<>Course.Stud_name
This query is returning duplicate rows.

If you insist on Join you can use this one:
SELECT *
FROM Names
FULL OUTER JOIN Course ON Names.Class_id=Course.Course_id
AND Names.Stud_name = Course.Stud_name
WHERE Names.Stud_name IS NULL or Course.Stud_name IS NULL

Fetches unmatched rows in Names table
SELECT * FROM Names
WHERE
NOT EXISTS
(SELECT 'x' from Course
WHERE
Names.Class_id = Course.Course_id AND
Names.Stud_name = Course.Stud_name)
Fetches unmatched rows in Names and Course too!
SELECT Names.Class_id,Names.Stud_name,C1.Stud_name
FROM Names , Course C1
WHERE Names.Class_id = C1.Course_id AND
NOT EXISTS
(SELECT 'x' from Course C2
WHERE
Names.Class_id = C2.Course_id AND
Names.Stud_name = C2.Stud_name);

When you ask for unmatching rows I assume that you want rows that exist in names but not in course.
If this is the case you're probably after
select * from names
where (class_id, stud_name ) not in
(select course_id, stud_name from course);
Your query returned duplicate rows beacuse for each row in names it selected all rows in course that satisfied the where condition.
So, for the row S001, PETER in names it faound that S001, JAMES and S001, KEITH matched that condition, thus, that row was "returned" twice.
EDIT Since it is not clear if stud_name is a primary key, or unique (and on second sight I think it's not), you'd probably want a
select * from names
where not exists (
select 1 from course where
names.class_id = course.course_id and
names.stud_name <> course.stud_name
)
Edit II if you insist on using a join (as per your comment) you might want to try a
select distinct names.* from...

Hope it helps you
with not_in_class as
(select a.*
from Names a
where not exists ( select 'x'
from course b
where b.Course_id = a.class_id
and a.Stud_name = b.Stud_name)),
not_in_course as
(select b.*
from course b
where not exists ( select 'x'
from Names a
where b.Course_id = a.class_id
and a.Stud_name = b.Stud_name))
select x.class_id,
x.Stud_name NOT_IN_CLASS,
y.stud_name NOT_IN_COURSE
from not_in_class x, not_in_course y
where x.class_id = y.course_id
Output
| CLASS_ID | NOT_IN_CLASS | NOT_IN_COURSE |
|----------|--------------|---------------|
| S001 | PETER | KEITH |
Only problem is that if multiple mismatches are there in both the tables for a given id, it works for single mismatch for a particular id. You need to rework if multiple mismatches are there for the same id.

Well, I am not sure if I understand correctly what you are asking. I think you want a list of all IDs where the student list in class table and course table differs. Then you want to show the id and the students that are in class but not in course and the students that are in course but not in class.
To do so you would full outer join the tables. That gives you students that are both in class and course, students that are in class and not in course, and students that are in course and not in class. Filter your results where either class_id or course_id is null then to get the students missing in course or class. At last group by id and list the students.
select coalesce(class.class_id, course.course_id) as id
, listagg(class.stud_name, ',') within group (order by class.stud_name) as missing_in_course
, listagg(course.stud_name, ',') within group (order by course.stud_name) as missing_in_class
from class
full outer join course
on (class.class_id = course.course_id and class.stud_name = course.stud_name)
where class.class_id is null or course.course_id is null
group by coalesce(class.class_id, course.course_id);
Here is the SQL fiddle showing how it works: http://sqlfiddle.com/#!4/8aaaa/2
EDIT: In Oracle 9i there is no listagg. You can use the inofficial function wm_concat instead:
select coalesce(class.class_id, course.course_id) as id
, wm_concat(class.stud_name) as missing_in_course
, wm_concat(course.stud_name) as missing_in_class
from class
full outer join course
on (class.class_id = course.course_id and class.stud_name = course.stud_name)
where class.class_id is null or course.course_id is null
group by coalesce(class.class_id, course.course_id);

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas