Relational algebra "grouping" - sql

Sorry for the vague question topic!
I've got a particular relational-algebra problem that has me and a couple of friends stumped.
Now, here's the question:
For each department, find the maximum salary of instructors in that
department. You may assume that every department has at least one
instructor.
I'll upload the schema as well, as a visual aide.
I've worked this out to a point;
I need a relation that includes all the instructors in any department, we've got that. It's the instructor relation.
Out of that relation i need to 'split' it up into a per-department basis. and once I have that relation I just take the max(salary) and return that.
Problem is, the only way I can think of to do that is something like this:
π(max(salary)(σ(dept_name = x(instructor)))
Where x = whatever dept_name i'm looking for, but If I did it this way, then I'd have to do a new relation for every department!
How would you do it?
(Note: I just copy and paste'd the symbols from wikipedia if you want to use them in your answer)

My relational algebra might be a bit rusty but I think that
dept_name_G_{max(salary)}(
σ_{ddept_name = idept_name}(
ρ_{dept_name/ddept_name}(department) ⨯
ρ_{dept_name/idept_name}(instructor)
)
)
is what you seek for.
Remember that all projections are just operations on sets. The first thing you would do to
connect the information of department and instructor is to bring the information together.
So you want to join department and instructor, basically a cross product (⨯):
department = {(depA, 100$), (depB, 200$)}
instructor = {(will, depA, 10$), (bob, depB, 20$), (will, depB, 9$)}
department ⨯ instructor = {
(depA, 100$, will, depA, 10$),
(depA, 100$, bob, depB, 20$),
...,
(depB, 200$, will, depA, 10$),
...
}
So what you would want now is to filter the tuples where the dept_name of the instructor equals the
dept_name of the department. But you also notice that you now have a naming collision,
namely the column dept_name comes up twice.
As you can't simply do σ_{dept_name = dept_name}(department ⨯ instructor) you need to rename at
least one of the dept_name fields. I renamed both for clarity which one belongs to what.
So what you now have is
σ_{ddept_name = idept_name}(
ρ_{dept_name/ddept_name}(department) ⨯
ρ_{dept_name/idept_name}(instructor)
)
giving you:
{
(depA, 100$, will, depA, 10$),
(depB, 200$, bob, depB, 20$),
(depB, 200$, will, depB, 9$)
}
The whole process is a natural join and can be expressed shortly with:
department ⋈ instructor
Now the final step is to project the maximum salary per department. A simple projection can't do that
but the aggregation operator can:
{dept_name}_G_{max(salary)}(department ⋈ instructor)
results in
{
(depA, 10$),
(depB, 20$)
}

Related

Cypher - Add multiple connections

I have 2 nodes:
Students and Subjects.
I want to be able to add multiple student names to multiple subjects at the same time using cypher query.
So far I have done it by iterating through the list of names of students and subjects and executing the query for each. but is there a way to do the same in the query itself?
This is the query I use for adding 1 student to 1 subject:
MATCH
(s:Student)-[:STUDENT_BELONGS_TO]->(c:Classroom),
(u:Subjects)-[:SUBJECTS_TAUGHT_IN]->(c:Classroom)
WHERE
s.id = ${"$"}studentId
AND c.id = ${"$"}classroomId
AND u.name = ${"$"}subjectNames
AND NOT (s)-[:IN_SUBJECT]->(u)
CREATE (s)-[:IN_SUBJECT]->(u)
So I want to be able to receive multiple subjectNames and studentIds at once to create these connections. Any guidance for multi relationships in cypher ?
I think what you are looking for is UNWIND. If you have an array as parameter to your query:
studentList :
[
studentId: "sid1", classroomId: "cid1", subjectNames: ['s1','s2'] },
studentId: "sid2", classroomId: "cid2", subjectNames: ['s1','s3'] },
...
]
You can UNWIND that parameter in the beginning of your query:
UNWIND $studentList as student
MATCH
(s:Student)-[:STUDENT_BELONGS_TO]->(c:Classroom),
(u:Subjects)-[:SUBJECTS_TAUGHT_IN]->(c:Classroom)
WHERE
s.id = student.studentId
AND c.id = student.classroomId
AND u.name = in student.subjectNames
AND NOT (s)-[:IN_SUBJECT]->(u)
CREATE (s)-[:IN_SUBJECT]->(u)
You probably need to use UNWIND.
I haven't tested the code, but something like this might work:
MATCH
(s:Student)-[:STUDENT_BELONGS_TO]->(c:Classroom),
(u:Subjects)-[:SUBJECTS_TAUGHT_IN]->(c:Classroom)
WITH
s AS student, COLLECT(u) AS subjects
UNWIND subjects AS subject
CREATE (student)-[:IN_SUBJECT]->(subject)

Return results from more than one database table in Django

Suppose I have 3 hypothetical models;
class State(models.Model):
name = models.CharField(max_length=20)
class Company(models.Model):
name = models.CharField(max_length=60)
state = models.ForeignField(State)
class Person(models.Model):
name = models.CharField(max_length=60)
state = models.ForeignField(State)
I want to be able to return results in a Django app, where the results, if using SQL directly, would be based on a query such as this:
SELECT a.name as 'personName',b.name as 'companyName', b.state as 'State'
FROM Person a, Company b
WHERE a.state=b.state
I have tried using the select_related() method as suggested here, but I don't think this is quite what I am after, since I am trying to join two tables that have a common foreign-key, but have no key-relationships amongst themselves.
Any suggestions?
Since a Person can have multiple Companys in the same state. It is not a good idea to do the JOIN at the database level. That would mean that the database will (likely) return the same Company multiple times, making the output quite large.
We can prefetch the related companies, with:
qs = Person.objects.select_related('state').prefetch_related('state__company')
Then we can query the Companys in the same state with:
for person in qs:
print(person.state.company_set.all())
You can use a Prefetch-object [Django-doc] to prefetch the list of related companies in an attribute of the Person, for example:
from django.db.models import Prefetch
qs = Person.objects.prefetch_related(
Prefetch('state__company', Company.objects.all(), to_attr='same_state_companies')
)
Then you can print the companies with:
for person in qs:
print(person.same_state_companies)

Relational Algebra check for error

Hi could someone please verify my work. Im not sure if im doing any of this correctly and would greatly appreciate any help. I am not allow to use the Bow tie operator. Thank you.
Question:
Books (ISBN, Title, Authors, Publisher, Ed, Year, Genre)
Patron (MemberNumber, FirstName, LastName, AddressLn1, AddressLn2, City, State, Zipcode)
Loan (MemberNumber,ISBN,DateLoaned,DateDue, DateReturned)
Business Logic
• You may assume that the library only has one copy of each book.
• Each book may have many authors. If a particular book has multiple authors, they are listed as a comma separated string. You may assume that the same author always uses the same exact name and no two authors will have the same name.
• Year is stored as an integer.
• DateLoaned, DateDue, and DateReturned are stored as a date.
• When a book is initially lent out, DateReturned is set to be NULL, upon its return, the value is updated.
1.1. Find all books that were loaned out after 12/22/2012. Show the ISBN, Title, and DateDue.
1.2. Find all library patrons who have borrowed a book titled "Database Systems". Show their FirstName, LastName, and DateLoaned.
1.3. Find all books that were ever loaned out. Display the ISBN.
1.4. Find all books returned before 12/22/2012. Display the ISBN.
1.5. Find all books returned on or after 12/22/2012. Display the ISBN.
1.6. Find all books returned either (before 12/22/2012) or (on or after 12/22/2012) Display the ISBN.
1.7. In 1 sentence explain the difference between 1.3 and 1.6.
1.8. Find all patrons who have never borrowed a book.
1.9. Find all books with Genre "Mystery" that have NEVER been loaned out.
1.10. Create a new attribute ImportantDates. A date is important if it is in the Loan relation either as a DateLoaned or a DateDue. Display ImportantDates.
1.11. Find all library patrons who have borrowed a book with an author "James Stewart". You may use the expression LIKE "%James Stewart%" in your Relational Algebra.
1.12. Find all library patrons who have never borrowed a book with an author "James Stewart". You may use the expression LIKE "%James Stewart%" in your Relational Algebra.
1.13. Find all library patrons who have only borrowed a book with an author "James Stewart". If they have ever borrowed a book without the author "James Stewart" they should be excluded. You may use the expression LIKE "%James Stewart%" in your Relational Algebra. 
Answer:
1.1) πISBN,TITLE,DATEDUE(σDATELOANED > 12222012 AND BOOKS.ISBN = LOAN.ISBN(LOAN X BOOKS)
1.2) πFIRSTNAME,LASTNAME,DATELOANED(σTITLE = "DATABASE SYSTEMS" AND BOOKS.ISBN = LOAN.ISBN AND PATRON.MEMBERNUMBER = LOAN.MEMBERNUMBER(BOOKS X PATRON X LOAN))
1.3) πISBN(σDATELOANED <> "NULL" AND BOOK.ISBN = LOAN.ISBN(LOAN X BOOKS))
1.4) πISBN(σDATELOANED < 12222012 AND BOOK.ISBN = LOAN.ISBN(LOAN X BOOKS))
1.5) πISBN(σDATELOANED >= 12222012 AND BOOK.ISBN = LOAN.ISBN(LOAN X BOOKS))
1.6) πISBN(σDATELOANED >= 12222012 OR DATELOANED <12222012 AND BOOK.ISBN = LOAN.ISBN(LOAN X BOOKS))
1.7) 1.3 AND 1.6 are the same as they both find books that have been loaned.
1.8) σDATELOANED = "NULL" AND PATRON.MEMBERNUMBER = LOAN.MEMBERNUMBER(LOAN X PATRON)
1.9) σGENRE = "MYSTERY" AND BOOKS.ISBN = LOAN.ISBN AND DATELOANED = "NULL"
1.10) LOAN(DATELOANED,DATEDUE) -> IMPORTANTDATE
Could you please give me an example of either 1.11, 1.12, or 1.13 as I have no clue in how to use the LIKE expression.
The professor told you how to do it:
You may use the expression LIKE "%James Stewart%" in your Relational Algebra.
It has been about a year since I have had to use relational algebra, but it will be whatever your pre-conditions are (this is a task for you) followed by the line:
LIKE %James Stewart%
The SQL statement would look something like this:
Select * from patrons p where Books.author LIKE %James Stewart%
You will find in your studies relational algebra does not deal with functions of SQL it just looks at the purely mathematical side of things.

nHibernate collections and alias criteria

I have a simple test object model in which there are schools, and a school has a collection of students.
I would like to retrieve a school and all its students who are above a certain age.
I carry out the following query, which obtains a given school and the children which are above a certain age:
public School GetSchoolAndStudentsWithDOBAbove(int schoolid, DateTime dob)
{
var school = this.Session.CreateCriteria(typeof(School))
.CreateAlias("Students", "students")
.Add(Expression.And(Expression.Eq("SchoolId", schoolid), Expression.Gt("students.DOB", dob)))
.UniqueResult<School>();
return school;
}
This all works fine and I can see the query going to the database and returning the expected number of rows.
However, when I carry out either of the following, it gives me the total number of students in the given school (regardless of the preceding request) by running another query:
foreach (Student st in s.Students)
{
Console.WriteLine(st.FirstName);
}
Assert.AreEqual(s.Students.Count, 3);
Can anyone explain why?
You made your query on the School class and you restricted your results on it, not on the mapped related objects.
Now there are many ways to do this.
You can make a static filter as IanL said, however its not really flexible.
You can just iterate the collection like mxmissile but that is ugly and slow (especially considering lazy loading considerations)
I would provide 2 different solutions:
In the first you maintain the query you have and you fire a dynamic filter on the collection (maintaining a lazy-loaded collection) and doing a round-trip to the database:
var school = GetSchoolAndStudentsWithDOBAbove(5, dob);
IQuery qDob = nhSession.CreateFilter(school.Students, "where DOB > :dob").SetDateTime("dob", dob);
IList<Student> dobedSchoolStudents = qDob.List<Student>();
In the second solution just fetch both the school and the students in one shot:
object result = nhSession.CreateQuery(
"select ss, st from School ss, Student st
where ss.Id = st.School.Id and ss.Id = :schId and st.DOB > :dob")
.SetInt32("schId", 5).SetDateTime("dob", dob).List();
ss is a School object and st is a Student collection.
And this can definitely be done using the criteria query you use now (using Projections)
Unfortunately s.Students will not contain your "queried" results. You will have to create a separate query for Students to reach your goal.
foreach(var st in s.Students.Where(x => x.DOB > dob))
Console.WriteLine(st.FirstName);
Warning: That will still make second trip to the db depending on your mapping, and it will still retrieve all students.
I'm not sure but you could possibly use Projections to do all this in one query, but I am by no means an expert on that.
You do have the option of filtering data. If it there is a single instance of the query mxmissle option would be the better choice.
Nhibernate Filter Documentation
Filters do have there uses, but depending on the version you are using there can be issues where filtered collections are not cached correctly.

NHibernate Multiple Criteria Difficulties

I'm having difficulties with multiple joins in the NHibernate Criteria search. Say I had a pet table, and I wanted to return all pets where the pet category was Dog, and the owner gender was female, ordered by the pet birthday. I've tried a bunch of permutations for how I can get this but haven't been able to figure it out. My latest iteration is as follows:
var recentPets = session.CreateCriteria(typeof(Pet))
.AddOrder(Order.Desc("PetBirthday"))
.CreateCriteria("PetType", "pt", JoinType.InnerJoin)
.CreateCriteria("PetOwnerId", "po", JoinType.InnerJoin)
.Add(Expression.Eq("pt.PetTypeName", petType))
.Add(Expression.Eq("po.PersonGender", gender))
.List<Pet>();
Thanks so much for the help!
Is there a reason you are not using the Hibernate/Java persistence query language to perform the query?
select p from Pet p
join p.owner o
where o.gender = :gender
and p.type.name = :petType
order by p.birthday