Retrieving column value in table2 via same ID in table1 - sql

I have this SQL query that returns overdue assignments
SELECT DUE_DATE,
SUBJECT,
ASSIGNMENT,
STUDENT_NAME,
TEACHER_NAME
FROM(SELECT DISTINCT
a.due_date AS due_date,
a.subject AS subject,
a.assignment AS assignment,
a.student_name AS student_name,
a.student_id AS student_id,
a.teacher_name AS teacher_name,
a.teacher_id AS teacher_id
FROM DB.ASSIGNMENT a,
DB.ALL b,
WHERE (trunc(a.DATE_CREATED) >= trunc(db.utc_sysdate)))
WHERE((trunc(due_date) < trunc(db.utc_sysdate));
and I want to include both the teacher and student emails as additional columns in my SQL query - I was wondering how to map their id in table ASSIGNMENT in order to get their respective emails in table ALL with the existing query I have?

We do lack some information, but - wouldn't your query be like this?
select distinct
a.due_date,
a.subject,
a.assignment,
a.student_name,
a.student_email,
a.teacher_name,
a.teacher_email
from db.assignment a join db.all b
on trunc(a.date_created) >= trunc(b.utc_sysdate)
and trunc(a.due_date) < trunc(b.utc_sysdate);
What's the difference, if compared to your query?
your query is invalid
comma after db.all b
the final where clause references db. "alias" (although it is probably schema name, according to inline view's from clause)
there's no point in aliasing column names using exactly the same name; what's the difference between a.due_date as due_date and a.due_date itself? None. So don't use it, you're just causing confusion
as you want to include student's and teacher's e-mail addresses, why don't you just do that? Add those columns into the query ...
it seems that you don't need an inline view; put both where conditions into the same query and remove columns you don't need (both IDs)

Related

how do I make a good subquery in SQL in this code

I need a bit of help with an SQL statement, pretty much a beginner so just go easy on me.
The Program want me to give out every Student who studied for less than 7 years at a School
Select schoolid, characterid, firstname, lastname, count(year) as num
from schoolhouse natural join student natural join character
group by schoolid, characterid, firstname, lastname
So far, so good, with this code I can already see a relation with the counted years but I can't make a where statement which includes the "num" count from the select statement.
WHERE clauses are applied to the individual rows after the tables are joined, but before they are grouped/aggregated. HAVING clauses are used to assert conditions on the results of the aggregation. Just add HAVING count(year) < 7
Select schoolid, characterid, firstname, lastname, count(year) as num
from schoolhouse natural join student natural join character
group by schoolid, characterid, firstname, lastname
having count(year) < 7
But also always qualify which table your columns come from. In this query it's not clear if the year column is from the schoolhouse table, the student table or the character table.
It should look more like...
SELECT TABLE.schoolid, TABLE.characterid, TABLE.firstname, TABLE.lastname, COUNT(TABLE.year) AS num
FROM schoolhouse NATURAL JOIN student NATURAL JOIN character
GROUP BY TABLE.schoolid, TABLE.characterid, TABLE.firstname, TABLE.lastname
HAVING COUNT(TABLE.year) < 7
(Replacing each occurance of TABLE with the correct table name for each case.)
Finally, using words such as character and year as column or table names is usually frowned upon. Such words ofter appear as "key-words" in SQL and can cause errors or ambiguity. In general, if something even might appear as an SQL key-word, don't use it as a column or table name.

Retrieving duplicate and original rows from a table using sql query

Say I have a student table with the following fields - student id, student name, age, gender, marks, class.Assume that due to some error, there are multiple entries corresponding to each student. My requirement is to identify the duplicate rows in the table and the filter criterion is the student name and the class.But in the query result, in addition to identifying the duplicate records, I also need to find the original student detail which got duplicated. Is there any method to do this. I went through this answer: SQL: How to find duplicates based on two fields?. But here it only specifies how to find the duplicate rows and not a means to identify the actual row that was duplicated. Kindly throw some light on the possible solution. Thanks.
First of all: if the columns you've listed are all in the same table, it looks like your database structure could use some normalization.
In terms of your question: I'm assuming your StudentID field is a database generated, primary key and so has not been duplicated. (If this is not the case, I think you have bigger problems than just duplicates).
I'm also assuming the duplicate row has a higher value for StudentID than the original row.
I think the following should work (Note: I haven't created a table to verify this so it might not be perfect straight away. If it doesn't it should be fairly close)
select dup.StudentID as DuplicateStudentID
dup.StudentName, dup.Age, dup.Gender, dup.Marks, dup.Class,
orig.StudentID as OriginalStudentId
from StudentTable dup
inner join (
-- Find first student record for each unique combination
select Min(StudentId) as StudentID, StudentName, Age, Gender, Marks, Class
from StudentTable t
group by StudentName, Age, Gender, Marks, Class
) orig on dup.StudentName = orig.StudenName
and dup.Age = orig.Age
and dup.Gender = orig.Gender
and dup.Marks = orig.Marks
and dup.Class = orig.Class
and dup.StudentID > orig.StudentID -- Don't identify the original record as a duplicate

How does GROUP BY use COUNT(*)

I have this query which finds the number of properties handled by each staff member along with their branch number:
SELECT s.branchNo, s.staffNo, COUNT(*) AS myCount
FROM Staff s, PropertyForRent p
WHERE s.staffNo=p.staffNo
GROUP BY s.branchNo, s.staffNo
The two relations are:
Staff{staffNo, fName, lName, position, sex, DOB, salary, branchNO}
PropertyToRent{propertyNo, street, city, postcode, type, rooms, rent, ownerNo, staffNo, branchNo}
How does SQL know what COUNT(*) is referring to? Why does it count the number of properties and not (say for example), the number of staff per branch?
This is a bit long for a comment.
COUNT(*) is counting the number of rows in each group. It is not specifically counting any particular column. Instead, what is happening is that the join is producing multiple properties, because the properties are what cause multiple rows for given values of s.branchNo and s.staffNo.
It gets even a little more "confusing" if you include a column name. The following would all typically return the same value:
COUNT(*)
COUNT(s.branchNo)
COUNT(s.staffNo)
COUNT(p.propertyNo)
With a column name, COUNT() determines the number of rows that do not have a NULL value in the column.
And finally, you should learn to use proper, explicit join syntax in your queries. Put join conditions in the on clause, not the where clause:
SELECT s.branchNo, s.staffNo, COUNT(*) AS myCount
FROM Staff s JOIN
PropertyForRent p
ON s.staffNo = p.staffNO
GROUP BY s.branchNo, s.staffNo;
GROUP BY clauses partition your result set. These partitions are all the sql engine needs to know - it simply counts their sizes.
Try your query with only count(*) in the select part.
In particular, COUNT(*) does not produce the number of distinct rows/columns in your result set!
Some people might think that count(*) really count all the columns, however the sql optimizer is smarter than that.
COUNT(*) returns the number of rows in a specified table without getting rid of duplicates. Which mean that you can't use Distinct with count(*)
Count(*) will return the cardinality (elements in table) of the specified mapping.
What you have to remember is that when using count over a specific column, null won't be allowed while count(*) will allow null in the rows as it could be any field.
How does SQL know what COUNT(*) is referring to?
I'm pretty sure, however not 100% sure as I can't find in doc, that the sql optimizer simply do a count on the primary key (not null) instead of trying to handle null in rows.

I'm not sure what is the purpose of "group by" here

I'm struggling to understand what this query is doing:
SELECT branch_name, count(distinct customer_name)
FROM depositor, account
WHERE depositor.account_number = account.account_number
GROUP BY branch_name
What's the need of GROUP BY?
You must use GROUP BY in order to use an aggregate function like COUNT in this manner (using an aggregate function to aggregate data corresponding to one or more values within the table).
The query essentially selects distinct branch_names using that column as the grouping column, then within the group it counts the distinct customer_names.
You couldn't use COUNT to get the number of distinct customer_names per branch_name without the GROUP BY clause (at least not with a simple query specification - you can use other means, joins, subqueries etc...).
It's giving you the total distinct customers for each branch; GROUP BY is used for grouping COUNT function.
It could be written also as:
SELECT branch_name, count(distinct customer_name)
FROM depositor INNER JOIN account
ON depositor.account_number = account.account_number
GROUP BY branch_name
Let's take a step away from SQL for a moment at look at the relational trainging language Tutorial D.
Because the two relations (tables) are joined on the common attribute (column) name account_number, we can use a natural join:
depositor JOIN account
(Because the result is a relation, which by definition has only distinct tuples (rows), we don't need a DISTINCT keyword.)
Now we just need to aggregate using SUMMARIZE..BY:
SUMMARIZE (depositor JOIN account)
BY { branch_name }
ADD ( COUNT ( customer_name ) AS customer_tally )
Back in SQLland, the GROUP BY branch_name is doing the same as SUMMARIZE..BY { branch_name }. Because SQL has a very rigid structure, the branch_name column must be repeated in the SELECT clause.
If you want to COUNT something (see SELECT-Part of the statement), you have to use GROUP BY in order to tell the query what to aggregate. The GROUP BY statement is used in conjunction with the aggregate functions to group the result-set by one or more columns.
Neglecting it will lead to SQL errors in most RDBMS, or senseless results in others.
Useful link:
http://www.w3schools.com/sql/sql_groupby.asp

Can't get unique values

I'm using my sql to get unique values from database
My query looks as follows, but somehow I'm not able to get unique results
SELECT
DISTINCT(company_name),
sp_id
FROM
Student_Training
ORDER BY
company_name
sp_id is the primary key, and then company_name is the companies name that needs to be unique
looks as follows
sp_id, company_name
1 comp1
2 comp2
3 comp2
4 comp3
Just not sorting this unique
DISTINCT works globally, on all the columns you SELECT. Here you're getting distinct pairs of (sp_id, company_name) values, but the individual values of each column may show duplicates.
That being said, it's extremely deceiving that MySQL authorizes the syntax SELECT DISTINCT(company_name), sp_id when it really means SELECT DISTINCT(company_name, sp_id). You don't need the parentheses at all, by the way.
Edit
Actually there's a reason why DISTINCT(company_name), sp_id is valid syntax: adding parentheses around an expression is always legal although it can be overkill: company_name is the same as (company_name) or even (((company_name))). Hence what that piece of SQL means is really: “DISTINCT [company_name in parentheses], [sp_id]”. The parentheses are attached to the column name, not the DISTINCT keyword, which, unlike aggregate function names, for example, does not need parentheses (AVG sp_id is not legal even if unambiguous for a human reader, it's always AVG(sp_id).)
For that matter, you could write SELECT DISTINCT company_name, (sp_id) or SELECT DISTINCT (company_name), (sp_id), it's exactly the same as the plain syntax without parentheses. Putting the list of columns inside parentheses – (company_name, sp_id) – is not legal SQL syntax, though, you can only SELECT “plain” lists of columns, unparenthesized (the form's spell-checker tells me this last expression is not an English word but I don't care. It's Friday afternoon after all).
Therefore, any database engine should accept this confusing syntax :-(
DISTINCT will make unique rows, meaning the unique combination of the field values in your query.
The following query will return a list of all the unique company_name together with the first match for sp_id.
SELECT sp_id, company_name
FROM Student_Training
GROUP BY company_name
And as Arthur Reutenauer has suggested, it is indeed pretty deceiving that MySQL allows the DISTINCT(fieldname) syntax when it actually means DISTINCT(field1, field2, ..., fieldn)
Which id you want in case when a single company_name has two or more id's?
This:
SELECT DISTINCT company_name
FROM Student_Training
will select only the company_name.
This:
SELECT company_name, MIN(id)
FROM Student_Training
GROUP BY
company_name
will select minimal id for each company name.
DISTINCT does not work the way you think it does. It gives you a distinct record. Stop and think about how it could return what you expect it to return. If you had a table as follows
id name
1 joe
2 joe
3 james
If it only returned distinct names, which id would it return for joe?
You may want
SELECT company_name, Min(sp_id) FROM Student_Tracking GROUP BY company_name
or perhaps (as above) just
SELECT DISTINCT company_name from student_training