How to write Relational Algebra - sql

Here the relation: WORKS(emp_name, company_name,salary)
Q. Write an expression Relational Algebra to find the company name that has the highest number of employee.
I tried to solve it in many ways but not finding the correct way.

Here is a query which should work across most RDBMS:
SELECT company_name
FROM WORKS
GROUP BY company_name
HAVING COUNT(*) = SELECT MAX(empCount) FROM
(
SELECT COUNT(*) AS empCount
FROM WORKS
GROUP BY company_name
) t
If you are using MySQL, SQL Server, or any database which has a LIMIT keyword (or something like it), then the query gets easier:
SELECT company_name, COUNT(*) AS empCount
FROM WORKS
GROUP BY company_name
ORDER BY empCount DESC
LIMIT 1

Related

SQL What do I need to group by?

I am trying to get a better understanding of group by and count in SQL and tried to find the student who has been studying for the second longest time.
I need to also group by s.semester for it to work, just group by s.name alone (which is what I had done initially) does not work - why is this? I know this is right, but am trying to understand why for future practice questions.
select s.name
from students s
group by s.name, s.semester
having 1 = (select count (gold.name)
from students gold
where gold.name <> s.name
and gold.semester > s.semester
)
Thanks in advance!
To solve this task, it is not enough to use grouping. You need to use ranking functions.
You will receive the rank of each subsequent student or students, based on the number of semesters.
You can get students by rank using "where".
select
*
from(
select
row_number() over (partition by name_students, semestr order by semestr_count
asc) r_n,
name_students,
semestr
from (
select name_students, semestr, count(semestr_count) from studies
group by name_students, semestr
) agregate_studies
where r_n = 1
Using r_n for get some students top _ n : 2..3

SQL Query: Find the name of the company that has been assigned the highest number of patents

Using this query I can find the Company Assignee number for company with most patents but I can't seem to print the company name.
SELECT count(*), patent.assignee
FROM Patent
GROUP BY patent.assignee
HAVING count(*) =
(SELECT max(count(*))
FROM Patent
Group by patent.assignee);
COUNT(*) --- ASSIGNEE
9 19715
9 27895
Nesting above query into
SELECT company.compname
FROM company
WHERE ( company.assignee = ( *above query* ) );
would give an error "too many values" since there are two companies with most patents but above query takes only one assignee number in the WHERE clause. How do I solve this problem? I need to print name of BOTH companies with assignee number 19715 and 27895. Thank you.
You have started down the path of using nested queries. All you need to do is remove COUNT(*):
SELECT company.compname
FROM company
WHERE company.assignee IN
(SELECT patent.assignee
FROM Patent
GROUP BY patent.assignee
HAVING count(*) = (SELECT max(count(*))
FROM Patent
GROUP BY patent.assignee
)
);
I wouldn't write the query this way. The use of max(count(*)) is particularly jarring, but it is valid Oracle syntax.
Applying an aggregate function on another aggregate function (like max(count(*))) is illegal in many databases but I believe using the ALL operator instead and a join to get the company name would solve your problem.
Try this:
SELECT COUNT(*), p.assignee, c.compname
FROM Patent p
JOIN Company c ON c.assignee = p.assignee
GROUP BY p.assignee, c.compname
HAVING COUNT(*) >= ALL -- this predicate will return those rows
( -- for which the comparison holds true
SELECT COUNT(*) -- for all instances.
FROM Patent -- it can only be true for the highest count
GROUP BY assignee
);
Assuming you have Oracle, I thought about this a bit differently:
select
c.compname
from
company c
join
(
select
assignee,
dense_rank() over (order by count(1) desc) rnk
from
patent
group by
assignee
) p
on p.assignee = c.assignee
where
p.rnk = 1
;
I like this because is lets you find the any rank. For example, if you want the top 3 you would just change p.rnk = 1 to p.rnk <= 3. If you want 10th place, you just change it to p.rnk = 10. Adding the total count and rank into the results would be easy from here too. Overall I think it's more versatile.

Simple definition: query or sub-query?

I have seen sources saying that an SQL statement such as
SELECT first_name, last_name, subject
FROM student_details
WHERE games NOT IN ('Cricket', 'Football');
is an example of a subquery, but is it not a simple query? I was under the impression that subqueries demand a second call of SELECT, is this correct?
A subquery is a query within a query - your example is just a query.
Your source, http://beginner-sql-tutorial.com/sql-subquery.htm, is incorrect in some ways, I think.
This is a query that contains a subquery:-
USE AdventureWorks2008R2;
GO
SELECT Ord.SalesOrderID, Ord.OrderDate,
(SELECT MAX(OrdDet.UnitPrice)
FROM AdventureWorks.Sales.SalesOrderDetail AS OrdDet
WHERE Ord.SalesOrderID = OrdDet.SalesOrderID) AS MaxUnitPrice
FROM AdventureWorks2008R2.Sales.SalesOrderHeader AS Ord
This statement contains a subquery:
Select First_Name, Last_Name, Subject
From Student_Details
Where GameID not in (Select GameID from Games where RequiresHelmet = 1)
Yours does not.

How to optimize an SQLite3 query

I'm learning SQLite3 by means of a book ("Using SQLite") and the Northwind database. I have written the following code to order the customers by the number of customers in their city, then alphabetically by their name.
SELECT ContactName, Phone, City as originalCity
FROM Customers
ORDER BY (
SELECT count(*)
FROM Customers
WHERE city=originalCity)
DESC, ContactName ASC
It takes about 50-100ms to run. Is there a standard procedure to follow to optimize this query, or more generally, queries of its type?
In the most general case, query optimization starts with reading the query optimizer's execution plan. In SQLite, you just use
EXPLAIN QUERY PLAN statement
In your case,
EXPLAIN QUERY PLAN
SELECT ContactName, Phone, City as originalCity
FROM Customers
ORDER BY (
SELECT count(*)
FROM Customers
WHERE city=originalCity)
DESC, ContactName ASC
You might also need to read the output of
EXPLAIN statement
which goes into more low-level detail.
In general (not only SQLite), it's better to do the count for all values (cities) at once, and a join to construct the query:
SELECT ContactName, Phone, Customers.City as originalCity
FROM Customers
JOIN (SELECT city, count(*) cnt
FROM Customers
GROUP BY city) Customers_City_Count
ON Customers.city = Customers_City_Count.city
ORDER BY Customers_City_Count.cnt DESC, ContactName ASC
(to prevent, like in your case, the count from being computed many times for the same value (city))

Sql - Use the biggest value to select other data

I've a big doubt on the way to retrieve the biggest value of a table and use it in another query.
Consider this :
CREATE TABLE people
(
peopleID int NOT NULL,
cityID int NOT NULL
)
The following request gives me the number of people per city
SELECT peopleID, COUNT(*)
FROM people
GROUP BY cityID
Suppose I want the people list of the biggest city, I would write this request like :
SELECT people.peopleID, people.cityID
FROM people,
(
SELECT cityID, COUNT(*) AS "people_count"
FROM people
GROUP BY cityID
) g
WHERE people.cityID = g.cityID
HAVING people_count = MIN(people_count)
but doesn't work, what's the best way to achieve this request?
Thanks :)
This technique should work in most databases:
SELECT peopleID
FROM people
WHERE cityID =
(
SELECT cityID
FROM people
GROUP BY cityID
ORDER BY COUNT(*) DESC
LIMIT 1
)
LIMIT 1 is not standard SQL (the standard states that you should use FETCH FIRST 1 ROWS ONLY). See here for a list of how to fetch only the first row in a variety of databases:
Select (SQL) - Result limits
Edit: I misunderstood your question. I thought you meant what is a sensible way to perform this query that can easily be modified to work in almost any SQL database. But it turns out what you actually want to know is how to write the query so that it will work using exactly the same syntax in all databases. Even random databases that no-one uses that don't even properly support the SQL standard. In which case you can try this but I'm sure you can find a database where even this doesn't work:
SELECT peopleID, cityID
FROM people
WHERE cityID = (
SELECT MAX(cityID)
FROM (
SELECT cityID
FROM people
GROUP BY cityID
HAVING COUNT(*) =
(
SELECT MAX(cnt) FROM
(
SELECT cityID, COUNT(*) AS cnt
FROM people
GROUP BY cityID
) T1
)
) T2
)