SQL queries: how to get the value which appears the most in the total of two different tables? - sql

Context: I want to know which vehicle brand appears the most in different accidents.
I have the table vehicle (v_number, brand).
Problem is, I have two different accident tables:
One refers to driven cars involved in an accident, let's call it acc_drive (v_number, acc_number, driver) [v_number FK vehicle]
The other refers to parked cars which are involved in an accident, let's call it acc_park (v_number, acc_number) [v_number FK vehicle, acc_number FK acc_drive]
Now, I'm trying to get the vehicle brand which appears the most in the total of the two tables. For example, if Audi cars appeared 2 times in acc_drive and 3 times in acc_park, the total number of appearences would be 5.
I'm having a really hard time trying to figure this out, so a helping hand would be much appreciated!

UNION ALL can be used to bring the tables together for the JOIN:
select v.brand, count(a.v_number)
from vehicle v left join
((select v_number
from acc_drive
) union all
(select v_number
from acc_park
)
) a
on v.v_number = a.v_number
group by v.brand
order by count(v_number) desc; -- put the biggest numbers first
Note that this uses a left join. So brands with no accidents will be included in the results.

Try this-
SELECT TOP 1 brand,COUNT(*)
FROM vehicle A
INNER JOIN acc_drive B ON A.v_number = B.v_number
INNER JOIN acc_park C ON A.v_number = C.v_number
GROUP BY brand
ORDER BY COUNT(*) DESC

Related

How to include zero results when querying one single table?

I have a table called Apartments that has three columns: apartment_type, person, date. It includes the apartment type selected by a certain person and date. I need to count how many people picked each of the apartment types. Some apartment type have 0 population.
Here is my query:
SELECT apartment_type, COUNT(*) AS TOTAL
FROM Apartments
GROUP BY apartment_type
It works great, but it doesn't include apartment types with a value of 0. Please, help me to correct this query.
In case some appartment_type have 0 population - your table will not contain any record with that type - so you must add some join from another table, where all apartment types exists. Or use union to create all 0 populated entries.
Something like:
SELECT apartment_type, COUNT(*) AS TOTAL
FROM (SELECT * FROM Apartments UNION ALL SELECT apartment_type, 0 as person, 0 as date from SomeTableWithFullListOfTypes group by apartment_type) as tmp
GROUP BY apartment_type
I generally agree with Nosyara's answer, but I don't agree with his sample query with the union all. I'm not sure it works, and it's certainly too complicated.
As stated already, if you don't have a table with all the possible apartment types, create one. Then you can write your query using a simple left join:
select t.apartment_type, count(a.apartment_type) as total
from apartment_types t
left join apartments a
on a.apartment_type = t.apartment_type
group by t.apartment_type
Note how count(*) was replaced by count(a.apartment_type). That change is necessary to have an accurate count in the case where you don't have apartments for a certain apartment type.
SELECT apartment_type, COUNT(apartment.*) AS TOTAL
FROM apartment_type
left join apartment
on apartment_type.aparentment_type = apartements.apartment_type
GROUP BY apartment_type
Using a left join will give you everything from the left side of the join (so all your types) and anything from the right that matches.

SQL Server count() returns different results when using joins

I am new to SQL Server and I am not looking for a solution (but it may help others), rather, I would like to understand the behaviour / why I get two different results from the two pseudo queries below.
The reason I have joined two other tables is because I will need to count all items in Vehicle against the 'Date_Recorded' in 'Garage'. All recorded since 2010. So before I did this I wanted to be sure I was getting the same total count from the table 'Car' and also get the same result with the joins, before I added the 'isDate' on 'Garage' condition, that is when I noticed the difference in the results.
I would have thought the joins would have been ignored?
Hope someone can advise? Thanks in advance!
SELECT count(Car.CAR_ID) AS Car_ID
FROM Vehicle Car
INNER JOIN Road Rd
ON Car.CAR_ID = Rd.CAR_ID
JOIN Garage g
ON Rd.GARAGE_ID = g.GARAGE_ID
----------------------------------------------
Car_ID
----------------------------------------------
226923
SELECT count(Car.CAR_ID) AS Car_ID
FROM Vehicle Car
----------------------------------------------
Car_ID
----------------------------------------------
203417
INNER JOIN: Returns all rows when there is at least one match in BOTH tables.
LEFT JOIN: Return all rows from the left table, and the matched rows from the right table.
RIGHT JOIN: Return all rows from the right table, and the matched rows from the left table.
FULL JOIN: Return all rows when there is a match in ONE of the tables.
you are using inner join thats why you are getting the wrong result from the actual count() result.
if you want to get all the record from the left table the you have to use left join in this query then you'll get the result of count() same as the main table[left table].
You either have
multiple records in the Road table with the same Car_ID or
multiple records in the Road table with the same Garage_ID or
Both of the above
You may be able to run the following to get what you want (assuming there is always a match in the road and garage tables):
SELECT count(Distinct Car.CAR_ID) AS Car_ID
FROM Vehicle Car
INNER JOIN Road Rd
ON Car.CAR_ID = Rd.CAR_ID
JOIN Garage g
ON Rd.GARAGE_ID = g.GARAGE_ID

Constructing a query, for selecting a table with limit of associations

I have using the last too many hours trying to construct this sql query that i just can't wrap my head around.
I have three tables, with the following relations, i have removed the rest of the columns for simplicity.
- Jobs
id
- Company
id
- Offer
job_id
company_id
offer_type (either 'single' or 'voucher')
- Reservation
job_id
company_id
Context.
A user creates a job. Companies can make one or two offers (one of each type) on a job, a job is closed when a job gets offers from 3 different companies. Also a reservation can take one of the spots.
So i am trying to fetch all open jobs, for a listing to the company. That is all jobs which have received offers from 2 different companies.
As mentioned i have tried to come up with a query for this, so far i got.
;WITH company_offers AS
(
SELECT
DISTINCT ON(offers.company_id) offers.company_id,
count(offers.company_id) as total,
offers.job_id
FROM offers
GROUP BY offers.company_id, offers.job_id
),
counts AS
(
SELECT jobs.*,
(SELECT count(*) FROM company_offers) as offer_count,
(SELECT count(*) FROM reservations WHERE reservations.job_id = jobs.id) as reservation_count
FROM jobs
JOIN company_offers ON company_offers.job_id = jobs.id
GROUP BY jobs.id
)
SELECT offer_count+reservation_count as total
FROM counts
I have tried to fetch the offers by unique company id, in the first CTE. Then using the second CTE to count the results of the first, and also find the reservation. Then i add them together at last, and lastly i should make a condition that the total is less than 3.
But this doesn't return the expected result, in fact long from.
I would appreciate if someone could help me out, and explain aswell.
Let me know if you got question.
Some generic SQL could look like this:
select Jobs.id
from Jobs
left outer join Offer on Offer.job_id = Jobs.id
left outer join Reservation on Reservation.job_id = Jobs.id
group by Jobs.id
having count(distinct Offer.company_id) + count(distinct Reservation.company_id) < 3
If PostgreSQL does not like that count(distinct ...), you may have to include an equivalent sub-query.
By the way:
SELECT DISTINCT ... GROUP BY ..., i.e. DISTINCT and GROUP BY, usually does not work out.

Uses of unequal joins

Of all the thousands of queries I've written, I can probably count on one hand the number of times I've used a non-equijoin. e.g.:
SELECT * FROM tbl1 INNER JOIN tbl2 ON tbl1.date > tbl2.date
And most of those instances were probably better solved using another method. Are there any good/clever real-world uses for non-equijoins that you've come across?
Bitmasks come to mind. In one of my jobs, we had permissions for a particular user or group on an "object" (usually corresponding to a form or class in the code) stored in the database. Rather than including a row or column for each particular permission (read, write, read others, write others, etc.), we would typically assign a bit value to each one. From there, we could then join using bitwise operators to get objects with a particular permission.
How about for checking for overlaps?
select ...
from employee_assignments ea1
, employee_assignments ea2
where ea1.emp_id = ea2.emp_id
and ea1.end_date >= ea2.start_date
and ea1.start_date <= ea1.start_date
Whole-day inetervals in date_time fields:
date_time_field >= begin_date and date_time_field < end_date_plus_1
Just found another interesting use of an unequal join on the MCTS 70-433 (SQL Server 2008 Database Development) Training Kit book. Verbatim below.
By combining derived tables with unequal joins, you can calculate a variety of cumulative aggregates. The following query returns a running aggregate of orders for each salesperson (my note - with reference to the ubiquitous AdventureWorks sample db):
select
SH3.SalesPersonID,
SH3.OrderDate,
SH3.DailyTotal,
SUM(SH4.DailyTotal) RunningTotal
from
(select SH1.SalesPersonID, SH1.OrderDate, SUM(SH1.TotalDue) DailyTotal
from Sales.SalesOrderHeader SH1
where SH1.SalesPersonID IS NOT NULL
group by SH1.SalesPersonID, SH1.OrderDate) SH3
join
(select SH1.SalesPersonID, SH1.OrderDate, SUM(SH1.TotalDue) DailyTotal
from Sales.SalesOrderHeader SH1
where SH1.SalesPersonID IS NOT NULL
group by SH1.SalesPersonID, SH1.OrderDate) SH4
on SH3.SalesPersonID = SH4.SalesPersonID AND SH3.OrderDate >= SH4.OrderDate
group by SH3.SalesPersonID, SH3.OrderDate, SH3.DailyTotal
order by SH3.SalesPersonID, SH3.OrderDate
The derived tables are used to combine all orders for salespeople who have more than one order on a single day. The join on SalesPersonID ensures that you are accumulating rows for only a single salesperson. The unequal join allows the aggregate to consider only the rows for a salesperson where the order date is earlier than the order date currently being considered within the result set.
In this particular example, the unequal join is creating a "sliding window" kind of sum on the daily total column in SH4.
Dublicates;
SELECT
*
FROM
table a, (
SELECT
id,
min(rowid)
FROM
table
GROUP BY
id
) b
WHERE
a.id = b.id
and a.rowid > b.rowid;
If you wanted to get all of the products to offer to a customer and don't want to offer them products that they already have:
SELECT
C.customer_id,
P.product_id
FROM
Customers C
INNER JOIN Products P ON
P.product_id NOT IN
(
SELECT
O.product_id
FROM
Orders O
WHERE
O.customer_id = C.customer_id
)
Most often though, when I use a non-equijoin it's because I'm doing some kind of manual fix to data. For example, the business tells me that a person in a user table should be given all access roles that they don't already have, etc.
If you want to do a dirty join of two not really related tables, you can join with a <>.
For example, you could have a Product table and a Customer table. Hypothetically, if you want to show a list of every product with every customer, you could do somthing like this:
SELECT *
FROM Product p
JOIN Customer c on p.SKU <> c.SSN
It can be useful. Be careful, though, because it can create ginormous result sets.

Select based on the number of appearances of an id in another table

I have a table B with cids and cities. I also have a table C that has these cids with extra information. I want to list all the cids in table C that are associated with ALL appearances of a given city in Table B.
My current solution relies on counting the number of times the given city appears in Table B and selecting only the cids that appear that many times. I don't know all the SQL syntax yet, but is there a way to select for this kind of pattern?
My current solution:
SELECT Agents.aid
FROM Agents, Customers, Orders
WHERE (Customers.city='Duluth')
AND (Agents.aid = Orders.aid)
AND (Customers.cid = Orders.cid)
GROUP BY Agents.aid
HAVING count(Agents.aid) > 1
It only works because I know right now with the HAVING statement.
Thanks for the help. I wasn't sure how to google this problem, since it's pretty specific.
EDIT: I'm pinpointing my problem a bit. I need to know how to determine if EVERY row in a table has a certain value for a field. Declaring a variable and counting the rows in a sub-selection and filtering out my results by IDs that appear that many times works, but It's really ugly.
There HAS to be a way to do this without explicitly count()ing rows. I hope.
Not an answer to your question, but a general improvement.
I'd recommend using JOIN syntax to join your tables together.
This would change your query to be:
SELECT Agents.aid
FROM Agents
INNER JOIN Orders
ON Agents.aid = Orders.aid
INNER JOIN Customers
ON Customers.cid = Orders.cid
WHERE Customers.city='Duluth'
GROUP BY Agents.aid
HAVING count(Agents.aid) > 1
What variant of SQL are you using?
To start with, you can (and should) use JOIN instead of doing it in the WHERE clause, e.g.,
select Agents.aid
from Agents
join Orders on Agents.aid = Orders.aid
join Customers on Customers.cid = Orders.cid
where Customers.city = 'Duluth'
group by Agents.aid
having count(Agents.aid) > 1
After that, I'm afraid I might be a little lost. Using the table names in your example query, what (in English, not pseudocode) are you trying to retrieve? For example, I think your sample query is retrieving the PK for all Agents that have been involved in at least 2 Orders involving Customers in Duluth.
Also, some table definitions for Agents, Orders, and Customers might help (then again, they might be irrelevant).
I'm not sure if I understood you problem, but I think the following query is what you want:
SELECT *
FROM customers b
INNER JOIN orders c USING (cid)
WHERE b.city = 'Duluth'
AND NOT EXISTS (SELECT 1
FROM customers b2
WHERE b2.city = b.city
AND b2.cid <> cid);
Probably you will need some indexes on these columns.