Understanding JOINS , Sub Query and Aggregate Functions

Understanding JOINS , Sub Query and Aggregate Functions - sql

In MYSQL , I am trying to understand the Aggregate Functions and am trying some examples in the northwind schema.
The table employees has the following description.
+-----------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| company | varchar(50) | YES | MUL | NULL | |
| last_name | varchar(50) | YES | MUL | NULL | |
| first_name | varchar(50) | YES | MUL | NULL | |
| email_address | varchar(50) | YES | | NULL | |
| job_title | varchar(50) | YES | | NULL | |
| business_phone | varchar(25) | YES | | NULL | |
| home_phone | varchar(25) | YES | | NULL | |
| mobile_phone | varchar(25) | YES | | NULL | |
| fax_number | varchar(25) | YES | | NULL | |
| address | longtext | YES | | NULL | |
| city | varchar(50) | YES | MUL | NULL | |
| state_province | varchar(50) | YES | MUL | NULL | |
| zip_postal_code | varchar(15) | YES | MUL | NULL | |
| country_region | varchar(50) | YES | | NULL | |
| web_page | longtext | YES | | NULL | |
| notes | longtext | YES | | NULL | |
| attachments | longblob | YES | | NULL | |
+-----------------+-------------+------+-----+---------+----------------+
Also the data in the table is
mysql> select city , first_name,last_name from employees;
+----------+------------+----------------+
| city | first_name | last_name |
+----------+------------+----------------+
| Seattle | Nancy | Freehafer |
| Bellevue | Andrew | Cencini |
| Redmond | Jan | Kotas |
| Kirkland | Mariya | Sergienko |
| Seattle | Steven | Thorpe |
| Redmond | Michael | Neipper |
| Seattle | Robert | Zare |
| Redmond | Laura | Giussani |
| Seattle | Anne | Hellung-Larsen |
+----------+------------+----------------+
I am trying to understand how can I find the Average number of people from different cities.
Till Now , I have
mysql> select city,count(city) from employees group by city;
+----------+-------------+
| city | count(city) |
+----------+-------------+
| Bellevue | 1 |
| Kirkland | 1 |
| Redmond | 3 |
| Seattle | 4 |
+----------+-------------+
Also I have
SELECT SUM(inner_count_city) from
(
SELECT city AS inner_city,
COUNT(*) AS inner_count_city
FROM employees
GROUP BY inner_city
) temp_table;
+-----------------------+
| SUM(inner_count_city) |
+-----------------------+
| 9 |
+-----------------------+
I am struggling to proceed forward do this due to the following reasons.
I am not able to do a AVG(COUNT(city)) - cant do two aggregate function
I am also not sure , how to divide it by the sum of count of cities ( = 9).
Not sure if I should use unions or joins , or subqueries.
I am trying to do something like
select city , inner_count_city / sum (inner_count_city) from ..

SQL only supports one aggregation at a time; for multiple aggregations you need multiple subqueries/CTEs. If you want the average number of people in the cities, then you are almost there:
SELECT AVG(inner_count_city * 1.0) from
FROM (SELECT city AS inner_city, COUNT(*) AS inner_count_city
FROM employees
GROUP BY inner_city
) c;
Note that SQL Server does integer arithmetic on integers. So the average of 1 and 2 is 1, not 1.5. Hence the * 1.0.
You can do this without a subquery as well:
SELECT COUNT(*) * 1.0 / COUNT(DISTINCT city)
FROM employees;

You can apply AVG aggregate function in an outer query:
select avg(cnt)
FROM (select count(city) as cnt
from employees
group by city) as t
To get the percentage of people per city you can use the following query:
select city, count(city) * 100.0 / total_count as cnt
from employees
cross join (select count(*) as total_count from employees) AS t
group by city, total_count

Related

Select all child record but need values of parent or grandparent record

I'm working on a project that uses a MariaDB v5.5 database to keep track of employees in a tree based higherarchy. Each person in this tree can have various 'flags' associated with them. In this case, these flags are stored using a bitmask.
My objects look like the following
Employee Table description
+--------------+-------------+--------------------------------------+
| Name | Field | Description |
+--------------+-------------+--------------------------------------+
| employee_id | INT | Unique key |
| name | VARCHAR(45) | Employees name |
| flags | INT(4) | Bitmask of employee attributes |
| parent_id | INT | the employee_id of the parent record |
+--------------+-------------+--------------------------------------+
'flag' bitmap description
+-----+--------------+
| Bit | Flag |
+-----+--------------+
| 0 | CEO |
| 1 | MANAGER |
| 2 | PROJECT_LEAD |
| 3 | SALES_PERSON |
| 4 | MAINTANCE |
+-----+--------------+
Employee Table Data
+----+--------+------------+---------------------------+
| id | name | parent_id | flags |
+----+--------+------------+---------------------------+
| 1 | Lisa | NULL | CEO |
| 2 | Steve | 1 | MANAGER |
| 3 | Pat | 1 | MANAGER |
| 4 | Mary | 2 | SALES_PERSON,PROJECT_LEAD |
| 5 | Phil | 4 | SALES_PERSON,MAINTANCE |
| 6 | Jim | 3 | SALES_PERSON,MAINTANCE |
| 7 | Anna | 3 | SALES_PERSON,MAINTANCE |
+----+--------+------------+---------------------------+
Let's say I want to query all employees who have the "MAINTANCE" flag BUT, I need to return the id of the parent record that has the "MANAGER" flag. So my result should look like this.
> SELECT id, name, manager_id, manager_name FROM ...
+----+--------+------------+--------------+
| ID | Name | manager_id | manager_name |
+----+--------+------------+--------------+
| 5 | Phil | 2 | Steve |
| 6 | Jim | 3 | Pat |
| 7 | Anna | 3 | Pat |
+----+--------+------------+--------------+
So how do I build my query to give me what I want?
Note, that this tree can be more than just 3 levels deep and the query still needs to work.

Select
a.ID
,a.Name
,a.manager_ID
,b.Manager_Name
From
EmployeeTable As a
Left Join
EmployeeTable As b
On
a.Mangager_ID = b.ID
Where
a.Flag = 'Maintenance'
And
b.Flag = 'Manager'
This should give you what you are looking for.

SQL count results on left join

I'm trying to get the total count of a table from a left join where there's a multiple of the same id. Here's my example below -
Table 1:
+-------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| project_id | int(11) | NO | | NULL | |
| token | varchar(32) | NO | | NULL | |
| email | varchar(255) | NO | | NULL | |
| status | char(1) | NO | | 0 | |
| permissions | varchar(255) | YES | | NULL | |
| created | datetime | NO | | NULL | |
| modified | datetime | NO | | NULL | |
+-------------+--------------+------+-----+---------+----------------+
Table 2:
+------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(32) | NO | | NULL | |
| account_id | int(11) | NO | | NULL | |
| created | datetime | NO | | NULL | |
| modified | datetime | NO | | NULL | |
| active | tinyint(1) | YES | | 1 | |
+------------+-------------+------+-----+---------+----------------+
I have this statement so far -
SELECT account_id, (SELECT COUNT(invitations.id)
FROM invitations WHERE invitations.project_id = projects.id) AS inv_count
FROM projects order by account_id;
And here's a sample of the results:
+------------+-----------+
| account_id | inv_count |
+------------+-----------+
| 1 | 0 |
| 2 | 2 |
| 2 | 0 |
| 3 | 4 |
| 3 | 0 |
| 3 | 4 |
| 3 | 0 |
| 4 | 6 |
| 4 | 3 |
| 4 | 3 |
| 4 | 5 |
| 4 | 3 |
| 4 | 9 |
| 5 | 6 |
| 5 | 0 |
| 5 | 4 |
| 5 | 2 |
| 5 | 2 |
How do I get account_id to show once and the sum of inv_count to show as 1 line? So I should see -
+------------+-----------+
| account_id | inv_count |
+------------+-----------+
| 1 | 0 |
| 2 | 2 |
| 3 | 8 |

You only need to put your query in a derived table (and name it, say tmp) and then group by the account_id:
SELECT account_id,
SUM(inv_count) AS inv_count
FROM
( SELECT account_id,
(SELECT COUNT(invitations.id)
FROM invitations
WHERE invitations.project_id = projects.id
) AS inv_count
FROM projects
) AS tmp
GROUP BY account_id
ORDER BY account_id ;
To simplify it farther, you can convert the inline subquery to a LEFT join. This way, no derived table is needed. I've also added aliases and removed the ORDER BY. MySQL does an implicit ORDER BY when you have GROUP BY so it's not needed here (unless you want to order by some other expression, different from the one you group by):
SELECT
p.account_id,
COUNT(i.id) AS inv_count
FROM
projects AS p
LEFT JOIN
invitations AS i
ON i.project_id = p.id
GROUP BY
p.account_id ;

Error: The used SELECT statements have a different number of columns

Why am I getting ERROR 1222 (21000): The used SELECT statements have a different number of columns from the following?
SELECT * FROM friends
LEFT JOIN users AS u1 ON users.uid = friends.fid1
LEFT JOIN users AS u2 ON users.uid = friends.fid2
WHERE (friends.fid1 = 1) AND (friends.fid2 > 1)
UNION SELECT fid2 FROM friends
WHERE (friends.fid2 = 1) AND (friends.fid1 < 1)
ORDER BY RAND()
LIMIT 6;
users:
+------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+----------------+
| uid | int(11) | NO | PRI | NULL | auto_increment |
| first_name | varchar(50) | NO | | NULL | |
| last_name | varchar(50) | NO | | NULL | |
| email | varchar(128) | NO | UNI | NULL | |
| mid | varchar(40) | NO | | NULL | |
| active | enum('N','Y') | NO | | NULL | |
| password | varchar(64) | NO | | NULL | |
| sex | enum('M','F') | YES | | NULL | |
| created | datetime | YES | | NULL | |
| last_login | datetime | YES | | NULL | |
| pro | enum('N','Y') | NO | | NULL | |
+------------+---------------+------+-----+---------+----------------+
friends:
+---------------+--------------------------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------------------------------+------+-----+---------+----------------+
| friendship_id | int(11) | NO | MUL | NULL | auto_increment |
| fid1 | int(11) | NO | PRI | NULL | |
| fid2 | int(11) | NO | PRI | NULL | |
| status | enum('pending','accepted','ignored') | NO | | NULL | |
+---------------+--------------------------------------+------+-----+---------+----------------+

UNIONs (UNION and UNION ALL) require that all the queries being UNION'd have:
The same number of columns in the SELECT clause
The column data type has to match at each position
Your query has:
SELECT f.*, u1.*, u2.* ...
UNION
SELECT fid2 FROM friends
The easiest re-write I have is:
SELECT f.*, u.*
FROM FRIENDS AS f
JOIN USERS AS u ON u.uid = f.fid2
WHERE f.fid1 = 1
AND f.fid2 > 1
UNION
SELECT f.*, u.*
FROM FRIENDS AS f
JOIN USERS AS u ON u.uid = f.fid1
WHERE f.fid2 = 1
AND f.fid1 < 1
ORDER BY RAND()
LIMIT 6;
You've LEFT JOIN'd to the USERS table twice, but don't appear to be using the information.

mysql three joins

I have a problem with mysql
I have 3 tables:
Deposit
+-------------------+-------------+------+-----+
| Field | Type | Null | Key |
+-------------------+-------------+------+-----+
| id | bigint(20) | NO | PRI |
| status | int(2) | NO | |
| depositDate | datetime | NO | MUL |
| reversePayment_id | bigint(20) | YES | UNI |
| claim_id | int(2) | NO | UNI |
| payment_id | bigint(20) | YES | UNI |
+-------------------+-------------+------+-----+
Payment
+--------------------------+---------------+------+-----+
| Field | Type | Null | Key |
+--------------------------+---------------+------+-----+
| id | int(10) | NO | PRI |
| paymentDate | timestamp | NO | MUL |
| pin | int(10) | NO | MUL |
| balanceChange | decimal(15,2) | YES | |
Claim
+------------------------+--------------+------+-----+
| Field | Type | Null | Key |
+------------------------+--------------+------+-----+
| id | int(11) | NO | PRI |
| fullName | varchar(100) | NO | |
| depositSum | blob | NO | |
| ip | varchar(39) | NO | |
| status | int(2) | NO | |
+------------------------+--------------+------+-----+
I try to select deposits (with claims) payment or reversePayment were between two dates, I perform this query with 3 joins:
EXPLAIN SELECT this_.id AS id60_3_, ..., fcpayment2_.id AS id59_0_, ..., reversepay3_.id AS id59_1_, ..., cl1_.id AS id61_2_, ...
FROM Deposit this_
INNER JOIN Payment fcpayment2_ ON this_.payment_id = fcpayment2_.id
LEFT OUTER JOIN Payment reversepay3_ ON this_.reversePayment_id = reversepay3_.id
INNER JOIN Claim cl1_ ON this_.claim_id = cl1_.id
WHERE (
(
fcpayment2_.paymentDate >= '2010-08-04 21:00:00'
AND fcpayment2_.paymentDate <= '2010-08-05 08:01:00'
)
OR (
reversepay3_.paymentDate >= '2010-08-04 21:00:00'
AND reversepay3_.paymentDate <= '2010-08-05 08:01:00'
)
)
ORDER BY this_.depositDate DESC
the result is
+----+-------------+--------------+--------+--------------------------------------------------------------------+----------+---------+-----------------------------------------+--------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+--------+--------------------------------------------------------------------+----------+---------+-----------------------------------------+--------+---------------------------------+
| 1 | SIMPLE | cl1_ | ALL | PRIMARY | NULL | NULL | NULL | 426588 | Using temporary; Using filesort |
| 1 | SIMPLE | this_ | eq_ref | claim_id,payment_id,FKDB5A0548511B6CDD,FKDB5A054867BA4108 | claim_id | 4 | portal.cl1_.id | 1 | |
| 1 | SIMPLE | fcpayment2_ | eq_ref | PRIMARY,paymentDate,date | PRIMARY | 4 | portal.this_.payment_id | 1 | Using where |
| 1 | SIMPLE | reversepay3_ | eq_ref | PRIMARY | PRIMARY | 4 | portal.this_.reversePayment_id | 1 | Using where |
+----+-------------+--------------+--------+--------------------------------------------------------------------+----------+---------+-----------------------------------------+--------+---------------------------------+
Why the first table in result is cl1_ and why mysql doesn't use key?

Because you used the keyword 'Explain', and because cl1_ is the alias you gave the table in your query.
I don't understand your question about the key.

MYSQL select statement conditions

query:
SELECT u.deviceID, u.userName, u.contactNo, u.rating
FROM User u
INNER JOIN TaxiQuery t ON u.deviceID = t.seat1
OR u.deviceID = t.seat2
OR u.deviceID = t.seat3
OR u.deviceID = t.seat4
WHERE t.queryID = 3;
+--------------------------------------+----------+-----------+--------+
| deviceID | userName | contactNo | rating |
+--------------------------------------+----------+-----------+--------+
| 00000000-0000-1000-8000-0016CB8B3C8E | uuuuuu | 55555 | 5 |
+--------------------------------------+----------+-----------+--------+
describe user;
+-----------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+--------------+------+-----+---------+-------+
| deviceID | varchar(100) | NO | PRI | NULL | |
| userName | varchar(100) | YES | | NULL | |
| contactNo | int(11) | YES | | NULL | |
| emailAddr | varchar(100) | YES | | NULL | |
| rating | int(11) | YES | | NULL | |
+-----------+--------------+------+-----+---------+-------+
mysql> describe taxiQuery;
+--------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+----------------+
| queryID | int(11) | NO | PRI | NULL | auto_increment |
| destination | varchar(100) | YES | | NULL | |
| deptTime | varchar(100) | YES | | NULL | |
| startingPt | varchar(100) | YES | | NULL | |
| boardingPass | varchar(100) | YES | | NULL | |
| miscInfo | varchar(100) | YES | | NULL | |
| seat1 | varchar(100) | YES | | NULL | |
| seat2 | varchar(100) | YES | | NULL | |
| seat3 | varchar(100) | YES | | NULL | |
| seat4 | varchar(100) | YES | | NULL | |
+--------------+--------------+------+-----+---------+----------------+
What i want is to display the user's information if they exist in (seat1/seat2/seat3/seat4) in TaxiQuery. But i am only able to output one result when they are suppose to be three.
May i know how do i modify mysql statement to display the user's information when (seat1-4 is the foreign key to the deviceID of User's table) when seat1, seat2, seat3, seat4 contains the deviceID of the users?

As far as I can tell, it should work if you don't do an INNER join. I think the INNER keyword is telling mySQL to only include each source a maximum of once, so it will only use one copy of the TaxiQuery, when you actually need up to four (one per seat).

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Understanding JOINS , Sub Query and Aggregate Functions - sql

Related

Select all child record but need values of parent or grandparent record

SQL count results on left join

Error: The used SELECT statements have a different number of columns

mysql three joins

MYSQL select statement conditions

Categories

Resources