Aggregate function in inline view - sql

I am trying to write some SQL for Pervasive database but I can't get my query working.
let me present a simple example. Imagine I have the following table:
PURCHASES:
OrderNumber CustomerName
55 Amy
56 Dan
57 Bob
58 Dan
59 Dan
60 Bob
61 Amy
62 Cindy
63 Dan
now I can use the query select count(OrderNumber) as "Number of orders palced", CustomerName from PURCHASES group by CustomerName order by count(OrderNumber) desc to get this result for the inline view:
Number of orders placed CustomerName
4 Dan
2 Amy
2 Bob
1 Cindy
But I don't want to stop here I want to know how many customers exist for each "Number of orders placed" but I can't get this query right.
I want to use this as a subquery something like this:
select x."Number of orders placed",count(x.CustomerName) as
"Number of customers that have purchased this many orders" from
(
select count(OrderNumber) as "Number of orders placed", CustomerName from PURCHASES
group by CustomerName
) x
group by x."Number of orders placed"
my query is failing miserably I think because column labels is not the proper way to refer to the the subquery.
The result I want to get should look like this:
Number of orders placed Number of customers...
4 1
2 2
1 1
help and explanation is appreciated

I see nothing wrong in your query except for order by in the inline view.
select x."Number of orders placed",count(x.CustomerName) as
"Number of customers that have purchased this many orders"
from
(select count(OrderNumber) as "Number of orders placed", CustomerName
from PURCHASES
group by CustomerName
) x
group by x."Number of orders placed"

Related

Why did the 'NOT IN' work but not the 'NOT EXISTS'?

I've been trying to improve my SQL and was playing around with a 'NOT EXISTS' function. I needed to find the names of salespeople who did not have any sales to company 'RED'.
I tried this and it did not work:
SELECT DISTINCT
sp.name
FROM salesperson sp
WHERE NOT EXISTS (
SELECT
ord.sales_id
FROM
company cmp
LEFT JOIN orders ord
on cmp.com_id=ord.com_id
WHERE cmp.name = 'RED')
This query ran but returned a NULL. Then I changed it to this and it worked fine:
SELECT DISTINCT
sp.name
FROM salesperson sp
WHERE sp.sales_id NOT IN (
SELECT
ord.sales_id as sales_id
FROM
company cmp
left join orders ord
on cmp.com_id=ord.com_id
WHERE cmp.name = 'RED')
Can someone explain why 'NOT EXISTS' did not work in this instance?
.
.
.
.
.
.
Just in case, here is the exercise in full:
Given three tables: salesperson, company, orders
Output all the names in the table salesperson, who didn’t have sales to company 'RED'.
Table: salesperson
sales_id
name
salary
commission_rate
hire_date
1
John
100000
6
4/1/2006
2
Amy
120000
5
5/1/2010
3
Mark
65000
12
12/25/2008
4
Pam
25000
25
1/1/2005
5
Alex
50000
10
2/3/2007
The table salesperson holds the salesperson information. Every salesperson has a sales_id and a name.
Table: company
com_id
name
city
1
RED
Boston
2
ORANGE
New York
3
YELLOW
Boston
4
GREEN
Austin
The table company holds the company information. Every company has a com_id and a name.
Table: orders
order_id
order_date
com_id
sales_id
amount
1
1/1/2014
3
4
100000
2
2/1/2014
4
5
5000
3
3/1/2014
1
1
50000
4
4/1/2014
1
4
25000
The table orders holds the sales record information, salesperson and customer company are represented by sales_id and com_id.
expected output
name
Amy
Mark
Alex
Explanation:
According to order '3' and '4' in table orders, it is easy to tell only salesperson 'John' and 'Pam' have sales to company 'RED', so we need to output all the other names in the table salesperson.
I think your two queries are totally different.
NOT EXISTS - this will return data when that subquery doesn't return data. Which will always return some data so you will always get null. You need to join this subquery with the main query using WHERE sp.sales_id = ord.sales_id AND cmp.name = 'RED'
NOT IN - this is what you need for your purpose. You can see that it's clearly giving you data for not in (subquery) condition.
The equivalent NOT EXISTS requires a correlation clause:
SELECT sp.name
FROM salesperson sp
WHERE NOT EXISTS (SELECT ord.sales_id
FROM company cmp JOIN
orders ord
ON cmp.com_id = ord.com_id
WHERE sp.sales_id = ord.sales_id AND
cmp.name = 'RED'
);
Neither the NOT IN nor NOT EXISTS versions requires a LEFT JOIN in the subquery. In fact, the LEFT JOIN somewhat defeats the purpose of the logic.
Without the correlation clause, the subquery runs and it will return rows if any cmp.name is 'RED'. That appears to be the case and so NOT EXISTS always returns false.

Must I use inner join if I want to use MAX() value as a "where" condition?

My table is like this:
ProductID ProductName SupplierID CategoryID Unit Price
1 Chais 1 1 10 boxes x 20 bags 18
2 Chang 1 1 24 - 12 oz bottles 19
3 Aniseed Syrup 1 2 12 - 550 ml bottles 10
4 Chef Anton's
Cajun Seasoning 2 2 48 - 6 oz jars 21.35
5 Chef Anton's
Gumbo Mix 2 2 36 boxes 25
I copy it from https://www.w3schools.com/sql/sql_func_max.asp
I tried the simple version of MAX() function test, it works. But when I use the HighestPrice in the WHERE condtion as following:
SELECT
MAX(Price) AS HighestPrice,
SupplierID
FROM Products
GROUP BY SupplierID
WHERE HighestPrice>20;
The sytem report ERROR as:
Error: misuse of aggregate: MAX()
Does it mean I must use inner join to get what I want?
Following query should work:
SELECT SupplierID, MAX(Price) AS HighestPrice
FROM Products
GROUP BY SupplierID
HAVING MAX(Price) > 20;
following is the correct syntax of writing any SQL query:
SELECT column_name1,
SUM(column_name2)
FROM table_name
WHERE [CONDITION]
GROUP BY column_name1
HAVING (arithematic function condition);
Use having instead of where .
Where is always used before group by statement. It is way to filter the data which is already available with us whereas having is used after group by statement because it is applied on the data which we are in process of making.
SELECT MAX(Price) AS HighestPrice, SupplierID
FROM Products
Group By SupplierID
having MAX(Price) > 20;
Let me know in case of any queries.
Just for fun. If you specifically want to use where condition for highest price instead of having clause as given by G.arima with max function again. Do this:-
SELECT *
FROM
(
SELECT
MAX(Price) AS HighestPrice,
SupplierID
FROM Products
GROUP BY SupplierID
) a
WHERE HighestPrice>20;
Hope it helps :-)
By way of explanation of G.arima’s answer above:
When you use GROUP BY, you effectively create a new virtual table which contains only the GROUP BY fields as well as summaries.
There are two filter clauses, WHERE and HAVING, but they have a distinct role.
WHERE filters the original table. This gives you the formula FROM … WHERE …
HAVING filters the groups. This gives you the formula GROUP BY … HAVING …
What you ask is OK, but the clause is the wrong one. As G.arima says, you should use HAVING.

Merge output as one

Select Category, Books.ISBN,
Orderitems.Quantity * (Books.Retail - Books.Cost) AS Category_Profit
From BOoks
INNER JOIN Orderitems
ON BOOKS.ISBN=ORDERITEMS.ISBN
Example output:
Category ISBN Category_Profit
Family life 1234 50
Family Life 1234 50
Family Life 1234 100
Fitness 4321 10
Fitness 4321 20
So forth and so forth,
How can I make the output calculate all the values for each category into one line?
I.e
Family Life 1234 200
Fitness 4321 30
Because you already have this as a starting point, use exactly what you have as a temp table, and pull data from that:
Select Category, ISBN, Sum(Category_Profit) From
(
select Category, Books.ISBN as ISBN,
Orderitems.Quantity * (Books.Retail - Books.Cost) AS Category_Profit
From Books
INNER JOIN Orderitems
ON BOOKS.ISBN=ORDERITEMS.ISBN
) temp
group by Category, ISBN
You organized the data really well, so implementing a sum on the Profit is easy. You group by Category and ISBN to get all unique pairs of those columns with the corresponding Profit.
If you do not want to use a sub-query you can sum in your query (but I feel it's something helpful to use my existing query as a sub-query before altering it, just to make sure it works:
select Category, Books.ISBN,
SUM(Orderitems.Quantity * (Books.Retail - Books.Cost)) AS Category_Profit
From Books
INNER JOIN Orderitems
ON BOOKS.ISBN=ORDERITEMS.ISBN
temp
group by Category, Books.ISBN
Group by can be used to solve your problem.
Note: In Group by clause , a set of table rows can be grouped based on certain columns and in the select clause either the group by columns or aggregate function(SUM,MIN,MAX,Count etc) on other columns can be shown.
Reference :
http://www.dofactory.com/sql/group-by
Use group by as is done below. Hope this solves the issue.
Select Category, Books.ISBN,
SUM(Orderitems.Quantity * (Books.Retail - Books.Cost)) AS Category_Profit
From BOoks
INNER JOIN Orderitems
ON BOOKS.ISBN=ORDERITEMS.ISBN
Group by Category,ISBN
Use GROUP_BY & SUM, Syntax :
SELECT column_name, SUM(column_name)
FROM table_name
WHERE column_name operator value
GROUP BY column_name;
Refer: SQL GROUP_BY
On you table you may run :
Select Category, ISBN, Sum(Category_Profit) From Table1
group by Category, ISBN;
Table1:
Category ISBN Category_Profit
Family life 1234 50
Family Life 1234 50
Family Life 1234 100
Fitness 4321 10
Fitness 4321 20
Output:
| Category | ISBN | Sum(Category_Profit) |
|-------------|------|----------------------|
| Family life | 1234 | 200 |
| Fitness | 4321 | 30 |
Fiddle

Postgresql : Check if the last number is the highest

I have large database and one field should be an incremental number, but it sometimes resets and I must detect them (the bold rows)
Table 1:
Shop #Sell DATE
EC1 56 1/10/2015
EC1 57 2/10/2015
**EC1 11 3/10/2015
EC1 12 4/10/2015**
AS2 20 1/10/2015
AS2 21 2/10/2015
AS2 22 3/10/2015
AS2 23 4/10/2015
To solve this problem I thought to find the highest number of each SHOP and check if it is the number with the highest DATE. Do you know another easier way to do it?
My concern is that it can be a problem to do the way I am thinking since I have a large database.
Do you know how I can do the query I am thinking of or do you have any others ideas?
The query you have in mind will give you all Shop values having a discontinuity in Sell number.
If you want to get the offending record you can use the following query:
SELECT Shop, Sell, DATE
FROM (
SELECT Shop, Sell, DATE,
LAG(Sell) OVER (PARTITION BY Shop ORDER BY DATE) AS prevSell
FROM Shops ) t
WHERE Sell < prevSell
ORDER BY DATE
LIMIT 1
The above query will return the first discontinuity found within each Shop partition.
Output:
Shop Sell DATE
---------------------
EC1 11 2015-03-10
Demo here
EDIT:
In case you cannot use windowed function and you only want the id of the shop having the discontinuity, then you can use the following query:
SELECT s.Shop
FROM Shops AS s
INNER JOIN (
SELECT Shop, MAX(Sell) AS Sell, MAX(DATE) AS DATE
FROM Shops
GROUP BY Shop ) t
ON s.Shop = t.Shop AND s.DATE = t.DATE
WHERE t.Sell <> s.Sell
The above will work provided that you have unique DATE values per Shop.
I think the following is the type of query you want:
select s.*
from (select shop, max(sell) as maxsell,
first_value(sell) over (partition by shop order by date desc) as lastsell
from shops s
group by shop
) s
where maxsell <> lastsell;

How can I SELECT the max row in a table SQL?

I have a little problem.
My table is:
Bill Product ID Units Sold
----|-----------|------------
1 | 10 | 25
1 | 20 | 30
2 | 30 | 11
3 | 40 | 40
3 | 20 | 20
I want to SELECT the product which has sold the most units; in this sample case, it should be the product with ID 20, showing 50 units.
I have tried this:
SELECT
SUM(pv."Units sold")
FROM
"Products" pv
GROUP BY
pv.Product ID;
But this shows all the products, how can I select only the product with the most units sold?
Leaving aside for the moment the possibility of having multiple products with the same number of units sold, you can always sort your results by the sum, highest first, and take the first row:
SELECT pv."Product ID", SUM(pv."Units sold")
FROM "Products" pv
GROUP BY pv."Product ID"
ORDER BY SUM(pv."Units sold") DESC
LIMIT 1
I'm not quite sure whether the double-quote syntax for column and table names will work - exact syntax will depend on your specific RDBMS.
Now, if you do want to get multiple rows when more than one product has the same sum, then the SQL will become a bit more complicated:
SELECT pv.`Product ID`, SUM(pv.`Units sold`)
FROM `Products` pv
GROUP BY pv.`Product ID`
HAVING SUM(pv.`Units sold`) = (
select max(sums)
from (
SELECT SUM(pv2.`Units sold`) as "sums"
FROM `Products` pv2
GROUP BY pv2.`Product ID`
) as subq
)
Here's the sqlfiddle
SELECT SUM(pv."Units sold") as `sum`
FROM "Products" pv
group by pv.Product ID
ORDER BY sum DESC
LIMIT 1
limit 1 + order by
The Best and effective way to this is Max function
Here's The General Syntax of Max function
SELECT MAX(ID) AS id
FROM Products;
and in your Case
SELECT MAX(Units Sold) from products
Here is the Complete Reference to MIN and MAX functions in Query
Click Here