Northwind - How many countries one shipper handled in one year - sql

I need some help with my shipping query.
I need to count, how many different countries one shipping company delivered to in last year. My current query looks like this:
SELECT CompanyName, Count(o.CountryID) as Shipments, o.CountryID as Countries
FROM Shippers s
INNER JOIN Orders o ON s.ShipperID = o.ShipVia
WHERE DATEPART(year, o.OrderDate)=1997
GROUP BY CompanyName, o.CountryID
ORDER BY Shipments DESC;
It gives me list of Companies, How many times this company shipper to country with CountryID.
United Package 26 9
United Package 26 20
Speedy Express 23 9
Speedy Express 19 20
United Package 17 4
Speedy Express 16 4
What I need is to count how many distinctive countries one shipping company delivered to. So for example it should give me:
United Package 120 4
Speedy Express 90 3
United Package send 120 orders to 4 different countries.
How can I change my query to get that result?

SELECT CompanyName, SUM(Shipments) AS Shipments,COUNT(DISTINCT Countries) AS Countries
FROM
(
SELECT CompanyName, Count(o.CountryID) as Shipments, o.CountryID as Countries
FROM Shippers s
INNER JOIN Orders o ON s.ShipperID = o.ShipVia
WHERE DATEPART(year, o.OrderDate)=1997
GROUP BY CompanyName, o.CountryID
) AS T
GROUP BY CompanyName
ORDER BY Shipments DESC

I would recommend:
SELECT s.CompanyName, Count(*) as Shipments,
COUNT(DISTINCT o.CountryID) as Countries
FROM Shippers s INNER JOIN
Orders o
ON s.ShipperID = o.ShipVia
WHERE o.OrderDate >= '1997-01-01' AND o.OrderDate < '1998-01-01'
GROUP BY s.CompanyName
ORDER BY Shipments DESC;
Notes:
Qualify all column names, especially in a query that has more than one table reference.
Don't use datepart(). That prevents the use of indexes. Instead, do direct comparisons on dates.
COUNT(DISTINCT) appears to do what you want.

Related

SQL find the maximum value

I have 2 tables:
Customer table
CustomerId
FirstName
LastName
Country
1
Luís
Gonçalves
Brazil
2
Leonie
Köhler
Germany
3
François
Tremblay
Canada
4
Bjørn
Hansen
Norway
52
Emma
Jones
United Kingdom
53
Phil
Hughes
United Kingdom
Invoice table
InvoiceId
CustomerId
Total
1
2
1.98
2
4
3.96
3
8
5.94
140
52
23.76
369
52
13.86
283
53
28.71
109
53
8.91
I have to write a query that returns the country along with the top customer and how much they spent. For countries where the top amount spent is shared, provide all customers who spent this amount.
I wrote a query like:
SELECT c.CustomerId, c.FirstName, c.LastName, c.Country, SUM(i.Total) AS TotalSpent
FROM Customer c
JOIN Invoice i
ON c.CustomerId = i.CustomerId
GROUP BY c.FirstName, c.LastName
HAVING i.Total >= MAX(i.Total)
ORDER BY c.Country;
the query is not finding the maximum values but it is returning all available values.
I am not sure about which DBMS is used as it is my first steps in SQL coding and above example is from Udacity learning platform lab (Maybe it is SQLite that they are using in the lab)
Any help is appreciated. Thank you!
You did not shared your database with us.
Also, you need to add expected results in your question from the data you provided.
But lets say you use SQLite then this would work I think:
select CustomerId
, FirstName
, LastName
, Country
, max(tot)
from ( select sum(i1.Total) as tot
, i1.CustomerId
, c1.Country
, c1.FirstName
, c1.LastName
FROM Customer c1
JOIN Invoice i1
ON c1.CustomerId = i1.CustomerId
group by i1.CustomerId) TAB
group by Country
DEMO
After the comment from the OP I have edited the code:
select c.CustomerId, c.FirstName, c.LastName, c.Country, sum(Total)
from Customer c
JOIN Invoice i ON c.CustomerId = i.CustomerId
group by country, c.CustomerId
having sum(Total) in (select max(tot) as tot2
from (select sum(i1.Total) as tot
, country
FROM Customer c1
JOIN Invoice i1
ON c1.CustomerId = i1.CustomerId
group by i1.CustomerId) TAB
group by country)
DEMO2
As the accepted (at the time of writing this answer) solution would fail on at least PostgreSQL (for not including a selected value in either the group by clause or an aggregate function) I provide another variant:
WITH t AS
(
SELECT
c.customerid AS i, c.last_name AS l, c.first_name AS f,
c.country AS c, SUM(i.total) AS s
FROM customer c JOIN invoice i ON c.customerid = i.customerid
GROUP BY c.customerid, c.country
-- note: last and first name do not need to be included in the group by clause
-- here as the (assumed!) primary key 'customerid' already is
)
SELECT c, l, f, s FROM t
WHERE s = (SELECT MAX(s) FROM t WHERE t.c = c)
(tested on PostgreSQL)
Below code worked fine to fulfill all the requirements:
WITH tab1 AS
( SELECT c.CustomerId, c.FirstName, c.LastName,c.Country, SUM(i.Total) as TotalSpent
FROM Customer AS c
JOIN Invoice i ON c.CustomerId = i.CustomerId
GROUP BY C.CustomerId
)
SELECT tab1.*
FROM tab1
JOIN
( SELECT CustomerId, FirstName, LastName, Country, MAX(TotalSpent) AS TotalSpent
FROM tab1
GROUP BY Country
)tab2
ON tab1.Country = tab2.Country
WHERE tab1.TotalSpent = tab2.TotalSpent
ORDER BY Country;

How to select customers that didn't place an order in the last 7 days

I am trying to find out the customers that didn't place an order in the last seven days. Basically I have 3 tables: customers, orders and help_desk_agents.
I'm trying to figure out the best way to get this information.
The SQL bellow retrieves the customers info, the help desk agent 111 and the last date of the orders of each customer:
SELECT DISTINCT customers.customer_id,
customers.customer_name,
agents.help_desk_agent,
Max(orders.order_date)
FROM customers
LEFT JOIN (SELECT DISTINCT customers.customer_id,
orders.order_date
FROM orders
GROUP BY 1,
2) orders2
ON customers.customer_id = orders2.customer_id
LEFT JOIN help_desk_agents
ON customers.help_desk_agent_id =
help_desk_agents.help_desk_agent_id
WHERE customer.help_desk_agent_id = 111
GROUP BY 1,
2,
3
I would like like somehow to filter the customers that didn't place an order in the last seven days.
Try adding this at the and of your query :
having max(orders.order_date) < dateadd(day, -7, getdate())
You can try a
Datediff(dd,<datecolumn>,getdate())
and use
>= 7
as a condition.
The query that you want should look like this:
SELECT c.customer_id, c.customer_name, a.help_desk_agent,
Max(orders.order_date)
FROM customers c JOIN
(SELECT o.customer_id, MAX(o.order_date) as max_order_date
FROM orders o
GROUP BY o.customer_id
) o
ON c.customer_id = o.customer_id
WHERE c.help_desk_agent_id = 111 AND
o.max_order_date < dateadd(day, -7, getdate());
Your query has multiple issues:
The alias customers.customer_id is not understood in the subquery.
The select distinct is unnecessary.
LEFT JOIN is unnecessary because presumably customers have at least one order and you require a match to the agent table.
You don't need the agent table, because the information you want is in the customer table.

how to select duplicated column value in sql

Write a query that determines the customer that has spent the most on music for each country. Write a query that returns the country along with the top customer and how much they spent. For countries where the top amount spent is shared, provide all customers who spent this amount.
You should only need to use the Customer and Invoice tables.
i want to select the customer with the maximum money spent for each country and there is two customers have the same money spent and the same country so when using group by country i got only 1 customer what should i do ?
select c.CustomerId,c.FirstName,c.LastName, c.Country , max(c.Invoices) as TotalSpent
from
(select * , sum(i.Total) as 'Invoices'
from Customer d
join Invoice i on i.CustomerId = d.CustomerId
group by i.CustomerId
) c
group by c.Country
the table i got is the same expected table except 1 customer
Consider joining unit level with two aggregate queries: 1) first to calculate total amount by CustomerId and Country and 2) second to calculate max total amount by Country.
Below assumes your database supports Common Table Expression (CTE) using the WITH clause (nearly supported by all major commercial or open-source RDBMS's). CTE here avoids the need to repeat sum_agg as a subquery.
with sum_agg AS (
select i.CustomerId, sub_c.Country, sum(i.Total) as Sum_Amount
from Customer sub_c
join Invoice i on i.CustomerId = sub_c.CustomerId
group by i.CustomerId, sub_c.Country
)
select c.CustomerId, c.FirstName, c.LastName, c.Country, max_agg.Max_Sum
from Customer c
join sum_agg
on c.CustomerId = sum_agg.Customer_Id and c.Country = sum_agg.Country
join
(select Country, max(Sum_Amount) as Max_Sum
from sum_agg
group by Country
) max_agg
on max_agg.Country = sum_agg.Country and max_agg.Max_Sum = sum_agg.Sum_Amount
Your inner query is almost correct. It should be
select d.*, sum(i.Total) as Invoices
from Customer d
join Invoice i on i.CustomerId = d.CustomerId
group by d.CustomerId
It is allowed to use d.* here, as we can assume d.CustomerId to be the table's primary key, so all columns in the table are functionally dependent on it. If we grouped by d.country instead for instance, that would not be the case and d.* would be forbidden in the select clause (as well as d.firstname etc.). We can only select columns we grouped by (directly or indirectly) and aggregates such as MIN, MAX, SUM etc.
This query gives you the totals per customer along with the customers' countries.
But then you are taking this result and group by country. If you do this, you can only access country and its aggregates. Selecting c.CustomerId for instance is invalid, as there is no the customer ID per country. If your DBMS allows this, it it flawed in this regard and you get a kind of random result.
If your DBMS features window functions, you can get the maximum amounts per country on-the-fly:
select customerid, firstname, lastname, country, invoices
from
(
select
c.*,
sum(i.total) as invoices,
max(sum(i.total)) over (partition by c.country) as max_sum
from customer c
join invoice i on i.customerid = c.customerid
group by c.customerid
) per_customer
where invoices = max_sum
order by country, customerid;
Otherwise you'd have to use your inner query twice, once to get the country totals, once to get the customers matching these totals:
select
c.customerid, c.firstname, c.lastname, c.country,
sum(i.total) as invoices
from customer c
join invoice i on i.customerid = c.customerid
group by c.customerid
having (c.country, invoices) in
(
select country, max(invoices)
from
(
select
--c.customerid, <- optional, it may make this query more readable
c.country,
sum(i.total) as invoices
from customer c
join invoice i on i.customerid = c.customerid
group by c.customerid
) per_customer
);

Microsoft SQL: include dummy row(s) in results when value doesn't exist

I'm returning total sales for a period of time for each country. Sometimes a country will not appear in the results because they haven't had any orders during that time period. For these countries with no sales, I would like to include in the results the countries abbreviated name and sales total with a value of '0'. For example, NL and IS should also be included in the results with Sales_Total both with a value of '0'. How would I include those dummies rows in the results when the country hasn't had any sales for the period?
**QUERY:**
SELECT
Country,
SUM(TOTAL) AS Sales_Total
FROM Orders
WHERE OrderDate BETWEEN '2014-01-01' AND '2014-12-31'
GROUP BY Country
**RESULTS**
Country Total_Sales
AU 7646
CA 13773
KR 13976
NZ 1831
US 69421
**Required Results:**
Country Total_Sales
AU 7646
CA 13773
KR 13976
NZ 1831
US 69421
NL 0
IS 0
This should do it:
SELECT Country
, Sales_Total=ISNULL(Sales_Total,0)
FROM
(SELECT o.Country
, SUM(TOTAL) AS Sales_Total
FROM Orders
WHERE OrderDate BETWEEN '2014-01-01' AND '2014-12-31'
GROUP BY Country) AS o
RIGHT OUTER JOIN
(SELECT DISTINCT
Country
FROM Orders) AS C ON o.Country = c.Country;
I would use (create if needed) a country table you could outer join from. Then you can write like so;
SELECT
c.CountryCode,
SUM(TOTAL) AS Sales_Total
FROM Country c
LEFT JOIN Orders o
in c.CountryCode = o.Country AND o.OrderDate BETWEEN '2014-01-01' AND '2014-12-31'
GROUP BY c.CountryCode
Giorgos has one way, but you can also left join your query to the list of Countries that you have in another table (or the same table). Something like this:
SELECT c.Country, ISNULL(s.Sales_Total,0) FROM
Countries AS c LEFT JOIN
(SELECT
Country,
SUM(TOTAL) AS Sales_Total
FROM Orders
WHERE OrderDate BETWEEN '2014-01-01' AND '2014-12-31'
GROUP BY Country) s
ON c.Country = s.Country
You can place the predicate of the WHERE clause inside SUM:
SELECT
Country,
SUM(CASE WHEN OrderDate BETWEEN '2014-01-01' AND '2014-12-31'
THEN TOTAL
ELSE 0
END) AS Sales_Total
FROM Orders
GROUP BY Country

select based on calculated value, optimization

i need to select among other fields the age of a customer at the time he/she bought some product of a specific brand etc, WHERE the customer was for example between 30 and 50 years old.i wrote this query (getAge just uses DATEDIFF to return the age in years)
SELECT DISTINCT customers.FirstName, customers.LastName,
products.ProductName,
dbo.getAge(customers.BirthDate,sales.Datekey)
AS Age_when_buying
FROM sales
INNER JOIN dates ON sales.Datekey=dates.Datekey
INNER JOIN customers ON sales.CustomerKey=customers.CustomerKey
INNER JOIN products ON sales.ProductKey=products.ProductKey
INNER JOIN stores ON sales.StoreKey=stores.StoreKey
WHERE stores.StoreName = 'DribleCom Europe Online Store' AND
products.BrandName = 'Proseware' AND
dbo.getAge(customers.BirthDate, sales.Datekey) >= 30 AND
dbo.getAge(customers.BirthDate, sales.Datekey) <=50
and it works but i calculate the age three times.I tried to assign age_when_buying to a variable but it didn't work.My next thought was to use cursor but i feel that there is a more simple way i am missing.The question is: which is the appropriate way to solve this or what are my options?
Assuming that you only have a limited number of filters you'd like to apply, you could use a Common Table Expression to restructure your query.
I personally find it easier to see all the joins and such in one place, while the filters are similarly grouped together at the bottom...
WITH CTE AS(
select customers.FirstName
, customers.LastName
, dbo.getAge(customers.BirthDate,sales.Datekey) AS Age_when_buying
, sales.StoreName
, products.BrandName
, products.ProductName
from sales
INNER JOIN customers on sales.CustomerKey=customers.CustomerKey
INNER JOIN products ON sales.ProductKey = products.ProductKey
INNER JOIN stores ON sales.StoreKey = stores.StoreKey
)
SELECT DISTINCT FirstName, LastName, ProductName, Age_when_buying
FROM CTE
WHERE StoreName = 'DribleCom Europe Online Store'
AND BrandName = 'Proseware'
AND Age_when_buying BETWEEN 30 AND 50
You should use Cross Apply.
SELECT DISTINCT customers.FirstName, customers.LastName,
products.ProductName,
age.age AS Age_when_buying
FROM sales
INNER JOIN dates ON sales.Datekey=dates.Datekey
INNER JOIN customers ON sales.CustomerKey=customers.CustomerKey
INNER JOIN products ON sales.ProductKey=products.ProductKey
INNER JOIN stores ON sales.StoreKey=stores.StoreKey
CROSS APPLY
(select dbo.getAge(customers.BirthDate, sales.Datekey) as age) age
WHERE stores.StoreName = 'DribleCom Europe Online Store' AND
products.BrandName = 'Proseware' AND
age.age >= 30 AND
age.age <=50
You could use a WITH clause :
WITH Customers_Info (CustomerFirstName, CustomerLastName, CustomerKey, AgeWhenBuying)
AS
(
SELECT customers.FirtName,
customers.LastName,
CustomerKey
dbo.getAge(customers.BirthDate, sales.DateKey) As AgeWhenBuying
FROM customers
JOIN sale USING(CustomerKey)
)
SELECT FirstName,
LastName,
products.ProductName,
Customers_Info.AgeWhenBuying
FROM Customers_Info
JOIN sale USING(CustomerKey)
JOIN products USING(ProductKey)
JOIN stores USING(StoreKey)
WHERE stores.StoreName = 'DribleCom Europe Online Store'
AND products.BrandName = 'Proseware'
AND Customers_Info.AgeWhenBuying >= 30
AND Customers_Info.AgeWhenBuying <= 50;