SQL Server Grouping issue - sql

I have two tables.
Customer ID, NAME, MonthlyIncome
Orders ID, CustomerKey, TotalCost
I want join those two table into one, in the way that having total amount the bought in the shop.
CustomerKey, MonthlyIncome, TotalCost.
I have tried to group by CustomerKey, but I am not able to include the rest of the columns. Which query should I be using?
SELECT
fs.CustomerKey, SUM(fs.TotalCost) as TotalBought
FROM
FactOnlineSales fs
INNER JOIN
DimCustomer c on c.CustomerKey = fs.CustomerKey
GROUP BY
fs.CustomerKey

I have tried to group by CustomerKey, but not able to manage how to include rest of the columns.
Just do it:
SELECT fs.CustomerKey,
c.MonthlyIncome,
SUM(fs.TotalCost) as TotalBought
FROM FactOnlineSales fs
INNER JOIN DimCustomer c
ON c.CustomerKey = fs.CustomerKey
GROUP BY fs.CustomerKey,
c.MonthlyIncome
Presumably CustomerKey is unique, so grouping by a less unique column isn't going to change the SUM total.

You didn't mention what version of SQL Server you're using - but if you're on SQL Server 2005 or newer, you could use the OVER() clause:
SELECT
fs.CustomerKey,
c.MonthlyIncome,
TotalBought = SUM(fs.TotalCost) OVER(PARTITION BY fs.CustomerKey)
FROM
FactOnlineSales fs
INNER JOIN
DimCustomer c on c.CustomerKey = fs.CustomerKey
That basically gives you the CustomerKey, the MonthlyIncome and then sums the TotalCost for each CustomerKey and displays the appropriate value in your output. No need to do a GROUP BY in this case

You may have an error in the ON clause of the join.
You used CustomerKey instead of Customer ID from the DimCustomer table.
For gouping: you must group by every column you don't aggregate on, so you can include NAME (so you can have descriptive customer) and MonthlyIncome.
SELECT
c."Customer ID",
c.NAME,
c.MonthlyIncome,
SUM(fs.TotalCost) as TotalBought
FROM
FactOnlineSales fs INNER JOIN DimCustomer c
on c."Customer ID" = fs.CustomerKey
GROUP BY
c."Customer ID",
c.NAME,
c.MonthlyIncome

Related

Group By and Inner Join Together To Get Unique Values By Maximum Date

I have a table here in which I want to write a SELECT query in SQL Server that allows me to get the following:
For each unique combination of SalesPerson x Country, get only the rows with the latest Upload_DateTime
However, I am trying to do a group-by and inner join, but to no avail. My code is something like this:
SELECT t1.[SalesPerson], t1.[Country], MAX(t1.[Upload_DateTime]) as [Upload_DateTime]
FROM [dbo].[CommentTable] AS t1
GROUP BY t1.[SalesPerson], t1.[Country]
INNER JOIN SELECT * FROM [dbo].[CommentTable] as t2 ON t1.[SalesPerson] = t2.[SalesPerson], t1.[Country] = t2.[Country]
It seems like the GROUP BY needs to be done outside of the INNER JOIN? How does that work? I get an error when I run the query and it seems my SQL is not right.
Basically, this subquery will fetch the person, the country and the latest date:
SELECT
SalesPerson, Country, MAX(uplodaed_datetime)
FROM CommentTable
GROUP BY SalesPerson, Country;
This can be used on a lot of ways (for example with JOIN or with an IN clause).
The main query will add the remaing columns to the result.
Since you tried a JOIN, here the JOIN option:
SELECT
c.id, c.SalesPerson, c.Country,
c.Comment, c.uplodaed_datetime
FROM
CommentTable AS c
INNER JOIN
(SELECT
SalesPerson, Country,
MAX(uplodaed_datetime) AS uplodaed_datetime
FROM CommentTable
GROUP BY SalesPerson, Country) AS sub
ON c.SalesPerson = sub.SalesPerson
AND c.Country = sub.Country
AND c.uplodaed_datetime = sub.uplodaed_datetime
ORDER BY c.id;
Try out: db<>fiddle

SQL dividing a count from one table by a number from a different table

I am struggling with taking a Count() from one table and dividing it by a correlating number from a different table in Microsoft SQL Server.
Here is a fictional example of what I'm trying to do
Lets say I have a table of orders. One column in there is states.
I have a second table that has a column for states, and second column for each states population.
I'd like to find the order per population for each sate, but I have struggled to get my query right.
Here is what I have so far:
SELECT Orders.State, Count(*)/
(SELECT StatePopulations.Population FROM Orders INNER JOIN StatePopulations
on Orders.State = StatePopulations.State
WHERE Orders.state = StatePopulations.State )
FROM Orders INNER JOIN StatePopulations
ON Orders.state = StatePopulations.State
GROUP BY Orders.state
So far I'm contending with an error that says my sub query is returning multiple results for each state, but I'm newer to SQL and don't know how to overcome it.
If you really want a correlated sub-query, then this should do it...
(You don't need to join both table in either the inner or outer query, the correlation in the inner query's where clause does the 'join'.)
SELECT
Orders.state,
COUNT(*) / (SELECT population FROM StatePopulation WHERE state = Orders.state)
FROM
Orders
GROUP BY
Orders.state
Personally, I'd just join them and use MAX()...
SELECT
Orders.state,
COUNT(*) / MAX(StatePopulation.population)
FROM
Orders
INNER JOIN
StatePopulation
StatePopulation.state = Orders.state
GROUP BY
Orders.state
Or aggregate your orders before you join...
SELECT
Orders.state,
Orders.order_count / StatePopulation.population
FROM
(
SELECT
Orders.state,
COUNT(*) AS order_count
FROM
Orders
GROUP BY
Orders.state
)
Orders
INNER JOIN
StatePopulation
StatePopulation.state = Orders.state
(Please forgive typos and smelling pistakes, I'm doing this on a phone.)

SQL Server INNER JOIN and GROUP BY

In sql server I'm trying to group by each sales people some infos as follow:
I have 2 tables: Positions and Clients
In Positions table, I have the following columns: Client_Id, Balance, Acquisition_Cost and in the Clients table I use the following columns: Client_Id and Sales_person.
I want to group by Sales_person (Clients table) the Client_id, Balance, Acquisition_Cost (Positions table)
I tried this:
SELECT Positions.Client_ID, Positions.Balance, Positions.Acquisition_cost
FROM Positions
INNER JOIN Clients ON Positions.Client_ID = Clients.Client_ID
GROUP BY Sales_person
It gives me "Positions.Client_ID is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause".
I precise I'm pretty new on SQL so that does not ring that much a bell to me.
You need to add Positions.Client_ID in the GROUP BY Clause, since the select expects this column if you really do not want Positions.Client_ID in the select, then no need to add in the GROUP BY Clause
For any column in your SELECT that you aren't including in the GROUP BY, you need to use some kind of aggregate function (MAX, SUM, etc.) on the column. So you could write it like this:
SELECT Positions.Client_ID
, Clients.Sales_person
, SUM(Positions.Balance) Balance_Sum,
, SUM(Positions.Acquisition_cost) Acquisition_Cost_Sum
FROM Positions
INNER JOIN Clients ON Positions.Client_ID = Clients.Client_ID
GROUP BY Positions.Client_ID
, Clients.Sales_person
If you only want the totals by Sales_person and not client_ID, you could just remove client_id from the SELECT AND GROUP BY:
SELECT Clients.Sales_person
, SUM(Positions.Balance) Balance_Sum,
, SUM(Positions.Acquisition_cost) Acquisition_Cost_Sum
FROM Positions
INNER JOIN Clients ON Positions.Client_ID = Clients.Client_ID
GROUP BY Clients.Sales_person
SELECT a.Client_ID, b.Sales_person, SUM(a.Balance) as Balance_Sum,SUM(a.Acquisition_cost) as Acquisition_Cost_Sum FROM Positions as a INNER JOIN Clients as b ON a.Client_ID = b.Client_ID GROUP BY a.Client_ID, b.Sales_person
Thanks all of you guys, here's the code that works for me in this case:
SELECT Positions.Client_Id, Clients.Sales_person, SUM(Positions.Balance) as sum_balance_impacted, SUM(Positions.Acquisition_cost) as sum_acquisition_cost
FROM Positions
INNER JOIN Clients ON Positions.Client_Id= Clients.Client_Id
GROUP BY Clients.Sales_person, Positions.Client_Id```

Finding max - SQL

I needed to print ( show ) the max of the table created by the following code:
SELECT name, SUM(cost)+SUM(DISTINCT stock*cost) AS result
FROM publishers
NATURAL JOIN editions
NATURAL JOIN shipments
NATURAL JOIN stock
GROUP BY name
I have tried using DESC but am not allowed to use it
SELECT MAX(result) from (SELECT name, SUM(cost)+SUM(DISTINCT stock*cost) AS result FROM publishers NATURAL JOIN editions NATURAL JOIN shipments NATURAL JOIN stock group by name )
This nested query should do if you only need max of result.

Using SQL query to find details of customers who ordered > x types of products

Please note that I have seen a similar query here, but think my query is different enough to merit a separate question.
Suppose that there is a database with the following tables:
customer_table with customer_ID (key field), customer_name
orders_table with order_ID (key field), customer_ID, product_ID
Now suppose I would like to find the names of all the customers who have ordered more than 10 different types of product, and the number of types of products they ordered. Multiple orders of the same product does not count.
I think the query below should work, but have the following questions:
Is the use of count(distinct xxx) generally allowed with a "group by" statement?
Is the method I use the standard way? Does anybody have any better ideas (e.g. without involving temporary tables)?
Below is my query
select T1.customer_name, T1.customer_ID, T2.number_of_products_ordered
from customer_table T1
inner join
(
select cust.customer_ID as customer_identity, count(distinct ord.product_ID) as number_of_products_ordered
from customer_table cust
inner join order_table ord on cust.customer_ID=ord.customer_ID
group by ord.customer_ID, ord.product_ID
having count(distinct ord.product_ID) > 10
) T2
on T1.customer_ID=T2.customer_identity
order by T2.number_of_products_ordered, T1.customer_name
Isn't that what you are looking for? Seems to be a little bit simpler. Tested it on SQL Server - works fine.
SELECT customer_name, COUNT(DISTINCT product_ID) as products_count FROM customer_table
INNER JOIN orders_table ON customer_table.customer_ID = orders_table.customer_ID
GROUP BY customer_table.customer_ID, customer_name
HAVING COUNT(DISTINCT product_ID) > 10
You could do it more simply:
select
c.id,
c.cname,
count(distinct o.pid) as `uniques`
from o join c
on c.id = o.cid
group by c.id
having `uniques` > 10