Explanation of PostgreSQL Query - sql

I'm having trouble unpacking this postgreSQL query:
select name, revenue from (
select facs.name, sum(case
when memid = 0 then slots * facs.guestcost
else slots * membercost
end) as revenue
from cd.bookings bks
inner join cd.facilities facs
on bks.facid = facs.facid
group by facs.name
) as agg where revenue < 1000
order by revenue;
Here are my questions.
the outer query is pulling "revenue" from the table returned by the inner query, which has a column name "revenue" as well. Is the outer query simply pulling the column from the inner query with this reference?
what is "as agg" doing in this query? It isn't referenced anywhere.

Yes. The outer query just references the value of "revenue" of the inner query which, in this case, is the result of the sum() function.
It's an alias for the result. If, for any reason, you want to use the result of your outer query, in another query, you reference it as 'agg'. For example, lets say I want to get the revenue < 500 of the initial result, I would create the following query:
SELECT revenue FROM agg WHERE revenue < 500

The inner query is querying the table 'bookings' (and naming it bks) it is joining the table 'facilities' (and naming it facs). The join is occurring on bookings.facid being equal to facilities.facid
The generated results of that join are being called "agg". There must be multiple entries with the same name in the 'facilities', as the results are grouped by facilities.'name'. 'revenue' is being calculated by the sum() function, which is calculating a total using all of the entries consisting of the same 'facilities'.'name'
"agg" is the name given to the results of inner query, as all generated results must be named, but does not need to be referenced in the outer query because 'name' and 'revenue' are unique in the outer query. So, "SELECT name, revenue" is the same as "SELECT agg.name, agg.revenue"

Read it like this
There is inner query which fetches you two columns name and revenew from joins
Below part is your revenew from bookings table
sum(case when memid = 0 then slots * facs.guestcost else slots * membercost end) as revenue
The booking table is then joined with facilities table like this
inner join cd.facilities facs on bks.facid = facs.facid
Since this is inner query so alias is must to give like agg
after when the inner query returns the data , the main select query
select name, revenue extracts the data <1000
the outer query is pulling "revenue" from the table returned by the inner query, which has a column name "revenue" as well. Is the outer query simply pulling the column from the inner query with this reference?
YES
what is "as agg" doing in this query? It isn't referenced anywhere.
its just an alias to tag that from as a table.

Related

SQL dividing a count from one table by a number from a different table

I am struggling with taking a Count() from one table and dividing it by a correlating number from a different table in Microsoft SQL Server.
Here is a fictional example of what I'm trying to do
Lets say I have a table of orders. One column in there is states.
I have a second table that has a column for states, and second column for each states population.
I'd like to find the order per population for each sate, but I have struggled to get my query right.
Here is what I have so far:
SELECT Orders.State, Count(*)/
(SELECT StatePopulations.Population FROM Orders INNER JOIN StatePopulations
on Orders.State = StatePopulations.State
WHERE Orders.state = StatePopulations.State )
FROM Orders INNER JOIN StatePopulations
ON Orders.state = StatePopulations.State
GROUP BY Orders.state
So far I'm contending with an error that says my sub query is returning multiple results for each state, but I'm newer to SQL and don't know how to overcome it.
If you really want a correlated sub-query, then this should do it...
(You don't need to join both table in either the inner or outer query, the correlation in the inner query's where clause does the 'join'.)
SELECT
Orders.state,
COUNT(*) / (SELECT population FROM StatePopulation WHERE state = Orders.state)
FROM
Orders
GROUP BY
Orders.state
Personally, I'd just join them and use MAX()...
SELECT
Orders.state,
COUNT(*) / MAX(StatePopulation.population)
FROM
Orders
INNER JOIN
StatePopulation
StatePopulation.state = Orders.state
GROUP BY
Orders.state
Or aggregate your orders before you join...
SELECT
Orders.state,
Orders.order_count / StatePopulation.population
FROM
(
SELECT
Orders.state,
COUNT(*) AS order_count
FROM
Orders
GROUP BY
Orders.state
)
Orders
INNER JOIN
StatePopulation
StatePopulation.state = Orders.state
(Please forgive typos and smelling pistakes, I'm doing this on a phone.)

Get row from one table, plus COUNT from a related table

I'm trying to build an SQL query where I grab one table's information (WHERE shops.shop_domain = X) along with the COUNT of the customers table WHERE customers.product_id = 4242451.
The shops table DOES NOT have product.id in it, but the customers table DOES HAVE the shop_domain in it, hence my attempt to do some sort of join.
I essentially want to return the following:
shops.id
shops.name
shops.shop_domain
COUNT OF CUSTOMERS WHERE customers.product_id = '4242451'
Here is my not so lovely attempt at the query.
I think I have the idea right (maybe...) but I can't wrap my head around building this query.
SELECT shops.id, shops.name, shops.shop_domain, COUNT(customers.customer_id)
FROM shops
LEFT JOIN customers ON shops.shop_domain = customers.shop_domain
WHERE shops.shop_domain = 'myshop.com' AND
customers.product_id = '4242451'
GROUP BY shops.shop_id
Relevant database schemas:
shops:
id, name, shop_domain
customers:
id, name, product_id, shop_domain
You are close. The condition on customers needs to go in the ON clause, because this is a LEFT JOIN and customers is the second table:
SELECT s.id, s.name, s.shop_domain, COUNT(c.customer_id)
FROM shops s LEFT JOIN
customers c
ON s.shop_domain = c.shop_domain AND c.product_id = '4242451'
WHERE s.shop_domain = 'myshop.com'
GROUP BY s.id, s.name, s.shop_domain;
I am also inclined to include all three columns in the GROUP BY, although Postgres (and ANSI/ISO standards) are happy with just id if it is declared as the primary key in the table.
A correlated subquery should be substantially cheaper (and simpler) for the purpose:
SELECT id, name, shop_domain
, (SELECT count(*)
FROM customers
WHERE shop_domain = s.shop_domain
AND product_id = 4242451) AS special_count
FROM shops s
WHERE shop_domain = 'myshop.com';
This way you only need to aggregate in the subquery, and need not worry about undesired effects on the outer query.
Assuming product_id is a numeric data type, so I use a numeric literal (4242451) instead of a string literal '4242451' - which might cause problems otherwise.

SQL Server 2016 Sub Query Guidance

I am currently working on an assignment for my SQL class and I am stuck. I'm not looking for full code to answer the question, just a little nudge in the right direction. If you do provide full code would you mind a small explanation as to why you did it that way (so I can actually learn something.)
Here is the question:
Write a SELECT statement that returns three columns: EmailAddress, ShipmentId, and the order total for each Client. To do this, you can group the result set by the EmailAddress and ShipmentId columns. In addition, you must calculate the order total from the columns in the ShipItems table.
Write a second SELECT statement that uses the first SELECT statement in its FROM clause. The main query should return two columns: the Client’s email address and the largest order for that Client. To do this, you can group the result set by the EmailAddress column.
I am confused on how to pull in the EmailAddress column from the Clients table, as in order to join it I have to bring in other tables that aren't being used. I am assuming there is an easier way to do this using sub Queries as that is what we are working on at the time.
Think of SQL as working with sets of data as opposed to just tables. Tables are merely a set of data. So when you view data this way you immediately see that the query below returns a set of data consisting of the entirety of another set, being a table:
SELECT * FROM MyTable1
Now, if you were to only get the first two columns from MyTable1 you would return a different set that consisted only of columns 1 and 2:
SELECT col1, col2 FROM MyTable1
Now you can treat this second set, a subset of data as a "table" as well and query it like this:
SELECT
*
FROM (
SELECT
col1,
col2
FROM
MyTable1
)
This will return all the columns from the two columns provided in the inner set.
So, your inner query, which I won't write for you since you appear to be a student, and that wouldn't be right for me to give you the entire answer, would be a query consisting of a GROUP BY clause and a SUM of the order value field. But the key thing you need to understand is this set thinking: you can just wrap the ENTIRE query inside brackets and treat it as a table the way I have done above. Hopefully this helps.
You need a subquery, like this:
select emailaddress, max(OrderTotal) as MaxOrder
from
( -- Open the subquery
select Cl.emailaddress,
Sh.ShipmentID,
sum(SI.Value) as OrderTotal -- Use the line item value column in here
from Client Cl -- First table
inner join Shipments Sh -- Join the shipments
on Sh.ClientID = Cl.ClientID
inner join ShipItem SI -- Now the items
on SI.ShipmentID = Sh.ShipmentID
group by C1.emailaddress, Sh.ShipmentID -- here's your grouping for the sum() aggregation
) -- Close subquery
group by emailaddress -- group for the max()
For the first query you can join the Clients to Shipments (on ClientId).
And Shipments to the ShipItems table (on ShipmentId).
Then group the results, and count or sum the total you need.
Using aliases for the tables is usefull, certainly when you select fields from the joined tables that have the same column name.
select
c.EmailAddress,
i.ShipmentId,
SUM((i.ShipItemPrice - i.ShipItemDiscountAmount) * i.Quantity) as TotalPriceDiscounted
from ShipItems i
join Shipments s on (s.ShipmentId = i.ShipmentId)
left join Clients c on (c.ClientId = s.ClientId)
group by i.ShipmentId, c.EmailAddress
order by i.ShipmentId, c.EmailAddress;
Using that grouped query in a subquery, you can get the Maximum total per EmailAddress.
select EmailAddress,
-- max(TotalShipItems) as MaxTotalShipItems,
max(TotalPriceDiscounted) as MaxTotalPriceDiscounted
from (
select
c.EmailAddress,
-- i.ShipmentId,
-- count(*) as TotalShipItems,
SUM((i.ShipItemPrice - i.ShipItemDiscountAmount) * i.Quantity) as TotalPriceDiscounted
from ShipItems i
join Shipments s on (s.ShipmentId = i.ShipmentId)
left join Clients c on (c.ClientId = s.ClientId)
group by i.ShipmentId, c.EmailAddress
) q
group by EmailAddress
order by EmailAddress
Note that an ORDER BY is mostly meaningless inside a subquery if you don't use TOP.

Joining 2 tables error

I'm trying to join 2 tables to get an output report. The tables involved are the stock and dailysales table.
Stock and Dailysales tables:
Desired output format:
I am trying to join 2 tables by using the below query
Select item,article,sold,stockonhand
from stock S
left join dailysales as D on S.item=D.item
group by item
I want the output to include all rows from the stock table. Like there is stock, but not sold, also to be included in the report. Currently my report is does not show, (stocked but not sold).
Hope you can understand my context or point me to a tutorial which I could read up. I tried to search for few days and couldn't find an exact question to mine.
Not tested -
select item,article,sum(sold),sum(stockonhand)-sum(sold) from (
select a.item,a.article,a.stockonhand,case when b.sold is null then 0 else b.sold end as sold
from stock a left join dailysales b on (a.item = b.item))
group by item,article;
It's basically does the left join and put 0 on the null column(for sum after)
and then summing up the results grouping by all the columns in the select(that what was wrong with your query)
Simply LEFT JOIN the two tables (to get the flute too), do a GROUP BY, and SUM the sold:
select s.item, s.article, coalesce(SUM(d.sold),0) as "qty sold", s.stockonhand
from stock S
left join dailysales as D on S.item=D.item
group by s.item, s.article, s.stockonhand
The coalesce is there to replace NULL with 0, for items not sold. (Thanks sagi!)
General GROUP BY tip: If a GROUP BY clause is specified, each column reference in the SELECT list must either identify a grouping column or be the argument of a set function.
Also, you can remove the article column from the dailysales table. That data is already stored in the stock table. Never store same data twice! (Normalization!) Risk for data inconsistency if you keep that column.
You can sum the item in a separated table, it would be clearer than GROUP BY 2 tables
SELECT
s.item,
s.article,
ISNULL(ds.sold, 0) AS qtysold,
s.stockonhand
FROM
stock s OUTER APPLY
(SELECT SUM(sold) AS sold FROM dailysales WHERE item = s.item GROUP BY item) ds

Sum SQL statement outputs many rows when I expect only one

So I have 3 tables joined as shown:
What I want to do is query for the sum of all the holdings that fall into the criteria specified for the clients in my query. Here is what I have:
SELECT Sum(Holdings.HoldingValue) AS SumOfHoldingValue
FROM (Clients INNER JOIN Accounts
ON Clients.ClientID = Accounts.ClientID)
INNER JOIN Holdings
ON Accounts.AccountID = Holdings.AccNum
GROUP BY Holdings.HoldingDate, Clients.Active, Clients.RiskCode, Clients.NewClient, Clients.BaseCurrency, Clients.ClientID
HAVING (((Holdings.HoldingDate)=#3/31/2013#)
AND ((Clients.Active)=True)
AND ((Clients.RiskCode) In (1,2))
AND ((Clients.NewClient)=True)
AND ((Clients.BaseCurrency)='GBP')
AND ((Clients.ClientID) Not In (10022,10082,10083)));
Here's an example of what I get as the result:
SumOfHoldingValue
1056071.96
466595.6
1074459.38
371142.54
814874.42
458203.65
8308697.09
254733.94
583796.33
443897.76
203787.11
1057445.84
1058751.26
317507.43
So there are quite a few criteria for the client table but the result is a list of SumOfHoldingValue when what I want is just one number. I.e. the sum of all the holding values. Why is it not grouping them all together to form one total?
Since you're not computing any aggregates on the values in the HAVING clause, I think you just want this:
SELECT Sum(Holdings.HoldingValue) AS SumOfHoldingValue
FROM (Clients INNER JOIN Accounts
ON Clients.ClientID = Accounts.ClientID)
INNER JOIN Holdings
ON Accounts.AccountID = Holdings.AccNum
WHERE (((Holdings.HoldingDate)=#3/31/2013#)
AND ((Clients.Active)=True)
AND ((Clients.RiskCode) In (1,2))
AND ((Clients.NewClient)=True)
AND ((Clients.BaseCurrency)='GBP')
AND ((Clients.ClientID) Not In (10022,10082,10083)));
Which, with no GROUP clause will produce a single GROUP (over the entire set) and produce a single row.
If you just want totals - remove the group by. With the group by clause it gives you totals for every group separately.
If you need to filter data put the condition into Where clause instead
Your query contains a group by clause which returns each group on its own line.
You are also using a having clause. The having clause is applied after the group by. Usually, it would contain aggregation functions -- such as having count(*) > 1. In your case, it is used as a where clause.
Try rewriting the query like this:
SELECT Sum(Holdings.HoldingValue) AS SumOfHoldingValue
FROM (Clients INNER JOIN Accounts
ON Clients.ClientID = Accounts.ClientID)
INNER JOIN Holdings
ON Accounts.AccountID = Holdings.AccNum
WHERE (((Holdings.HoldingDate)=#3/31/2013#)
AND ((Clients.Active)=True)
AND ((Clients.RiskCode) In (1,2))
AND ((Clients.NewClient)=True)
AND ((Clients.BaseCurrency)='GBP')
AND ((Clients.ClientID) Not In (10022,10082,10083)));