SQL joining, summing from sub queries to calculate a total aggregate amount - sql

I've done all of this in active record, but it's not going to work with pagination so I'd like to switch to a SQL query with OFFSET so I can query efficiently. - Rather than getting User.all, then working out the calculations on related objects and then compiling it all into a bundled array and finally sending it up to the view I'd like to handle the calcs in a find_by_sql command so it's easier to manage pagination etc.
Trying to work out a users total amounts invested + their residual uninvented amounts in my little stock market simulator I'm playing with.
Have a share prices table that has multiple entries for each new share price for each company, so want to:
a) select the last entry from that table to get the latest share
price
b) ownerships shows what users from users table own what
company shares
c) So if we multiply shares_owned from the ownerships
table with the latest share price from a) then we get the total
amount invested
d) Once we have the total amount invested across all
companies, we need to add on what uninvested dollars the user has
associated with them so total invested + u.dollars should give us the
total valuation for a given user.
e) what screws me up is that a user might not own anything at a particular time, which means he will have no entries in the ownership table. In that case the query needs to only return his uninvested u.dollars amount.
Trying to get that total valuation per customer and order by 'richest' user:
select u.id,
sum(
((
select s.price from shareprices s
WHERE s.company_id = o.company_id and u.id = o.user_id
ORDER BY id DESC LIMIT 1) * o.shares_owned
))
+ u.dollars as totalAmount
from users as u full outer join ownerships as o on u.id = o.user_id group by u.id order by totalamount ASC
That returns fine for where there are ownerships and the calculations for total invested work out, but for users who only have uninvested dollars how can i get them to show in that summed column, so essentially its 0 (ie. no owned investment amounts because they own shares in no companies) + u.dollars to get how much they have, but I don't understand how to make SQL do that.
I am hoping to avoid needing a pgsql function() to achieve this logic but if the above didn't make it obvious, I'm terrible at SQL and only learning it now. I hope my explanation is even understandable!

You can add a colaesce around the part of the calculation that needs to treat nulls as zeros (its not clear to me which part needs to )
sum(COALESCE
((
select s.price from shareprices s
WHERE s.company_id = o.company_id and u.id = o.user_id
ORDER BY id DESC LIMIT 1) * o.shares_owned
)) , 0))
+ u.dollars totalAmount

Related

SQL Server 2008 : running totals and null records

I have several queries that get account balances from our ERP, but there are several issues I am trying to work around and I am curious if there are better ways or if more recent versions of SQL Server have functions to address any of these problems.
Our ERP generates a balance record only in periods where there is activity associated with the account. The ERP applications and reports summarize values by period but no record is added to the database so custom processes that need a balance by period require a query/view to calculate this info.
My workaround for this has been to use a global variable to intentionally create duplicates from the Account table and the pseudo period table I created, see below
Our Account Period table dose not contain a period index (I suppose it should be the Row ID however at some point a fiscal period was added incorrectly and the index was thrown out of order. I have been advised by the ERP provider not to update this without a full reimplementation). I created a workaround table for this.
So I have several queries that work around these issues but they run slowly with just a handful of accounts so a full pseudo table for account balances has not been practical with my methods at least.
I have included an example below for calculating the balance by period for accounts that are not summarized to retained earnings annually (assets, liabilities, equity)
SELECT
ID AS ACCOUNT_ID, ind.Month_Index, ind.Period,
(SELECT
ISNULL(SUM(CASE WHEN A3.TYPE IN ('e','r') THEN NULL
WHEN A3.TYPE = 'a' THEN ISNULL(AB3.DEBIT_AMOUNT, 0) - ISNULL(AB3.CREDIT_AMOUNT, 0)
ELSE ISNULL(AB3.CREDIT_AMOUNT, 0) - ISNULL(AB3.DEBIT_AMOUNT, 0) END), 0)
FROM ACCOUNT_BALANCE AS AB3
LEFT OUTER JOIN ACCOUNT AS A3 ON AB3.ACCOUNT_ID = A3.ID
LEFT OUTER JOIN
(SELECT YEAR, Month_Num, Month_Index, Period
FROM UFC_Calander
GROUP BY YEAR, Month_Num, Month_Index, Period) AS ind2 ON AB3.ACCT_YEAR = ind2.YEAR AND AB3.ACCT_PERIOD = ind2.Month_Num
WHERE A.ID = AB3.ACCOUNT_ID
AND A3.CURRENCY_ID = '(USA) $'
AND ind2.Month_Index <= ind.Month_Index) AS BALANCE_AQL
FROM
ACCOUNT AS A
LEFT OUTER JOIN
ACCOUNT_PERIOD AS per ON 'UCC' = per.SITE_ID
LEFT OUTER JOIN
ACCOUNT_BALANCE AS AB ON A.ID = AB.ACCOUNT_ID
AND per.ACCT_YEAR = AB.ACCT_YEAR
AND per.ACCT_PERIOD = AB.ACCT_PERIOD
AND AB.CURRENCY_ID = '(USA) $'
LEFT OUTER JOIN
(SELECT YEAR, Month_Num, Month_Index, Period
FROM UFC_Calander
GROUP BY YEAR, Month_Num, Month_Index, Period) AS ind ON per.ACCT_YEAR = ind.YEAR AND per.ACCT_PERIOD = ind.Month_Num
WHERE
ID IN ('120-1140-0000', '120-1190-1190', '120-1190-1193',
'120-1190-1194', '210-2100-0000', '210-2101-0000')
GROUP BY
ID, ind.Month_Index, ind.Period
ORDER BY
ind.Month_Index DESC, ACCOUNT_ID DESC
Any suggestions that might improve the performance of this query will be greatly appreciated.
My high level recommendations are the following:
Avoid using the IN clause. If possible (assuming the account table isn't too big, create a temporary table for only the columns you need and load that data with the ID's you are working with.) Then use that in your code above.
(not a performance thing but more of a slight change). The ISNULL(SUM... part is only needed due to you having a "A3.TYPE IN ('e', 'r') THEN NULL". If you had said THEN 0, you could avoid the null check.
A correlated subquery within the select is okay, but its a multi-part join that is most likely causing it to slow down. I'm not sure 100% confident on how you can break this apart to be two separate logical grabs of data and then joined back together, but its the best I got with what I'm seeing here.

Summing up the price of production and selling prices

I need to write query and calculate the total CostPrice’s and total Sellprice’s related to Material of ID=1. I want to calculate total costs so I have to include the QuantityColumn from the table Materials right?
SELECT SUM((CostPrice) * M.Quantity) AS 'TotalCostExclusive', SUM((SellPrice) * M.Quantity) AS 'SellPriceExclusive'
FROM Materials AS M INNER JOIN Warehouse AS W
ON M.Warehouse_id = W.Id
WHERE M.Material_Id = '1';
I tried this, but I'm not sure the result is equal to the required task.
Now I want to write new query to include the Discount on both cost and selling prices.
Since I’m a beginner and this data model is for student purpose, I would like to know why I must to have a table WarehouseIdentities. What exactly this table doing here and what role it has.

Multiple of same result even with group by

Alright so say I have a 'product_catalog', and 'orders' tables. Each order has the product_catalog_id as a foreign key. What I want to return as the query results is the product_code (name of the product associated with a specific product_catalog_id) + a count of how many of each product_code have been ordered. That's easy enough with something like this (Oracle SQL):
SELECT pc.product_code,
COUNT(*) as count
FROM orders o
join product_catalog pc on pc.product_catalog_id = o.product_catalog_id
GROUP BY pc.product_code
ORDER BY count DESC;
but I also want to print various pieces of information from the order table such as total of all monthly charges for that product_code. That would seem easy enough with something like this:
(o.monthly_base_charge*count(*)) as "Monthly Fee"
but the problem is that there have been various monthly fees for the same product_code over time. If I add the above line in and add 'o.monthly_base_charge' to the group by statement, then it will print out a unique row for every variation of pricing for that product_code. How do I get it to ignore those price variations and just add together every entry with that product code?
It is a little unclear what you are asking. My best guess is that you want the sum of the monthly base charge:
SELECT pc.product_code,
COUNT(*) as count,
sum(o.monthly_base_charge) as "Monthly Fee"
FROM orders o join
product_catalog pc
on pc.product_catalog_id = o.product_catalog_id
GROUP BY pc.product_code
ORDER BY count DESC;
I'm not sure if this is exactly what you want. What happens if you have two orders in the same month for the same product?
You may need to do something like this since SQL will not be able to know which monthly base charge to multiply by the count.
SELECT pc.product_code,
COUNT(*) as count,
(min(o.monthly_base_charge)*count(*)) as "Monthly Fee"
FROM orders o
join product_catalog pc on pc.product_catalog_id = o.product_catalog_id
GROUP BY pc.product_code
ORDER BY count DESC;
Or you will need to add o.monthly_base_charge to the group by in order for sql to know how to determine the count()
GROUP BY pc.product_code, o.monthly_base_charge

Finding drop off rate from membership table in SQL Server 2005

We have a view that stores the history of membership plans held by our members and we have been running a half price direct debit offer for some time. We've been asked to report on whether people are allowing the direct debit to renew (at full price) but I'm no SQL expert!
The view in effect is
memberRef, historyRef, validFrom, validTo,MembershipType,PaymentType,totalAmount
Here
memberRef identifies the person (int)
historyRef identifies this row (int)
validFrom and validTo are the start and end of the plan (datetime)
MembershipType is the type of plan (int)
PaymentType is direct debit or credit card (a string - DD or EFT)
totalAmount is the price of the plan (decimal)
I'm wondering if there is a query as opposed to a cursor I can use to count the number of policies which are at half price and have another direct debit policy that follows on from it.
If we can also capture if that person first joined at half price or if there was a gap where membership had lapsed before they took the half price incentive that would be great.
Thanks in advance for any help!
For example
select count(MemberRef), max(vhOuter.validFrom) "most recent plan start",
(select top(1) vh2.validFrom
from v_Membershiphistory vh2
where (vh2.totalamount = 14.97 or vh2.totalamount = 25.50)
and vh2.memberref = vhOuter.memberref
order by createdat desc
) "half price plan start"
from v_membershiphistory vhOuter
where vhOuter.memberref in (select vh1.memberref from v_membershiphistory vh1 where vh1.totalamount = 14.97 or vh1.totalamount = 25.50)--have taken up offer
group by memberref
having max(vhOuter.validFrom) > (select top(1) vh2.validFrom
from v_Membershiphistory vh2
where (vh2.totalamount = 14.97 or vh2.totalamount = 25.50)
and vh2.memberref = vhOuter.memberref
order by createdat desc
)
This will display the members who have a half price plan and have a valid from date that is greater than the valid from date of that plan.
Not quite right as we should be testing that it is the same plan but...
if I change the select here to just count(memberRef) I get the count of memberRef for the member I'm grouping for each member I'm grouping i.e. for 5220 results I'd get 5220 rows returned each with in effect the number of plans I've selected
But I need to count the number of people taking the offer and proportion that renew. Also that renewal rate in the population that aren't taking the offer (which I'm guessing is a trivial change once I've got one set sorted)
I suppose I'm looking at how one operates on the set but compares multiple rows for each distinct person without using a cursor. But I might be wrong :)
try something like:
SELECT
a.*, b.*
FROM YourTable a
INNER JOIN YourTable b On a.memberRef=b.memberRef and a.validToDate<b.validFromDate
WHERE b.PaymentType='?direct debit?' and a.Cost='?half price?'
to get just counts use something like:
SELECT
COUNT(a.memberRef) AS TotalCount
FROM YourTable a
INNER JOIN YourTable b On a.memberRef=b.memberRef and a.validToDate<b.validFromDate
WHERE b.PaymentType='?direct debit?' and a.Cost='?half price?'

SQL sub queries - is there a better way

This is an SQL efficiency question.
A while back I had to write a collection of queries to pull data from an ERP system. Most of these were simple enough but one of them resulted in a rather ineficient query and its been bugging me ever since as there's got to be a better way.
The problem is not complex. You have rows of sales data. In each row you have quantity, sales price and the salesman code, among other information.
Commission is paid based on a stepped sliding scale. The more they sell, the better the commission. Steps might be 1000, 10000, 10000$ and so forth. The real world problem is more complex but thats it essentially it.
The only way I found of doing this was to do something like this (obviously not the real query)
select qty, price, salesman,
(select top 1 percentage from comissions
where comisiones.salesman = saleslines.salesman
and saleslines.qty > comisiones.qty
order by comissiones.qty desc
) percentage
from saleslines
this results in the correct commission but is horrendously heavy.
Is there a better way of doing this? I'm not looking for someone to rewrite my sql, more 'take a look as foobar queries' and I can take it from there.
The real life commission structure can be specified for different salesmen, articles and clients and even sales dates. It also changes from time to time, so everything has to be driven by the data in the tables... i.e I can't put fixed ranges in the sql. The current query returns some 3-400000 rows and takes around 20-30 secs. Luckily its only used monthly but the slowness is kinda bugging me.
This is on mssql.
Ian
edit:
I should have given a more complex example from the beginning. I realize now that my initial example is missing a few essential elements of the complexity, apologies to all.
This may better capture it
select client-code, product, product-family, qty, price, discount, salesman,
(select top 1 percentage from comissions
where comisiones.salesman = saleslines.salesman
and saleslines.qty > comisiones.qty
and [
a collection of conditions which may or may not apply:
Exclude rows if the salesman has offered discounts above max discounts
which appear in each row in the commissions table
There may be a special scale for the product family
There may be a special scale for the product
There may be a special scale for the client
A few more cases
]
order by [
The user can control the order though a table
which can prioritize by client, family or product
It normally goes from most to least specific.
]
) percentage
from saleslines
needless to say the real query is not easy to follow. Just to make life more interesting, its naming is multi language.
Thus for every row of salesline the commission can be different.
It may sound overly complex but if you think of how you would pay commission it makes sense. You don't want to pay someone for selling stuff at high discounts, you also want to be able to offer a particular client a discount on a particular product if they buy X units. The salesman should earn more if they sell more.
In all the above I'm excluding date limited special offers.
I think partitions may be the solution but I need to explore this more indepth as I know nothing about partitions. Its given me a few ideas.
If you are using a version of SQL Server that supports common-table expressions such as SQL Server 2005 and later, a more efficient solution might be:
With RankedCommissions As
(
Select SL.qty, SL.price, SL.salesman, C.percentage
, Row_Number() Over ( Partition By SL.salesman Order By C.Qty Desc ) As CommissionRank
From SalesLines As SL
Join Commissions As C
On SL.salesman = C.salesman
And SL.qty > C.qty
)
Select qtr, price, salesman, percentage
From RankedCommissions
Where CommissionRank = 1
If you needed to account for the possibility that there are no Commissions values for a given salesman where the SalesLine.Qty > Commission.Qty, then you could do something like:
With RankedCommissions As
(
Select SL.qty, SL.price, SL.salesman, C.percentage
, Row_Number() Over ( Partition By SL.salesman Order By C.Qty Desc ) As CommissionRank
From SalesLines As SL
Join Commissions As C
On SL.salesman = C.salesman
And SL.qty > C.qty
)
Select SL.qtr, SL.price, SL.salesman, RC.percentage
From SalesLines As SL
Left Join RankedCommissions As RC
On RC.salesman = SL.salesman
And RC.CommissionRank = 1
select
qty, price, salesman,
max(percentage)
from saleslines
inner join comissions on commisions.salesman = saleslines.salesman and
saleslines.qty > comissions.qty
group by
qty, price, salesman