Can I divide an amount across multiple parties and round to the 'primary' party in a single SQL query? - sql

I am working on an oracle PL/SQL process which divides a single monetary amount across multiple involved parties in a particular group. Assuming 'pGroupRef' is an input parameter, the current implementation first designates a 'primary' involved party, and then it splits the amount across all the secondaries as follows:
INSERT INTO ActualValue
SELECT
...
pGroupRef AS GroupRef,
ROUND(Am.Amount * P.SplitPercentage / 100, 2) AS Amount,
...
FROM
Amount Am,
Party P
WHERE
Am.GroupRef = pGroupRef
AND P.GroupRef = Am.GroupRef
...
P.PrimaryInd = 0;
Finally, it runs a second procedure to insert whatever amount is left over to the primary party, i.e.:
INSERT INTO ActualValue
SELECT
...
pGroupRef AS GroupRef,
Am.Amount - S.SecondaryAmounts,
FROM
Amount Am,
Party P,
(SELECT SUM(Amount) AS SecondaryAmounts FROM ActualValue WHERE GroupRef = pGroupRef) S
WHERE
Am.GroupRef = pGroupRef
AND P.GroupRef = Am.GroupRef
...
P.PrimaryInd = 1;
However, the full query here is very large and I am making this area more complex by adding subgroups, each of which will have their own primary member, and the possibility of overrides - hence if I continued to use this implementation then it would mean a lot of duplicated SQL.
I suppose I could always calculate the correct amounts into an array before running a single unified insert - but I feel like there has to be an elegant mathematical way to capture this logic in a single SQL Query.

So you can use analytical functions to get what you are looking for. As I didn't know your exact structure, this is only an example:
SELECT s.party_id, s.member_id,
s.portion + DECODE(s.prime, 1, s.total - SUM(s.portion) OVER (PARTITION BY s.party_id),0)
FROM (SELECT p.party_id, p.member_id,
ROUND(a.amt*(p.split/100), 2) AS PORTION,
a.amt AS TOTAL, p.prime
FROM party p
INNER JOIN amount a ON p.party_id = a.party_id) s
So in the query you have a subquery that gathers the required information, then the outer query puts everything together, only applying the remainder to the record marked as prime.
Here is a DBFiddle showing how this works (LINK)
N.B.: Interestingly in the example in the DBFiddle, there is a 0.01 overpayment, so the primary actually pays less.

Related

Add a number value to column in SQL query using SELECT method

I have am working on adding a query that calculates tuition costs. It should do this by using the Tuition table which only includes the FullTimeCost (a static number for the student fees), and the PerUnitCost (the cost per credit hour).
I am trying to use a SELECT to return 3 more columns, 1 constant value of 12 called units, and 2 more that calculate the rest based on simple math.
The problem I am having is that I cannot seem to make the column Units have a default value of 12.
This is my code, and the issue I am having is that when I use this approach, the following formulas do not recognize the the columns being created in the previous lines.
All I need is for the 3rd Line to recognize Units so it can multiply by 12 as intended. Also this is for school, so a comment saying just change it to 12 is not useful.
SELECT
FullTimeCost, PerUnitCost,
12 AS Units,
PerUnitCost * Units AS TotalPerUnitCost,
FullTimeCost + TotalPerUnitCost AS TotalTuition
FROM
Tuition
You cannot re-use a column alias in the select. However, SQL Server gives you a convenient way to define the alias in the from clause, so you can use it:
SELECT t.FullTimeCost, t.PerUnitCost, v.Units,
v2.TotalPerUnitCost,
(t.FullTimeCost + v2.TotalPerUnitCost AS TotalTuition
FROM Tuition t CROSS APPLY
(VALUES (12)) v(units) CROSS APPLY
(VALUES (t.PerUnitCost * v.Units)) v2(TotalPerUnitCost);
Use a CTE to "add" your constant as a column and then apply the calculation. Without context, a variable would also be just as simple and useful.
with cte as (select FullTimeCost, PerUnitCost, 12 as Units
from dbo.Tuition
)
SELECT
FullTimeCost, PerUnitCost,
Units,
PerUnitCost * Units AS TotalPerUnitCost,
FullTimeCost + TotalPerUnitCost AS TotalTuition
FROM cte
order by ...;
There are, of course, other ways to accomplish this. Not certain what your coursework has covered but I assume that recent topics should have provided techniques to do this.
Using apply as shown by Gordon's answer is the most elegant solution and also noted in the comments is another way using a derived table.
As you have no doubt gathered, the problem is that during query compilation, the optimizer does not "see" the calculated column aliases as it can only (generally) access columns available from tables in the where clause, or as shown by Gordon, using an apply().
What you can also do is use a derived table, by first selecting the columns you need from your table and also adding your additional columns.
You then wrap this in parentheses - it's now a derived table ie, the results of the parenthesis content is itself a table available to an outer select.
You then use this as the source for an outer select which has visiblity of any additional columns you have added.
A complication with your query is that you want to add a constant value Units and then reference it, and also reference a second calculated column that makes use of Units.
I would simply use a single derived table to calculate the TotalPerUnitCost, you don't need Units since it's used only once.
select
FullTimeCost, PerUnitCost, TotalPerUnitCost,
FullTimeCost + TotalPerUnitCost as TotalTuition
from (
select FullTimeCost, PerUnitCost, TotalPerUnitCost, PerUnitCost * 12 as TotalPerUnitCost
from Tuition
)t

My Joins in query not pulling through correctly

Good evening. Could someone please help me with the following. I am trying to join two tables.The first id wbr_global.gl_ap_details. This stores historic GL information. The second table sandbox.utr_fixed_mapping is where account mapping is stored. For example, ana ccount number 60820 is mapped as Employee relation. The first table needs the mapping from the second table linked on the account number. The output I am getting is not right and way to bug. Any help would be appreciated!
Output
select sandbox.utr_fixed_mapping_na.new_mapping_1,sum(wbr_global.gl_ap_details.amount)
from wbr_global.gl_ap_details
LEFT JOIN sandbox.utr_fixed_mapping_na ON wbr_global.gl_ap_details.account_number = sandbox.utr_fixed_mapping_na.account_number
Where gl_ap_details.cost_center = '1172'
and gl_ap_details.period_name = 'JUL-21'
and gl_ap_details.ledger_name = 'Amazon.com, Inc.'
Group by 1;
I tried adding the cast function but after 5000 seconds of the query running I canceled it.
The query itself appears ok, but minor changes. Learn to use table "aliases". This way you don't have to keep typing long database.table.column all over. Additionally, SQL is easier to read doing it that way anyhow.
Notice the aliases "gl" and "fm" after the tables are declared, then these aliases are used to represent the columns.. Easier to read, would you agree.
Added GL Account number as described below the query.
select
gl.account_number,
fm.new_mapping_1,
sum(gl.amount)
from
wbr_global.gl_ap_details gl
LEFT JOIN sandbox.utr_fixed_mapping_na fm
ON gl.account_number = fm.account_number
Where
gl.cost_center = '1172'
and gl.period_name = 'JUL-21'
and gl.ledger_name = 'Amazon.com, Inc.'
Group by
gl.account_number,
fm.new_mapping_1
Now, as for your query and getting null. This just means that there are records within the gl_ap_details table with an account number that is not found in the utr_fixed_mapping_na table. So, to see WHAT gl account number does NOT exist, I have added it to the query. Its possible there are MULTIPLE records in the gl_ap_details that are not found in the mapping table. So, you may get
GLAccount Description SumOfAmount
glaccount1 null $someAmount
glaccount37 null $someAmount
glaccount49 null $someAmount
glaccount72 Depreciation $someAmount
glaccount87 Real Estate $someAmount
glaccount92 Building $someAmount
glaccount99 Salaries $someAmount
I obviously made-up glaccounts just to show the purpose. You may have multiple where the null's total amount is actually masking how many different gl account numbers were NOT found.
Once you find which are missing, you can check / confirm they SHOULD be in the mapping table.
FEEDBACK.
Since you do realize the missing numbers, lets consider a Cartesian result. If there are multiple entries in the mapping table for the same G/L account number, you will get a Cartesian result thus bloating your numbers. To clarify, lets say your mapping table has
Mapping file.
GL Descr1 NewMapping
1 test Salaries
1 testView Buildings
1 Another Depreciation
And your GL_AP_Details has
GL Amount
1 $100
Your total for the query would result in $300 because the query is trying to join the AP Details GL #1 to EACH of the entries in the mapping file thus bloating the amount. You could also add a COUNT(*) as NumberOfEntries to the query to see how many transactions it THINKS it is processing. Is there some "unique ID" in the GL_AP_Details table? If so, then you could also do a count of DISTINCT ID values. If they are different (distinct is lower than # of entries), I think THAT is your culprit.
select
fm.new_mapping_1,
sum(gl.amount),
count(*) as NumberOfEntries,
count( distinct gl.UniqueIdField ) as DistinctTransactions
from
wbr_global.gl_ap_details gl
LEFT JOIN sandbox.utr_fixed_mapping_na fm
ON gl.account_number = fm.account_number
Where
gl.cost_center = '1172'
and gl.period_name = 'JUL-21'
and gl.ledger_name = 'Amazon.com, Inc.'
Group by
fm.new_mapping_1
Might you also need to limit the mapping table for a specific prophecy or mec view?
If you "think" that the result of an aggregate is wrong, then the easiest way to verify this is to select the individual rows that correlate to 1 record in the aggregate output and inspect the records, looking for duplications.
For instance, pick 'Building Management':
SELECT fixed.new_mapping_1,details.amount,*
FROM wbr_global.gl_ap_details details
LEFT JOIN sandbox.utr_fixed_mapping_na fixed ON details.account_number = fixed.account_number
WHERE details.cost_center = '1172'
AND details.period_name = 'JUL-21'
AND details.ledger_name = 'Amazon.com, Inc.'
AND details.account_number = 'Building Management'
Notice that we tack on a ,* to the end of the projection, this will show you everything that the query has access to, you should look for repeating sections of data that you were not expecting, then depending on which table they originate from your might add additional criteria to the JOIN, or to the WHERE or you might need to group by additional columns.
This type of issue is really hard to comment on in a forum like this because it is highly specific to your schema, and the data contained within it, making solutions highly subjective to criteria you are not likely to publish online.
Generally if you think a calculation is wrong, you need to manually compute it to verify, this above advice helps you to inspect the data your query is using, you should either construct your own query or use other tools to build the data set that helps you to manually compute the correct values, then work them back into or replace your original query.
The speed issues are out of scope here, we can comment on the poor schema design but I suspect you don't have a choice. In the utr_fixed_mapping_na table you should make the account_number have the same column type as the source data, or add a new column that has the data in the original type, then you can setup indexes on the columns to improve the speed of the join.

How to calculate a bank's deposit growth from one call report to the next, as a percentage?

I downloaded the entire FDIC bank call reports dataset, and uploaded it to BigQuery.
The table I currently have looks like this:
What I am trying to accomplish is adding a column showing the deposit growth rate since the last quarter for each bank:
Note:The first reporting date for each bank (e.g. 19921231) will not have a "Quarterly Deposit Growth". Hence the two empty cells for the two banks.
I would like to know if a bank is increasing or decreasing its deposits each quarter/call report (viewed as a percentage).
e.g. "On their last call report (19921231)First National Bank had deposits of 456789 (in 1000's). In their next call report (19930331)First National bank had deposits of 567890 (in 1000's). What is the percentage increase (or decrease) in deposits"?
This "_%_Change_in_Deposits" column would be displayed as a new column.
This is the code I have written so far:
select
SFRNLL.repdte, SFRNLL.cert, SFRNLL.name, SFRNLL.city, SFRNLL.county, SFRNLL.stalp, SFRNLL.specgrp AS `Loan_Specialization`, SFRNLL.lnreres as `_1_to_4_Residential_Loans`, AL.dep as `Deposits`, AL.lnlsnet as `loans_and_leases`,
IEEE_DIVIDE(SFRNLL.lnreres, AL.lnlsnet) as SFR2TotalLoanRatio
FROM usa_fdic_call_reports_1992.All_Reports_19921231_1_4_Family_Residential_Net_Loans_and_Leases as SFRNLL
JOIN usa_fdic_call_reports_1992.All_Reports_19921231_Assets_and_Liabilities as AL
ON SFRNLL.cert = AL.cert
where SFRNLL.specgrp = 4 and IEEE_DIVIDE(SFRNLL.lnreres, AL.lnlsnet) <= 0.10
UNION ALL
select
SFRNLL.repdte, SFRNLL.cert, SFRNLL.name, SFRNLL.city, SFRNLL.county, SFRNLL.stalp, SFRNLL.specgrp AS `Loan_Specialization`, SFRNLL.lnreres as `_1_to_4_Residential_Loans`, AL.dep as `Deposits`, AL.lnlsnet as `loans_and_leases`,
IEEE_DIVIDE(SFRNLL.lnreres, AL.lnlsnet) as SFR2TotalLoanRatio
FROM usa_fdic_call_reports_1993.All_Reports_19930331_1_4_Family_Residential_Net_Loans_and_Leases as SFRNLL
JOIN usa_fdic_call_reports_1993.All_Reports_19930331_Assets_and_Liabilities as AL
ON SFRNLL.cert = AL.cert
where SFRNLL.specgrp = 4 and IEEE_DIVIDE(SFRNLL.lnreres, AL.lnlsnet) <= 0.10
The table looks like this:
Additional notes:
I would also like to view the last column (SFR2TotalLoansRatio) as a percentage.
This code runs correctly, however, previously I was getting a "division by zero" error when attempting to run 50,000 rows (1992 to the present).
Addressing each of your question individually.
First) Retrieving SFR2TotalLoanRatio as percentage, I assume you want to see 9.988% instead of 0.0988 in your results. Currently, in BigQuery you can achieve this by casting the field into a STRING then, concatenating the % sign. Below there is an example with sample data:
WITH data as (
SELECT 0.0123 as percentage UNION ALL
SELECT 0.0999 as percentage UNION ALL
SELECT 0.3456 as percentage
)
SELECT CONCAT(CAST(percentage*100 as String),"%") as formatted_percentage FROM data
And the output,
Row formatted_percentage
1 1.23%
2 9.99%
3 34.56%
Second) Regarding your question about the division by zero error. I am assuming IEEE_DIVIDE(arg1,arg2) is a function to perform the division, in which arg1 is the divisor and arg2 is the dividend. Therefore, I would adivse your to explore your data in order to figured out which records have divisor equals to zero. After gathering these results, you can determine what to do with them. In case you decide to discard them you can simply add within your WHERE statement in each of your JOINs: AL.lnlsnet = 0. On the other hand, you can also modify the records where lnlsnet = 0 using a CASE WHEN or IF statements.
UPDATE:
In order to add this piece of code your query, you u have to wrap your code within a temporary table. Then, I will make two adjustments, first a temporary function in order to calculate the percentage and format it with the % sign. Second, retrieving the previous number of deposits to calculate the desired percentage. I am also assuming that cert is the individual id for each of the bank's clients. The modifications will be as follows:
#the following function MUST be the first thing within your query
CREATE TEMP FUNCTION percent(dep INT64, prev_dep INT64) AS (
Concat(Cast((dep-prev_dep)/prev_dep*100 AS STRING), "%")
);
#followed by the query you have created so far as a temporary table, notice the the comma I added after the last parentheses
WITH data AS(
#your query
),
#within this second part you need to select all the columns from data, and LAG function will be used to retrieve the previous number of deposits for each client
data_2 as (
SELECT repdte, cert, name, city, county, stalp, Loan_Specialization, _1_to_4_Residential_Loans,Deposits, loans_and_leases, SFR2TotalLoanRatio,
CASE WHEN cert = lag(cert) OVER (PARTITION BY id ORDER BY d) THEN lag(Deposits) OVER (PARTITION BY id ORDER BY id) ELSE NULL END AS prev_dep FROM data
)
SELECT repdte, cert, name, city, county, stalp, Loan_Specialization, _1_to_4_Residential_Loans,Deposits, loans_and_leases, SFR2TotalLoanRatio, percent(Deposits,prev_dep) as dept_growth_rate FROM data_2
Note that the built-in function LAG is used together with CASE WHEN in order to retrieve the previous amount of deposits per client.

SQL Server 2008 : running totals and null records

I have several queries that get account balances from our ERP, but there are several issues I am trying to work around and I am curious if there are better ways or if more recent versions of SQL Server have functions to address any of these problems.
Our ERP generates a balance record only in periods where there is activity associated with the account. The ERP applications and reports summarize values by period but no record is added to the database so custom processes that need a balance by period require a query/view to calculate this info.
My workaround for this has been to use a global variable to intentionally create duplicates from the Account table and the pseudo period table I created, see below
Our Account Period table dose not contain a period index (I suppose it should be the Row ID however at some point a fiscal period was added incorrectly and the index was thrown out of order. I have been advised by the ERP provider not to update this without a full reimplementation). I created a workaround table for this.
So I have several queries that work around these issues but they run slowly with just a handful of accounts so a full pseudo table for account balances has not been practical with my methods at least.
I have included an example below for calculating the balance by period for accounts that are not summarized to retained earnings annually (assets, liabilities, equity)
SELECT
ID AS ACCOUNT_ID, ind.Month_Index, ind.Period,
(SELECT
ISNULL(SUM(CASE WHEN A3.TYPE IN ('e','r') THEN NULL
WHEN A3.TYPE = 'a' THEN ISNULL(AB3.DEBIT_AMOUNT, 0) - ISNULL(AB3.CREDIT_AMOUNT, 0)
ELSE ISNULL(AB3.CREDIT_AMOUNT, 0) - ISNULL(AB3.DEBIT_AMOUNT, 0) END), 0)
FROM ACCOUNT_BALANCE AS AB3
LEFT OUTER JOIN ACCOUNT AS A3 ON AB3.ACCOUNT_ID = A3.ID
LEFT OUTER JOIN
(SELECT YEAR, Month_Num, Month_Index, Period
FROM UFC_Calander
GROUP BY YEAR, Month_Num, Month_Index, Period) AS ind2 ON AB3.ACCT_YEAR = ind2.YEAR AND AB3.ACCT_PERIOD = ind2.Month_Num
WHERE A.ID = AB3.ACCOUNT_ID
AND A3.CURRENCY_ID = '(USA) $'
AND ind2.Month_Index <= ind.Month_Index) AS BALANCE_AQL
FROM
ACCOUNT AS A
LEFT OUTER JOIN
ACCOUNT_PERIOD AS per ON 'UCC' = per.SITE_ID
LEFT OUTER JOIN
ACCOUNT_BALANCE AS AB ON A.ID = AB.ACCOUNT_ID
AND per.ACCT_YEAR = AB.ACCT_YEAR
AND per.ACCT_PERIOD = AB.ACCT_PERIOD
AND AB.CURRENCY_ID = '(USA) $'
LEFT OUTER JOIN
(SELECT YEAR, Month_Num, Month_Index, Period
FROM UFC_Calander
GROUP BY YEAR, Month_Num, Month_Index, Period) AS ind ON per.ACCT_YEAR = ind.YEAR AND per.ACCT_PERIOD = ind.Month_Num
WHERE
ID IN ('120-1140-0000', '120-1190-1190', '120-1190-1193',
'120-1190-1194', '210-2100-0000', '210-2101-0000')
GROUP BY
ID, ind.Month_Index, ind.Period
ORDER BY
ind.Month_Index DESC, ACCOUNT_ID DESC
Any suggestions that might improve the performance of this query will be greatly appreciated.
My high level recommendations are the following:
Avoid using the IN clause. If possible (assuming the account table isn't too big, create a temporary table for only the columns you need and load that data with the ID's you are working with.) Then use that in your code above.
(not a performance thing but more of a slight change). The ISNULL(SUM... part is only needed due to you having a "A3.TYPE IN ('e', 'r') THEN NULL". If you had said THEN 0, you could avoid the null check.
A correlated subquery within the select is okay, but its a multi-part join that is most likely causing it to slow down. I'm not sure 100% confident on how you can break this apart to be two separate logical grabs of data and then joined back together, but its the best I got with what I'm seeing here.

SQL - Getting the max effective date less than a date in another table

I'm currently working on a conversion script to transfer a bunch of old data out of an SQL Server 2000 database and onto a SQL Server 2008. One of thing things I'm trying to accomplish during this conversion is to eliminate all of the composite keys and replace them with a "proper" primary key. Obviously, when I transfer the data I need to inject the foreign key values into the new table structures.
I'm currently stuck with one data set though and I can't seem to get my head around it in a set-based fashion. The two tables with which I am working are called Charge and Law. They have a 1:1 relationship and "link" on three columns. The first two are an equal link on the LawSource and LawStatue columns, but the third column is causing me problems. The ChargeDate column should link to the LawDate column where LawDate <= ChargeDate.
My current query is returning more than one row (in some cases) for a given Charge because the Law may have more than one LawDate that is less than or equal to the ChargeDate.
Here's what I currently have:
select LawId
from Law a
join Charge b on b.LawSource = a.LawSource
and b.LawStatute = a.LawStatute
and b.ChargeDate >= a.LawDate
Any way I can rewrite this to get the most recent entry in the Law table that is the same (or earlier) date at the ChargeDate?
This would be easier in SQL 2008 with the partitioning functions (so, it should be easier in the future for you).
The usual caveats of "I don't have your schema, so this isn't tested" apply, but I think it should do what you need.
select
l.LawID
from
law l
join (
select
a.LawSource,
a.LawStatue,
max(a.LawDate) LawDate
from
Law a
join Charge b on b.LawSource = a.LawSource
and b.LawStatute = a.LawStatute
and b.ChargeDate >= a.LawDate
group by
a.LawSource, a.LawStatue
) d on l.LawSource = d.LawSource and l.LawStatue = d.LawStatue and l.LawDate = d.LawDate
If performance is not an issue, cross apply provides a very readable way:
select *
from Law l
cross apply
(
select top 1 *
from Charge
where LawSource = l.LawSource
and LawStatute = l.LawStatute
and ChargeDate >= l.LawDate
order by
ChargeDate
) c
For each row, this looks up the row in the Charge table with the smallest ChargeDate.
To include rows from Law without a matching Charge, change cross apply to outer apply.