How to calculate a bank's deposit growth from one call report to the next, as a percentage? - google-bigquery

I downloaded the entire FDIC bank call reports dataset, and uploaded it to BigQuery.
The table I currently have looks like this:
What I am trying to accomplish is adding a column showing the deposit growth rate since the last quarter for each bank:
Note:The first reporting date for each bank (e.g. 19921231) will not have a "Quarterly Deposit Growth". Hence the two empty cells for the two banks.
I would like to know if a bank is increasing or decreasing its deposits each quarter/call report (viewed as a percentage).
e.g. "On their last call report (19921231)First National Bank had deposits of 456789 (in 1000's). In their next call report (19930331)First National bank had deposits of 567890 (in 1000's). What is the percentage increase (or decrease) in deposits"?
This "_%_Change_in_Deposits" column would be displayed as a new column.
This is the code I have written so far:
select
SFRNLL.repdte, SFRNLL.cert, SFRNLL.name, SFRNLL.city, SFRNLL.county, SFRNLL.stalp, SFRNLL.specgrp AS `Loan_Specialization`, SFRNLL.lnreres as `_1_to_4_Residential_Loans`, AL.dep as `Deposits`, AL.lnlsnet as `loans_and_leases`,
IEEE_DIVIDE(SFRNLL.lnreres, AL.lnlsnet) as SFR2TotalLoanRatio
FROM usa_fdic_call_reports_1992.All_Reports_19921231_1_4_Family_Residential_Net_Loans_and_Leases as SFRNLL
JOIN usa_fdic_call_reports_1992.All_Reports_19921231_Assets_and_Liabilities as AL
ON SFRNLL.cert = AL.cert
where SFRNLL.specgrp = 4 and IEEE_DIVIDE(SFRNLL.lnreres, AL.lnlsnet) <= 0.10
UNION ALL
select
SFRNLL.repdte, SFRNLL.cert, SFRNLL.name, SFRNLL.city, SFRNLL.county, SFRNLL.stalp, SFRNLL.specgrp AS `Loan_Specialization`, SFRNLL.lnreres as `_1_to_4_Residential_Loans`, AL.dep as `Deposits`, AL.lnlsnet as `loans_and_leases`,
IEEE_DIVIDE(SFRNLL.lnreres, AL.lnlsnet) as SFR2TotalLoanRatio
FROM usa_fdic_call_reports_1993.All_Reports_19930331_1_4_Family_Residential_Net_Loans_and_Leases as SFRNLL
JOIN usa_fdic_call_reports_1993.All_Reports_19930331_Assets_and_Liabilities as AL
ON SFRNLL.cert = AL.cert
where SFRNLL.specgrp = 4 and IEEE_DIVIDE(SFRNLL.lnreres, AL.lnlsnet) <= 0.10
The table looks like this:
Additional notes:
I would also like to view the last column (SFR2TotalLoansRatio) as a percentage.
This code runs correctly, however, previously I was getting a "division by zero" error when attempting to run 50,000 rows (1992 to the present).

Addressing each of your question individually.
First) Retrieving SFR2TotalLoanRatio as percentage, I assume you want to see 9.988% instead of 0.0988 in your results. Currently, in BigQuery you can achieve this by casting the field into a STRING then, concatenating the % sign. Below there is an example with sample data:
WITH data as (
SELECT 0.0123 as percentage UNION ALL
SELECT 0.0999 as percentage UNION ALL
SELECT 0.3456 as percentage
)
SELECT CONCAT(CAST(percentage*100 as String),"%") as formatted_percentage FROM data
And the output,
Row formatted_percentage
1 1.23%
2 9.99%
3 34.56%
Second) Regarding your question about the division by zero error. I am assuming IEEE_DIVIDE(arg1,arg2) is a function to perform the division, in which arg1 is the divisor and arg2 is the dividend. Therefore, I would adivse your to explore your data in order to figured out which records have divisor equals to zero. After gathering these results, you can determine what to do with them. In case you decide to discard them you can simply add within your WHERE statement in each of your JOINs: AL.lnlsnet = 0. On the other hand, you can also modify the records where lnlsnet = 0 using a CASE WHEN or IF statements.
UPDATE:
In order to add this piece of code your query, you u have to wrap your code within a temporary table. Then, I will make two adjustments, first a temporary function in order to calculate the percentage and format it with the % sign. Second, retrieving the previous number of deposits to calculate the desired percentage. I am also assuming that cert is the individual id for each of the bank's clients. The modifications will be as follows:
#the following function MUST be the first thing within your query
CREATE TEMP FUNCTION percent(dep INT64, prev_dep INT64) AS (
Concat(Cast((dep-prev_dep)/prev_dep*100 AS STRING), "%")
);
#followed by the query you have created so far as a temporary table, notice the the comma I added after the last parentheses
WITH data AS(
#your query
),
#within this second part you need to select all the columns from data, and LAG function will be used to retrieve the previous number of deposits for each client
data_2 as (
SELECT repdte, cert, name, city, county, stalp, Loan_Specialization, _1_to_4_Residential_Loans,Deposits, loans_and_leases, SFR2TotalLoanRatio,
CASE WHEN cert = lag(cert) OVER (PARTITION BY id ORDER BY d) THEN lag(Deposits) OVER (PARTITION BY id ORDER BY id) ELSE NULL END AS prev_dep FROM data
)
SELECT repdte, cert, name, city, county, stalp, Loan_Specialization, _1_to_4_Residential_Loans,Deposits, loans_and_leases, SFR2TotalLoanRatio, percent(Deposits,prev_dep) as dept_growth_rate FROM data_2
Note that the built-in function LAG is used together with CASE WHEN in order to retrieve the previous amount of deposits per client.

Related

Wrapping a range of data

How would I select a rolling/wrapping* set of rows from a table?
I am trying to select a number of records (per type, 2 or 3) for each day, wrapping when I 'run out'.
Eg.
2018-03-15: YyBiz, ZzCo, AaPlace
2018-03-16: BbLocation, CcStreet, DdInc
These are rendered within a SSRS report for Dynamics CRM, so I can do light post-query operations.
Currently I get to:
2018-03-15: YyBiz, ZzCo
2018-03-16: AaPlace, BbLocation, CcStreet
First, getting a number for each record with:
SELECT name, ROW_NUMBER() OVER (PARTITION BY type ORDER BY name) as RN
FROM table
Within SSRS, I then adjust RN to reflect the number of each type I need:
OnPageNum = FLOOR((RN+num_of_type-1)/num_of_type)-1
--Shift RN to be 0-indexed.
Resulting in AaPlace, BbLocation and CcStreet having a PageNum of 0, DdInc of 1, ... YyBiz and ZzCo of 8.
Then using an SSRS Table/Matrix linked to the dataset, I set the row filter to something like:
RowFilter = MOD(DateNum, NumPages(type)) == OnPageNum
Where DateNum is essentially days since epoch, and each page has a separate table and day passed in.
At this point, it is showing only N records of type per page, but if the total number of records of a type isn't a multiple of the number of records per page of that type, there will pages with less records than required.
Is there an easier way to approach this/what's the next step?
*Wrapping such as Wraparound found in videogames, seamless resetting to 0.
To achieve this effect, I found that offsetting the RowNumber by -DateNum*num_of_type (negative for positive ordering), then modulo COUNT(type) would provide the correct "wrap around" effect.
In order to achieve the desired pagination, it then just had to be divided by num_of_type and floor'd, as below:
RowFilter: FLOOR(((RN-DateNum*num_of_type) % count(type))/num_of_type) == 0

SAP BO - how to get 1/0 distinct values per week in each row

the problem I am trying to solve is having a SAP Business Objects query calculate a variable for me because calculating it in a large excel file crashes the process.
I am having a bunch of columns with daily/weekly data. I would like to get a "1" for the first instance of Name/Person/Certain Identificator within a single week and "0" for all the rest.
So for example if item "Glass" was sold 5 times in week 4 in this variable/column first sale will get "1" and next 4 sales will get "0". This will allow me to have the number of distinct items being sold in a particular week.
I am aware there are Count and Count distinct functions in Business Objects, however I would prefer to have this 1/0 system for the entire raw table of data because I am using it as a source for a whole dashboard and there are lots of metrics where distinct will be part/slicer for.
The way I doing it previously is with excel formula: =IF(SUMPRODUCT(($A$2:$A5000=$A2)*($G$2:$G5000=$G2))>1,0,1)
This does the trick and gives a "1" for the first instance of value in column G appearing in certain value range in column A ( column A is the week ) and gives "0" when the same value reappears for the same week value in column A. It will give "1" again when the week value change.
Since it is comparing 2 cells in each row for the entire columns of data as the data gets bigger this tends to crash.
I was so far unable to emulate this in Business Objects and I think I exhausted my abilities and googling.
Could anyone share their two cents on this please?
Assuming you have an object in the query that uniquely identifies a row, you can do this in a couple of simple steps.
Let's assume your query contains the following objects:
Sale ID
Name
Person
Sale Date
Week #
Price
etc.
You want to show a 1 for the first occurrence of each Name/Week #.
Step 1: Create a variable with the following definition. Let's call it [FirstOne]
=Min([Sale ID]) In ([Name];[Week #])
Step 2: In the report block, add a column with the following formula:
=If [FirstOne] = [Sale ID] Then 1 Else 0
This should produce a 1 in the row that represents the first occurrence of Name within a Week #. If you then wanted to show a 1 one the first occurrence of Name/Person/Week #, you could just modify the [FirstOne] variable accordingly:
=Min([Sale ID]) In ([Name];[Person];[Week #])
I think you want logic around row_number():
select t.*,
(case when 1 = row_number() over (partition by name, person, week, identifier
order by ??
)
then 1 else 0
end) as new_indicator
from t;
Note the ??. SQL tables represent unordered sets. There is no "first" row in a table or group of rows, unless a column specifies that ordering. The ?? is for such a column (perhaps a date/time column, perhaps an id).
If you only want one row to be marked, you can put anything there, such as order by (select null) or order by week.

SQL: Calculate Percentage in new column using another column

I found it hard to describe what I wanted to do in the title, but I will be more specific here.
I have a reasonably long query:
SELECT
/*Amount earned with validation to remove outlying figures*/
Case When SUM(t2.[ActualSalesValue])>=0.01 OR SUM(t2.[ActualSalesValue])<0 Then SUM(t2.[ActualSalesValue]) ELSE 0 END AS 'Amount',
/*Profit earned (is already calculated then input into db, this just pulls that figure*/
SUM(t2.[Profit]) AS 'Profit',
/*Product Type - pulls the product type so that we can sort by product*/
t1.[ucIIProductType] AS 'Product Type',
/*Profit Percentage - This is to calculate the percentage of profit based on the sales price which uses 2 different columns - Case ensures that there are no wild values appearing in the reports as previously experienced*/
Case When SUM(t2.[ActualSalesValue])>=0.01 OR SUM(t2.[ActualSalesValue])<0 THEN (SUM(t2.[Profit])/SUM(t2.[ActualSalesValue])) ELSE 0 END AS 'Profit Percentage',
/*Percentage of Turnover*/
*SUM(t2.[ActualSalesValue])/(Select SUM(t2.[ActualSalesValue]) OVER() FROM [_bvSTTransactionsFull]) AS 'PoT'
/*The join is connect the product type with the profit and the amount*/
FROM [dbo].[StkItem] AS t1
INNER JOIN [dbo].[_bvSTTransactionsFull] AS t2
/*There attirbutes are the links between the tables*/
ON t1.[StockLink]=t2.[AccountLink]
WHERE t2.[TxDate] BETWEEN '1/Aug/2014' AND '31/Aug/2014' AND ISNUMERIC(t2.[Account]) = 1
Group By t1.[ucIIProductType]
The 'Percentage of Turnover' part I am having trouble with - I am trying to calculate the percentage of the Amount based on the total amount - using the same column. So eg: I want to take the Amount value in row 1, then divide it by the total amount of the entire column and then have that value listed in a new column. But I keep getting errors or I Keep getting 1 (because it wants to divide the value by the same value. CAN anyone please advise me on proper syntax for solving this:
/*Percentage of Turnover*/
*SUM(t2.[ActualSalesValue])/(Select SUM(t2.[ActualSalesValue]) OVER() FROM [_bvSTTransactionsFull]) AS 'PoT'
I think you want one of the following:
SUM(t2.[ActualSalesValue])/(Select SUM(t.[ActualSalesValue]) FROM [_bvSTTransactionsFull] t) AS PoT
or:
SUM(t2.[ActualSalesValue])/(SUM(SUM(t2.[ActualSalesValue])) OVER() ) AS PoT
Note: you should use single quotes only for string and date constants, not for column and table names. If you need to escape names, use square braces.

Select Average of Top 25% of Values in SQL

I'm currently writing a stored procedure for my client to populate some tables that will be used to generate SSRS reports later on. Some of the data is based on specific stock formulas that are run on each of their clients' quarterly data (sent to them by their clients). The other part of the data is generated by comparing those results against those from other, similar sized clients. One of the things that they want tracked in their reports is the average of the top 25% of formula results for that particular comparison group.
To give a better picture of it, imagine the following fields that I have in a temp table:
FormulaID int
Value decimal (18,6)
I want to do the following: Given a specific FormulaID return the average of the top 25% of Value.
I know how to take an average in SQL, but I don't know how to do it against only the top 25% of a specific group.
How would I write this query?
I guess you can do something like this...
SELECT AVG(Q.ColA) Avg25Prec
FROM (
SELECT TOP 25 Percent ColA
FROM Table_Name
ORDER BY SomeCOlumn
) Q
Here's what I did, given the table shown above:
select AVG(t.Value)
from (select top 25 percent Value
from #TempGroupTable
where FormulaID = #PassedInFormulaID
order by Value desc) as t
The desc must be there, because the percent command will not actually do comparisons. It will just simply grab the first x number of records, with x being equal to 25% of the count of records it's querying. Therefore, the order by Value desc line then will grab the top 25% records which have the highest Value, and then sends that info to be averaged.
As a side note to all of this, this also means that if you wanted to grab the bottom 25% instead, or if your formula results are like a golf score (i.e. lowest is the best), all you would need to do is remove the desc part and you would be good to go.

Create one query with sum and count with each value pulled from a different table

I am trying to create a query that pulls two different aggregated values from three different tables during a specific date range. I am working in Access 2003.
I have:
tblPO which has the high level purchase order description (company name, shop order #, date of order, etc)
tblPODescription which has the dollar values of the individual line items from customers the purchase order
tblCostSheets which as a breakdown of the individual pieces that we need to manufacture to satisfy the customers purchase order.
I am looking to create a query that will allow me, based on the Shop Order #, to get both the sum of the dollar values from tblPODescriptions and the count of the different type of pieces we need to make from tblCostSheets.
A quick caveat: the purchase order may have 5 line items for a sum of say $1560 but it might take us making 8 or 9 different parts to satisfy those 5 line items. I can easily create a query that pulls either the sum or the count by themselves, but when I created my query with both, I end up with numbers that are multipled versions of what I want. I believe it is multiplying my piece counts and dollar values.
SELECT DISTINCTROW tblPO.CompanyName, tblPO.ShopOrderNo, tbl.OrderDate, Sum(tblPODescriptions.ItemAmount) AS SumOfItemAmount, Count(tblCostSheets.Description) AS CountOfDescription
FROM (tblPO INNER JOIN tblPODescriptions ON (tblPO.CompanyName = tblPODescriptions.CompanyName) AND (tblPO.PurchaseOrderNo = tblPODescriptions.PurchaseOrderNo) AND (tblPO.PODate = tblPODescriptions.PODate)) INNER JOIN tblCostSheets ON tblPO.ShopOrderNo = tblCostSheets.ShopOrderNo
GROUP BY tblPO.CompanyName, tblPO.ShopOrderNo, tblPO.OrderDate
HAVING (((tblPO.OrderDate) Between [Enter Start Date:] And [Enter End Date:]));