Structuring Databases for Financial Statements - sql

I am looking for the best way to structure my database. I have quarterly financial statements for 1000’s of companies from 1997-2012. Each company has three different statements, an income statement, a balance sheet, and a cash flow statement.
I want to be able to perform calculations on the data such as adding up each quarter to get a yearly total for each line item on each statement.
I have tried two ways so far:
1) Storing each line item for each statement in it’s own table i.e. Sales would be one table and have only sales data for all companies I am tracking, with company as the primary key, and each quarters data as a separate column. This seems like the easiest way to work with the data, but updating the data each quarter is time consuming as there are hundreds of tables.
Sales Table
Company q32012 q22012 q12012
ABC Co. 500 100 202
XYZ Co. 230 302 202
2) The other option which is a little easier to update but harder to work with the data is to have a separate table for each company for each statement. For example, the income statement for Royal Bank would have it’s own table, with the primary column being the line item.
Income Statement for Royal Bank
Line_Item q32012 q22012 q12012
Sales
Net Profit
The problem here is when I try to annualize this data, I get a really ugly output due to the group by
SELECT
(CASE WHEN Line_Item = 'Sales' THEN SUM(q4 + q3 + q2 + q1) ELSE '0' END) AS Sales2012,
(CASE WHEN Line_Item = 'NetProfit' THEN SUM(q4 + q3 + q2 + q1)
ELSE '0' END) AS Inventories2012
FROM dbo.[RoyalBankIncomeStatement]
GROUP BY Line_Item
Any help would be appreciated.

Whenever I've had to build a database for fiscal reports by fiscal quarter, month, or year or whatever, I've found it convenient to borrow a concept from star schema design and data warehousing, even if I'm not really building a DW.
The borrowed concept is to have a table, let's call it ALMANAC, that has one row for each date, keyed by the date. In this case a natural key works out well. Dependent attributes in here can be what fiscal month and quarter the date belongs to, whether the date was one where the enterprise was open for business (TRUE or FALSE), and whatever other quirks are in the company calendar.
Then, you need a computer program that just generates this table out of nothing. All the strange rules for the company calendar are embedded in this one program. The ALMANAC can cover a ten year period in a little over 3,650 rows. That's a tiny table.
Now every date in the operational data can be used like a foreign key into the ALMANAC table, provided you consistently use the Date datatype for dates. For example, each sale has a date of sale. Then aggregating by fiscal quarter, or fiscal year, or whatever you like is just a matter of joining operational data with the ALMANAC, and using GROUP BY and SUM() to get the aggregate you want.
It's simple, and it makes generating a whole raft of time period reports a breeze.

My advice to you is to think about not using a SQL database to do this. Instead, think of using something like SQL Server Analysis Services (SSAS). If you want to get a quick start with SSAS, I recommend getting up to speed on PowerPivot for Excel. You can take the model you develop in PowerPivot and import it into SSAS when you're ready.
Why don't I recommend SQL? Because you're going to have a problem aggregating accounts in SQL Server. For example, your balance sheets aren't going to be something you're going to be able to aggregate easily in SQL -- Asking SQL Server for the "Cash" for 2010, for example means that you want to get the entry for the end of December 2010, not that you want to SUM all of the entries for Cash for that year (which would be a nonsense number). On the other hand, with income and expense accounts such as those which would appear on your income statements, you would want to SUM those values up. To make matters worse, some reports are going to have a mix of account types on them, which is going to make reporting quite difficult.
SSAS has provisions inside the product where it "knows" how to aggregate for your reports based on account type, and there are many tutorials out there which can show you how to set this up.
Either way, you're going to need to store your data somewhere before it goes into your reporting system or Analysis Services cube. In order to do that, you should structure your data something like this. Let's say you're storing your data in a table called Reports:
Reports
--------
[Effective Date]
[CompanyID]
[AccountID]
[Amount]
Your Account table would have the description of what you're trying to store (income, expenses, etc). Your [Effective Date] column would link back into a Dates table which describes to which year, quarter, etc., your data belongs. In essence, what I'm describing is a classic shape for reporting databases, called a star schema.

I would probably go with the following structure in one data table:
Company
StatementType
LineItem
FiscalYear
Q1, Q2, Q3, Q4
StatementType would be Income Statement, Balance Sheet or Cash Flow Statement. Line Item would be the coded/uncoded text of the item on the statement, Fiscal Year is 2012, 2011 and so on. You'd still need to make sure that Line Items are consistent across companies.
This structure would let you query for flat statement -
select
LineItem, Q1, Q2, Q3, Q4
from Data
where
Company = 'RoyalBank'
and FiscalYear = 2012
and StatementType = 'Income Statement'
or
QoQ
select
FiscalYear,
Q1
from Data
where
Company = 'Royal Bank'
and
StatementType = 'Income Statement'
and
LineItem = 'Sales'
order by FiscalYear
in addition to aggregates. You'd probably want to have another table for line items with some kind of an index reference to make sure you can pull the statement back in the original order of line items.

Related

Filter PowerPivot based on multiple Date Criteria

I am trying to apply some Time Intelligence functions in my PowerPivot workbook concerning projects and money received for them. I have three relevant tables; Matters, Payments, and a Date Table.
Each matter has a creationDate, and a closureDate(from a linked table). Likewise, each payment has a date. I have reporting set up decently, but am now trying to use Time intelligence to filter this a bit more clearly.
How can I set a PowerPivot Pivot Table up so that the only Matters which show are those which existed within the period selected. e.g. If I select a slicer for 2014, I don't want to show a matter created in 2015, or one which was closed in 2013. The matter should have been active during the period specified.
Is this possible?
You want to show all the matters EXCEPT those where the CreationDate is after the upper limit of the date range you are looking at or the ClosureDate is before the lower limit of the date range you are looking at.
Assuming you have a data structure like this, where the left-hand table is the Matters and the right-hand one is the Payments:
If you have a calculated field called [Total Payments] that just adds up all the payments in the Payments table, a formula similar to this would work:-
[Payment in Range]:=IF(OR(MIN(Matters[Creation Date])>MAX('Reporting Dates'[Date]),MAX(Matters[Closure Date])<MIN('Reporting Dates'[Date])),BLANK(),[Total Payments])
Here is the result with one month selected in the timeline:
Or with one year selected in the year slicer:
NOTE: in my example, I have used a disconnected date table.
Also, you will see that the Grand Total adds up all the payments because it takes the lowest of all the creation dates and the highest of all the closure dates to determine whether to show a total payment value. If it is important that the Grand Total shows correctly, then an additional measure is required:
[Fixed Totals Payment in Range]:=IF(COUNTROWS(VALUES(Matters[Matter]))=1,[Payment in Range],SUMX(VALUES(Matters[Matter]),[Payment in Range]))
Replace the [Payment in Range] in your pivot table with this new measure and the totals will show correctly, however, this will only work if Matters[Matter] is used as one of the fields in the pivot table.
Use filters & the calculate function.
So, if you're Summing payments, it would look like.....
Payments 2014:= CALCULATE( SUM([Payments]), DateTable[Year]=2014)
The Sum function takes the entirety of payments & the filter function will only capture payments w/in 2014, based on the data connected to your date table.

Finding Outliers In SQL

I am very new to SQL and have my data in an Access database (~50k rows) with the following structure
State Year Date Price
CA 2012 1/2/13 5.00
NY 2013 1/2/13 6.00
NY 2013 1/7/13 7.00
A (State, Year) pair, though held in different columns here, represent a vintage (like a wine). So we talk about how the price of "CA 2012" moves throughout the year.
Because some of our data is entered manually into this database, there is opportunity for error. We would like to write a query that flags any suspicious entries for further review.
I have read many different questions and threads on the subject but have not found anything that addresses my main concern of how to find local outliers - the price can move up and down so prices that may be okay for some date range may be an outlier earlier in the year
Update: I chunked my data into buckets of months so finding local outliers might be easier as a result of that. I'm still looking for good outlier detection methods I can implement in SQL.
Sometimes simple is best- No need for an intro to statistics yet. I would recommend starting with simple grouping. Within that function you can Average, get the minimum, the Maximum and other useful bits of data. Here are a couple of examples to get you started:
SELECT Table1.State, Table1.Yr, Count(Table1.Price) AS CountOfPrice, Min(Table1.Price) AS MinOfPrice, Max(Table1.Price) AS MaxOfPrice, Avg(Table1.Price) AS AvgOfPrice
FROM Table1
GROUP BY Table1.State, Table1.Yr;
Or (in case you want month data included)
SELECT Table1.State, Table1.Yr, Month([Dt]) AS Mnth, Count(Table1.Price) AS CountOfPrice, Min(Table1.Price) AS MinOfPrice, Max(Table1.Price) AS MaxOfPrice
FROM Table1
GROUP BY Table1.State, Table1.Yr, Month([Dt]);
Obviously you'll need to modify the table and field names (Just so you know though- 'Year' and 'Date' are both reserved words and best not used for field names.)

MS Access query to convert values from one currency to another currency

Alright, I need a little assistance on a problem that I am facing.
I am working on a database project and have run into a problem regarding converting money values from a variety of different currencies into US Dollars.
The reason for my difficulty is that I need to maintain the original records in their original currency format but I also have to be able to convert these values to US Dollars, then perform a number of dynamic queries to sum up values of specific records and then output the final outcomes into a series of Reports.
I already have a table which contains all of my transactions (which includes currency type field, and several monetary value fields(12 fields per record))
I have a second table which contains the reference list of currency types along with the neccessary conversion rates over a 12 month period(so again 12 numeric fields) based on their relation to the US dollar. (ie. the entry of US dollar would be followed by 12 fields all containing a value of 1 for a 1-to-1 exchange rate)
I would like to be able to run a query which copies the records from my transactions table to a new table after converting them all to their US Dollar equivallent value. However I am not a expert in writing such a query and would like some assistance. is it possible to write a where clause into an expression within a query so that it takes each record from transactions, finds the correct conversion rate for the correct month, does the math and outputs to another table that same record with the modified values?
Or is there a way to perform this same function using a VBA script? If so what kind of recommendations would you make for that code?
UPDATE OF PROGRESS/SOLUTION
So after reviewing the solutions and comments here is the solution I came up with.
I built my exchange Rates table (ExRates) in the format that I had intended CurrencyName, Followed by the conversion rate for each of the 12 months (this is due to having to work with existing database elements)
Built the following 2 queries Match & Convert
SELECT ForcastTrans.*, ExRates.JanRate, ExRates.FebRate, ExRates.MarRate, ExRates.AprRate, ExRates.MayRate, ExRates.JunRate, ExRates.JulRate, ExRates.AugRate, ExRates.SepRate, ExRates.OctRate, ExRates.NovRate, ExRates.DecRate
FROM ForcastTrans, ExRates
WHERE ForcastTrans.Currency=ExRates.CurrencyName;
SELECT qryExRatematch.EntityID, qryExRatematch.Account, qryExRatematch.Currency, [qryExRatematch]![Month1]*[qryExRatematch]![JanRate] AS Jan, [qryExRatematch]![Month2]*[qryExRatematch]![FebRate] AS Feb, [qryExRatematch]![Month3]*[qryExRatematch]![MarRate] AS Mar, [qryExRatematch]![Month4]*[qryExRatematch]![AprRate] AS Apr, [qryExRatematch]![Month6]*[qryExRatematch]![JunRate] AS Jun, [qryExRatematch]![Month7]*[qryExRatematch]![JulRate] AS Jul, [qryExRatematch]![Month8]*[qryExRatematch]![AugRate] AS Aug, [qryExRatematch]![Month9]*[qryExRatematch]![SepRate] AS Sep, [qryExRatematch]![Month10]*[qryExRatematch]![OctRate] AS Oct, [qryExRatematch]![Month11]*[qryExRatematch]![NovRate] AS Nov, [qryExRatematch]![Month12]*[qryExRatematch]![DecRate] AS [Dec]
FROM qryExRatematch
ORDER BY qryExRatematch.EntityID, qryExRatematch.Account, qryExRatematch.Currency;
These got me the conversions that I needed and I can reconnect my reporting queries to these tables instead of the original ones I had done without the conversion.
Thank you everyone for your help, suggestions, and opinions and I credit Johnny Bones with this answer because his answer led me to the line of experimentation that help me reach my solution.
Thanks again for all your help
Are your table layouts set in stone? The easiest way to do this is to set up your currency table with 3 fields:
CurrencyDate - The date of the new currency exchange rate
CurrencyName - The name of the currency (Yen, Pound, Frank, etc...)
CurrencyRate - The exchange rate on that day
Then you would set up a query called qryCurrentExchange where you would take the Max(CurrencyDate) for each currency. This will give you one query that holds the current exchnage rate for each currency.
Create another query with your transaction table, and Inner Join the above query by the CurrencyName, and you should be able to pull in the exchange rate, which you would multiply by your currency field in your transaction table. You can either leave the query as-is or turn it into a Make Table query if you want to output the results to a table.

Best practice for keeping historical data in SQL (for SSAS Cube use)

I am working on an Hotel DB, and the booking table changes a lot since people book and cancel reservation all the time. Trying to find out the best way to convert the booking table to a fact table in SSAS. I want to be able to get the right statsics from it.
For example: if a client X booked a room on Sep 20th for Dec 20th and canceled the order on Oct 20th. If I run the cube on the month of September (run it in Nov) and I want to see how many rooms got booked in the month of Sep, the order X made should be counted in the sum.
However, if I run the cube for YTD calculation (run it in Nov), the order shouldn't be counted in the sum.
I was thinking about inserting the updates to the same fact table every night, and in addition to the booking number (unique key) and add revision column to the table. So going back to the example, let say client X booking number is 1234, the first time I enter it to the table will get revision 0, in Oct when I add the cancellation record, it will get revision 1 (of course with timestamp on the row).
Now, if I want to look on any piroed of time, I can take it by the timestamp and look at the MAX(revision).
Does it make sense? Any ideas?
NOTE: I gave the example of cancelling the order, but we want to track another statistics.
Another option I read about is partitioning the cubes, but do I partition the entire table. I want to be able to add changes every night. Will I need to partition the entire table every night? it's a huge table.
One way to handle this is to insert records in your fact table for bookings and cancellations. You don't need to look at the max(revision) - cubes are all about aggregation.
If your table looks like this:
booking number, date, rooms booked
You can enter data like this:
00001, 9/10, 1
00002, 9/12, 1
00001, 10/5, -1
Then your YTDs will always have information accurate as of whatever month you're looking at. Simply sum up the booked rooms.

SQL query question

I'm trying to do something in a query that I've never done before. it probably requires variables, but i've never done that, and I'm not sure that it does.
What I want is to get a list of sales, grouped first by affiliate, then by it's month.
I can do that, but here's the twist... I don't want the month, but month 1, month 2, month 3...
And those aren't Jan, feb, march, but the number of months since the day of first sale.
Is this possible in a query at all, or do I need to do this in my code.
Oh, mysql 5.1.something...
Sure, just write an expression in SQL that generates the number of months since the first sale (Do you mean the first sale for that afiliate? If so, you'll need a subquery)
And since you say you want a list of sales, I assume you don't really want to "Group By" affilaite and monthcount, you just want to Sort, or Order By those values)
If you wanted the Average sales amount, or the Count of sales, or some other Aggregate function of sales data, then you would be doing a "Group By"...
And I don't think you need to worry about sorting by the number of months, you can simply sort by the difference between each sales date and the rearliest sale date for each affiliate. (If you wanted to apply a third sorting rule, after the sales date sort, then you would need to be more careful.)
Select * From Sales S
Order By Affiliate,
SalesDate - (Select Min(SalesDate)
From Sales
Where Affiliate = S.Affiliate)
Or, if you really need it to be by the difference in months
Select * From Sales S
Order By Affiliate,
Month(SalesDate) -
(Select Month(Min(SalesDate))
From Sales
Where Affiliate = S.Affiliate)
This is possible in standard SQL if you use what I like to call "SQL gymnastics". It can be done with subqueries.
But it looks incredibly ugly, is hard to maintain and it's really not worth it. You're far better off using one of the many programming languages that wrap SQL (such as PL/SQL) or even a general purpose language that can call SQL (such as Python).
The result will be in two languages but will be all the more understandable than the same thing written in just SQL.