Newb help in designing query to subtract results of two queries in same table - sql

I have seen other questions like this one but feel mine is a bit different, or didn't quite understand the SQL in the other questions...so my apologies if this one is redundant or very easy..
Anyway, I have an accounting transaction DB that stores every transaction posting within our financial system on one line. What I am trying to do is net the sum of the debits and the credits for each GL account.
Here are the two basic queries I am executing to get the results that I would like to net.
Query 1 gives me the sum of all debit transactions posting to each gl account:
Select gl_debit, sum (amt) from FISC_YEAR2014 where fund = 'XXX'
group by gl_debit
Query 2 gives me the sum of all credit transactions posting to each gl account:
select gl_credit, sum (amt) from FISC_YEAR2014 where fund = 'XXX'
group by gl_credt
Now I would to subtract the credit amounts from the debit amounts to get net totals for each gl account. Make sense?
Thanks.

There are two ways to do this depending our your table definition. I think your situation is the first.
This is the normal way assuming credits and debits are in separate columns:
SELECT sum(gl_debit)-sum(gl_credit) as net_debit
FROM FISC_YEAR2014
WHERE fund = 'XXX'
This is the other way assuming direction is indicated by a separate column:
SELECT SUM(IF(is_debit=1,amount,-1*amount)) as net_debit
FROM FISC_YEAR2014
WHERE fund = 'XXX'
See also:
MySQL 'IF' in 'SELECT' statement
Can't calculate totals in general ledger report
What's a good way to store a financial ledger?

I believe this is what you need:
select
gl_account,
sum(amt)
from
(
select gl_debit gl_account,
sum(-amt) amt
from fisc_year2014
where fund = 'XXX'
group by gl_debit
union all
select gl_credit,
sum(amt)
from fisc_year2014
where fund = 'XXX'
group by gl_credit
)
group by
gl_account
There are two SELECTs: one to get the (negative) debits and another to get the credits. They are UNIONed to create a two-column result. The outer SELECT then aggregates the total sum by the gl_account code. If there is a mismatch (a gl_debit without a gl_credit, or vice-versa), then its amount would still be displayed.
SQLFiddle here (I added another row to show the effect of mismatched IDs)

To do this you should SUM the debits and credits separately in subqueries, then join those subqueries on gl_credit = gl_debit.
SELECT COALESCE(gl_credit, gl_debit) AS Id
,COALESCE(d.amt,0)-COALESCE(c.amt,0) AS Net
FROM (
SELECT gl_debit, SUM(amt) AS amt
FROM FISC_YEAR2014
GROUP BY gl_debit
) d
FULL OUTER JOIN (
SELECT gl_credit, SUM(amt) AS amt
FROM FISC_YEAR2014
GROUP BY gl_credit
) c ON d.gl_debit = c.gl_credit
ORDER BY COALESCE(gl_credit, gl_debit)
SQLFiddle
Outputs:
ID Net
-----------
101 -475
201 225
301 500
501 -250
If I were you rather than using a FULL OUTER JOIN, I'd select the ids from the accounts table or wherever you store them, then LEFT JOIN both of the subqueries to it, you haven't shown any other tables though so I can only speculate.

Related

SQL Server 2008 : running totals and null records

I have several queries that get account balances from our ERP, but there are several issues I am trying to work around and I am curious if there are better ways or if more recent versions of SQL Server have functions to address any of these problems.
Our ERP generates a balance record only in periods where there is activity associated with the account. The ERP applications and reports summarize values by period but no record is added to the database so custom processes that need a balance by period require a query/view to calculate this info.
My workaround for this has been to use a global variable to intentionally create duplicates from the Account table and the pseudo period table I created, see below
Our Account Period table dose not contain a period index (I suppose it should be the Row ID however at some point a fiscal period was added incorrectly and the index was thrown out of order. I have been advised by the ERP provider not to update this without a full reimplementation). I created a workaround table for this.
So I have several queries that work around these issues but they run slowly with just a handful of accounts so a full pseudo table for account balances has not been practical with my methods at least.
I have included an example below for calculating the balance by period for accounts that are not summarized to retained earnings annually (assets, liabilities, equity)
SELECT
ID AS ACCOUNT_ID, ind.Month_Index, ind.Period,
(SELECT
ISNULL(SUM(CASE WHEN A3.TYPE IN ('e','r') THEN NULL
WHEN A3.TYPE = 'a' THEN ISNULL(AB3.DEBIT_AMOUNT, 0) - ISNULL(AB3.CREDIT_AMOUNT, 0)
ELSE ISNULL(AB3.CREDIT_AMOUNT, 0) - ISNULL(AB3.DEBIT_AMOUNT, 0) END), 0)
FROM ACCOUNT_BALANCE AS AB3
LEFT OUTER JOIN ACCOUNT AS A3 ON AB3.ACCOUNT_ID = A3.ID
LEFT OUTER JOIN
(SELECT YEAR, Month_Num, Month_Index, Period
FROM UFC_Calander
GROUP BY YEAR, Month_Num, Month_Index, Period) AS ind2 ON AB3.ACCT_YEAR = ind2.YEAR AND AB3.ACCT_PERIOD = ind2.Month_Num
WHERE A.ID = AB3.ACCOUNT_ID
AND A3.CURRENCY_ID = '(USA) $'
AND ind2.Month_Index <= ind.Month_Index) AS BALANCE_AQL
FROM
ACCOUNT AS A
LEFT OUTER JOIN
ACCOUNT_PERIOD AS per ON 'UCC' = per.SITE_ID
LEFT OUTER JOIN
ACCOUNT_BALANCE AS AB ON A.ID = AB.ACCOUNT_ID
AND per.ACCT_YEAR = AB.ACCT_YEAR
AND per.ACCT_PERIOD = AB.ACCT_PERIOD
AND AB.CURRENCY_ID = '(USA) $'
LEFT OUTER JOIN
(SELECT YEAR, Month_Num, Month_Index, Period
FROM UFC_Calander
GROUP BY YEAR, Month_Num, Month_Index, Period) AS ind ON per.ACCT_YEAR = ind.YEAR AND per.ACCT_PERIOD = ind.Month_Num
WHERE
ID IN ('120-1140-0000', '120-1190-1190', '120-1190-1193',
'120-1190-1194', '210-2100-0000', '210-2101-0000')
GROUP BY
ID, ind.Month_Index, ind.Period
ORDER BY
ind.Month_Index DESC, ACCOUNT_ID DESC
Any suggestions that might improve the performance of this query will be greatly appreciated.
My high level recommendations are the following:
Avoid using the IN clause. If possible (assuming the account table isn't too big, create a temporary table for only the columns you need and load that data with the ID's you are working with.) Then use that in your code above.
(not a performance thing but more of a slight change). The ISNULL(SUM... part is only needed due to you having a "A3.TYPE IN ('e', 'r') THEN NULL". If you had said THEN 0, you could avoid the null check.
A correlated subquery within the select is okay, but its a multi-part join that is most likely causing it to slow down. I'm not sure 100% confident on how you can break this apart to be two separate logical grabs of data and then joined back together, but its the best I got with what I'm seeing here.

SQL joining two tables with different levels of details

So I have two tables of sales, budget and actual.
"budget" has two columns: location and sales. For example,
location sales
24 $20000
36 $100300
40 $24700
Total $145000
"actual" has three columns: invoice_number, location, and sales. For example,
invoice location sales
10000 36 $5000
10001 40 $6000
10002 99 $7000
and so forth
Total $110000
In summary, "actual" records transactions at the invoice level, whereas "budget" is done at the location level only (no individual invoices).
I'm trying to create a summary table that lists actual and budget sales side by side, grouped by location. The total of the actual column should be $110000, and $145000 for budget. This is my attempt at it (on pgAdmin/ postgresql):
SELECT actual.location, SUM(actual.sales) AS actual_sales, SUM(budget.sales) AS budget_sales
FROM actual LEFT JOIN budget
ON actual.location = budget.location
GROUP BY actual.location;
I used LEFT JOIN because "actual" has locations that "budget" doesn't have (e.g. location 99).
I ended up with some gigantic numbers ($millions) on both the actual_sales and budget_sales columns, far exceeding the total actual ($110000) or budget sales ($145,000).
Is this because the way I wrote my query is basically asking SQL to join each invoice in "actual" to each line in "budget," therefore duplicating things many times over? If so how should I have written this?
Thanks in advance!
Based on your description, you seem to have duplicates in both tables. There are various ways to solve this problem. Here is one using union all and group by:
select Location,
sum(actual_sales) as actual_sales,
sum(budget_sales) as budget_sales
from ((select a.location, a.sales as actual_sales, null as budget_sales
from actual a
) union all
(select b.location, null, b.sales
from budget b
)
) ab
group by location;
This structure guarantees that each value is counted only once, regardless of the table.
The query looks fine to me. However, it is difficult to find out why the figures are wrong. My suggestion is that you do the sum by location separately for budget and actual into 2 temporary tables, and later put them together using LEFT JOIN.
Yes, you're joining the budget in once for each actual sales row. However, your Actual Sales sum shouldn't have been larger unless there were multiple budget rows for the same location. You should check for that, because it doesn't sound like there should be.
What you need to do in a case like this is sum the actual sales first in a CTE or subquery, then later join the result to the budget. That way you only have one row for each location. This does it for the actual sales. If you really do have more than one row for a location for budget as well, you might need to subquery the budget as well the same way.
Select Act.Location, Act.actual_sales, budget.sales as budget_sales
From
(
SELECT actual.location, SUM(actual.sales) AS actual_sales
FROM actual
GROUP BY actual.location
) Act
left join budget on Act.location = budget.location
Gordon's suggestion is good, an alternative using WITH statements is:
WITH aloc AS (
SELECT location, SUM(sales) FROM actual GROUP BY 1
), bloc AS (
SELECT location, SUM(sales) FROM budget GROUP BY 1
)
SELECT location, a.sum AS actual_sales, b.sum AS budget_sales
FROM aloc a LEFT JOIN bloc b USING (location)
This is equivalent to:
SELECT location, a.sum AS actual_sales, b.sum AS budget_sales
FROM (SELECT location, SUM(sales) FROM actual GROUP BY 1) a LEFT JOIN
(SELECT location, SUM(sales) FROM budget GROUP BY 1) b USING (location)
but I find WITH statements more readable.
The purpose of the subqueries is to get tables into a state where a row means something relevant, i.e. aloc contains a row per location, and hence cause the join to evaluate to what you want.

Using SUM() with multiple tables in sql

I am trying to write a query that will use the sum function to add up all values in 1 column then divide by the count of tuples in another table. For some reason when i run the sum query by itself i get the correct number back but when i use it in my query below the value is wrong.
this is what im trying to do but the numbers are coming out wrong.
select (sum(adonated) / count(p.pid)) as "Amount donated per Child"
from tsponsors s, player p;
I found out the issue is in the sum. below returns 650,000 when it should return 25000
select (sum(adonated)) as "Amount donated per Child"
from tsponsors s, player p;
if i remove the from player p it gets the correct amount. However i need the player table to get the number of players.
I have 3 tables that are related to this query.
player(pid, tid(fk))
team(tid)
tsponsors(tid(fk), adonated, sid(fk)) this is a joining table
what i want to get is the sum of all the amounts donated to each team sum(adonated) and divide this by the number of players in the database count(pid).
I guess your sponsors are giving amounts to teams. You then want to know the proportion of donations per child in the sponsored team.
You would then need something like this:
SELECT p.tid,(SUM(COALESCE(s.adonated,0)) / COUNT(p.pid)) AS "Amount donated per Child"
FROM player p
LEFT OUTER JOIN tsponsors s ON s.tid=p.tid
GROUP BY p.tid
I also used a LEFT OUTER JOIN in order to show 0$ if a team has no sponsors.
Try
select sum(s.adonated) / (SELECT count(p.pid) FROM player p)
as "Amount donated per Child"
from tsponsors s;
Your original query joins 2 tables without any condition, which results in cross join.
UPDATE
SELECT ts.tid, SUM(ts.adonated),num_plyr
FROM tsponsors ts
INNER JOIN
(
SELECT tid, COUNT(pid) as num_plyr
FROM player
GROUP BY tid
)a ON (a.tid = ts.tid)
GROUP BY ts.tid,num_plyr

MySQL - Max() return wrong result

I tried this query on MySQL server (5.1.41)...
SELECT max(volume), dateofclose, symbol, volume, close, market FROM daily group by market
I got this result:
max(volume) dateofclose symbol volume close market
287031500 2010-07-20 AA.P 500 66.41 AMEX
242233000 2010-07-20 AACC 16200 3.98 NASDAQ
1073538000 2010-07-20 A 4361000 27.52 NYSE
2147483647 2010-07-20 AAAE.OB 400 0.01 OTCBB
437462400 2010-07-20 AAB.TO 31400 0.37 TSX
61106320 2010-07-20 AA.V 0 0.24 TSXV
As you can see, the maximum volume is VERY different from the 'real' value of the volume column?!?
The volume column is define as int(11) and I got 2 million rows in this table but it's very far from the max of MyISAM storage so I cannot believed this is the problem!? What is also strange is data get show from the same date (dateofclose). If I force a specific date with a WHERE clause, the same symbol came out with different max(volume) result. This is pretty weird...
Need some help here!
UPDATE :
Here's my edited "working" request:
SELECT a.* FROM daily a
INNER JOIN (
SELECT market, MAX(volume) AS max_volume
FROM daily
WHERE dateofclose = '20101108'
GROUP BY market
) b ON
a.market = b.market AND
a.volume = b.max_volume
So this give me, by market, the highest volume's stock (for nov 8, 2010).
As you can see, the maximum volume is VERY different from the 'real' value of the volume column?!?
This is because MySQL rather bizarrely doesn't GROUP things in a sensical way.
Selecting MAX(column) will get you the maximum value for that column, but selecting other columns (or column itself) will not necessarily select the entire row that the found MAX() value is in. You essentially get an arbitrary (and usually useless) row back.
Here's a thread with some workarounds using subqueries:
How can I SELECT rows with MAX(Column value), DISTINCT by another column in SQL?
This is a subset of the "greatest n per group" problem. (There is a tag with that name but I am a new user so I can't retag).
This is usually best handled with an analytic function, but can also be written with a join to a sub-query using the same table. In the sub-query you identify the max value, then join to the original table on the keys to find the row that matches the max.
Assuming that {dateofclose, symbol, market} is the grain at which you want the maximum volume, try:
select
a.*, b.max_volume
from daily a
join
(
select
dateofclose, symbol, market, max(volume) as max_volume
from daily
group by
dateofclose, symbol, market
) b
on
a.dateofclose = b.dateofclose
and a.symbol = b.symbol
and a.market = b.market
Also see this post for reference.
Did you try adjusting your query to include Symbol in the grouping?
SELECT max(volume), dateofclose, symbol,
volume, close, market FROM daily group by market, symbol

SUM() Problem in Paradox Database

I'm currently working with a paradox database that was implemented before I started working at my current job at an insurance firm.
Long story short is that when I am trying to to compile a query of the debit/credit balances of all the active clients, it gives me a different balance per client if I do a query for each individual client. With a client base of a 100K and with a number of transactions over 2 million it isn't viable to do so. So here is what I do for an indiviudal client:
Code:
SELECT COUNT(Debit) as NumberOfDebits
, COUNT(Credit) as NumberOfCredit
, SUM(Debit) as DebitTotal
, SUM(Credit) as CreditTotal
FROM MemberTransactions
WHERE MemberID = '####000094';
As I mentioned above, this gives the right balances for the member, but if I do the following:
SELECT MemberID
, COUNT(Debit) as NumberOfDebits
, COUNT(Credit) as NumberOfCredit
, SUM(Debit) as DebitTotal
, SUM(Credit) as CreditTotal
FROM MemberTransactions
GROUP BY MemberID;
It gives me both a different count and sum results for most the members in the table.
Here is the table structure so you can understand what I have to work with and what I want to accomplish. Every row is a single transaction with either a debit or a credit to the member's account. So what I want to do is sum up every debit and credit into a single cell for each respectively for every member. So that is why I was the group by, thinking that it would add up every credit and debit for every member, but it won't do that. So how would I go about that. I've tried to do an outer join on the membernr from the member details, but I still need to group by which gives me the same result in the end
Table Structure:
PeriodNr I
EffectiveDate D
Entrynr +
MemberNr A
Date D
JournalNr A
ReferenceNr A
DtAmount N
CtAmount N
Narration A
ModifyUserId A
ModifStamp #
One thing I did notice is that after I run the following query:
SELECT COUNT(A.CtAmount) as CreditCount
, Sum(A.CtAmount) as Credit
, COUNT(A.DtAmount) as DebitCount
, SUM(DtAmount) as Debit
, M.MemberNr
, M.Premium
FROM MemAcc as A
LEFT OUTER JOIN Member as M on A.MemberNr = M.MemberNr
GROUP BY M.MemberNr, M.Premium;
There is a single row at top with no MemberNr and a significantly high number of counts, debit and credit. Much higher than any account should be, so I'm guessing for some reason that the missing transactions are going into this row for some reason.
For an example, if I uniquely query lets say member X, I get a debit and credit of 3094 and debit count of 55 and credit count of 18 which matches with the number of records that are in the table for that member, but when I run the above query I get a credit count of 2, debit count of 19, credit of 1590 and debit of 2090.
So I am stumped. I don't know if this is a Paradox problem, or rather my inept understanding of SQL.
Oh yeah the blank member has a credit count of 273, debit count of 341, credit of 19030 and debit of 17168.
I don't know if this is a Paradox
problem, or rather my inept
understanding of SQL.
I would expect that the resultset for the "single member" query and the equivalent line in the "all members" query would return the same counts and sums. If that was your expectation too then I wouldn't describe your understanding of SQL as "inept".
Diagnosing these sorts of problems is hard. The one clue you have is this:
It gives me both a different count and
sum results for most the members in
the table. (emphasis mine)
What you need to do is pick a couple of members where both queries return the same result and discover what distinguishes them from members which have different results.
The results you see from the third query suggest that you have a bunch of records in the MemAcc table where the MemberNr is null. Since there is no way to attach them to the proper member, they would all get grouped together, and the members would appear to have fewer MemAcc records.
The memberNr might not be NULL in the MemAcc table, just with the left outer join it is not on the Member table, but you are doing a group by the member table columns - therfore it shows as NULL because the MemAcc entry no longer exists in the Member table.
e.g. if you do this :
SELECT COUNT(A.CtAmount) as CreditCount
, Sum(A.CtAmount) as Credit
, COUNT(A.DtAmount) as DebitCount
, SUM(DtAmount) as Debit
, A.MemberNr
, M.Premium
FROM MemAcc as A
LEFT OUTER JOIN Member as M on A.MemberNr = M.MemberNr
GROUP BY A.MemberNr, M.Premium;
you will see different results - at least the value of MemberNr which no longer exists on Member.
As for your strange results. I seem to recall a limit on the number of rows in a Paradox table, and you could be approaching that limit. Of course might not be - and depends on which version of Paradox you are using, and how you are accessing the data.
In worse case scenarios, have seen the need to UNION a few of those queries together. E.g.
SELECT COUNT(A.CtAmount) as CreditCount
, Sum(A.CtAmount) as Credit
, COUNT(A.DtAmount) as DebitCount
, SUM(DtAmount) as Debit
, A.MemberNr
, M.Premium
FROM MemAcc as A
LEFT OUTER JOIN Member as M on A.MemberNr = M.MemberNr
WHERE A.MemberNr <= 100000
GROUP BY A.MemberNr, M.Premium;
UNION
SELECT COUNT(A.CtAmount) as CreditCount
, Sum(A.CtAmount) as Credit
, COUNT(A.DtAmount) as DebitCount
, SUM(DtAmount) as Debit
, A.MemberNr
, M.Premium
FROM MemAcc as A
LEFT OUTER JOIN Member as M on A.MemberNr = M.MemberNr
WHERE A.MemberNr > 100000
GROUP BY A.MemberNr, M.Premium;