I am totally new to Postgres and I cant find any example of what I am trying to do...
I have a table of transactions for a year:
amount | date
12 | 1980-02-12
-200 | 1980-03-06
30 | 1980-03-14
Positive transactions for incoming money.
Negative transactions for credit payments.
And I need to return a total for the year that also reflects charging a $20 fee for every month where less than $400 in credit was used, like so:
total
-------
80,401
My thought was that I would first find total credit for all months like this...
WITH month_credit_totals
AS
(SELECT
SUM(amount) AS total_credit
FROM transactions
WHERE amount < 0
GROUP BY DATE_TRUNC('month', date))
and from there I would find the amount of months with less than $400 in credit payments like this...
SELECT
COUNT(*)
FROM month_credit_totals
WHERE total_credit <= -400
I wanted to save off this number, subtract it from 12, multiply that result by -20, and get the total amount owed in credit fees for the year that way.
Then, I thought I could just total up the amount column, apply that total fee amount, and that would be my result.
But I am having so much trouble understanding the syntax as a complete beginner, I cant find a way to save off variables, do the math, and apply that to total. I am constantly getting syntax errors that are pointing me in no direction just when I try to declare a variable at all.
Maybe I'm going about this the completely wrong way?
I would appreciate any advice. Thanks!
You're on the right track, but you don't really need to save these variables anywhere, just calculate it on the fly.
WITH credit_by_month AS
(
SELECT DATE_TRUNC('month', dt), SUM(amount) AS total_credit
FROM transactions
WHERE amount < 0
GROUP BY DATE_TRUNC('month', dt)
)
SELECT COUNT(*) * 20 AS credit_fee
FROM credit_by_month
WHERE total_credit > -400
Here's a working demo on dbfiddle, it's using random data so it will return a different number every time.
I'm new to SQL and am currently learning and this is probably a fairly basic question:
I have 4 tables
1) Customer (customerName, street, customerCity)
2) Deposit (customerName, branchName, accountNumber, balance)
3) Loan (customerName, branchName, loanNumber, amount)
4) Branch (branchName, branchCity, assets)
Each table has some data inserted into it.
I have been asked to find the customerName with the highest deposit amount. (So I'm guessing I will be using just the Deposit table)?
However, here is the catch, the sheet I am learning from is requesting that I MUST use either ALL or ANY, I can't simply use the MAX function to achieve this.
How would I achieve this? I've tried query after query and simply can't find a way to get it to work (baring in mind that I've only been learning this for a week).
The things I've been trying have been along the lines of:
SELECT customerName
FROM Deposit
WHERE balance > ALL;
The query should return 1 value, which would be the customerName with the highest balance value.
Thanks a lot for you help :)!
You are looking for the customer(s) whose balance is greater than or equal to all balances in the table
so you just need to use >= instead of >
SELECT customerName
FROM Deposit
WHERE balance >= ALL (SELECT balance
FROM Deposit);
Or you can use a correlated sub query and look for customers whose balances are greater than all other balance values.
SELECT customerName
FROM Deposit d1
WHERE balance > ALL (SELECT balance
FROM Deposit d2
WHERE d2.balance <> d1.balance);
In the event of ties both queries will return all customers with the highest amount.
I have a table called Transaction which contains some columns: [TransactionID, Type(credit or debit), Amount, Cashout, CreditPaid, EndTime]
Customers can get lots of credit and these transactions are stored in the transactions table. If a customer pays at the end of the month an amount which covers some or all of the credit transactions, I want those transactions to be updated. If the total payment covers some transactions, then the transactions should be updated.
For example, a customer pays in 300. If the transaction 'Amount' is 300 and 'Type' is credit then the 'CreditPaid' amount should be 300. (This is a simple update statement) but...
If there are two transactions i.e. one 300 and another 400 and are both credit and the monthly payment amount is 600, then the oldest transaction should be paid 300 in full, and the next transaction 300 leaving 100 outstanding.
Any ideas how to do this?
TrID Buyin Type Cashout CustID StartTime EndTime AddedBy CreditPaid
72 200 Credit 0 132 2013-05-21 NULL NULL NULL
73 300 Credit 0 132 2013-05-22 NULL NULL NULL
75 400 Credit 0 132 2013-05-23 NULL NULL NULL
Desired Results after customer pays 600
TrID Buyin Type Cashout CustID StartTime EndTime AddedBy CreditPaid
72 200 Credit 0 132 2013-05-21 2013-05-24 NULL 200
73 300 Credit 0 132 2013-05-22 2013-05-24 NULL 300
75 400 Credit 0 132 2013-05-23 NULL NULL 100
Here's a SQL 2008 version:
CREATE PROCEDURE dbo.PaymentApply
#CustID int,
#Amount decimal(11, 2),
#AsOfDate datetime
AS
WITH Totals AS (
SELECT
T.*,
RunningTotal =
Coalesce (
(SELECT Sum(S.Buyin - Coalesce(S.CreditPaid, 0))
FROM dbo.Trans S
WHERE
T.CustID = S.CustID
AND S.Type = 'Credit'
AND S.Buyin < Coalesce(S.CreditPaid, 0)
AND (
T.Starttime > S.Starttime
OR (
T.Starttime = S.Starttime
AND T.TrID > S.TrID
)
)
),
0)
FROM
dbo.Trans T
WHERE
CustID = #CustID
AND T.Type = 'Credit'
AND T.Buyin < Coalesce(T.CreditPaid, 0)
)
UPDATE T
SET
T.EndTime = P.EndTime,
T.CreditPaid = Coalesce(T.CreditPaid, 0) + P.CreditPaid
FROM
Totals T
CROSS APPLY (
SELECT TOP 1
V.*
FROM
(VALUES
(T.Buyin - Coalesce(T.CreditPaid, 0), #AsOfDate),
(#Amount - RunningTotal, NULL)
) V (CreditPaid, EndTime)
ORDER BY
V.CreditPaid,
V.EndTime DESC
) P
WHERE
T.RunningTotal <= #Amount
AND #Amount > 0;
;
See a Live Demo at SQL Fiddle
Or, for anyone using SQL 2012, you can replace the contents of the CTE with a better-performing and simpler query using the new windowing functions:
SELECT
*,
RunningTotal =
Sum(Buyin - Coalesce(CreditPaid, 0)) OVER(
ORDER BY StartTime
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) - Buyin
FROM dbo.Trans
WHERE
CustID = #CustID
AND Type = 'Credit'
AND Buyin - Coalesce(CreditPaid, 0) > 0
See a Live Demo at SQL Fiddle
Here's how they work:
We calculate the running total for all the prior rows where the CreditPaid amount is less than the Buyin amount. Note this does NOT include the current row.
From this we can determine what portion of the payment will apply to each row and which rows will be involved in the payment. If the sum of all the credits for all the prior rows are higher than the payment, then this row will NOT be included, thus T.RunningTotal <= #Amount. That's because all the prior rows will fully consume the payment by this point, so we can stop applying it.
For each row where we will apply a payment, we want to pay as much as possible, but we have to pay attention to the last row where we may not be paying the full amount (as is the case with the third credit in the example). So we'll be paying one of two amounts: the full credit amount (with more rows to receive payments) or only the portion left over which could be less than the full credit for that row (and this is the last row). We accomplish this by taking the lesser of either 1) the full remaining Buyin - CreditPaid amount, or 2) what's left of the full amount #Amount - RunningTotalOfPriorRows. I could have done this as a CASE expression, but I like using the Min function, especially because we would have had to do two CASE expressions to also determine whether to also update the EndTime column (per your requirements).
The SQL 2012 version simply calculates the same thing as the 2008 version: the sum of Buyin - CreditPaid for all the prior rows, using a windowing function instead of a correlated subquery.
Finally, we perform the update to all rows where the RunningTotal is less than the amount to be applied (since if it were equal to the amount, there would be no payment left for the current row).
Now, there are some larger considerations that you should think about.
Some of your scheme I like--I am not convinced that, as some commenters have said, you should ignore the individual transactions. I think that handling the individual transactions can be very important. It's much like how hospitals have one medical record number for each patient (MRN) but open a new account / file / visit each time the patient has a service performed. Each account is treated separately, and this is for many reasons, including--and this is where it seems important for you, too--the need for the customer to understand what exactly is comprising the total. It can be shocking to see the total all added up, but when this is broken out into individual transactions on individual dates, this makes a lot more sense to people and they can begin to understand exactly how they spent more money than they remembered at the time. "You owe me 600 bucks" can be harder to face than "your transactions for $100, $300, and $200 are still unpaid". :)
So, on to some big considerations here.
If you go with the theory that a transactional or balance-based account starts at 0 as a sort of "anchor", and to find the current balance you simply have to add up all the transactions: well, this does indeed satisfy relational theory, but in practice it is completely unworkable because it does not provide a fast, accurate way to get the current balance. It is imperative to have the current balance saved as a discrete value. If you were a bank, how would you know how much money you had, without adding up perhaps dozens of years of transaction history each time? Instead, it may be better to think of the current balance as the "anchor" (instead of 0) and think of the transactions as going backward in time. Additionally, there is no harm in recording periodic balances. Banks do this by closing out periods into statements, with a defined balance as of each statement closing date. There is no need to go all the way back to zero, since you don't care too much about the balance at the old, unanchored end of the history. You know that eventually every account started at 0. Big deal!
Given these thoughts, it is important for you to have a table where the customer's total account balance is simply stated. You also need a place to record his payments, refunds, cancellations, and so on. This should be separate from the accounts (in your case, transactions) themselves, because there is not a one-to-one correspondence between payment transactions and credit transactions. Already in your current scheme you have partially paid transactions with no date recorded--this is a huge gap in the system that will come back to bite you. What if a customer paid $10 a day toward a $200 credit for 20 days? 19 of those payments would show no date paid.
What I recommend, then, is that you create a stored procedure (SP) that applies payments to totals first, and then create another one that will "rewrite" the payments into the transactions in an on-demand way. Think about what a credit card company has to do if they "re-rate" your account. Perhaps they acted on incorrect information and increased your interest rate on a certain date. (This actually happened to me. I proved to them that the collections activity they were responding to was not my fault--it had been retracted by the original company after I showed them that one of their staff had mistakenly changed my mailing address, and I had never received a bill to be able to pay. So they had to be able to re-run all the purchase/debit/interest rate calculations on my account retroactively, to recalculate everything after the original change date based on the correct interest rate.) Think about this a bit and you will see that it is quite possible to operate this way, as long as you design your system properly. Your SP is given a date range or set of transactions within which it is allowed to work, and then "rewrites" history as if the old history had never existed.
But, you don't actually want to destroy history, so this is further complicated by the fact that at one point in time, your best knowledge of the customer's account balance for a time period was a different amount than your current best knowledge of their account balance for that time period--both are true data and need to be kept.
Let's say you discover that your system occasionally doubled up Credit transactions mistakenly. When you fix the customer data, you need to be able to see the fact that they had the problem, even though they don't have it now. This is done by using additional date columns EffectiveDate and ExpirationDate--or whatever you want to call them. Then, these need to be part of the clustered index, and used on every query to always get the current values. I highly recommend using 9999-12-31 instead of NULL as your ExpirationDate value for current rows--this will have a huge positive impact on performance when querying for current data. I also recommend putting the ExpirationDate as the first column in the clustered index (or at least, before the EffectiveDate column), since history will always potentially have many more records than the future, so it will be more selective than EffectiveDate being first (think a bit: all past knowledge will have EffectiveDate =< GetDate() but only current or future data will have ExpirationDate > GetDate()). To drive the point home: you don't delete. You expire old rows by setting a column to the date the knowledge became obsolete, and you insert new rows representing the new knowledge, with a column showing the date you learned this information and having an indefinitely-open "to the future" value in the other date column.
And finally a couple of single points:
The CreditPaid column should be NOT NULL with a default of 0. I had to throw in a bunch of Coalesces to deal with the NULLs.
You need to handle overpayments somehow. Either by preventing them, or by storing the overpaid portion value and applying it later. You could OUTPUT the results of the UPDATE statement into a table, then select the Sum from this and make the SP return any unused payment value. There are many ways to handle this. If you build the "re-rate" SP as I suggested, then this won't be too much of a problem, as you can rerun it after receiving new transactions (then immediately (re)apply all payments for any open periods).
At this point I can't go on too much more, but I hope that these thoughts help you. Your design is a good start, but it needs some work to get it to the point where it will function well as an enterprise-quality system.
UPDATE
I corrected a glitch in the 2008 version (adding the conditions from the outer query to the subquery).
And here's my last edit (all: please do not edit this answer again or it will be converted to community wiki).
If you do go with a scheme where rows are marked with the dates they are understood to be true (EffectiveDate and ExpirationDate), you can make coding in your system a little easier by creating inline table functions that select only the active rows from the table WHERE EffectiveDate <= GetDate() AND GetDate() < ExpirationDate. Pay careful attention to the comparison operators you're using (e.g., <= vs <), and use date ranges that are inclusive at the start and exclusive at the end. If you aren't sure what that means, please do look these terms up and understand them before proceeding. You want to be able to change the resolution of your date data type in the future, without breaking any of your queries. If you use an inclusive end date, this will not be possible. There are many posts online talking about how to properly query for dates in SQL.
Something like this:
CREATE FUNCTION dbo.TransCurrent
RETURNS TABLE
AS
RETURN (
SELECT *
FROM dbo.Trans
WHERE
EffectiveDate <= GetDate()
AND GetDate() < ExpirationDate --make clustered index have this first!
);
Do NOT confuse this with a multi-statement table-value-returning function. That will NOT perform well. This function type here will perform well because it can be inlined into the query, where basically the engine takes the logical intent of what the function is doing, and disposes with the function call entirely. Using any other kind of function will defeat this and your performance will go into the pot as your table grows in size.
I have this requirement, where I need to update the balance of an account once a purchase is made. Sometimes, the balance is not sufficient. So, I want to update only if the balance is sufficient to make the purchase. How to write the right sql for this
I need something like
Update account set balance = balance - amount only if balance >= amount.
How can I write something like this in sql.
You are looking for the WHERE keyword. It lets you filter a command to a subset of the table. Example:
Update account
SET balance = balance - amount
WHERE balance >= amount
You can use the CASE expression like so:
Update account
SET balance = CASE
WHEN balance >= amount THEN balance - amount
ELSE balance
END;
I have a rather strange problem:
when certain sales are made (completed) a record is inserted with the event and ID of the sales person, currently this table is queried and the 'Count' is used (along with usual date boundaries) to calculate number of 'closed' sales. This is currently working fine.
The problem now is some sales are 'shared' sales, products are linked via leads, usually the lead and the product are always by the same sales person, however in some rare cases lead can be created by on salesperson and the product 'sold' by another in such case the 'sales' calculation needs to award 0.5 (half) a point to each sales person.
What would be the best SQL approach to solve this?
(SQL Server 2005)
SELECT
SUM(CASE WHEN SaleUserID = LeadUserID THEN 1 ELSE 0.5)
FROM
sales
WHERE
(SaleUserID = #targetID OR LeadUSerID = #targetID)
-- AND dateCriteria
Just use SUM() instead of COUNT(), i.e.
SELECT SUM(CASE WHEN shared=1 THEN 0.5 ELSE 1 END) FROM sales WHERE ...
I guess you have some way to distinguish between a "full sale" and a "half sale".
Count separately the number of "full sales" and the number of "half sales".
Then add full + 0.5*half.