I have a table of records like the following
Month Amount Sum
1 100 100
2 50 150
3 NULL NULL
4 NULL NULL
5 50 200
ETC.
How do I keep a running total sum column and I'd like to cascade the previous valid sum into null records like follows in one SQL Statement?
1 100 100
2 50 150
3 0 150
4 0 150
5 50 200
Any ideas?
This isn't something you'd typically store in the database, but rather get with a query. You would do a subquery on the table to get a sum:
SELECT
t1.Month, t1.Amount,
SUM(SELECT t2.Amount FROM my_table t2 WHERE t2.Month <= t1.Month)
FROM my_table t1
In this way I use the table twice, once as t1 and once as t2.
Assuming the new month and amount being inserted are represented by variables #month and #amount:
INSERT INTO t (Month, Amount, [Sum])
SELECT #month,
CASE WHEN #amount IS NULL THEN 0 ELSE #amount END,
CASE WHEN #amount IS NULL THEN SUM(Amount) ELSE SUM(Amount) + #amount END
FROM t
If the months are always going to consecutive, you could use MAX(Month) + 1 instead of #month as the inserted value.
(Though I agree with #JHolyHead's caveat that I'd be hesitant to store a running total inside the table...)
Either store the running sum value somewhere else where you can read and update it on every transaction, or do some clever logic in a SP that calculates it during the transaction.
If I'm honest, I wouldn't store it. I'd probably prefer to calculate it in the application when I need it. That way you can filter by date/whatever other criteria you like.
Works from version 2005
;with a as(
select month, coalesce(amount, 0) amount, row_number() over(order by /*year?,*/ month) rn
from yourtable
),
cte as
(
select month, amount, amount [sum1]
from a where rn = 1
union all
select a.month, a.amount, a.amount + b.amount
from cte join a on a.rn = cte.rn - 1
)
select month, amount, [sum1] from cte
I surgest you do not use column names like 'sum' not even for computed columns.
Don't waste a table column on a sum, imagine what happens when someone updates a column in the database. Just use a computed column in a view.
Related
I am trying to write a query in which I update a counter based on other conditions. For example:
with table 1 as (select *, count from table1)
select box_type,
case when box_type = lag(box_type) over (order by time)
then
count, update table1 set count = count + 1
else
count
end as identifier
Here's the basic jist of what I'm trying to do. I want a table that looks like this:
box_type
identifier
small
1
small
1
small
1
medium
2
medium
2
large
3
large
3
small
4
I just want to increment that identifier value every time the box_type changes
Thank you!
Your question only makes sense if you have a column that sepcifies the ordering. Let me assume such a column exists -- based on your code, I'll call it time.
Then, you can use lag() and a cumulative sum:
select t1.*,
count(*) filter (where box_type is distinct from prev_box_type) over (order by time) as count
from (select t1.*,
lag(box_type) over (order by time) as prev_box_type
from table1 t1
) t1
Struggling with this subquery - it should be basic, but I'm missing something. I need to make these available as apart of a larger query.
I have customers, and I want to get the ONE transaction with the HIGHEST timestamp.
Customer
customer foo
1 val1
2 val2
Transaction
tx_key customer timestamp value
1 1 11/22 10
2 1 11/23 15
3 2 11/24 20
4 2 11/25 25
The desired of the query:
customer foo timestamp value
1 val1 11/23 15
2 val2 11/25 25
I successfully wrote a subquery to calculate what I needed by using multiple sub queries, but it is very slow when I have a larger data set.
I did it like this:
(select timestamp where transaction.customer = customer.customer order by timestamp desc limit 1) as tx_timestamp
(select value where transaction.customer = customer.customer order by timestamp desc limit 1) as tx_value
So how do I reduce this down to only calculating it once? In my real data set, i have 15 columns joined over 100k rows, so doing this over and over is not performant enough.
In Postgres, the simplest method is distinct on:
select distinct on (cust_id) c.*, t.timestamp, t.value
from transactions t join
customer c
using (cust_id)
order by cust_id, timestamp desc;
Try this query please:
SELECT
T.customer, T.foo, T.timestamp, T.value
FROM Transaction T
JOIN
(SELECT
customer, max(timestamp) as timestamp
from Transaction GROUP BY customer) MT ON
T.customer = MT.customer
AND t.timestamp = MT.timestamp
I have a scenario in which if the system date is between 1 to 5 of the current quarter the calculation should not include current quarter data and if it is greater than 5 it has to include all the data.
I am trying to include this condition in where clause but I am not able to acheive the result.
Could you please help me in this condition
SELECT
Dense_Rank() over(order by AMOUNT desc)as RANK,
FISCAL,
AMOUNT
FROM
T1 INNER JOIN T2 ON 1=1
WHERE ( FISCAL<( CASE WHEN t2.SYSDATE BETWEEN t2.CURRENTQUARTER_START_DATE AND ADD_DAYS(tw.CURRENTQUARTER_START_DATE,4)
THEN CURRENT_QUARTER
END ) OR (NULL)
I am not sure how to include that condition.
I think this may be what you're after:
SELECT
Dense_Rank() over (order by AMOUNT desc) as RANK,
FISCAL,
AMOUNT
FROM
T1
WHERE
FISCAL <= (
SELECT
CASE
WHEN ADD_DAYS(SYSDATE, -5) >= CURRENTQUARTER_START_DATE
THEN SYSDATE /* or maybe CURRENTQUARTER_END_DATE ? */
ELSE ADD_DAYS(CURRENTQUARTER_START_DATE, -1)
END
FROM T2
)
While you can do this with a join I think it makes sense to break it into logical pieces where the end-date lookup is isolated to a subquery and where the optimizer will understand that it should only see a single row/value.
Try:
CASE
WHEN t2.SYSDATE BETWEEN t2.CURRENT_QUARTER_START_DATE
AND ADD_DAYS(tw.CURRENTQUARTER_START_DATE, 4)
THEN CURRENT_QUARTER
ELSE NULL
END
Pretty simple question I suppose, but I can't find the answer on here!
I am trying to calculate the volume of records based on a WHERE clause, as well as return a percentage of those records based on the same where clause. The percentage would be calculated against the total amount of records in the database. For example, I count all my records that meet "MyCondition":
SELECT COUNT(*) FROM [MyTable]
WHERE Condition='MyCondition'
This works fine. However, how does one take that count, and return the percentage it equates to when put against all the records in the database? In other words, I want to see the percentage of how many records meet WHERE Condition='MyCondition' in regards to the total record count.
Sorry for the simple question and TIA! I am using MS SQL 2012.
Here is another method that only hits the base table once.
SELECT COUNT(*) as TotalCount
,SUM(case when Condition = 'MyCondition' then 1 else 0 end) as ConditionalCount
FROM [MyTable]
You can do simply divide the match of the count by the total number of records.
Sample Data:
create table test (MatchColumn int)
insert into test (MatchColumn)
values (1),(1),(1),(2),(3),(4)
Match Condition:
SELECT COUNT(*) MatchValues,
(SELECT COUNT(*) FROM test) TotalRecords,
CAST(COUNT(*) AS FLOAT)/CAST((SELECT COUNT(*) FROM test) AS FLOAT)*100 Percentage
FROM [test]
WHERE MatchColumn=1
Returns:
| MATCHVALUES | TOTALRECORDS | PERCENTAGE |
|-------------|--------------|------------|
| 3 | 6 | 50 |
SQL Fiddle Demo
Using a CTE:
Another option is to do the same with a CTE and reference the columns it creates:
;WITH CTE AS
(SELECT COUNT(*) MatchValues,
(SELECT COUNT(*) FROM test) TotalRecords
FROM [test]
WHERE MatchColumn=1)
SELECT MatchValues, TotalRecords,
CAST(MatchValues AS FLOAT)/CAST(TotalRecords AS FLOAT)*100 Percentage
FROM CTE
SQL Fiddle Demo
NOTE: Casting the counts to a float to calculate a percentage is required as dividing 2 int values would return an int, but in this case we want a decimal value less than 0, which would simply be 0 if it were an int.
Reference:
SQL Server, division returns zero
Or something like this.
create table test (MatchColumn int)
insert into test (MatchColumn)
values (1),(1),(1),(2),(3),(4)
SELECT
(SELECT COUNT(*) FROM test WHERE MatchColumn=1) MatchValues,
(SELECT COUNT(*) FROM test) TotalRecords,
((SELECT CAST(COUNT(*) AS FLOAT) FROM test WHERE MatchColumn=1)/CAST((SELECT COUNT(*) FROM test) AS FLOAT)*100) Percentage
I have a co-worker who is working on a table with an 'amount' column.
They would like to get the top 5 amounts and the sum of the amounts in the same query.
I know you could do this:
SELECT TOP 5 amount FROM table
UNION SELECT SUM(amount) FROM table
ORDER BY amount DESC
But this produces results like this:
1000 (sum)
100
70
50
30
20
When what they really need is this:
100 | 1000
70 | 1000
50 | 1000
30 | 1000
20 | 1000
My intuitive attempts to achieve this tend to run into grouping problems, which isn't such an issue when you are selecting a different column, but is when you want to use an aggregate function based on the column you are selecting.
You can use a CROSS JOIN for this:
SELECT TOP 5 a.amount, b.sum
FROM table a
CROSS JOIN (SELECT SUM(amount) sum FROM table) b
ORDER BY amount DESC
This might work
SELECT TOP 5 amount, (SELECT SUM(amount) FROM table)
FROM table
ORDER BY amount DESC
Not really pretty, but this shouls do it:
SELECT TOP 5 amount, SAmount
FROM table Join
(SELECT SUM(amount) As SAmount FROM table)
ORDER BY amount DESC
As said by others, I'd probably use two queries.
Another approach using analytic functions (SQL Server 2005+):
SELECT TOP 5 amount, SUM(amount) OVER()
FROM table
ORDER BY
amount DESC