Running Balance/Total on 2 columns - sum

I have the following Table in Azure MSSQL:
ID,
Charge,
Payment
ID is the Primary Key and unique, Charge and Payment are Numeric.
ID charge payment
1 10 null
2 null 10
3 40 null
4 null 30
I want to do the following query:
select *,
SUM(charge)-sum(payment) OVER (order by id) AS Balance
from Table T
order by id asc
Which in the above data sample would look like this:
ID charge payment balance
1 10 null 10
2 null 10 0
3 40 null 40
4 null 30 10
However that query fails, complaining I need to do add a Group by clause, however if I run the following
select *,
SUM(charged) OVER (order by id) AS totalCharged
from table
order by id
That works fine - I feel like I've missed something obvious.
I should also note there are other columns that are in the final query but are omitted from here since they aren't relevant to the logic.

As it turns out, I was missing something obvious.
I changed the source query to use 0 instead of Null and then did the following:
select *,
SUM(charged - payment) OVER (order by id) AS balance
from table
order by id
So figured it out myself whilst thinking upon the porcelein throne.

Related

Updating column according to index within group

In our databases we have a table called conditions which references a table called attributes.
So it looks like this (ignoring some other columns that aren't relevant to the question)
id
attribute_id
execution_index
1
1000
1
2
1000
2
3
1000
1
4
2000
1
5
2000
2
6
2000
2
In theory the combination of attribute_id and execution_index should always be unique, but in practice they're not, and the software ends up essentially using the id to decide which comes first between two conditions with the same execution index. We want to add a uniqueness constraint to the table, but before we do that we need to update the execution indexes. So essentially we want to group them by attribute_id, order them by execution_index then id, and give them new execution indexes so that it becomes
id
attribute_id
execution_index
1
1000
1
2
1000
3
3
1000
2
4
2000
1
5
2000
2
6
2000
3
I'm not sure how to do this without just ordering by attribute_id, execution_index, id and then iterating through incrementing the execution_index by 1 each time and resetting it to be 1 whenever the attribute_id changes. (That would work but it'd be slow and someone is going to have to run this script on several dozen databases so I'd rather it didn't take more than a couple of seconds per database.)
Really I'd like to do something along the lines of
UPDATE c
SET c.execution_index = [this needs to be the index within the group somehow]
FROM condities c
GROUP BY c.attribute_id
ORDER BY c.execution_index asc, c.id asc
But I don't know how to make that actually work.
It looks like you can use an updatable CTE:
with cte as (
select *,
Row_Number() over(partition by attribute_id order by execution_index, id) new
from conditions
)
update cte set execution_index = new
I would suggest adding a new column and first updating that and checking the results are as expected.
Example Fiddle
WITH cte AS
(
SELECT
*,
ROW_NUMBER() OVER
(
PARTITION BY attribute_id
ORDER BY execution_index, id
) AS RowNum
FROM condities
)
UPDATE cte
SET execution_index = RowNum

Last record per transaction

I am trying to select the last record per sales order.
My query is simple in SQL Server management.
SELECT *
FROM DOCSTATUS
The problem is that this database has tens of thousands of records, as it tracks all SO steps.
ID SO SL Status Reason Attach Name Name Systemdate
22 951581 3 Processed Customer NULL NULL BW 2016-12-05 13:33:27.857
23 951581 3 Submitted Customer NULL NULL BW 2016-17-05 13:33:27.997
24 947318 1 Hold Customer NULL NULL bw 2016-12-05 13:54:27.173
25 947318 1 Invoices Submit Customer NULL NULL bw 2016-13-05 13:54:27.300
26 947318 1 Ship Customer NULL NULL bw 2016-14-05 13:54:27.440
I would to see the most recent record per the SO
ID SO SL Status Reason Attach Name Name Systemdate
23 951581 4 Submitted Customer NULL NULL BW 2016-17-05 13:33:27.997
26 947318 1 Ship Customer NULL NULL bw 2016-14-05 13:54:27.440
Well I'm not sure how that table has two Name columns, but one easy way to do this is with ROW_NUMBER():
;WITH cte AS
(
SELECT *,
rn = ROW_NUMBER() OVER (PARTITION BY SO ORDER BY Systemdate DESC)
FROM dbo.DOCSTATUS
)
SELECT ID, SO, SL, Status, Reason, ..., Systemdate
FROM cte WHERE rn = 1;
Also please always reference the schema, even if today everything is under dbo.
I think you can keep it this simple:
SELECT *
FROM DOCSTATUS
WHERE ID IN (SELECT MAX(ID)
FROM DOCSTATUS
GROUP BY SO)
You want only the maximum ID from each SO.
An efficient method with the right index is a correlated subquery:
select t.*
from t
where t.systemdate = (select max(t2.systemdate) from t t2 where t2.so = t.so);
The index is on (so, systemdate).

SQL aggregate rows with same id , specific value in secondary column

I'm looking to filter out rows in the database (PostgreSQL) if one of the values in the status column occurs. The idea is to sum the amount column if the unique reference only has a status equals to 1. The query should not SELECT the reference at all if it has also a status of 2 or any other status for that matter. status refers to the state of the transaction.
Current data table:
reference | amount | status
1 100 1
2 120 1
2 -120 2
3 200 1
3 -200 2
4 450 1
Result:
amount | status
550 1
I've simplified the data example but I think it gives a good idea of what I'm looking for.
I'm unsuccessful in selecting only references that only have status 1.
I've tried sub-queries, using the HAVING clause and other methods without success.
Thanks
Here's a way using not exists to sum all rows where the status is 1 and other rows with the same reference and a non 1 status do not exist.
select sum(amount) from mytable t1
where status = 1
and not exists (
select 1 from mytable t2
where t2.reference = t1.reference
and t2.status <> 1
)
SELECT SUM(amount)
FROM table
WHERE reference NOT IN (
SELECT reference
FROM table
WHERE status<>1
)
The subquery SELECTs all references that must be excluded, then the main query sums everything except them
select sum (amount) as amount
from (
select sum(amount) as amount
from t
group by reference
having not bool_or(status <> 1)
) s;
amount
--------
550
You could use windowed functions to count occurences of status different than 1 per each group:
SELECT SUM(amount) AS amount
FROM (SELECT *,COUNT(*) FILTER(WHERE status<>1) OVER(PARTITION BY reference) cnt
FROM tc) AS sub
WHERE cnt = 0;
Rextester Demo

SQL - Set field value based on count of previous rows values

I have the following table structure in Microsoft SQL:
ID Name Number
1 John
2 John
3 John
4 Mark
5 Mark
6 Anne
7 Anne
8 Luke
9 Rachael
10 Rachael
I am looking to set the 'Number' field to the number of times the 'Name' field has appeared previously, using SQL.
Desired output as follows:
ID Name Number
1 John 1
2 John 2
3 John 3
4 Mark 1
5 Mark 2
6 Anne 1
7 Anne 2
8 Luke 1
9 Rachael 1
10 Rachael 2
The table is ordered by 'Name', so there is no worry of 'John' appearing under ID 11 again, using my example.
Any help would be appreciated. I'm not sure if I can do this with a simple SELECT statement, or whether I will need an UPDATE statement, or something more advanced.
Use ROW_NUMBER:
SELECT ID, Name,
ROW_NUMBER() OVER (PARTITION BY Name
ORDER BY ID) AS Number
FROM mytable
There is no need to add a field for this, as the value can be easily calculated using window functions.
You should be able to use the ROW_NUMBER() function within SQL Server to partition each group (by their Name property) and output the individual row in each partition :
SELECT ID,
Name,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY ID) AS Number
FROM YourTable
ORDER BY ID
You can see what your data looks like prior to the query :
and then after it is executed :
If your system doesnt support OVER PARTITION, you can use following code:
SELECT
ID,
Name,
(
SELECT
SUM(counterTable.nameCount)
FROM
mytable innerTable
JOIN (SELECT 1 as nameCount) as counterTable
WHERE
innerTable.ID <= outerTable.ID
AND outerTable.Name = innerTable.Name
) AS cumulative_sum
FROM
mytable outerTable
ORDER BY outerTable.ID
Following CREATE TABLE statement I used and then filled in your data:
CREATE TABLE `mytable` (
`ID` INT(11) NULL DEFAULT NULL,
`Name` VARCHAR(50) NULL DEFAULT NULL
);
This should work with DBS not supporting OVER PARTITION like MySQL, Maria, ...

Using GROUP BY, select ID of record in each group that has lowest ID

I am creating a file orginization system where you can add content items to multiple folders.
I am storing the data in a table that has a structure similar to the following:
ID TypeID ContentID FolderID
1 101 1001 1
2 101 1001 2
3 102 1002 3
4 103 1002 2
5 103 1002 1
6 104 1001 1
7 105 1005 2
I am trying to select the first record for each unique TypeID and ContentID pair. For the above table, I would want the results to be:
ID
1
3
4
6
7
As you can see, the pairs 101 1001 and 103 1002 were each added to two folders, yet I only want the record with the first folder they were added to.
When I try the following query, however, I only get result that have at least two entries with the same TypeID and ContentID:
select MIN(ID)
from table
group by TypeID, ContentID
results in
ID
1
4
If I change MIN(ID) to MAX(ID) I get the correct amount of results, yet I get the record with the last folder they were added to and not the first folder:
ID
2
3
5
6
7
Am I using GROUP BY or the MIN wrong? Is there another way that I can accomplish this task of selecting the first record of each TypeID ContentID pair?
MIN() and MAX() should return the same amount of rows. Changing the function should not change the number of rows returned in the query.
Is this query part of a larger query? From looking at the sample data provided, I would assume that this code is only a snippet from a larger action you are trying to do. Do you later try to join TypeID, ContentID or FolderID with the tables the IDs are referencing?
If yes, this error is likely being caused by another part of your query and not this select statement. If you are using joins or multi-level select statements, you can get different amount of results if the reference tables do not contain a record for all the foreign IDs.
Another suggestion, check to see if any of the values in your records are NULL. Although this should not affect the GROUP BY, I have sometime encountered strange behavior when dealing with NULL values.
Use ROW_NUMBER
WITH CTE AS
(SELECT ID,TypeID,ContentID,FolderID,
ROW_NUMBER() OVER (PARTITION BY TypeID,ContentID ORDER BY ID) as rn FROM t
)
SELECT ID FROM CTE WHERE rn=1
Use it with ORDER BY:
select *
from table
group by TypeID, ContentID
order by id
SQLFiddle: http://sqlfiddle.com/#!9/024016/12
Try with first ( id) instead of min(id)
select first(id)
from table
group by TypeID, ContentID
It works ?