SQL Server sum field from previous calculation - sql

In SQL Server, I have table with 4 column
artid num A B
46 1 417636000 0
47 1 15024000 0
102 1 3418105650 0
226 1 1160601286 0
60 668 260000 0
69 668 5500000 0
I want in result set create new column for some calculation
This column should have value like this:
artid num a b newColumnValue
----------- ----------- ---------------------- ---------------------- ----------------------
46 1 417636000 0 a-b+previous newColumnValue
I write this query, but I can't get previous newColumnValue:
select *, (a- b+ lag(a- b, 1, a- b) over (order by num,artid)) as newColumnValue
FROM MainTbl
ORDER BY num,artid
i get this result
artid num a b newColumnValue
----------- ----------- ---------------------- ---------------------- ----------------------
46 1 417636000 0 417636000
47 1 15024000 0 432660000
102 1 3418105650 0 3433129650
226 1 1160601286 0 4578706936
60 668 260000 0 1160861286
69 668 5500000 0 5760000
i want get this result
artid num a b newColumnValue
----------- ----------- ---------------------- ---------------------- ----------------------
46 1 417636000 0 417636000
47 1 15024000 0 432660000
102 1 3418105650 0 3850765650
226 1 1160601286 0 5011366936
60 668 260000 0 5011626936
69 668 5500000 0 5017126936

You want cumulative sums (well, the difference between them):
select a, b, sum(a - b) over (order by num, artid)
from mytbl;
Note: SQL tables represent unordered sets. You need a column to specify the ordering to define previous. If you really only have two columns, then I might assume the ordering is based on a, and the query would be:
select a, b, sum(a - b) over (order by a)
from mytbl;

Given the following example data,
+----+---+---+
| Id | A | B |
+----+---+---+
| 1 | 2 | 3 |
+----+---+---+
| 2 | 3 | 4 |
+----+---+---+
| 3 | 4 | 5 |
+----+---+---+
| 4 | 5 | 6 |
+----+---+---+
| 5 | 6 | 7 |
+----+---+---+
the following short SQL statement produces the desired output:
select A - B + lag(A - B, 1, 0) over (order by id)
from test
+----+
| -1 |
+----+
| -2 |
+----+
| -2 |
+----+
| -2 |
+----+
| -2 |
+----+
Note that the Lag function takes three arguments: the first is the expression you would like evaluated for the "lagged" record, the second is the amount of the lag (defaults to 1), and the third is the value to return if the expression cannot be computed (e.g. if it is the first record).

Related

How to increment grouping number in query if consecutive values don't satisfy conditions defined?

I will describe problem briefly.
----------------------------------------------------------------------------------------------------
| Total UnitName UnitValue PartlyStatus PartlyValue CountMetric CountValue | RowNo
| |
| 79 A 7654 B 0 C 360 | 1
| 79 A 7656 B 0 C 360 | 2
| 79 A 7657 B 0 C 360 | 2
| 79 A 7658 B 0 C 360 | 2
| 79 A 7659 B 1 C 240 | 3
| 79 A 7660 B 0 C 360 | 4
| 79 A 7662 B 1 C 240 | 5
| 79 A 7663 B 1 C 240 | 5
| 79 A 7664 B 1 C 240 | 5
| 79 A 7665 B 1 C 240 | 5
| 79 A 7667 B 1 C 240 | 6
| 79 A 7668 B 1 C 240 | 6
| 79 A 7669 B 1 C 240 | 6
| 79 A 7670 B 0 C 360 | 7
| 79 A 7671 B 0 C 360 | 7
| 79 A 7672 B 0 C 360 | 7
---------------------------------------------------------------------------------------------------
I have to create new row in my table in SQL Server Reporting Services(SSRS) if constraint is not satisfied.
Rules that i have to apply:
If UnitValue Numbers are not consecutive, use next row.
If binary values of partlyValue changes, use next row.
I have to write a query that creates a RowNo, which increments if conditions are not satisfied.
The table that i show is a derived result from long query to demonstrate problem. RowNo column is written for showing intended result.
My question is asked for understanding and thinking about elegant approaches to solve problem,
so conceptual query examples or solutions are fine for me as long as it puts me in a right direction.
I think you just want window functions. It is a little hard to follow the logic but this does what you want:
select t.*,
sum(case when prev_uv = unitvalue - 1 and
prev_pv = partlyvalue
then 0 -- no new group
else 1
end) over (order by unitvalue) as rowno
from (select t.*,
lag(unitvalue) over (order by unitvalue) as prev_uv,
lag(partlyvalue) over (order by unitvalue) as prev_pv
from t
) t;
You need to write functions in your solution explorer.

Postgres width_bucket() not assigning values to buckets correctly

In postgresql 9.5.3 I can't get width_bucket() to work as expected, it appears to be assigning values to the wrong buckets.
Dataset:
1
2
4
32
43
82
104
143
232
295
422
477
Expected output (bucket ranges and zero-count rows added to help analysis):
bucket | bucketmin | bucketmax | Expect | Actual
--------+-----------+-----------+--------|--------
1 | 1 | 48.6 | 5 | 5
2 | 48.6 | 96.2 | 1 | 2
3 | 96.2 | 143.8 | 2 | 1
4 | 143.8 | 191.4 | 0 | 0
5 | 191.4 | 239 | 1 | 1
6 | 239 | 286.6 | 0 | 1
7 | 286.6 | 334.2 | 1 | 0
8 | 334.2 | 381.8 | 0 | 1
9 | 381.8 | 429.4 | 1 | 0
10 | 429.4 | 477 | 1 | 1
Actual output:
wb | count
----+-------
1 | 5
2 | 2
3 | 1
5 | 1
6 | 1
8 | 1
10 | 1
Code to generate actual output:
create temp table metrics (val int);
insert into metrics (val) values(1),(2),(4),(32),(43),(82),(104),(143),(232),(295),(422),(477);
with metric_stats as (
select
cast(min(val) as float) as minV,
cast(max(val) as float) as maxV
from metrics m
),
hist as (
select
width_bucket(val, s.minV, s.maxV, 9) wb,
count(*)
from metrics m, metric_stats s
group by 1 order by 1
)
select * from hist;
Your calculations appear to be off. The following query:
with metric_stats as (
select cast(min(val) as float) as minV,
cast(max(val) as float) as maxV
from metrics m
)
select g.n,
s.minV + ((s.maxV - s.minV) / 9) * (g.n - 1) as bucket_start,
s.minV + ((s.maxV - s.minV) / 9) * g.n as bucket_end
from generate_series(1, 9) g(n) cross join
metric_stats s
order by g.n
Yields the following bins:
1 1 53.8888888888889
2 53.8888888888889 106.777777777778
3 106.777777777778 159.666666666667
4 159.666666666667 212.555555555556
5 212.555555555556 265.444444444444
6 265.444444444444 318.333333333333
7 318.333333333333 371.222222222222
8 371.222222222222 424.111111111111
9 424.111111111111 477
I think you intend for the "9" to be a "10", if you want 10 buckets.

PostgreSQL - finding and updating multiple records

I have a table:
ID | rows | dimensions
---+------+-----------
1 | 1 | 15 x 20
2 | 3 | 2 x 10
3 | 5 | 23 x 33
3 | 7 | 15 x 23
4 | 2 | 12 x 32
And I want to have something like that:
ID | rows | dimensions
---+------+-----------
1 | 1 | 15 x 20
2 | 3 | 2 x 10
3a | 5 | 23 x 33
3b | 7 | 15 x 23
4 | 2 | 12 x 32
How can I find the multiple ID value to make it unique?
How can I update the parent table after?
Thanks for your help!
with stats as (
SELECT "ID",
"rows",
row_number() over (partition by "ID" order by rows) as rn,
count(*) over (partition by "ID") as cnt
FROM Table1
)
UPDATE Table1
SET "ID" = CASE WHEN s.cnt > 1 THEN s."ID" || '-' || s.rn
ELSE s."ID"
END
FROM stats s
WHERE S."ID" = Table1."ID"
AND S."rows" = Table1."rows"
I'm assuming you cant have two rows with same ID and same rows other wise you need to include "dimensions" on the WHERE too.
In this case the output is

Window running function except current row

I have a theoretical question, so I'm not interested in alternative solutions. Sorry.
Q: Is it possible to get the window running function values for all previous rows, except current?
For example:
with
t(i,x,y) as (
values
(1,1,1),(2,1,3),(3,1,2),
(4,2,4),(5,2,2),(6,2,8)
)
select
t.*,
sum(y) over (partition by x order by i) - y as sum,
max(y) over (partition by x order by i) as max,
count(*) filter (where y > 2) over (partition by x order by i) as cnt
from
t;
Actual result is
i | x | y | sum | max | cnt
---+---+---+-----+-----+-----
1 | 1 | 1 | 0 | 1 | 0
2 | 1 | 3 | 1 | 3 | 1
3 | 1 | 2 | 4 | 3 | 1
4 | 2 | 4 | 0 | 4 | 1
5 | 2 | 2 | 4 | 4 | 1
6 | 2 | 8 | 6 | 8 | 2
(6 rows)
I want to have max and cnt columns behavior like sum column, so, result should be:
i | x | y | sum | max | cnt
---+---+---+-----+-----+-----
1 | 1 | 1 | 0 | | 0
2 | 1 | 3 | 1 | 1 | 0
3 | 1 | 2 | 4 | 3 | 1
4 | 2 | 4 | 0 | | 0
5 | 2 | 2 | 4 | 4 | 1
6 | 2 | 8 | 6 | 4 | 1
(6 rows)
It can be achieved using simple subquery like
select t.*, lag(y,1) over (partition by x order by i) as yy from t
but is it possible using only window function syntax, without subqueries?
Yes, you can. This does the trick:
with
t(i,x,y) as (
values
(1,1,1),(2,1,3),(3,1,2),
(4,2,4),(5,2,2),(6,2,8)
)
select
t.*,
sum(y) over w as sum,
max(y) over w as max,
count(*) filter (where y > 2) over w as cnt
from t
window w as (partition by x order by i
rows between unbounded preceding and 1 preceding);
The frame_clause selects just those rows from the window frame that you are interested in.
Note that in the sum column you'll get null rather than 0 because of the frame clause: the first row in the frame has no row before it. You can coalesce() this away if needed.
SQLFiddle

Reduce rows in SQL

I have a select query that will return something like the following table:
start | stop | id
------------------
0 | 100 | 1
1 | 101 | 1
2 | 102 | 1
2 | 102 | 2
5 | 105 | 1
7 | 107 | 2
...
300 | 400 | 1
370 | 470 | 1
450 | 550 | 1
Where stop = start + n; n = 100 in this case.
I would like to merge the overlaps for each id:
start | stop | id
------------------
0 | 105 | 1
2 | 107 | 2
...
300 | 550 | 1
id 1 does not give 0 - 550 because the start 300 is after stop 105.
There will be hundreds of thousands of records returned by the first query and n can go up to tens of thousands, so the faster it can be processed the better.
Using PostgreSQL btw.
WITH bounds AS
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY start) AS rn
FROM (
SELECT id, LAG(stop) OVER (PARTITION BY id ORDER BY start) AS pstop, start
FROM q
UNION ALL
SELECT id, MAX(stop), NULL
FROM q
GROUP BY
id
) q2
WHERE start > pstop OR pstop IS NULL OR start IS NULL
)
SELECT b2.start, b1.pstop
FROM bounds b1
JOIN bounds b2
ON b1.id = b2.id
AND b1.rn = b2.rn + 1