get Last value of a column in sql server - sql

I want to get the last value of a column(it is not an identity column) and increment it to the value of corresponding row number generated.
Select isnull(LAST_VALUE(ColumnA) over(order by ColumnA), 0) +
ROW_NUMBER() OVER (ORDER BY ColumnA)
from myTable
I am calling my sp recursively hence why I thought of this logic.
But it is not working.
I basically wanted, for first time 1-9 for second run (if recursively being called 2 times) 10-19 and so on.

Total stab in the dark, but I suspect "not working" means "returning the current row's value." Don't forget that an OVER clause defaults to the window RANGE BETWEEN PRECEDING AND CURRENT ROW when one isn't explicitly specified and there is an ORDER BY (see SELECT - OVER Clause (Transact-SQL) - ORDER BY).
ORDER BY
ORDER BY *order_by_expression* [COLLATE *collation_name*] [ASC|DESC]
Defines the logical order of the rows within each partition of the result set. That is, it specifies the logical order in which the window function calculation is performed.
If it is not specified, the default order is ASC and window function will use all rows in partition.
If it is specified, and a ROWS/RANGE is not specified, then default RANGE UNBOUNDED PRECEDING AND CURRENT ROW is used as default for window frame by the functions that can accept optional ROWS/RANGE specification (for example min or max).
As you haven't defined the window, that's what your LAST_VALUE function is using. Define that you want the whole lot for the partition:
SELECT ISNULL(LAST_VALUE(ColumnA) OVER (ORDER BY ColumnA ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), 0) +
ROW_NUMBER() OVER (ORDER BY ColumnA)
FROM dbo.myTable;
Though what Gordon says in their comment is the real solution:
You should be using an identity column or sequence.
This type of solution can (and will) end up suffering race conditions, as well as end up reusing "identities" when it shouldn't be.

Related

ERROR: Aggregate window functions with an ORDER BY clause require a frame clause

I am getting an 'ERROR: Aggregate window functions with an ORDER BY clause require a frame clause' message when enterring the following query on Redshift. Please help - I am trying to view the growth of members from day 1 til today. Thanks.
select date(timestampregistered), count(distinct(memberid)),
(SUM(count(distinct(memberid))) OVER (ORDER BY date(timestampregistered)))
AS total_users
from table
order by date(timestampregistered);
You have a couple of things going on. First you seem to be missing a GROUP BY clause for the proper operation of COUNT() by date.
Next you need to specify the range of "counts" for which you want to SUM(). Specifically you want to sum counts for previous dates up to and including the current row's date but not later dates.
select date(timestampregistered), count(distinct(memberid)),
(SUM(count(distinct(memberid))) OVER (ORDER BY date(timestampregistered) ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW))
AS total_users
from table
group by date(timestampregistered)
order by date(timestampregistered);

Redshift SQL - Running Sum using Unbounded Proceding and Following

When we use the window function to calculate the running sum like SUM(sales) over (partition by dept order by date), if we don't specify the range/window, is the default setting as between unbounded proceding and current row, basically from the first row until the current row?
According to this doc it seems to be the case, but I wanted to double check.
Thanks!
The problem you are running into is 'what does the database engine assume in ambiguous circumstances?' I've run into this exact case before when porting from SQLserver to Redshift - SQL server assumes that is you order but don't specify a frame that you want unbounded preceding to current row. Other DBs do not make the same assumption - if you don't specify a frame it will be unbounded preceding to unbounded following and yet other will throw an error if you specify and "order by" but don't specify a frame. Bottom line - don't let the DB engine guess what you want, be specific.
Gordon is correct that this is based on rows, not ranges. If you want a running sum by date (not row), you can group by date and run the window function - windows execute after group by in a single query.

SQL create increment field based on the values of another field

I need to generate an increment field based on the difference bettwen current and previous value from another field:
So for example, this table would look like this:
I have this data in postgresql and my query is currently generating the table in first image, but I need it to create the second one.
Would be thankful for any hints.
I would recommend using lag():
select t.*,
(totalreply -
lag(totalreply, 1, totalreply) over (order by month)
) as incremental_totalreply
from t;
Note that this uses the 3-argument form of lag() so the first value is 0 rather than NULL.
You can use WINDOW FUNCTION, try this:
select month, totalread, (totalread -
lead(totalread, -1, totalread) over(order by totalread))
from table1;
Reading doc, lead:
returns value evaluated at the row that is offset rows after the current row within the partition; if there is no such row, instead return default (which must be of the same type as value). Both offset and default are evaluated with respect to the current row. If omitted, offset defaults to 1 and default to null

Cumulative count for calculating daily frequency using SQL query (in Amazon Redshift)

I have a dataset contains 'UI' (unique id), time, frequency (frequency for give value in UI column), as it is shown here:
What I would like to add a new column named 'daily_frequency' which simply counts each unique value in UI column for a given day sequentially as I show in the image below.
For example if UI=114737 and it is repeated 2 times in one day, we should have 1, and 2 in the daily_frequency column.
I could do that with Python and Panda package using group by and cumcount methods as follows ...
df['daily_frequency'] = df.groupby(['UI','day']).cumcount()+1
However, for some reason, I must do this via SQL queries (Amazon Redshift).
I think you want a running count, which could be calculated as:
COUNT(*) OVER (PARTITION BY ui, TRUNC(time) ORDER BY time
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS daily_frequency
Although Salman's answer seems to be correct, I think ROW_NUMBER() is simpler:
COUNT(*) OVER (PARTITION BY ui, time::date
ORDER BY time
) AS daily_frequency

LAST_VALUE() rows between unbounded preceding and unbounded following

What is the use of this statement?
Please elaborate with an example. I came across it while using the LAST_VALUE function.
from https://forums.oracle.com/forums/thread.jspa?threadID=1018352
when you ORDER a set of records in analytic functions you can specify a range of rows to consider,ignoring the others.
You can do this using the ROWS clause
UNBOUNDED PRECEDING
The range starts at the first row of the partition.
UNBOUNDED FOLLOWING
The range ends at the last row of the partition.
CURRENT ROW
range begins at the current row or ends at the current row
n PRECEDING or n FOLLOWING
The range starts or ends n rows before or after the current row
This is explained quite well in the manual:
http://www.postgresql.org/docs/current/static/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS