Generating row number without ordering any column - sql

I want to generate row numbers in the same order the data are added.
The below query is working fine for SQL Server.
SELECT *,ROW_NUMBER() OVER (ORDER BY (SELECT 100)) AS SNO FROM TestTable
I need standard query to achieve the same scenario in Firebird. Can anyone suggest me about this?

You can't use row_number() over (order by (select 100)) with Firebird, because Firebird - as required by the SQL standard - requires a from clause for a select. The equivalent in Firebird would be row_number() over (order by (select 100 from rdb$database)).
The best solution would be to use an actual column for the order by to ensure a deterministic order.
When looking at the SQL:2016 standard, then an order by is not required for row_number() (but it is for rank() and dense_rank()). Unfortunately, it looks like Microsoft applied that requirement for row_number() as well, possibly for uniformity with the rank-functions, and maybe because row_number() without an order does not make a lot of sense. Using row_number() over () with SQL Server yields an error "The function 'row_number' must have an OVER clause with ORDER BY.", but works with Firebird.
SQL Server also enforces that the order by in a window function is not a numeric column reference. Using row_number() over (order by 1) with SQL Server yields an error "Windowed functions, aggregates and NEXT VALUE FOR functions do not support integer indices as ORDER BY clause expressions.", but works with Firebird (although the 1 is taken as a literal 1, and not as column reference, contrary to an order by on select level).
SQL Server also does not support using constants or literals in the order by in a window function. Using row_number() over (order by '1') with SQL Server yields an error "Windowed functions, aggregates and NEXT VALUE FOR functions do not support constants as ORDER BY clause expressions.", but works with Firebird.
I did find a trick that worked for both Firebird 3 and SQL Server 2017, but it is a dirty hack:
row_number() over (order by current_user)
This works because SQL Server doesn't consider current_user as a constant, but as a function, which means it doesn't fall under the 'no constants allowed'-rule.
Be aware that this trick may yield inconsistent row numbers (eg in Firebird multiple window functions evaluated with different constants will yield different values, and the window function is evaluated before an order by on select level), and you may want to consider if you shouldn't simply track a row index in your application.

One way could be usage of ORDER BY RAND():
CREATE TABLE TestTable(i INT);
INSERT INTO TestTable(i) VALUES (10);
INSERT INTO TestTable(i) VALUES (20);
INSERT INTO TestTable(i) VALUES (30);
SELECT TestTable.*,ROW_NUMBER() OVER (ORDER BY RAND()) AS SNO
FROM TestTable;
db<>fiddle demo - Firebird
db<>fiddle demo - SQL Server

Related

How to reuse a computed value multiple times?

Basically I just want a simple way of finding the most recent date in a table, saving it as a variable, and reusing that variable in the same query.
Right now this is how I'm doing it:
with recent_date as (
select max(date)
from mytable
)
select *
from mytable
where date = (select * from recent_date)
(For this simple example, a variable is overkill, but in my real-world use-case I reuse the recent date multiple times in the same query.)
But that feels cumbersome. It would be a lot cleaner to save the recent date to a variable rather than a table and having to select from it.
In pseudo-code, something like this would be nice:
$recent_date = (select max(date) from mytable)
select *
from mytable
where date = $recent_date
Is there something like that in Postgres?
Better for the simple case
For the scope of a single query, CTEs are a good tool. In my hands the query would look like this:
WITH recent(date) AS (SELECT max(date) FROM mytable)
SELECT m.*
FROM recent r
JOIN mytable m USING (date)
Except that the actual example query would burn down to this in my hands:
SELECT *
FROM mytable
ORDER BY date DESC NULLS LAST
FETCH FIRST 1 ROWS WITH TIES;
NULLS LAST only if there can be NULL values. See:
Sort by column ASC, but NULL values first?
WITH TIES only if date isn't UNIQUE NOT NULL. See:
Get top row(s) with highest value, with ties
In combination with an index on mytable (date) (or more specific), this produces the best possible query plan. Look no further.
No, I need variables!
If you positively need variables scoped for the same command, transaction, session or more, there are various options.
The closest thing to "variables" in SQL in Postgres are "customized options". See:
User defined variables in PostgreSQL
You can only store text, any other type has to be cast (and cast back on retrieval).
To set and retrieve a value from within a query, use the Configuration Settings Functions set_config() and current_setting():
SELECT set_config('foo.recent', max(date)::text, false) FROM mytable;
SELECT *
FROM mytable
WHERE date = current_setting('foo.recent')::date;
Typically, there are more efficient ways.
If you need that "recent date" a lot, consider a simple function as "global variable", usable by all transactions in all sessions (but each new command sees its own current state):
CREATE FUNCTION f_recent_date()
RETURNS date
LANGUAGE sql STABLE PARALLEL SAFE AS
'SELECT max(date) FROM mytable';
STABLE is a valid volatility setting as the function returns the same result within the same query. Be sure to actually make it STABLE, so Postgres does not evaluate repeatedly. In Postgres 9.6 or later, also make it PARALLEL SAFE. Then your query becomes:
SELECT * FROM mytable WHERE date = f_recent_date();
More options:
Is there a way to define a named constant in a PostgreSQL query?
Passing user id to PostgreSQL triggers
Typically, if I need variables in Postgres, I use a PL/pgSQL code block in a function, a procedure, or a DO statement for ad-hoc use without the need to return rows:
DO
$do$
DECLARE
_recent_date date := (SELECT max(date) FROM mytable);
BEGIN
PERFORM * FROM mytable WHERE date = _recent_date;
-- more queries using _recent_date ...
END
$do$;
PL/pgSQL may be what you should be using to begin with. Further reading:
When to use stored procedure / user-defined function?
Keep in mind that in SQL you cannot directly declare a variable. Basically a CTE is creating variable (or a set of) and in SQL to use a variable you select it. However, if you want to avoid that structure you can just get the variable directl from a subset directly.
select *
from mytable
where date = (select max(date) from mytable);

OVER clause for VARCHAR

I can use over clause for numeric and date columns using an aggregate function. But, I'm stuck with being unable to use over clause for the varchar column. In the example below, I can reproduce the FIRST_FILL_DT column using the following lines:
MIN(FILL_DATE) OVER(PARTITION BY ID) AS FIRST_FILL_DT
However, when trying to produce the FIRST_BP_MED column, I am not sure if I can use similar syntax because I don't know if the aggregate function works correctly with VARCHAR Columns.
Can anyone please offer insights or guidance on how to solve this?
My data is like this:
My desired data should like this:
If your database supports the FIRST_VALUE window function, you can use something like this:
FIRST_VALUE(BP_MED) OVER (PARTITION BY ID ORDER BY FILL_DATE) AS first_bp_med
Docs for FIRST_VALUE:
MySQL, SQL Server,
Postgresql,
SQLite
This is pretty straight forward. Use 'FIRST_VALUE' over your window clause to pick the first value omitted by your partition irrespective of the condition.
https://learn.microsoft.com/en-us/sql/t-sql/functions/first-value-transact-sql?view=sql-server-ver15
SELECT
ID, FILL_DATE, BP_MED,
MIN (FILL_DATE) OVER (PARTITION BY ID ORDER BY FILL_DATE) AS FIRST_FILL_DT,
FIRST_VALUE (BP_MED) OVER (PARTITION BY ID ORDER BY FILL_DATE) AS FIRST_BP_MED
FROM
YOURTABLE;

How do I update this query so as to use listagg instead of wm_concat?

I have a query that looks like this which I inherited from another developer, this is its select statement
select distinct
wm_concat(
nvl(
listagg(USER_CODE,',') within group (order by USER_CODE),
USER_CODE
)
)
How can I update this to work using listagg?
I understand what listagg does and how it operates, but I'm not sure what the outcome of wrapping this nvl-wrapped listagg in wm_concat was to begin with, and since we're on 12c now, I can't test what their old output was supposed to look like.
WM_CONCAT() is an undocumented Oracle function, that does pretty much the same thing as LISTAGG(), and whose usage is discouraged. Since it is not officially supported, it may break anytime when you upgrade.
You did not show the whole query so is still to be confirmed, but:
I do not see the logic of using WM_CONCAT() as a wrapper around LISTAGG()
the use of NVL(LISTAGG(user_code ...) ..., user_code) does not seem to make sense: LISTAGG() is an aggregate function, so using it implies that column user_code is aggregated. Since this column is aggregated, you cannot use it a second argument to NVL()...
Bottom line, I would simply suggest to drop all that fancy (and probably invalid) stuff and use a simple aggregate expression:
SELECT LISTAGG(user_code, ',') WITHIN GROUP (ORDER BY user_code) ...
WM_CONCAT(some_field) is essentially the same thing as LISTAGG(some_field, ',') WITHIN GROUP (ORDER BY some_field), so since LISTAGG(...) returns a single value the WM_CONCAT is effectively a no-op. Eliminate it from your query and move on.

create primary key ROW_NUMBER() over function sql teradata

I want to create a primary key in my select statement. I read that I can use ROW_NUMBER() over function. But as it is going to be primary key, i don't have any columns for over or partition by. I tried using just select row_number() as PKbut that throws error [3706] syntax error: expected something between ( and as keyword.
how could I resolve the issue?
You would need an over clause. I'm not sure if the order by is optional in Teradata (I don't have a version on hand):
row_number() over ()
row_number() over (order by <some column here>)
Are you trying to create a auto generated number which you can then use as a primary index for good distribution across the AMPs in Teradata nodes (which you refer to as PK in select statement) ?
If so, and if you dont want to use the IDENTITY COLUMN data type to do that for you (pros and cons exist), then you could generate such a auto number to be used as PI in Teardata by simply using a csum function. (Mind you, your target table must not be too large i.e. more than a few hundred thousands to a million)
SELECT
mx.max_id + csum(1,1) as PI_column
,src.columnABC
from
source_table src
cross join
(SELECT max(id) as max_id from target_table) as mx
group by 1,2
order by 1;
This will generate a new PI/PK/Unique ID column to be used for PI with good distribution for every unique combination of ColumnABC.
Hope this helps.
If my "if" statement at the beginning was not true, then please explain further what are you trying to do and i will be happy to help you with that.

SQL-92 (Filemaker): How can I UPDATE a list of sequential numbers?

I need to re-assign all SortID's, starting from 1 until MAX (SortID) from a subset of records of table Beleg, using SQL-92, after one of the SortID's has changed (for example from 444 to 444.1). I have tried several ways (for example SET #a:=0; UPDATE table SET field=#a:=#a+1 WHERE whatever='whatever' ORDER BY field2), but it didn't work, as these solutions all need a special kind of SQL, like SQLServer or Oracle, etc.
The SQL that I use is SQL-92, implemented in FileMaker (INSERT and UPDATE are available, though, but nothing fancy).
Thanks for any hint!
Gary
From what I know, SQL-92 is a standard and not a language. So you can say you are using T-SQL, which is mostly SQL-92 compliant, but you can't say I program SQL Server in SQL-92. The same applies to FileMaker.
I suppose you are trying to update your table through ODBC? The Update statement looks OK, but there are no variables if FileMaker SQL (and I am not sure using a variable inside query will give you result you expect, I think you will set SortId in every row to 1). You are thinking about doing something like Window functions with row() in TSQL, but I do not think this functionality is available.
The easiest solution is to use FileMaker, resetting the numbering for a column is really a trivial task which takes seconds. Do you need help with this?
Edit:
I was referring to TSQL functions rank() and row_number(), there is no row() function in TSQL
I finally got the answer from Ziggy Crueltyfree Zeitgeister on the Database Administrators copy of my question.
He suggested to break this down into multiple steps using a temporary table to store the results:
CREATE TABLE sorting (sid numeric(10,10), rn int);
INSERT INTO sorting (sid, rn)
SELECT SortID, RecordNumber FROM Beleg
WHERE Year ( Valuta ) = 2016
AND Ursprungskonto = 1210
ORDER BY SortID;
UPDATE Beleg SET SortID = (SELECT rn FROM sorting WHERE sid=Beleg.SortID)
WHERE Year ( Valuta ) = 2016
AND Ursprungskonto = 1210;
DROP TABLE sorting;
Of course! I just keep the table definition in Filemaker (let the type coercion be done by Filemaker this way), and filling and deleting from it with my function: RenumberSortID ().