Random record selection - VMS / RDB - sql

I have been asked for a SQL statement to retrieve an arbitrary number of rows, randomly selected, from a data table in a database hosted on Oracle RDB on VMS.
In MS SQL, it would simply be:
SELECT TOP 5 *
FROM MyTable
ORDER BY NEWID()
But I cannot find an equivalent method for RDB/VMS.
"Proper" Oracle would be:
ORDER BY dbms_random.VALUE
However, that does not appear to be supoprted in RDB on VMS.
Any insights would be greatly appreciated.

SELECT *
FROM MyTable
ORDER BY NEWID()
LIMIT to 5

I don't know the "RDB way" to do it, but if the records have sequential IDs, find out the hightest-number ID and then by scripting generate a random list of IDs to fetch. Something like:
SELECT TOP 1 FROM mytable ORDER BY id DESC;
Then in Python: records = random.sample(range(topId), 5)

Related

SQL Server - Pagination Without Order By Clause

My situation is that a SQL statement which is not predictable, is given to the program and I need to do pagination on top of it. The final SQL statement would be similar to the following one:
SELECT * FROM (*Given SQL Statement*) b
OFFSET 0 ROWS FETCH NEXT 50 ROWS ONLY;
The problem here is that the *Given SQL Statement* is unpredictable. It may or may not contain order by clause. I am not able to change the query result of this SQL Statement and I need to do pagination on it.
I searched for solution on the Internet, but all of them suggested to use an arbitrary column, like primary key, in order by clause. But it will change the original order.
The short answer is that it can't be done, or at least can't be done properly.
The problem is that SQL Server (or any RDBMS) does not and can not guarantee the order of the records returned from a query without an order by clause.
This means that you can't use paging on such queries.
Further more, if you use an order by clause on a column that appears multiple times in your resultset, the order of the result set is still not guaranteed inside groups of values in said column - quick example:
;WITH cte (a, b)
AS
(
SELECT 1, 'a'
UNION ALL
SELECT 1, 'b'
UNION ALL
SELECT 2, 'a'
UNION ALL
SELECT 2, 'b'
)
SELECT *
FROM cte
ORDER BY a
Both result sets are valid, and you can't know in advance what will you get:
a b
-----
1 b
1 a
2 b
2 a
a b
-----
1 a
1 b
2 a
2 b
(and of course, you might get other sorts)
The problem here is that the *Given SQL Statement" is unpredictable. It may or may not contain order by clause.
your inner query(unpredictable sql statement) should not contain order by,even if it contains,order is not guaranteed.
To get guaranteed order,you have to order by some column.for the results to be deterministic,the ordered column/columns should be unique
Please note: what I'm about to suggest is probably horribly inefficient and should really only be used to help you go back to the project leader and tell them that pagination of an unordered query should not be done. Having said that...
From your comments you say you are able to change the SQL statement before it is executed.
You could write the results of the original query to a temporary table, adding row count field to be used for subsequent pagination ordering.
Therefore any original ordering is preserved and you can now paginate.
But of course the reason for needing pagination in the first place is to avoid sending large amounts of data to the client application. Although this does prevent that, you will still be copying data to a temp table which, depending on the row size and count, could be very slow.
You also have the problem that the page size is coming from the client as part of the SQL statement. Parsing the statement to pick that out could be tricky.
As other notified using anyway without using a sorted query will not be safe, But as you know about it and search about it, I can suggest using a query like this (But not recommended as a good way)
;with cte as (
select *,
row_number() over (order by (select 0)) rn
from (
-- Your query
) t
)
select *
from cte
where rn between (#pageNumber-1)*#pageSize+1 and #pageNumber*#pageSize
[SQL Fiddle Demo]
I finally found a simple way to do it without any order by on a specific column:
declare #start AS INTEGER = 1, #count AS INTEGER = 5;
select * from (SELECT *,ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS fakeCounter
FROM (select * from mytable) AS t) AS t2 order by fakeCounter OFFSET #start ROWS
FETCH NEXT #count ROWS ONLY
where select * from mytable can be any query

SQL for getting each category data in maria db

I need to fetch 4 random values from each category. What should be the correct sql syntax for maria db. I have attached one image of table structure.
Please click here to check the structure
Should i write some procedure or i can do it with basic sql syntax?
You can do that with a SQL statement if you only have a few rows:
SELECT id, question, ... FROM x1 ORDER BY rand() LIMIT 1
This works fine if you have only a few rows - as soon as you have thousands of rows the overhead for sorting the rows becomes important, you have to sort all rows for getting only one row.
A trickier but better solution would be:
SELECT id, question from x1 JOIN (SELECT CEIL(RAND() * (SELECT(MAX(id)) FROM x1)) AS id) as id using(id);
Running EXPLAIN on both SELECTS will show you the difference...
If you need random value for different categories combine the selects via union and add a where clause
http://mysql.rjweb.org/doc.php/groupwise_max#top_n_in_each_group
But then ORDER BY category, RAND(). (Your category is the blog's province.)
Notice how it uses #variables to do the counting.
If you have MariaDB 10.2, then use one of its Windowing functions.
SELECT column FROM table WHERE category_id = XXX
ORDER BY RAND()
LIMIT 4
do it for all categories

Reverse initial order of SELECT statement

I want to run a SQL query in Postgres that is exactly the reverse of the one that you'd get by just running the initial query without an order by clause.
So if your query was:
SELECT * FROM users
Then
SELECT * FROM users ORDER BY <something here to make it exactly the reverse of before>
Would it just be this?
ORDER BY Desc
You are building on the incorrect assumption that you would get rows in a deterministic order with:
SELECT * FROM users;
What you get is really arbitrary. Postgres returns rows in any way it sees fit. For simple queries typically in order of their physical storage, which typically is the order in which rows were entered. But there are no guarantees, and the order may change any time between two calls. For instance after any UPDATE (writing a new physical row version), or when any background process reorders rows - like VACUUM. Or a more complex query might return rows according to an index or a join. Long story short: there is no reliable order for table rows in a relational database unless you specify it with ORDER BY.
That said, assuming you get rows from the above simple query in the order of physical storage, this would get you the reverse order:
SELECT * FROM users
ORDER BY ctid DESC;
ctid is the internal tuple ID signifying physical order. Related:
In-order sequence generation
How list all tables with data changes in the last 24 hours?
here is a tsql solution, thid might give you an idea how to do it in postgres
select * from (
SELECT *, row_number() over( order by (select 1)) rowid
FROM users
) x
order by rowid desc

How to retrieve the last 2 records from table?

I have a table with n number of records
How can i retrieve the nth record and (n-1)th record from my table in SQL without using derived table ?
I have tried using ROWID as
select * from table where rowid in (select max(rowid) from table);
It is giving the nth record but i want the (n-1)th record also .
And is there any other method other than using max,derived table and pseudo columns
Thanks
You cannot depend on rowid to get you to the last row in the table. You need an auto-incrementing id or creation time to have the proper ordering.
You can use, for instance:
select *
from (select t.*, row_number() over (order by <id> desc) as seqnum
from t
) t
where seqnum <= 2
Although allowed in the syntax, the order by clause in a subquery is ignored (for instance http://docs.oracle.com/javadb/10.8.2.2/ref/rrefsqlj13658.html).
Just to be clear, rowids have nothing to do with the ordering of rows in a table. The Oracle documentation is quite clear that they specify a physical access path for the data (http://docs.oracle.com/cd/B28359_01/server.111/b28318/datatype.htm#i6732). It is true that in an empty database, inserting records into a newtable will probably create a monotonically increasing sequence of row ids. But you cannot depend on this. The only guarantees with rowids are that they are unique within a table and are the fastest way to access a particular row.
I have to admit that I cannot find good documentation on Oracle handling or not handling order by's in subqueries in its most recent versions. ANSI SQL does not require compliant databases to support order by in subqueries. Oracle syntax allows it, and it seems to work in some cases, at least. My best guess is that it would probably work on a single processor, single threaded instance of Oracle, or if the data access is through an index. Once parallelism is introduced, the results would probably not be ordered. Since I started using Oracle (in the mid-1990s), I have been under the impression that order bys in subqueries are generally ignored. My advice would be to not depend on the functionality, until Oracle clearly states that it is supported.
select * from (select * from my_table order by rowid) where rownum <= 2
and for rows between N and M:
select * from (
select * from (
select * from my_table order by rowid
) where rownum <= M
) where rownum >= N
Try this
select top 2 * from table order by rowid desc
Assuming rowid as column in your table:
SELECT * FROM table ORDER BY rowid DESC LIMIT 2

How to read the last row with SQL Server

What is the most efficient way to read the last row with SQL Server?
The table is indexed on a unique key -- the "bottom" key values represent the last row.
If you're using MS SQL, you can try:
SELECT TOP 1 * FROM table_Name ORDER BY unique_column DESC
select whatever,columns,you,want from mytable
where mykey=(select max(mykey) from mytable);
You'll need some sort of uniquely identifying column in your table, like an auto-filling primary key or a datetime column (preferably the primary key). Then you can do this:
SELECT * FROM table_name ORDER BY unique_column DESC LIMIT 1
The ORDER BY column tells it to rearange the results according to that column's data, and the DESC tells it to reverse the results (thus putting the last one first). After that, the LIMIT 1 tells it to only pass back one row.
If some of your id are in order, i am assuming there will be some order in your db
SELECT * FROM TABLE WHERE ID = (SELECT MAX(ID) FROM TABLE)
I think below query will work for SQL Server with maximum performance without any sortable column
SELECT * FROM table
WHERE ID not in (SELECT TOP (SELECT COUNT(1)-1
FROM table)
ID
FROM table)
Hope you have understood it... :)
I tried using last in sql query in SQl server 2008 but it gives this err:
" 'last' is not a recognized built-in function name."
So I ended up using :
select max(WorkflowStateStatusId) from WorkflowStateStatus
to get the Id of the last row.
One could also use
Declare #i int
set #i=1
select WorkflowStateStatusId from Workflow.WorkflowStateStatus
where WorkflowStateStatusId not in (select top (
(select count(*) from Workflow.WorkflowStateStatus) - #i ) WorkflowStateStatusId from .WorkflowStateStatus)
You can use last_value: SELECT LAST_VALUE(column) OVER (PARTITION BY column ORDER BY column)...
I test it at one of my databases and it worked as expected.
You can also check de documentation here: https://msdn.microsoft.com/en-us/library/hh231517.aspx
OFFSET and FETCH NEXT are a feature of SQL Server 2012 to achieve SQL paging while displaying results.
The OFFSET argument is used to decide the starting row to return rows from a result and FETCH argument is used to return a set of number of rows.
SELECT *
FROM table_name
ORDER BY unique_column desc
OFFSET 0 Row
FETCH NEXT 1 ROW ONLY
SELECT TOP 1 id from comission_fees ORDER BY id DESC
In order to retrieve the last row of a table for MS SQL database 2005, You can use the following query:
select top 1 column_name from table_name order by column_name desc;
Note: To get the first row of the table for MS SQL database 2005, You can use the following query:
select top 1 column_name from table_name;
If you don't have any ordered column, you can use the physical id of each lines:
SELECT top 1 sys.fn_PhysLocFormatter(%%physloc%%) AS [File:Page:Slot],
T.*
FROM MyTable As T
order by sys.fn_PhysLocFormatter(%%physloc%%) DESC
SELECT * from Employees where [Employee ID] = ALL (SELECT MAX([Employee ID]) from Employees)
This is how you get the last record and update a field in Access DB.
UPDATE compalints SET tkt = addzone &'-'& customer_code &'-'& sn where sn in (select max(sn) from compalints )
If you have a Replicated table, you can have an Identity=1000 in localDatabase and Identity=2000 in the clientDatabase, so if you catch the last ID you may find always the last from client, not the last from the current connected database.
So the best method which returns the last connected database is:
SELECT IDENT_CURRENT('tablename')
Well I'm not getting the "last value" in a table, I'm getting the Last value per financial instrument. It's not the same but I guess it is relevant for some that are looking to look up on "how it is done now". I also used RowNumber() and CTE's and before that to simply take 1 and order by [column] desc. however we nolonger need to...
I am using SQL server 2017, we are recording all ticks on all exchanges globally, we have ~12 billion ticks a day, we store each Bid, ask, and trade including the volumes and the attributes of a tick (bid, ask, trade) of any of the given exchanges.
We have 253 types of ticks data for any given contract (mostly statistics) in that table, the last traded price is tick type=4 so, when we need to get the "last" of Price we use :
select distinct T.contractId,
LAST_VALUE(t.Price)over(partition by t.ContractId order by created ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
from [dbo].[Tick] as T
where T.TickType=4
You can see the execution plan on my dev system it executes quite efficient, executes in 4 sec while the exchange import ETL is pumping data into the table, there will be some locking slowing me down... that's just how live systems work.
It is very simple
select top 10 * from TableName order by 1 desc
SELECT * FROM TABLE WHERE ID = (SELECT MAX(ID) FROM TABLE)
I am pretty sure that it is:
SELECT last(column_name) FROM table
Becaause I use something similar:
SELECT last(id) FROM Status