Randomly choosing a row

Randomly choosing a row - sql

I have table s1, which has 3 rows. How can I RANDOMLY pick a row from s1 and INSERT its corresponding value into d1.
I don't want a hard coded solution. Can ROWNUM() be used then dbms_random? Let's say I want 10 rows in d1.
An example would be appreciated.
Create table s1(
val NUMBER(4)
);
INSERT into s1
(val) VALUES (30);
INSERT into s1
(val) VALUES (40);
INSERT into s1
(val) VALUES (50);
Create table d1(
val NUMBER(4)
);

You can sort by a random value and select one row:
insert into d1 (val)
select val
from (select s1.*
from s1
order by dbms_random.value
) s1
where rownum = 1;
In Oracle 12C+, you don't need the subquery:
insert into d1 (val)
select val
from s1
order by dbms_random.value
fetch first 1 row only;
Note: This assumes that you really mean random and not arbitrary. A random row means that any row in the table has an equal chance of being chosen in any given invocation of the query.

In case of huge tables standard way with sorting by dbms_random.value is not effective because you need to scan whole table and dbms_random.value is pretty slow function and requires context switches. For such cases, there are 2 well-known methods:
Use sample clause:
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/SELECT.html#GUID-CFA006CA-6FF1-4972-821E-6996142A51C6
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/SELECT.html#GUID-CFA006CA-6FF1-4972-821E-6996142A51C6
for example:
select *
from s1 sample block(1)
order by dbms_random.value
fetch first 1 rows only
ie get 1% of all blocks, then sort them randomly and return just 1 row.
if you have an index/primary key on the column with normal distribution, you can get min and max values, get random value in this range and get first row with a value greater or equal than that randomly generated value.
Example:
--big table with 1 mln rows with primary key on ID with normal distribution:
Create table s1(id primary key,padding) as
select level, rpad('x',100,'x')
from dual
connect by level<=1e6;
select *
from s1
where id>=(select
dbms_random.value(
(select min(id) from s1),
(select max(id) from s1)
)
from dual)
order by id
fetch first 1 rows only;
Update
and 3rd variant: get random table block, generate rowid and get row from the table by this rowid:
select *
from s1
where rowid = (
select
DBMS_ROWID.ROWID_CREATE (
1,
objd,
file#,
block#,
1)
from
(
select/*+ rule */ file#,block#,objd
from v$bh b
where b.objd in (select o.data_object_id from user_objects o where object_name='S1' /* table_name */)
order by dbms_random.value
fetch first 1 rows only
)
);

Related

How to add specific number of empty rows in sqlite?

I have a SQLite file and I want to add 2550 empty (NULL) rows.
I am able to add one empty line with this code
INSERT INTO my_table DEFAULT VALUES
But I need 2550 rows. Is there any shortcut for it? I don't want to execute same code 2550 times.

If your version of SQLite support it, you could use a recursive CTE to generate a series from 1 to 2550, and then insert "empty" records along that sequence:
WITH RECURSIVE generate_series(value) AS (
SELECT 1
UNION ALL
SELECT value + 1
FROM generate_series
WHERE value + 1 <= 2550
)
INSERT INTO yourTable (col1, col2, ...)
SELECT NULL, NULL, ...
FROM generate_series;
It is not clear which values, if any, you want to specify for the actual insert. If you omit mention of any column in the insert, then by default SQLite should assign NULL or whatever default value be defined for that column.

If your table is empty, then use a recursive CTE to get 2550 rows each consisting of the integers 1 to 2550 and use them to insert 2550 rows:
WITH cte AS (
SELECT 1 nr
UNION ALL
SELECT nr + 1
FROM cte
WHERE nr < 2550
)
INSERT INTO my_table(rowid)
SELECT nr FROM cte
This way, you use the column rowid where the integer values of the CTE will be stored and there is no need to enumerate all the columns of your table in the INSERT statement. These columns will get their default values.
If your table is not empty you can do it in a similar way by starting the integer numbers from the max rowid value in the table +1:
WITH cte AS (
SELECT MAX(rowid) + 1 nr FROM my_table
UNION ALL
SELECT nr + 1
FROM cte
WHERE nr < (SELECT MAX(rowid) + 2550 FROM my_table)
)
INSERT INTO my_table(rowid)
SELECT nr FROM cte
See a simplified demo (for 5 rows).
But since you also tagged android-sqlite you can use a for loop:
for (int i = 1; i <= 2550; i++) {
db.execSQL("INSERT INTO my_table DEFAULT VALUES");
}
where db is a valid non null instance of SQLiteDatabase.

You can generate numbers using a recursive CTE and then insert . . . but you need to be more explicit about the values being inserted:
with cte as (
select 1 as n
union all
select n + 1
from cte
where n < 2550
)
insert into mytable (<something>)
select <something>
from cte;
I think you need to specify the value for at least one column in SQLite.

SQL clone/replicate records within same table with a condition

I have a table and i would like to replicate/clone records within the same table. However i would like to do that with a condition. And the condition is i have a column called recordcount with numeric values. For example Row 1 can take on a value of recordcount say 7, then i would like my row 1 to be replicated 7 times. Row 2 could take on a value say 9 then i would like row 2 to be replicated 9 times.
Any help is appreciated. Thank you

What you can do (and I'm pretty sure it's not a best practice),
Is to hold a table with just numbers, which has rowcount that correspond to the numeric value.
Join that with your table, and project your table only.
Example:
create table nums(x int);
insert into nums select 1;
insert into nums select 2;
insert into nums select 2;
insert into nums select 3;
insert into nums select 3;
insert into nums select 3;
create table t (txt varchar(10) , recordcount int);
insert into t select 'A',1;
insert into t select 'B',2;
insert into t select 'C',3;
select t.*
from t
inner join nums
on t.recordcount = nums.x
order by 1
;
Will project:
"A",1
"B",2
"B",2
"C",3
"C",3
"C",3

Oracle SQL - Efficient Join N Rows Per Match

The basic idea is to join two tables, let's call them MYTABLE1 and MYTABLE2 on a field JOINID. There will be lots of matches per JOINID (one row from MYTABLE1 corresponding to many rows in MYTABLE2, and for the purposes of testing, MYTABLE1 has 50 rows), but we want to only select up to N matches per JOINID value. I have seen lots of inefficient solutions, for example:
select t1.*, t2.*
from MYTABLE1 t1 inner join
(select MYTABLE2.*,
row_number() over (partition by MYTABLE2.JOINKEY order by 1) as seqnum
from MYTABLE2) t2
on t1.joinkey = t2.joinkey and seqnum <= 2;
which takes over 5 minutes for me to run and returns less than 100 results, whereas something like
select t1.*, t2.*
from MYTABLE1 t1 inner join MYTABLE2 t2 on t1.JOINKEY = t2.JOINKEY
where rownum <= 100;
returns 100 results in ~60 milliseconds.
(To be sure of the validity of these results, I selected a different test table and performed the second query above on a specific single JOINKEY until I got a result set with less than 100 results, meaning it did in fact search through all of MYTABLE2. The total query time was 30 milliseconds. Afterwards, I started the original query, but this time getting 50 joins per row of MYTABLE1, which again took over 5 minutes to complete.)
How can I approach this in a not-so-terribly-inefficient manner?
It seems so simple, all we need to do is go through the rows of MYTABLE1 and matching the JOINKEY field to that of rows of MYTABLE2, moving on to the next row of MYTABLE1 once we have matched the desired number for that row.
In the worst case scenario for my second example, we should have to spend 30ms searching through the full TABLE2 per row of TABLE1, of which there are 50 rows, for a total execution time of 1.5 seconds.

I wouldn't call the below approach efficient by any means and it cheats a little and has some clunkiness, but it comes in under the 1500ms limit you provided so I'll add as something to consider.
This example cheats in that it compiles a TYPE, so it can table an anonymous function.
This approach just iteratively probes MYTABLE2 with each JOINKEY from MYTABLE1 using an anonymous subquery-factoring-clause function and accumulates the results as it goes.
I don't know the real structure of the tables involved, so this example pretends MYTABLE2 has one additional CHAR attribute called OTHER_DATA that is the target of the SELECT.
First, setup the test tables:
CREATE TABLE MYTABLE1 (
JOINKEY NUMBER NOT NULL
);
CREATE TABLE MYTABLE2 (
JOINKEY NUMBER NOT NULL,
OTHER_DATA CHAR(1) NOT NULL
);
CREATE INDEX MYTABLE2_I
ON MYTABLE2 (JOINKEY);
Then add the test data. 50 rows to MYTABLE1 and 100M rows to MYTABLE2:
INSERT INTO MYTABLE1
SELECT LEVEL
FROM DUAL
CONNECT BY LEVEL < 51;
BEGIN
<<COMMIT_LOOP>>
FOR OUTER_POINTER IN 1..4000 LOOP
<<DATA_LOOP>>
FOR POINTER IN 1..10 LOOP
INSERT INTO MYTABLE2
SELECT
JOINKEY, OTHER_DATA
FROM
(SELECT LEVEL AS JOINKEY FROM DUAL CONNECT BY LEVEL < 51)
CROSS JOIN
(SELECT CHR(64 + LEVEL) AS OTHER_DATA FROM DUAL CONNECT BY LEVEL < 51);
END LOOP DATA_LOOP;
COMMIT;
END LOOP COMMIT_LOOP;
END;
/
Then gather stats...
Verify the table counts:
SELECT COUNT(*) FROM MYTABLE1;
50
SELECT COUNT(*) FROM MYTABLE2;
100000000
Then create a TYPE that includes the desired data:
CREATE OR REPLACE TYPE JOINKEY_OTHER_DATA IS OBJECT (JOINKEY1 NUMBER, OTHER_DATA CHAR(1));
/
CREATE OR REPLACE TYPE JOINKEY_OTHER_DATA_LIST IS TABLE OF JOINKEY_OTHER_DATA;
/
And then run a query that uses an anonymous subquery-factoring-block function that imposes a rowcount per JOINKEY to be returned. In this first example, it fetches two MYTABLE2 rows per JOINKEY:
SELECT SYSTIMESTAMP FROM DUAL;
WITH FUNCTION FETCH_N_ROWS
(P_MATCHES_LIMIT IN NUMBER)
RETURN JOINKEY_OTHER_DATA_LIST
AS
V_JOINKEY_OTHER_DATAS JOINKEY_OTHER_DATA_LIST;
BEGIN
V_JOINKEY_OTHER_DATAS := JOINKEY_OTHER_DATA_LIST();
FOR JOINKEY_POINTER IN (SELECT MYTABLE1.JOINKEY
FROM MYTABLE1)
LOOP
DECLARE
V_MYTABLE2_JOINKEYS JOINKEY_OTHER_DATA_LIST;
BEGIN
SELECT JOINKEY_OTHER_DATA(MYTABLE2.JOINKEY, MYTABLE2.OTHER_DATA)
BULK COLLECT INTO V_MYTABLE2_JOINKEYS
FROM MYTABLE2 WHERE MYTABLE2.JOINKEY = JOINKEY_POINTER.JOINKEY
FETCH FIRST P_MATCHES_LIMIT ROWS ONLY;
V_JOINKEY_OTHER_DATAS := V_JOINKEY_OTHER_DATAS MULTISET UNION ALL V_MYTABLE2_JOINKEYS;
END;
END LOOP;
RETURN V_JOINKEY_OTHER_DATAS;
END;
SELECT *
FROM TABLE (FETCH_N_ROWS(2));
/
SELECT SYSTIMESTAMP FROM DUAL;
Results in:
SYSTIMESTAMP
18-APR-17 03.32.10.623056000 PM -06:00
JOINKEY1 OTHER_DATA
1 A
1 B
2 A
2 B
3 A
3 B
...
49 A
49 B
50 A
50 B
100 rows selected.
SYSTIMESTAMP
18-APR-17 03.32.11.014554000 PM -06:00
By changing the number passed to FETCH_N_ROWS, different data volumens can be fetched with fairly consistent performance.
...
SELECT * FROM TABLE (FETCH_N_ROWS(13));
Returns:
...
50 K
50 L
50 M
650 rows selected.

You cannot compare the two queries. The second query is simply returning whatever rows come first. The second has to go through all the data to return anything. A more apt comparison would use:
select . . .
from (select t1.*, t2.*
from MYTABLE1 t1 inner join
MYTABLE2 t2
on t1.JOINKEY = t2.JOINKEY
order by t1.JOINKEY
) t1
where rownum <= 100;
This has to read all the data before returning anything so it is more analogous to using row number.
But, start with this query:
select t1.*, t2.*
from MYTABLE1 t1 inner join
(select MYTABLE2.*,
row_number() over (partition by t2.JOINKEY order by 1) as seqnum
from MYTABLE2 t2
) t2
on t1.joinkey = t2.joinkey and seqnum <= 2;
For this query, you want an index on MYTABLE2(JOINKEY). If the ORDER BY has another key, that should be in the query as well.

How to insert columns with adding sequence column?

In an Oracle database (11gR2), I have a table my_table with columns (sequence, col1, col2, col3). I want to insert values into the table that are queried from other tables, i.e. insert into my_table select <query from other tables>. The problem is that the primary key is the four columns, hence I need to add a sequence starting from 0 up till the count of the rows to be inserted (order is not a problem).
I tried using a loop like this:
DECLARE
j NUMBER;
r_count number;
BEGIN
select count(1) into r_count from <my query to be inserted>;
FOR j IN 0 .. r_count
LOOP
INSERT INTO my_table
select <my query, incorporating r_count as sequence column> ;
END LOOP;
END;
But it didn't work, actually looped r_count times trying to insert the entire rows every time, as logically it shall do. How can I achieve the expected goal and insert rows with adding a sequence column?

Don't do this in a loop. Just use row_number():
INSERT INTO my_table(seq, . . .)
select row_number() over (order by NULL) - 1, . . .
from . . .;

Let's create table with sample data (to simulate your source of data)
-- This is your source query table (can be anything)
CREATE TABLE source_table
(
source_a VARCHAR(255),
source_b VARCHAR(255),
source_c VARCHAR(255)
);
insert into source_table (source_a, source_b, source_c) values ('A', 'B', 'C');
insert into source_table (source_a, source_b, source_c) values ('D', 'E', 'F');
insert into source_table (source_a, source_b, source_c) values ('G', 'H', 'I');
Then create target table, with id and 3 data columns.
-- This is your target_table
CREATE TABLE target_table
(
id NUMBER(9,0),
target_a VARCHAR2(255),
target_b VARCHAR2(255),
target_c VARCHAR2(255)
);
-- This is sequence used to ensure unique number in 1st column
CREATE sequence target_table_id_seq start with 0 minvalue 0 increment BY 1;
Finally, perform insert, loading id from sequence, rest of the data from source table.
INSERT INTO target_table
SELECT target_table_id_seq.nextval,
source_a,
source_b,
source_c
FROM source_table;
Results might look like
1 A B C
2 D E F
3 G H I
If you added some values later, they will continue with numbering 4,5,6 etc.. Or do you want to get order only inside the group ? Thus if you added 2 more rows JKL and MNO, target table would look like this
1 A B C
2 D E F
3 G H I
1 J K L
2 M N O
For that you need different solution (don't even need sequencer)
SELECT
RANK() OVER (ORDER BY source_a, source_b, source_c),
source_a,
source_b,
source_c
FROM source_table;
Technically you could use ROWNUM directly, BUT I opt for RANK() OVER analytical function due to consistent result. Please note, that this will breach your complex primary key if you try to insert the same rows twice (My first solution doesn't)

Clearly, you should use Oracle sequence.
First, create a sequence:
create sequence seq_my_table start with 0 minvalue 0 increment by 1;
Then use it:
INSERT INTO my_table (sequence, ...)
select seq_my_table.nextval, <the rest of my query>;
Sequence numbers will be inserted in succession.

So, you already have the table, it has the required number of rows, and now you want to add numbers from 0 to total number of rows minus one in the column named sequence? (perhaps not "sequence" but something less likely to clash with Oracle reserved words?)
Then this should work:
update my_table set seq = rownum - 1;

How can I select multiple copies of the same row?

I have a table in MS Access with rows which have a column called "repeat"
I want to SELECT all the rows, duplicated by their "repeat" column value.
For example, if repeat is 4, then I should return 4 rows of the same values. If repeat is 1, then I should return only one row.
This is very similar to this answer:
https://stackoverflow.com/a/6608143
Except I need a solution for MS Access.

First create a "Numbers" table and fill it with numbers from 1 to 1000 (or up to whatever value the "Repeat" column can have):
CREATE TABLE Numbers
( i INT NOT NULL PRIMARY KEY
) ;
INSERT INTO Numbers
(i)
VALUES
(1), (2), ..., (1000) ;
then you can use this:
SELECT t.*
FROM TableX AS t
JOIN
Numbers AS n
ON n.i <= t.repeat ;

If repeat has only small values you can try:
select id, col1 from table where repeat > 0
union all
select id, col1 from table where repeat > 1
union all
select id, col1 from table where repeat > 2
union all
select id, col1 from table where repeat > 3
union all ....

What you can do is to retrieve the one 'unique' row and copy this row/column into a string however many copies you need from it using a for loop.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Randomly choosing a row - sql

Related

How to add specific number of empty rows in sqlite?

SQL clone/replicate records within same table with a condition

Oracle SQL - Efficient Join N Rows Per Match

How to insert columns with adding sequence column?

How can I select multiple copies of the same row?

Categories

Resources