Random records for a row - sql

I have the following table strvals() with data. I'm looking to randomly choose 2 rows from strval() table and populate table S1 in a loop.
I want something like this
Create table s1(id primary key,strval1, strval2) as
select level,random_rec(strvals), random_rec(strvals)
from dual
connect by level<=10;
The caveat is the column strval1 has to be different THEN strval2 for each row.
Valid output
1, 'AAAA', 'BBBB'
2, 'CCCC', 'BBBB'
3, 'CCCC', 'AAAA'
Not valid
1, 'AAAA', 'AAAA'
Create table strvals(
strval varchar2(4),
constraint pk_strval primary key (strval)
);
insert into strvals
values(
'AAAA'
);
insert into strvals
values(
'BBBB'
);
insert into strvals
values(
'CCCC'
);

Getting a random string from a table can be tricky. One method is to use correlated subqueries -- the correlation clause ensures that the subquery is not "optimized" to run only once.
So, here is one method:
select id, strval,
(select s2.strval
from strvals s2
where s2.strval <> x.strval and x.id > 0
order by dbms_random.random fetch first 1 row only
) as strval2
from (select id,
(select strval
from strvals
where x.id > 0
order by dbms_random.random fetch first 1 row only
) as strval
from (select level as id
from dual
connect by level < 25
) x
) x;
And here is a db<>fiddle.

To reduce a number of slow dbms_random.value in case of many rows, I would suggest to use this technique:
with
rand as (select row_number()over(order by dbms_random.value()) n,strval from strvals)
,cnt as (select count(*) m from rand)
,generator as (select level id, ceil(dbms_random.value()*cnt.m) rnd from cnt connect by level<=10)
select generator.id, rand.strval
from generator
join rand on rand.n=generator.rnd;
As you can see, I sort the table strvals once by dbms_random.value (rand view), so each row has own random integer number, then I count them to get maximum N. And then I calculate random N for each generated row so we can join rand view using hash join.
Update: for 2 strvals: DBFiddle
with
rand as (select row_number()over(order by dbms_random.value()) n,strval from strvals)
,cnt as (select count(*) m from rand)
,generator as (
select
level id,
ceil(dbms_random.value()*cnt.m) rnd1,
ceil(dbms_random.value()*cnt.m) rnd2
from cnt connect by level<=10
)
select
generator.id,
rnd1.strval,
rnd2.strval
from generator
join rand rnd1 on rnd1.n=generator.rnd1
join rand rnd2 on rnd2.n=generator.rnd2;
Update2: now with 2 strvals: DBFiddle
with
rand as (
select
row_number()over(order by dbms_random.value()) n
,s1.strval strval1
,s2.strval strval2
from strvals s1
join strvals s2
on s1.strval!=s2.strval
)
,cnt as (select count(*) m from rand)
,generator as (
select
level id,
ceil(dbms_random.value()*cnt.m) rnd
from cnt connect by level<=10
)
select
generator.id,
rand.strval1,
rand.strval2
from generator
join rand on rand.n=generator.rnd
order by 1
;

Related

How to get last record from Master-Details tables

I have a table that has 3 columns.
create table myTable
(
ID int Primary key,
Detail_ID int references myTable(ID) null, -- reference to self
Master_Value varchar(50) -- references to master table
)
this table has the follow records:
insert into myTable select 100,null,'aaaa'
insert into myTable select 101,100,'aaaa'
insert into myTable select 102,101,'aaaa'
insert into myTable select 103,102,'aaaa' ---> last record
insert into myTable select 200,null,'bbbb'
insert into myTable select 201,200,'bbbb'
insert into myTable select 202,201,'bbbb' ---> last record
the records is saved In the form of relational with ID and Detail_ID columns.
I need to select the last record each Master_Value column. follow output:
lastRecordID Master_Value Path
202 bbbb 200=>201=>202
103 aaaa 100=>101=>102=>103
tips:
The records are not listed in order in the table.
I can not use the max(ID) keyword. beacuse data is not sorted.(may
be the id column updated manually.)
attempts:
I was able to Prepare follow query and is working well:
with Q as
(
select ID ,Detail_ID, Master_Value , 1 RowOrder, CAST(id as varchar(max)) [Path] from myTable where Detail_ID is null
union all
select R.id,R.Detail_ID , r.Master_Value , (q.RowOrder + 1) RowOrder , (q.[Path]+'=>'+CAST(r.id as varchar(max))) [Path] from myTable R inner join Q ON Q.ID=R.Detail_ID --where r.Dom_ID_RowType=1010
)
select * into #q from Q
select Master_Value, MAX(RowOrder) lastRecord into #temp from #Q group by Master_Value
select
q.ID lastRecordID,
q.Master_Value,
q.[Path]
from #temp t
join #q q on q.RowOrder = t.lastRecord
where
q.Master_Value = t.Master_Value
but I need to simple way (one select) and optimal method.
Can anyone help me?
One method uses a correlated subquery to get the last value (which is how I interpreted your question):
select t.*
from mytable t
where not exists (select 1
from mytable t2
where t2.master_value = t.master_value and
t2.id = t.detail_id
);
This returns rows that are not referred to by another row.
For the path, you need a recursive CTE:
with cte as (
select master_value, id as first_id, id as child_id, convert(varchar(max), id) as path, 1 as lev
from mytable t
where detail_id is null
union all
select cte.master_value, cte.first_id, t.id, concat(path, '->', t.id), lev + 1
from cte join
mytable t
on t.detail_id = cte.child_id and t.master_value = cte.master_value
)
select cte.*
from (select cte.*, max(lev) over (partition by master_value) as max_lev
from cte
) cte
where max_lev = lev
Here is a db<>fiddle.

How can I transform my N little queries into one query?

I have a query that gives me the first available value for a given date and pair.
SELECT
TOP 1 value
FROM
my_table
WHERE
date >= 'myinputdate'
AND key = 'myinpukey'
ORDER BY date
I have N pairs of key and dates, and I try to find out how not to query each pair one by one. The table is rather big, and N as well, so it's currently heavy and slow.
How can I query all the pairs in one query ?
A solution is to use APPLY like a "function" created on the fly with one or many columns from another set:
DECLARE #inputs TABLE (
myinputdate DATE,
myinputkey INT)
INSERT INTO #inputs(
myinputdate,
myinputkey)
VALUES
('2019-06-05', 1),
('2019-06-01', 2)
SELECT
I.myinputdate,
I.myinputkey,
R.value
FROM
#inputs AS I
CROSS APPLY (
SELECT TOP 1
T.value
FROM
my_table AS T
WHERE
T.date >= I.myinputdate AND
T.key = I.myinputkey
ORDER BY
T.date ) AS R
You can use OUTER APPLY if you want NULL result values to be shown also. This supports fetching multiple columns and using ORDER BY with TOP to control amount of rows.
This solution is without variables. You control your N by setting the right value to the row_num predicate.
There are plenty of ways how to do you what you want and it all depends on your specific needs. As it answered already, that you can use temp/variable table to store these conditions and then join it on the same conditions you use predicates. You can also create user defined data type and use it as param to the function/procedure. You might use CROSS APPLY + VALUES clause to get that list and then join it.
DROP TABLE IF EXISTS #temp;
CREATE TABLE #temp ( d DATE, k VARCHAR(100) );
GO
INSERT INTO #temp
VALUES ( '20180101', 'a' ),
( '20180102', 'b' ),
( '20180103', 'c' ),
( '20180104', 'd' ),
( '20190101', 'a' ),
( '20190102', 'b' ),
( '20180402', 'c' ),
( '20190103', 'c' ),
( '20190104', 'd' );
SELECT a.d ,
a.k
FROM ( SELECT d ,
k ,
ROW_NUMBER() OVER ( PARTITION BY k ORDER BY d DESC ) row_num
FROM #temp
WHERE (d >= '20180401'
AND k = 'a')
OR (d > '20180401'
AND k = 'b')
OR (d > '20180401'
AND k = 'c')
) a
WHERE a.row_num <= 1;
-- VALUES way
SELECT a.d ,
a.k
FROM ( SELECT t.d ,
t.k ,
ROW_NUMBER() OVER ( PARTITION BY t.k ORDER BY t.d DESC ) row_num
FROM #temp t
CROSS APPLY (VALUES('20180401','a'), ('20180401', 'b'), ('20180401', 'c')) f(d,k)
WHERE t.d >= f.d AND f.k = t.k
) a
WHERE a.row_num <= 1;
If all the keys are using the same date, then use window functions:
SELECT key, value
FROM (SELECT t.*, ROW_NUMBER() OVER (PARTITION BY key ORDER BY date) as seqnum
FROM my_table t
WHERE date >= #input_date AND
key IN ( . . . )
) t
WHERE seqnum = 1;
SELECT key, date,value
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY key,date ORDER BY date) as rownum,key,date,value
FROM my_table
WHERE
date >= 'myinputdate'
) as d
WHERE d.rownum = 1;

insert random strings from a list using sql

i've created a table like this : region(id, name) and i want to insert into this table some rows , the name should be choosed randomly from a list like ['east', 'west'], how can i do that ?
Insert 10 random rows with...
Oracle:
INSERT INTO region(name)
SELECT rndm.name
FROM (SELECT 1 n
FROM dual
CONNECT BY LEVEL <= 365
) gen
CROSS
JOIN (SELECT 'west' name FROM dual
UNION ALL
SELECT 'east' FROM dual
UNION ALL
SELECT 'north' FROM dual
) rndm
WHERE rownum <= 10
ORDER BY DBMS_RANDOM.VALUE
SQL Server:
INSERT INTO region (name)
SELECT TOP 10 rndm.name
FROM sys.all_objects a1
CROSS
JOIN (VALUES ('east'),
('west')) rndm(name)
ORDER BY CHECKSUM(NEWID())
MySQL:
INSERT INTO region (name)
SELECT rndm.name
FROM information_schema.columns v
CROSS
JOIN (SELECT 'east' as name UNION ALL
SELECT 'west') rndm
ORDER BY RAND()
LIMIT 10
Postgres:
INSERT INTO region(NAME)
SELECT unnest(ARRAY['west','east'])
FROM generate_series(1, 100)
ORDER BY random()
LIMIT 10
This is an Oracle solution. The first subquery generates ten random integers between 1 and 3, using Oracle's CONNECT BY syntax. The second subquery associates the region name with a number between 1 and 3. These are joined in the main query to populate the table:
insert into region (id, name)
with rnd as (
select level as lvl, round(DBMS_RANDOM.VALUE(0.5,3.4999999999), 0) as rnd
from dual
connect by level <= 10
) , regn as (
select 1 as rno, 'west' as rname from dual union all
select 2 as rno, 'east' as rname from dual union all
select 3 as rno, 'north' as rname from dual
)
select rnd.lvl
, regn.rname
from rnd
join regn on rnd.rnd = regn.rno
/

Query to return dynamic number of rows using SQL in SQL Server 2012

I have a unique requirement to return number of result rows in multiples of 10. Example, if actual data rows are 3, I must add another 7 blank rows to make it 10. If actual data rows are 16, I must add another 4 blank rows to make it 20, and so on.
Without using a procedure, is it possible to achieve this using SELECT statement?
The blank rows can simply contain NULL values or spaces or zeroes.
You can assume any simple query for data rows; the objective is to understand how to return rows dynamically in multiples of 10.
Example:
Select EmpName FROM Employees
If there are 3 employees, I should still return 10 rows, with the balance 7 rows containing either NULL value or blanks.
I am using SQL Server 2012.
This is very raw idea how it can be achieved:
WITH data(r) AS (
SELECT 1 r FROM dual
UNION ALL
SELECT r+1 r FROM data WHERE r < 10
)
SELECT sd.*
FROM data d
left join some_data sd on d.r = sd.id
This is dual table structure:
create table dual (dummy varchar(1));
insert into dual values ('x');
Fiddle: http://sqlfiddle.com/#!6/5ffcc/4
One of the possible options is this:
WITH data(r) AS (
SELECT 1 r FROM dual
UNION ALL
SELECT r+1 r FROM data WHERE r < 10
)
SELECT sd.*
FROM
(select r, row_number() over (order by r) rn from data) d
left join (
select id, name, row_number() over (order by id) rn from some_data sd
) sd
on d.rn = sd.rn
The obvious disadvantages of this colutions:
'r' value generation rule most probably is not as simple in your
case.
Number of rows must be known before query execution.
But maybe it will help you to find better solution.
Here's another, fairly easy, way to handle it...
IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL
DROP TABLE #TestData;
CREATE TABLE #TestData (
EmpID INT NOT NULL,
EmpName VARCHAR(20) NOT NULL
);
INSERT #TestData(EmpID, EmpName) VALUES
(47, 'Bob'),(33, 'Mary'), (88, 'Sue');
-- data as it exists...
SELECT
td.EmpID,
td.EmpName
FROM
#TestData td;
-- the desired output...
WITH
cte_AddRN AS (
SELECT
td.EmpID,
td.EmpName,
RN = ROW_NUMBER() OVER (ORDER BY td.EmpName)
FROM
#TestData td
),
cte_TenRows AS (
SELECT n.RN FROM ( VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9),(10) ) n (RN)
)
SELECT
ar.EmpID,
ar.EmpName
FROM
cte_TenRows tr
LEFT JOIN cte_AddRN ar
ON tr.RN = ar.RN
ORDER BY
tr.RN;
Results...
-- data as it exists...
EmpID EmpName
----------- --------------------
47 Bob
33 Mary
88 Sue
-- the desired output...
EmpID EmpName
----------- --------------------
47 Bob
33 Mary
88 Sue
NULL NULL
NULL NULL
NULL NULL
NULL NULL
NULL NULL
NULL NULL
NULL NULL
Based on the above 2 answers, here is what I did:
WITH DATA AS
(SELECT EmpName FROM Employees),
DataSummary AS
(SELECT COUNT(*) AS NumDataRows FROM DATA),
ReqdDataRows AS
(SELECT CEILING(NumDataRows/10.0)*10 AS NumRowsReqd FROM DataSummary),
FillerRows AS
(
SELECT 1 AS SLNO, '00000' AS FillerCol
UNION ALL
SELECT 2 AS SLNO, '00000' AS FillerCol
UNION ALL
SELECT 3 AS SLNO, '00000' AS FillerCol
UNION ALL
SELECT 4 AS SLNO, '00000' AS FillerCol
UNION ALL
SELECT 5 AS SLNO, '00000' AS FillerCol
UNION ALL
SELECT 6 AS SLNO, '00000' AS FillerCol
UNION ALL
SELECT 7 AS SLNO, '00000' AS FillerCol
UNION ALL
SELECT 8 AS SLNO, '00000' AS FillerCol
UNION ALL
SELECT 9 AS SLNO, '00000' AS FillerCol
UNION ALL
SELECT 10 AS SLNO, '00000' AS FillerCol
)
SELECT * FROM DATA
--UNION ALL
--SELECT CONVERT(VARCHAR(10), NumDataRows) FROM DataSummary
--UNION ALL
--SELECT CONVERT(VARCHAR(10), NumRowsReqd) FROM ReqdDataRows
UNION ALL
SELECT FillerCol FROM FillerRows
WHERE (SELECT NumDataRows FROM DataSummary) + SLNO <= (SELECT NumRowsReqd FROM ReqdDataRows)
This gives me the output what I want. This avoids use of ROW_NUMBER and ORDERing. The table FillerRows can be further simplified using SELECT * FROM (VALUES...), and the 2nd and 3rd table DataSummary and ReqdDataRows can be merged into a single SELECT statement.
This approach is a step by step approach and easy to understand and debug, like:
Get the actual data rows
Get count of the data rows
Calculate required no. data rows
UNION the actual data rows with filler rows
Any suggestions on further simplifying this are welcome.

SQL Puzzle: How to generate row numbers ? (a classic puzzle with a cruel twist)

* A more refine version of this challenge can be found here.
The Puzzle
We got a table t:
create table t (i int not null);
The goal is to write, under the requirements specified below, a query that returns the same results as -
select t.i,row_number() over (order by t.i) as rn from t;
It might help you to know that there is another table in the database -
create table s (i int not null unique);
All I can tell you about table s, except for its definition, is that it has the same number of rows as table t, or maybe more.
Requirements
The solution should be a single SQL query (sub-queries are fine).
The use of any table other than t (and perhaps s), including table functions, is not allowed.
Only the following clauses are allowed: SELECT, FROM, WHERE, GROUP BY, HAVING, ORDER BY and WITH (but not recursive!!).
The use of analytic functions is not allowed.
The use of rownum, rowid, guid and their like is not allowed.
The use of T-SQL, PL/SQL etc. is not allowed.
The use of UDF (User Defined Functions) is not allowed.
The use of variables is not allowed.
Data Sample
create table t (i int not null);
insert into t (i) values (1);
insert into t (i) values (2);
insert into t (i) values (3);
insert into t (i) values (3);
insert into t (i) values (4);
insert into t (i) values (5);
insert into t (i) values (5);
insert into t (i) values (5);
insert into t (i) values (6);
insert into t (i) values (7);
create table s (i int not null unique);
insert into s (i) values (3);
insert into s (i) values (12);
insert into s (i) values (13);
insert into s (i) values (28);
insert into s (i) values (41);
insert into s (i) values (52);
insert into s (i) values (56);
insert into s (i) values (57);
insert into s (i) values (83);
insert into s (i) values (91);
insert into s (i) values (97);
insert into s (i) values (99);
Requested result
I RN
---------- ----------
1 1
2 2
3 3
3 4
4 5
5 6
5 7
5 8
6 9
7 10
The following will work almost all the time:
with tt as (
select t.i, t.i + rand() as new_i
from t
)
select tt.i,
(select count(*)
from tt tt2
where tt2.new_i <= tt.new_i
) as rn
from tt;
Note: The function for rand() exists in all databases that support with, although the exact function (or combination) varies by database.
EDIT:
It is much more complicated to get something that works all the time. But:
with n as (
select (select count(*) from s s2 where s2.i <= s.i) as n
from s
),
tt as (
select i, count(*) as num
from t
group by i
),
ttt as
(select tt.*,
(select sum(num) from tt tt2 where tt2.i < tt.i
) as cume_num
from tt
)
select ttt.i, coalesce(cume_num, 0) + n.n
from ttt join
n
on n.n <= ttt.num;
I sort of like the first way better ;)
I would suggest this:
with
grp as
(select i,
count(*) cnt,
(select count(*) from t where i < t1.i) cntBefore
from t t1
group by i),
r as
(select (select count(*) from s where i <= s1.i) rn
from s s1)
select grp.i, r.rn
from r
inner join grp on r.rn between grp.cntBefore + 1 and grp.cntBefore + grp.cnt;
My variation:
with seq_num (i)
as
(
select (select count (*) as i from s s2 where s2.i <= s1.i)
from s s1
)
,t_with_seq (i,i_seq)
as
(
select t.i as i
,s.i as i_seq
from (select i
,count (*) as occurrences
from t
group by i
)
t
join seq_num s
on s.i <= t.occurrences
)
select ts1.i
,(select count (*) from t_with_seq ts2 where ts2.i < ts1.i or (ts2.i = ts1.i and ts2.i_seq <= ts1.i_seq)) as rn
from t_with_seq ts1
order by rn
;
with
num_gen as (
select (select count(*) from s where i <= s1.i) n
from s s1
),
groups as (
select i, count(*) as ct
from t
group by i
)
select g.i, ng.n + (select count(*) from t where t.i < g.i) rn
from num_gen ng inner join groups g on ng.n <= g.ct;
Note: Initially I offered a different solution, shown below for historical perspective (the first two comments refer to it). The OP is right, of course; I butchered a good idea. In the solution above I restored the idea to its proper simplicity.
-- OLD solution (replaced by the one above)
with
num_gen as (
select (select count(*) from s where i <= s1.i) n
from s s1
),
groups as (
select i, count(*) as ct
from t
group by i
),
new_numbers as (
select g.i i, g.i + power(10, -ng.n) new_i
from num_gen ng inner join groups g on ng.n <= g.ct
)
select nn.i, (select count(*) from new_numbers where new_i <= nn.new_i) rn
from new_numbers nn;