Using Random function in the WHERE Clause - sql

In my WHERE Clause I'm using the random function to get a random number between 1 and 5. However the result is always empty without any error.
Here it is:
Select Question._id, question_text, question_type, topic, favorite,
picture_text, picture_src, video_text, video_src, info_title, info_text,
info_picture_src, topic_text
FROM Question
LEFT JOIN Question_Lv ON Question._id = Question_Lv.question_id
LEFT JOIN Info ON Question._id = Info.question_id
LEFT JOIN Info_Lv ON Question._id = Info_Lv.question_id
LEFT JOIN Picture ON Question._id = Picture.question_id
LEFT JOIN Picture_Lv ON Question._id = Picture_Lv.question_id
LEFT JOIN Video ON Question._id = Video.question_id
LEFT JOIN Video_Lv ON Question._id = Video_Lv.question_id
LEFT JOIN Topic ON Question.topic = Topic._id
LEFT JOIN Topic_Lv ON Topic._id = Topic_Lv.topic_id
LEFT JOIN Exam ON Question._id = Exam.question_id
WHERE Exam.exam = (random() * 5+ 1)
What is the random function doing in this case and how to use it correctly?

From Docs
random()
The random() function returns a pseudo-random integer between
-9223372036854775808 and +9223372036854775807.
Hence your random value is not between 0 and 1 as you assumed and hence no rows.
You can get it between 0 and 1 by dividing it with 2×9223372036854775808 and adding 0.5 to it.
random() / 18446744073709551616 + 0.5
So, your where clause becomes:
WHERE Exam.exam = ((random() / 18446744073709551616 + 0.5) * 5 + 1)
which is same as:
WHERE Exam.exam = 5 * random() / 18446744073709551616 + 3.5
Also, you'll probably need to round the output of right side calculation, so:
WHERE Exam.exam = round(5 * random() / 18446744073709551616 + 3.5)

I'll answer this question using Vertica as a database.
Vertica has the function RANDOM(), which returns a random double precision number between 0 and 1, and the function RANDOMINT(<*integer*>), which returns an integer number between 0 and *<integer>*-1. I'll use RANDOMINT(5) for this example.
As a general suggestion - Isolate your specific problem in your question. The joins in your query are not part of the problem. And use a sample table, like I do in the code below.
As some of the previous answers suggested, RANDOMINT(5) will return a new random integer between 0 and 4 for each of the rows that are read from the exam table.
See here:
WITH exam(id,exam,exam_res) AS (
SELECT 1,1,'exam_res_1'
UNION ALL SELECT 2,2,'exam_res_2'
UNION ALL SELECT 3,3,'exam_res_3'
UNION ALL SELECT 4,4,'exam_res_4'
UNION ALL SELECT 5,5,'exam_res_5'
UNION ALL SELECT 6,1,'exam_res_1'
UNION ALL SELECT 7,2,'exam_res_2'
UNION ALL SELECT 8,3,'exam_res_3'
UNION ALL SELECT 9,4,'exam_res_4'
UNION ALL SELECT 10,5,'exam_res_5'
UNION ALL SELECT 11,1,'exam_res_1'
UNION ALL SELECT 12,2,'exam_res_2'
UNION ALL SELECT 13,3,'exam_res_3'
UNION ALL SELECT 14,4,'exam_res_4'
UNION ALL SELECT 15,5,'exam_res_5'
)
SELECT * FROM exam WHERE exam=RANDOMINT(5)+1
;
id|exam|exam_res
3| 3|exam_res_3
4| 4|exam_res_4
5| 5|exam_res_5
6| 1|exam_res_1
7| 2|exam_res_2
9| 4|exam_res_4
12| 2|exam_res_2
What you need to do is make sure that you call your random number generator only once.
If your database abides to the ANSI 99 standard and supports the WITH clause (the common table expression, as I also use it to generate the sample data), do that also in a common table expression - which I call search_exam:
WITH search_exam(exam) AS (
SELECT RANDOMINT(5)+1
)
, exam(id,exam,exam_res) AS (
SELECT 1,1,'exam_res_1'
UNION ALL SELECT 2,2,'exam_res_2'
UNION ALL SELECT 3,3,'exam_res_3'
UNION ALL SELECT 4,4,'exam_res_4'
UNION ALL SELECT 5,5,'exam_res_5'
UNION ALL SELECT 6,1,'exam_res_1'
UNION ALL SELECT 7,2,'exam_res_2'
UNION ALL SELECT 8,3,'exam_res_3'
UNION ALL SELECT 9,4,'exam_res_4'
UNION ALL SELECT 10,5,'exam_res_5'
UNION ALL SELECT 11,1,'exam_res_1'
UNION ALL SELECT 12,2,'exam_res_2'
UNION ALL SELECT 13,3,'exam_res_3'
UNION ALL SELECT 14,4,'exam_res_4'
UNION ALL SELECT 15,5,'exam_res_5'
)
SELECT id,exam,exam_res FROM exam
WHERE exam=(SELECT exam FROM search_exam)
;
id|exam|exam_res
1| 1|exam_res_1
6| 1|exam_res_1
11| 1|exam_res_1
Alternatively, you can go SELECT id,exam,exam_res FROM exam INNER JOIN search_exam USING(exam) .
Happy playing -
Marco the Sane

Related

SQL Nested Sum(probably ?)

I got a select that groups the total value of a column, but also I need the percentage referring to each sum referring to the total sum!
As an example of today's return:
id|sum
1| 10
2| 50
3| 80
4| 20
5| 60
What I'm looking for: (the total is 220)
id|sum| %
1| 10|10/220
2| 50|50/220
3| 80|80/220
4| 20|20/220
5| 60|60/220
I am not sure if there's an easy way to got that result, using a sub select I could do this but I tought it not good and should have a better way.
The select is so simple:
Select
id,
sum(value)
From
table
Group By
id
But, that's the real select:
Select
OrcaItem.Cd_Produto,
OrcaItem.Ds_Produto,
OrcaItem.Cd_Produto || ' - ' || OrcaItem.Ds_Produto CdDs_Produto,
Estoque.Qt_Disponivel,
Sum(OrcaItem.Qt_Vendida) Qt_Vendida,
Sum(OrcaItem.Vr_TotalLiquido) Vr_Liquido, -- HERE!
Cast(Sum(
Case :piTipoCusto
When 0 Then (Produto.Vr_CustoAtual * OrcaItem.Qt_Vendida)
When 1 Then (OrcaItem.Vr_CustoAtual * OrcaItem.Qt_Vendida)
When 2 Then (OrcaItem.Vr_CustoFin * OrcaItem.Qt_Vendida)
When 3 Then (Produto.Vr_UltPcoCompra * OrcaItem.Qt_Vendida)
End
) As Numeric(15,2)) Vr_Custo,
Cast(Sum(
Case :piTipoCusto
When 0 Then OrcaItem.Vr_TotalLiquido - (Produto.Vr_CustoAtual * OrcaItem.Qt_Vendida)
When 1 Then OrcaItem.Vr_TotalLiquido - (OrcaItem.Vr_CustoAtual * OrcaItem.Qt_Vendida)
When 2 Then OrcaItem.Vr_TotalLiquido - (OrcaItem.Vr_CustoFin * OrcaItem.Qt_Vendida)
When 3 Then OrcaItem.Vr_TotalLiquido - (Produto.Vr_UltPcoCompra * OrcaItem.Qt_Vendida)
End
) As Numeric(15,2)) Vr_Lucro,
Sum(
Case :piTipoCusto
When 0 Then OrcaItem.Vr_TotalLiquido - (Produto.Vr_CustoAtual * OrcaItem.Qt_Vendida)
When 1 Then OrcaItem.Vr_TotalLiquido - (OrcaItem.Vr_CustoAtual * OrcaItem.Qt_Vendida)
When 2 Then OrcaItem.Vr_TotalLiquido - (OrcaItem.Vr_CustoFin * OrcaItem.Qt_Vendida)
When 3 Then OrcaItem.Vr_TotalLiquido - (Produto.Vr_UltPcoCompra * OrcaItem.Qt_Vendida)
End
) / Sum(OrcaItem.Vr_TotalLiquido) * 100 Pc_Lucro,
Produto.Cd_Linha,
Linha.Ds_Linha,
Produto.Cd_Grupo,
Grupo.Ds_Grupo
From
OrcaItem
Inner Join Orca On Orca.Nr_Orcamento = OrcaItem.Nr_Orcamento
Inner Join Estoque On Estoque.Cd_Produto = OrcaItem.Cd_Produto
Inner Join Produto On Produto.Cd_Produto = OrcaItem.Cd_Produto
Inner Join Linha On Linha.Cd_Linha = Produto.Cd_Linha
Inner Join Grupo On Grupo.Cd_Linha = Produto.Cd_Linha And
Grupo.Cd_Grupo = Produto.Cd_Grupo
Where
Orca.Fg_Situacao In ('F', 'R') And
Orca.Dt_Atendido Between :piDt_Inicio And :piDt_Final
Group By
OrcaItem.Cd_Produto,
OrcaItem.Ds_Produto,
CdDs_Produto,
Estoque.Qt_Disponivel,
Produto.Cd_Linha,
Linha.Ds_Linha,
Produto.Cd_Grupo,
Grupo.Ds_Grupo
Order By
Produto.Cd_Linha,
Produto.Cd_Grupo,
OrcaItem.Ds_Produto
Sum(OrcaItem.Vr_TotalLiquido) Vr_Liquido = total of each
The 'global' total is the sum of Vr_Liquido.
Assuming the simple select you show, you can use a window function, assuming you're using Firebird 3.0 or higher:
select id, sum_value, cast(sum_value as numeric(18,2)) / sum(sum_value) over()
from (
select id, sum("VALUE") as sum_value
from testdata
group by id
)
The over() will aggregate over all rows. The cast to NUMERIC(18,2) is needed to ensure the value is non-zero. Use a higher scale if you need more digits (or cast to double precision).
For the more complex select, you take the same approach. Make the original query a derived table (or common table expression), and use the window function in the enclosing select list.

How to SELECT / Query Whole Numbers only

Would you know If i can select only whole numbers?
I don't want to round off the values.
For example this is select * from Table_1 below
Numbers
Team
10.5
A
12.12
B
23
C
I would do like
select * from Table_1
where NUMBERS is ;
Expected output below
Numbers
Team
23
C
Thank you very much!
The function TRUNC() truncates a number without rounding:
SELECT *
FROM Table_1
WHERE TRUNC("Numbers") = "Numbers";
See the demo.
You can use the ROUND function to filter (rather than to display) the values:
SELECT *
FROM table_1
WHERE ROUND(numbers) = numbers;
You could also use the FLOOR, CEIL or TRUNC functions instead of ROUND.
Or, you could use the MOD function:
SELECT *
FROM table_1
WHERE MOD(numbers, 1) = 0;
(And you could apply a function-based index to MOD(number, 1) if you wanted to improve performance.)
Which, for the sample data:
CREATE TABLE table_1 (Numbers, Team) AS
SELECT 10.5, 'A' FROM DUAL UNION ALL
SELECT 12.12, 'B' FROM DUAL UNION ALL
SELECT 23, 'C' FROM DUAL;
All the options output:
NUMBERS
TEAM
23
C
db<>fiddle here

Return five rows of random DNA instead of just one

This is the code I have to create a string of DNA:
prepare dna_length(int) as
with t1 as (
select chr(65) as s
union select chr(67)
union select chr(71)
union select chr(84) )
, t2 as ( select s, row_number() over() as rn from t1)
, t3 as ( select generate_series(1,$1) as i, round(random() * 4 + 0.5) as rn )
, t4 as ( select t2.s from t2 join t3 on (t2.rn=t3.rn))
select array_to_string(array(select s from t4),'') as dna;
execute dna_length(20);
I am trying to figure out how to re-write this to give a table of 5 rows of strings of DNA of length 20 each, instead of just one row. This is for PostgreSQL.
I tried:
CREATE TABLE dna_table(g int, dna text);
INSERT INTO dna_table (1, execute dna_length(20));
But this does not seem to work. I am an absolute beginner. How to do this properly?
PREPARE creates a prepared statement that can be used "as is". If your prepared statement returns one string then you can only get one string. You can't use it in other operations like insert, e.g.
In your case you may create a function:
create or replace function dna_length(int) returns text as
$$
with t1 as (
select chr(65) as s
union
select chr(67)
union
select chr(71)
union
select chr(84))
, t2 as (select s,
row_number() over () as rn
from t1)
, t3 as (select generate_series(1, $1) as i,
round(random() * 4 + 0.5) as rn)
, t4 as (select t2.s
from t2
join t3 on (t2.rn = t3.rn))
select array_to_string(array(select s from t4), '') as dna
$$ language sql;
And use it in a way like this:
insert into dna_table(g, dna) select generate_series(1,5), dna_length(20)
From the official doc:
PREPARE creates a prepared statement. A prepared statement is a server-side object that can be used to optimize performance. When the PREPARE statement is executed, the specified statement is parsed, analyzed, and rewritten. When an EXECUTE command is subsequently issued, the prepared statement is planned and executed. This division of labor avoids repetitive parse analysis work, while allowing the execution plan to depend on the specific parameter values supplied.
About functions.
This can be much simpler and faster:
SELECT string_agg(CASE ceil(random() * 4)
WHEN 1 THEN 'A'
WHEN 2 THEN 'C'
WHEN 3 THEN 'T'
WHEN 4 THEN 'G'
END, '') AS dna
FROM generate_series(1,100) g -- 100 = 5 rows * 20 nucleotides
GROUP BY g%5;
random() produces random value in the range 0.0 <= x < 1.0. Multiply by 4 and take the mathematical ceiling with ceil() (cheaper than round()), and you get a random distribution of the numbers 1-4. Convert to ACTG, and aggregate with GROUP BY g%5 - % being the modulo operator.
About string_agg():
Concatenate multiple result rows of one column into one, group by another column
As prepared statement, taking
$1 ... the number of rows
$2 ... the number of nucleotides per row
PREPARE dna_length(int, int) AS
SELECT string_agg(CASE ceil(random() * 4)
WHEN 1 THEN 'A'
WHEN 2 THEN 'C'
WHEN 3 THEN 'T'
WHEN 4 THEN 'G'
END, '') AS dna
FROM generate_series(1, $1 * $2) g
GROUP BY g%$1;
Call:
EXECUTE dna_length(5,20);
Result:
| dna |
| :------------------- |
| ATCTTCGACACGTCGGTACC |
| GTGGCTGCAGATGAACAGAG |
| ACAGCTTAAAACACTAAGCA |
| TCCGGACCTCTCGACCTTGA |
| CGTGCGGAGTACCCTAATTA |
db<>fiddle here
If you need it a lot, consider a function instead. See:
What is the difference between a prepared statement and a SQL or PL/pgSQL function, in terms of their purposes?

can't use JOIN with generate_series on Redshift

generate_series function on Redshift works as expected, when used in a simple select statement.
WITH series AS (
SELECT n as id from generate_series (-10, 0, 1) n
) SELECT * FROM series;
-- Works fine
As soon as I add a JOIN condition, redshift throws
com.amazon.support.exceptions.ErrorException: Function
generate_series(integer,integer,integer)" not supported"
DROP TABLE testing;
CREATE TABLE testing (
id INT
);
WITH series AS (
SELECT n as id from generate_series (-10, 0, 1) n
) SELECT * FROM series S JOIN testing T ON S.id = T.id;
-- Function "generate_series(integer,integer,integer)" not supported.
Redshift Version
SELECT version();
-- PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.1485
Are there any workarounds to make this work?
generate_series is not supported by Redshift. It works only standalone on a leader node.
A workaround would be using row_number against any table that has sufficient number of rows:
with
series as (
select (row_number() over ())-11 from some_table limit 10
) ...
also, this question was asked multiple times already
You are correct that this does not work on Redshift.
See here.
The easiest workaround is to create a permanent table "manually" beforehand with the values within that table, e.g. you could have rows on that table for -1000 to +1000, then select the range from that table,
So for your example you would have something like
WITH series AS (
SELECT n as id from (select num as n from newtable where num between -10 and 0) n
) SELECT * FROM series S JOIN testing T ON S.id = T.id;
Does that work for you?
Alternatively, if you cannot create the table beforehand or prefer not to, you could use something like this
with ten_numbers as (select 1 as num union select 2 union select 3 union select 4 union select 5 union select 6 union select 7 union select 8 union select 9 union select 0)
,generted_numbers AS
(
SELECT (1000*t1.num) + (100*t2.num) + (10*t3.num) + t4.num-5000 as gen_num
FROM ten_numbers AS t1
JOIN ten_numbers AS t2 ON 1 = 1
JOIN ten_numbers AS t3 ON 1 = 1
JOIN ten_numbers AS t4 ON 1 = 1
)
select gen_num from generted_numbers
where gen_num between -10 and 0
order by 1;

How can I treat a UNION query as a sub query

I have a set of tables that are logically one table split into pieces for performance reasons. I need to write a query that effectively joins all the tables together so I use a single where clause of the result. I have successfully used a UNION on the result of using the WHERE clause on each subtable explicitly as in the following
SELECT * FROM FRED_1 WHERE CHARLIE = 42
UNION
SELECT * FROM FRED_2 WHERE CHARLIE = 42
UNION
SELECT * FROM FRED_3 WHERE CHARLIE = 42
but as there are ten separate subtables updating the WHERE clause each time is a pain. What I want is something like this
SELECT *
FROM (
SELECT * FROM FRED_1
UNION
SELECT * FROM FRED_2
UNION
SELECT * FROM FRED_3)
WHERE CHARLIE = 42
If it makes a difference the query needs to run against a DB2 database.
Here is a more comprehensive (sanitised) version of what I need to do.
select *
from ( select * from FRD_1 union select * from FRD_2 union select * from FRD_3 ) as FRD,
( select * from REQ_1 union select * from REQ_2 union select * from REQ_3 ) as REQ,
( select * from RES_1 union select * from RES_2 union select * from RES_3 ) as RES
where FRD.KEY1 = 123456
and FRD.KEY1 = REQ.KEY1
and FRD.KEY1 = RES.KEY1
and REQ.KEY2 = RES.KEY2
NEW INFORMATION:
It looks like the problem has more to do with the number of fields in the union than anything else. If I greatly restrict the fields I can get most of the syntax variations below working. Unfortunately, restricting the fields so much means the resulting query, while potentially useful, is not giving me the result I wanted. I've managed to get an additional 3 fields from one of the tables in addition to the 2 keys. Any more than that and the query fails.
I believe you have to give a name to your subquery result. I don't know db2 so I'm taking a shot in the dark, but I know this works on several other platforms.
SELECT *
FROM (
SELECT * FROM FRED_1
UNION
SELECT * FROM FRED_2
UNION
SELECT * FROM FRED_3) AS T1
WHERE CHARLIE = 42
If the logical implementation is a single table but the physical implementation is multiple tables then how about creating a view that defines the logical model.
CREATE VIEW VW_FRED AS
SELECT * FROM FRED_1
UNION
SELECT * FROM FRED_2
UNION
SELECT * FROM FRED_3
then it's a simple matter of
SELECT * FROM VW_FRED WHERE CHARLIE = 42
Again, I'm not familiar with db2 syntax but this gives you the general idea.
with
FRD as ( select * from FRD_1 union select * from FRD_2 union select * from FRD_3 ),
REQ as ( select * from REQ_1 union select * from REQ_2 union select * from REQ_3 ),
RES as ( select * from RES_1 union select * from RES_2 union select * from RES_3 )
SELECT * from FRD, REQ, RES
WHERE FRD.KEY1 = 123456
and FRD.KEY1 = REQ.KEY1
and FRD.KEY1 = RES.KEY1
and REQ.KEY2 = RES.KEY2
I'm not familiar with DB2 syntax but why aren't you doing this as an INNER JOIN or LEFT JOIN?
SELECT *
FROM FRED_1
INNER JOIN FRED_2
ON FRED_1.Charlie = FRED_2.Charlie
INNER JOIN FRED_3
ON FRED_1.Charlie = FRED_3.Charlie
WHERE FRED_1.Charlie = 42
If the values don't exist in FRED_2 or FRED_3 then use a LEFT/OUTER JOIN. I'm assuming that FRED_1 is a master table, and if a record exists then it will be in this table.
maybe:
SELECT * FROM
(select * from FRD_1
union
select * from FRD_2
union
select * from FRD_3) FRD
INNER JOIN (select * from REQ_1 union select * from REQ_2 union select * from REQ_3) REQ
on FRD.KEY1 = REQ.KEY1
INNER JOIN (select * from RES_1 union select * from RES_2 union select * from RES_3) RES
on FRD.KEY1 = RES.KEY1
WHERE FRD.KEY1 = 123456 and REQ.KEY2 = RES.KEY2