I want to stop making the same performance mistakes and need someone way steadier on SQL statements than me on this one.
Basically I want my function:
create or replace FUNCTION SEQGEN(vinp in varchar2, iSeq in INTEGER)
RETURN VARCHAR2 is vResult VARCHAR2(32);
iBas INTEGER; iRem INTEGER; iQuo INTEGER; lLen CONSTANT INTEGER := 2;
BEGIN
iBas := length(vInp);
iQuo := iSeq;
WHILE iQuo > 0 LOOP
iRem := iQuo mod iBas;
--dbms_output.put_line('Now we divide ' || lpad(iQuo,lLen,'0') || ' by ' || lpad(iBas,lLen,'0') || ', yielding a quotient of ' || lpad( TRUNC(iQuo / iBas) ,lLen,'0') || ' and a remainder of ' || lpad(iRem,lLen,'0') || ' giving the char: ' || substr(vInp, iRem, 1)); end if;
iQuo := TRUNC(iQuo / iBas);
If iRem < 1 Then iRem := iBas; iQuo := iQuo - 1; End If;
vResult := substr(vInp, iRem, 1) || vResult;
END LOOP;
RETURN vResult;
END SEQGEN;
to be written with SQL statements only.
Something like:
WITH sequence ( vResult, lSeq ) AS
(
SELECT str, length(str) base FROM (SELECT 'abc' str FROM DUAL)
)
SELECT vResult FROM sequence WHERE lSeq < 13
if str = 'aB' output: if str = 'abC' output:
1 a a
2 B b
3 a a C
4 a B a a
5 B a a b
6 B B a C
7 a a a b a
8 a a B b b
9 a B a b C
10 a B B C a
11 B a a C b
12 B a B C C
13 B B a a a a
and if str = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ' then:
SELECT vResult FROM sequence
WHERE vResult in ('0001','0002','0009','000A','000Z',
'0010','0011','001A','ZZZZ') --you get the idea...
I have found some questions at Stackoverflow that are fairly related.
Base 36 to Base 10 conversion using SQL only and
PL/SQL base conversion without functions.
But with my current knowledge of SQL I am not quite able to hack this one...
EDITED:
Or well, it is almost like this one:
select sum(position_value) pos_val from (
select power(base,position-1) * instr('abc', digit) as position_value from (
select substr(input_string,length(input_string)+1-level,1) digit, level position, length(input_string) base
from (select 'cc' input_string from dual)
connect by level <= length(input_string)
)
)
Excepted that I want to give the pos_val as a parameter and get the input_string out...
Here is one way. Another cool usage of the model function in Oracle SQL.
Related
function that take two parameters, the first to be a string and the second is the order (Asc or Desc) and the returned output to be ordering the first string as per the second parameter.
IN : dgtak
OUT: adgkt
Tried this but doesn't seem to work
CREATE OR REPLACE FUNCTION order_string(my_string IN VARCHAR2)
RETURN VARCHAR2 IS
ret_string VARCHAR2(4000);
BEGIN
SELECT LISTAGG(regexp_substr(my_string, '\w', 1, level), '') WITHIN
GROUP(
ORDER BY 1)
INTO ret_string
FROM dual
CONNECT BY regexp_substr(my_string, '\w', 1, level) IS NOT NULL;
RETURN ret_string;
END;
select order_string('dgtak') as RESULT from dual;
Here's one option:
SQL> create or replace function order_string (par_string in varchar2, par_order in varchar2)
2 return varchar2
3 is
4 retval varchar2(100);
5 begin
6 with temp (val) as
7 -- split PAR_STRING to rows
8 (select substr(par_string, level, 1)
9 from dual
10 connect by level <= length(par_string)
11 )
12 -- aggregate characters back in ascending or descending order
13 select case when par_order = 'Asc' then listagg(val, '') within group (order by val asc)
14 when par_order = 'Desc' then listagg(val, '') within group (order by val desc)
15 else null
16 end
17 into retval
18 from temp;
19
20 return retval;
21 end;
22 /
Function created.
Testing:
SQL> select order_string('dfag', 'Asc') result_asc,
2 order_string('dfag', 'Desc') result_desc
3 from dual;
RESULT_ASC RESULT_DESC
-------------------- --------------------
adfg gfda
SQL>
Just for fun, here's a procedural version. It has more lines of code than the SQL version but in my tests it's slightly faster.
create or replace function order_string
( p_string varchar2
, p_reverse varchar2 default 'N' )
return varchar2
as
pragma udf;
type letter_tt is table of number index by varchar2(1);
letters letter_tt := letter_tt();
letter varchar2(1);
sorted_string long;
string_length integer := length(p_string);
begin
-- Store all characters of p_string as indices of array:
for i in 1..string_length loop
letter := substr(p_string,i,1);
if letters.exists(letter) then
letters(letter) := letters(letter) +1;
else
letters(letter) := 1;
end if;
end loop;
-- Loop through array appending each array index to sorted_string
for i in indices of letters loop
for r in 1..letters(i) loop
sorted_string := sorted_string || i;
end loop;
end loop;
if p_reverse = 'Y' then
select reverse(sorted_string) into sorted_string from dual;
end if;
return sorted_string;
end order_string;
I've used the 21c indices of loop iterator, but you can write a conventional loop in earlier versions. You might also use two alternative loops for ascending and descending order in place of my hack.
I've got a table named "F_ParqueInfra", that I'd like to count all values in it where the value is equal to -1.
So, this table has 11 columns and 833 rows = 9.163 number of data in this table.
I'd like to know, how many "-1" values has in the whole table (all columns), in the simplest way.
Also I'll do that with a lot of tables in my Data Warehouse.
Really thanks!
One option is to use dynamic SQL. For example:
SQL> select * from f_parqueinfra;
ID_USUARIO ID_EMPRESA ID_DEPARTAMENTO
---------- ---------- ---------------
250 32 12
-1 -1 -1
0 -1 1
5 2 -1
SQL> set serveroutput on;
SQL> declare
2 l_table_name varchar2(30) := 'F_PARQUEINFRA';
3 l_value number := -1; -- search value
4 l_str varchar2(200); -- to compose SELECT statement
5 l_cnt number := 0; -- number of values in one column
6 l_sum number := 0; -- total sum of values
7 begin
8 for cur_r in (select table_name, column_name
9 from user_tab_columns
10 where table_name = l_table_name
11 and data_type = 'NUMBER'
12 )
13 loop
14 l_str := 'select count(*) from ' ||cur_r.table_name ||
15 ' where ' || cur_r.column_name || ' = ' || l_value;
16 execute immediate l_str into l_cnt;
17 l_sum := l_sum + l_cnt;
18 end loop;
19 dbms_output.put_line('Number of ' || l_value ||' values = ' || l_sum);
20 end;
21 /
Number of -1 values = 5
PL/SQL procedure successfully completed.
SQL>
If you change l_value (line #3), you can search for some other value, e.g.
SQL> l3
3* l_value number := -1; -- search value
SQL> c/-1/250
3* l_value number := 250; -- search value
SQL> /
Number of 250 values = 1
PL/SQL procedure successfully completed.
SQL>
Or, you can change table name (line #2) and search some other table.
Basically, you'd probably want to turn that anonymous PL/SQL code into a function which would accept table name and search value and return number of appearances. That shouldn't be too difficult so I'll leave it to you, for practice.
[EDIT: converting it into a function]
Although far from being perfect, something like this will let you search for some numeric and string values in tables in current schema. Dates are more complex, depending on formats etc. but - for simple cases - this code might be OK for you. Have a look:
SQL> create or replace function f_cnt (par_table_name in varchar2,
2 par_data_type in varchar2,
3 par_value in varchar2)
4 return sys.odcivarchar2list
5 is
6 l_data_type varchar2(20) := upper(par_data_type);
7 l_retval sys.odcivarchar2list := sys.odcivarchar2list();
8 l_str varchar2(200); -- to compose SELECT statement
9 l_out varchar2(200); -- return value
10 l_cnt number := 0; -- number of values in one column
11 l_sum number := 0; -- total sum of values
12 begin
13 -- loop through all tables in current schema
14 for cur_t in (select table_name
15 from user_tables
16 where table_name = upper(par_table_name)
17 or par_table_name is null
18 )
19 loop
20 -- reset the counter
21 l_sum := 0;
22 -- loop through all columns in that table
23 for cur_c in (select column_name
24 from user_tab_columns
25 where table_name = cur_t.table_name
26 and data_type = l_data_type
27 )
28 loop
29 -- pay attention to search value's data type
30 if l_data_type = 'VARCHAR2' then
31 l_str := 'select count(*) from ' ||cur_t.table_name ||
32 ' where ' || cur_c.column_name || ' = ' ||
33 chr(39) || par_value ||chr(39);
34 elsif l_data_type = 'NUMBER' then
35 l_str := 'select count(*) from ' ||cur_t.table_name ||
36 ' where ' || cur_c.column_name || ' = ' || par_value;
37 end if;
38
39 execute immediate l_str into l_cnt;
40 l_sum := l_sum + l_cnt;
41 end loop;
42
43 if l_sum > 0 then
44 l_out := cur_t.table_name ||' has ' || l_sum ||' search values';
45 l_retval.extend;
46 l_retval(l_retval.count) := l_out;
47 end if;
48 end loop;
49 return l_retval;
50 end;
51 /
Function created.
Testing:
SQL> select * From table(f_cnt(null, 'number', -1));
COLUMN_VALUE
-----------------------------------------------------------------
F_PARQUEINFRA has 5 search values
SQL> select * From table(f_cnt(null, 'varchar2', 'KING'));
COLUMN_VALUE
-----------------------------------------------------------------
EMP has 1 search values
SQL>
This might be a good place to use the unpivot syntax. This still requires you to type all the column names once - but not more.
Here is an example for 4 columns:
select count(*) cnt
from mytable
unpivot(myval for mycol in (col1, col2, col3, col4))
where myval = -1
As a bonus, you can easily modify the query to get the number of -1 per column:
select mycol, count(*) cnt
from mytable
unpivot(myval for mycol in (col1, col2, col3, col4))
where myval = -1
group by mycol
This should give you what you need.
Notes:
performs true numeric comparison (for example would match -1.00 also) using an inline function to prevent error should the value in a compared cell be non-numeric. (if all your compared values are guaranteed to be numeric the inline function can be simplified dramatically)
searches only varchar2 and number column types (this can be changed if desired).
The code follows:
set serveroutput on size 10000
declare
vMyTableName varchar2(200) := 'F_ParqueInfra';
vMyValue number := -1;
vSQL varchar2(4000);
vTotal pls_integer;
vGrandTotal number(18) := 0;
cursor c1 is
select *
from user_tab_columns
where table_name = vMyTableName;
vInlineFn varchar2(4000) := 'with
function matchesMyValue(value varchar2) return pls_integer
is
begin
return case
when to_number(value) = '||vMyValue||' then
1
else
0
end;
exception
when value_error then
return 0;
end;
';
begin
for x in c1 loop
if x.data_type in ('VARCHAR2','NUMBER') then -- only looking in these column data types for -1 but you can flex this to suit your column type definitions
vSQL := 'select sum(matchesMyValue('||x.column_name||')) from '||vMyTableName;
execute immediate vInlineFn||vSQL into vTotal;
vGrandTotal := vGrandTotal + vTotal;
end if;
end loop;
dbms_output.put_line('Total cells containing -1 = '||vGrandTotal);
end;
/
I want to print in Oracle.
Input string : 'Tprintthisstring'
Output string: 'T,pri,ntt,his,str,ing'
Use a regular expression to prepend a comma before every block of 3 lower-case letters.
Query:
SELECT REGEXP_REPLACE( 'Tprintthisstring', '([a-z]{3})', ',\1' )
FROM DUAL;
Output:
| REGEXP_REPLACE('TPRINTTHISSTRING','([A-Z]{3})',',\1') |
| :---------------------------------------------------- |
| T,pri,ntt,his,str,ing |
db<>fiddle here
Well, this returns the result you want, but I have no idea whether it'll work always as you didn't explain rules that lead from source to target.
SQL> with test (col) as
2 (select 'Tprintthisstring' from dual
3 ),
4 temp as
5 -- c1 is the first letter
6 -- then split the rest into groups of 3 letters (rows)
7 (select substr(substr(col, 2), 3 * (level - 1) + 1, 3) c2,
8 level lvl,
9 substr(col, 1, 1) c1
10 from test
11 connect by level <= length(substr(col, 2))
12 )
13 -- aggregate the c2 string back, separated by comma
14 select c1 ||','||
15 listagg(c2, ',') within group (order by lvl) result
16 from temp
17 where c2 is not null
18 group by c1;
RESULT
-------------------------------------------------------------------------------
T,pri,ntt,his,str,ing
SQL>
I'm not sure why you tagged it as PL/SQL and what kind of PL/SQL should it be; an anonymous block? Stored procedure? Whatever it is, the above query can easily be converted to PL/SQL.
set serveroutput ON;
DECLARE
l VARCHAR2 (256);
l1 VARCHAR2 (256);
len NUMBER;
str1 VARCHAR (20);
str2 VARCHAR (20);
a NUMBER (10);
counter NUMBER (10);
i NUMBER (10);
p_string VARCHAR2(1000) := 'aaasasdasd,rrt';
decml NUMBER (10) := 3;
BEGIN
a := 1;
i := 1;
l := Substr (p_string, Instr (p_string, ',') + 1);
l1 := Substr (p_string, 0, Instr (p_string, ',') - 1);
len := Length (l1);
IF len <= decml THEN
str1 := l1
||','
||l;
ELSE
counter := Floor (len / decml);
FOR a IN REVERSE i .. counter LOOP
str1 := str1
|| '.'
|| Substr (l1, -decml * a, decml);
END LOOP;
IF ( counter * decml = len ) THEN
str1 := Substr (str1, 2, Length (str1))
|| ','
|| l;
ELSE
str2 := Substr (l1, 1, ( len - ( counter * decml ) ));
str1 := str2
|| str1
|| ','
|| l;
END IF;
END IF;
dbms_output.Put_line(str1);
END;
I have an assignment asking me to rewrite this PL/SQL code I wrote for a previous assignment:
DECLARE
-- Variables used to count a, b, c, d, and f grades:
na integer := 0;
nb integer := 0;
nc integer := 0;
nd integer := 0;
nf integer := 0;
BEGIN
select count(*) into na
from gradeReport1
where grade = 'A';
select count(*) into nb
from gradeReport1
where grade = 'B';
select count(*) into nc
from gradeReport1
where grade = 'C';
select count(*) into nd
from gradeReport1
where grade = 'D';
select count(*) into nf
from gradeReport1
where grade = 'F';
if na > 0 then
DBMS_OUTPUT.PUT_LINE('There are total ' || na || ' A''s');
else
DBMS_OUTPUT.PUT_LINE('There are no As');
end if;
if nb > 0 then
DBMS_OUTPUT.PUT_LINE('There are total ' || nb || ' B''s');
else
DBMS_OUTPUT.PUT_LINE('There are no Bs');
end if;
if nc > 0 then
DBMS_OUTPUT.PUT_LINE('There are total ' || nc || ' C''s');
else
DBMS_OUTPUT.PUT_LINE('There are no Cs');
end if;
if nd > 0 then
DBMS_OUTPUT.PUT_LINE('There are total ' || nd || ' D''s');
else
DBMS_OUTPUT.PUT_LINE('There are no Ds');
end if;
if nf > 0 then
DBMS_OUTPUT.PUT_LINE('There are total ' || nf || ' F''s');
else
DBMS_OUTPUT.PUT_LINE('There are no Fs');
end if;
END;
All it does is search a table I made called gradeReport that stores studentID's and associates them with a grade. The PL/SQL counts all instances of a grade A through F. The question wants me to rewrite this solution using looping and VARRAYS. Could anyone give me a hint to help get the ball rolling for me? I've only been using PL/SQL for a few weeks and don't have much more than a basic understanding of the syntax so I'm completely lost and have no idea where to start.
Not looking for any answers here, just some ideas.
Thank You
How about starting with the doc. http://docs.oracle.com/database/122/LNPLS/plsql-control-statements.htm#LNPLS004
DECLARE
-- Need to ensure the array size will hold all the grades
TYPE grade_tab IS VARRAY(200) OF gradeReport1.grade%TYPE;
-- variable used to store the grades:
t_grades grade_tab;
-- Variables used to count a, b, c, d, and f grades:
na INTEGER;
nb INTEGER;
nc INTEGER;
nd INTEGER;
nf INTEGER;
BEGIN
-- Store the grades in an array:
SELECT grade
BULK COLLECT INTO t_grades
FROM gradeReport1
WHERE grade IN ( 'A', 'B', 'C', 'D', 'F' );
-- Loop through the grades and count how many of each:
FOR i IN 1 .. t_grades.COUNT LOOP
IF t_grades(i) = 'A' THEN na := na + 1;
ELSIF t_grades(i) = 'B' THEN nb := nb + 1;
ELSIF t_grades(i) = 'C' THEN nc := nc + 1;
ELSIF t_grades(i) = 'D' THEN nd := nd + 1;
ELSIF t_grades(i) = 'F' THEN nf := nf + 1;
END IF;
END LOOP;
-- Output grade counts
END;
/
However, a much simpler solution would be to do the counting in a single SQL query (although this doesn't meet the assessment's requirements of using a VARRAY):
DECLARE
-- Variables used to count a, b, c, d, and f grades:
na INTEGER;
nb INTEGER;
nc INTEGER;
nd INTEGER;
nf INTEGER;
BEGIN
SELECT COUNT( CASE grade WHEN 'A' THEN 1 END ),
COUNT( CASE grade WHEN 'B' THEN 1 END ),
COUNT( CASE grade WHEN 'C' THEN 1 END ),
COUNT( CASE grade WHEN 'D' THEN 1 END ),
COUNT( CASE grade WHEN 'F' THEN 1 END )
INTO na,
nb,
nc,
nd,
nf
FROM gradeReport1;
-- Output grade counts...
END;
/
Edit: as the requirement is specifically for varrays, see replies by AmmoQ and MTO. As they both point out, though, you'd be unlikely to need arrays for this type of task in practice, and even if you did, you would use a nested table or an associative array and not a varray.
You'll want a Cursor FOR loop, along the lines of
for r in (
select grade from gradereport1
)
loop
...
end loop;
In real code you'd probably make that a group by query and have SQL do the counting for you.
Then just conditionally increment the counters in the loop depending in the value of r.grade.
You can rationalise all of the if statements for reporting the totals by writing a procedure that takes a grade and a total, as the logic is the same for all of them.
procedure showgrade
( p_grade gradereport1.grade%type
, p_count integer )
is
begin
...
end showgrade;
I'll leave the details as an exercise.
Just for fun, here is another approach, using arrays and looping (but not a varray - they really are a bit useless):
declare
type gradereport_tt is table of pls_integer index by gradereport.grade%type;
gradecounts gradereport_tt;
g gradereport.grade%type;
begin
-- Initialise counts:
gradecounts('A') := 0;
gradecounts('B') := 0;
gradecounts('C') := 0;
gradecounts('D') := 0;
gradecounts('E') := 0;
gradecounts('F') := 0;
-- Count grades:
for r in (
select grade from gradereport
)
loop
gradecounts(r.grade) := gradecounts(r.grade) +1;
end loop;
-- Report counts:
g := gradecounts.first;
while g is not null loop
dbms_output.put_line(g || ': ' || gradecounts(g));
g := gradecounts.next(g);
end loop;
end;
btw there is no need to put brackets after if as in some other languages, unless the condition contains a mixture of and and or conditions that need separating.
There is also no need to write anything in uppercase. It's quite common and Steven Feuerstein does it all the time, but they had this debate in the HTML/CSS world and settled on lowercase for readability. And if you are going to have an uppercase rule, at least use it consistently. Your code example has end if; but END; not to mention Select (which I've fixed). Some people seem to be able to read code like this without it driving them nuts, but I'm afraid I'm not one of them.
set SERVEROUTPUT ON
declare
type number_array is VARRAY(5) OF integer;
total integer :=0;
i number :=1;
begin
numbers :=number_array(14,45,67,89,21);
arr_size := numbers.count;
FOR i in 1..arr_size loop
total :=total+numbers(i);
end loop;
dbms_output.put_line('total-' || total);
end;
here I want to count the number_array elements, but I can't get the correct answers. what is the problem with this?
A solution using VARRAYs could look like that:
DECLARE
type chararray IS VARRAY(6) OF CHAR(1);
type numarray IS VARRAY(6) OF INTEGER;
grades chararray;
cnt numarray;
BEGIN
select grade, count(*)
bulk collect into grades, cnt
from gradeReport1
group by grade
order by grade;
for i in 1..grades.count loop
DBMS_OUTPUT.PUT_LINE('There are total ' || cnt(i) || ' ' ||grades(i)||'s');
end loop;
END;
/
But honestly, it's pointless to use VARRAYs in that case. Just use a cursor loop:
BEGIN
for c in ( select grade, count(*) cnt
from gradeReport1
group by grade
order by grade ) loop
DBMS_OUTPUT.PUT_LINE('There are total ' || c.cnt || ' ' ||c.grade||'s');
end loop;
END;
/
Finding the missing marks (those with a count of 0) is a bit more difficult, though.
I currently have a script that calculates the tanimoto coefficient on the fingerprints of a chemical library. However during testing I found my implementation could not be feasibly scaled up due to my method of comparing the bit strings (It just takes far too long). See below. This is the loop I need to improve. I've simplified this so it is just looking at two structures the real script does permutations about the dataset of structures, but that would over complicate the issue I have here.
LOOP
-- Find the NA bit
SELECT SUBSTR(qsar_kb.fingerprint.fingerprint, var_fragment_id ,1) INTO var_na FROM qsar_kb.fingerprint where project_id = 1 AND structure_id = 1;
-- FIND the NB bit
SELECT SUBSTR(qsar_kb.fingerprint.fingerprint, var_fragment_id ,1) INTO var_nb FROM qsar_kb.fingerprint where project_id = 1 AND structure_id = 2;
-- Test for both bits the same
IF var_na > 0 AND var_nb > 0 then
var_tally := var_tally + 1;
END IF;
-- Test for bit in A on and B off
IF var_na > var_nb then
var_tna := var_tna + 1;
END IF
-- Test for bit in B on and A off.
IF var_nb > var_na then
var_tnb := var_tnb + 1;
END IF;
var_fragment_id := var_fragment_id + 1;
EXIT WHEN var_fragment_id > var_maxfragment_id;
END LOOP;
For a simple example
Structure A = '101010'
Structure B = '011001'
In my real data set the length of the binary is 500 bits and up.
I need to know:
1)The number of bits ON common to Both
2)The number of bits ON in A but off in B
3)The number of bits ON in B but off in B
So in this case
1) = 1
2) = 2
3) = 2
Ideally I want to change how I'm doing this. I don't want to be steeping though each bit in each string its just too time expensive when I scale the whole system up with thousands of structures each with fingerprint bit strings in the length order of 500-1000
My logic to fix this would be to:
Take the total number of bits ON in both
A) = 3
B) = 3
Then perform an AND operation and find how many bits are on in both
= 1
Then just subtract this from the totals to find the number of bits on in one but not the other.
So how can I perform an AND like operation on two strings of 0's and 1's to find the number of common 1's?
Check out the BITAND function.
The BITAND function treats its inputs and its output as vectors of bits; the output is the bitwise AND of the inputs.
However, according to the documentation, this only works for 2^128
You should move the SELECT out of the loop. I'm pretty sure you're spending 99% of the time selecting 1 bit 500 times where you could select 500 bits in one go and then loop through the string:
DECLARE
l_structure_a LONG;
l_structure_b LONG;
var_na VARCHAR2(1);
var_nb VARCHAR2(1);
BEGIN
SELECT MAX(decode(structure_id, 1, fingerprint)),
MAX(decode(structure_id, 2, fingerprint))
INTO l_structure_a, l_structure_b
FROM qsar_kb.fingerprint
WHERE project_id = 1
AND structure_id IN (1,2);
LOOP
var_na := substr(l_structure_a, var_fragment_id, 1);
var_nb := substr(l_structure_b, var_fragment_id, 1);
-- Test for both bits the same
IF var_na > 0 AND var_nb > 0 THEN
var_tally := var_tally + 1;
END IF;
-- Test for bit in A on and B off
IF var_na > var_nb THEN
var_tna := var_tna + 1;
END IF;
-- Test for bit in B on and A off.
IF var_nb > var_na THEN
var_tnb := var_tnb + 1;
END IF;
var_fragment_id := var_fragment_id + 1;
EXIT WHEN var_fragment_id > var_maxfragment_id;
END LOOP;
END;
Edit:
You could also do it in a single SQL statement:
SQL> WITH DATA AS (
2 SELECT '101010' fingerprint,1 project_id, 1 structure_id FROM dual
3 UNION ALL SELECT '011001', 1, 2 FROM dual),
4 transpose AS (SELECT ROWNUM fragment_id FROM dual CONNECT BY LEVEL <= 1000)
5 SELECT COUNT(CASE WHEN var_na = 1 AND var_nb = 1 THEN 1 END) nb_1,
6 COUNT(CASE WHEN var_na > var_nb THEN 1 END) nb_2,
7 COUNT(CASE WHEN var_na < var_nb THEN 1 END) nb_3
8 FROM (SELECT to_number(substr(struct_a, fragment_id, 1)) var_na,
9 to_number(substr(struct_b, fragment_id, 1)) var_nb
10 FROM (SELECT MAX(decode(structure_id, 1, fingerprint)) struct_a,
11 MAX(decode(structure_id, 2, fingerprint)) struct_b
12 FROM DATA
13 WHERE project_id = 1
14 AND structure_id IN (1, 2))
15 CROSS JOIN transpose);
NB_1 NB_2 NB_3
---------- ---------- ----------
1 2 2
I'll sort of expand on the answer from Lukas with a little bit more information.
A little bit of an internet search revealed code from Tom Kyte (via Jonathan Lewis) to convert between bases. There is a function to_dec which will take a string and convert it to a number. I have reproduced the code below:
Convert base number to decimal:
create or replace function to_dec(
p_str in varchar2,
p_from_base in number default 16) return number
is
l_num number default 0;
l_hex varchar2(16) default '0123456789ABCDEF';
begin
for i in 1 .. length(p_str) loop
l_num := l_num * p_from_base + instr(l_hex,upper(substr(p_str,i,1)))-1;
end loop;
return l_num;
end to_dec;
Convert decimal to base number:
create or replace function to_base( p_dec in number, p_base in number )
return varchar2
is
l_str varchar2(255) default NULL;
l_num number default p_dec;
l_hex varchar2(16) default '0123456789ABCDEF';
begin
if ( trunc(p_dec) <> p_dec OR p_dec < 0 ) then
raise PROGRAM_ERROR;
end if;
loop
l_str := substr( l_hex, mod(l_num,p_base)+1, 1 ) || l_str;
l_num := trunc( l_num/p_base );
exit when ( l_num = 0 );
end loop;
return l_str;
end to_base;
This function can be called to convert the string bitmap into a number which can then be used with bitand. An example of this would be:
select to_dec('101010', 2) from dual
Oracle only really provides BITAND (and BIT_TO_NUM which isn't really relevant here` as a way of doing logical operations but the operations required here are (A AND B), (A AND NOT B) and (NOT A AND B). So we need a var of converting A to NOT A. A simple way of doing this is to use translate.
So.... the final outcome is:
select
length(translate(to_base(bitand(data_A, data_B),2),'10','1')) as nb_1,
length(translate(to_base(bitand(data_A, data_NOT_B),2),'10','1')) as nb_2,
length(translate(to_base(bitand(data_NOT_A, data_B),2),'10','1')) as nb_3
from (
select
to_dec(data_A,2) as data_A,
to_dec(data_b,2) as data_B,
to_dec(translate(data_A, '01', '10'),2) as data_NOT_A,
to_dec(translate(data_B, '01', '10'),2) as data_NOT_B
from (
select '101010' as data_A, '011001' as data_B from dual
)
)
This is somewhat more complicated than I was hoping when I started writing this answer but it does seem to work.
Can be done pretty simply with something like this:
SELECT utl_raw.BIT_AND( t.A, t.B ) SET_IN_A_AND_B,
length(replace(utl_raw.BIT_AND( t.A, t.B ), '0', '')) SET_IN_A_AND_B_COUNT,
utl_raw.BIT_AND( t.A, utl_raw.bit_complement(t.B) ) ONLY_SET_IN_A,
length(replace(utl_raw.BIT_AND( t.A, utl_raw.bit_complement(t.B) ),'0','')) ONLY_SET_IN_A_COUNT,
utl_raw.BIT_AND( t.B, utl_raw.bit_complement(t.A) ) ONLY_SET_IN_A,
length(replace(utl_raw.BIT_AND( t.B, utl_raw.bit_complement(t.A) ),'0','')) ONLY_SET_IN_A_COUNT
FROM (SELECT '1111100000111110101010' A, '1101011010101010100101' B FROM dual) t
Make sure your bit string has an even length (just pad it with a zero if it has an odd length).