I'm trying to write an analytic function in PL/SQL that, when applied to a column within a table, returns for each row in the table, the median of the column excluding the given row.
An example to clarify: Suppose I have a table TABLE consisting of one column X that takes on the following values:
1
2
3
4
5
I want to define an analytic function LOOM() such that:
SELECT LOOM(X)
FROM TABLE
delivers the following:
3.5
3.5
3
2.5
2.5
i.e., for each row, the median of X, excluding the given row. I've been struggling to build the desired LOOM() function.
I'm not sure if there is a "clever" way to do this. You can do the calculation with a correlated subquery.
Assuming the x values are unique -- as in your example --
with t as (
select 1 as x from dual union all
select 2 as x from dual union all
select 3 as x from dual union all
select 4 as x from dual union all
select 5 as x from dual
)
select t.*,
(select median(x)
from t t2
where t2.x <> t.x
) as loom
from t;
EDIT:
A more efficient method uses analytic functions but requires more direct calculation of the median. For instance:
with t as (
select 1 as x from dual union all
select 2 as x from dual union all
select 3 as x from dual union all
select 4 as x from dual union all
select 5 as x from dual
)
select t.*,
(case when mod(cnt, 2) = 0
then (case when x <= candidate_1 then candidate_2 else candidate_1 end)
else (case when x <= candidate_1 then (candidate_2 + candidate_3)/2
when x = candidate_2 then (candidate_1 + candidate_3)/2
else (candidate_1 + candidate_2) / 2
end)
end) as loom
from (select t.*,
max(case when seqnum = floor(cnt / 2) then x end) over () as candidate_1,
max(case when seqnum = floor(cnt / 2) + 1 then x end) over () as candidate_2,
max(case when seqnum = floor(cnt / 2) + 2 then x end) over () as candidate_3
from (select t.*,
row_number() over (order by x) as seqnum,
count(*) over () as cnt
from t
) t
) t
I am executing this query in SQL Server and it is working fine but when I try to execute it in Oracle, it is not giving the same results.
You can see in my attached photo the data of one customer, which have got the code 1, 2,4, 8 and he should get 0.70 value for having code 1,2,4 and then for having code 8 he should get 0.75 so after multiplication it should return 0.52 as value. I tried it in Oracle by replacing is null by nvl but it returned 1 instead of 0.52. Please help me convert this query in an oracle supported query which will return the same results.
Here is my query
SELECT [id] ,[name],r = isnull(nullif(
max(CASE WHEN [code] IN (1,2,4) then 0.70 else 0 end)
,0),1)
* isnull(nullif(
min(CASE WHEN [code] IN (1,2) then 0 else 1 end)
* max(CASE WHEN [code] IN (4) then 0.20 else 0 end)
,0),1)
* isnull(nullif(
max(CASE WHEN [code] IN (8) then 0.75 else 0 end)
,0),1)
FROM (values (1, 'ali',4)
,(1, 'ali',1)
,(1, 'ali',8)
,(1, 'ali',2)
,(2, 'sunny',1)
,(4, 'arslan',4)) as t(id, name,code)
GROUP BY id, name;
Since now you are multiplying scores, first we need to decide, what is the score if non of codes is matched. I suppose, it should be 0.
Next, we should break all possible codes into independent groups, that is which results do not depend on other groups members. Here they are (1,2,4) and (8). And define the rule for every group.
So
SELECT [id] ,[name],r =
-- At least one of values needed to get score > 0
MAX(CASE WHEN code IN (1,2,4, 8) THEN 1.0 ELSE 0.0 END) *
-- Now rules for every independent set of codes. Rule should return score if matched or 1.0 if not matched
-- (1,2,4)
coalesce(MAX(CASE WHEN [code] IN (1,2,4) THEN 0.70 END), 1.0 ) *
-- (8)
coalesce(MAX(CASE WHEN [code] IN (8) THEN 0.75 END), 1.0)
-- more ?
FROM (values (1, 'ali',4)
,(1, 'ali',1)
,(1, 'ali',8)
,(1, 'ali',2)
,(2, 'sunny',1)
,(4, 'arslan',4)) as t(id, name,code)
GROUP BY id, name;
There are some SQL Server things in the query that are not standard SQL:
[] around column names - remove them; you don't need them here (otherwise you would use standard SQL quotes "")
r = expression - for an alias name. Change this to standard SQL expression AS r
ISNULL(expression, value) - Change this to standard SQL COALESCE(expression, value) or Oracle's NVL(expression, value)
NULLIF(expression, value) - this you can keep; Oracle supports it, too
values (), (), ... - replace with a SELECT FROM DUAL UNION ALL subquery
You get:
select
id,
name,
coalesce(nullif( max(case when code in (1,2,4) then 0.70 else 0 end), 0), 1) *
coalesce(nullif( min(case when code in (1,2) then 0 else 1 end) *
max(case when code in (4) then 0.20 else 0 end) , 0), 1) *
coalesce(nullif( max(case when code in (8) then 0.75 else 0 end), 0), 1) as r
from
(
select 1 as id, 'ali' as name, 4 as code from dual
union all
select 1 as id, 'ali' as name, 8 as code from dual
union all
select 1 as id, 'ali' as name, 2 as code from dual
union all
select 2 as id, 'sunny' as name, 1 as code from dual
union all
select 4 as id, 'arslan' as name, 4 as code from dual
)
group by id, name;
The calculation, however, is unnecessarily complicated:
coalesce(nullif( max(case when code in (1,2,4) then 0.70 else 0 end), 0), 1)
means if there is at least one match then 0.70 else 0 which is turned to null which is turned to 1. So it is the same as
min(case when code in (1,2,4) then 0.70 else 1 end)
So if I am not mistaken, the whole calcultion becomes:
case when max(case when code in (1,2) then 1 end) = 1
then 0.7 else max(case when code = 4 then 0.14 else 1 end) end *
min(case when code = 8 then 0.75 else 1 end) as r
or
case when max(case when code in (1,2) then 1 end) = 1 then 0.7
when max(case when code = 4 then 1 end) = 1 then 0.14
else 1
end *
min(case when code = 8 then 0.75 else 1 end) as r
Well, there are many ways to write this.
The code below should give you the answer you expect;
CREATE TABLE #TestData (ID int, Name varchar(10), Code int)
INSERT INTO #TestData (ID, Name, Code)
VALUES
(1,'ali',4)
,(1,'ali',1)
,(1,'ali',8)
,(1,'ali',2)
,(2,'sunny',1)
,(4,'arslan',4)
SELECT DISTINCT
a.id
,a.Name
,COALESCE(b.HasCode1, b.HasCode2, b.HasCode4,1) * COALESCE(b.HasCode8,1) Result
FROM (SELECT ID, Name FROM #TestData GROUP BY ID, Name) a
LEFT JOIN
(
SELECT
ID
,Name
,SUM(CASE WHEN CODE = 1 THEN 0.7 END) HasCode1
,SUM(CASE WHEN CODE = 2 THEN 0.7 END) HasCode2
,SUM(CASE WHEN CODE = 4 THEN 0.7 END) HasCode4
,SUM(CASE WHEN CODE = 8 THEN 0.75 END) HasCode8
FROM #TestData
GROUP BY
ID
,Name
) b
ON a.ID = b.ID
AND a.Name = b.Name
DROP TABLE #TestData
If I understand what you're after (ie. for each of the cases, the id/name combination needs to have all the codes specified), then this will probably do what you're after. You may want to add some sort of trunc/floor/round function on the val column if you're after the answer to 2 decimal places, though:
with t as (select 1 id, 'ali' name, 4 code from dual union all
select 1 id, 'ali' name, 1 code from dual union all
select 1 id, 'ali' name, 8 code from dual union all
select 1 id, 'ali' name, 2 code from dual union all
select 2 id, 'ali' name, 4 code from dual union all
select 2 id, 'ali' name, 8 code from dual union all
select 3 id, 'bob' name, 1 code from dual union all
select 3 id, 'bob' name, 2 code from dual union all
select 3 id, 'bob' name, 8 code from dual),
res as (select id,
name,
case when count(distinct case when code in (1, 2, 4) then code end) = 3 then 0.7
when count(distinct case when code in (1, 2) then code end) = 2 then 0.5
else 1
end case_1_2_and_poss_4,
case when count(distinct case when code = 8 then code end) = 1 then 0.75 else 1 end case_8
from t
group by id, name)
select id,
name,
case_1_2_and_poss_4 * case_8 val
from res;
ID NAME VAL
---------- ---- ----------
1 ali 0.525
2 ali 0.75
3 bob 0.375
I am writing below query which divides the two select query and calculate the percentage. But i am getting an error as not a single-group group function
select CASE WHEN COUNT(*) = 0 THEN 0 ELSE round((r.cnt / o.cnt)*100,3) END from
(Select count(*) as cnt from O2_CDR_HEADER WHERE STATUS NOT IN(0,1) and DATE_CREATED > (SYSDATE - 1)) r cross join
(Select count(*) as cnt from O2_CDR_HEADER WHERE DATE_CREATED > (SYSDATE - 1)) o;
You don't need to use joins. If I were you, I'd do:
select case when count(*) = 0 then 0
else round(100 * count(case when status not in (0, 1) then 1 end) / count(*), 3)
end non_0_or_1_status_percentage
from o2_cdr_header
where date_created > sysdate - 1;
Here's a simple demo:
with t as (select 1 status from dual union all
select 2 status from dual union all
select 3 status from dual union all
select 2 status from dual union all
select 4 status from dual union all
select 5 status from dual union all
select 6 status from dual union all
select 7 status from dual union all
select 1 status from dual union all
select 0 status from dual union all
select 1 status from dual)
select case when count(*) = 0 then 0
else round(100 * count(case when status not in (0, 1) then 1 end) / count(*), 3)
end col1
from t
where 1=0;
COL1
----------
0
And just in case you aren't sure that doing the filtering of the count in the case statement returns the same as when you filter in the where clause, here's a demo that proves it:
with t as (select 1 status from dual union all
select 2 status from dual union all
select 3 status from dual union all
select 2 status from dual union all
select 4 status from dual union all
select 5 status from dual union all
select 6 status from dual union all
select 7 status from dual union all
select 1 status from dual union all
select 0 status from dual union all
select 1 status from dual)
select 'using case statement' how_count_filtered,
count(case when status not in (0, 1) then 1 end) cnt
from t
union all
select 'using where clause' how_count_filtered,
count(*) cnt
from t
where status not in (0, 1);
HOW_COUNT_FILTERED CNT
-------------------- ----------
using case statement 7
using where clause 7
You are referencing an aggregate function (COUNT(*)) and an individual column expression (r.cnt and o.cnt) in the same SELECT query. This is not valid SQL unless a GROUP BY clause is added for the relevant individual columns.
It would be easier to provide a valid alternative it you could clarify what you'd like this query to return (given a sample schema and set of data). As a guess, I'd say you can simply substitute COUNT(*) with o.cnt to avoid the division by 0 issue. If there's some other logic expected to be present here, you'd need to clarify what that is.
It looks like you want to get a percentage of status not in 0,1, or 0 if there is no results.
Maybe this is what you want for the first line?
SELECT CASE WHEN (R.CNT = 0 AND O.CNT = 0) THEN 0 ELSE ROUND((R.CNT *100.0 / O.CNT),3) END
You don't need a cross join. Select the counts and do a division later on.
select case when ocnt > 0 then round((rcnt / ocnt)*100,3)
else 0 end
from
(
select
CASE WHEN STATUS NOT IN(0,1) and DATE_CREATED > (SYSDATE - 1)
THEN COUNT(*) END as rcnt,
CASE WHEN DATE_CREATED > (SYSDATE - 1)
THEN COUNT(*) END as ocnt
from O2_CDR_HEADER
group by status, date_created
) t
Boneist's answer is fine, but I would write it as:
select coalesce(round(100 * avg(case when status not in (0, 1) then 1.0 else 0
end), 3), 0) as non_0_or_1_status_percentage
from o2_cdr_header
where date_created > sysdate - 1;
Here is the answer which works perfectly for me
select CASE WHEN (o.cnt = 0) THEN 0 ELSE round((r.cnt / o.cnt)*100,3) END from
(Select count(*) as cnt from O2_CDR_HEADER WHERE STATUS NOT IN(0,1) and DATE_CREATED > (SYSDATE - 1)) r cross join
(Select count(*) as cnt from O2_CDR_HEADER WHERE DATE_CREATED > (SYSDATE - 1)) o
I have a table, sort of like this:
Items
-----------
ID Value1 Value2 Value3 Value4 Value5 Value6
1 345895 435234 342534 678767 5455 423555
2 3245 549238 230944 923948 234488 234997
3 490458 49349 234234 87810 903481 3940102
4 849545 435234 67678 98741 99084 978897
How would I write a query, that finds all the items, that have at least 3 values (just an example, could be more than 3) in common with a specific item i.e. I have an item
345895 435234 67678 98741 5455 423555
and running this query would give me
1 345895 435234 342534 678767 5455 423555
4 849545 435234 67678 98741 99084 978897
Any help would be greatly appreciated. Thank you.
You can use CASE statements in the WHERE clause in order to calculate the number of matches:
SELECT i.*
FROM Items AS i
CROSS JOIN ( VALUES ( 345895, 435234, 67678, 98741, 5455, 423555) ) AS Item(v1, v2, v3, v4, v5, v6)
WHERE (CASE WHEN i.Value1 = Item.v1 THEN 1 ELSE 0 END) +
(CASE WHEN i.Value2 = Item.v2 THEN 1 ELSE 0 END) +
(CASE WHEN i.Value3 = Item.v3 THEN 1 ELSE 0 END) +
(CASE WHEN i.Value4 = Item.v4 THEN 1 ELSE 0 END) +
(CASE WHEN i.Value5 = Item.v5 THEN 1 ELSE 0 END) +
(CASE WHEN i.Value6 = Item.v6 THEN 1 ELSE 0 END) >= 3
This is one way:
; with sub as(
select 345895 as mynum
union all select 435234
union all select 67678
union all select 98741
union all select 5455
union all select 423555
)
select i.*
from items i
join
(
select x.id
from(
select id, value1 as val from items union all
select id, value2 from items union all
select id, value3 from items union all
select id, value4 from items union all
select id, value5 from items union all
select id, value6 from items
) x join sub s on x.val = s.mynum
group by x.id
having count(*) >= 3
) x on x.id = i.id
Fiddle: http://sqlfiddle.com/#!6/1dff3/2/0