How to compare values with lookup table - sql

I am having one table "Mark" which contains marks of different subjects. If marks fit into one particular range then I should pick up respective rank and insert into marks table itself in column 'rank_sub_1'. could you please help me how can I look up in the table and insert in the column. Below is my table structure.
**Marks**
Subject1_Marks Subject2_Marks
------------------------------
71 22
10 40
**LookupTable**
Rank range1 range2
----------------------
9 10 20
8 21 30
7 31 40
6 41 50
5 51 60
4 61 70
3 71 80
2 81 90
1 91 100
Now I want to check marks of each subject with lookup table which contains the ranges and ranks for different marks obtained.
**Marks**
Subject1_Marks Subject2_Marks Rank_Sub_1 Rank_Sub_2
------------------------------------------------------
71 22
10 40
If marks fit into one particular range then I should pick up respective rank and insert into marks table itself in column 'rank_sub_1'. could you please help me how can I look up in the table and insert in the column.

(Considering there is no overlapping in range values)
Take two instances of lookuptable and join first with subject1_marks and second with subject2_marks. Here i haven't used LEFT JOINS as i am assuming your subject marks will fall under 1 range for sure. If you are not sure about that, please use left joins and handle null values as per your requirement for columns RANK_SUB_1 and RANK_SUB_2
WITH LOOKUPTABLE_TMP AS (SELECT * FROM LOOKUPTABLE)
SELECT M.*, L1.RANK AS RANK_SUB_1, L2.RANK AS RANK_SUB_2
FROM MARKS M , LOOKUPTABLE_TMP L1, LOOKUPTABLE_TMP L2
WHERE M.SUBJECT1_MARKS BETWEEN L1.RANGE1 AND L1.RANGE2
AND M.SUBJECT2_MARKS BETWEEN L2.RANGE1 AND L2.RANGE2
Then MERGE the data into table MARKS.
Solution:
MERGE INTO MARKS MS
USING
(
SELECT M.SUBJECT1_MARKS, M.SUBJECT2_MARKS, L1.RNK AS RANK_SUB_1, L2.RNK AS RANK_SUB_2
FROM MARKS M , LOOKUPTABLE L1, LOOKUPTABLE L2
WHERE M.SUBJECT1_MARKS BETWEEN L1.RANGE1 AND L1.RANGE2
AND M.SUBJECT2_MARKS BETWEEN L2.RANGE1 AND L2.RANGE2
GROUP BY M.SUBJECT1_MARKS, M.SUBJECT2_MARKS, L1.RNK, L2.RNK
) SUB
ON (MS.SUBJECT1_MARKS=SUB.SUBJECT1_MARKS AND MS.SUBJECT2_MARKS =SUB.SUBJECT2_MARKS)
WHEN MATCHED THEN UPDATE
SET MS.RANK_SUB_1=SUB.RANK_SUB_1, MS.RANK_SUB_2=SUB.RANK_SUB_2;
Tested on below schema and data as per your question's details.
CREATE TABLE MARKS (SUBJECT1_MARKS NUMBER, SUBJECT2_MARKS NUMBER , RANK_SUB_1 NUMBER, RANK_SUB_2 NUMBER)
INSERT INTO MARKS (SUBJECT1_MARKS , SUBJECT2_MARKS ) VALUES (71, 22);
INSERT INTO MARKS (SUBJECT1_MARKS , SUBJECT2_MARKS ) VALUES (10, 40);
CREATE TABLE LOOKUPTABLE (RNK NUMBER, RANGE1 NUMBER , RANGE2 NUMBER)
INSERT INTO LOOKUPTABLE VALUES (9, 10, 20);
INSERT INTO LOOKUPTABLE VALUES (8, 21, 30);
INSERT INTO LOOKUPTABLE VALUES (7, 31, 40);
INSERT INTO LOOKUPTABLE VALUES (6, 41, 50);
INSERT INTO LOOKUPTABLE VALUES (5, 51, 60);
INSERT INTO LOOKUPTABLE VALUES (4, 61, 70);
INSERT INTO LOOKUPTABLE VALUES (3, 71, 80);
INSERT INTO LOOKUPTABLE VALUES (2, 81, 90);
INSERT INTO LOOKUPTABLE VALUES (1, 91, 100);
Thanks!!

I think thisupdatestatement should do what you want:
UPDATE Marks m
SET Rank_Sub_1 = (SELECT l.Rank
FROM LookupTable l
WHERE m.Subject1_Marks BETWEEN l.range1 AND l.range2)
WHERE EXISTS (
SELECT 1
FROM LookupTable l
WHERE m.Subject1_Marks BETWEEN l.range1 AND l.range2
);
Sample SQL Fiddle
If you want to update the value forRank_Sub_2at the same time you can do this:
UPDATE Marks m
SET Rank_Sub_1 = (SELECT l.Rank
FROM LookupTable l
WHERE m.Subject1_Marks BETWEEN l.range1 AND l.range2)
,Rank_Sub_2 = (SELECT l.Rank
FROM LookupTable l
WHERE m.Subject2_Marks BETWEEN l.range1 AND l.range2)
Sample SQL Fiddle

Consider the design below which eliminates the possibility of overlaps or gaps. Although I usually use this technique with dates, any data that defines an unbroken sequence will work the same way. The idea is that you only define where the range starts. It is understood that the range stops at the last value possible less than the next higher range. However, notice I added a tenth rank, in case values less than 10 are possible. Any values greater than 100 will, of course, show as rank 1.
with
Lookup( Rank, Cutoff )as(
select 1, 91 union all
select 2, 81 union all
select 3, 71 union all
select 4, 61 union all
select 5, 51 union all
select 6, 41 union all
select 7, 31 union all
select 8, 21 union all
select 9, 10 union all
select 10, 0
),
Marks( Mark1, Mark2 )as(
select 71, 22 union all
select 10, 40 union all
select 21, 101
)
select Mark1, l1.Rank as Rank1, Mark2, l2.Rank as Rank2
from Marks m
join Lookup l1
on l1.Cutoff =(
select Max( Cutoff )
from Lookup
where Cutoff <= m.Mark1 )
join Lookup l2
on l2.Cutoff =(
select Max( Cutoff )
from Lookup
where Cutoff <= m.Mark2 );
The output:
Mark1 Rank1 Mark2 Rank2
----------- ----------- ----------- -----------
71 3 22 8
10 9 40 7
21 8 101 1

Related

In Postgres SQL how to convert values as range and get the range with maximum records

Having a table of students with their name and age as follows how to convert values of age as a range of
7-9
9-11
11-13
13-15
15-17
17-19
and get the age range with maximum students
Creating table:
CREATE TABLE students (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
age FLOAT NOT NULL
);
Inserting values:
INSERT INTO students
VALUES
(1, 'Ryan', 12),
(2, 'Joanna', 12.5),
(3, 'James', 11),
(4, 'Karen', 10),
(5, 'Holmes', 11.2),
(6, 'Garry', 12.1),
(7, 'Justin', 14.5),
(8, 'Emma', 15),
(9, 'Andy', 10),
(10, 'Claren', 9.5),
(11, 'Dennis', 9),
(12, 'Henna', 16),
(13, 'Iwanka', 15.4),
(14, 'June', 8.1),
(15, 'Kamila', 7.5),
(16, 'Lance', 17);
Expected Output should be range with max count of records:
Range | Count
10-12 | 5
--construct numrange table.
CREATE TABLE age_range (
id serial,
agerange numrange
);
INSERT INTO age_range (agerange)
VALUES ('[7,9]'), ('[10,12]'), ('[13,15]'), ('[15,17]'), ('[17,19]');
--cte with window function.
WITH a AS (
SELECT
age,
name,
agerange
FROM
students s,
age_range b
WHERE
age <# agerange IS TRUE
)
SELECT
*,
count(agerange) OVER (PARTITION BY agerange)
FROM
a
ORDER BY
agerange,
name;
You can try to use an aggregate function with CASE WHEN expression for your logic, then use ORDER BY COUNT DESC to get max count of records
SELECT (CASE WHEN age BETWEEN 7 AND 9 THEN '7-9'
WHEN age BETWEEN 10 AND 12 THEN '10-12'
WHEN age BETWEEN 13 AND 15 THEN '13-15'
WHEN age BETWEEN 15 AND 17 THEN '15-17'
WHEN age BETWEEN 17 AND 19 THEN '17-19' END) as range,
COUNT(*) cnt
FROM students
GROUP BY CASE WHEN age BETWEEN 7 AND 9 THEN '7-9'
WHEN age BETWEEN 10 AND 12 THEN '10-12'
WHEN age BETWEEN 13 AND 15 THEN '13-15'
WHEN age BETWEEN 15 AND 17 THEN '15-17'
WHEN age BETWEEN 17 AND 19 THEN '17-19' END
ORDER BY COUNT(*) DESC
LIMIT 1
edit
if you range number has a logic and you want a generic range solution
you can try to use generate_series generate a range number with your range logic then do outer join.
For your sample data I would use generate_series(7,17,2) create a range number which you expect the calutaion start and end number
SELECT CONCAT(t1.startnum,'-',t1.endnum) as range,
COUNT(*) cnt
FROM students s
INNER JOIN (
SELECT v startnum,v+2 endnum
FROM generate_series(7,17,2) v
) t1 ON s.age BETWEEN t1.startnum AND t1.endnum
GROUP BY CONCAT(t1.startnum,'-',t1.endnum)
ORDER BY COUNT(*) DESC
LIMIT 1
sqlfiddle

T-SQL Select to compute a result row on preceeding group/condition

How to achieve this result using a T-SQL select query.
Given this sample table :
create table sample (a int, b int)
insert into sample values (999, 10)
insert into sample values (16, 11)
insert into sample values (10, 12)
insert into sample values (25, 13)
insert into sample values (999, 20)
insert into sample values (14, 12)
insert into sample values (90, 45)
insert into sample values (18, 34)
I'm trying to achieve this output:
a b result
----------- ----------- -----------
999 10 10
16 11 10
10 12 10
25 13 10
999 20 20
14 12 20
90 45 20
18 34 20
The rule is fairly simple: if column 'a' has the special value of 999 the result for that row and following rows (unless the value of 'a' is again 999) will be the value of column 'b'. Assume the first record will have 999 on column 'a'.
Any hint how to implement, if possible, the select query without using a stored procedure or function?
Thank you.
António
You can do what you want if you add a column to specify the ordering:
create table sample (
id int identity(1, 1),
a int,
b int
);
Then you can do what you want by finding the "999" version that is most recent and copying that value. Here is a method using window functions:
select a, b, max(case when a = 999 then b end) over (partition by id_999) as result
from (select s.*,
max(case when a = 999 then id end) over (order by id) as id_999
from sample s
) s;
You need to have an id column
select cn.id, cn.a
, (select top (1) b from sample where sample.id <= cn.id and a = 999 order by id desc)
from sample as cn
order by id

Another approach to percentiles?

I have a dataset which essentially consists of a list of job batches, the number of jobs contained in each batch, and the duration of each job batch. Here is a sample dataset:
CREATE TABLE test_data
(
batch_id NUMBER,
job_count NUMBER,
duration NUMBER
);
INSERT INTO test_data VALUES (1, 37, 9);
INSERT INTO test_data VALUES (2, 47, 4);
INSERT INTO test_data VALUES (3, 66, 6);
INSERT INTO test_data VALUES (4, 46, 6);
INSERT INTO test_data VALUES (5, 54, 1);
INSERT INTO test_data VALUES (6, 35, 1);
INSERT INTO test_data VALUES (7, 55, 9);
INSERT INTO test_data VALUES (8, 82, 7);
INSERT INTO test_data VALUES (9, 12, 9);
INSERT INTO test_data VALUES (10, 52, 4);
INSERT INTO test_data VALUES (11, 3, 9);
INSERT INTO test_data VALUES (12, 90, 2);
Now, I want to calculate some percentiles for the duration field. Typically, this is done with something like the following:
SELECT
PERCENTILE_DISC( 0.75 )
WITHIN GROUP (ORDER BY duration ASC)
AS third_quartile
FROM
test_data;
(Which gives the result of 9)
My problem here is that we don't want to get the percentiles based on batches, I want to get them based on individual jobs. I can figure this out by hand quite easily by generating a running total of the job_count:
SELECT
batch_id,
job_count,
SUM(
job_count
)
OVER (
ORDER BY duration
ROWS UNBOUNDED PRECEDING
)
AS total_jobs,
duration
FROM
test_data
ORDER BY
duration ASC;
BATCH_ID JOB_COUNT TOTAL_JOBS DURATION
6 35 35 1
5 54 89 1
12 90 179 2
2 47 226 4
10 52 278 4
3 66 344 6
4 46 390 6
8 82 472 7
9 12 484 9
1 37 521 9
11 3 524 9
7 55 579 9
Since I have 579 jobs, then the 75th percentile would be job 434. Looking at the above result set, that corresponds with a duration of 7, different from what the standard function does.
Essentially, I want to consider each job in a batch as a separate observation, and determine percentiles based on those, instead on the batches.
Is there a relatively simple way to accomplish this?
I would think of this as "weighted" percentiles. I don't know if there is a built-in analytic function for this in Oracle, but it is easy enough to calculate. And you are on the way there.
The additional idea is to calculate the total number of jobs, and then use arithmetic to select the value you want. For the 75th percentile, the value is the smallest duration such that the cumulative number of jobs is greater than 0.75 times the total number of jobs.
Here is the example in SQL:
select pcs.percentile, min(case when cumjobs >= totjobs * percentile then duration end)
from (SELECT batch_id, job_count,
SUM(job_count) OVER (ORDER BY duration) as cumjobs,
sum(job_count) over () as totjobs,
duration
FROM test_data
) t cross join
(select 0.25 as percentile from dual union all
select 0.5 from dual union all
select 0.75 from dual
) pcs
group by pcs.percentile;
This example gives you the percentile values (and as an added bonus, for three different percentiles) with each value on its own row. If you want the values on each row, you need to join back to your original table.
OK. I think I have your answer. Idea is mine. Implementation is borrowed from this Ask Tom article
SELECT PERCENTILE_DISC( 0.75 )
WITHIN GROUP (ORDER BY duration ASC)
AS third_quartile
FROM(
with data as
(select level l
from dual, (select max(job_count) max_jobs from test_data)
connect by level <= max_jobs
)
select *
from test_data, data
where l <= job_count
--ORDER BY duration, batch_id
) inner
;
Here is SQL Fiddle.

Get values between ranges

The more I think of it the more I am confused, could be because it is quite a while that i wrote some complex sql.
I have a table that has a range for a value. Lets call it a range:
RANGE
RANGE_ID RANGE_SEQ MIN MAX FACTOR
1 1 0 10 1
1 2 11 100 1.5
1 3 101 2.5
2 1 0 18 1
2 2 19 2
And I have anothe table that uses these ranges. Lets call it application
APPLICATION
APP_ID RAW_VALUE RANGE_ID FINAL_VALUE
1 20.0 1 30.0 /*In Range 1, 20 falls between 11 and 100, so 1.5 is applied)*/
2 25.0 2 50.0
3 18.5 2 18.5
I want to get those RAW_VALUES that fall between the ranges. So for range 2, I want those APP_IDs that have a RAW_VALUE between 18 and 19. Similarly for range 1, I want those APP_IDs that have a RAW_VALUE between 10 and 11 and 100 and 101.
I want to know whether this is possible with SQL, and some pointers on what I can try. I don't need the sql itself, just some pointers to the approach.
Try this to get you close
select app_id,raw_value,aa.range_id,raw_value * xx.factor as FinaL_Value
from Application_table aa
join range_table xx on (aa.raw_value between xx.min and xx.max)
and (aa.range_id=xx.range_id)
To get non-matches (i.e. raw_values that do not exist in the table), try this
select app_id,raw_value,aa.range_id
from Application_table aa
left join range_table xx on (aa.raw_value between xx.min and xx.max)
and (aa.range_id=xx.range_id)
where xx.range_id is null
create table tq84_range (
range_id number not null,
range_seq number not null,
min_ number not null,
max_ number,
factor number not null,
--
primary key (range_id, range_seq)
);
insert into tq84_range values (1, 1, 0, 10, 1.0);
insert into tq84_range values (1, 2, 10, 100, 1.5);
insert into tq84_range values (1, 3,101,null, 2.5);
insert into tq84_range values (2, 1, 0, 18, 1.0);
insert into tq84_range values (2, 2, 19,null, 2.0);
create table tq84_application (
app_id number not null,
raw_value number not null,
range_id number not null,
primary key (app_id)
);
insert into tq84_application values (1, 20.0, 1);
insert into tq84_application values (2, 25.0, 2);
insert into tq84_application values (3, 18.5, 2);
You want to use a left join.
With such a left join, you ensure that each record of the left
table (the table appearing prior to left join in the
select statement text) will be returned at least once,
even though the where condition doesn't find a record
in the right table.
If tq84_range.range is null then you know that the join
condition didn't find a record in tq84_range, therefore, there
seems to be a gap. So you print Missing:.
Since tq84_application.max_ can be null and null appears to
indicate infinity or upper bound you test the upper limit
with nvl(tq84_range.max_, tq84_application.raw_value
Thus, the select statement will become something like:
select
case when tq84_range.range_id is null then 'Missing: '
else ' '
end,
tq84_application.raw_value
from
tq84_application left join
tq84_range
on
tq84_application.range_id = tq84_range.range_id
and
tq84_application.raw_value between
tq84_range.min_ and nvl(tq84_range.max_, tq84_application.raw_value);
From what I understand you're saying you only want results from the application table that don't fit in any range? This, for example, would return only the row for app_id = 3 (my own column names and guess at real minimum and maximum amounts):
select *
from APP1 A
where not exists
(select null
from RANGE1 R
where R.RANGE_ID = A.RANGE_ID and A.RAW_VALUE between nvl(R.MINNUM, 0) and nvl(R.MAXNUM, 999999));
But, of course, it won't return a factor amount as it matches no rows in the range table so why would the result for app_id = 3 in your example above match up with factor = 1? If your raw_value column is going to be decimal then I would expect the ranges to be decimal too.

Oracle - natural sort rows on multiple levels

Using Oracle 10.2.0.
I have a table that consists of a line number, an indent level, and text. I need to write a routine to 'natural' sort the text within an indent level [that is a child of a lower indent level]. I have limited experience with analytic routines and connect by/prior, but from what I've read here and elsewhere, it seems like they could be put to use to help my cause, but I can't figure out how.
CREATE TABLE t (ord NUMBER(5), indent NUMBER(3), text VARCHAR2(254));
INSERT INTO t (ord, indent, text) VALUES (10, 0, 'A');
INSERT INTO t (ord, indent, text) VALUES (20, 1, 'B');
INSERT INTO t (ord, indent, text) VALUES (30, 1, 'C');
INSERT INTO t (ord, indent, text) VALUES (40, 2, 'D');
INSERT INTO t (ord, indent, text) VALUES (50, 2, 'Z');
INSERT INTO t (ord, indent, text) VALUES (60, 2, 'E');
INSERT INTO t (ord, indent, text) VALUES (70, 1, 'F');
INSERT INTO t (ord, indent, text) VALUES (80, 2, 'H');
INSERT INTO t (ord, indent, text) VALUES (90, 2, 'G');
INSERT INTO t (ord, indent, text) VALUES (100, 3, 'J');
INSERT INTO t (ord, indent, text) VALUES (110, 3, 'H');
This:
SELECT ord, indent, LPAD(' ', indent, ' ') || text txt FROM t;
...returns:
ORD INDENT TXT
---------- ---------- ----------------------------------------------
10 0 A
20 1 B
30 1 C
40 2 D
50 2 Z
60 2 E
70 1 F
80 2 H
90 2 G
100 3 J
110 3 H
11 rows selected.
In the case I've defined for you, I need my routine to set ORD 60 = 50 and ORD 50 = 60 [flip them] because E is after D and before Z.
Same with ORD 80 and 90 [with 90 bringing 100 and 110 with it because they belong to it], 100 and 110. The final output should be:
ORD INDENT TXT
10 0 A
20 1 B
30 1 C
40 2 D
50 2 E
60 2 Z
70 1 F
80 2 G
90 3 H
100 3 J
110 2 H
The result is that each indent level is sorted alphabetically, within its indent level, within the parent indent level.
Here's what I got to work. No idea how efficient it might be on larger sets. The hard part for me was identifying the "parent" for a given row based solely on indent and original order.
WITH
a AS (
SELECT
t.*,
( SELECT MAX( ord )
FROM t t2
WHERE t2.ord < t.ord AND t2.indent = t.indent-1
) AS parent_ord
FROM
t
)
SELECT
ROWNUM*10 AS ord,
indent,
rpad( ' ', LEVEL-1, ' ' ) || text
FROM
a
CONNECT BY
PRIOR ord = parent_ord
START WITH
parent_ord IS NULL
ORDER SIBLINGS BY
text
Okay, here you go. The hard part in your data structure is that the parent is not (explicitly) known, so that the first part of the query does nothing but identify the parent according to the rules (for each node, it gets all subnodes one level deep, stopping as soon as the identation is smaller or equal to the start node).
The rest is easy, basically just some recursion with connect by to get the items in the order you want them (renumbering them dynamically).
WITH OrdWithParentInfo AS
(SELECT ID,
INDENT,
TEXT,
MIN(ParentID) ParentID
FROM (SELECT O.*,
CASE
WHEN (CONNECT_BY_ROOT ID = ID) THEN
NULL
ELSE
CONNECT_BY_ROOT ID
END ParentID
FROM (SELECT ROWNUM ID,
INDENT,
TEXT
FROM T
ORDER BY ORD) O
WHERE (INDENT = CONNECT_BY_ROOT INDENT + 1)
OR (CONNECT_BY_ROOT ID = ID)
CONNECT BY ((ID = PRIOR ID + 1) AND (INDENT > CONNECT_BY_ROOT INDENT)))
GROUP BY ID,
INDENT,
TEXT)
SELECT ROWNUM * 10 ORD, O.INDENT, O.TEXT
FROM OrdWithParentInfo O
START WITH O.ParentID IS NULL
CONNECT BY O.ParentID = PRIOR ID
ORDER SIBLINGS BY O.Text;