SQL Query for the closest value (HELP!!) - sql

I have created a table on the database or I have loaded a caliber list with the corresponding prices.
I need a request which when you enter a caliber (which is not frocally in the caliber table) it displays the price corresponding to the nearest key on the table of the one which was entered.
The table of sizes by price :
in an example, based on this the table of calibers. if I put the value 1.47 as the caliber, it must bring me the price corresponding to the 1.5 caliber. or if I put the value 1.41 as the caliber, it must bring me the price corresponding to the 1.4 caliber

I would consider something like the following:
SELECT *
FROM
(
SELECT mt.*, RANK() OVER (ORDER BY ABS(caliber-1.41)) rn
FROM mytable mt
)
WHERE rn = 1
This calculates the difference between caliber and 1.41 using ABS for absolute value (to get closest without caring whether it is bigger or smaller). The WHERE rn = 1 then limits to the rows with the smallest difference.
Note that this assumes that if there are two rows that are equally far from your number, you want to return them both. If you want to arbitrarily pick one in the event of a tie I would replace RANK with ROW_NUMBER.

Related

Calculating the mode/median/most frequent observation in categorical variables in SQL impala

I would like to calculate the mode/median or better, most frequent observation of a categorical variable within my query.
E.g, if the variable has the following string values:
dog, dog, dog, cat, cat and I want to get dog since its 3 vs 2.
Is there any function that does that? I tried APPX_MEDIAN() but it only returns the first 10 characters as median and I do not want that.
Also, I would like to get the most frequent observation with respect to date if there is a tie-break.
Thank you!
the most frequent observation is mode and you can calculate it like this.
Single value mode can be calculated like this on a value column. Get the count and pick up row with max count.
select count(*),value from mytable group by value order by 1 desc limit 1
now, in case you have multiple modes, you need to join back to the main table to find all matches.
select orig.value from
(select count(*) c, value v from mytable) orig
join (select count(*) cmode from mytable group by value order by 1 desc limit 1) cmode
ON orig.c= cmode.cmode
This will get all count of values and then match them based on count. Now, if one value of count matches to max count, you will get 1 row, if you have two value counts matches to max count, you will get 2 rows and so on.
Calculation of median is little tricky - and it will give you middle value. And its not most frequent one.

Query smallest number of rows to match a given value threshold

I would like to create a query that operates similar to a cash register. Imagine a cash register full of coins of different sizes. I would like to retrieve a total value of coins in the fewest number of coins possible.
Given this table:
id
value
1
100
2
100
3
500
4
500
5
1000
How would I query for a list of rows that:
has a total value of AT LEAST a given threshold
with the minimum excess value (value above the threshod)
in the fewest possible rows
For example, if my threshold is 1050, this would be the expected result:
id
value
1
100
5
1000
I'm working with postgres and elixir/ecto. If it can be done in a single query great, if it requires a sequence of multiple queries no problem.
I had a go at this myself, using answers from previous questions:
Using ABS() to order by the closest value to the threshold
Select rows until a sum reduction of a single column reaches a threshold
Based on #TheImpaler's comment above, this prioritises minimum number of rows over minimum excess. It's not 100% what I was looking for, so open to improvements if anyone can, but if not I think this is going to be good enough:
-- outer query selects all rows underneath the threshold
-- inner subquery adds a running total column
-- window function orders by the difference between value and threshold
SELECT
*
FROM (
SELECT
i.*,
SUM(i.value) OVER (
ORDER BY
ABS(i.value - $THRESHOLD),
i.id
) AS total
FROM
inputs i
) t
WHERE
t.total - t.value < $THRESHOLD;

fetch aggregate value along with data

I have a table with the following fields
ID,Content,QuestionMarks,TypeofQuestion
350, What is the symbol used to represent Bromine?,2,MCQ
758,What is the symbol used to represent Bromine? ,2,MCQ
2425,What is the symbol used to represent Bromine?,3,Essay
2080,A quadrilateral has four sides, four angles ,1,MCQ
2614,A circular cone has a curved surface area of ,2,MCQ
2520,Two triangles have sides 5 cm, 11 cm, 2 cm . ,2,MCQ
2196,Life supporting process mediated by water? ,2,Essay
I would like to get random questions where total marks is an input number.
For example if I say 25, the result should be all the random questions whose Sum(QuestionMarks) is 25(+/-1)
Is this really possible using a SQL
select content,id,questionmarks,sum(questionmarks) from quiz_question
group by content,id,questionmarks;
Expected Input 25
Expected Result (Sum of Question Marks =25)
Update:
How do I ensure I get atleast 2 Essay Type Questions (this is just an example) I would extend this for other conditions. Thank you for all the help
S-Man's cumulative sum is the right approach. For your logic, though, I think you want to get up to the first row that is 24 or more. That logic is:
where total - questionmark < 24
If you have enough questions, then you could get exactly 25 using:
with q25 as (
select *
from (select t.*,
sum(questionmark) over (order by random()) as running_questionmark
from t
) t
where running_questionmark < 25
)
select q.ID, q.Content, q.QuestionMarks, q.TypeofQuestion
from q25 q
union all
(select t.ID, t.Content, t.QuestionMarks, t.TypeofQuestion
from t cross join
(select sum(questionmark) as questionmark_25 from q25) x
where not exists (select 1 from q25 where q25.id = t.id)
order by abs(questionmark - (25 - questionmark_25))
limit 1
)
This selects questions up to 25 but not at 25. It then tries to find one more to make the total 25.
Supposing, questionmark is of type integer. Then you want to get some records in random order whose questionmark sum is not more than 25:
You can use the consecutive SUM() window function. The order is random. The consecutive SUM() adds every current value to the previous sum. So, you could filter where SUM() <= <your value>:
demo:db<>fiddle
SELECT
*
FROM (
SELECT
*,
SUM(questionmark) OVER (ORDER BY random()) as total
FROM
t
)s
WHERE total <= 25
Note:
This returns a records list with no more than 25, but as close as possible to it with an random order.
To find an exact match of your value is some sort of combinatorical problem which shouldn't be solved in a database. Especially when there's a random factor. What if your current SUM is 22 and the next randomly chosen value is 4. Would you retry maybe until infinity to randomly find a value = 3? Or are you trying to remove an already counted record with value = 1?

Data sorting and grouping in Oracle

Some one mistakenly inputted negative values to unique key column
long back and now i have to group the data selecting max of ID as per category to extract report. ID column now have both positive and negative values.
Max(ID) function is not working correctly with negative values.
ID Category
45678 A
234567 B
-4578 A
-45798 A
-7890 C
-8904 C
-7654 C
Expected O/P is
ID Category
45678 A
234567 B
-8904 C
"So ID with largest negative values will have latest data before 2010
and id with positive values are created after 2010"
That means in case there are positive IDs for a category you want the maximum (e.g. 45678 for category A) and otherwise the minimum (e.g. -8904 for category C). You can use Oracle's KEEP FIRST/LAST for this:
select
category,
max(id) keep (dense_rank last order by sign(id), abs(id))
from mytable
group by category
order by category;
This sorts your IDs by sign (negative before positive ones, so if there are positive ones you'd prefer these) and then by absolute amount (so you get the highest negative or positive as the last row, which is the one you pick with KEEP LAST).

Suppress repeating values on subsequent rows

item_no parent item_no_child item_name text
123 3 xxx the item is resistant to water
123 5 yyy The item is resistant to heat
123 6 zzz The item is ....
I will be giving the parent item_no as input and retrieve child item no's. Now I have to check each child item's text and if they have same text I should not display the item_name else I should.
The row_number() analytic function is a neat way of implementing such distinct queries:
SELECT item_name
FROM (SELECT item_name,
ROW_NUMBER() OVER (PARTITION BY text ORDER BY 1) AS rn
FROM items
WHERE item_no parent = 123)
WHERE rn = 1
EDIT:
Some explanation, as requested in the comments - row_number is an analytic function (sometimes also referred to as a windowing function). It returns one result per row of input (like a row function), but takes into account all the other rows too (like an aggregate function). In this case, row_number simply returns the number of current row (i.e., a simple counter). This counting is done per different value of text (the partition by clause). row_number requires an order by clause so it knows in which order to count these rows. Since here we don't care about which row (per different value of text) comes first, I simply order by a constant 1.