How to calculate the ratio of this column with 2 rows - sql

I am very new to SQL and am having difficulty figuring out hot to divide row1 (101) by row2 (576).
COUNT
101
576
I want the output to be a single value expressed to 2 decimal places.
Any tips?
Thanks for the help

For two rows, it's easy.
If you have a big input table, and you want to divide the first row by the second, the third row by the fourth, etc, then you need an ordering column to save yourself.
So, with a two-row table (remember, tables are never ordered), you just rely on the fact that you divide the smaller number by the bigger number.
Here goes:
WITH
-- your input ...
input(counter) AS ( -- count is reserved word, use another name ...
SELECT 101
UNION ALL SELECT 576
)
-- cheat and just divide the smaller by the bigger
-- as "#Gordon Linoff" suggests
-- force a float division by adding a non-integer operand
-- and hard-cast it to DECIMAL(5,2)
SELECT
CAST(
MIN(counter) * 1.00 / MAX(counter)
AS DECIMAL(5,2)
) AS result
FROM input;
-- out result
-- out ----------
-- out 0.18
If, however, you have many rows, and you always need to divide the first row by the second, the third row by the fourth, that is, each odd row in the order by the next even row in the order, then you need an ordering column.
Is your problem just what you suggested, or is there more to it?

There is no such thing as row "1" or "2" in a table. Tables represent unordered sets, so without a column specifying the ordering, there is no first or second row.
You can use aggregation to divide by min by the max:
select min(count) * 1.0 / max(count)
from t;
Note the * 1.0. Postgres does integer division, so you want to convert to something with a decimal point.

Related

Query smallest number of rows to match a given value threshold

I would like to create a query that operates similar to a cash register. Imagine a cash register full of coins of different sizes. I would like to retrieve a total value of coins in the fewest number of coins possible.
Given this table:
id
value
1
100
2
100
3
500
4
500
5
1000
How would I query for a list of rows that:
has a total value of AT LEAST a given threshold
with the minimum excess value (value above the threshod)
in the fewest possible rows
For example, if my threshold is 1050, this would be the expected result:
id
value
1
100
5
1000
I'm working with postgres and elixir/ecto. If it can be done in a single query great, if it requires a sequence of multiple queries no problem.
I had a go at this myself, using answers from previous questions:
Using ABS() to order by the closest value to the threshold
Select rows until a sum reduction of a single column reaches a threshold
Based on #TheImpaler's comment above, this prioritises minimum number of rows over minimum excess. It's not 100% what I was looking for, so open to improvements if anyone can, but if not I think this is going to be good enough:
-- outer query selects all rows underneath the threshold
-- inner subquery adds a running total column
-- window function orders by the difference between value and threshold
SELECT
*
FROM (
SELECT
i.*,
SUM(i.value) OVER (
ORDER BY
ABS(i.value - $THRESHOLD),
i.id
) AS total
FROM
inputs i
) t
WHERE
t.total - t.value < $THRESHOLD;

SQL Server : execution plan explanation

I have this execution plan, but I have just a slight guess what's happening here. I think the plan divide my query to 2 interval, which are inner joined, but I have no clue what is the upper row means starting with the Merge interval.
My plan on Brentozar
My query is:
select Price, ID
from product
where price <> 800
ID is my primary key, Price has index
Thank you in advance
This apparatus is used for "dynamic seeks" in SQL Server. Often you will see this due to mismatched datatypes.
In your case the literal 800 is auto parameterised to an int parameter and the plan later has a CONVERT_IMPLICIT(float(53),[#1],0) to convert it to the datatype of the price column. I replace this with 800e0 below (one way of declaring that value as a float literal)
(sidenote: price should not be float - you should use a precise datatype such as decimal)
Node 6 outputs a single row with no columns. The compute scalar in
Node 5 adds three columns to the row with values (NULL, 800e0, 10)
Node 8 outputs a single row with no columns. The compute scalar in
Node 7 adds three columns to the row with values (800e0, NULL, 6)
Node 4 is a concatenation operator that UNION ALLs the above two rows together. The resultant columns are aliased as (Expr1009, Expr1010, Expr1011) - these correspond to (startOfRange, endOfRange, Flags) - NULL here means "unbounded"
Nodes 3,2 and 1 are concerned with ordering the ranges so that overlapping ones can be collapsed down and have no effect in this case.
Node 9 is an index seek that is executed twice (for the two rows on the outside of the join). This has a seek predicate of Price > Expr1009 AND Price < Expr1010 - i.e. Price > startOfRange AND Price < endOfRange. So it is called for range (NULL, 800) and range (800, NULL)
So the net effect of all this is that the <> 800 predicate gets converted to two index seeks. One on < 800 and the other on >800

SQLite3 Order by highest/lowest numerical value

I am trying to do a query in SQLite3 to order a column by numerical value. Instead of getting the rows ordered by the numerical value of the column, the rows are ordered alphabetically by the first digit's numerical value.
For example in the query below 110 appears before 2 because the first digit (1) is less than two. However the entire number 110 is greater than 2 and I need that to appear after 2.
sqlite> SELECT digit,text FROM test ORDER BY digit;
1|one
110|One Hundred Ten
2|TWO
3|Three
sqlite>
Is there a way to make 110 appear after 2?
It seems like digit is a stored as a string, not as a number. You need to convert it to a number to get the proper ordering. A simple approach uses:
SELECT digit, text
FROM test
ORDER BY digit + 0

Storing extremely small values in Amazon Redshift

I am creating a table in Amazon Redshift using the following command:
CREATE TABLE asmt.incorrect_question_pairs_unique
AS
SELECT question1,
question2,
occurrences,
occurrences / (SUM(occurrences)::FLOAT) OVER () AS prob_q1_q2
FROM (SELECT question1,
question2,
SUM(occurrences) AS occurrences
FROM asmt.incorrect_question_pairs
GROUP BY question1,
question2
HAVING SUM(occurrences) >= 50)
I also tried an alternate:
CREATE TABLE asmt.incorrect_question_pairs_unique
AS
SELECT question1,
question2,
occurrences,
occurrences::float / SUM(occurrences) OVER () AS prob_q1_q2
FROM (SELECT question1,
question2,
SUM(occurrences) AS occurrences
FROM asmt.incorrect_question_pairs
GROUP BY question1,
question2
HAVING SUM(occurrences) >= 50)
I'd like the column prob_q1_q2 to be a float column, which is why I am converting the denominator/numerator to float. But in the resulting table, I get all zeros in that column.
I would like to point out that the SUM(occurrences) would amount to about 10 Billion, so the column prob_q1_q2 will contain extremely small values. Is there a way to store such small values in Amazon Redshift?
How do I make sure that all the values in the column are non-zero float?
Any help would be appreciated.
METHOD 1 - I have had the same problem! In my case it was million of rows so I Multiplied the result by 10000. whenever I wanted to select values from that column I would divide by 10000 in the select statement to make it even. I know its not the perfect solution but works for me.
METHOD 2 - I created a sample table with Numeric(12,6) datatype and when I imported the result set similar to yours, I can see the float values upto 6 decimal precision.
I guess, the conversion does not work when you use create table AS command, you need to create the table specifying the datatype which enforces the result set to be stored to a certain precision level. Its odd! how the same select returns 0.00 but when inserted into table with enforced column, it returns 0.00333.
If I’ve made a bad assumption please comment and I’ll refocus my answer.
Patthebug,
You might be getting a way too low number which cannot be stored in the FLOAT type of Amazon Redshift. Try using DECIMAL instead, there is no way it cannot store your value it's a 128 bit variable.
The way it works is the following, if the value if too big or in your case too small and it exceeds the max/min value of your type the last digits are trimmed and then the new (trimmed) value is stored in the variable/column of your type.
When it is trimming a big value you lose almost nothing lets say you are trimming 20 cents out of 20 billion dollars, you wont be hurt much. But in your case when the number is too small you can loose everything when it trims the last digits to fit in the type
(f.e. A type can store up to 5 digits and you want to store a value of 0.000009 in a variable/column of this type. Your value cannot be fit in the type so its trimmed from the last 2 digits so it can be fit and you receive a new value of 0.0000 )
So if you followed my thought just changing the ::float to ::decimal should fix your issue.
P.S. decimal might require specifying it's size f.e. decimal(127,100)
Try:
select cast(num1 as float) / cast(num2 as float);
This will give you results upto 2 decimal places (by default), but takes up some of your processing time. Doing anything else will round-off the decimal part.
You can have up to 38 digits in a DECIMAL/NUMERIC column with of 37 digits of scale.
CREATE TEMP TABLE precision_test (test NUMERIC(38,37)) DISTSTYLE ALL
;
INSERT INTO precision_test
SELECT CAST( 0.0000000000000000000000000000000000001 AS NUMERIC(38,37)) test
;
SELECT * FROM precision_test
;
--Returns 0.0000000000000000000000000000000000001

Select results from the middle of a sorted list?

I've got a simple table with 300 rows, and after ordering them I want to select rows 11-50. Do I limit by 50 and remove the top 10 rows somehow?
SELECT *
FROM table
ORDER BY somecolumn
LIMIT 10,40
From MySQL's manual:
The LIMIT clause can be used to constrain the number of rows returned by the SELECT statement. LIMIT takes one or two numeric arguments, which must both be nonnegative integer constants (except when using prepared statements).
With two arguments, the first argument specifies the offset of the first row to return, and the second specifies the maximum number of rows to return. The offset of the initial row is 0 (not 1)
The LIMIT syntax includes an offset value, so you'd use:
LIMIT 10, 40
...to get rows 11 - 50, because the initial offset row is zero (not 1).