I'm trying to query a database, I need to get a list of customers where their weight is equal to 60.5. The problem is that 60.5 is a real I've never query a database with a real in a where clause before.
I've tried this:
SELECT Name FROM Customers WHERE Weight=60.5
SELECT Name FROM Customers WHERE Weight=cast(60.5 as real)
SELECT Name FROM Customers WHERE Weight=cast(60.5 as decimal)
SELECT Name FROM Customers WHERE Weight=convert(real,'60.5')
SELECT Name FROM Customers WHERE Weight=convert(decimal,'60.5')
These queries return 0 values but in the Customers table their are 10 rows with Weight=60.5
Your problem is that floating point numbers are inaccurate by definition. Comparing what seems to be 60.5 to a literal 60.5 might not work as you've noticed.
A typical solution is to measure the difference between 2 values, and if it's smaller then some predefined epsilon, consider them equal:
SELECT Name FROM Customers WHERE ABS(Weight-60.5) < 0.001
For better performance, you should actually use:
SELECT Name FROM Customers WHERE Weight BETWEEN 64.999 AND 65.001
If you need equality comparison, you should change the type of the column to DECIMAL. Decimal numbers are stored and compared exactly, while real and float numbers are approximations.
#Amit's answer will work, but it will perform quite poorly in comparison to my approach. ABS(Weight-60.5) < 0.001 is unable to use index seeks. But if you convert the column to DECIMAL, then Weight=60.5 will perform well and use index seeks.
Related
I'm new to this.
I have a column: (chocolate_weight) On the table : (Chocolate) which has g at the end of every number, so 30x , 2x5g,10g etc.
I want to remove the letter at the end and then query it to show any that weigh greater than 35.
So far I have done
Select *
From Chocolate
Where chocolate_weight IN
(SELECT
REPLACE(chocolote_weight,'x','') From Chocolate) > 35
It is coming back with 0 , even though there are many that weigh more than 35.
Any help is appreciated
Thanks
If 'g' is always the suffix then your current query is along the right lines, but you don't need the IN you can do the replace in the where clause:
SELECT *
FROM Chocolate
WHERE CAST(REPLACE(chocolate_weight,'g','') AS DECIMAL(10, 2)) > 35;
N.B. This works in both the tagged DBMS SQL-Server and MySQL
This will fail (although only silently in MySQL) if you have anything that contains units other than grams though, so what I would strongly suggest is that you fix your design if it is not too late, store the weight as an numeric type and lose the 'g' completely if you only ever store in grams. If you use multiple different units then you may wish to standardise this so all are as grams, or alternatively store the two things in separate columns, one as a decimal/int for the numeric value and a separate column for the weight, e.g.
Weight
Unit
10
g
150
g
1000
lb
The issue you will have here though is that you will have start doing conversions in your queries to ensure you get all results. It is easier to do the conversion once when the data is saved and use a standard measure for all records.
In SQL if I've the following select statement:
SELECT * FROM Products WHERE ProductName BETWEEN 'Geitost' AND 'Pavlova'
It run successfully and display the output records, but I'm a little bit confusing about how we can search between two strings and how the SQL perform it,
for example the following select statement:
SELECT * FROM Products WHERE Price BETWEEN 100 AND 300
It's clear that we are trying to find all the prices which has a price equals to 100$ or greater but less than or equals 300$
Can somebody explain what is the selecting mechanism between two strings in SQL?
Thanks in advance.
Your question is rather baffling. This code works:
WHERE Price BETWEEN 100 AND 300
It is turned into:
WHERE Price >= 100 AND Price <= 300
Similarly, your first condition is equivalent to:
WHERE ProductName >= 'Geitost' AND
ProductName <= 'Pavlova'
So the question is about how strings are compared. The short answer is "how they appear in the dictionary". Of course, there are lots of dictionaries in the world. The more complete answer is based on collations, which describe the alphabetic ordering of strings.
Here is the MySQL documentation for collations and character sets. Although some details such as the names of the collations and the syntax for functions may vary, the idea is similar across databases.
I'm trying to show that there are no significant differences between two tables using an EXCEPT query. I'm only including the fields that I care about comparing. The problem is that tons of differences are being picked up in the result set due to one of the tables having data in a float format and the other having Decimal(24,7). I want my EXCEPT query to only include differences that are greater than 1.
I tried casting the float to Decimal(24,7) as well as casting both to Decimal(24,2) but due to rounding there are still differences flagged. For example, one table might show 2.55 and the other 2.5499999. That gets flagged as a difference. If I truncate the values, I still get differences (2.55 vs 2.54). If I round them or cast as Decimal(24,2) this particular instance is fixed, but others show up (e.g. rounding 2.355 vs 2.35499999 causes 2.36 vs 2.35).
How can I cast or round the decimal values such that any differences less than 1 are not returned by my EXCEPT query?
Sample code:
SELECT name, weight FROM Table1
EXCEPT
SELECT name, weight FROM Table2
/* That returns thousands of differences. If I cast both weights as Decimal(24,2) I get far fewer differences, but I want to only show differences greater than 1. */
This is probably not appropriate for EXCEPT. But you can try using a smaller number of decimal places:
SELECT name, CAST(weight as DECIMAL(24, 3)) FROM Table1
EXCEPT
SELECT name, CAST(weight as DECIMAL(24, 3)) FROM Table2;
Alternatively, you can use NOT EXISTS:
SELECT name, weight
FROM Table1 t1
WHERE NOT EXISTS (SELECT 1
FROM table2 t2
WHERE t2.name = t1.name AND
ABS(t2.weight - t1.weight) < 0.00001
);
I'm trying to find geometric average of values from a table with millions of rows. For those that don't know, to find the geometric average, you mulitply each value times each other then divide by the number of rows.
You probably already see the problem; The number multiplied number will quickly exceed the maximum allowed system maximum. I found a great solution that uses the natural log.
http://timothychenallen.blogspot.com/2006/03/sql-calculating-geometric-mean-geomean.html
However that got me to wonder wouldn't the same problem apply with the arithmetic mean? If you have N records, and N is very large the running sum can also exceed the system maximum.
So how do RDMS calculate averages during queries?
I don't know an exact implementation for arithmetic mean in an RDBMS, nor did you specify one in your original question. But the RDBMS does not need to sum a million rows in a column in order to obtain the arithmetic mean. Consider the following summation:
sum = (x1 + x2 + x3 + ... + x1000000)
Then the mean can be written as
mean = sum / N = (x1 + x2 + x3 + ... + x1000000) / N, for N = 1,000,000
But this expression can be broken up into pieces like this:
mean = [(x1 + x2 + x3) / N ] + [(x4 + x5 + x6) / N] + ...
In other words, the RDBMS can simply scan down the million rows in a column and find the mean section by section, without running the risk of an overflow. And since each number in the column is presumably within range for the type storing it, there is no chance of the mean value itself overflowing.
Most databases don't support a product() function the way they support an average.
However, you can use do what you want with logs. The product (simplified) is like:
select exp(sum(ln(x)) as product
The average would be:
select power(exp(sum(ln(x))), 1.0 / count(*)) as geoaverage
or
select EXP(AVG(LN(x))) as geoaverage
The LN() function might be LOG() on some platforms...
These are schematics. The functions for exp() and ln() and power() vary, depending on the database. Plus, if you have to take into account zero or negative numbers, the logic is more complicated.
Very easy to check. For example, SQL Server 2008.
DECLARE #T TABLE(i int);
INSERT INTO #T(i) VALUES
(2147483647),
(2147483647);
SELECT AVG(i) FROM #T;
result
(2 row(s) affected)
Msg 8115, Level 16, State 2, Line 7
Arithmetic overflow error converting expression to data type int.
There is no magic. Column type is int, server adds values together using internal variable of the same type int and intermediary result exceeds range for int.
You can run the similar check for any other DBMS that you use. Different engines may behave differently, but I would expect all of them to stick to the original type of the column. For example, averaging two int values 100 and 101 may result in 100 or 101 (still int), but never 100.5.
For SQL Server this behavior is documented. I would expect something similar for all other engines:
AVG () computes the average of a set of values by dividing the sum of
those values by the count of nonnull values. If the sum exceeds the
maximum value for the data type of the return value an error will be
returned.
So, you have to be careful when calculating simple average as well, not just product.
Here is extract from SQL 92 Standard:
6) Let DT be the data type of the < value expression >.
9) If SUM or AVG is specified, then:
a) DT shall not be character string, bit string, or datetime.
b) If SUM is specified and DT is exact numeric with scale S, then the
data type of the result is exact numeric with implementation-defined
precision and scale S.
c) If AVG is specified and DT is exact numeric, then the data type of
the result is exact numeric with implementation- defined precision not
less than the precision of DT and implementation-defined scale not
less than the scale of DT.
d) If DT is approximate numeric, then the data type of the result is
approximate numeric with implementation-defined precision not less
than the precision of DT.
e) If DT is interval, then the data type of the result is inter- val
with the same precision as DT.
So, DBMS can convert int to larger type when calculating AVG, but it has to be an exact numeric type, not floating-point. In any case, depending on the values you can still get arithmetic overflow.
Some DBMS — specifically, the Informix DBMS — convert from an INT type to a floating point type to do the calculation:
SQL[2148]: create table t(i int);
SQL[2149]: insert into t values(214748347);
SQL[2150]: insert into t values(214748347);
SQL[2151]: insert into t values(214748347);
SQL[2152]: select avg(i) from t;
214748347.0
SQL[2153]: types on;
SQL[2154]: select i from t;
INTEGER
214748347
214748347
214748347
SQL[2155]: select avg(i) from t;
DECIMAL(32)
214748347.0
SQL[2156]:
Similarly with other types. This can still end with an overflow under some circumstances; you then get a runtime error. However, it is rather seldom that you exceed the precision — it typically takes a very large number of rows for the sum to exceed the limits, even if you're counting the US deficit over the next century in atto-Zimbabwean dollars circa 2009.
I am trying to pull data from a table with the filter weight < 25 kgs , but my table has weight in pounds, I tried using below sql can some one please tell me is this the right way to do it or is there any other way .
select * from dbo.abc
where (round((WEIGHT * 0.453592 ),0) < 25)
Your solution would work, but it's not sargaeble. A better solution would be to convert your 25kgs to lbs. That way, if you have an index on your WEIGHT column, the query analyzer could make use of it.
One additional note: Why round to 0 decimal places? You'll lose accuracy that way. Unless you have some requirement to do so, I'd drop the rounding. It's unnecessary overhead.
As other people mentioned, you don't want to convert weight as it will cause SQL Server not to use your index. So try this instead:
SELECT *
FROM dbo.acb
WHERE WEIGHT < ROUND(25/.453592,4)