In SQl its common to take the SUM of all values from a column
SELECT SUM(column_name)
FROM table_name
WHERE condition;
Can the same thing be done but instead of summing each value in an attribute, each value is "OR'd" together? This would only work if the attribute was boolean of course
MAX(booleancolumn) will return OR'd together. (Since TRUE > FALSE.)
Note that this will not work if null values are involved, because FALSE OR NULL evaluates to NULL.
There are two challenges here:
SQL doesn't have boolean type
This isn't part of any of the built-in aggregate functions.
However, some databases will allow you to define your own functions, including aggregate functions. So it is at least possible.
Related
a row of my data looks like this
[someId, someBool, someInt]
I'm looking for a way to aggregate someInt (to put them in an array specifically).
I use a GROUP BY clause to group by the someId field, then I can aggregate all the someInt using ARRAY_AGG but I only want to include rows where someBool=TRUE. How to approach this the right way ?
PS: It might be relevant to note what I got several booleans like someBool and would like to output to a different array each time
You can use ARRAY_AGG with IGNORE NULLS, e.g.:
ARRAY_AGG(IF(someBool IS NOT TRUE, NULL, someId) IGNORE NULLS)
This will only aggregate the IDs for which someBool is true. If you have multiple boolean columns that you want to use in the condition, you can AND them together or use a CASE WHEN ... or whatever other kind of condition you want that produces NULL in order to exclude a value.
Is there any generic command or syntax in SQL to allow one to have multiple values in a where statement without using OR?
OR just gets tedious when you have many values to choose from and you only want say half of them.
I want to return only columns that contain certain values. I am using Cache SQL, but as I said, a generic syntax might be helpful as well because most people are unfamiliar with Cache SQL. Thanks!
You should use IN:
... where column_name in ('val1', 'val2', ...);
Use the 'IN' clause.
SELECT * FROM product WHERE productid IN (1,2,3)
I believe you are looking for the IN clause.
Determines whether a specified value matches any value in a subquery or a list.
I would like the aggregates of an empty result set to be 0. I have tried the following:
SELECT SUM(COALESCE(capacity, 0))
FROM objects
WHERE null IS NOT NULL;
Result:
sum
-----
(1 row)
Subquestion: wouldn't the above work in Oracle, using SUM(NVL(capacity, 0))?
From the documentation page about aggregate functions:
It should be noted that except for count, these functions return a null value when no rows are selected. In particular, sum of no rows returns null, not zero as one might expect. The coalesce function may be used to substitute zero for null when necessary.
So, if you want to guarantee a value returned, apply COALESCE to the result of SUM, not to its argument:
SELECT COALESCE(SUM(capacity), 0) …
As for the Oracle 'subquestion', well, I couldn't find any notion of NULLs at the official doc page (the one for 10.2, in particular), but two other sources are unambiguous:
Oracle SQL Functions:
SUM([DISTINCT] n) Sum of values of n, ignoring NULLs
sum aggregate function [Oracle SQL]:
…if a sum() is created over some numbers, nulls are disregarded, as the following example shows…
That is, you needn't apply NVL to capacity. (But, like with COALESCE in PostgreSQL, you might want to apply it to SUM.)
The thing is, the aggregate always returns a row, even if no rows were aggregated (as is the case in your query). You summed an expression over no rows. Hence the null value you're getting.
Try this instead:
select coalesce(sum(capacity),0)
from objects
where false;
Just do this:
SELECT COALESCE( SUM(capacity), 0)
FROM objects
WHERE null IS NOT NULL;
By the way, COALESCE inside of SUM is redundant, even if capacity is NULL, it won't make the summary null.
To wit:
create table objects
(
capacity int null
);
insert into objects(capacity) values (1),(2),(NULL),(3);
select sum(capacity) from objects;
That will return a value of 6, not null.
And a coalesce inside an aggregate function is a performance killer too, as your RDBMS engine cannot just rip through all the rows, it has to evaluate each row's column if its value is null. I've seen a bit OCD query where all the aggregate queries has a coalesce inside, I think the original dev has a symptom of Cargo Cult Programming, the query is way very sloooowww. I removed the coalesce inside of SUM, then the query become fast.
Although this post is very old, but i would like to update what I use in such cases
SELECT NVL(SUM(NVL(capacity, 0)),0)
FROM objects
WHERE false;
Here external NVL avoids the cases when there is no row in the result set. Inner NVL is used for null column values, consider the case of (1 + null) and it will result in null. So inner NVL is also necessary other wise in alternate set default value 0 to the column.
I have a Postgresql function which returns a composite type defined as (location TEXT, id INT). When I run "SELECT myfunc()", My output is a single column of type text, formatted as:
("locationdata", myid)
This is pretty awful. Is there a way to select my composite so that I get 2 columns back - a TEXT column, and an INT column?
Use:
SELECT *
FROM myfunc()
You can read more about the functionality in this article.
Answer has already been accepted, but I thought I'd throw this in:
It may help to think about the type of the data and where those types fit into an overall query. SQL queries can return essentially three types:
A single scalar value
A list of values
A table of values
(Of course, a list is just a one-column table, and a scalar is just a one-value list.)
When you look at the types, you see that an SQL SELECT query has the following template:
SELECT scalar(s)
FROM table
WHERE boolean-scalar
If your function or subquery is returning a table, it belongs in the FROM clause. If it returns a list, it could go in the FROM clause or it could be used with the IN operator as part of the WHERE clause. If it returns a scalar, it can go in the SELECT clause, the FROM clause, or in a boolean predicate in the WHERE clause.
That's an incomplete view of SELECT queries, but I've found it helps to figure out where my subqueries should go.
Are there any constants in T-SQL like there are in some other languages that provide the max and min values ranges of data types such as int?
I have a code table where each row has an upper and lower range column, and I need an entry that represents a range where the upper range is the maximum value an int can hold(sort of like a hackish infinity). I would prefer not to hard code it and instead use something like SET UpperRange = int.Max
There are two options:
user-defined scalar function
properties table
In Oracle, you can do it within Packages - the closest SQL Server has is Assemblies...
I don't think there are any defined constants but you could define them yourself by storing the values in a table or by using a scalar valued function.
Table
Setup a table that has three columns: TypeName, Max and Min. That way you only have to populate them once.
Scalar Valued Function
Alternatively you could use scalar valued functions GetMaxInt() for example (see this StackOverflow answer for a real example.
You can find all the max/min values here: http://msdn.microsoft.com/en-us/library/ms187752.aspx
Avoid Scalar-Functions like the plague:
Scalar UDF Performance Problem
That being said, I wouldn't use the 3-Column table another person suggested.
This would cause implicit conversions just about everywhere you'd use it.
You'd also have to join to the table multiple times if you needed to use it for more than one type.
Instead have a column for each Min and Max of each Data Type (defined using it's own data type) and call those directly to compare to.
Example:
SELECT *
FROM SomeTable as ST
CROSS JOIN TypeRange as TR
WHERE ST.MyNumber BETWEEN TR.IntMin AND TR.IntMax