I wanted to make something like:
insert into Table(a, b, c)
select a, b, 1 from OtherTable
where a > 0
But where c would be a value of an Enum based on a certain condition of the where clause.
I wish the final result would be something similar to (even if this doesn't work):
insert into Table(a, b, c)
select a, b, x from OtherTable, YetAnotherTable
where a > 0 AND (IF a=b THEN x = 'Enum1' ELSE x = 'Enum2' ENDIF)
Do you think something like this is possible in a single statement?
You could use a case expression:
insert into Table(a, b, c)
select a,
b,
CASE WHEN a = b THEN 'Enum1' ELSE 'Enum2' END
from OtherTable, YetAnotherTable
where a > 0
Related
I have a table Table_A with columns A, B and C whereby column C needs to be summed, but only if column B is a certain value. Otherwise column C may not always contain a value to be summed.
So the normal SQL of:
SELECT A, B, SUM(C)
FROM Table_A
WHERE B = 'value condition'
GROUP BY A,B
Works well. However, I thought I could use "CASE WHEN" to catch the conditions with zero, say, like this:
SELECT
A, B,
CASE WHEN B = 'value condition' THEN SUM(C) ELSE 0 END
FROM Table_A
GROUP BY A, B
I get an error, referring to the fact that it is still trying to SUM a value not which is not in the condition. Am I missing something? Or have I misinterpreted CASE WHEN?
You want conditional aggregation. The CASE expression should appear inside SUM:
SELECT A, SUM(CASE WHEN B = 'value condition' THEN C ELSE 0 END) AS total
FROM Table_A
GROUP BY A;
Note that B probably does not belong in the GROUP BY clause, given that you want to conditionally aggregate using its values.
Here's a simple view I'd like to create, but I'd like only FOUR columns in the final view: a, b, c, e. I would like to define d for temporary use in determining the value of e, but then I do not want d to be part of the resulting view.
create view v as
select a, b, c, a+b+c as d,
case when d > 1000 then 1
when d > 100 then 2
when d > 10 then 3
else 4 END as e
from tbl;
Is there any way in Netezza SQL to define such a temporary value?
In this simple example, certainly I could replace each of my "when d" statements with "when a+b+c" every time, but my real-world scenario is more complex than illustrated here.
You can use a subquery:
create view v as
select a, b, c,
(case when d > 1000 then 1
when d > 100 then 2
when d > 10 then 3
else 4
end) as e
from (select tbl.*, (a + b + c) as d
from tbl
) t;
I have a table
a b c
1 2
1 3
1 4 1
2 1 2
The column a and c should be combined if the value is the same. If there are not the same, it is always so that one is empty
So the result should be:
a b
1 2
1 3
1 4
2 1
Is there any function that can be applied in PostgreSQL?
According to your description:
The column a and c should be combined if the value is the same. If
there are not the same, it is always so that one is empty
all you need is an unconditional COALESCE.
SELECT COALESCE(a, c) AS a, b FROM tbl;
Assuming that by "empty" you mean NULL, not an empty string (''), in which case you'd add NULLIF:
SELECT COALESCE(NULLIF(a, ''), c) AS a, b FROM tbl;
COALESCE works for multiple parameters:
SELECT COALESCE(a, c, d, e, f, g) AS a, b FROM tbl;
Are you looking for something like this?
SELECT COALESCE(c, a), b
FROM your_table
WHERE COALESCE(c, a) = a
hive rejects this code:
select a, b, a+b as c
from t
where c > 0
saying Invalid table alias or column reference 'c'.
do I really need to write something like
select * from
(select a, b, a+b as c
from t)
where c > 0
EDIT:
the computation of c it complex enough for me not to want to repeat it in where a + b > 0
I need a solution which would work in hive
Use a Common Table Expression if you want to use derived columns.
with x as
(
select a, b, a+b as c
from t
)
select * from x where c >0
You can run this query like this or with a Common Table Expression
select a, b, a+b as c
from t
where a+b > 0
Reference the below order of operations for logical query processing to know if you can use derived columns in another clause.
Keyed-In Order
SELECT
FROM
WHERE
GROUP BY
HAVING
ORDER BY
Logical Querying Processing Phases
FROM
WHERE
GROUP BY
HAVING
SELECT
ORDER BY
You are close, you can do this
select a, b, a+b as c
from t
where a+b > 0
It would have to look like this:
select a, b, a+b as c
from t
where a+b > 0
An easy way to explain/remember this is this: SQL cannot reference aliases assigned within its own instance. It would, however, work if you did this:
SELECT a,b,c
FROM(
select a, b, a+b as c
from t) as [calc]
WHERE c > 0
This syntax would work because the alias is assigned in a subquery.
no
just:
select a, b, a+b as c
from t
where a+b > 0
note: for mysql at least: order by and group by you can use column (or expression) positions
e.g. group by 2, order by 1 would get you one row per column 2 (whether a field name or an expression) and order it by column 1 (field or expression)
also: some RDBMS's do let you refer to the column alias as you first attempted
Given a table, t:
a b c d e
1 2 3 4 7
1 2 3 5 7
3 2 4 6 7
3 2 4 6 8
What SQL query can identify the columns that has one or more instances of varying values associated with each tuple from columns a and b, ?
In table t above, columns d and e would satisfy this criterion but not column c.
For tuples <1,2> and <3,2> that come from columns a and b, column c doesn't have varying values for each tuple.
Column d has one instance of varying values for tuple <1,2> -- values 4 and 5.
Column e also has one instance of varying values for tuple <3,2> -- values 7 and 8.
Something like this should work for you using CASE, COUNT and GROUP BY:
select
a, b,
case when count(distinct c) > 1 then 'yes' else 'no' end colc,
case when count(distinct d) > 1 then 'yes' else 'no' end cold,
case when count(distinct e) > 1 then 'yes' else 'no' end cole
from t
group by a, b
SQL Fiddle Demo
Slightly indirectly:
SELECT a, b,
COUNT(DISTINCT c) AS num_c,
COUNT(DISTINCT d) AS num_d,
COUNT(DISTINCT e) AS num_e
FROM t
GROUP BY a, b;
This yields:
1 2 1 2 1
3 2 1 1 2
If the num_c or num_d or num_e column has a value greater than 1, then there are varying values. You can vary the query to list whether the column is varying for a given value of (a, b) by using a CASE statement like this:
-- v for varying, n for non-varying
SELECT a, b,
CASE WHEN COUNT(DISTINCT C) > 1 THEN 'v' ELSE 'n' END AS num_c,
CASE WHEN COUNT(DISTINCT d) > 1 THEN 'v' ELSE 'n' END AS num_d,
CASE WHEN COUNT(DISTINCT e) > 1 THEN 'v' ELSE 'n' END AS num_e
FROM t
GROUP BY a, b;
This yields:
1 2 n v n
3 2 n n v
If you really want just to know whether any set of values in the given column varies for any values of (a, b) — and not which values of (a, b) it varies for — you can use the query above as a sub-query in the FROM clause and organize things as you want.
SELECT MAX(num_c) AS num_c,
MAX(num_d) AS num_d,
MAX(num_e) AS num_e
FROM (SELECT a, b,
CASE WHEN COUNT(DISTINCT C) > 1 THEN 'v' ELSE 'n' END AS num_c,
CASE WHEN COUNT(DISTINCT d) > 1 THEN 'v' ELSE 'n' END AS num_d,
CASE WHEN COUNT(DISTINCT e) > 1 THEN 'v' ELSE 'n' END AS num_e
FROM t
GROUP BY a, b
);
This relies on v being larger than n; it is easy enough (and convenient enough) for this binary decision, but not necessarily convenient or easy if there are, say, 4 states to map.
This yields:
n v v