in PromQL, I want to write
If metric_a>=bool 3:
return metric_b
else:
return 1
I am thinking to write as
(metric_b and metric_a>=3) or metric_a<bool 3
but I found that when I switch the metric order, like A or B, B or A, the query result would change, also am not sure if what I have means my if-else
why the or/and operator would give inconsist result? also what is a best way to present the if-else statement in here?
I checked your proposed solution:
(metric_b and metric_a>=3) or metric_a<bool 3
and it worked like expected, returning the value of metric_b when metric_a is >= 3, and 1 otherwise.
It's important to note that "VECTOR1 and VECTOR2" it's not necessarily equals to "VECTOR2 and VECTOR1". Take a look at the Prometheus documentation about this:
vector1 and vector2 results in a vector consisting of the elements of
vector1 for which there are elements in vector2 with exactly matching
label sets. Other elements are dropped.
The results are always from the first vector of the "and" clause.
For example, the following query:
Gives a different result of the following one:
Related
As an example:
select
pre, sze, fdm_pre, val
from
form_data_stage
where
fdm_pre in (1,2,3,4)
order by
pre;
this will return values for all the columns pre, sze, fdm_pre, val for any of the fdm_pre values listed, i.e. (1,2,3,4). However, I only care about the pre and size values when fdm_pre is 1.
I could write a query such as
select
case when fdm_pre = 1 then pre else null end as pre,
case when fdm_pre = 1 then sze else null end as sze,
fdm_pre,
val
from
form_data_stage
where
fdm_pre in (1,2,3,4)
order by
pre;
But, is there some standard way of dealing with this situation? Is it generally more efficient to return all the columns, even if they aren't used? Or, would it be better to do some conditional checking as in the second query? The pre and sze columns are integer values.
It's not efficient to return all the columns when they are not used, specially if the unused columns are massive in size (BLOB, CLOB, TEXT, ARRAY, etc.).
In your particular example the columns "not returned" are small ones (measured in bytes), so it won't really matter if you produce nulls instead.
You can use execute string instead of this kind of script. In that way your desire columns should put in string and with if statement you can decide about which column is shown and which one is not necessary.
I can make a sample for you if you need.
I think keeping the query simple and clear to read is important. I would suggest you return all the relevant columns (that might be useful) and on the app logic side deal with these 2 options of column usage.
What does this query try to achieve?
SELECT * FROM X WHERE (X.Y in (select Y from X))
As far as I figured, it is yielding me the same result as
SELECT * FROM X WHERE Y is not NULL
Is there anything more to the first query? The first query is actually very slow with a large dataset and hence I want to know whether I can replace it with the second query.
You are right, the two queries are equivalent.
It is unclear, why the first query was written this way. Maybe it looked different once.
As is, your second query is better, because it is easier to read and understand (and even faster as you say).
your second query is perfect than the 1st one
because in 1st query you may get abnormal(null) result in case if column Y contains null value but you will not get abnormal result in 2nd one if null values contain in column Y.
So based on values of your table two query will behave two different way
I just stumbled over jOOQ's maxDistinct SQL aggregation function.
What does MAX(DISTINCT x) do different from just MAX(x) ?
maxDistinct and minDistinct were defined in order to keep consistency with the other aggregate functions where having a distinct option actually makes a difference (e.g., countDistinct, sumDistinct).
Since the maximum (or minimum) calculated between the distinct values of a dataset is mathematically equivalent with the simple maximum (or minimum) of the same set, these function are essentially redundant.
In short, there will be no difference. In case of MySQL, it's even stated in manual page:
Returns the maximum value of expr. MAX() may take a string argument;
in such cases, it returns the maximum string value. See Section 8.5.3,
“How MySQL Uses Indexes”. The DISTINCT keyword can be used to find the
maximum of the distinct values of expr, however, this produces the
same result as omitting DISTINCT.
The reason why it's possible - is because to keep compatibility with other platforms. Internally, there will be no difference - MySQL will just omit influence of DISTINCT. It will not try to do something with set of rows (i.e. produce distinct set first). For indexed columns it will be Select tables optimized away (thus reading one value from index, not a table), for non-indexed - full scan.
If i'm not wrong there are no difference
For Columns
ID
1
2
2
3
3
4
5
5
The OUTPUT for both quires are same 5
MAX(DISTINCT x)
// ID = 1,2,2,3,3,4,5,5
// DISTINCT = 1,2,3,4,5
// MAX = 5
// 1 row
and for
MAX(x)
// ID = 1,2,2,3,3,4,5,5
// MAX = 5
// 1 row
Theoretically, DISTINCT x ensures that every element is different from a certain set. The max operator selects the highest value from a set. In plain SQL there should be no difference between both.
what am I doing wrong with my sql query? It always return an empty rows even if there is a value exist.
Here is my query:
SELECT *
FROM users
WHERE user_theme_id IN ( 9735, 9325, 4128 )
AND ( user_date_created BETWEEN '2013-06-04' AND '2013-06-10' );
I tried to cut my original query one by one, I got a result. Here is the first one:
SELECT * FROM users WHERE user_theme_id IN (9735, 9325, 4128 );
I got 3 rows for this result. See attached snapshot:
Now, the next query that I run is this:
SELECT *
FROM users
WHERE user_date_created BETWEEN '2013-06-04' AND '2013-06-10';
I do get 3 results on this. See attached snapshot:
By the way, this sql that uses BETWEEN should suppose return 4 rows but it only return 3. It doesn't return the data which has the created date of 2013-06-10 08:27:43
What am I doing wrong with my original query Why does it always return an empty rows?
If you are getting results by separately running different where clauses doesn't guarantee that AND 2 where clauses will return an answer.
There has to be intersection of rows to get result while AND.
You should validate your data and see if overlapping exists.
I have able to make it work by not using the SQL BETWEEN operators but instead COMPARISON OPERATORS like: >= || <=
I have read it from W3schools.com, the SQL between can produce different results in different databases.
This is the content:
Notice that the BETWEEN operator can produce different result in different databases!
In some databases, BETWEEN selects fields that are between and excluding the test values.
In other databases, BETWEEN selects fields that are between and including the test values.
And in other databases, BETWEEN selects fields between the test values, including the first test value and excluding the last test value.
Therefore: Check how your database treats the BETWEEN operator!
That is what happened in the issue that I am facing. The first field was being treated as part of the test values and the 2nd field was being excluded. Using the comparison operators give accurate result.
A co-worker just came to me with a puzzling SQL query:
(essentially)
SELECT LEAST(id) FROM tableA A, tableB B WHERE a.name = b.name(+)
The result set returned lists three numbers however:
LEAST(id)
--------------
621
644
689
(all being IDs that meet the query as if it lacked the LEAST function all together)
Why? =)
LEAST(x,y,...) is not an aggregate function. It works only on its parameters. The function you want is MIN(x).
For each record, you're running LEAST(id), which will always return id. If you were passing LEAST more parameters, you would see different results. For example, LEAST(5,6,7) = 5. LEAST always returns the smallest of its parameters, whereas MIN returns the smallest of every record.