Replacing a query with SELECT * FROM X WHERE Y is not NULL - sql

What does this query try to achieve?
SELECT * FROM X WHERE (X.Y in (select Y from X))
As far as I figured, it is yielding me the same result as
SELECT * FROM X WHERE Y is not NULL
Is there anything more to the first query? The first query is actually very slow with a large dataset and hence I want to know whether I can replace it with the second query.

You are right, the two queries are equivalent.
It is unclear, why the first query was written this way. Maybe it looked different once.
As is, your second query is better, because it is easier to read and understand (and even faster as you say).

your second query is perfect than the 1st one
because in 1st query you may get abnormal(null) result in case if column Y contains null value but you will not get abnormal result in 2nd one if null values contain in column Y.
So based on values of your table two query will behave two different way

Related

Is it possible to concisely tell SQL SELECT to omit some temporary/dummy columns?

Suppose I have this query:
select
x + y as _total,
abs(x - y) / _total as _err,
round(100 * _err) as pct_err,
x,
y
from foo;
This assumes I have a table with x and y, and calculates an error between them. Note that columns I prefixed with _ are dummy columns - they're only there to show the steps of the calculation more clearly. Is there a way to omit them from the result?
I don't want to simply collapse the three columns into a single expression. It would be messier, and consider also a calculation with 10 steps and much longer field names.
I don't want to make this a CTE and then re-select only the columns I want. That seems too much hassle for such a simple thing.
It would be okay if I could just put the dummy columns at the end where they would be out of the way, but SQL doesn't seem to allow referencing a column that comes after.
Note that "no" is an acceptable answer, if you have reasonably comprehensive knowledge of SQL syntax :)

Sql column value as formula in select

Can I select a column based on another column's value being listed as a formula? So I have a table, something like:
column_name formula val
one NULL 1
two NULL 2
three one + two NULL
And I want to do
SELECT
column_name,
CASE WHEN formula IS NULL
val
ELSE
(Here's where I'm confused - How do I evaluate the formula?)
END as result
FROM
table
And end up with a result set like
column_name result
one 1
two 2
three 3
You keep saying column, and column name, but you're actually talking about rows, not columns.
The problem is that you (potentially) want different formulas for each row. For example, row 4 might be (two - one) = 1 or even (three + one) = 4, where you'd have to calculate row three before you could do row 4. This means that a simple select query that parses the formulas is going to be very hard to do, and it would have to be able to handle each type of formula, and even then if the formulas reference other formulas that only makes it harder.
If you have to be able to handle functions like (two + one) * five = 15 and two + one * five = 7, then you'd be basically re-implementing a full blown eval function. You might be better to return the SQL table to another language that has eval functions built in, or you could use something like SQL Eval.net if it has to be in SQL.
Either way, though, you've still got to change "two + one" to "2 + 1" before you can do the eval with it. Because these values are in other rows, you can't see those values in the row you're looking at. To get the value for "one" you have to do something like
Select val from table where column_name = 'one'
And even then if the val is null, that means it hasn't been calculated yet, and you have to come back and try again later.
If I had to do something like this, I would create a temporary table, and load the basic table into it. Then, I'd iterate over the rows with null values, trying to replace column names with the literal values. I'd run the eval over any formulas that had no symbols anymore, setting the val for those rows. If there were still rows with no val (ie they were waiting for another row to be done first), I'd go back and iterate again. At the end, you should have a val for every row, at which point it is a simple query to get your results.
Possible solution would be like this kind....but since you mentioned very few things so this works on your above condition, not sure for anything else.
GO
SELECT
t1.column_name,
CASE WHEN t1.formula IS NULL
t1.val
ELSE
(select sum(t2.val) from table as t2 where t2.formula is not null)
END as result
FROM
table as t1
GO
If this is not working feel free to discuss it further.

SQL to find the matching row between two tables of same schema

I have two tables, say X and X_STAGING.
They are exactly identical in columns i.e. schema is same. However, the number of rows are different. I know that the first row of X is there in X_STAGING - the data was partially copied over from X_STAGING to X. However I need to know exactly which row of the X_STAGING contains the data, that went into the first row of X.
At the moment I am using this
SELECT
SUM(MATCH)
FROM
(
SELECT
CASE WHEN X_STAGING.KEY_ID='KEY_FROM_THE_FIRST_ROW_OF_X' THEN 1 ELSE 0 END AS MATCH
FROM
X_STAGING
WHERE ROWNUM<2550000
)
Changing the ROWNUM I can find out at which ROWNUM does the count get to 1. And then my adjusting ROWNUM I can eventually get to the particular row.
This will work, but I am sure there has to be a quicker and more clever way of doing this.
Please help.
Note: I am working on Linux, DB2 environment.
I don't understand what you are trying to accomplish, but the following does what you are asking for:
SELECT
MAX(MATCH)
FROM
(
SELECT
CASE WHEN X_STAGING.KEY_ID='KEY_FROM_THE_FIRST_ROW_OF_X' THEN ROWNUM ELSE 0 END AS MATCH
FROM
X_STAGING
)

SQL BETWEEN return empty rows even if value exist

what am I doing wrong with my sql query? It always return an empty rows even if there is a value exist.
Here is my query:
SELECT *
FROM users
WHERE user_theme_id IN ( 9735, 9325, 4128 )
AND ( user_date_created BETWEEN '2013-06-04' AND '2013-06-10' );
I tried to cut my original query one by one, I got a result. Here is the first one:
SELECT * FROM users WHERE user_theme_id IN (9735, 9325, 4128 );
I got 3 rows for this result. See attached snapshot:
Now, the next query that I run is this:
SELECT *
FROM users
WHERE user_date_created BETWEEN '2013-06-04' AND '2013-06-10';
I do get 3 results on this. See attached snapshot:
By the way, this sql that uses BETWEEN should suppose return 4 rows but it only return 3. It doesn't return the data which has the created date of 2013-06-10 08:27:43
What am I doing wrong with my original query Why does it always return an empty rows?
If you are getting results by separately running different where clauses doesn't guarantee that AND 2 where clauses will return an answer.
There has to be intersection of rows to get result while AND.
You should validate your data and see if overlapping exists.
I have able to make it work by not using the SQL BETWEEN operators but instead COMPARISON OPERATORS like: >= || <=
I have read it from W3schools.com, the SQL between can produce different results in different databases.
This is the content:
Notice that the BETWEEN operator can produce different result in different databases!
In some databases, BETWEEN selects fields that are between and excluding the test values.
In other databases, BETWEEN selects fields that are between and including the test values.
And in other databases, BETWEEN selects fields between the test values, including the first test value and excluding the last test value.
Therefore: Check how your database treats the BETWEEN operator!
That is what happened in the issue that I am facing. The first field was being treated as part of the test values and the 2nd field was being excluded. Using the comparison operators give accurate result.

Is there a performance difference between HAVING on alias, vs not using HAVING

Ok, I'm learning, bit by bit, about what HAVING means.
Now, my question is if these two queries have difference performance characteristics:
Without HAVING
SELECT x + y AS z, t.* FROM t
WHERE
x = 1 and
x+y = 2
With HAVING
SELECT x + y AS z, t.* FROM t
WHERE
x = 1
HAVING
z = 2
Yes it should be different - (1) is expected to be faster.
Having will ensure that first the main query is run and then the having filter is applied - so it basically works on a the dataset returned by the (query minus having).
The first query should be preferable, since it does not select those records at all.
HAVING is used for queries that contain GROUP BY or return a single row containg the result of aggregate functions. For example SELECT SUM(scores) FROM t HAVING SUM(scores) > 100 returns either one row, or no row at all.
The second query is considered invalid by the SQL Standard and is not accepted by some database systems.