Simplifying a SQL Select Statement in Oracle SQL Developer - sql

The problem I'm facing right now is I'm working with a SQL query that has over 200 lines of code and at the moment in multiple cases I'm just repeating the same sub-query multiple times in this select statement. In the code below I'm using two of the select statements a lot "avail_qty" and "pct_avail" which both having equations in them. Inside the LOW_CNT_&% SELECT statement I use both of the previous two SELECT statements over and over (this is just one example in my code). I would like to be able to make the equation once and assign it to a variable. Is there any way of doing this? I have tried using the WITH clause but for that you need to use a FROM clause, my FROM clause is massive and would look just as ugly if I were to use a WITH clause (plus instead of repeating the SELECT statement now I would be just repeating the FROM statement).
The reason I don't want to type out the whole equation multiple times is for a two reasons the first is it makes the code easier to read. My other reason is because multiple people edit this query and if someone else were to edit the equation in one spot but forgets to edit it in another spot, that could be bad. Also it doesn't feel like good code etiquette to repeat code over and over.
SELECT
all_nbr.total_qty,
NVL (avail_nbr.avail_qty, 0) AS avail_qty,
100 * TRUNC ( (NVL (avail_nbr.avail_qty, 0) / all_nbr.total_qty), 2) AS pct_avail,
CASE
WHEN ((NVL (avail_nbr.avail_qty, 0)) < 35)
THEN CASE
WHEN ((100 * TRUNC ( (NVL (avail_nbr.avail_qty, 0) / all_nbr.total_qty), 2)) < 35)
THEN (35 - (NVL (avail_nbr.avail_qty, 0)))
ELSE 0
END
ELSE 0
END AS "LOW_CNT_&%"
FROM
...
Any help would be awesome!!

If the subquery is exactly the same one, you can pre-compute it as a Common Table Expression (CTE). For example:
with
cte1 as (
select ... -- long, tedious, repetitive SELECT here
),
cte2 as (
select ... -- you can reference/use cte1 here
)
select ...
from cte1 -- you can use cte1 here, multiple times if you want
join cte2 -- you can also reference/use cte2 here, also multiple times
join ... -- all other joins
cte1 (you can use any name) is a precomputed table expression that can be used multiple times. You can also have multiple CTEs, each one with different names; also each CTE can reference previous ones.

I have tried using the WITH clause but for that you need to use a FROM clause, my FROM clause is massive and would look just as ugly if I were to use a WITH clause (plus instead of repeating the SELECT statement now I would be just repeating the FROM statement).
You shouldn't need to repeat the from clause. You move all of the query, including that clause, into the CTE; you just pull out the bits that rely on earlier calculations into the main query, which avoids the code repetition.
The structure would be something like:
WITH cte AS (
SELECT
all_nbr.total_qty,
NVL (avail_nbr.avail_qty, 0) AS avail_qty,
100 * TRUNC ( (NVL (avail_nbr.avail_qty, 0) / all_nbr.total_qty), 2) AS pct_avail,
FROM
...
)
SELECT
cte.total_qty,
cte.avail_qty,
cte.pct_avail,
CASE
WHEN cte.avail_qty, 0 < 35
THEN CASE
WHEN cte.total_qty < 35
THEN 35 - cte.avail_qty
ELSE 0
END
ELSE 0
END AS "LOW_CNT_&%"
FROM
cte;
Your main query only need to refer to the CTE (again, based on what you've shown), and can (only) refer to the prjoection of the CTE, incuding the calculated columns. It can't see the underlying tables, but shouldn't need to.
Or with an inline view instead, the principal is the same:
SELECT
total_qty,
avail_qty,
pct_avail,
CASE
WHEN avail_qty < 35
THEN CASE
WHEN total_qty < 35
THEN 35 - avail_qty
ELSE 0
END
ELSE 0
END AS "LOW_CNT_&%"
FROM
(
SELECT
all_nbr.total_qty,
NVL (avail_nbr.avail_qty, 0) AS avail_qty,
100 * TRUNC ( (NVL (avail_nbr.avail_qty, 0) / all_nbr.total_qty), 2) AS pct_avail,
FROM
...
);

Related

SQL CASE WHEN- can I do a function within a function? New to SQL

SELECT
SP.SITE,
SYS.COMPANY,
SYS.ADDRESS,
SP.CUSTOMER,
SP.STATUS,
DATEDIFF(MONTH,SP.MEMBERSINCE, SP.EXPIRES) AS MONTH_COUNT
CASE WHEN(MONTH_COUNT = 0 THEN MONTH_COUNT = DATEDIFF(DAY,SP.MEMBERSINCE, SP.EXPIRES) AS DAY_COUNT)
ELSE NULL
END
FROM SALEPASSES AS SP
INNER JOIN SYSTEM AS SYS ON SYS.SITE = SP.SITE
WHERE STATUS IN (7,27,29);
I am still trying to understand SQL. Is this the right order to have everything? I'm assuming my datediff() is unable to work because it's inside case when. What I am trying to do, is get the day count if month_count is less than 1 (meaning it's less than one month and we need to count the days between the dates instead). I need month_count to run first to see if doing the day_count would even be necessary. Please give me feedback, I'm new and trying to learn!
Case is an expression, it returns a value, it looks like you should be doing this:
DAY_COUNT =
CASE WHEN DATEDIFF(MONTH,SP.MEMBERSINCE, SP.EXPIRES) = 0
THEN DATEDIFF(DAY,SP.MEMBERSINCE, SP.EXPIRES))
ELSE NULL END
You shouldn't actually need else null as NULL is the default.
Note also you [usually] cannot refer to a derived column in the same select
It appears that what you are trying to do is define the MonthCount column's value, and then reuse that value in another column's definition. (The Don't Repeat Yourself principle.)
In most dialects of SQL, you can't do that. Including MS SQL Server.
That's because SQL is a "declarative" language. This means that SQL Server is free to calculate the column values in any order that it likes. In turn, that means you're not allowed to do anything that would rely on one column being calculated before another.
There are two basic ways around that...
First, use CTEs or sub-queries to create two different "scopes", allowing you to define MonthCount before DayCount, and so reuse the value without retyping the definition.
SELECT
*,
CASE WHEN MonthCount = 0 THEN foo ELSE NULL END AS DayCount
FROM
(
SELECT
*,
bar AS MonthCount
FROM
x
)
AS derive_month
The second main way is to somehow derive the value Before the SELECT block is evaluated. In this case, using APPLY to 'join' a single value on to each input row...
SELECT
x.*,
MonthCount,
CASE WHEN MonthCount = 0 THEN foo ELSE NULL END AS DayCount
FROM
x
CROSS APPLY
(
SELECT
bar AS MonthCount
)
AS derive_month

How to check if a float is between multiple ranges in Postgres?

I'm trying to write a query like this:
SELECT * FROM table t
WHERE ((long_expression BETWEEN -5 AND -2) OR
(long_expression BETWEEN 0 AND 2) OR
(long_expression BETWEEN 4 and 6))
Where long_expression is approximately equal to this:
(((t.s <#> (SELECT s FROM user WHERE user.user_id = $1)) / (SELECT COUNT(DISTINCT cluster_id) FROM cluster) * -1) + 1)
t.s and s are the CUBE datatypes and <#> is the indexed distance operator.
I could just repeat this long expression multiple times in the body, but this would be extremely verbose. An alternative might be to save it in a variable somehow (with a CTE?), but I think this might remove the possibility of using an index in the WHERE clause?
I also found int4range and numrange, but I don't believe they would work here either, because the distance operator returns float8's, not integer or numerics.
You can use a lateral join:
SELECT t.*
FROM table t CROSS JOIN LATERAL
(VALUES (long_expression)) v(x)
WHERE ((v.x BETWEEN -5 AND -2) OR
(v.x BETWEEN 0 AND 2) OR
(v.x BETWEEN 4 and 6)
);
Of course, a CTE or subquery could be used as well; I like lateral joins because they are easy to express multiple expressions that depend on previous values.

Nested case when parent depends on child in SQL Server [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I wanted to check when it comes to nested case where the parent depends on the child, is there any better way then to throw the query in the from clause.
Please mind the details of the logic, but this is the blueprint of what I want to accomplish and it works. The code just looks sloppy. Is there a better/neater way to accomplish this?
SELECT
*,
CASE
WHEN b.NewCalcColRate IS NULL
THEN 0
ELSE b.NewCalcColRate * 1000
END AS FinalCalcColRate
FROM
(SELECT
a.*,
CASE
WHEN a.calcColRate IS NULL
THEN 0
ELSE a.calcColRate * 100
END AS NewCalcColRate
FROM
(SELECT
*,
CASE
WHEN f.AnnualCost IS NULL
THEN 0
ELSE f.AnnualCost / 12
END AS calcColRate
FROM
Rates) AS a
) AS b
First,NewCalcColRate and calcColRate will never be NULLbecause in the nested CASE you are setting it to 0 when it is NULL(CASE WHEN f.AnnualCost IS NULL THEN 0), so that logic is pointless.
Also, you have this column with an f reference though it isn't aliased on the Rate table in that sub-query.
From what I can gather, this can be simplified to:
SELECT
*,
CASE
WHEN AnnualCost IS NULL
THEN 0
ELSE
(AnnualCost / 12) * 1000000
END AS calcColRate
FROM
Rate
Or, as Jabs pointed out...
SELECT
*,
(ISNULL(AnnualCost,0)/12) * 1000000
from
Rate
Another note, depending on the datatype of AnnualCost, you may want to consider dividing by a decimal so that you arne't doing INTEGER division and becoming a victim to lost precision.
(AnnualCost / 12.0) * 1000000.0
EXAMPLE
select
(1 / 12) * 1000000 --Returns 0
,(1 / 12.0) * 1000000 --Returns 83333.000000
In the future, if your code works and you are only looking for improvements, I would post it on Code Review as it is more tailored towards this kind of request.
There seems to be some important information missing from your question. If all you really want is the FinalCalcColRate for each row in your Rates table, then you can do it one step like #scsimon suggested in his/her answer:
select
r.*,
FinalCalcColRate = case when r.AnnualCost is null then 0 else r.AnnualCost / 12 * 100000 end
from
dbo.Rates r;
Or a similar implementation using coalesce (or isnull in SQL Server; see this question for details on the differences):
select
r.*,
FinalCalcColRate = coalesce(r.AnnualCost / 12 * 100000, 0)
from
dbo.Rates r;
However, one difference between #scsimon's query and your original is that the former outputs only the final calculated value while the latter also yields all of the intermediate values. It's not clear from your question whether consumers of this query will require those values or not. If they will, then you can include them simply enough:
select
r.*,
CalcColRate = coalesce(r.AnnualCost / 12, 0),
NewCalcColRate = coalesce(r.AnnualCost / 12 * 100, 0),
FinalCalcColRate = coalesce(r.AnnualCost / 12 * 100000, 0)
from
dbo.Rates r;
There is some repetition of logic here—for instance, if you were to ever change the definition of CalcColRate, you would also have to manually change the expressions for NewCalcColRate and FinalCalcColRate—but it's small enough that I doubt it's worth worrying about. Nonetheless, if the construction in your original query was motivated by a desire to avoid such repetition, you can refactor the query to use CTEs instead of nested queries:
with CalcCTE as
(
select
r.*,
CalcColRate = coalesce(r.AnnualCost / 12, 0)
from
dbo.Rates r
),
NewCalcColCTE as
(
select
c.*,
NewCalcColRate = c.CalcColRate * 100
from
CalcCTE c
)
select
n.*,
FinalCalcColRate = n.NewCalcColRate * 1000
from
NewCalcColCTE n;
This is obviously longer and arguably more difficult to understand than my previous query that defined all the values independently, but it does have the advantage that each step is built atop the last, and the CTE formulation tends to be a lot more readable than the equivalent set of nested queries since the steps are written in the order in which they're evaluated, whereas with a nested query you have to find the innermost point and work outward, which can get confusing in a hurry.

SQL Server - Multiplying row values for a given column value [duplicate]

Im looking for something like SELECT PRODUCT(table.price) FROM table GROUP BY table.sale similar to how SUM works.
Have I missed something on the documentation, or is there really no PRODUCT function?
If so, why not?
Note: I looked for the function in postgres, mysql and mssql and found none so I assumed all sql does not support it.
For MSSQL you can use this. It can be adopted for other platforms: it's just maths and aggregates on logarithms.
SELECT
GrpID,
CASE
WHEN MinVal = 0 THEN 0
WHEN Neg % 2 = 1 THEN -1 * EXP(ABSMult)
ELSE EXP(ABSMult)
END
FROM
(
SELECT
GrpID,
--log of +ve row values
SUM(LOG(ABS(NULLIF(Value, 0)))) AS ABSMult,
--count of -ve values. Even = +ve result.
SUM(SIGN(CASE WHEN Value < 0 THEN 1 ELSE 0 END)) AS Neg,
--anything * zero = zero
MIN(ABS(Value)) AS MinVal
FROM
Mytable
GROUP BY
GrpID
) foo
Taken from my answer here: SQL Server Query - groupwise multiplication
I don't know why there isn't one, but (take more care over negative numbers) you can use logs and exponents to do:-
select exp (sum (ln (table.price))) from table ...
There is no PRODUCT set function in the SQL Standard. It would appear to be a worthy candidate, though (unlike, say, a CONCATENATE set function: it's not a good fit for SQL e.g. the resulting data type would involve multivalues and pose a problem as regards first normal form).
The SQL Standards aim to consolidate functionality across SQL products circa 1990 and to provide 'thought leadership' on future development. In short, they document what SQL does and what SQL should do. The absence of PRODUCT set function suggests that in 1990 no vendor though it worthy of inclusion and there has been no academic interest in introducing it into the Standard.
Of course, vendors always have sought to add their own functionality, these days usually as extentions to Standards rather than tangentally. I don't recall seeing a PRODUCT set function (or even demand for one) in any of the SQL products I've used.
In any case, the work around is fairly simple using log and exp scalar functions (and logic to handle negatives) with the SUM set function; see #gbn's answer for some sample code. I've never needed to do this in a business application, though.
In conclusion, my best guess is that there is no demand from SQL end users for a PRODUCT set function; further, that anyone with an academic interest would probably find the workaround acceptable (i.e. would not value the syntactic sugar a PRODUCT set function would provide).
Out of interest, there is indeed demand in SQL Server Land for new set functions but for those of the window function variety (and Standard SQL, too). For more details, including how to get involved in further driving demand, see Itzik Ben-Gan's blog.
You can perform a product aggregate function, but you have to do the maths yourself, like this...
SELECT
Exp(Sum(IIf(Abs([Num])=0,0,Log(Abs([Num])))))*IIf(Min(Abs([Num]))=0,0,1)*(1-2*(Sum(IIf([Num]>=0,0,1)) Mod 2)) AS P
FROM
Table1
Source: http://productfunctionsql.codeplex.com/
There is a neat trick in T-SQL (not sure if it's ANSI) that allows to concatenate string values from a set of rows into one variable. It looks like it works for multiplying as well:
declare #Floats as table (value float)
insert into #Floats values (0.9)
insert into #Floats values (0.9)
insert into #Floats values (0.9)
declare #multiplier float = null
select
#multiplier = isnull(#multiplier, '1') * value
from #Floats
select #multiplier
This can potentially be more numerically stable than the log/exp solution.
I think that is because no numbering system is able to accommodate many products. As databases are designed for large number of records, a product of 1000 numbers would be super massive and in case of floating point numbers, the propagated error would be huge.
Also note that using log can be a dangerous solution. Although mathematically log(a*b) = log(a)*log(b), it might not be in computers as we are not dealing with real numbers. If you calculate 2^(log(a)+log(b)) instead of a*b, you may get unexpected results. For example:
SELECT 9999999999*99999999974482, EXP(LOG(9999999999)+LOG(99999999974482))
in Sql Server returns
999999999644820000025518, 9.99999999644812E+23
So my point is when you are trying to do the product do it carefully and test is heavily.
One way to deal with this problem (if you are working in a scripting language) is to use the group_concat function.
For example, SELECT group_concat(table.price) FROM table GROUP BY table.sale
This will return a string with all prices for the same sale value, separated by a comma.
Then with a parser you can get each price, and do a multiplication. (In php you can even use the array_reduce function, in fact in the php.net manual you get a suitable example).
Cheers
Another approach based on fact that the cardinality of cartesian product is product of cardinalities of particular sets ;-)
⚠ WARNING: This example is just for fun and is rather academic, don't use it in production! (apart from the fact it's just for positive and practically small integers)⚠
with recursive t(c) as (
select unnest(array[2,5,7,8])
), p(a) as (
select array_agg(c) from t
union all
select p.a[2:]
from p
cross join generate_series(1, p.a[1])
)
select count(*) from p where cardinality(a) = 0;
The problem can be solved using modern SQL features such as window functions and CTEs. Everything is standard SQL and - unlike logarithm-based solutions - does not require switching from integer world to floating point world nor handling nonpositive numbers. Just number rows and evaluate product in recursive query until no row remain:
with recursive t(c) as (
select unnest(array[2,5,7,8])
), r(c,n) as (
select t.c, row_number() over () from t
), p(c,n) as (
select c, n from r where n = 1
union all
select r.c * p.c, r.n from p join r on p.n + 1 = r.n
)
select c from p where n = (select max(n) from p);
As your question involves grouping by sale column, things got little bit complicated but it's still solvable:
with recursive t(sale,price) as (
select 'multiplication', 2 union
select 'multiplication', 5 union
select 'multiplication', 7 union
select 'multiplication', 8 union
select 'trivial', 1 union
select 'trivial', 8 union
select 'negatives work', -2 union
select 'negatives work', -3 union
select 'negatives work', -5 union
select 'look ma, zero works too!', 1 union
select 'look ma, zero works too!', 0 union
select 'look ma, zero works too!', 2
), r(sale,price,n,maxn) as (
select t.sale, t.price, row_number() over (partition by sale), count(1) over (partition by sale)
from t
), p(sale,price,n,maxn) as (
select sale, price, n, maxn
from r where n = 1
union all
select p.sale, r.price * p.price, r.n, r.maxn
from p
join r on p.sale = r.sale and p.n + 1 = r.n
)
select sale, price
from p
where n = maxn
order by sale;
Result:
sale,price
"look ma, zero works too!",0
multiplication,560
negatives work,-30
trivial,8
Tested on Postgres.
Here is an oracle solution for anyone who needs it
with data(id, val) as(
select 1,1.0 from dual union all
select 2,-2.0 from dual union all
select 3,1.0 from dual union all
select 4,2.0 from dual
),
neg(val , modifier) as(
select exp(sum(ln(abs(val)))), case when mod(count(*),2) = 0 then 1 Else -1 end
from data
where val <0
)
,
pos(val) as (
select exp(sum(ln(val)))
from data
where val >=0
)
select (select val*modifier from neg)*(select val from pos) product from dual

Is it possible to use the result of a subquery in a case statement of the same outer query?

I am writing a search routine with a ranking algorithm and would like to get this in one pass.
My Ideal query would be something like this....
select *, (select top 1 wordposition
from wordpositions
where recordid=items.pk_itemid and wordid=79588 and nextwordid=64502
) as WordPos,
case when WordPos<11 then 1 else case WordPos<50 then 2 else case WordPos<100 then 3 else 4 end end end end as rank
from items
Is it possible to use WordPos in a case right there? It's generating an error on me , Invalid column name 'WordPos'.
I know I can redo the subquery for each case but I think it would actually re-run the case wouldn't it?
For example:
select *, case when (select top 1 wordposition from wordpositions where recordid=items.pk_itemid and wordid=79588 and nextwordid=64502)<11 then 1 else case (select top 1 wordposition from wordpositions where recordid=items.pk_itemid and wordid=79588 and nextwordid=64502)<50 then 2 else case (select top 1 wordposition from wordpositions where recordid=items.pk_itemid and wordid=79588 and nextwordid=64502)<100 then 3 else 4 end end end end as rank from items
That works....but is it really re-running the identical query each time?
It's hard to tell from the tests as the first time it runs it's slow but subsequent runs are quick....it's caching...so would that mean that the first time it ran it for the first row, the subsequent three times it would get the result from cache?
Just curious what the best way to do this would be...
Thank you!
Ryan
You can do this using a subquery. I will stick with your SQL Server syntax, even though the question is tagged mysql:
select i.*,
(case when WordPos < 11 then 1
when WordPos < 50 then 2
when WordPos < 100 then 3
else 4
end) as rank
from (select i.*,
(select top 1 wpwordposition
from wordpositions wp
where recordid=i.pk_itemid and wordid=79588 and nextwordid=64502
) as WordPos
from items i
) i;
This also simplifies the case statement. You do not need nested case statements to handle multiple conditions, just multiple where clauses.
No. Identifiers introduced in the output clause (the fact that it comes from a sub-query is irrelevant) cannot be used within the same SELECT statement.
Here are some solutions:
Rewrite the query using a JOIN1, This will eliminate the issue entirely and fits well with RA.
Wrap the entire SELECT with the sub-query within another SELECT with the case. The outer select can access identifiers introduced by the inner SELECT's output clause.
Use a CTE (if SQL Server). This is similar to #2 in that it allows an identifier to be introduced.
While "re-writing" the sub-query for each case is very messy it should still result in an equivalent plan - but view the query profile! - as the results of the query are non-volatile. As such the equivalent sub-queries can be safely moved by the query planner which should move the sub-query/sub-queries to a JOIN to avoid any "re-running" in the first place.
1 Here is a conversion to use a JOIN, which is my preferred method. (I find that if a query can't be written in terms of a JOIN "easily" then it might be asking for the wrong thing or otherwise be showing issues with schema design.)
select
wp.wordposition as WordPos,
case wp.wordposition .. as Rank
from items i
left join wordpositions wp
on wp.recordid = i.pk_itemid
where wp.wordid = 79588
and wp.nextwordid = 64502
I've made assumptions about the multiplicity here (i.e. that wordid is unique) which should be verified. If this multiplicity is not valid and not correctable otherwise (and you're indeed using SQL Server), then I'd recommend using ROW_NUMBER() and a CTE.