In Asymptotic Notations for Order of Growth; Is the form
Theta(N ^ ( ( LOGb( a / b) + 1 ) ) )
Equivalent to
Theta(N ^ (LOGb( a ) ) ) ??
Where LOGb(a) means LOG a to base b.
Since log(a/b) = log a - log b and LOGb(b) = 1, we have LOGb(a/b)-1 = LOGb(a) - 1 + 1 = LOGb(a). No mention of asymptotics necessary, this equality is exact for all a, b > 0.
Related
I am working with manufacturing costs and the value of the fabrication still left to go (the "Net Work In Process"). The SQL is straight arithmetic but the query doesn't result in a value if there is a minus sign (subtraction) in the denominator. The database columns:
A = Material issued cost
B = Miscellaneous cost adds
C = Labor
D = Overhead
E = Setup cost
F = Scrap cost
G = Received cost (cost of assemblies completed already)
H = Original Quantity ordered
I = Quantity deviation
J = Quantity split from order
K = Quantity received (number of assemblies completed already)
The Net WIP cost is nothing more than the total cost remaining divided by the total quantity remaining. So in short, I'm simply trying to do this:
select (A + B + C + D + E - F - G) / (H + I - J - K) from MyTable
The subtractions work fine in the numerator but as soon as I subtract in the denominator, the query simply returns no value (blank). I've tried stuff like this:
select (A + B + C + D + E - F - G) / (H + I - (J + K)) from MyTable
select (A + B + C + D + E - F - G) / (H + I + (-J) + (-K)) from MyTable
select (A + B + C + D + E - F - G) / (H + I + (J * -1) + (K * -1)) from MyTable
None of these work. Just curious if anyone has come across this on IBM's DB2 database?
Thanks.
If you are returning "blank" in a numeric calculation, then you have a NULL value somewhere. Try using coalesce():
nullif(coalesce(H, 0) + coalesce(I, 0) - coalesce(J, 0) - coalesce(K, 0), 0)
You have nulls in one of the columns H, I, J, or K. Search for the offending rows using:
select H, I, J, K
from MyTable
where H is null
or I is null
or J is null
or K is null;
Then, you can treat those special cases according you your own logic. Typically you'll replace those nulls with zeroes or other values using COALESCE().
Thanks all for your comments. I did scour the columns for NULLs and there aren't any. There are then plenty of conditions where, let's say, the factory sets the order to 10 and completes (receives) 10 with no splits and no deviation. In that case:
H + I + J + K = 0 (+10 +0 -0 -10) = 0
and I can't divide by zero. So I have a different workaround for that and thanks for everyone's help.
Why does the following query not trigger a "cannot compare record types with different numbers of columns" error in PostgreSQL 11.6?
with
s AS (SELECT 1)
, main AS (
SELECT (a) = (b) , (a) = (a), (b) = (b), a, b -- I expect (a) = (b) fails
FROM s
, LATERAL (select 1 as x, 2 as y) AS a
, LATERAL (select 5 as x) AS b
)
select * from main;
While this one does:
with
x AS (SELECT 1)
, y AS (select 1, 2)
select (x) = (y) from x, y;
See the note in the docs on row comparison
Errors related to the number or types of elements might not occur if the comparison is resolved using earlier columns.
In this case, because a.x=1 and b.x=5, it returns false without ever noticing that the number of columns doesn't match. Change them to match, and you will get the same exception (which is also why the 2nd query does have that exception).
testdb=# with
s AS (SELECT 1)
, main AS (
SELECT a = b , (a) = (a), (b) = (b), a, b -- I expect (a) = (b) fails
FROM s
, LATERAL (select 5 as x, 2 as y) AS a
, LATERAL (select 5 as x) AS b
)
select * from main;
ERROR: cannot compare record types with different numbers of columns
I'm running sql statements on a huge db for the first time and I have code as such.
Select x, sum(y), sum(z) from db
where n = 'xxx' or n = 'yyy' and m = int
group by x
Now if I do this
Select x, sum(y), sum(z) from db
where n = 'xxx' and m = int
group by x
Select x, sum(y), sum(z) from db
where n = 'yyy' and m = int
group by x
And manually add the grouped values together from the 2 tables I am getting different results in my queries, with the separated queries being more accurate.
E.G. Result for row 1 will in the first query will be 20 million, Result for adding Row 1's together in the second block of code will be like 18 million? Not sure what the issue is...?
Best to use parentheses when OR's are used with AND's.
select x, sum(y), sum(z) from db
where (n = 'xxx' or n = 'yyy') and m = int
group by x
In SQL, an AND takes precedence over an OR.
So this:
where n = 'xxx' or n = 'yyy' and m = int
Is actually processed as:
where n = 'xxx' or (n = 'yyy' and m = int)
And that gets the n that are 'xxx' with any m.
Anyway, Gordon has a point.
Using an IN for this is better. Even if it's only 2.
Use in. Your code doesn't really make sense:
where n in ('xxx', 'yyy') and m = int
This query:
where n = 'xxx' or 'yyy' and m = int
should return an error in SQL Server, because of the dangling 'yyy'. MySQL accepts this syntax. In that database, it would be processed as:
where n = 'xxx' or 'yyy' and m = int
-- AND has higher precedence than `or`
where n = 'xxx' or ('yyy' and m = int)
-- `'yyy'` is converted to an integer
where n = 'xxx' or (0 and m = int)
-- which is treated as a boolean
where n = 'xxx' or (false and m = int)
-- which is grouped like this
where n = 'xxx' or (false and (m = int))
-- which is equivalent to
where n = 'xxx'
I need a function to calculate a trend line. I have a query (part of the function):
select round(sum(nvl(vl_indice, vl_meta))/12, 2) from (
SELECT
SUM (vl_indice) vl_indice, SUM (vl_meta) vl_meta
FROM
(SELECT cd_mes, vl_indice, NULL vl_meta, dt.id_tempo,
fi.id_multi_empresa, fi.id_setor, fi.id_indice
FROM dbadw.fa_indice fi , dbadw.di_tempo dt ,
dbadw.di_multi_empresa dme , dbaportal.organizacao o ,
dbadw.di_indice di
WHERE fi.id_tempo = dt.id_tempo
AND DT.CD_MES BETWEEN TO_NUMBER(TO_CHAR(ADD_MONTHS(TO_DATE(TO_CHAR(PCD_MES),'YYYYMM'),- 11),'YYYYMM'))
AND PCD_MES
AND DT.ANO = TO_NUMBER(TO_CHAR(TO_DATE(TO_CHAR(PCD_MES),'YYYYMM'),'YYYY'))
AND fi.id_multi_empresa = dme.id_multi_empresa
AND dme.cd_multi_empresa = NVL(o.cd_multi_empresa_mv2000, o.cd_organizacao)
AND o.cd_organizacao = PCD_ORG
AND fi.id_setor IS NULL
AND fi.id_indice = di.id_indice
AND di.cd_indice = PCD_IVM
UNION ALL
SELECT cd_mes, NULL vl_indice, vl_meta, dt.id_tempo,
fm.id_multi_empresa, fm.id_setor, fm.id_indice
FROM dbadw.fa_meta_indice fm , dbadw.di_tempo dt ,
dbadw.di_multi_empresa dme , dbaportal.organizacao o ,
dbadw.di_indice di
WHERE fm.id_tempo = dt.id_tempo
AND DT.ANO = TO_NUMBER(TO_CHAR(TO_DATE(TO_CHAR(PCD_MES),'YYYYMM'),'YYYY'))
AND fm.id_multi_empresa = dme.id_multi_empresa
AND dme.cd_multi_empresa = NVL(o.cd_multi_empresa_mv2000, o.cd_organizacao)
AND o.cd_organizacao = PCD_ORG
AND fm.id_setor IS NULL
AND fm.id_indice = di.id_indice
AND di.cd_indice = PCD_IVM
)
GROUP BY cd_mes, id_tempo, id_multi_empresa, id_setor, id_indice
ORDER BY cd_mes);
Where I tried to calculate the trend line on the first line, but is not correct. Please, Can anybody help me?
Its very difficult to work out from a query what you want to fit a "trend line" to - by which I assume you mean to use least square linear regression to find a best fit to the data.
So an example with test data:
Oracle Setup:
CREATE TABLE data ( x, y ) AS
SELECT LEVEL,
230 + DBMS_RANDOM.VALUE(-5,5) - 3.14159 * DBMS_RANDOM.VALUE( 0.95, 1.05 ) * LEVEL
FROM DUAL
CONNECT BY LEVEL <= 1000;
As you can see the data is random but its approximately y = -3.14159x + 230
Query - Get the Least Square Regression y-intercept and gradient:
SELECT REGR_INTERCEPT( y, x ) AS best_fit_y_intercept,
REGR_SLOPE( y, x ) AS best_fit_gradient
FROM data
This will get something like:
best_fit_y_intercept best_fit_gradient
-------------------- -----------------
230.531799878168 -3.143190435415
Query - Get the y co-ordinate of the line of best fit:
SELECT x,
y,
REGR_INTERCEPT( y, x ) OVER () + x * REGR_SLOPE( y, x ) OVER () AS best_fit_y
FROM data
The solution is:
SELECT valor, mes,
((mes * SLOPE) + INTERCEPT) TENDENCIA, SLOPE, INTERCEPT from
( select valor, mes, ROUND(REGR_SLOPE(valor,mes) over (partition by id_multi_empresa),4)SLOPE,
ROUND(REGR_INTERCEPT(valor,mes) over (PARTITION by id_multi_empresa),4) INTERCEPT from( --the initial select
I have the where condition in the sql:
WHERE
( Spectrum.access.dim_member.centene_ind = 0 )
AND
(
Spectrum.access.Client_List_Groups.Group_Name IN ( 'Centene Health Plan Book of Business' )
AND
Spectrum.access.dim_member.referral_route IN ( 'Claims Data' )
AND
***(
Spectrum.access.fact_task_metrics.task = 'Conduct IHA'
AND
Spectrum.access.fact_task_metrics.created_by_name <> 'BMU, BMU'
AND
Spectrum.access.fact_task_metrics.created_date BETWEEN '01/01/2015 00:0:0' AND '06/30/2015 00:0:0'
)***
AND
***(
Spectrum.access.fact_outreach_metrics.outreach_type IN ( 'Conduct IHA' )
AND
(
Spectrum.dbo.ufnTruncDate(Spectrum.access.fact_outreach_metrics.metric_date) >= Spectrum.access.fact_task_metrics.metric_date
OR
Spectrum.access.fact_outreach_metrics.metric_date >= Spectrum.access.fact_task_metrics.created_date
)
)***
AND
Spectrum.access.fact_outreach_metrics.episode_seq = 1
AND
Spectrum.access.dim_member.reinstated_date Is Null
)
I have marked two of the conditions in the above code.
The 1st condition have 2 AND operators.
The 2nd condition has an AND and an OR operator.
Question 1: Does removing the outer brackets "(" in the 1st condition impact the results?
Question 2: Does removing the outer brackets "(" in the 2nd condition impact the results?
After removing the outer bracket the filters will look like:
Spectrum.access.dim_member.referral_route IN ( 'Claims Data' )
AND
Spectrum.access.fact_task_metrics.task = 'Conduct IHA'
AND
Spectrum.access.fact_task_metrics.created_by_name <> 'BMU, BMU'
AND
Spectrum.access.fact_task_metrics.created_date BETWEEN '01/01/2015 00:0:0' AND '06/30/2015 00:0:0'
AND
Spectrum.access.fact_outreach_metrics.outreach_type IN ( 'Conduct IHA' )
AND
(
Spectrum.dbo.ufnTruncDate(Spectrum.access.fact_outreach_metrics.metric_date) >= Spectrum.access.fact_task_metrics.metric_date
OR
Spectrum.access.fact_outreach_metrics.metric_date >= Spectrum.access.fact_task_metrics.created_date
)
AND
Spectrum.access.fact_outreach_metrics.episode_seq = 1
Appreciate your help.
Regards,
Jude
Order of operations dictate that AND will be processed before OR when these expressions are evaluated within a parenthesis set.
WHERE (A AND B) OR (C AND D)
Is equivalent to:
WHERE A AND B OR C AND D
But the example below:
WHERE (A OR B) AND (C OR D)
Is not equivalent to:
WHERE A OR B AND C OR D
Which really becomes:
WHERE A OR (B AND C) OR D
Technically, you should be able to safely remove the parenthesis in question for both of your examples. With the AND statement, you are adding all of your conditions together to be one large condition. When using the OR clause, you should carefully place the parenthesis so that the groups are properly segmented.
Take the following examples into consideration:
a) where y = 1 AND n = 2 AND x = 3 or x = 5
b) where y = 1 AND n = 2 AND (x = 3 or x = 5)
c) where (y = 1 AND n = 2 AND x = 3) or x = 5
In example A, the intended outcome is unclear.
In example B, the intended outcome states that all of the conditions must be met and X can be either 3 or 5.
In example C, the intended outcome states that either Y=1, N=2 and X=3 OR x=5. As long as X = 5, it doesn't matter what Y and N equal.