I have an Access SQL query pulling back results from a Latitude & Longitude input (similar to a store locator). It works perfectly fine until I attempt to put in a WHERE clause limiting the results to only resultants within XXX miles (3 in my case).
The following query works fine without the WHERE distCalc < 3 clause being added in:
PARAMETERS
[selNum] Long
, [selCBSA] Long
, [cosRadSelLAT] IEEEDouble
, [radSelLONG] IEEEDouble
, [sinRadSelLAT] IEEEDouble;
SELECT B.* FROM (
SELECT A.* FROM (
SELECT
CERT
, RSSDHCR
, NAMEFULL
, BRNUM
, NAMEBR
, ADDRESBR
, CITYBR
, STALPBR
, ZIPBR
, simsLAT
, simsLONG
, DEPDOM
, DEPSUMBR
, 3959 * ArcCOS(
cosRadSelLAT
* cosRadSimsLAT
* cos(radSimsLONG - radSelLONG)
+ sinRadSelLAT
* sinRadSimsLAT
) AS distCalc
FROM aBRc
WHERE CBSA = selCBSA
AND cosRadSimsLAT IS NOT NULL
AND UNINUMBR <> selNum
) AS A
ORDER BY distCalc
) AS B
WHERE B.distCalc < 3
ORDER BY B.DEPSUMBR DESC;
When I add the WHERE distCalc < 3 clause, I get the dreaded
This expression is typed incorrectly, or it is too complex to be evaluated.
error.
Given that the value is created in the A sub-query I thought that it would be available in the outer B query for comparative calcs. I could recalculate the distCalc in the WHERE, however, I'm trying to avoid that since I'm using a custom function (ArcCOS). I'm already doing one hit on each row and there is significant overhead involved doing additional if I can avoid it.
The way you have it typed you are limiting it by B.distCalc, which is requires calcuation of A.distCalc, on which you are asking for a sort. Even if this worked worked, would require n^2 calculations to compute.
Try putting the filter on distCalc in the inner query (using the formula for distCalc, not distCalc itself).
This is not an answer. It's a formatted comment. What happens when you do this:
select *
from (
select somefield, count(*) records
from sometable
group by somefield) temp
If that runs successfully, try it with
where records > 0
at the end. If that fails, you probably need another approach. If it succeeds, start building your real query using baby steps like this. Test early and test often.
I was able to "solve" the problem by pushing the complicated formula into the function and then returning the value (which was then able to be used in the WHERE clause).
Instead of:
3959 * ArcCOS( cosRadSelLAT * cosRadSimsLAT * cos(radSimsLONG - radSelLONG) + sinRadSelLAT * sinRadSimsLAT) AS distCalc
I went with:
ArcCOS2(cosRadSelLAT,cosRadSimsLAT,radSimsLONG, radSelLONG,sinRadSelLAT,sinRadSimsLAT) AS distCalc
The ArcCOS2 Function contained the full formula. The upside is it works, the downside is that appears to be a slight tad slower. I appreciate everyone's help on this. Thank you.
Related
In AWS Timestream I am trying to get the average heart rate for the first month since we have received heart rate samples for a specific user and the average for the last week. I'm having trouble with the query to get the first month part. When I try to use MIN(time) in the where clause I get the error: WHERE clause cannot contain aggregations, window functions or grouping operations.
SELECT * FROM "DATABASE"."TABLE"
WHERE measure_name = 'heart_rate' AND time < min(time) + 30
If I add it as a column and try to query on the column, I get the error: Column 'first_sample_time' does not exist
SELECT MIN(time) AS first_sample_time FROM "DATABASE"."TABLE"
WHERE measure_name = 'heart_rate' AND time > first_sample_time
Also if I try to add to MIN(time) I get the error: line 1:18: '+' cannot be applied to timestamp, integer
SELECT MIN(time) + 30 AS first_sample_time FROM "DATABASE"."TABLE"
Here is what I finally came up with but I'm wondering if there is a better way to do it?
WITH first_month AS (
SELECT
Min(time) AS creation_date,
From_milliseconds(
To_milliseconds(
Min(time)
) + 2628000000
) AS end_of_first_month,
USER
FROM
"DATABASE"."TABLE"
WHERE
USER = 'xxx'
AND measure_name = 'heart_rate'
GROUP BY
USER
),
first_month_avg AS (
SELECT
Avg(hm.measure_value :: DOUBLE) AS first_month_average,
fm.USER
FROM
"DATABASE"."TABLE" hm
JOIN first_month fm ON hm.USER = fm.USER
WHERE
measure_name = 'heart_rate'
AND hm.time BETWEEN fm.creation_date
AND fm.end_of_first_month
GROUP BY
fm.USER
),
last_week_avg AS (
SELECT
Avg(measure_value :: DOUBLE) AS last_week_average,
USER
FROM
"DATABASE"."TABLE"
WHERE
measure_name = 'heart_rate'
AND time > ago(14d)
AND USER = 'xxx'
GROUP BY
USER
)
SELECT
lwa.last_week_average,
fma.first_month_average,
lwa.USER
FROM
first_month_avg fma
JOIN last_week_avg lwa ON fma.USER = lwa.USER
Is there a better or more efficient way to do this?
I can see you've run into a few challenges along the way to your solution, and hopefully I can clear these up for you and also propose a cleaner way of reaching your solution.
Filtering on aggregates
As you've experienced first hand, SQL doesn't allow aggregates in the where statement, and you also cannot filter on new columns you've created in the select statement, such as aggregates or case statements, as those columns/results are not present in the table you're querying.
Fortunately there are ways around this, such as:
Making your main query a subquery, and then filtering on the result of that query, like below
Select * from (select *,count(that_good_stuff) as total_good_stuff from tasty_table group by 1,2,3) where total_good_stuff > 69
This works because the aggregate column (count) is no longer an aggregate at the time it's called in the where statement, it's in the result of the subquery.
Having clause
If a subquery isn't your cup of tea, you can use the having clause straight after your group by statement, which acts like a where statement except exclusively for handling aggregates.
This is better than resorting to a subquery in most cases, as it's more readable and I believe more efficient.
select *,count(that_good_stuff) as total_good_stuff from tasty_table group by 1,2,3 having total_good_stuff > 69
Finally, window statements are fantastic...they've really helped condense many queries I've made in the past by removing the need for subqueries/ctes. If you could share some example raw data (remove any pii of course) I'd be happy to share an example for your use case.
Nevertheless, hope this helps!
Tom
I discovered this strange behavior with this query:
-- TP4N has stock_class = 'Bond'
select lot.symbol
, round(sum(lot.qty_left), 4) as "Qty"
from ( select symbol
, qty_left
-- , amount
from trade_lot_tbl t01
where t01.symbol not in (select symbol from stock_tbl where stock_class = 'Cash')
and t01.qty_left > 0
and t01.trade_date <= current_date -- only current trades
union
select 'CASH' as symbol
, sum(qty_left) as qty_left
-- , sum(amount) as amount
from trade_lot_tbl t11
where t11.symbol in (select symbol from stock_tbl where stock_class = 'Cash')
and t11.qty_left > 0
and t11.trade_date <= current_date -- only current trades
group by t11.symbol
) lot
group by lot.symbol
order by lot.symbol
;
Run as is, the Qty for TP4N is 1804.42
Run with the two 'amount' lines un-commented, which as far as I can tell should NOT affect the result, yet Qty for TP4N = 1815.36. Only ONE of the symbols (TP4N) has a changed value, all others remain the same.
Run with the entire 'union' statement commented out results in Qty for TP4N = 1827.17
The correct answer, as far as I can tell, is 1827.17.
So, to summarize, I get three different values by modifying parts of the query that, as far as I can tell, should NOT affect the answer.
I'm sure I'm going to kick myself when the puzzle is solved, this smells like a silly mistake.
Likely, what you are seeing is caused by the use of union. This set operator deduplicates the resultsets that are returned by both queries. So adding or removing columns in the unioned sets may affect the final resultset (by default, adding more columns reduces the risk of duplication).
As a rule of thumb: unless you do want deduplication, you should use union all (which is also more efficient, since the database does not need to search for duplicates).
I want to extract perfectly match data, not partical match data.
But, I can't extract them, if I execute sql code of the below:
I estimate this sql code extract no data , but this extract all rows of data.
【SQL code】
WITH a AS(
SELECT
001 AS id_a,
112345678901234567 AS x
UNION ALL
SELECT
002,
112345678901233567
UNION ALL
SELECT
003,
112345678901232568
),
comp_a AS(
SELECT
*
FROM
a
WHERE
x IN(112345678901234000, 112345678901233000, 112345678901232000)
),
comp_b AS(
SELECT
004 AS id_b
UNION ALL
SELECT
005
)
SELECT
id_a,
id_b
FROM
comp_a
LEFT OUTER JOIN
comp_b
ON (
comp_a.id_a = comp_b.id_b
)
WHERE
comp_b.id_b IS NULL
;
I think "in" clauses are used for perfectly match.
But, perhaps, I think this sql code isn't executed "in" clauses , but it is executed "like" clauses.
I will be glad you answer solution of my question.
■Further note:
・I deleted cashe of browser and Bigquery. But I couldn't solve it.
・This sql code is sample code , because I can't expose real sql code.
・I can recreate this problem in One enviroment of BigQuery,
but I can't recreate in Other enviroment of BigQuery.
This Problem may be not problem of sql code , but problem of enviroment
or setting.
Thank you for answering my question.
I solved my question.
The cause of my problem is not BigQuery , but it is the format of Excel.
Detail:
I tried to check data using Excel , Because the results are a lot of data.
Sad to say , Because of the format of Excel is numeric type , a part of number data are rounded. So I misunderstood the correct result to the wrong result.
Sorry about my misunderstanding.
I have a table order, which is very straightforward, it is storing order data.
I have a view, which is storing currency pair and currency rate. The view is created as below:
create or replace view view_currency_rate as (
select c.* from currency_rate c, (
select curr_from, curr_to, max(rate_date) max_rate_date from currency_rate
where system_rate > 0
group by curr_from, curr_to) r
where c.curr_from = r.curr_from
and c.curr_to = r.curr_to
and c.rate_date = r.max_rate_date
and c.system_rate > 0
);
nothing fancy here, this view populate the latest currency rate (curr_from -> curr_to) from the currency_rate table.
When I do as below, it populate 80k row (all data) because I have plenty of records in order table. And the time spent is less than 5 seconds.
First Query:
select * from
VIEW_CURRENCY_RATE c, order a
where
c.curr_from = A.CURRENCY;
I want to add in more filter, so I thought it could be faster, so I added this:
Second Query:
select * from
VIEW_CURRENCY_RATE c, order a
where
a.id = 'xxxx'
and c.curr_from = A.CURRENCY;
And now it run over 1 minute! I totally have no idea what happen to this. I thought it would be some oracle optimizer goes wrong, so I try to find another way, think of just the 80K data can be populated quite fast, so I try to get the data from it, so I nested the SQL as below:
select * from (
select * from
VIEW_CURRENCY_RATE c, order a
where
c.curr_from = A.CURRENCY
)
where id = 'xxxx';
It run damn slow as well! I running out of idea, can anyone explain what happen to my script?
Updated on 6-Sep-2016
After I know how to 'explain plan', I capture the screen:
Fist query (fast one with 80K data):
Second query (slow one):
The slow one totally break the view and form a new SQL! This is super weird that how can Oracle optimize this like that?
It seems problem relates to the plan of second query. because it uses of nest loops inplace of hash joint.
at first check if _hash_join_enable is true if it isn't true change it to true. if it is true there are some problem with oracle optimizer. for test it use of USE_HASH(tab2 tab1) hint.
Regards
mohsen
I am using Mike solution, I re-write the script, and it is running fast now, although the root cause is not determined, probably due to the oracle optimizer algorithm working in different way that I expect.
I have a MS-Access query that interprets dates and provides an appropriate status for given projects. However, this query provides data on only one division at a time. I am trying to eliminate the sub-query (highlighted in red below) so I can more easily repurpose the query for other division reports.
Below is the sub-query, named qryProjectStatusDPLphase1:
SELECT tblProject_HIF_FCF_CBH.HifFcfCbh
, tblProject_HIF_FCF_CBH.ProjectNum
, tblProject_HIF_FCF_CBH.Stat_CondCommitDt AS CondCommit
, tblProject_HIF_FCF_CBH.Stat_FirmCommitDt AS FirmCommit
, tblProject_HIF_FCF_CBH.Stat_FundAgtRecdDt AS FundAgt
, tblProject_HIF_FCF_CBH.Stat_InDisbursemtDt AS Disbursemt
, tblProject_HIF_FCF_CBH.Stat_ServicingDt AS Servicing
FROM tblProject_HIF_FCF_CBH
WHERE (tblProject_HIF_FCF_CBH.HifFcfCbh) Like "FCF";
The problem I run into is that when I insert the red subquery directly into the larger operation, Access still insists on finding the subquery. So when I forceably remove the stand-alone subquery from Access entirely, I get an error, "The Microsolft Access database engine cannot find the input table or query 'qryProjectStatusDPLphase1'..."
In an effort to identify the problem, I built up the query by adding one piece at a time. When I run the red subquery all by itself, I get no errors. When I run the red + blue sections of the query, again I get no errors. But when I run red + blue + teal sections, then I get the error. My suspicion is that there is something wrong with the way the tables are joined, that prevents red + blue + teal working together properly.
Unfortunately I've spent days on this and can't seem to crack the code, so I was hoping for some wisdom from the cloud.
You are getting the error because you aliased the inner qryProjectStatusDPLphase1 query, and you are trying to call it again in the outer query. Unfortunately, it is out of scope for the outer query, so it won't work.
If you don't want to save qryProjectStatusDPLphase1 as an Access query, you can simply paste the query's sql code into parentheses for the INNER JOIN:
...
FROM(SELECT tblProject_HIF_FCF_CBH.HifFcfCbh
, tblProject_HIF_FCF_CBH.ProjectNum
, tblProject_HIF_FCF_CBH.Stat_CondCommitDt AS CondCommit
, tblProject_HIF_FCF_CBH.Stat_FirmCommitDt AS FirmCommit
, tblProject_HIF_FCF_CBH.Stat_FundAgtRecdDt AS FundAgt
, tblProject_HIF_FCF_CBH.Stat_InDisbursemtDt AS Disbursemt
, tblProject_HIF_FCF_CBH.Stat_ServicingDt AS Servicing
FROM tblProject_HIF_FCF_CBH
WHERE (tblProject_HIF_FCF_CBH.HifFcfCbh) Like "FCF"
) AS qryProjectStatusDPLphase1, zProjectStatusDPL
)as t
INNER JOIN (SELECT tblProject_HIF_FCF_CBH.HifFcfCbh
, tblProject_HIF_FCF_CBH.ProjectNum
, tblProject_HIF_FCF_CBH.Stat_CondCommitDt AS CondCommit
, tblProject_HIF_FCF_CBH.Stat_FirmCommitDt AS FirmCommit
, tblProject_HIF_FCF_CBH.Stat_FundAgtRecdDt AS FundAgt
, tblProject_HIF_FCF_CBH.Stat_InDisbursemtDt AS Disbursemt
, tblProject_HIF_FCF_CBH.Stat_ServicingDt AS Servicing
FROM tblProject_HIF_FCF_CBH
WHERE (tblProject_HIF_FCF_CBH.HifFcfCbh) Like "FCF") AS a
...
Looking at your screen-shot, you're still trying to join the nested query t to the subquery qryProjectStatusDPLphase1:
AS t
INNER JOIN qryProjectStatusDPLphase1 As a
ON t.ProjectNum = a.ProjectNum
You'll need to replace that second instance of qryProjectStatusDPLphase1 with the nested query.