I'm having an issue with converting a view from PostgreSQL to Oracle when a sub-query is referencing a column in the outer query.
This issue seems to have been discussed here several times but I have been unable to get any of the fixes to work with my specific query.
The query's purpose is to get a mobile devices last recorded position and get the distance in KM from it's closest checkpoint/Geo-boundary and it references 3 separate tables: devices, device_locations and checkpoints.
SELECT
d.id,
dl.latitude AS last_latitude,
dl.longitude AS last_longitude,
(SELECT * /* Get closest 'checkpoint' to the last device position by calculating the Great-circle distance */
FROM (
SELECT
6371 * acos(cos(dl.latitude / (180/acos(-1))) * cos(checkpoints.latitude / (180/acos(-1))) * cos((checkpoints.longitude / (180/acos(-1))) - (dl.longitude / (180/acos(-1)))) + sin(dl.latitude / (180/acos(-1))) * sin(checkpoints.latitude / (180/acos(-1)))) AS distance
FROM checkpoints
ORDER BY distance)
WHERE ROWNUM = 1) AS distance_to_checkpoint
FROM devices d
LEFT JOIN ( /* Get the last position of the device */
SELECT l.id,
l.time,
l.latitude,
l.longitude,
l.accuracy
FROM device_locations l
WHERE l.ROWID IN (SELECT MAX(ROWID) FROM device_locations GROUP BY id)
ORDER BY l.id, l.time DESC) dl
ON dl.id = d.id;
I've been stuck on this for a while and hoping someone can put me on the right path, thanks.
This is a follow-up to my other answer. In order to get the checkpoints record with the minimum distance, you'd join with the table and use window functions again to pick the best record. E.g.:
select
device_id,
last_latitude,
last_longitude,
checkpoint_latitude,
checkpoint_longitude,
distance
from
(
select
device_id,
last_latitude,
last_longitude,
checkpoint_latitude,
checkpoint_longitude,
distance,
min(distance) over (partition by device_id) as min_distance
from
(
select
d.id as device_id,
dl.latitude as last_latitude,
dl.longitude as last_longitude,
cp.latitude as checkpoint_latitude,
cp.longitude as checkpoint_longitude,
6371 *
acos(cos(dl.latitude / (180/acos(-1))) *
cos(cp.latitude / (180/acos(-1))) *
cos((cp.longitude / (180/acos(-1))) - (dl.longitude / (180/acos(-1))))
+
sin(dl.latitude / (180/acos(-1))) *
sin(cp.latitude / (180/acos(-1)))
) as distance
from devices d
left join
(
select
id as device_id, latitude, longitude, time,
max(time) over (partition by id) as max_time
from device_locations
) dl on dl.device_id = d.id and dl.time = dl.max_time
cross join checkpoints cp
)
)
where (distance = min_distance) or (distance is null and min_distance is null);
Such queries are easier to write with CROSS APPLY and OUTER APPLY, available as of Oracle 12c.
I see two issues:
Extra comma after you final select column: AS distance_to_checkpoint,
Outer select columns reference an inner table device_locations l, instead of the derived table dl - example: l.latitude should be dl.latitude
First of all: The query doesn't get the last device positions. It gets the records with the highest ROWID per ID which may happen to be the latest entry, but is not at all guaranteed to be.
Then you most probably have an issue with scope. Unfortunately, names are only valid one level deep, which is an annoying limitation. dl.latitude etc. are probably not valid in your subquery, because it's actually a subquery within a subquery. Anyway, what you are trying to get is the minimum distance, which you can easily get with MIN.
An ORDER BY in a subquery is superfluous in standard SQL. Oracle makes an exception for their ROWNUM technique, but I wouldn't make use of this. (And as mentioned, it's even clumsy for getting a minimum value.) The ORDER BY in the outer join is superfluous anyway.
This is how I would approach the problem:
select
d.id as device_id,
dl.latitude as last_latitude,
dl.longitude as last_longitude,
(
select min(6371 *
acos(cos(dl.latitude / (180/acos(-1))) *
cos(cp.latitude / (180/acos(-1))) *
cos((cp.longitude / (180/acos(-1))) - (dl.longitude / (180/acos(-1))))
+
sin(dl.latitude / (180/acos(-1))) *
sin(cp.latitude / (180/acos(-1)))
)
)
from checkpoints cp
) as distance
from devices d
left join
(
select
id as device_id, latitude, longitude, time,
max(time) over (partition by id) as max_time
from device_locations
) dl on dl.device_id = d.id and dl.time = dl.max_time;
Related
I'm trying to build a query that provides me a list of five jobs for a weekly promotion. The query works fine and gives the right result. There is only one factor that needs a filter.
We want to promote different jobs of different companies. The ORDER BY makes it possible to select jobs with the highest need for applicants. It could be that one company has five times the most urgent need. Therefore the query selects the five jobs of one company. I want to add a filter so the query selects a maximum of two or three job from one company. But couldn't find out how.
I've tried it with different angles of the DISTINCT function. But without results. I think that the underlying problem has something to do with a wrong group function on job.id (just a thought) but can't find a solution.
SELECT
job.id,
company_name,
city,
job.title,
hourly_rate_amount,
created_at,
count(work_intent.id),
number_of_contractors,
(count(work_intent.id)/number_of_contractors) AS applicants,
(3959 * acos(cos(radians(52.370216)) * cos( radians(address.latitude))
* cos(radians(longitude) - radians(4.895168)) + sin(radians(52.370216)) * sin(radians(latitude)))) AS distance
FROM job
INNER JOIN client on job.client_id = client.id
INNER JOIN address on job.address_id = address.id
LEFT JOIN work_intent on job.id = work_intent.job_id
INNER JOIN job_title on job.job_title_id = job_title.id
WHERE job_title.id = ANY
(SELECT job_title.id FROM job_title WHERE job.job_title_id = '28'
or job.job_title_id = '30'
or job.job_title_id = '31'
or job.job_title_id = '32'
)
AND job.status = 'open'
AND convert(job.starts_at, date) = '2019-09-19'
AND hourly_rate_amount > 1500
GROUP BY job.id
HAVING distance < 20
ORDER BY applicants, distance
LIMIT 5
I expect the output would be:
job.id - company_name - applicants
14842 - company_1 - 0
46983 - company_6 - 0
45110 - company_5 - 0
95625 - company_1 - 1
12055 - company_3 - 2
One quite simple solution, that can be applied without essentially modifyin the logic of the query, is to wrap the query and use ROW_NUMBER() to rank the records. Then, you can filter on the row number to limit the number of records per company.
Consider:
SELECT *
FROM (
SELECT
x.*,
row_number() over(partition by company order by applicants, distance) rn
FROM (
-- your query, without ORDER BY and LIMIT
) x
) y
WHERE rn <= 3
ORDER BY applicants, distance
LIMIT 5
Finding distances on the surface of the earth means using Great Circle distances, worked out with the Haversine formula, also called the Spherical Cosine Law formula.
The problem is this: Given a table of locations with latitudes and longitudes, which of those locations are nearest to a given location?
I have the following query:
SELECT z.id,
z.latitude, z.longitude,
p.radius,
p.distance_unit
* DEGREES(ACOS(COS(RADIANS(p.latpoint))
* COS(RADIANS(z.latitude))
* COS(RADIANS(p.longpoint - z.longitude))
+ SIN(RADIANS(p.latpoint))
* SIN(RADIANS(z.latitude)))) AS distance
FROM doorbots as z
JOIN ( /* these are the query parameters */
SELECT 34.0480698 AS latpoint, -118.3589196 AS longpoint,
2 AS radius, 111.045 AS distance_unit
) AS p ON 1=1
WHERE z.latitude between ... and
z.longitude between ...
How to use earthdistance extension to change my complicated formula in the query?
Is it equivalent change?
SELECT z.id,
z.latitude, z.longitude,
p.radius,
round(earth_distance(ll_to_earth(p.latpoint, p.longpoint), ll_to_earth(z.latitude, z.longitude))::NUMERIC,0) AS distance
FROM doorbots as z
JOIN ( /* these are the query parameters */
SELECT 34.0480698 AS latpoint, -118.3589196 AS longpoint,
2 AS radius, 111.045 AS distance_unit
) AS p ON 1=1
WHERE z.latitude between ... and
z.longitude between ...
You can get the most out of earthdistance with the following queries:
Locations close enough (i.e. within 1000000.0 meters -- 621.371192 miles) to (34.0480698, -118.3589196):
select *
from doorbots z
where earth_distance(ll_to_earth(z.latitude, z.longitude), ll_to_earth(34.0480698, -118.3589196)) < 1000000.0; -- in meters
select *
from doorbots z
where point(z.longitude, z.latitude) <#> point(-118.3589196, 34.0480698) < 621.371192; -- in miles
Top 5 locations closest to (34.0480698, -118.3589196):
select *
from doorbots z
order by earth_distance(ll_to_earth(z.latitude, z.longitude), ll_to_earth(34.0480698, -118.3589196))
limit 5;
select *
from doorbots z
order by point(z.longitude, z.latitude) <#> point(-118.3589196, 34.0480698)
limit 5;
To use indexes, apply the following one to your table:
create index idx_doorbots_latlong
on doorbots using gist (earth_box(ll_to_earth(latitude, longitude), 0));
Use index for: locations close enough (i.e. within 1000000.0 meters -- 621.371192 miles) to (34.0480698, -118.3589196):
with p as (
select 34.0480698 as latitude,
-118.3589196 as longitude,
1000000.0 as max_distance_in_meters
)
select z.*
from p, doorbots z
where earth_box(ll_to_earth(z.latitude, z.longitude), 0) <# earth_box(ll_to_earth(p.latitude, p.longitude), p.max_distance_in_meters)
and earth_distance(ll_to_earth(z.latitude, z.longitude), ll_to_earth(p.latitude, p.longitude)) < p.max_distance_in_meters;
Use index for: top 5 locations closest to (34.0480698, -118.3589196):
select z.*
from doorbots z
order by earth_box(ll_to_earth(z.latitude, z.longitude), 0) <-> earth_box(ll_to_earth(34.0480698, -118.3589196), 0)
limit 5;
http://rextester.com/WQAY4056
This post is related to another question of mine. I came up with a recursive query that does basically want I want. As long as the count for the dist_calc_points attribute has not exceeded the recursive query is being executed. But this works only for one entry (see the WHERE v2_channel.id=2 clause). How I can apply this query to the whole table?
WITH RECURSIVE dist(x, the_geom, d) AS (
SELECT
0::double precision,
the_geom,
0::double precision
FROM v2_channel where v2_channel.id=2
UNION ALL
SELECT
x+1,
v2_channel.the_geom AS gm,
d+(1/v2_channel.dist_calc_points) AS dist_calc_pnts
FROM v2_channel, dist
WHERE dist.x<v2_channel.dist_calc_points AND v2_channel.id=2
)
SELECT *, ST_AsText(ST_LineInterpolatePoint(the_geom, d)) FROM dist;
To allow the CTE to apply to multiple rows, you have to be able to identify these rows. So just add the ID:
WITH RECURSIVE dist(id, x, the_geom, d) AS (
SELECT
id,
0::double precision,
the_geom,
0::double precision
FROM v2_channel
UNION ALL
SELECT
dist.id,
x+1,
v2_channel.the_geom AS gm,
d+(1/v2_channel.dist_calc_points) AS dist_calc_pnts
FROM v2_channel JOIN dist
ON dist.x < v2_channel.dist_calc_points
AND dist.id = v2_channel.id
)
SELECT *, ST_AsText(ST_LineInterpolatePoint(the_geom, d)) FROM dist;
there are four tables as :
T_SALES has columns like
CUST_KEY,
ITEM_KEY,
SALE_DATE,
SALES_DLR_SALES_QTY,
ORDER_QTY.
T_CUST has columns like
CUST_KEY,
CUST_NUM,
PEER_GRP_ID
T_PEER_GRP has columns like
PEER_GRP_ID,
PEER_GRP_DESC,
PRNT_PEER_GRP_ID
T_PRNT_PEEER has columns like
PRNT_PEER_GRP_ID,
PRNT_PEER_DESC
Now for the above tables, i need to generate a percentile rank of the customer based on the computation fillrate = SALES_QTY / ORDER_QTY * 100 by peer group within a parent peer.
could someone please help on this?
You can use the analytic function PERCENT_RANK() to calculate the percentile rank, as below:
SELECT
t_s.cust_key,
t_c.cust_num,
PERCENT_RANK() OVER (ORDER BY (t_s.SALES_DLR_SALES_QTY / ORDER_QTY) DESC) as pr
FROM t_sales t_s
INNER JOIN t_cust t_c ON t_s.cust_key = t_c.cust_key
ORDER BY pr;
Reference:
PERCENT_RANK on Oracle® Database SQL Reference
If by "percentile rank" you mean "percent rank" (documented here), then the harder part is the joins. I think this is the basic data that you want for the percentile rank:
select t.PEER_GRP_ID, t.PRNT_PEER_GRP_ID,
sum(SALES_DLR_SALES_QTY * ORDER_QTY) as total
from t_sales s join
t_customers c
on s.CUST_KEY = c.cust_key join
t_peer_grp t
on t.PEER_GRP_ID = c.PEER_GRP_ID
group by t.PEER_GRP_ID, t.PRNT_PEER_GRP_ID;
You can then calculate the percentile (0 to 100) as:
select t.PEER_GRP_ID, t.PRNT_PEER_GRP_ID,
sum(SALES_DLR_SALES_QTY * ORDER_QTY) as total,
percentile_rank() over (partition by t.PRNT_PEER_GRP_ID
order by sum(SALES_DLR_SALES_QTY * ORDER_QTY)
)
from t_sales s join
t_customers c
on s.CUST_KEY = c.cust_key join
t_peer_grp t
on t.PEER_GRP_ID = c.PEER_GRP_ID
group by t.PEER_GRP_ID, t.PRNT_PEER_GRP_ID;
Note that this mixes analytic functions with aggregation functions. This can look awkward when you first learn about it.
My problem requires me to query data from the table, and include a column to calculate the % increase as well. I need to pull only the records with the highest % of increase using MAX. I think I'm on the right track but but for some reason its returning all records despite the having clause calling for just the max.
Select
O.Grocery_Item,
TO_CHAR(sum(g.Price_IN_2000), '$99,990.00') TOTAL_IN_2000,
TO_CHAR(sum(g.Estimated_Price_In_2025), '$99,990.00') TOTAL_IN_2025,
TO_CHAR(Round(O.MY_OUTPUT),'9,990') || '%' as My_Output
From
GROCERY_PRICES g,
(SELECT
GROCERY_ITEM,
(((sum(Estimated_Price_In_2025) -
sum(Price_IN_2000))/sum(Price_IN_2000))*100) MY_OUTPUT
FROM
GROCERY_PRICES
GROUP BY GROCERY_ITEM) O
Where
G.GROCERY_ITEM = O.GROCERY_ITEM
GROUP BY
O.GROCERY_ITEM, O.MY_OUTPUT
Having
my_output IN (select Max(O.MY_OUTPUT) from GROCERY_PRICES);
Results:
GROCERY_ITEM TOTAL_IN_2000 TOTAL_IN_2025 MY_OUTPUT
------------------------------ ------------- ------------- ---------
M_004 $2.70 $5.65 109%
B_001 $0.80 $2.64 230%
T_006 $5.70 $6.65 17%
B_002 $2.72 $7.36 171%
E_001 $0.62 $1.78 187%
R_003 $4.00 $13.20 230%
6 rows selected
You can simplify your query so you only select from the Groceries table once since your My_Output column is only a function of numbers you are already producing the self join is not necessary. Then I've used RANK to get the top records (although if you are not concerned about ties ROWNUM will work better):
SELECT g.Grocery_Item,
g.TOTAL_IN_2000,
g.TOTAL_IN_2025,
g.My_Output
FROM ( SELECT Grocery_Item,
TO_CHAR(TOTAL_IN_2000, '$99,990.00') TOTAL_IN_2000,
TO_CHAR(TOTAL_IN_2025, '$99,990.00') TOTAL_IN_2025,
TO_CHAR(ROUND(((TOTAL_IN_2025 / TOTAL_IN_2000) - 1) * 100), '9,990') || '%' as My_Output,
RANK() OVER(PARTITION BY Grocery_Item ORDER BY (TOTAL_IN_2025 / TOTAL_IN_2000) - 1 DESC) AS GroceryRank
FROM ( SELECT g.Grocery_Item,
SUM(g.Price_IN_2000) TOTAL_IN_2000,
SUM(g.Estimated_Price_In_2025) TOTAL_IN_2025
FROM GROCERY_PRICES g
GROUP BY g.Grocery_Item
) g
) g
WHERE GroceryRank = 1;
I've also simplified your percentage calculation.
Try this instead:
select *
from (Select O.Grocery_Item, TO_CHAR(sum(g.Price_IN_2000), '$99,990.00') TOTAL_IN_2000,
TO_CHAR(sum(g.Estimated_Price_In_2025), '$99,990.00') TOTAL_IN_2025,
TO_CHAR(Round(O.MY_OUTPUT),'9,990') || '%' as My_Output
From GROCERY_PRICES g join
(SELECT GROCERY_ITEM,
(((sum(Estimated_Price_In_2025) -
sum(Price_IN_2000))/sum(Price_IN_2000))*100
) MY_OUTPUT
FROM GROCERY_PRICES
GROUP BY GROCERY_ITEM
) O
on G.GROCERY_ITEM = O.GROCERY_ITEM
GROUP BY O.GROCERY_ITEM, O.MY_OUTPUT
ORDER BY my_output desc
) t
where rownum = 1
The problem is that your subquery only has outer references. So, the o.my_output is coming from the outer table, not the from clause in the subquery. You are comparing a value to itself, which for non-NULL values is always true.
Since you want the maximum value, the easiest way is to order the list and take the first row. You can also do this with analytic functions, but rownum is usually more efficient.