MDX geographic distance calculation - sql-server-2005

I'm using SQL Server 2005 Analysis Services and I'm trying to calculate distance inside of an MDX query - so that I can get counts of the items that are near my current location. I've created a dimension with Latitude & Longitude, and have also created a .NET assembly to do the math - but am having a hard time getting it all to work out in the query.
My query to find items in a 100 mile radius looks something like this:
select FILTER([DimProducts].[Product].[Product],
ZipCalculatorLib.GetDistance(43.474208, [Zip].[Latitude], 96.687689, [Zip].[Longitude]) < 100) on rows,
[Measures].[RowCount] on columns
from MyCube;
And my distance code in .NET looks like this:
public static double GetDistance(double startLat, double endLat,
double startLong, double endLong)
{
return Math.Sqrt(Math.Pow(69.1 * (startLat - endLat), 2) + Math.Pow(Math.Cos(endLat / 57.3) * 69.1 * (startLong - endLong), 2));
}
However, when I run that query, I come up with zero records. If I change the distance from 100 to 10000 - I get counts similar to what should be in the 100 mile radius. It looks like the .NET class isn't doing the square root - but I've tested that code many times over, and it looks right.
Does anyone have any suggestions as to where I should look to fix my problem?
EDIT:
I started to wonder if maybe the latitude and longitude weren't being passed into my GetDistance function correctly - so I added a line of code there to throw an exception to show me what they were. I added the following:
throw new ArgumentException("endLat", string.Format("endLat: {0}, endLong: {1}", endLat, endLong));
And now when I run my query, I get the following error:
Execution of the managed stored
procedure GetDistance failed with the
following error: Exception has been
thrown by the target of an
invocation.endLat Parameter name:
endLat: 1033, endLong: 1033.
So now the question becomes: how do I get my actual latitudes values to pass through that function in the filter? It looks like just a language code is being passed in now.

[Zip].[Latitude] is a member expression. In order to pass this to a function as a numeric value SSAS will return the equivalent of ([Zip].[Latitude], Measures.CurrentMember). And you can't directly filter one dimension based on another in MDX. By definition, different dimensions are completely independent and in OLAP terms every product could potentially exist at every Zip location.
What I suspect that the logic would have to look like is the following:
select
NONEMPTY([DimProducts].[Product].[Product] *
FILTER([Zip].[Zip].[Zip] , ZipCalculatorLib.GetDistance(43.474208, [Zip].[Latitude].CurrentMember.MemberValue, 96.687689, [Zip].[Longitude].CurrentMember.MemberValue) < 100),[Measures].[RowCount]) on rows, [Measures].[RowCount] on columns
from MyCube;
This gets all of the Zip members that have a latitude/longitude within 100 miles, cross joins that with products and then returns those that have a nonempty RowCount.

How are you sure that you are calculating this in miles?
I'm not sure if SQL Server 2008 is availuable, if it is, you should use its geography datatype to calculate distances.
If not, check libraries like SharpMap and Proj.Net - They will let you build a true geographic point and calculate accurate distances between those objects.

I ended up solving my problem using a subcube. by creating the subcube, I was able to filter for the distance - and then after that just did a select for the dimension that I was looking for. Here's what I ended up with...
create subcube MyCube as
select Filter
(
(
[DimLocation].[Latitude].[Latitude],
[DimLocation].[Longitude].[Longitude]
),
(
ZipCalculatorLib.GetDistance(
43.474208,
[DimLocation].[Latitude].CurrentMember.MemberValue,
96.687689,
DimLocation.Longitude.CurrentMember.MemberValue) < 100
)
) on 0 from MyCube;
select DimProducts.Product.Product on rows,
Measures.RowCount on columns
from MyCube;

Related

How can I return all the rows in a PostgreSQL/PostGIS table within a radius of Xkm provided by a longitude and latitude value?

I'm trying to have a go at learning about PostgreSQL and in particular, it's PostGIS extension and the benefits with regards to geographic spatial features it provides. I've loaded a PostgreSQL DB with a table that contains 30,000 records of latitude, longitude and a price value (for houses) and I want to start querying the DB to return all the rows that would be in a radius of Xkm of a particular latitude and longitude.
I've hit a brick wall as to how I might run this type of query as I've found the documentation to be quite limited online and I've found no similar attempts at this method of querying online.
Some methods I've tried:
SELECT *
FROM house_prices
WHERE ST_DWithin( ST_MakePoint(53.3348279,-6.269547099999954)) <= radius_mi *
1609.34;
This prompts the following error:
ERROR: function st_dwithin(geometry) does not exist
Another attempt:
SELECT * FROM house_prices ST_DWithin( 53.3348279, -6.269547099999954, 5); <-- A latitude value, longitude value and 5 miles radius
This prompts the following error:
ERROR: syntax error at or near "53.3348279"
Could anyone point me in the right direction/ know of some documentation I could look at?
** Edit **
Structure and set up of database and table in pgAdmin4
The first query has an invalid number of parameters. The function ST_DWithin expects at least two geometries and the srid distance,
and optionally a Boolean parameter indicating the usage of a spheroid (see documentation).
The second query is missing a WHERE clause and has the same problem as the first query.
Example from documentation:
SELECT s.gid, s.school_name
FROM schools s
LEFT JOIN hospitals h ON ST_DWithin(s.the_geom, h.the_geom, 3000)
WHERE h.gid IS NULL;
Perhaps something like this would be what you want to achieve:
SELECT *
FROM house_prices h
WHERE ST_DWithin(ST_MakePoint(53.3348,-6.2695),h.geom,h.radius_mi * 1609.34)
Also pay attention to the order of the coordinates pair (x,y or y,x), otherwise you might easily land on the sea with these coordinates ;-)
EDIT: Taking into account that there is no geometry on the table, so the points are stored in two different columns, longitude and latitude:
SELECT *
FROM house_prices
WHERE ST_DWithin(ST_MakePoint(longitude,latitude),ST_MakePoint(53.3348,-6.2695),1609.34)

MDX Query SUM PROD to do Weighted Average

I'm building a cube in MS BIDS. I need to create a calculated measure that returns the weighted-average of the rank value weighted by the number of searches. I want this value to be calculated at any level, no matter what dimensions have been applied to break-down the data.
I am trying to do something like the following:
I have one measure called [Rank Search Product] which I want to apply at the lowest level possible and then sum all values of it
IIf([Measures].[Searches] IS NOT NULL, [Measures].[Rank] * [Measures].[Searches], NULL)
And then my weighted average measure uses this:
IIf([Measures].[Rank Search Product] IS NOT NULL AND SUM([Measures].[Searches]) <> 0,
SUM([Measures].[Rank Search Product]) / SUM([Measures].[Searches]),
NULL)
I'm totally new to writing MDX queries and so this is all very confusing to me. The calculation should be
([Rank][0]*[Searches][0] + [Rank][1]*[Searches][1] + [Rank][2]*[Searches][2] ...)
/ SUM([searches])
I've also tried to follow what is explained in this link http://sqlblog.com/blogs/mosha/archive/2005/02/13/performance-of-aggregating-data-from-lower-levels-in-mdx.aspx
Currently loading my data into a pivot table in Excel is return #VALUE! for all calculations of my custom measures.
Please halp!
First of all, you would need an intermediate measure, lets say Rank times Searches, in the cube. The most efficient way to implement this would be to calculate it when processing the measure group. You would extend your fact table by a column e. g. in a view or add a named calculation in the data source view. The SQL expression for this column would be something like Searches * Rank. In the cube definition, you would set the aggregation function of this measure to Sum and make it invisible. Then just define your weighted average as
[Measures].[Rank times Searches] / [Measures].[Searches]
or, to avoid irritating results for zero/null values of searches:
IIf([Measures].[Searches] <> 0, [Measures].[Rank times Searches] / [Measures].[Searches], NULL)
Since Analysis Services 2012 SP1, you can abbreviate the latter to
Divide([Measures].[Rank times Searches], [Measures].[Searches], NULL)
Then the MDX engine will apply everything automatically across all dimensions for you.
In the second expression, the <> 0 test includes a <> null test, as in numerical contexts, NULL is evaluated as zero by MDX - in contrast to SQL.
Finally, as I interpret the link you have in your question, you could leave your measure Rank times Searches on SQL/Data Source View level to be anything, maybe just 0 or null, and would then add the following to your calculation script:
({[Measures].[Rank times Searches]}, Leaves()) = [Measures].[Rank] * [Measures].[Searches];
From my point of view, this solution is not as clear as to directly calculate the value as described above. I would also think it could be slower, at least if you use aggregations for some partitions in your cube.

SQL - View column that calculates the percentage from other columns

I have a query from Access where I caluclated the percentage score of three seperate numbers Ex:
AFPercentageMajor: [AFNumberOfMajors]/([AFTotalMajor]-[AFMajorNA])
which could have values of 20/(23-2) = 95%
I have imported this table into my SQL database and tried to write a expression in the view (changed the names of the columns a bit)
AF_Major / (AF_Major_Totals - AF_Major_NA)
I tried adding *100 to the end of the statement but it only works if the calculation is at 100%. If it is anything less than that it puts it as a 0.
I have a feeling it just doesn't like the combincation of the three seperate column names. But like I said I'm still learning so I could be going at this completely wrong!
SQL Server does integer division. You need to change one of the values to a floating point representation. The following will work:
cast([AFNumberOfMajors] as float)/([AFTotalMajor]-[AFMajorNA])
You can multiply this by 100 to get the percentage value.

Microsoft Access - SQL-generated field evaluated as Text instead of Single data type

I have a SQL statement (saved as "LocationSearch" in Access) that calculates distance between two points and returns the "Distance" as a generated field.
SELECT Int((3963*(Atn(-(Sin(LATITUDE/57.2958)*
Sin([#lat]/57.2958)+Cos(LATITUDE/57.2958)*Cos([#lat]/57.2958)*
Cos([#lng]/57.2958-LONGITUDE/57.2958))/Sqr(-(Sin(LATITUDE/57.2958)*
Sin([#lat]/57.2958)+Cos(LATITUDE/57.2958)*Cos([#lat]/57.2958)*
Cos([#lng]/57.2958-LONGITUDE/57.2958))*(Sin(LATITUDE/57.2958)*
Sin([#lat]/57.2958)+Cos(LATITUDE/57.2958)*Cos([#lat]/57.2958)*
Cos([#lng]/57.2958-LONGITUDE/57.2958))+1))+2*Atn(1)))*10)/10 AS Distance, *
FROM Locations
ORDER BY (3963*(Atn(-(Sin(LATITUDE/57.2958)*
Sin([#lat]/57.2958)+Cos(LATITUDE/57.2958)*Cos([#lat]/57.2958)*
Cos([#lng]/57.2958-LONGITUDE/57.2958))/Sqr(-(Sin(LATITUDE/57.2958)*
Sin([#lat]/57.2958)+Cos(LATITUDE/57.2958)*Cos([#lat]/57.2958)*
Cos([#lng]/57.2958-LONGITUDE/57.2958))*(Sin(LATITUDE/57.2958)*
Sin([#lat]/57.2958)+Cos(LATITUDE/57.2958)*Cos([#lat]/57.2958)*
Cos([#lng]/57.2958-LONGITUDE/57.2958))+1))+2*Atn(1)));
All the nasty math code you see is what calculates the distance (in miles) in the SQL statement using Latitude and Longitude coordinates.
However, the problem is that the Distance field that is generated by the SQL statement seems to be returned as a string. If I then add SQL code that asks for locations between a distance of 0 and 45 miles, it returns ANY Distance value that starts between "0" and "45". This includes a location with a distance of "1017" miles. Apparently, the Distance field is a text field, not a number field. So I can't use the "BETWEEN" statement. I also can't evaluate using "<" and ">" because it has the same problem.
I saved the SQL query above as a saved query called "LocationSearch". This way I can run secondary queries against it, like this:
SELECT * FROM LocationSearch WHERE Distance < #MaxDistance
Access will ask for the #lat, #long and #MaxDistance parameters, then the locations will be returned in a recordset, ordered by distance. However, the problem that occurs is when I enter a MaxDistance of 45. With a table containing locations on the West Coast of the US, and a #lat of 47 and a #long of -122 (near Seattle), Access returns the following:
Notice also that the "Distance" field is right-formatted so it appears to be a numeric field, yet for some reason the query returns a location in San Diego, which is 1,017 miles away. My guess is that it was evaluating the Distance field as a text field, and in an ASCII comparison, I believe that "1017" lies between "0" and "45".
One other thing: I'm using ASP 3.0 (classic) to access this query using JET OLEDB 4.0.
Anyone know how to define the Distance field as a number?
Thanks!
--- EDIT ---
Using HansUp's idea from his answer below, I tried this query to force Access to consider the Distance field as a Single precision number:
SELECT * FROM LocationSearch WHERE CSng(Distance) < #MaxDistance
Even this returned the exact same results as before which included the location in San Diego, 1017 miles away.
If you can't find a way to return numerical values instead of text from that Duration field expression, use your query as a subquery, then cast Duration in the containing query.
SELECT CSng(sub.Duration) AS Duration_as_single
FROM
(
-- your existing query --
) AS sub
WHERE CSng(sub.Duration) BETWEEN 0 AND 45
ORDER BY 1;
That approach also makes for a nicer ORDER BY ... if that counts for anything. :-)
I tried your query without the select *, and without the FROM and ORDER BY clauses.
I added in an extra column into the SELECT to prove that strings return as left-justified in access's grid.
SELECT Int((3963*(Atn(-(
Sin(LATITUDE/57.2958)*Sin([#lat]/57.2958)+Cos(LATITUDE/57.2958)*
Cos([#lat]/57.2958)*Cos([#lng]/57.2958-LONGITUDE/57.2958))/Sqr(-(Sin(LATITUDE/57.2958)*
Sin([#lat]/57.2958)+Cos(LATITUDE/57.2958)*Cos([#lat]/57.2958)*
Cos([#lng]/57.2958-LONGITUDE/57.2958))*(Sin(LATITUDE/57.2958)*
Sin([#lat]/57.2958)+Cos(LATITUDE/57.2958)*Cos([#lat]/57.2958)*
Cos([#lng]/57.2958-LONGITUDE/57.2958))+1))+2*Atn(1)))*10)/10 AS Distance,
'test' as test
I was prompted for four parameters, but in the end, I got back a two-column table:
Since the first column in right-justified, and the second (clearly a string) is left-justified, it appears that access is indeed returning it as a numeric for me. This was in Access 2010.
--EDIT--
I just created a new two-column table called Locations. It has a field id (autonumber) and a field Field1 (text). I ran the original query provided by OP and it works fine (distance is returned as a number).
This leads to wonder... Does the OP's Locations table have it's own Distance field, that is a string? Otherwise, the problem has got to be in the code calling the SQL statement, not in the statement or the jet engine itself.
Okay, solved it!
HansUp, your idea turned out to be the solution. I tried adding the CSng() function on the #MaxDistance parameter in the SQL query and that was what fixed it.
Here's the modified secondary SQL query:
SELECT * FROM LocationSearch WHERE CSng(Distance) < CSng(#MaxDistance)
Thanks for your help, everybody! You all rock.
Happy New Year.

How to average values based on location proximity

I have an SQL table with geo-tagged values (Longitude, Latitude, value). The table is accumulated quickly and has thousands entries. Therefore, querying the table for values in some area return very large data-set.
I would like to know the way to average value with close location proximity to one value, here is an illustration:
Table:
Long lat value
10.123001 53.567001 10
10.123002 53.567002 12
10.123003 53.567003 18
10.124003 53.568003 13
lets say my current location is 10.123004, 53.567004. If I am querying for the values near by I will get the four raws with values 10, 12, 18, and 13. This works if the data-set is relatively small. If the data is large I would like to query sql for rounded location (10.123, 53.567) and need sql to return something like
Long lat value
10.123 53.567 10 (this is the average of 10, 12, and 18)
10.124 53.568 13
Is this possible? how we can average large data set based on locations?
Is sql database is the right choice in the first place?
GROUP BY rounded columns, and the AVG aggregate function should work fine for this:
SELECT ROUND(Long, 3) Long,
ROUND(Lat, 3) Lat,
AVG(value)
FROM Table
GROUP BY ROUND(Long, 3), ROUND(Lat, 3)
Add a WHERE clause to filter as needed.
Here's some rough pseudocode that might be a start. You need to provide the proper precision arguments for the round function in the dialect of SQL you are using for your project, so understand that the 3 I provide as the second argument to Round is the number of decimals of precision to which the number is rounded, as indicated by your original post.
Select round(lat,3),round(long,3),avg(value)
Group by round(lat,3),round(long,3)
The problem with the rounding approach is the boundary conditions -- what happens when points are close to the bounday.
However, for the neighborhood of a given point it is better to use something like:
select *
from table
where long between #MyLong - #DeltaLong and #MyLong + #DeltaLong and
lat between #MyLat - #DeltaLat and #MyLat + #DeltaLat
For this, you need to define #DeltaLong and #DeltaLat.
Rounding works fine for summarization, if that is your problem.