Redis search unusual behaviour while using Ft.search - redis

We are using Redis search utility to cache our data in JSON format. We have stored approx 4.3 milions records. Where id is the key and property such as color, latitude, longitude, and googlepolyline geometry (string) as an attribute.
We have created index also on latitude and longitude columns that are numeric using Ft.create command.
We are searching data based on min-max latitude and longitude range.
It works perfectly and return result very fast, but suddenly after 2-3 hour when we update color property, either index got deleted automatically or redis search return null result when data is present for the given search criterion.
If we dont update anything then it works fine, but after property updation it starts behaving wierdly.
It either return less records than expected or return null.
When we refill the whole data and create index again then its starts behaving normally.

Can you provide more details about the Redis and Search deployment and version that you're using?
btw you can use the GEO type when using lat/long to make query ranges or distance to a specific location.
For FT.CREATE
GEO allows geographic range queries against the value in this
attribute. The value of the attribute must be a string containing a
longitude (first) and latitude separated by a comma.
and for FT.SEARCH such as:
FT.SEARCH restaurants-idx "colour #location:[-122.41 37.77 5 km]"

Related

Can I save multiple GeoEntry with the same member in Redis or there are alternatives?

My aim is to create a cache of road speed limits on Redis (taken from OSM) where searching the position with latitude and longitude, returns the speed limit in a certain radius using GEORADIUS.
The problem is that using:
GEOADD speed-limits -45.000000 10.000000 "90"
if I add a new position with always the limit of 90 the previous one is overwritten.
You can either
(1) use a compound key as the member
so it is GEOADD speed-limits -45.000000 10.000000 90:timestamp:location, and the query would be something like GEORADIUS speed-limits ... WITHCOORD and then use .split(":")[0] to get the speed.
or
(2) store the speed separately
GEOADD speed-limits -45.000000 10.000000 timestamp:location and SET timestamp:location 90 so it would be a two step query too.
Yes. It would get overwritten because 90 is already assigned a value.
Generally, you need to choose your keys carefully. Instead of simply storing the speed-limit, you can multiple delimiters such as timestamps,random-hashes or even some other useful information (say city) in this case with the limit.
For example, "90" could be transformed to 90#1606757564#abcde#city_name.
This way, when you query for the radius, you would get the entire key. Use a simple startsWith() check to get the original limit.

How to perform geospatial operations on redis hashes

I am reading the docs for redis geospatial and I see that I can store only a key, latitude, longitude, and name.
I have some hashes stored, such as events:id, listings:id, etc. Events for example holds the JSON for an event object. This is because these items don't change much and I am caching them in redis.
In order to find some events within some radius, how can I do that?
Would I have to do something like this?
GEOADD [event:id] {event.latitude} {event.longitude} {event.id}
and then map these against the events:id hash?
GEOADD key longitude latitude member adds an entry (member with longitude and latitude) to a sorted set key, the geospatial index. The sorted is created if it doesn't exist.
To be able to query for events within some radius, you want to have all your events in the same geospatial sorted set.
This means you add all to the same key using:
GEOADD events:location {event.longitude} {event.latitude} {event.id}
You can add more than one event at a time:
GEOADD events:location 13.361389 38.115556 "event:1" 15.087269 37.502669 "event:2"
Note longitude goes first
Then you get all events near to a user-defined location using GEORADIUS
> GEORADIUS events:location 15 37 200 km WITHDIST
1) 1) "event:1"
2) "190.4424"
2) 1) "event:2"
2) "56.4413"
Other commands available are:
GEODIST - the distance between two members
GEOHASH - Geohash strings representing the position
GEOPOS - the positions (longitude, latitude) of all the specified members
GEORADIUSBYMEMBER - as GEORADIUS but it takes the name of a member already existing inside the geospatial index
You can also use the sorted set commands on the geospatial index. For example, to get how many events you have on the geospatial index:
> ZCARD events:location
(integer) 2
You can have your whole JSON-encoded event as the member of the geospatial index, or just the event:id which is also key to another key with the event data, your call.

Find out the amount of space each field takes in Google Big Query

I want to optimize the space of my Big Query and google storage tables. Is there a way to find out easily the cumulative space that each field in a table gets? This is not straightforward in my case, since I have a complicated hierarchy with many repeated records.
You can do this in Web UI by simply typing (and not running) below query changing to field of your interest
SELECT <column_name>
FROM YourTable
and looking into Validation Message that consists of respective size
Important - you do not need to run it – just check validation message for bytesProcessed and this will be a size of respective column
Validation is free and invokes so called dry-run
If you need to do such “columns profiling” for many tables or for table with many columns - you can code this with your preferred language using Tables.get API to get table schema ; then loop thru all fields and build respective SELECT statement and finally Dry Run it (within the loop for each column) and get totalBytesProcessed which as you already know is the size of respective column
I don't think this is exposed in any of the meta data.
However, you may be able to easily get good approximations based on your needs. The number of rows is provided, so for some of the data types, you can directly calculate the size:
https://cloud.google.com/bigquery/pricing
For types such as string, you could get the average length by querying e.g. the first 1000 fields, and use this for your storage calculations.

How to add filter to match the value in a map bin Aerospike

I have a requirement where I have to find the record in an aerospike based on attributeId. The data in aerospike is inthe below format
{
name=ABC,
id=xyz,
ts=1445879080423,
inference={2601=0.6}
}
Now I will be getting the value "2601" programatically and I should find this record based on this value. But the problem is the value is in a Map and the size of this map may be more than 1 like
inference={{2601=0.6},{2830=0.9},{2931=0.8}}
So how can I find this record using attributeId in java. Any suggestions much appreciated
A little know feature of Aerospike is that, in addition to a Bin, you can define an index on:
List values
Map Keys
Map Values
Using in index defined on your map keys in the "inference" bin, you will be able to query (filter) base on the key's name.
I hope this helps

CoreData + Magical Record running select query

I have an application with a sqlite database that contains 7000+ records in it with city names, longitudes and latitudes.. also these "cities" are connected to relevant city fields on the database too.
What my app doing is, query the current location with core location, fetch the lon and lat values, and then find the closest location from the database.
The result doesn't have to be super accurate (i just want to match cities), so I want to use Hypotenuse formula for finding the closest point:
closest city in db: min((x1-x2)^2 +(y1-y2)^2)^(1/2)
x1, y1: lon and lat for user
x2, y2: lon and lat for points in database.
If I was using ms-sql or sqlite database, I could easily create a query but when it comes to core data, I'm out of ideas.
I don't want to fetch all the data (and fill the memory) then aggregate this formula on all fields so is there a way to create a query and get the result from the db?
Am I overthinking this problem, and missing a simple solution?
If I'm understanding your problem correctly, you're wanting to find the closest "n" cities to your current location.
I had something similar and here's how I approached it.
In essence, you probably need to take each city's lat/lon and hash it into some index. We use a Mercator Projection to convert the lat/lon to x/y, then hash that value in a manner similar to how Google/Bing/Apple Maps hash their map tiles. Fortunately, MapKit has a built-in Mercator Projection function.
In pseudocode:
for each city's lat/lon {
CLLocationCoordinate2D coordinate = (CLLocationCoordinate2D){lat, lon};
MKMapPoint point = MKMapPointForCoordinate(coordinate);
//256 represents the size of a map tile at zoomLevel 20. You can use whatever zoomLevel
//you want here, but we need something to quickly lookup close-by cities.
//this is the formula you can use to determine how granular your index is
//(256 * pow(2, (20 - zoomLevel)))
NSInteger x = point.x/256.0;
NSInteger y = point.y/256.0;
save x & y in a CityHashIndex table
}
Now, you get the current location's lat/lon, hash that into the index as above, and just simply write a query against this CityHashIndex table.
So say that, for simplicity sake, you're current location is indexed at 1000, 1000. So to find close by cities, maybe you search for cities with indexes in the range of `900-1100, 900-1100'.
From there, you're now only pulling in a much smaller set of cities and the memory requirements to process your Hypotenuse Formula isn't so bad.
I can elaborate more if you're interested.
This is directly related to a commonly asked question about Core Data.
Searching for surrounding suburbs based on latitude & longitude using Objective C
Calculate a bounding box around the point you need (min lat/long max lat/long) then use an NSPredicate against those values to find everything within the box. From there you can do a distance calculation on the results that return and sort them.
I would suggest setting this up so that it can search at multiple distances then you can see if a city is within 10 miles, 100 miles, etc. Slowly increasing the bounding box until you get one or more results back.
I would use NSPredicate to define my search criteria it will act as a filter. I'm not sure how optimized is this and if it will pull all your registers but I'm assuming that coreData has some kind of indexing mechanism that will optimize the search.
You can take a look of this document
https://developer.apple.com/library/mac/documentation/Cocoa/Conceptual/CoreData/Articles/cdFetching.html
Check the section named
Retrieving Specific Objects