Creating a workable Redis store with several filters - redis

I am working on a system to display information about real estate. It runs in angular with the data stored as a json file on the server, which is updated once a day.
I have filters on number of bedrooms, bathrooms, price and a free text field for the address. It's all very snappy, but the problem is the load time of the app. This is why I am looking at Redis. Trouble is, I just can't get my head round how to get data with several different filters running.
Let's say I have some data like this: (missing off lots of fields for simplicity)
id beds price
0 3 270000
1 2 130000
2 4 420000
etc...
I am thinking I could set up three sets, one to hold the whole dataset, one to create an index on bedrooms and another for price:
beds id
2 1
3 0
4 2
and the same for price:
price id
130000 1
270000 0
420000 2
Then I was thinking I could use SINTER to return the overlapping sets.
Let's say I looking for a house with more than 2 bedrooms that is less than 300000.
From the bedrooms set I get IDs 0,2 for beds > 2.
From the prices set I get IDs 0,1 for price < 300000
So the common id is 0, which I would then lookup in the main dataset.
It all sounds good in theory, but being a Redis newbie, I have no clue how to go about achieving it!
Any advice would be gratefully received!

You're on the right track; sets + sorted sets is the right answer.
Two sources for all of the information that you could ever want:
Chapter 7 of my book, Redis in Action - http://bitly.com/redis-in-action
My Python/Redis object mapper - https://github.com/josiahcarlson/rom (it uses ideas directly from chapter 7 of my book to implement sql-like indices)
Both of those resources use Python as the programming language, though chapter 7 has been translated into Java: https://github.com/josiahcarlson/redis-in-action/ (go to the java path to see the code).
... That said, a normal relational database (especially one with built-in Geo handling like Postgres) should handle this data with ease. Have you considered a relational database?

Related

Venn diagrams as database model - is there something similar like MPTT?

I am wondering about how to save venn diagramms into a database. Well actually I dont need to implement the complete venn diagram logic, it's more like a tree with one child beeing child of many parents at the same time.
My first guess was, to use a n:m-self relation to the table with a small helping table.
But I am used building trees with nested sets (see MPTT - Modified Preorder Tree Traversal). This helps me a lot not to iterate the query through all the childs.
Is there something similar like MPTT or a way to extent MPTT to my problem, which could help me avoiding to iterate through all the childs?
Thank you for any remarks or comments in advance.
There is a general ordering of the regions of a Venn diagram. You just need to express each region as a binary number. The number of bits is the number of sets, and each bit shows whether the region belongs to a given set. For instance, for four sets,
Id Binary Interpretation
======= ====== ==============
0 0000 Does not belong to any set
1 0001 Only belongs to set 4
2 0010 Only belongs to set 3
3 0011 Belongs to sets 3 and 4, and not to set 1 or set 2
4 0100 Only belongs to set 2
...
14 1110 Belongs to sets 1, 2 and 3, and not to set 4
15 1111 Belongs to all four sets
With this strategy, once you have defined the region you can immediately retrieve its Id, no tree is necessary.

Doing multiple queries in Postgresql - conditional loop

Let me first start by stating that in the last two weeks I have received ENORMOUS help from just about all of you (ok ok not all... but I think perhaps two dozen people commented, and almost all of these comments were helpful). This is really amazing and I think it shows that the stackoverflow team really did something GREAT altogether. So thanks to all!
Now as some of you know, I am working at a campus right now and I have to use a windows machine. (I am the only one who has to use windows here... :( )
Now I manage to setup (ok, IT department did that for me) and populate a Postgres database (this I did on my own), with about 400 mb of data. Which perhaps is not so much for most of you heavy Ppostgre users, but I was more used to sqlite database for personal use which rarely exceeded 2mb ever.
Anyway, sorry for being so chatty - now the queries from that database work
nicely. I use ruby to do queries actually.
The entries in the Postgres database are interconnected, in as far as they are like
"pointers" - they have one field that points to another field.
Example:
entry 3667 points to entry 35785 which points to entry 15566. So it is quite simple.
The main entry is 1, so the end of all these queries is 1. So, from any other number, we can reach 1 in the end as the last result.
I am using ruby to make as many individual queries to the database until the last result returned is 1. This can take up to 10 individual queries. I do this by logging into psql with my password and data, and then performing the SQL query via -c. This probably is not ideal, it takes a little time to do these logins and queries, and ideally I would have to login only once, perform ALL queries in Postgres, then exit with a result (all these entries as result).
Now here comes my question:
- Is there a way to make conditional queries all inside of Postgres?
I know how to do it in a shell script and in ruby but I do not know if this is available in postgresql at all.
I would need to make the query, in literal english, like so:
"Please give me all the entries that point to the parent entry, until the last found entry is eventually 1, then return all of these entries."
I already "solved" it by using ruby to make several queries until 1 is eventually returned, but this strikes me as fairly inelegant and possibly not effective.
Any information is very much appreciated - thanks!
Edit (argh, I fail at pasting...):
Example dataset, the table would be like this:
id | parent
----+---------------+
1 | 1 |
2 | 131567 |
6 | 335928 |
7 | 6 |
9 | 1 |
10 | 135621 |
11 | 9 |
I hope that works, I tried to narrow it down solely on example.
For instance, id 11 points to id 9, and id 9 points to id 1.
It would be great if one could use SQL to return:
11 -> 9 -> 1
Unless you give some example table definitions, what you're asking for vaguely reminds of a tree structure which could be manipulated with recursive queries: http://www.postgresql.org/docs/8.4/static/queries-with.html .

Use Cases for Redis' "Score" and "Ranking" Features for Sets

What are some use cases for Redis' "score" and "ranking" features for sets (outside of the typical "leaderboard" examples for games? I'm trying to figure out how to make use of these dynamic new features as I anticipate moving from using a traditional relational database to Redis as a persistent data store.
ZSETs are great for selections or ranges based on scores, but scores can be any numerical value, like a timestamp.
We store daily stock prices for all US stocks in redis. Here's an example for ebay...
ZADD key score member [score member ...]
...
ZADD stocks:ebay 1 30.39 2 32.70 3 31.25 4 31.75 5 29.12 6 29.87 7 29.93
The score values in this case would normally be long timestamps, with that aside, if we want daily prices for the last 3 days, we simply convert two dates to timestamps and pull from redis using the timestamp range 1 3...
zrangebyscore stocks:ebay 1 3
1) "30.39"
2) "32.70"
3) "31.25"
The query is very fast and works well for our needs.
Hope it helps!
zset is the only type of key who can be sorted
by example you can imagine puts all comments key id of a specific article in a zset,
users will vote up/down each comments and this will change the score value
after that when you need to draw comments you can get them ordered, better comments in first place (like here)
using ZREMRANGEBYSCORE you can imagine delete all pretty bad comments each days
but as each redis type, they still basic, give you a dedicated use case is hard there can be some :- )

help with tree-like structure

I've got some financial data to store and manipulate. Let's say I have 2 divisions, with offices in 2 cities, in 2 currencies, and 4 bank accounts. (It's actually more complex than that.) I want to show a list like this:
Electronics
Chicago
Dollars
Account 2 -> transactions in acct2 in $ in chicago/electronics
Euros
Account 1 -> transactions in acct1 in E in chicago/electronics
Account 3 -> etc.
Account 4
Brussles
Dollars
Account 1
Euros
Account 3
Account 4
Dessert Toppings
Chicago
Dollars
Account 1
Account 4
Euros
Account 2
Account 4
Brussles
Dollars
Account 2
Euros
Account 3
Account 4
So at each level except the top, the category can appear in multiple places. I've been reading around about the various methods, but none of the examples seem to address my particular use case, where nodes can appear in more than one place in the hierarchy. (Maybe there's a different name for this than "tree" or "hierarchy".)
I guess my hierarchy is actually something like Division > City > Currency with 'Electronics' and 'Euros' merely instances of each level, but I'm not quite sure how that helps or hurts.
A few notes: this is for a demo site, so the dataset won't be large -- ease of set-up and maintenance is more important than query efficiency. (I'm actually considering just building a data object by hand, though I'd much rather do it the right way.) Also, FWIW, we're working in php with an ms access back-end, so any libraries out there that make this easy in that environment would be helpful. (I've found a couple of implementations of the nested set pattern already.)
Are you sure you want to use a hierarchical design for this? To me, the hierarchy seems more a consequence of the desired output format than something intrinsic to your data structure.
And what if you have to display the data in a different order, like City > Currency > Division? Wouldn't that be very cumbersome?
You could use a plain structure instead, with a table for Branches, one for Cities, one for Currencies, and then then one Account table with Branch_ID, City_ID, and Currency_ID as foreign keys.
I'm not sure what database platform you're using. But if you're using MS SQL Server, then you should check out recursive queries using common table expressions (CTEs). They're easy to use and are designed for exactly the type of situation you've illustrated (a bill of materials, for instance). Check out this website for more detail: http://www.mssqltips.com/tip.asp?tip=1520
Good luck!

Generate breadcrumbs of categories stored in MySQL

In MySQL, I store categories this way:
categories:
- category_id
- category_name
- parent_category_id
What would be the most efficient way to generate the trail / breadcrumb for a given category_id?
For example
breadcrumbs(category_id):
General > Sub 1 > Sub 2
There could be in theories unlimited levels.
I'm using php.
UPDATE:
I saw this article (http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/) about the Nested Set Model.
It looks interesting, but how would you ago about dynamically managing categories?
It looks easier on paper, like when you know ahead of times the categories, but not when the user can create/delete/edit categories on the fly ...
What do you think?
I like to use the Materialized Path method, since it essentially contains your breadcrumb trail, and makes it easy to do things like select all descendants of a node without using recursive queries.
Materialized Path model
The idea with the Materialized path model is to link each node in the hierarchy with its position in the tree. This is done with a concatenated list of all the nodes ancestors. This list is usually stored in a delimited string. Note the “Linage” field below.
CAT_ID NAME CAT_PARENT Lineage
1 Home .
2 product 1 .1
3 CD’s 2 .1.2
4 LP’s 2 .1.2
5 Artists 1 .1
6 Genre 5 .1. 5
7 R&B 6 .1. 5.6
8 Rock 6 .1. 5.6
9 About Us 1 .1
Traversing the table
Select lpad('-',length(t1.lineage))||t1.name listing
From category t1, category t2
Where t1.lineage like t2.lineage ||'%'
And t2.name = 'Home';
Order by t1.lineage;
Listing
Home
-product
–CD’s
–LP’s
-Artists
–Genre
—R&B
—Rock
-About Us
Generate it (however you like) from a traditional parent model and cache it. It's too expensive to be generating it on the fly and the changes to the hierarchy are usually several orders of magnitude less frequent than other changes ever are. I wouldn't bother with the nested sets model since the hierarchy will be changing and then you have to go fooling around with the lefts and rights. (Note that the article only included recipes for adding and deleting - not re-parenting - which is very simple in the parent model).
The beauty of nested sets is that you can easily add/remove nodes from the graph with just a few simple SQL statements. It's really not all that expensive, and can be coded pretty quickly.
If you happen to be using PHP (or even if you don't), you can look at this code to see a fairly straight-forward implementation of adding nodes to a nested set model (archive.org backup). Removing (or even moving) is similarly straightforward.