Are indexes in RavenDB and relational database indexs works the same? - ravendb

I am quite new of RavenDB and I am trying to understand how indexes works in RavenDB. I found a very good explanation on relational database indexing from this" How does database indexing work?" . But i was wondering are the indexes in RavenDB works the same as the relational database's indexes ?
Thanks.

Kekewong,
No, they work in drastically different fashion.
They are built in the background, offer full text search support, and are stored using a different format than most db indexes.

Related

Index data structure for RavenDB

Both MongoDB and CouchDB use B-trees as an underlying data structure for storing indexes. Anyone knows what is the equivalent for RavenDB? There is nothing mentioned about this in the documentation. Thanks!
RavenDB uses Lucene index.
In order to allow fast queries over your indexes, RavenDB processes
them in the background, executing the queries against the stored
documents and persisting the results to a Lucene index. Lucene is a
full text search engine library (Raven uses the .NET version) which
allows us to perform lightning fast full text searches.
You can read more about indexing in the documentation: How the indexes work

Slow Mongo queries from Rails app taking > 1000 ms ... ideas for optimization?

We built a Rails app on Mongo 2.2 and Rails 3.2.12. We're new to Mongo and would appreciate any tips on how we should optimize very slow queries, ones that take longer than 1000 ms.
We're using MongoMapper as the interface to Mongo.
We are indexing these table, but is there a way to confirm whether our queries are using these indices properly? How else can we pinpoint the cause of slowness?
Here is one day's worth of slow Mongo queries: https://gist.github.com/panabee/2876e833002f3151eeda
Here is explain on three of those queries: https://gist.github.com/panabee/358bd87ba7b954018dab
Your issue is schema design, indexing design and probably also insufficient resources to handle the current load (possibly - it's impossible to tell without fixing the queries first).
When designing your schema you must consider how you will be accessing the data for reads and writes. The structure of your documents and your indexing strategy must reflect the needs of the reads and writes. You currently have queries which must scan thousands of documents to select very few - this is a reflection of poor indexing strategy. With effective indexing your queries should be able to use indexes to zero in on the number of documents they need, in addition if sorts must be done then those should also be supported by indexes.
This would not be appropriate venue for an answer since your problem isn't that you are missing a single index or using the wrong type of query. You need to reconsider your schema design and your indexing strategy based on the needs of your application.

How to choose a database for my purposes? I want to store file metadata.

I'm building a web app that requires me to store metadata about files, approximately 15-20 "characteristics" for each file, including some shared ones (i.e. user1 & user2 should have access).
Would you recommend using a relational database for this? or is one of the newer more scalable noSQL databases a better option?
It should be something that scales quickly - and allows us to read and write fast.
Not sure how that would work with a relational DB in terms of performance (say im trying to find all the files that are owned by user1 and shared to user2 that have a certain property - I would essentially have to join 3-4 tables together... which is probably bad for performance?!)
Thanks for your feedback!
I don't think JOINing 3 or 4 tables would cause bad performance. If you are considering open-source relational solutions, I would suggest PostgreSQL, which is the richest SQL implementation currently. But MySQL will work, too, or even SQLite. They all have decent performance.
On the other hand, if the metadata that you need to store will expand in the future, a schema-based database will be a hassle. In that case I would instead suggest a schema-less (aka document-based, NoSQL, etc) database, like the open-source MongoDB. With indexes, it will also have excellent query performance. CouchDB is a richer implementation, but they don't pay as much attention to speed.
I think a relational database is a good fit for this. NoSQL databases typically don't allow easy and flexible querying. That is a strength of good old SQL databases.
Storing documents and some info for them isn't the strength of SQL databases.
I wouldn't choose MySQL, because of its license (or rather that of its data providers), and because you cannot say what Oracle is going to do with it in the future.
You are looking for a NoSQL database that is optimized for storing documents, that is extremely fast, and easy to setup (and use).
One that was written in C++ and not Java, and that non-the-less has bindings for .NET and Java, I assume.
I would say MongoDB would be the ideal choice.
Why not use a vcs such us svn or hg where you can assign attributes to files?
This all depends upon what you want to do with the information.

Emulating join behavior with Rails and Mongoid

Just wanted to ask some advice when building a database with mongodb, I have been reading a lot that if you have a database with a lot of joins it's better to go with say postgresql.
So if I wanted flexibility and needed my data to join multiple times, should I go with Postgresql? I know mongodb has fast reads / writes but needs to query multiple times to emulate joins. So when would this become a performance hit? Does mongodb limit your ability to create new complex relationships on your data that did not previously exist?
I guess the attractiveness of mongodb is its javascript syntax and similarity to json :)
I will start from the end:
I guess the attractiveness of mongodb is its javascript syntax and
similarity to json :)
Not only this, and json style is not main advantage. Main advantages of mongodb is ability to embedd documents, high performance and full scalability, full index support, map/reduce, etc.
So if I wanted flexibility and needed my data to join multiple times,
should I go with Postgresql?
It depends on concrete task, for example if you designing report system i prefer to use some relational database. But sometimes instead of joins and separate collections you can embedd documents + mongodb good fit for the data denormalization ( and in many situations you can denormalize in background to avoid joins )
I know mongodb has fast reads / writes but needs to query multiple
times to emulate joins. So when would this become a performance hit?
If you will use mongodb as regular relational database (without embedding and denormaliztion) you never achieve best performance.
Does mongodb limit your ability to create new complex relationships on
your data that did not previously exist?
No mongodb not limit you, because of it does not contains any constraints between collections like foreign key in any sql database + it allow embedd and easy denormalize data to fit your business needs and achieve best performance.
Another alternative would be to denormalize your data.
You store copies of data in multiple tables/collections. In doing so, you avoid the need for JOINs and lookups needed to stitch together related pieces of data.
You avoid joins and you’re storing more data - but your overall application can be faster.
In mongoid there are two great gems to make this easier:
Mongoid_alize &
Mongoid_denomalize
http://blog.joshdzielak.com/blog/2012/05/03/releasing-mongoid-alize-comprehensive-field-denormalization-for-mongoid/
You can always use:
http://www.mongodb.org/display/DOCS/MapReduce
Or
http://www.mongodb.org/display/DOCS/Aggregation#Aggregation-Group

MS Sql Server 2005+ table indexes creation can be automated?

I was reading Ian Stirk's Uncover Hidden Data to Optimize Application Performance article on MSDN Magazine, and now I am wondering if table indexes creation could be automated, as Google AppEngine does for its BigTable.
There is any tool or Sql Server feature that automates table index creation?
No, as far as I know, there's no feature in SQL Server that enables automatic table index creation.
I wouldn't think it to be a good idea, either, because getting the right indexes in place will depend on a multitude of factors, hardly any of which can be really truly automated.
Sure - you can place a primary key on any column called "ID" - until you run into a case where you need a primary key on something else....
Sure, it makes sense to index foreign key columns in the child table - but sometimes, the added overhead for INSERTs can more than offset the gains of having the index.
Getting the right indices in place is just way too dependant on your actual usage and a lot of dynamic, usage parameters (and design decisions on your part, too) so I'd be surprised if any solution would really work all that well...
Marc
I am not aware of any tools, and the best way to create indexes is to actually check the queries and their execution plans manually. I don't think that an automated tool will ever be as good as a few good DBA's analyzing the data together with the Profiler.
But, if you feel like giving a shot yourself, I recommend that you start looking at the performance views in the SQL Server.
Start with the function sys.dm_db_missing_index_columns, that should give you a hint of which columns that could benefit from being indexed.
sys.dm_db_index_usage_stats could show you which indexes are useless, or could be optimized as well.
sys.dm_exec_cached_plans, sys.dm_exec_query_plan, sys.dm_exec_query_stats and sys.dm_exec_sql_text should show you queries performed and how they perform, and together with information from the other tables, you could probably find out which ones that need more work.
Actually, I vaguely recall some wizard that can help you to analyze performance in the Profiler, probably not automatically, but it might be possible to put that in a Maintenance plan.
I'm glad you enjoyed my article!
In my new book about SQL Server performance via DMVs (www.manning.com/stirk), in chapter 10, you'll see a script that allows you to automatically created indexes... there's also a script to remove unused indexes... you can see how these could be used together.
Thanks
Ian
I know it's an old thread, but I thought someone may find this interesting:
http://blogs.msdn.com/b/queryoptteam/archive/2006/06/01/613516.aspx
Even though it sounds awesome I heed caution - indexes can cripple a system as much as they can make it fly - only use if you know what you are doing!