RavenDB Hierarchial Data Manipulation - ravendb

We have a Blog engine stored in RavenDB. The Blog Posts are separate documents to the Comments. What we need to do is to create an Index that retrieves our blog posts as normal but also includes a field for the sum of the comments (i.e. the count as a number) belonging to each Blog Post. Of course each comment document has the Blog Post Id as a foreign key.
Many thanks

The easiest way to handle that is to do a map/reduce index that would count the number of comments per blog post. Then you query the index for the count as part of loading the blog post (you can do that using Lazy).

Related

Creating a SOLR index for activity stream or newsfeed

I am trying to index the activity feed of a social portal am building. The portal allows users to follow each other to get updates from the people they follow as an activity feed sorted by date.
For example, user A will be following users B, C, D, E & F. So user A should see all the posts from B, C, D, E & F on his/her activity feed.
Let's assume the post consist of just two fields.
1. The text of the post. (text_field)
2. The name/UID of the user who posted it. (user_field)
Currently, I am creating an index for all the posts and indexing the text_field & user_field. In scale, there can be 1,000,000+ posts. A user may follow 100s if not 1000s of users. What will be the best way to create an index for this scenario?
Should I also index a person followers, so that its quickly looked up and then pass it to a second query for getting the posts of all those users sorted by date?
What is the best way to query the index consisting of all these posts, by passing the UID of all the users that are followed? Considering this may be in 100's or more.
Update:
The motivation for using Solr for the news feed was mainly inspired by this detailed slide and my brief discussion with OpenSocial team.
When starting off with a social portal, Fan out on write seems an overkill and more expensive. However Fan out on read is better. Both the slide and the OpenSocial team suggested using a search backend for Fan out on read. The slide mentioned above also have data on how it helped them.
At present, the feed is going to be flat and only sort criteria will be the date(recency). We won't be considering relevance or posts from more closer groups.
It's kind of abstract, but I will do my best here. Based on what you mentioned, I am not sure if Solr is really the right tool for the job here. You can still have Solr for full text search, but I am not sure about generating a news feed from it in this scenario. Remember that although Solr is pretty impressive, it is a search engine. I will pretend that you will stick with Solr for the rest of the post, keep in mind that we are trying to put a square peg through a round hole here though.
Here are a few additional questions you should think about.
You will probably want to add a timestamp of the post to the data element
You need to figure out how to properly sort the results. Is it in order of recency? Or based on posts that the user is more likely to interact with?
If a user has 1000+ connections, would he want to see an update from every one of them in the main feed? Or should posts from a closer group of friends show up higher?
Here are some comments about your questions:
1) If you index person's followers, it may be hard to keep up. I am assuming followers are going to be changing often and re-indexing in this scenario would not really be practical.
2) That sounds more on par, but again, you need to figure out the sorting. You can get a list of connections for the user, then run a search for top posts from all of them.

Piranha-CMS How do you get the Post to display on the Page?

I need to get the posts to display on one of my pages...
How to I do this? Can I create the link via the database? I would prefer this if possible as I'm way better at sql than C#
Cheers
There's are two regions you can add to your page type, Posts & Post Models. The first only loads the general information while the second loads the entire post models including all extensions.
Then in your page you get to select which post type, the number of posts, included related entities & sort order for the posts.
You can read about the included standard regions here:
http://piranhacms.org/docs/pages/regions
Regards
/Håkan

How Facebook organize posts in news feed page

I have always wondered how Facebook organize posts in news feed page. Facebook doesn't use date and time to organize posts in news feed page. This is obvious when some posts acquire many likes or comments. These posts, in spite they may be older posts, will be displayed first.
let's suppose a simple database table for posts :
Post_Id
Post_Owner_Id
Post_Text
Post_Image
Post_Date
So what field (or fields) that must be added to organize posts like the one in Facebook ?
The algorithm for how Facebook sorts the newsfeed isn't public from what I've heard, but what the algorithm looks for isn't completely.
Have a look at these articles for a slight idea on what they do and why.
Bufferapp - Decoding the Facebook newsfeed
Forbes - Facebook Changes News Feed Algorithm To Prioritize Content From Friends Over Pages
Everything You Need To Know About Facebook’s News Feed Algorithm
So if you are wanting to recreate their algorithm, you could get a very rough imitation by sorting based on date rounded to the closest week, second by the type of post it is (message, page, etc) then perhaps the number of likes it got.
Which means you would need number of likes and the Post_Type attributes.
You would also need to have it sort them based on friend status (direct or friends-of-friends) and whether or not the post comes from someone verified such as a celebrity.
There is so much to it.

Storing Blog Comments/Upvotes - Tracking Users?

I am working on a blog-type website in ASP .net MVC3. I am trying to figure out how I will deal with post upvotes/downvotes(I will have to know what users have already voted where to prevent spam voting). Comments on a blog post is another issue.
My thoughts so far(I am sure they are pretty far off the mark):
Votes:
Store a list of UserIDs in a voted field of my Blog table.
For each user in my Users table, store a list of all PostIDs they have voted on.
Comments:
Make a separate Comments table and in that table have a field referencing the parent blog post.
Store a list of CommentIDs in a Comment field in my Blogs table.
I know there are several other ways to go about this but I am trying to set this up so that I won't have to rewrite the whole thing should I get an influx of users.
You might wanna consider creating a Votes table like
User|Post|Type?
john|43 |Up
mary|43 |Down
making User + Post a composite primary key, and thus indexing by both... Then you can easily check if a user has already voted for a post or not... You can also create additional indexes by user or post if needed...
I'd also be a good idea then to have the "Current Ups and Current Downs" in the blogs table, so you don't have to count them each time...

Database layout tagging system

I am creating a web site for a customer and they want to be able to create articles. My idea is to tag them so I am going to implement the system.
What is the best design, both from an architectural and a perfomance perspective:
1. To have table with all tags and then have a one to many relationship table that links a tag like this:
articles table with ID
tags table with ID
one to many table with columns Article.ID and Tags.ID
2. To have one table with articles and one with tags for articles like this:
articles table with ID
tags table with Article.ID and tag text
Thanks in advance!
Your first option is the most appropriate and theoretically right.
Guess, your clients do not think tags like a nice feature to have because everybody has it - they would like to have search by tags. Even if they don't yet understand their needs and really want to have tags because everybody around has them - they will realize their needs soon.
First option will give you better search operation performance.
Implement separate table for articles, tags and many-to-many between them.
Definitely the first option.
Apart from the other benefits, you could enforce some regularity in using tags, by checking if the tag (or a similar one) is already present before adding it, allowing users to select from existing tags, and/or allowing only superusers to add new tags.
This way you avoid mispellings or alternate spellings of the same tags (i.e. US, USA, USofA, U.S.A., U.S, US., America, Amerika, Amrica and so on when labelling something about the United States)