Storing Blog Comments/Upvotes - Tracking Users? - sql

I am working on a blog-type website in ASP .net MVC3. I am trying to figure out how I will deal with post upvotes/downvotes(I will have to know what users have already voted where to prevent spam voting). Comments on a blog post is another issue.
My thoughts so far(I am sure they are pretty far off the mark):
Votes:
Store a list of UserIDs in a voted field of my Blog table.
For each user in my Users table, store a list of all PostIDs they have voted on.
Comments:
Make a separate Comments table and in that table have a field referencing the parent blog post.
Store a list of CommentIDs in a Comment field in my Blogs table.
I know there are several other ways to go about this but I am trying to set this up so that I won't have to rewrite the whole thing should I get an influx of users.

You might wanna consider creating a Votes table like
User|Post|Type?
john|43 |Up
mary|43 |Down
making User + Post a composite primary key, and thus indexing by both... Then you can easily check if a user has already voted for a post or not... You can also create additional indexes by user or post if needed...
I'd also be a good idea then to have the "Current Ups and Current Downs" in the blogs table, so you don't have to count them each time...

Related

Database Schema for Individual and Group accounts(Like Facebook Users and Pages)

I'm trying to create a schema, which is very similar to part of what Facebook does,so i will explain with that as an example as it would be easier.
Just like Facebook, we have Users and Facebook Pages. Both can publish posts, follow each other and the users have a feed page where they can see the posts of the entities they have followed(User or Pages)
The page is managed by group of accounts(page moderators),the posts they make will appear as posts from the page.
How do i go about this?
What i thought of
Create a user table and Page Table. User table have a column called group_id which will map to the Page Table id for the moderator accounts, since both are kind of similar this makes sense.
Making separate User and Moderator Tables
However with the above approach i probably would have make two separate tables for posts as author_id will be different, users will have user_id and pages have page_id.
Also thinking of doing something like this for the Follow table
follower-user_followed-page_followed
1-2-NULL
1-NULL-5
I know this is not the kind of ideal solution and will be problematic to retrieve the user_id and page_id from the Follow table, do a query on posts from user post table and then again on brand post table, merge them all and show it on the feed,but i'm unable to wrap my head around this any other way.Searched for Facebook schema, but was unable to find with Pages in it. Please any pointers would be helpful.
You should use different tables for different entities.
Users
Groups
Moderators (contains id of corresponding users and moderated by them groups id)
Followers (contains id of users and followed by them groups)
Tables 3 and 4 provides such called "many-to-many" relationship (multiple users can follow multiple groups), as databases support only one-to-many or one-to-one relationship from the box

Querying an implicit re-orderable list

I was searching for a way to re-order my records, like blog posts, for instance.
One of the solutions I have found is to self-reference to refer to the previous (or next) value, like in a linked list (https://softwareengineering.stackexchange.com/a/375246). However, this requires the client-side (a web service or perhaps a mobile app) to implement the linked-list travesal logic to derive the order.
Is there a way to do this at the database level?
The reason for this is that if you are deriving the order at the client-side, then if you want to display only the first 10 records, you would have to retrieve all the records anyway.
EDIT
It seems the blog posts example was a very bad example, sorry. I was thinking of blog posts as they are displayed on an admin dashboard, and the user can re-order the position they are displayed by dragging and dropping. Hope this is more clear.
EDIT 2
I guess, generally, what I'm really asking is, how can one implement and query a tree-like structure in SQL

What is the permalink to a blog post on Shopify?

Given a product id (PRODUCTID), the permalink to the published product page on Shopify is https://SHOP.myshopify.com/products/ID.
For a blog post, there are two ids, id of the blog post, and id of the blog. How do I get the permalink to the blog post?
I tried https://SHOP.myshopify.com/articles/BLOGPOSTID, but it did not work.
Not sure what you mean by permalink. When you access a product, if you were going to want a longer term solid reference to it, I think the handle serves as a better "permalink" than ID. Handle is used for search engines, and the site map. ID's are more for an administrative view of things, and note that an ID can change if you were to accidentally delete the product and recreate it. Happens all the time I bet. But the handle, that stays.
As for referencing blog articles, yes. They remain a bit tougher than products, since they do have that extra reference ID in the path. The reference of blogs/name_of_the_blog/ID_article_handle is awkward for sure. Why Shopify still keeps the article ID in there is due to some really longstanding old code no one has to see real reason to fix.
It used to be a lot of pseudo-seo-smart people dissed the whole Shopify URL scheme as unworkable for SEO, but I think in the end, they were proven to be a hefty lot of nothing to see here, move along.

Creating a SOLR index for activity stream or newsfeed

I am trying to index the activity feed of a social portal am building. The portal allows users to follow each other to get updates from the people they follow as an activity feed sorted by date.
For example, user A will be following users B, C, D, E & F. So user A should see all the posts from B, C, D, E & F on his/her activity feed.
Let's assume the post consist of just two fields.
1. The text of the post. (text_field)
2. The name/UID of the user who posted it. (user_field)
Currently, I am creating an index for all the posts and indexing the text_field & user_field. In scale, there can be 1,000,000+ posts. A user may follow 100s if not 1000s of users. What will be the best way to create an index for this scenario?
Should I also index a person followers, so that its quickly looked up and then pass it to a second query for getting the posts of all those users sorted by date?
What is the best way to query the index consisting of all these posts, by passing the UID of all the users that are followed? Considering this may be in 100's or more.
Update:
The motivation for using Solr for the news feed was mainly inspired by this detailed slide and my brief discussion with OpenSocial team.
When starting off with a social portal, Fan out on write seems an overkill and more expensive. However Fan out on read is better. Both the slide and the OpenSocial team suggested using a search backend for Fan out on read. The slide mentioned above also have data on how it helped them.
At present, the feed is going to be flat and only sort criteria will be the date(recency). We won't be considering relevance or posts from more closer groups.
It's kind of abstract, but I will do my best here. Based on what you mentioned, I am not sure if Solr is really the right tool for the job here. You can still have Solr for full text search, but I am not sure about generating a news feed from it in this scenario. Remember that although Solr is pretty impressive, it is a search engine. I will pretend that you will stick with Solr for the rest of the post, keep in mind that we are trying to put a square peg through a round hole here though.
Here are a few additional questions you should think about.
You will probably want to add a timestamp of the post to the data element
You need to figure out how to properly sort the results. Is it in order of recency? Or based on posts that the user is more likely to interact with?
If a user has 1000+ connections, would he want to see an update from every one of them in the main feed? Or should posts from a closer group of friends show up higher?
Here are some comments about your questions:
1) If you index person's followers, it may be hard to keep up. I am assuming followers are going to be changing often and re-indexing in this scenario would not really be practical.
2) That sounds more on par, but again, you need to figure out the sorting. You can get a list of connections for the user, then run a search for top posts from all of them.

Activity streams / feeds / news in social network database schema

I have a goal to implement database schema for simple \ typical social network.
I have read many threads \ answers but have couple open questions.
So we have User table (userId, name and etc). We can make some Actions (reply, like, follow and etc). I want to implement some log for all activities and do it as PULL-MODEL. So we write entry in Activity table for any action. Schema for this table is (id, ownerId, actionType, targetId, time) where ownerId is User's id, who made action. actionType is reply, follow or other action. targetId is id of user or post and depends on actionType. When User get his activities we just do query by friends ids. So it is clear for me. My questions are:
1) In case if I follow User and unfollow him, what I should do? Should I make two entries in Activity table or I should remove the first followAction entry? What is the best practice?
2) It is clear foe me do query by friend ids so I get all activities of my friends. But in case any not my friend liked my photo and I must get event that "Some not my friends liked my photo". So, what are good solutions there for this case. May be I must to change my current schema?
Releated questions :
How to implement the activity stream in a social network
Database Design - "Push" Model, or Fan-out-on-write
What's the best manner of implementing a social activity stream?
Thanks you all for good answers.
First, it may be better to split each kind of action into its own table, rather than having all actions in one table, distinguished by types. This makes your metadata about each action more flexible; as you say, the target ID depends on the action; without splitting them out into other tables, it's harder to write constraints on what the data should be.
Second - on your question #1, I think you're confusing a log of user actions with user status. You may need both; you might want two separate data structures. For example, if a user follows and then unfollows, the status is that they aren't following, but the log of actions is that they followed, then unfollowed. So I think you should be careful to have a separate data structure that captures current status of certain relationships, apart from actions. Then the problem becomes simpler, you log all actions as they happen, and update status accordingly.
For question #2, the photo should be its own data object, with "likes" split out into a different table; users like posts. Then of all of the users who like a post, they can easily be grouped into two categories; friends (those who have a friend relationship to the poster) and non-friends.