Parent/Child Design For a Basic Social Network Using SQL - sql

I'm trying to build a simple structure for a social network style application. But I have a little confusion about how to design the relationship between posts, comments and medias.
So simply, media can be an image or a video (with enumaration). It contains size and URL info about a standard image, thumbnail and video (according to the mediaType enumaration). A post may have multiple media attached to it. A comment may have multiple media attached to it. But when a media is used at one place, it cannot be used at another. No other post can have it. No other comment can have it. Also, when I implement users, they can refer to an image type media as their profilePic. When there will be a messaging feature, some media might be attached to a message etc. So, I want things to be a little flexible.
I didn't want to add specific columns about about thumbnailWidth, thumbnailSize, thumbnailURL etc to multiple tables, because it would be just too much repetation. So I've decided to use a centralized media table to hold all the main information about an uploaded image or video.
I've decided to put the thumbnail and standard image infos to the same row, otherwise it just felt too complicated to handle. I may divide images and videos to separate tables later.
Note: I don't have a structure for comments to reply each other. That is a later concern :)
Here is the current design without the connection between media and other tables.
Media
----------
id
thumbnail_width
thumbnail_height
thumbnail_URL
standard_width
standard_height
standard_URL
media_type ("video" or "image")
source_URL (only used if media_type is "video")
(maybe other columns to be used with "video" type)
user_id (who uploaded the media)
Post
----------
id
title
body
user_id (who sent the post.)
Comment
----------
id
body
post_id (which post has this comment)
user_id
Option 1
So one option is, putting commentId and postId fields (as nullable) to media table.
If a media is attached to a post, put the postId there. If it is attached to a comment, do same for the commentId. If one of them has a value, others must be null. But this may result in too many reference columns in the media table, because a media might be used in a lot of places as the project grows.
Option2
Another option is creating tables for each relationship like;
PostMedia
----------
id
post_id
media_id (unique. one-to-one relationship with Media table)
CommentMedia
----------
id
comment_id
media_id (unique. one-to-one relationship with Media table)
But now it becomes harder to check if a media is used in a post, before saving a comment. Or the other way around. We need to check the whole PostMedia table each time.
Another situation might be, when a user sets a media with the image type as their profile picture, we need to check if it was used in a post or a comment. I'm not sure about this constraint but it might come in handy for some situations.
I might set some ownerType enumaration in media table. That might be post, comment, profilePic etc. And PostMedia table could reference a media only if the ownerType is post.
Option 3
The centralized Media table idea is cool, but it comes with a lot of complexity I think. With an object-oriented design, I might just create an abstract media class, put all of the required columns and methods in that class and extend it as PostMedia, CommentMedia etc. And it would be much more easy to handle, but ends up a lot of same columns and similar tables across the db. I don't know if it is a good design.
What would be the best practice here? I might be thinking things too complicated, there might be simpler solutions. I'm open to any advices :)
Thanks!

Related

Querying an implicit re-orderable list

I was searching for a way to re-order my records, like blog posts, for instance.
One of the solutions I have found is to self-reference to refer to the previous (or next) value, like in a linked list (https://softwareengineering.stackexchange.com/a/375246). However, this requires the client-side (a web service or perhaps a mobile app) to implement the linked-list travesal logic to derive the order.
Is there a way to do this at the database level?
The reason for this is that if you are deriving the order at the client-side, then if you want to display only the first 10 records, you would have to retrieve all the records anyway.
EDIT
It seems the blog posts example was a very bad example, sorry. I was thinking of blog posts as they are displayed on an admin dashboard, and the user can re-order the position they are displayed by dragging and dropping. Hope this is more clear.
EDIT 2
I guess, generally, what I'm really asking is, how can one implement and query a tree-like structure in SQL

Activity streams / feeds / news in social network database schema

I have a goal to implement database schema for simple \ typical social network.
I have read many threads \ answers but have couple open questions.
So we have User table (userId, name and etc). We can make some Actions (reply, like, follow and etc). I want to implement some log for all activities and do it as PULL-MODEL. So we write entry in Activity table for any action. Schema for this table is (id, ownerId, actionType, targetId, time) where ownerId is User's id, who made action. actionType is reply, follow or other action. targetId is id of user or post and depends on actionType. When User get his activities we just do query by friends ids. So it is clear for me. My questions are:
1) In case if I follow User and unfollow him, what I should do? Should I make two entries in Activity table or I should remove the first followAction entry? What is the best practice?
2) It is clear foe me do query by friend ids so I get all activities of my friends. But in case any not my friend liked my photo and I must get event that "Some not my friends liked my photo". So, what are good solutions there for this case. May be I must to change my current schema?
Releated questions :
How to implement the activity stream in a social network
Database Design - "Push" Model, or Fan-out-on-write
What's the best manner of implementing a social activity stream?
Thanks you all for good answers.
First, it may be better to split each kind of action into its own table, rather than having all actions in one table, distinguished by types. This makes your metadata about each action more flexible; as you say, the target ID depends on the action; without splitting them out into other tables, it's harder to write constraints on what the data should be.
Second - on your question #1, I think you're confusing a log of user actions with user status. You may need both; you might want two separate data structures. For example, if a user follows and then unfollows, the status is that they aren't following, but the log of actions is that they followed, then unfollowed. So I think you should be careful to have a separate data structure that captures current status of certain relationships, apart from actions. Then the problem becomes simpler, you log all actions as they happen, and update status accordingly.
For question #2, the photo should be its own data object, with "likes" split out into a different table; users like posts. Then of all of the users who like a post, they can easily be grouped into two categories; friends (those who have a friend relationship to the poster) and non-friends.

Embed Ektron smartform in another Ektron smartform

(Using Ektron version 8.6.1)
Say I have a smartform ContactInfo, something like:
<ContactInfo>
<Name></Name>
<Email></Email>
</ContactInfo>
I would like to create another smartform (e.g. NewsArticle) and "embed" ContactInfo inside
<NewsArticle>
<Title></Title>
<Summary></Summary>
...
<ContactInfo>
<Name></Name>
<Email></Email>
</ContactInfo>
</NewsArticle>
My solution thus far has been to include a Resource Selector field to add a reference to an existing smartform instance. I would prefer to make the association at the configuration level, to make the data entry workflow more intuitive.
I'm using Bill Cava's ContentTypes and generating classes from smartform XSDs, so it would also make the presentation code more natural and type-safe in that embedded fields could be accessed directly (rather than having to make another request based on a reference ID, which may or may not be an ID to the smartform I'm expecting).
I gather this is not possible out of the box; I'm not opposed to hacking Workarea code to make something like this work. Does anyone have experience with a scenario like this?
I heard from an Ektron rep that they are planning on elevating the role of smartforms in an upcoming summer release - can anyone offer some more info to that point? Perhaps smartform composition like I've described will be supported?
Currently it isn't possible to do smartform composition. Depending on why/if you actually need a second smartform definition, you could just define the contact info in the news article.
If the contact info smartforms are related to the news articles in a one to many or many to many fashion, then using the resource selector as you have is the only way that I know of to create the relationship you are looking for.
If the relationship is one-to-one or many-to-one, then I'd suggest doing away with the separate smartform definition.
If you can clarify the workflow you are trying to achieve for the content authors, I might be able to respond better.
The Content Types would represent the data in the CMS. Suppose, as in your example, a NewsArticle contains a reference to a ContactInfo. Embedding the ContactInfo inside your NewsArticle might make sense from a presentation perspective, but it turns your ContentTypes into a one-way data model. You would lose the ability to construct a new NewsArticle and persist it into the CMS.
What might work well for you is to leave the content types as-is, with the id of the ContactInfo from the resource selector. Then create a NewsArticleDisplayModel... essentially a view model that contains the news article data plus ContactName and ContactEmail.
Now, if you need the contact info to be searchable, you could get really fancy with CMS Extensions and hook into the OnBeforePublish event to update searchable metadata with the name from the ContactInfo, so that the NewsArticle can be searched for using the values from the other "embedded" resource. That could get kinda tricky, though... ideally you'd have to also hook into the publish events of the ContactInfo objects in case something changes on that side, too. Then do you create a custom database table to track which NewsArticle content ids are using a particular ContactInfo?
Your solution can get as complex as it needs to, but I would keep the content blocks separate. If nothing else, you'll end up with a more maintainable and upgradable solution.

Draft / Live Content System Database Design

I've been working on a project that requires draft/live versions of content and have thought of a design such as below:
Article
ID
Creator
CreationDate
DraftContent(fk to ArticleContent)
PublicContent(fk to ArticleContent)
IsPendingApproval
ArticleContent
Title
Body
I am wondering if it would be better to change the foreign keys upon an article being published or if it is better to just copy the contents from the draft table to the live table.
Any suggestions?
Edit: Both draft and live versions exist at once although the live version is the only one that is visible to the public. There can only be one draft and one live table
Part of the reason for this design is to force users to have their articles approved before they go live.
Update:
We decided to use Kieren's solution with a slight modification. Instead of using a column for items like IsPublished IsLive we decided to use a single state column. Otherwise the design remained the same.
Draft articles that become live and then are 'published'
The usual thing would be to have a status/type flag on the article table - IsLive.
Using separate tables is unnecessary and redundant; changing foreign keys doesn't make much sense either. Think of the article as a valid object, whether its draft or live. The only difference is, in most cases you only want to display live articles. In some cases in the future, you might want to display both.
Articles that might be edited and have a new draft version after initially becoming live
In terms of one article having both a live and draft version - the most common pattern would be to have a master Article entity/object, and then say ArticleVersion coming from that. The ArticleVersion would have the IsLive property, or even better, the Article itself would have a property, CurrentLiveVersionId. That way there can be a live and draft versions lying around, but you'd only usually join Article onto the ArticleVersion by that CurrentLiveVersionId to get the current live version.
Advantages of having the ArticleVersion table include the fact that the entire history of an article, a changelog, can be stored, so you can revert to previous versions if needed, or review changes. All for a very low implementation cost..
Let me know if I can clarify this method.
Your design looks appropriate to me. When a new version goes live, I would:
UPDATE the PublicContent key to point to the (formerly) draft article.
DELETE the no-longer-referenced formerly-published article.
NULL the DraftContent key or, if your model calls for always having a draft version, INSERT a new, empty draft into ArticleContent and point the DraftContent key to it.

Database layout tagging system

I am creating a web site for a customer and they want to be able to create articles. My idea is to tag them so I am going to implement the system.
What is the best design, both from an architectural and a perfomance perspective:
1. To have table with all tags and then have a one to many relationship table that links a tag like this:
articles table with ID
tags table with ID
one to many table with columns Article.ID and Tags.ID
2. To have one table with articles and one with tags for articles like this:
articles table with ID
tags table with Article.ID and tag text
Thanks in advance!
Your first option is the most appropriate and theoretically right.
Guess, your clients do not think tags like a nice feature to have because everybody has it - they would like to have search by tags. Even if they don't yet understand their needs and really want to have tags because everybody around has them - they will realize their needs soon.
First option will give you better search operation performance.
Implement separate table for articles, tags and many-to-many between them.
Definitely the first option.
Apart from the other benefits, you could enforce some regularity in using tags, by checking if the tag (or a similar one) is already present before adding it, allowing users to select from existing tags, and/or allowing only superusers to add new tags.
This way you avoid mispellings or alternate spellings of the same tags (i.e. US, USA, USofA, U.S.A., U.S, US., America, Amerika, Amrica and so on when labelling something about the United States)