Draft / Live Content System Database Design - sql

I've been working on a project that requires draft/live versions of content and have thought of a design such as below:
Article
ID
Creator
CreationDate
DraftContent(fk to ArticleContent)
PublicContent(fk to ArticleContent)
IsPendingApproval
ArticleContent
Title
Body
I am wondering if it would be better to change the foreign keys upon an article being published or if it is better to just copy the contents from the draft table to the live table.
Any suggestions?
Edit: Both draft and live versions exist at once although the live version is the only one that is visible to the public. There can only be one draft and one live table
Part of the reason for this design is to force users to have their articles approved before they go live.
Update:
We decided to use Kieren's solution with a slight modification. Instead of using a column for items like IsPublished IsLive we decided to use a single state column. Otherwise the design remained the same.

Draft articles that become live and then are 'published'
The usual thing would be to have a status/type flag on the article table - IsLive.
Using separate tables is unnecessary and redundant; changing foreign keys doesn't make much sense either. Think of the article as a valid object, whether its draft or live. The only difference is, in most cases you only want to display live articles. In some cases in the future, you might want to display both.
Articles that might be edited and have a new draft version after initially becoming live
In terms of one article having both a live and draft version - the most common pattern would be to have a master Article entity/object, and then say ArticleVersion coming from that. The ArticleVersion would have the IsLive property, or even better, the Article itself would have a property, CurrentLiveVersionId. That way there can be a live and draft versions lying around, but you'd only usually join Article onto the ArticleVersion by that CurrentLiveVersionId to get the current live version.
Advantages of having the ArticleVersion table include the fact that the entire history of an article, a changelog, can be stored, so you can revert to previous versions if needed, or review changes. All for a very low implementation cost..
Let me know if I can clarify this method.

Your design looks appropriate to me. When a new version goes live, I would:
UPDATE the PublicContent key to point to the (formerly) draft article.
DELETE the no-longer-referenced formerly-published article.
NULL the DraftContent key or, if your model calls for always having a draft version, INSERT a new, empty draft into ArticleContent and point the DraftContent key to it.

Related

Parent/Child Design For a Basic Social Network Using SQL

I'm trying to build a simple structure for a social network style application. But I have a little confusion about how to design the relationship between posts, comments and medias.
So simply, media can be an image or a video (with enumaration). It contains size and URL info about a standard image, thumbnail and video (according to the mediaType enumaration). A post may have multiple media attached to it. A comment may have multiple media attached to it. But when a media is used at one place, it cannot be used at another. No other post can have it. No other comment can have it. Also, when I implement users, they can refer to an image type media as their profilePic. When there will be a messaging feature, some media might be attached to a message etc. So, I want things to be a little flexible.
I didn't want to add specific columns about about thumbnailWidth, thumbnailSize, thumbnailURL etc to multiple tables, because it would be just too much repetation. So I've decided to use a centralized media table to hold all the main information about an uploaded image or video.
I've decided to put the thumbnail and standard image infos to the same row, otherwise it just felt too complicated to handle. I may divide images and videos to separate tables later.
Note: I don't have a structure for comments to reply each other. That is a later concern :)
Here is the current design without the connection between media and other tables.
Media
----------
id
thumbnail_width
thumbnail_height
thumbnail_URL
standard_width
standard_height
standard_URL
media_type ("video" or "image")
source_URL (only used if media_type is "video")
(maybe other columns to be used with "video" type)
user_id (who uploaded the media)
Post
----------
id
title
body
user_id (who sent the post.)
Comment
----------
id
body
post_id (which post has this comment)
user_id
Option 1
So one option is, putting commentId and postId fields (as nullable) to media table.
If a media is attached to a post, put the postId there. If it is attached to a comment, do same for the commentId. If one of them has a value, others must be null. But this may result in too many reference columns in the media table, because a media might be used in a lot of places as the project grows.
Option2
Another option is creating tables for each relationship like;
PostMedia
----------
id
post_id
media_id (unique. one-to-one relationship with Media table)
CommentMedia
----------
id
comment_id
media_id (unique. one-to-one relationship with Media table)
But now it becomes harder to check if a media is used in a post, before saving a comment. Or the other way around. We need to check the whole PostMedia table each time.
Another situation might be, when a user sets a media with the image type as their profile picture, we need to check if it was used in a post or a comment. I'm not sure about this constraint but it might come in handy for some situations.
I might set some ownerType enumaration in media table. That might be post, comment, profilePic etc. And PostMedia table could reference a media only if the ownerType is post.
Option 3
The centralized Media table idea is cool, but it comes with a lot of complexity I think. With an object-oriented design, I might just create an abstract media class, put all of the required columns and methods in that class and extend it as PostMedia, CommentMedia etc. And it would be much more easy to handle, but ends up a lot of same columns and similar tables across the db. I don't know if it is a good design.
What would be the best practice here? I might be thinking things too complicated, there might be simpler solutions. I'm open to any advices :)
Thanks!

Querying an implicit re-orderable list

I was searching for a way to re-order my records, like blog posts, for instance.
One of the solutions I have found is to self-reference to refer to the previous (or next) value, like in a linked list (https://softwareengineering.stackexchange.com/a/375246). However, this requires the client-side (a web service or perhaps a mobile app) to implement the linked-list travesal logic to derive the order.
Is there a way to do this at the database level?
The reason for this is that if you are deriving the order at the client-side, then if you want to display only the first 10 records, you would have to retrieve all the records anyway.
EDIT
It seems the blog posts example was a very bad example, sorry. I was thinking of blog posts as they are displayed on an admin dashboard, and the user can re-order the position they are displayed by dragging and dropping. Hope this is more clear.
EDIT 2
I guess, generally, what I'm really asking is, how can one implement and query a tree-like structure in SQL

What is the permalink to a blog post on Shopify?

Given a product id (PRODUCTID), the permalink to the published product page on Shopify is https://SHOP.myshopify.com/products/ID.
For a blog post, there are two ids, id of the blog post, and id of the blog. How do I get the permalink to the blog post?
I tried https://SHOP.myshopify.com/articles/BLOGPOSTID, but it did not work.
Not sure what you mean by permalink. When you access a product, if you were going to want a longer term solid reference to it, I think the handle serves as a better "permalink" than ID. Handle is used for search engines, and the site map. ID's are more for an administrative view of things, and note that an ID can change if you were to accidentally delete the product and recreate it. Happens all the time I bet. But the handle, that stays.
As for referencing blog articles, yes. They remain a bit tougher than products, since they do have that extra reference ID in the path. The reference of blogs/name_of_the_blog/ID_article_handle is awkward for sure. Why Shopify still keeps the article ID in there is due to some really longstanding old code no one has to see real reason to fix.
It used to be a lot of pseudo-seo-smart people dissed the whole Shopify URL scheme as unworkable for SEO, but I think in the end, they were proven to be a hefty lot of nothing to see here, move along.

What's the optimal way to filter a set of entities in a lookup?

I've got a lookup field on Account entity called something. Each such Something has a reference to an account. When my users click the magnifying glass, I want them to see a list of available Something records but filtered to view only such instances that link to the currently treated entity.
Also, I'll need to design such a filtration for Contact instances to only show the Something records that are related to the account that the currently regarded contact is a member of.
I can't decide between a plugin on Retrieve and some JS in OnLoad registering a fetchXML. All such operations will be done client-side. The solution needs only to work in CRM13 (and if possible apply some cool functionality in that version).
Suggestions?
JavaScript & FetchXml are your best option here as with a Retrieve plugin you're taking the performance hit of executing on every retrieve regardless of whether the entity is being retrieved for the lookup. A filtered lookup in JS only applies for those scenarios that require a change to the field on Account.
Another other good reason for using a filtered lookup in Js is they are now a supported feature in CRM 2013 as opposed to the "hack" that was required in 2011.
Some more info on addPreSearch and addCustomFilter can be found on MSDN and there's a decent blog post providing examples here.

Can a client ever reliably generate a PK for an object being written to a DB?

I've been fussing with this dilemma for a while now and I thought I'd approach SO.
A bit of background on my scenario:
I have Playlists which contain 0 or more PlaylistItem children.
Playlists are created infrequently. As such, I am OK with requesting a GUID from the server, waiting for a response and, on success, refreshing the UI to show the successfully added Playlist.
PlaylistItem objects are created frequently. As such, I am NOT OK with a loading message while I wait for the server to respond with a UUID.
This last fact is an optimization, I know, but I think it greatly improves the usability of the program.
Nevertheless, I would like to discuss my options for uniquely identifying my object client-side. I'll first highlight the two options I have tried and failed with, followed by a third option I am considering. I would love some insight into other possible solutions.
Generating a PK UUID client-side which will be persisted to the server.
This was my first choice. It was an obvious decision, but has some clear shortcomings. The first issue here is that client-side UUIDs can't and shouldn't be trusted for this sort of purpose. A malicious user can force PK collisions with ease. Furthermore, my understanding is that I should expect a greater collision chance if I chose to generate UUIDs client-side. Scratching that.
Generate a composite PK based on Playlist GUID and position in Playlist
I thought that this was a tricky, but great solution to my issue. A PlaylistItem's position is unique to a given Playlist collection and it is derivable both client-side and server-side. This seemed like a great fix. Unfortunately, having my position be part of the PK breaks the immutability of my PK. Whenever a PlaylistItem is reordered or deleted -- a large amount of PlaylistItem keys would need to be updated. Scratching that.
Generating a composite PK based on Playlist GUID and an auto-increment PlaylistItem ID
This solution is similar to the one above, but ensures that the PK is immutable by separating the composite key from the position. This is the current solution I am toying with. My only concern is that a malicious user could force collisions by modifying the auto-incremented id of the client before sending along. I don't think that this sort of malicious act would cause any harm to the system, but something to consider.
Okay! There you have it. Am I being stupid for doing all of this? Do I just suck it up and force my server to generate the GUIDs for my PlaylistItem objects? Or, is it possible to write a proper implementation?
UPDATE:
I am hoping to represent the user's action visually before the server has successfully saved to the database and implement the needed recovery techniques if the save fails. I am unsure if this is fool-hardy, but I will explain my reasoning through a use case scenario:
The client would like to add a new PlaylistItem. To do so, a request to YouTube's API is made for all the necessary information to create a PlaylistItem. The client has all necessary information to create a PlaylistItem after YouTube's API has responded, except for the ability to uniquely identify it.
At this point, the user has already waited X timeframe for YouTube's API. Now, I would like to visually show the PlaylistItem on the client. If I opt to wait for the server, I am now waiting X + Y timeframe before there is a visual indication of success. In testing, this delay felt awkward.
My server is just a micro instance on Amazon's EC2. I could reduce Y timeframe by upgrading hardware, but I could eliminate Y completely with clever programming. This is the dilemma I am facing.
Okay, as you seemed to like it when I suggested it in a comment :)
You could use a high/low approach, which basically allows a client to reserve a bunch of keys at a time. The simplest way would probably be to make it a composite primary key, consisting of two integers. The client would have one call along the lines of "give me a major key". You'd autoincrement the "next major key" sequence, and record which client "owns" that major key. That client can then use any minor key alongside that major key, and know that they'll be isolated from any other clients.
When the client performs an insert, you can check that the client is using the right major key, i.e. one assigned to them.
Of course, an alternative way of approach this would be to just make the primary key { client ID, UUID } and let the client just specify any UUID...