Querying an implicit re-orderable list - sql

I was searching for a way to re-order my records, like blog posts, for instance.
One of the solutions I have found is to self-reference to refer to the previous (or next) value, like in a linked list (https://softwareengineering.stackexchange.com/a/375246). However, this requires the client-side (a web service or perhaps a mobile app) to implement the linked-list travesal logic to derive the order.
Is there a way to do this at the database level?
The reason for this is that if you are deriving the order at the client-side, then if you want to display only the first 10 records, you would have to retrieve all the records anyway.
EDIT
It seems the blog posts example was a very bad example, sorry. I was thinking of blog posts as they are displayed on an admin dashboard, and the user can re-order the position they are displayed by dragging and dropping. Hope this is more clear.
EDIT 2
I guess, generally, what I'm really asking is, how can one implement and query a tree-like structure in SQL

Related

Creating a SOLR index for activity stream or newsfeed

I am trying to index the activity feed of a social portal am building. The portal allows users to follow each other to get updates from the people they follow as an activity feed sorted by date.
For example, user A will be following users B, C, D, E & F. So user A should see all the posts from B, C, D, E & F on his/her activity feed.
Let's assume the post consist of just two fields.
1. The text of the post. (text_field)
2. The name/UID of the user who posted it. (user_field)
Currently, I am creating an index for all the posts and indexing the text_field & user_field. In scale, there can be 1,000,000+ posts. A user may follow 100s if not 1000s of users. What will be the best way to create an index for this scenario?
Should I also index a person followers, so that its quickly looked up and then pass it to a second query for getting the posts of all those users sorted by date?
What is the best way to query the index consisting of all these posts, by passing the UID of all the users that are followed? Considering this may be in 100's or more.
Update:
The motivation for using Solr for the news feed was mainly inspired by this detailed slide and my brief discussion with OpenSocial team.
When starting off with a social portal, Fan out on write seems an overkill and more expensive. However Fan out on read is better. Both the slide and the OpenSocial team suggested using a search backend for Fan out on read. The slide mentioned above also have data on how it helped them.
At present, the feed is going to be flat and only sort criteria will be the date(recency). We won't be considering relevance or posts from more closer groups.
It's kind of abstract, but I will do my best here. Based on what you mentioned, I am not sure if Solr is really the right tool for the job here. You can still have Solr for full text search, but I am not sure about generating a news feed from it in this scenario. Remember that although Solr is pretty impressive, it is a search engine. I will pretend that you will stick with Solr for the rest of the post, keep in mind that we are trying to put a square peg through a round hole here though.
Here are a few additional questions you should think about.
You will probably want to add a timestamp of the post to the data element
You need to figure out how to properly sort the results. Is it in order of recency? Or based on posts that the user is more likely to interact with?
If a user has 1000+ connections, would he want to see an update from every one of them in the main feed? Or should posts from a closer group of friends show up higher?
Here are some comments about your questions:
1) If you index person's followers, it may be hard to keep up. I am assuming followers are going to be changing often and re-indexing in this scenario would not really be practical.
2) That sounds more on par, but again, you need to figure out the sorting. You can get a list of connections for the user, then run a search for top posts from all of them.

What's the optimal way to filter a set of entities in a lookup?

I've got a lookup field on Account entity called something. Each such Something has a reference to an account. When my users click the magnifying glass, I want them to see a list of available Something records but filtered to view only such instances that link to the currently treated entity.
Also, I'll need to design such a filtration for Contact instances to only show the Something records that are related to the account that the currently regarded contact is a member of.
I can't decide between a plugin on Retrieve and some JS in OnLoad registering a fetchXML. All such operations will be done client-side. The solution needs only to work in CRM13 (and if possible apply some cool functionality in that version).
Suggestions?
JavaScript & FetchXml are your best option here as with a Retrieve plugin you're taking the performance hit of executing on every retrieve regardless of whether the entity is being retrieved for the lookup. A filtered lookup in JS only applies for those scenarios that require a change to the field on Account.
Another other good reason for using a filtered lookup in Js is they are now a supported feature in CRM 2013 as opposed to the "hack" that was required in 2011.
Some more info on addPreSearch and addCustomFilter can be found on MSDN and there's a decent blog post providing examples here.

Identify item by either an ID or a slug in a RESTful API

I'm currently designing an API and I came a cross a little problem:
How should a URL of a RESTful API look like when you should be able to identify an item by either an ID or a slug?
I could think of three options:
GET /items/<id>
GET /items/<slug>
This requires that the slug and the ID are distinguishable, which is not necessarily given in this case. I can't think of a clean solution for this problem, except you do something like this:
GET /items/id/<id>
GET /items/slug/<slug>
This would work fine, however this is not the only place I want to identify items by either a slug or an ID and it would soon get very ugly when one wants to implement the same approach for the other actions. It's just not very extendable, which leads us to this approach:
GET /items?id=<id>
GET /items?slug=<slug>
This seems to be a good solution, but I don't know if it is what one would expect and thus it could lead to frustrating errors due to incorrect use. Also, it's not so easy - or let's say clean - to implement the routing for this one. However, it would be easily extendable and would look very similar to the method for getting multiple items:
GET /items?ids=<id:1>,<id:2>,<id:3>
GET /items?slugs=<slug:1>,<slug:2>,<slug:3>
But this has also a downside: What if someone wants to identify some of the items he want to fetch with IDs, but the others with a slug? Mixing these identifiers wouldn't be easy to achieve with this.
What is the best and most widely-accepted solution for these problems?
In general, what matters while designing such an API?
Of the three I prefer the third option, it's not uncommon to see that syntax; e.g. parts of Twitter's API allow that syntax:
https://dev.twitter.com/rest/reference/get/statuses/show/id
A fourth option is a hybrid approach, where you pick one (say, ID) as the typical access method for single items, but also allow queries based on the slug. E.g.:
GET /items/<id>
GET /items?slug=<slug>
GET /items?id=<id>
Your routing will obvious map /items/id to /items?id=
Extensible to multiple ids/slugs, but still meets the REST paradigm of matching URIs to the underlying data model.

The REST-way to check/uncheck like/unlike favorite/unfavorite a resource

Currently I am developing an API and within that API I want the signed in users to be able to like/unlike or favorite/unfavorite two resources.
My "Like" model (it's a Ruby on Rails 3 application) is polymorphic and belongs to two different resources:
/api/v1/resource-a/:id/likes
and
/api/v1/resource-a/:resource_a_id/resource-b/:id/likes
The thing is: I am in doubt what way to choose to make my resources as RESTful as possible. I already tried the next two ways to implement like/unlike structure in my URL's:
Case A: (like/unlike being the member of the "resource")
PUT /api/v1/resource/:id/like maps to Api::V1::ResourceController#like
PUT /api/v1/resource/:id/unlike maps to Api::V1::ResourceController#unlike
and case B: ("likes" is a resource on it's own)
POST /api/v1/resource/:id/likes maps to Api::V1::LikesController#create
DELETE /api/v1/resource/:id/likes maps to Api::V1::LikesController#destroy
In both cases I already have a user session, so I don't have to mention the id of the corresponding "like"-record when deleting/"unliking".
I would like to know how you guys have implemented such cases!
Update April 15th, 2011: With "session" I mean HTTP Basic Authentication header being sent with each request and providing encrypted username:password combination.
I think the fact that you're maintaining application state on the server (user session that contains the user id) is one of the problems here. It's making this a lot more difficult than it needs to be and it's breaking a REST's statelessness constraint.
In Case A, you've given URIs to operations, which again is not RESTful. URIs identify resources and state transitions should be performed using a uniform interface that is common to all resources. I think Case B is a lot better in this respect.
So, with these two things in mind, I'd propose something like:
PUT /api/v1/resource/:id/likes/:userid
DELETE /api/v1/resource/:id/likes/:userid
We also have the added benefit that a user can only register one 'Like' (they can repeat that 'Like' as many times as they like, and since the PUT is idempotent it has the same result no matter how many times it's performed). DELETE is also idempotent, so if an 'Unlike' operation is repeated many times for some reason then the system remains in a consistent state. Of course you can implement POST in this way, but if we use PUT and DELETE we can see that the rules associated with these verbs seem to fit our use-case really well.
I can also imagine another useful request:
GET /api/v1/resource/:id/likes/:userid
That would return details of a 'Like', such as the date it was made or the ordinal (i.e. 'This was the 50th like!').
case B is better, and here have a good sample from GitHub API.
Star a repo
PUT /user/starred/:owner/:repo
Unstar a repo
DELETE /user/starred/:owner/:repo
You are in effect defining a "like" resource, a fact that a user resource likes some other resource in your system. So in REST, you'll need to pick a resource name scheme that uniquely identifies this fact. I'd suggest (using songs as the example):
/like/user/{user-id}/song/{song-id}
Then PUT establishes a liking, and DELETE removes it. GET of course finds out if someone likes a particular song. And you could define GET /like/user/{user-id} to see a list of the songs a particular user likes, and GET /like/song/{song-id} to see a list of the users who like a particular song.
If you assume the user name is established by the existing session, as #joelittlejohn points out, and is not part of the like resource name, then you're violating REST's statelessness constraint and you lose some very important advantages. For instance, a user can only get their own likes, not their friends' likes. Also, it breaks HTTP caching, because one user's likes are indistinguishable from another's.

How to decide whether to split up a VB.Net application and, if so, how to split it up?

I have 2 1/2 years experience of VB.Net, mostly self taught, so please bear with me if I seem rather noobish still and do not know some of the basics. I would recommend you grab a cup of tea before starting on this, as it appears to have got quite long...
I currently have a rather large application (VB.Net website) of over 15000 lines of code at the last count. It does not do retail or anything particularly complex like that - it is literally just a wholesale viewing website with admin frontend, catalogue / catalogue management system and pageview system.
I don't really know much about how .Net applications work in the background - whether they are all loaded on the same thread or if each has its own thread... I just know how to code them, or at least like to think I do... :-)
Basically my application is set up as follows:
There are two different areas - the customer area and the administration frontend.
The main part of the customer frontend is the Catalogue. The MasterPage will load a list of products but that's all, and this is common to all the customer frontend pages.
I tend to work on only one or several parts of the application at a time before uploading the changes. So, for example, I may alter the hierarchy of the Catalogue and change the Catalogue page to match the hierarchy change whilst leaving everything else alone.
The pageview database is getting really quite large and so it is getting rather slow when the application is first requested due to the way it works.
The application timeout is set to 5 minutes - don't know how to change it, I have even tried asking this question on here and seem to remember the solution was quite complex and I was recommended not to change it, but if a customer requests the application 5 minutes after the last page view then it will reload the application from scratch. This means there is a very slow page load whenever it exceeds 5 minutes of inactivity.
I am not sure if this needs consideration to determine how best to split the application up, if at all, but each part of the catalogue system is set up as follows:
A Manager class at the top level, which is used by the admin frontend to add, edit and remove items of the specified type and the customer frontend to retrieve a list of items of the specified type. For example the "RangeManager" will contain a list of product "Ranges" and will be used to interact with these from the customer frontend.
An Item class, for example Range, which contains a list of Attributes. For example Name, Description, Visible, Created, CreatedBy and so on. The form for adding / editing loops through these to display relevant controls for the administrator. For example a Checkbox for BooleanAttribute.
An Attribute class, which can be of type StringAttribute, BooleanAttribute, IntegerAttribute and so on. There are also custom Attributes (not just datatypes) such as RangeAttribute, UserAttribute and so on. These are given a data field which is used to get a piece of data specific to the item it is contained in when it is first requested. Basically the Item is given a DataRow which is stored and accessed by Attributes only when they are first requested.
When one item is requested from a specific manager is requested, the manager will loop through all the items in the database and create a new instance of the item class. For example when a Range is requested from the RangeManager, the RangeManager will loop through all of the DataRows in the Ranges table and create a new instance of Range for each one. As stated above it simply creates a new instance with the DataRow, rather than loading all the data into it there and then. The Attributes themselves fetch the relevant data from the DataRow as and when they're first requested.
It just seems a tad stupid, in my mind, to recompile and upload the entire application every time I fix a minor bug or a spelling mistake for a word which is in the code behind (for example if I set the text of a Label dynamically). A fix / change to the Catalogue page, the way it is now, may mean a customer trying to view the Contact page, which is in no way related to the Catalogue page apart from by having the same MasterPage, cannot do so because the DLL is being uploaded.
Basically my question is, given my current situation, how would people suggest I change the architecture of the application by way of splitting it into multiple applications? I mean would it be just customer / admin, or customer / admin and pageviews, or some other way? Or not at all? Are there any other alternatives which I have not mentioned here? Could web services come in handy here? Like split the catalogue itself into a different application and just have the masterpage for all the other pages use a web service to get the names of the products to list on the left hand side? Am I just way WAY over-complicating things? Judging by the length of this question I probably am, and it wouldn't be the first time... I have tried to keep it short, but I always fail... :-)
Many thanks in advance, and sorry if I have just totally confused you!
Regards,
Richard
15000 LOC is not really all that big.
It sounds like you are not pre-compiling your site for publishing. You may want to read this: http://msdn.microsoft.com/en-us/library/1y1404zt(v=vs.80).aspx
Recompiling and uploading the application is the best way to do it. If all you are changing is your markup, that can be uploaded individually (e.g. changing some html layout in an aspx page).
I don't know what you mean here by application timeout, but if your app domain recycles every 5 minutes, then that doesn't seem right at all. You should look into this.
Also, if you find yourself working on various different parts of the site (i.e. many different changes), but need to deploy only some items in isolation, then you should look into how you are using your source control tools (you are using one, aren't you?). Look into something like GIT and branching/merging.
Start by reading:
Application Architecture Guide