Get short URL from long URL - url-shortener

I was just reading this question How to code a URL shortener? where the top answer focusses on getting an auto-incremented ID for the long URL and then has functions which creates a short URL from the ID and also a function which goes back to the ID from the short URL.
However, would this not mean that if you input the same long URL again to generate a short URL it would auto-increment to a new ID and therefore create a different short URL?
If you wanted to make sure it returned the same short URL I think this means on the database that is currently storing a big hash table with just 'id' to 'long url' you need another index hash table for the 'long url' to 'id'?
Is there a more efficient way of doing this rather than having to double the memory storage?

I have an implementation for this, so basically I am generating a unique UUID for the specific URL as key and the actual url as value.
To make things more clear, here is the Github Link
In your front-end
<b>{{ **Key** }}</b>

Related

Using slug instead of ID in URLs and getting objects hashing the slug

I don't know if this question sounds too stupid, but I appreciate any answer to it.
I use URL design that looks like Stackoverflow e.g.:
example.com/object/{ID}/{SLUG}
Where ID is just an integer corresponding to a serial key in my database (Postgres), the slug does nothing special actually.
What if I want to use a more modern URL design like Quora e.g.:
example.com/object/{SLUG}
And in my backend software (Django in my case) I hash the slug using some lightweight hash function (16 or 32-bit for example) and get the corresponding object using an additional column that contains the hash (in that case it will just another integer column)?
The slugs are guaranteed to be unique, and it is almost impossible to expand beyond 2^32 objects.

REST based URL for complex resource identifier

I am trying to construct URL for the REST API that needs to use complex resource identifier
e.g. Get specific Course
GET /Courses/{id}
where {id} = {TermId}/{SubjectId}/{SectionID}
Is it acceptable to format it as below or there is a better way?
/Courses/{TermId}/{SubjectId}/{SectionID}
It's rather not acceptable, because it introduces confusion to the clients that use the API you provided. Basically / (slash) indicates a new resource. In this particular case you have Courses resources which has a particular resource with TermId which in turn has SubjectId and so on. This is not readable and not what client expects. I see two possible solutions here:
Use combined key, separated with - or other URI-useable sign:
GET /Courses/{TermId}-{SubjectId}-{SectionID}
Just parse such key on the server side.
Use other URI
GET /Courses/{courseId}/Terms/{termId}/subjects/{subjectId}/sections/{sectionId}
There are also other ideas, the one you suggested doesn't seem useable.
As I see it, you have two reasonable options:
Use a compound key, as #Opal said
Use a surrogate key (an arbitrary key with no relation to your three unique constraints)
The advantage to (1) is that the URI is human-hackable - assuming that the user remembers the order to put the values in and what valid values can be. If a significant use case is going to be students using these URIs to find courses online they might like to skip the search step if they have all the relevant information and just punch those values into the URI. If your response type is HTML, this is not unreasonable.
The advantage to (2) is that it's not human-hackable - REST is about discovery through hypermedia. If the response type is JSON or XML, humans aren't going to be using these URIs directly.
I would suggest supporting the following endpoints:
GET /courses?termId={}&subjectId={}&sectionId={}
// all three parameters are optional. returns all courses that match the
// specified criteria - either a subset of the data or the full course
// data for each result
GET /courses/{courseId}

RESTful API - URI Structure Advice

I have REST API URL structure similar to:
/api/contacts GET Returns an array of contacts
/api/contacts/:id GET Returns the contact with id of :id
/api/contacts POST Adds a new contact and return it with an id added
/api/contacts/:id PUT Updates the contact with id of :id
/api/contacts/:id PATCH Partially updates the contact with id of :id
/api/contacts/:id DELETE Deletes the contact with id of :id
My question is about:
/api/contacts/:id GET
Suppose that in addition to fetching the contact by ID, I also want to fetch it by an unique alias.
What should be URI structure be if I want to be able to fetch contact by either ID or Alias?
If you're alias's are not numeric i would suggest using the same URI structure and figuring out if it's an ID or an alias on your end. Just like Facebook does with username and user_id. facebook.com/user_id or facebook.com/username.
Another approach would be to have the client use GET /contacts with some extra GET parameters as filters to first search for a contact and then looking up the ID from that response.
Last option i think would be to use a structure like GET /contacts/alias/:alias. But this would kinda imply that alias is a subresource of contacts.
The path and query part of IRIs are up to you. The path is for hierarchical data, like api/version/module/collection/item/property, the query is for non-hierarchical data, like ?display-fields="id,name,etc..." or ?search="brown teddy bear"&offset=125&count=25, etc...
What you have to keep in mind, that you are working with resources and not operations. So the IRIs are resource identifiers, like DELETE /something, and not operation identifiers, like POST /something/delete. You don't have to follow any structure by IRIs, so for example you could use simply POST /dashuif328rgfiwa. The server would understand, but it would be much harder to write a router for this kind of IRIs, that's why we use nice IRIs.
What is important that a single IRI always belongs only to a single resource. So you cannot read cat properties with GET /cats/123 and write dog properties with PUT /cats/123. What ppl usually don't understand, that a single resource can have multiple IRIs, so for example /cats/123, /cats/name:kitty, /users/123/cats/kitty, cats/123?fields="id,name", etc... can belong to the same resource. Or if you want to give an IRI to a thing (the living cat, not the document which describes it), then you can use /cats/123#thing or /users/123#kitty, etc... You usually do that in RDF documents.
What should be URI structure be if I want to be able to fetch contact
by either ID or Alias?
It can be /api/contacts/name:{name} for example /api/contacts/name:John, since it is clearly hierarchical. Or you can check if the param contains numeric or string in the /api/contacts/{param}.
You can use the query too, but I don't recommend that. For example the following IRI can have 2 separate meanings: /api/contacts?name="John". You want to list every contact with name John, or you want one exact contact. So you have to make some conventions about this kind of requests in the router of your server side application.
I would consider adding a "search" resource when you are trying to resolve a resource with the alias:
GET /api/contacts/:id
and
GET /api/contacts?alias=:alias
or
GET /api/contacts/search?q=:alias
First of all, the 'ID' in the URL doesn't have to be a numerical ID generated by your database. You could use any piece of data (including the alias) in the URL, as long as its unique. Of course, if you are using numerical ID's everywhere, it is more consistent to do the same in your contacts API. But you could choose to use the aliases instead of numeric IDs (as long as they are always unique).
Another approach would be, as Stromgren suggested, to allow both numeric IDs and aliases in the URL:
/api/contacts/123
/api/contacts/foobar
But this can obviously cause problems if aliases can be numeric, because then you wouldn't have any way to differentiate between an ID and a (numeric) alias.
Last but not least, you can implement a way of filtering the complete collection, as shlomi33 already suggested. I wouldn't introduce a search resource, as that isn't really RESTful, so I'd go for the other solution instead:
/api/contacts?alias=foobar
Which should return all contacts with foobar as alias. Since the alias should be unique, this will return 1 or 0 results.

Query string parameter names for SEO

If I have a website like:
google.com/index.html?c=123123&p=shoes
Will it be better for SEO to have it as:
google.com/index.html?code=123123&footwear=shoes
I mean, does giving useful names to query string parameters help SEO?
Yes query string could help Google to understand the meaning of the page.
What is important with query string that you display unique content when changing the value of a parameter.
Example:
google.com/index.html?code=123123&footwear=shoes
google.com/index.html?code=123123&footwear=shoesB
If in this case you display the same content you can occur in duplicated issues.
(You can also use canonical URL)
The best would be re-write the the string as URL friendly like
google.com/footwear/shoes/name-product-ID
A unique URL for each product.
Here some useful resource of duplicate issue
http://www.seomoz.org/learn-seo/duplicate-content
Hope can help

Store URL Keywords in database, or derrive

If I have a book named "The Harold's Purple Crayon Collectors Set," I want the website URL to look like this:
www.site.com/book/harolds-purple-crayon/4324
I will need to write code to strip out things like noise words, special characters, words less than x chars long, limiting the final result to y words, etc, but after that code is written, what do I do with it?
Do I run each title through the code every time the URL is needed on my site, or instead, use the code to loop through all my titles and dump the results into a database and pull them from there instead of dynamically building them each time?
The best practice in this case is to save the friendly-URLs to the database and have them retrieved together with other information about that book (in this case). Next, all you have to do is to re-create the URL, using the string you generated and the ID (as per your example).