Suppose I have a website and it has the option to choose a language, then all the content (text) changes, where is it better in terms of performance to store this content?
I would build a translation file on the client side, is that right? (Also for sites where the content is huge)
Storing on the client-side all huge content translation list would be fine as long as you get this context from API and only 2 languages allowed. The client browser can handle that.
Hoverwer if you were having more than 3 languages translated texts, I would advise you to get these translated contents for each language from API.
From personal experience, We recently developed an app with 2 language options, and all the texts are on the client-side, but we face some issues on pages where we have texts more than 25 pages long. For us, it's just one page where users hardly visit. So we don't care, but if this issue were mostly for all pages, I would be using the server-side.
Related
We run DokuWiki.
We have one page for every server.
We want to mix automated content (like number of CPUs) with content created by human beings by hand and keyboard.
What is an easy and not so "dirty" way to solve this?
Include generated pages and their sections into user-maintained pages or vice versa. As a benefit you will be able to forbid user access to generated pages(namespaces) via ACL.
Use plugins like data or sqlite to include smaller pieces of information on the page.
It might be enough to have discussions available for generated pages.
I'm maintaining an existing website that wants a site search. I implemented the search using the YAHOO API. The problem is that the API is returning irrelevant results. For example, there is a sidebar with a list of places and if a user searches for "New York" the top results will be for pages that do not have "New York" in the main content section. I have tried adding Yahoo's class="robots-nocontent" to the sidebar however that was two weeks ago and there has been no update.
I also tried out Google's Search API but am having the same problem.
This site has mostly static content and about 50 pages total so it is very small.
How can I implement a simple search that only searches the main content portions of the page?
At the risk of sounding completely self-promoting as well as pushing yet another API on you, I wrote a blog post about implementing Bing for your site using jQuery.
The advantage in using the jQuery approach is that you can tune the results quite specifically based on filters passed to the API and playing around with the JSON (or XML / SOAP if you prefer) result Bing returns, as well as having the ability to be more selective about what data you actually have jQuery display.
The other thing you should probably be aware of is how to effectively use #rel attributes on your content (esp. links) so that search engines are aware of what the relationship is between the actual content they're crawling and the destination content it links to.
First, post a link to your website... we can probably help you more if we can see the problem.
It sound like you're doing it wrong. Google Search should work on your website, unless your content is hidden behind javascript or forms or something, or your site isn't properly interlinked. Google solved crawling static pages, so if that's what you have, it will work.
So, tell me... does your site say New York anywhere? If it does, have a look at the page and see how the word is used... maybe your site isn't as static as you think. Also, are people really going to search your site for New York? Why don't you input some search terms that are likely on your site.
Another thing to consider is if your site is really just 50 pages, is it really realistic that people will want to search it? Maybe you don't need search... maybe you just need like a commonly used link section.
The BOSS Site Search Widget is pretty slick.
I use the bookmarklet thing but set as my "home" page in my browser. So whatever site I'm on I can hit my "home" button (which I never used anyway) and it pops up that handy site search thing.
a friend of mine told me that the company he works at are redoing their SEO for their large website. Large == both number of pages and traffic they get a day.
Currently they have a (quote) deeply nested site , which i'm assuming means /x/y/z/a/b/c.. or something. I also know it's very unRESTful from some of the pages i've also seen -> eg. foo.blah?a=1&b=2&c=3......z=24 (yep, lots of crap in the url).
So updating their SEO sounds like a much needed thing.
But, they are going flat. I mean -> totally flat. eg. /foo-bar-pew-pew-abc-article1
This scares the bollox out of me.
From what he said (if i understood him right), each - character doesn't mean a new heirachial level.
so /foo-bar-pew-pew-abc-article1 does not mean /foo/bar/pew/pew/abc/article1
A space could be replace by a -. A + represents a space, but only if the two words are suppose to be one word (whatever that means). ie. Jean-Luke will be jean+luke but if i had a subject like 'hello world, that would be listed ashello-world`.
Excuse me while i blow my head up.
Is this just mean or is it totally silly to go completly flat. To mean, I was under the impression that when SEO people say keep it as flat as possible, they are trying to say keep it to 1 or 2 levels. 4 is the utter max=.
Is this me or is a flat heirachy a 'really really good thing' for seo ... for MEDIUM and LARGE sites (lots of resources, not necessairly lots of hits/page views).
Well, let's take a step back and look at what SEO is supposed to accomplish; it's meant to help a search engine identify quality, relevant content for users based on key phrases and terms.
Take, for example, the following blog URLs:
* http://blog.example.com/articles/2010/01/20/how-to-improve-seo/
* http://blog.example.com/how-to-improve-seo/
Yes, one is deep and the other is flat; but the URL structure is important for two reasons:
URL terms and phrases are high-value targets for determining relevance of a page by a search engine
A confusing URL may immediately force a user to skip your link in the search results
Let's face it: Google and other search engines can associate even the worst URLs with relevant content.
Take, for example, a search for "sears kenmore white refrigerator" in Google: http://www.google.com/search?q=sears+kenmore+white+refrigerator&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a.
Notice the top hit? The URL is http://www.sears.com/shc/s/p_10153_12605_04665802000P , and yet Google replaces the lousy URL with www.sears.com › Refrigerators › Top Freezers. (Granted, 2 results down is the true URL.)
If your goal for SEO is optimized organic relevance, then I would wholeheartedly recommend generating either key/value pairs in the URL, like www.sears.com/category/refrigerators/company/kenmore (meh), or phrase-like URLs like www.sears.com/kenmore/refrigerators/modelNumber. You want to align your URLs with the user's search terms and phrases to maximize your effort.
In the end, if you offer valuable content and you structure your content and site properly, the search engines will accurately gather it. You just need to help them realize how specific and authoritative your content is. :)
Generally the less navigation to reach content the better. But with a logical breadcrumb strategy and well thought out deep linking the excess of directory depth can be managed and not hurt seo and the visibility in search.
Remember that Google is trying to return the most relevant link and the best user experience, so if your site has 3 urls coming up for the same search term and it take 2 or 3 exits to find the appropriate content, Google will read that as bad and start lowering all of your urls in SERPs.
You have to consider how visitors will find your content - not navigate it. Think content discovery and just navigation.
HTH
Flat or deeply nested really shouldn't affect the SEO. The key part is how those individual pages are linked to will determine how they get ranked. I did write some basic stuff on this years ago see here, but essentially if pages are not buried deeply within a site, i.e. it takes several clicks (or links from Google's perspective) then they should rank fairly much the same in either case. Google used to put a lot more weight on keywords in URL's but this has been scaled back in more recent algorithm changes. It helps to have keywords there, but its no longer the be-all and end-all.
What you/they will need to consider are the following two important points:
1) How will the URL structure be perceived by the users of the site? Will they they be able to easily navigate the site and not have to rely on the URL structure in the address bar?
2) In making navigational changes such as this its vitally important to set-up redirects from old url's. Google, hates 404's and they should either put in 410 (Gone) HTTP responses for pages are no longer valid or 301 HTTP response for permanent redirects (with new url).
In making any large changes such as this you can save loads of time getting the site indexed successfully by utilising XML sitemaps and Google's webmaster console.
Context:
I'm in the design phase of what I'm hoping will be a big website (lots of traffic, lots of users reading and writing to database).
I want to offer this website in the three languages I speak myself (English, French, and by the time I finish the website, I will hopefully have learned enough Spanish to offer that too)
Dilemma:
I'm wondering how I should go about offering these various languages (and perhaps more in the future).
Criteria:
Many methods exist for designing multi-language websites. I'm looking for the technique that will result in a faster browsing experience for the user.
Choices:
Currently, I can think of (and have read about) the following choices. They are sorted in order of preference up to now.
Store all language-specific strings
in a database and fetch the good one
depending on prefered-language
(members can choose which language
they prefer),
browser-default-language and which
language is selected during the
current session, in that order.
Pros:
Most of the time, a single
test at the beggining of the session
confirms which language to use for
the remainder of the session (stored
in a SESSION variable). Otherwise, a
user logging in also fetches the
right language and keeps it until
he/she logs out (no further tests). So the testing part should be
pretty fast.
Cons:
I'm afraid that accessing the
database all the time would be quite
time-consuming (longer page load for
the user), especially considering
that lots of users could also be
accessing the database at the same
time for the same reason (getting the website text in the correct language), but also
for posting comments and the such.
Strings which include variables
(e.g. "Hello " + user.name + ", how
are you?") are harder to
store because the variable (e.g.
user name) changes for each user.
A direct link to a portal for a specific language would be ugly (e.g. www.site.com?lang=es)
Store all language-specific strings
in a text file and fetch the good one
depending on prefered-language
(members can choose which language
they prefer),
browser-default-language and which
language is selected during the
current session, in that order.
Pros:
Most of the time, a single
test at the beggining of the session
confirms which language to use for
the remainder of the session (stored
in a SESSION variable). Otherwise, a
user logging in also fetches the
right language and keeps it until
he/she logs out (no further tests). So the testing part should be
pretty fast.
Cons:
I'm afraid that accessing the
text file all the time would be quite
time-consuming (longer page load for
the user), especially considering
that lots of users could also be
accessing the file at the same
time for the same reason (getting the website text in the correct language).
Strings which include variables
(e.g. "Hello " + user.name + ", how
are you?") are harder to
store because the variable (e.g.
user name) changes for each user.
I don't think multiple users could access the text file concurrently, though I may be wrong. If that's the case though, every user loading a page would have to wait for his/her turn to access the text file.
Fetching the very last string of the text file could be pretty long...
A direct link to a portal for a specific language would be ugly (e.g. www.site.com?lang=es)
Creating multiple versions of the website in seperate folders, where each version is in a different language.
Pros:
No extra-treatment is needed for handling languages, so no extra waiting time.
Cons:
Maintaining the website will be like going to school: painfull, long, makes you stupid after doing the same thing over and over again.
ugly url (e.g. www.site.com/es/ instead of www.site.com)
Additionnaly, the coices above could be combined with one or more of the following techniques:
Caching certain frequently requested pages (in a singleton or static PHP function?). Certain sentences could also be cached for every language.
Pros
Quicker access for frequently-requested pages.
Which pages need caching can be determined dynamically, with time.
Cons
I'm not sure about this one, but would this end up bloating the server's RAM?
Rewritting the url could be used for many things.
A user looking for direct access to one language could do so using www.site.com/fr/somefile and would be redirected to www.site.com/somefile, but with the language selected beign stored in a session variable.
Pros
Search engines like this because they have two different pages to show for two different languages
Cons
Bookmarking a page doesn't mean you'll en up with the right language when you come back, unless I put the language information in the url (www.site.com/somefile?lang=fr)
A little more info
I usually user the following technologies to make a website:
PHP
SQL
XHTML
CSS
Javascript (and AJAX)
This being said, if a solution requires that I learn a new language or something, I'm very open to doing so. I have no deadline for this project and I do intend to learn a lot from doing it!
Conclusion:
What I'm looking for is a method that allows me to offer multiple languages while not increasing page load time and not going crazy when trying to maintain the website. If you guys/gals have other ideas I should consider, I will try adding them to my list. Another possibility is that I'm overdoing this. Maybe I won't gain enough time with these methods for this all to be worth it, I just don't know how to verify if I need to worry about this or not.. so if you have any ideas for that, it would also help me.
Whether you use a database or a filesystem to store the translations, you should be loading the text all at once and then serving it from memory. Most applications will typically not have so much text that this becomes a problem. In Java or .Net this could be accomplished by storing the text in a singleton or static object. Then all the strings are in RAM and do not need to be loaded or parsed. If your platform does not have a convenient way to store data in ram, you could run a separate caching application such as memcached.
The rest of your concerns can be mitigated by hiding the details. Build or find a framework that lets you load your translations and then look them up by some key. If you decide to switch to files or a database later, the rest of your code is unaffected. In the short term do whichever is easier for you. I've found that it's best to have a mix: it's easier to manage application text along with the source code in a version control system. But some text changes often, or needs to change without requiring a build+deployment cycle, and that text should be in the DB.
Finally, don't build strings with substitutions in them. Use some kind of format string, because otherwise your translators will go crazy trying to translate sentence fragments.
(Warning: Java code sample)
//WRONG
String msg = "Hello, " + username + ", welcome back.";
//RIGHT
String fmt = "Hello, %s, welcome back."; // in real code: load this string from a file or the db
String msg = fmt.format(username);
Another person mentioned encoding the language in the URL. This is the preferred way to do it if you care what a search engine thinks of your site. Google recommends using different hostnames or a different subdirectory. This means that the language headers sent by the user can't be used for anything, except perhaps initially sending them to one landing page or another. You will need to determine the language for each request based on the incoming URL (this actually simplifies your code a lot later on). In Java I'd store the language code in the Request and just grab it whenever I need it.
The easiest way to handle language codes in the URL is to use re-writing. A client sends a request for www.yoursite.com/de/somepage and internally you re-write the request to www.yoursite.com/somepage and store the language identifier somewhere. In Java each request has an HttpServletRequest object where you can store attributes for the lifecycle of the request. If your framework doesn't have anything like that you can just add a parameter to the url: www.yoursite.com/de/somepage => www.yoursite.com/somepage?lang=de. If you are using hostname-based languages you can use hostnames such as de.yoursite.com or www.yoursite.de. There are pros and cons to using this approach. For one thing, using country-code TLDs means registering new TLDs and trying to figure out whether a country code is appropriate to represent a language (it's often not). Using differnet hostnames/domains means you have to consider under what domains cookies are stored. If you want a cookie-free subdomain you need to plan this carefully. But from the coding side a language-based hostname doesn't need any additional re-writing; you can read the hostname (it's the Host header in the HTTP request) and parse that to determine the language.
Offer the initial page in a language depending on the Accept-Language HTTP header.
Let the user set the language in the current session and, if they're authenticated, in their user profile.
In your code and templates, mark strings as "translatable." You should have tools that gather all the strings from your codebase and let your translaters translate them.
Have a layer which loads the translations from the database either individually or as a bundle, and apply them to the page which is loading. Cache these parts to make them fast -- every page load shouldn't make a hundred calls to the database for every translatable string.
Checkout how Django does it -- it should be enlightening.
"I'm afraid that accessing [the database/text file] all the time would be quite time-consuming"
It would be, but that's why you'd likely be using caching to some extent. Nearly all large sites are accessing data stored outside the HTML page itself and, as such, utilize caching techniques as needed.
Your question regarding speed really is irrelevant to having multiple languages. It's an issue of storing data (content) so it's easy to maintain and present to the user. Whether it's one language or 10 the problem is the same.
Create the most generic form of the site as you can. Import the translation from a database, with fall back (i.e. an order of languages, if a translation does not exist then use the next best langauge (For German: German, Dutch, English etc).
You would solve performance issues by keeping caches of the dynamically created pages. [Check the dependent data and update if necessary]
The perfered language that a user would like is passed along in the HTTP request headers. Having a select language+query string would often be unnecessary.
Resource files would be one way to go. It is easier to send to translators. However it can be difficult to resuse amongst multiple websites.
Databases are convient because it is the first thing that should be backed up on a website. It also has the benefit of being fast. However, if you have an extremely database focused project, you may not want to add additional strain on your database.
For my solutions I want this:
The language should be indicated in the URL, it works better with google indexing the page and people following the links in google's search result.
As much pre-generated translations as possible, for faster page-serving.
The first is quite easily done by having an URL like http://example.com/fr/and-so-on. URL rewriting can turn that into http://example.com/and-so-on?lang=fr which is potentially easier to handle.
For pre-generating translations, it is good to use a html template framework so you can generate translated templates from one set of source templates. A blunt approach is to generate a sed-script from a language key-value files, and run that sed script on each template to get a translated version.
What remains then is to translate the dynamically generated parts of the pages. There are a few tools for that java has bundles, gnu gettext is a quite nice tool.
I am working on a website in dnn. I want to change the language of website or particular page. So I download the language package for spanish(es-es),chinese(zh-cn) and install them from host. Next when I changed the language of browser then the website language didn't change. Working on dnn 5.0.
Please let me know how I can use language packages in dnn website.
For initial translations and maintenance of DotNetNuke translations, I recommend the use of OmegaT. It handles resx files directly. And content (such as HTML or Blogs) can be downloaded, translated and then uploaded thanks to the APIs of DNN (drop me a note if you need the scripts).
OmegaT stores the translations in it's memory (a TMX file, which is actually some kind of XML). It also uses Google Translate and similars, and has a fast user interface which increases translation speed a lot when compared against continously waiting for DotNetNuke to handle your updated resources.
More info on OmegaT. An example of a translated site and modules: site translated from Dutch into English
You should probably ask this in the DotNetNuke forums: http://www.dotnetnuke.com/tabid/795/default.aspx.
There's one dedicated forum for questions about language packs and localization. You will probably find your answer there: http://www.dotnetnuke.com/Community/Forums/tabid/795/forumid/77/scope/threads/Default.aspx
The language packs don't always have translations for everything on the site, especially content that you added yourself. You'll need to do two things to get them working properly:
Go to Admin > Languages, and enable the languages you want to use.
Open the Language Editor and start translating. Under each resource name, you will see an edit text box for the localized value, and a read-only text box for the default value. In most cases, you'll need to translate verbatim what you see under "default value".
We had to write our own menu provider to get the menu to do this - instead of going for the resource files we went for a database solution - other reasons applied to this as well - we also built an interface for doing this - as for things like the text/html module there are some third party builds that allow you to nationalize content. Apollo comes to mind Apollo Software they have some multilanguage modules
The language packs will typically only localize text used by the core such as "Login" and "Settings". It is designed so that you can have a site in a language other than English, not so you can have multiple languages on one site. You can easily have multiple portals, each with a different language.
In order to have multiple locales on one portal you will need to use a third party module or develop your own.