Spotify Web API special characters - api

Is there any documentation for the Spotify Web API as to which characters are valid when searching? For example "Macklemore & Ryan Lewis" needs the & to be url encoded in order for the request to work.
The character ":" is completely invalid it seems. In order to search for an album like "Pink Friday: Roman Reloaded", I have to remove the : completely from the String. Even URL encoding it doesn't work. This probably was to do with the fact that : seems to be used to separate the fields of the query.
Other characters like .[()/ seem to work ok. Any docs on this anywhere? Just want to know we are being comprehensive. Thanks.

Given the Björk example here,
http://ws.spotify.com/search/1/artist?q=artist:Bj%C3%B6rk
You'll need percent encoding. Not sure what technology you're using, if javascript, this answer should help - URL encode sees “&” (ampersand) as “&” HTML entity.
Update:
Colons are special characters used for advanced search. Queries involving colons need to be quoted, unless it's an advanced search :).

Related

Lucene query-time boosting culture code

I'm using the Lucene.Net implementation packaged with the Kentico CMS. The site that we're indexing has articles in various languages. If a user is viewing the Japanese version of the site (for example) and runs a search for 'VPN', we'd like them to see Japanese articles about VPN first, but also see other language articles in the results.
I'm trying to achieve this with query-time boosting of the _culture field. Since we're using the standard analyzer (really don't want to change that), and the standard analyzer treats hyphens as whitespace, I thought I'd try appending '(_culture:jp)^4' to the user's query. As you can see from the Luke tool's Explain output, that isn't doing anything to boost the documents with 'jp' in the field. What gives?
I've also tried:
_culture:"en-jp"
_culture:en AND _culture:jp
_culture:"en jp"
Update: It's something with the field. There's another field in the index named 'documentculture' that contains the same data (don't know why). But when I try '(documentculture:jp)^4', it works as I expect. That solves my problem, but I still have an academic question of how the fields are different.
Even though the standard analyzer ignores hyphens I don't believe it will treat the two parts of your culture code as separate terms. Therefore under normal circumstances a wildcard would help you here. For example, the query vpn (_culture:en*)^4 would boost all documents with a culture starting with en.
However, in your case you want to match the end of the term. Unfortunately, Lucene syntax doesn't support wildcards at the start of terms for some reason (according to this reference). Therefore I think you're going to have to consider changing the analyzer you're using. I generally find the Whitespace analyzer fits my needs best. I've just tried your scenario using Whitespace analyzer and have found vpn (_culture:en-jp)^4 will give you what you need.
I understand if you don't accept this answer though since you stated you didn't want to change the analyzer!

How to restrict a search to multiple sites?

Restricting a search to multiple sites - is it possible to do this?
E.g. site:www.google+www.yahoo.com?
Bing does not provide a separate method of specifying which sites to search like Google's annotations. Instead, you need to add them as parameters in your query string. To do so, use the OR logical operator along with the site: specification. Bing prioritizes AND over OR, so make sure you put parentheses around the OR'ed terms. For example,
example search terms (site:google.com OR site:yahoo.com)
If you are adding a lot of sites, keep in mind that the total URL length must be less than 2048 characters, encoded.
Use OR operator and site: keyword.
E.g.
(site:http://superuser.com/ OR site:http://stackoverflow.com/) (some query)
It should work in most of the search engines such as Google, Bing, Yahoo, DuckDuckGo, etc.
Note: It's important to use capital letters for operators such as AND and OR, otherwise it could not work.
Format your query like this:
site:dell.com OR site:ibm.com "search phrase"
So each site/domain with OR between. Tested seems to work for both Bing and Google.
Syntax shamelessly taken from searchenginewatch.com.

How do you test your app for Iñtërnâtiônàlizætiøn? (Internationalization?)

How do you test your app for Iñtërnâtiônàlizætiøn compliance? I tell people to store the Unicode string Iñtërnâtiônàlizætiøn into each field and then see if it is displayed correctly on output.
--- including output as a cell's content in Excel reports, in rtf format for docs, xml files, etc.
What other tests should be done?
Added idea from #Paddy:
Also try a right-to-left language. Eg, שלום ירושלים ([The] Peace of Jerusalem). Should look like:
(source: kluger.com)
Note: Stackoverflow is implemented correctly. If text does not match the image, then you have a problem with your browser, os, or possibly a proxy.
Also note: You should not have to change or "setup" your already running app to accept either the W European characters or the Hebrew example. You should be able to just type those characters into your app and have them come back correctly in your output. In case you don't have a Hebrew keyboard laying around, copy and paste the the examples from this question into your app.
Pick a culture where the text reads from right to left and set your system up for that - make sure that it reads properly (easier said than done...).
Use one of the three "pseudo-locales" available since Windows Vista:
The three different pseudo-locale are for testing 3 kinds of locales:
Base The qps-ploc locale is used for English-like pseudo
localizations. Its strings are longer versions of English strings,
using non-Latin and accented characters instead of the normal script.
Additionally simple Latin strings should sort in reverse order with
this locale.
Mirrored qpa-mirr is used for right-to-left pseudo data, which is
another area of interest for testing.
East Asian qps-asia is intended to utilize the large CJK character
repertoire, which is also useful for testing.
Windows will start formatting dates, times, numbers, currencies in a made-up psuedo-locale that looks enough like english that you can work with it, but obvious enough when you're not respecting the locale:
[Шěđлеśđαỳ !!!], 8 ōf [Μäŕςћ !!] ōf 2006
There is more to internationalization than unicode handling. You also need to make sure that dates show up localized to the user's timezone, if you know it (and make sure there's a way for people to tell you what their time zone is).
One handy fact for testing timezone handling is that there are two timezones (Pacific/Tongatapu and Pacific/Midway) that are actually 24 hours apart. So if timezones are being handled properly, the dates should never be the same for users in those two timezones for any timestamp. If you use any other timezones in your tests, results may vary depending on the time of day you run your test suite.
You also need to make sure dates and times are formatted in a way that makes sense for the user's locale, or failing that, that any potential ambiguity in the rendering of dates is explained (e.g. "05/11/2009 (dd/mm/yyyy)").
"Iñtërnâtiônàlizætiøn" is a really bad string to test with since all the characters in it also appear in ISO-8859-1, so the string can work completely without any Unicode support at all! I've no idea why it's so commonly used when it utterly fails at its primary function!
Even Chinese or Hebrew text isn't a good choice (though right-to-left is a whole can of worms by itself) because it doesn't necessarily contain anything outside 3-byte UTF-8, which curiously was a very large hole in MySQL's default UTF-8 implementation (which is limited to 3-byte chars), until it was fixed by the addition of the utf8mb4 charset in MySQL 5.5. These days one of the more common uses of >3-byte UTF-8 is Emojis like these: [💝🐹🌇⛔]. If you don't see some pretty little coloured pictures between those brackets, congratulations, you just found a hole in your Unicode stack!
First, learn The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets.
Make sure your application can handle Turkish. It has several quirks that break applications that assume English rules. Because there are four kinds of letter "i" (dotted and dot-less, upper and lower case), applications that assume uppercase(i) => I will break when using Turkish rules, where uppercase(i) => İ.
A common thing to do is check if the user typed the command "exit" by using lowercase(userInput) == "exit" or uppercase(userInput) == "EXIT". This works as expected under English rules but will fail under Turkish rules where "exıt" != "exit" and "EXİT" != "EXIT". To do this correctly, one must use case-insensitive comparison routines which are built into all modern languages.
I was thinking about this question from a completely different angle. I can't recall exactly what we did, but on a previous project I think we wound up changing the Regional Settings (in the Regional and Language Options control panel?) to help us ensure the localized strings were working.

Do spaces in your URL (%20) have a negative impact on SEO?

All the articles I Googled on this subject are dated back in 2004-2005.
Basically I am structuring precanned searches, and it is based off of categories the client will input.
Example
content/(term name)/index.htm
Does it matter if I used the raw term with a space, which is converted to %20 in the URL, or should I convert the link to '-' and remove that before querying for results?
I already have it working, but does anyone know if this definitely has a negative impact on SEO and ranking?
No impact on SEO. A - just looks nicer, that's all.
You'd use %20 if you needed to preserve the exact term including a proper space when you read it back from the URL. Probably you don't.
I personally think it should be "-"
I don't remember seeing a website that was using %20
"-" is one character and %20 is three, so you can put more stuff visible in the address bar
for an example, what is better
Do spaces in your URL (%20) have a negative impact on SEO?
or
Do spaces in your URL (%20) have a negative impact on SEO?
Yes don't use them - Google, Yahoo and bing does not know how to leverage the spaces and more importantly you are wasting good opportunity to communicate both with the consumer and search engines more about your product or page URL and what the topic of the content is all about.
However, sometimes it can't be helped because you have a website / ecommerce site for years and the site is indexed and already on good page ranking.
In that case, if you do want to get better naming convention, you will want to re-name the urls but take all of the existing url with space and place it into 301 redirect and map them to the new urls.
%20 does not effects SEO but it will destroy the readability of your URL. since the CMS have taken all the intention, so now it's easy to set-up dynamic URL structure. I recently read an article on SEO Friendly URLS which will help you to avoid Google penaltyimprove your chances to rankandmake your links meaningful hope it helps.
As mentioned, it really doesn't matter from a search engine perspective. With that being said, however, it's generally not good practice to use spaces in URLs (%20). Replace it with a dash or concatenate it.
I use blogger and while adding labels to blog post, the link to that label page has space which is converted to "%20" but i have no control over that with blogger. When I try to make the labels with '-' instead of space they are not nice to humans, so i go with spaces and "%20" in urls, i think this should not affect SERPs.
We use "%20" all over the place on our website and have not experienced any negative effects. We began doing this about two years ago, and at that time a few search engines had problems, but they have since disappeared. Some browsers will display a "%20" in the address bar, while others will display an empty space, but this really doesn't matter.
We're not so sure though that this has any positive effect on ranking, though it definitely has no negative effect. The thing to remember about Google is that while having a keyword as part of the base url, such as www.greatwidgets.com, is very helpful, using keywords as part of the page url, example: www.myexample.com/widgets.htm does not appear to result in any advantage. What matters is the page content and how many other pages out there have the exact same content. Also, incoming links from relevant websites with high rankings, without the rel="nofollow" tag are extremely important.
You cannot "trick" Google with fancy-looking URLs and h1 headers. That's right, h1 headers mean nothing, because Google doesn't require your input to tell them what's important.
Remember, if you're selling products and copying content from the manufacturer's website (or the competitor's website), Google's PANDA is going to be very angry. You'll need to reword your content so that it's not a verbatim copy from some other website. Google rewards originality, and severely punishes plagiarism. Seriously, PANDA will put the offending page on page 50 until it's brought into conformity with Google's policy on duplicate content.
Always use sitemaps to help the search engines.
I believe it looks better in a link if an underscore (_) is used.
content/term_name/index.htm
content/term-name/index.htm
content/term%20name/index.htm
It's better to use "-" instead of %20 since it shows unprofessional coding to the search engines and to the visitors. You really think a visitor could remember a URL with %20 ? Make the pages for the users and not for the search engines. You will get the most benefit form this and SE will appreciate it.
according to my view spaces in url should not be there as this is not good practice. we should use hypens between the URLS. the website should have sitemap.xml file.
according to my view spaces do have negative impact on seo. and secondly when creating a url structure hypens should be placed instead of underscores.
yes they do have negative effect as it effects the user experiences. the users would like to have easy to remember urls. google suggest you should seperetae your words with ' - ' and ideally not to use '_' or spaces '%20' .
Something else to consider is that if you use spaces in your URLs, it will break automatic URL detection in many software (e.g. emails, chat, etc) where they think that a space is the end of URL. This might impact negatively the "sharability" of your URLs.
Using spaces in URLs is still not common practice in 2020 and Google still recommends to use - instead:
https://support.google.com/webmasters/answer/76329?hl=en

URL formatting tips for search engine optimization?

I am looking for url encoding tips for SEO compliant site.
I have a list of variables I need!
hypen = used to split locations, Leeds-UK-England
space = underscore for where spaces occur
hypen = plus sign used in some british locations (stafford-upon-avon)
forward slash = exclamation used in house for names of things.
Are the ones chosen bad or good? Are there any better ones, I'm pretty sure I need all the data, in order to decode the url's properly.
My "SEO" gave me a list of things which are bad, but not good. I've searched these and google seems to give the same type of results.
Cheers, Sarkie
Google used not to recognise underscores as word separators - see this article from 2005. This has entered into received wisdom and most of the 'experts' and articles you will find on SEO will still be recommending this.
However, last year this changed: underscores are now recognised as word separators so it opens things up for URL design. This now allows using dashes as dashes and underscores as spaces which some consider more natural. I've not found many people who have caught up with this, including SEO consultants I deal with professionally.
As to a good system for your use case, I would recommend asking around some non technical people (colleagues, friends, family, etc) to see what they like.
Hyphens for spaces is the usual and preferred method.