Let admit that with Jersey I expose 2 queries that are :
/hello/{name}
/hello/goodby
If the user do /hello/goodby, does Jersey guarantie that it is the request "/hello/goodby" that will be chosen and not "/hello/{name}" with the name equals to "goodby" ?
I have case like that in the services that I expose, it seems that static path is always chosen but I'm looking for a kind of confirmation in the documentation and I don't see anything here : https://jersey.github.io/documentation/latest/jaxrs-resources.html#d0e2271
It's not going to be in the documentation. It's going to be in the JAX-RS Spec. Look in the section "3.7.2 Request Matching", and somewhere along in the cryptic mumbo jumbo you will see this:
Sort E using the number of literal characters in each member as the primary key
E being the so far qualified methods based on path. This means that the path with the most literal characters should be prioritized. In your case, that's why /hello/goodbye always wins. goodbye are literal characters, while {name} has zero literal characters, it's a capture group.
That is correct. /hello/goodby is given precedence over /hello/{name} assuming both are at the same level like class or method.
All matching classes are sorted in descending order on the below conditions -
Number of literal characters as primary key
Number of path params as secondary key
Number of regex strings as ternary key.
In your case, you have only literal characters and path params.
/hello/goodby - 12 literal characters and 0 path params.
/hello/{name} - 4 literal characters and 1 path params.
According to the sorting algorithm, /hello/goodby will be before /hello/{name}, and /hello/goodby is the best match.
Related
I have review multiple instructions on URL-parameters which all suggest 2 approaches:
Parameters can follow / forward slashes or be specified by parameter name and then by parameter value. so either:
1) http://numbersapi.com/42
or
2) http://numbersapi.com/random?min=10&max=20
For the 2nd one, I provide parameter name and then parameter value by using the ?. I also provide multiple parameters using ampersand.
Now I have see the request below which works fine but does not fit into the rules above:
http://numbersapi.com/42?json
I understand that the requests sets 42 as a parameter but why is the ? not followed by the parameter name and just by the value. Also the ? seems to be used as an ampersand???
From Wikipedia:
Every HTTP URL conforms to the syntax of a generic URI. The URI generic syntax consists of a hierarchical sequence of five components:
URI = scheme:[//authority]path[?query][#fragment]
where the authority component divides into three subcomponents:
authority = [userinfo#]host[:port]
This is represented in a syntax diagram as:
As you can see, the ? ends the path part of the URL and starts the query part.
The query part is usually a &-separated string of name=value pairs, but it doesn't have to be, so json is a valid value for the query part.
Or, as the Wikipedia articles says it:
An optional query component preceded by a question mark (?), containing a query string of non-hierarchical data. Its syntax is not well defined, but by convention is most often a sequence of attribute–value pairs separated by a delimiter.
It is also fairly common for request processors to treat a name=value pair that is missing the = sign, as if the it was name=.
E.g. if you're writing Servlet code and call servletRequest.getParameter("json"), it would return an empty string ("") for that last URL in the question.
In Solr I have a field dedicated to URLs. The URL field can be anywhere up to 2000 in length. However, I only ever need to search the first 200 characters.
Example URL:
https://www.google.co.uk/search/2014/here/?q=help+me&oq=stackoverflow&aqs=c
I've experimented over the last 2 weeks with Grams and various combinations of Tokenizers to no avail. I always seem to fall short. I would provide examples but they are all standard so no point cluttering this with non-working types.
The main problem seems to be with how Solr deals with punctuation. It treats non-A-z/0-9 characters as separators. How do I disable this for a field?
For example I can search: 'google' and get the correct result, but when I search 'google.co' nothing comes back. Same problem with most of the non-A-z/0-9 characters, it seems to treat them as a separator.
Everything needs to be *wildcard*searchable from 4char strings up to 200 char strings.
So the following search terms would return the above result. '&aqs','ow&aqs=','ps://www.goo','q=help+','2014/he'... etc
How would you define a field type for the URL wildcard use case?
You can use a string field for your url and use a filter that cuts it off to 200 characters.It can be a regex expressions also to keep only 200 characters for that field.
String field will match the exact tokens
I am building an API endpoint that accepts DateTime as a parameter.
It is recommended not to use : character as part of the URI, so I can't simply use ISO 8601 format.
So far I have considered two formats:
A) Exclamation mark as minute delimiter:
http://api.example.com/resource/2013-08-29T12!15
Looks unnatural and even with clear documentation, API consumers are bound to make mistakes.
B) URI segment per DateTime part:
http://api.example.com/resource/2013/08/29/12/15
Looks unreadable. Also, once I add further numeric parameters - it will become incomprehensible!
Is there standard/convention for for representing date/time in URIs?
I'd use the data interchange standard format.
Check this: http://en.wikipedia.org/wiki/ISO_8601
You can use : in URI paths.
The colon is a reserved character, but it has no delimiting role in the path segment. So the following should apply:
If a reserved character is found in a URI component and no delimiting role is known for that character, then it must be interpreted as representing the data octet corresponding to that character's encoding in US-ASCII.
There is only one exception for relative-path references:
A path segment that contains a colon character (e.g., "this:that") cannot be used as the first segment of a relative-path reference, as it would be mistaken for a scheme name. Such a segment must be preceded by a dot-segment (e.g., "./this:that") to make a relative-path reference.
But note that some encoding libraries might percent-encode the colon anyway.
Let's say, I have a regular expression that checks the validation of the input value as a whole. For example, it is an email input box and when user hits enter, I check it against ^[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}$ to see if it is a valid email address.
What I want to achieve is, I want to intercept the character input too, and check every single input character to see if that character is also a valid character. I can do this by adding an extra regular expression, e.g. [A-Z0-9._%+-] but that is not what I want.
Is there a way to extract the widest possible range of acceptable characters from a given regular expression? So in the example above, can I extract all the valid characters that are defined by the original regular expression (i.e. ^[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}$) programmatically?
I would appreciate any help or hint.
P.S. This is project for iOS written in Objective-C.
If you don't mind writing half a regex parser, certainly. You would have to be able to distinguish literals from meta-characters and to unroll/merge all character classes (including negated character classes, and nested negated character classes, if you regex flavor supports them).
If NSRegularExpressions doesn't come with some convenience method, I cannot imagine how it would be possible otherwise. Just think about ^. When it is outside of a character class, it's a meta-character that you can ignore. If it is inside a character class, it's a meta-character, that negates the character class unless it is not the first character. - is a meta-character inside character classes, unless it is the first character, the last character, or right after another character range (depending on regex flavor). And I'm not even speaking about escaped characters.
I don't know about NSRegularExpressions, but some flavors also support nested character classes (like [a-z[^aeiou]] for all consonants). I think you get where I am going with this.
I'd like to create a regular expression such that when I compare the a string against an array of strings, matches are returned with the regex ignoring certain characters.
Here's one example. Consider the following array of names:
{
"Andy O'Brien",
"Bob O'Brian",
"Jim OBrien",
"Larry Oberlin"
}
If a user enters "ob", I'd like the app to apply a regex predicate to the array and all of the names in the above array would match (e.g. the ' is ignored).
I know I can run the match twice, first against each name and second against each name with the ignored chars stripped from the string. I'd rather this by done by a single regex so I don't need two passes.
Is this possible? This is for an iOS app and I'm using NSPredicate.
EDIT: clarification on use
From the initial answers I realized I wasn't clear. The example above is a specific one. I need a general solution where the array of names is a large array with diverse names and the string I am matching against is entered by the user. So I can't hard code the regex like [o]'?[b].
Also, I know how to do case-insensitive searches so don't need the answer to focus on that. Just need a solution to ignore the chars I don't want to match against.
Since you have discarded all the answers showing the ways it can be done, you are left with the answer:
NO, this cannot be done. Regex does not have an option to 'ignore' characters. Your only options are to modify the regex to match them, or to do a pass on your source text to get rid of the characters you want to ignore and then match against that. (Of course, then you may have the problem of correlating your 'cleaned' text with the actual source text.)
If I understand correctly, you want a way to match the characters "ob" 1) regardless of capitalization, and 2) regardless of whether there is an apostrophe in between them. That should be easy enough.
1) Use a case-insensitivity modifier, or use a regexp that specifies that the capital and lowercase version of the letter are both acceptable: [Oo][Bb]
2) Use the ? modifier to indicate that a character may be present either one or zero times. o'?b will match both "o'b" and "ob". If you want to include other characters that may or may not be present, you can group them with the apostrophe. For example, o['-~]?b will match "ob", "o'b", "o-b", and "o~b".
So the complete answer would be [Oo]'?[Bb].
Update: The OP asked for a solution that would cause the given character to be ignored in an arbitrary search string. You can do this by inserting '? after every character of the search string. For example, if you were given the search string oleary, you'd transform it into o'?l'?e'?a'?r'?y'?. Foolproof, though probably not optimal for performance. Note that this would match "o'leary" but also "o'lea'r'y'" if that's a concern.
In this particular case, just throw the set of characters into the middle of the regex as optional. This works specifically because you have only two characters in your match string, otherwise the regex might get a bit verbose. For example, match case-insensitive against:
o[']*b
You can add more characters to that character class in the middle to ignore them. Note that the * matches any number of characters (so O'''Brien will match) - for a single instance, change to ?:
o[']?b
You can make particular characters optional with a question mark, which means that it will match whether they're there or not, e.g:
/o\'?b/
Would match all of the above, add .+ to either side to match all other characters, and a space to denote the start of the surname:
/.+? o\'?b.+/
And use the case-insensitivity modifier to make it match regardless of capitalisation.