Actually I'm working with Elm but I have few issues with the json parsing in this language, the error that give me the compiler is:
Err "Given an invalid JSON: Unexpected token \n in JSON at position 388"
What I need to do is this:
example
At the char_meta I want its something like this:
[("Biographical Information", [("Japanese Name", "緑谷出久"), ...]), ...]
Here the code:
Ellie link
PD: The only constant keys are character_name, lang, summary and char_meta, they keys inside of char_meta are dynamic (thats why I use keyvaluepair) and the length its always different of this array (sometimes its empty)
Thanks, hope can help me.
EDIT:
The Ellie link now redirect to the fixed code
The issue is that elm (or JS once transcoded) interprets the \n and \" sequences when parsing the string literal, and they are replaced with an actual new line and double quotes respectively, which results in invalid JSON.
If you want to have the JSON inline in the code, you need to escape the 5 \s by doubling them (\\n and \\").
This only applies for literals, you won't have the issue if you load JSON from the network for instance.
Related
I'm trying to describe SSH protocol in Kaitai language (.ksy file).
At the beginning, there is a protocol version exchange in the following format:
SSH-protoversion-softwareversion SP comments CR LF
where SP comments is optional. AFAIK, there is not way of describing attribute as fully optional, only via if condition. Does anybody know how to describe this relation in Kaitai, so that parser accepts also this format: SSH-protoversion-softwareversion CR LF?
Thanks
Kaitai Struct is not designed to be what you would call a grammar in its traditional meaning (i.e. something mapping to a regular language, context-free grammar, BNF, or something similar). Traditional grammars have notion of "this element being optional" or "this element can be repeated multiple times", but KS works the other way around: it's not even attempting to solve the ambiguility problem, but rather builds on a fact that all binary formats are designed to be non-ambiguous.
So, whenever you're encountering something like "optional element" or "repeated element" without any further context, please take a pause and consider if Kaitai Struct is a right tool for the task, and is it really a binary format you're trying to parse. For example, parsing something like JSON or XML or YAML might be theoretically possible with KS, but the result will be not of much use.
That said, in this particular case, it's perfectly possible to use Kaitai Struct, you'll just need to think on how a real-life binary parser will handle this. From my understanding, a real-life parser will read the whole line until the CR byte, and then will do a second pass at trying to interpret the contents of that line. You can model that in KS using something like that:
seq:
- id: line
terminator: 0xd # CR
type: version_line
# ^^^ this creates a substream with all bytes up to CR byte
- id: reserved_lf
contents: [0xa]
types:
version_line:
seq:
- id: magic
contents: 'SSH-'
- id: proto_version
type: str
terminator: 0x2d # '-'
- id: software_version
type: str
terminator: 0x20 # ' '
eos-error: false
# ^^^ if we don't find that space and will just hit end of stream, that's fine
- id: comments
type: str
size-eos: true
# ^^^ if we still have some data in the stream, that's all comment
If you want to get null instead of empty string for comments when they're not included, just add extra if: not _io.eof for the comments attribute.
I have a column with a varchar and want to convert it to a JSON by parse_Json.
({u'meta': {u'removedAt': None, u'validation': {u'createdTime': 157....)
When I use :
select get_path(PARSE_JSON(OFFER), 'field') from
this error occours: SQL-Fehler [100069] [22P02]: Error parsing JSON: missing colon, pos 3.
So I try to add a Colon at position 3
select get_path(PARSE_JSON(REPLACE (offer,'u','u:')), 'field') from
So this error occurred SQL-Fehler [100069] [22P02]: Error parsing JSON: misplaced colon, pos 10
By now I don't know how do handle this and the information by snowflake doesnt really help.
https://support.snowflake.net/s/article/error-error-parsing-json-missing-comma-pos-number
Thanks for your help
Your 'JSON input' is actually a Python representation string of its dictionary data structure, and is not a valid JSON format. While dictionaries in Python may appear similar to JSON when printed in an interactive shell, they are not the same.
To produce valid JSON from your Python objects, use the json module's dump or dumps functions, and then use the proper string serialized JSON form in your parse_json function.
I noticed while experimenting with tr///, that it doesn't seem to translate backslashes, even when escaped. For example,
say TR"\^/v"." given 'v^/\\';
say TR"\\^/v"." given 'v^/\\';
say TR"\ ^/v"." given 'v^/\\';
All of them output ...\ rather than what I expected, ....
There's some other weird behaviour too, like \ seemingly only escaping lowercase letters, but the docs page doesn't have much information... What exactly is the behaviour of backslashes (\) in transliteration (tr///)?
There is a bug caused by backslashes getting swallowed instead of correctly escaping things in the grammar for tr///.
say TR/\\// given '\\'
===SORRY!=== Error while compiling:
Malformed replacement part; couldn't find final /
at line 2
------> <BOL>⏏<EOL>
I have raised https://github.com/rakudo/rakudo/issues/2456 and submitted https://github.com/rakudo/rakudo/pull/2457 which fixes it.
The second part of the answer is that Perl 6 tries quite hard in some quoting constructs to only interpret \ as an escape for valid escape sequences, i.e. \n, \r, \s, \', etc. Otherwise it is left as a literal \.
I do not have an explanation for the observed problem. However, when you use the Perl 6 Str.trans method it looks like it's working as expected:
say 'v^/\\'.trans( "\\^/v" => "." );
Outputs:
....
Reference:
https://perl6advent.wordpress.com/2010/12/21/day-21-transliteration-and-beyond/
I have a string "Artîsté". I use json_encode from PHP on it and I get "Art\u00eest\u00e9".
How do I convert that to an NSString? I have tried many things and none of them work I always end up getting Artîsté
For Example:
NSString stringWithUTF8String:"Art\u00c3\u00aest\u00c3\u00a9"];//Artîsté
#"Art\u00c3\u00aest\u00c3\u00a9"; //Artîsté
You can use CFStringCreateFromExternalRepresentation with the kCFStringEncodingNonLossyASCII encoding to parse the \uXXXX escape sequences. Check out my answer here:
Converting escaped UTF8 characters back to their original form
The problem is your input string:
"Art\u00c3\u00aest\u00c3\u00a9"
does in fact literally mean "Artîsté". \u00c3 is 'Ã', \u00ae is '®', and \u00a9 is '©'.
Whatever is producing your input string is receiving UTF-8 input but expecting something else (e.g., cp1252, ISO-8859-1, or ISO-8859-15)
Are square brackets in URLs allowed?
I noticed that Apache commons HttpClient (3.0.1) throws an IOException, wget and Firefox however accept square brackets.
URL example:
http://example.com/path/to/file[3].html
My HTTP client encounters such URLs but I'm not sure whether to patch the code or to throw an exception (as it actually should be).
RFC 3986 states
A host identified by an Internet
Protocol literal address, version 6
[RFC3513] or later, is distinguished
by enclosing the IP literal within
square brackets ("[" and "]"). This
is the only place where square bracket
characters are allowed in the URI
syntax.
So you should not be seeing such URI's in the wild in theory, as they should arrive encoded.
Square brackets [ and ] in URLs are not often supported.
Replace them by %5B and %5D:
Using a command line, the following example is based on bash and sed:
url='http://example.com?day=[0-3][0-9]'
encoded_url="$( sed 's/\[/%5B/g;s/]/%5D/g' <<< "$url")"
Using Java URLEncoder.encode(String s, String enc)
Using PHP rawurlencode() or urlencode()
<?php
echo '<a href="http://example.com/day/',
rawurlencode('[0-3][0-9]'), '">';
?>
output:
<a href="http://example.com/day/%5B0-3%5D%5B0-9%5D">
or:
<?php
$query_string = 'day=' . urlencode('[0-3][0-9]') .
'&month=' . urlencode('[0-1][0-9]');
echo '<a href="http://example.com?',
htmlentities($query_string), '">';
?>
Using your favorite programming language... Please extend this answer by posting a comment or editing directly this answer to add the function you use from your programming language ;-)
For more details, see the RFC 3986 specifying the URL syntax. The Appendix A is about %-encoding in the query string (brackets as belonging to “gen-delims” to be %-encoded).
I know this question is a bit old, but I just wanted to note that PHP uses brackets to pass arrays in a URL.
http://www.example.com/foo.php?bar[]=1&bar[]=2&bar[]=3
In this case $_GET['bar'] will contain array(1, 2, 3).
Pretty much the only characters not allowed in pathnames are # and ? as they signify the end of the path.
The uri rfc will have the definative answer:
http://www.ietf.org/rfc/rfc1738.txt
Unsafe:
Characters can be unsafe for a number of reasons. The space
character is unsafe because significant spaces may disappear and
insignificant spaces may be introduced when URLs are transcribed or
typeset or subjected to the treatment of word-processing programs.
The characters "<" and ">" are unsafe because they are used as the
delimiters around URLs in free text; the quote mark (""") is used to
delimit URLs in some systems. The character "#" is unsafe and should
always be encoded because it is used in World Wide Web and in other
systems to delimit a URL from a fragment/anchor identifier that might
follow it. The character "%" is unsafe because it is used for
encodings of other characters. Other characters are unsafe because
gateways and other transport agents are known to sometimes modify
such characters. These characters are "{", "}", "|", "\", "^", "~",
"[", "]", and "`".
All unsafe characters must always be encoded within a URL. For
example, the character "#" must be encoded within URLs even in
systems that do not normally deal with fragment or anchor
identifiers, so that if the URL is copied into another system that
does use them, it will not be necessary to change the URL encoding.
The answer is that they should be hex encoded, but knowing postel's law, most things will accept them verbatim.
Any browser or web-enabled software that accepts URLs and is not throwing an exception when special characters are introduced is almost guaranteed to be encoding the special characters behind the scenes. Curly brackets, square brackets, spaces, etc all have special encoded ways of representing them so as not to produce conflicts. As per the previous answers, the safest way to deal with these is to URL-encode them before handing them off to something that will try to resolve the URL.
For using the HttpClient commons class, you want to look into the org.apache.commons.httpclient.util.URIUtil class, specifically the encode() method. Use it to URI-encode the URL before trying to fetch it.
StackOverflow seems to not encode them:
https://stackoverflow.com/search?q=square+brackets+[url]
Best to URL encode those, as they are clearly not supported in all web servers. Sometimes, even when there is a standard, not everyone follows it.
According to the URL specification, the square brackets are not valid URL characters.
Here's the relevant snippets:
The "national" and "punctuation" characters do not appear in any
productions and therefore may not appear in URLs.
national { | } | vline | [ | ] | \ | ^ | ~
punctuation < | >
Square brackets are considered unsafe, but majority of browsers will parse those correctly. Having said that it is better to replace square brackets with some other characters.