In the HTTP header, line breaks are tokens to separate fields in the header.
But, if I wan't to send a line break literal in a custom field how should I escape it?
If you are designing your own custom extension field, you may use BASE64 or quoted-printable to escape(and unescape) the value.
The actual answer to this question is that there is no standard for encoding line breaks.
You can use any Binary-to-text encoding such as URL-Encoding or Base64, but obviously that's only going to work if both sender and receiver implement the same method.
RFC 2616 did allow to 'fold' (i.e. wrap) header values over multiple lines, but the line breaks were treated as a single space character and not part of the parsed field value.
However, that specification has been obsoleted by RFC 7230 which forbids folding:
Historically, HTTP header field values could be extended over multiple lines by preceding each extra line with at least one space or horizontal tab (obs-fold).
This specification deprecates such line folding except within the message/http media type (Section 8.3.1).
A sender MUST NOT generate a message that includes line folding
A standard for line breaks in HTTP Header field values is not – and never was – established.
According to RFC2616 4.2 Message Headers:
Header fields can be extended over
multiple lines by preceding each extra
line with at least one SP or HT.
where SP means a space character (0x20) and HT means a horizontal tab character (0x09).
The idea is, that HTTP is ASCII-only and newlines and such are not allowed. If both sender and receiver can interpret YOUR encoding then you can encode whatever you want, however you want. That's how DNS international names are handled with the Host header (it's called PUNYCODE).
Short answer is: You don't, unless you control both sender and receiver.
If it's a custom field how you escape it depends entirely on how the targetted application is going to parse it. If this is some add on you created you could stick with URL encoding since it's pretty tried and true and lots of languages have encoding/decoding methods built in so your web app would encode it and your plug in (or whatever you're working on) would decode it.
Related
Currently, I am using the below code to set parameters to retrieve data from PACS.
DcmDataset findParams = DcmDataset();
findParams.putAndInsertString(DCM_QueryRetrieveLevel, "SERIES");
findParams.putAndInsertString(DCM_SpecificCharacterSet, "ISO_IR 192");
However, just wanted to check can we provide support multiple characters set to import data at the same time, Code will look like something below, I am trying to check whether this is possible or not as I dont have the facility to verify the same.
findParams.putAndInsertString(DCM_SpecificCharacterSet, "ISO_IR 192" ,"ISO_IR 100");
I think that what you want to express is that "this Query SCU can accept responses in the following character sets". This is plainly not possible. See a discussion in the DICOM newsgroup for reference. It ends with a proposal to add character set negotiation to the association negotiation. But such a supplement has not been submitted yet, and I am not aware of anyone working on it currently.
The semantics of the attribute Specific Character Set (0008,0005) in the context of the Query Retrieve Service Class:
PS3.4, C.4.1.1.3.1 Request Identifier Structure
Conditionally, the Attribute Specific Character Set (0008,0005). This Attribute shall be included if expanded or replacement character sets may be used in any of the Attributes in the Request Identifier. It shall not be included otherwise
I.e. it describes nothing but the character encoding of your request dataset.
and
C.4.1.1.3.2 Response Identifier Structure
Conditionally, the Attribute Specific Character Set (0008,0005). This Attribute shall be included if expanded or replacement character sets may be used in any of the Attributes in the Response Identifier. It shall not be included otherwise. The C-FIND SCP is not required to return responses in the Specific Character Set requested by the SCU if that character set is not supported by the SCP. The SCP may return responses with a different Specific Character Set.
I.e. you cannot control the character set in which the SCP will send you the responses. Surprising but a matter of fact.
Sending multiple values for the attribute is possible but has different semantics. It means that the request contains characters from different character sets which are switched using Code Extension Techniques as defined in ISO 2022. An illustrative example how this would look like and what it would mean is found in PS3.5, H.3.2
What implementors usually do to avoid character set compatibility issues is configuring "the one and only" character set for a particular installation (=hospital) in a locale configuration that is configured upon system setup. It works pretty well, for e.g. an installation in Russia will very likely support Cyrillic (ISO_IR 144) or UNICODE (ISO_IR 192) or both. In case of "both", you can select the character set that you prefer for configuring your system.
I have created a vCard 4.0 file with a text editor according to RFC 6350 by IETF. It is simple, and looks kind of like this:
BEGIN:VCARD
VERSION:4.0
KIND:individual
FN:René Descartes
N:Descartes;René;;;
TITLE:Façade Engineer
ADR;
GEO="geo:46.975308,0.699597";
LABEL="Headquarters":
;;29 Rue Descartes;;Descartes;37160;France
TEL;VALUE=uri;TYPE=home:tel:+33247597919
END:VCARD
The file is saved as somename.vcf (with CRLF and in UTF-8) and inspected on my iOS/macOS devices. However, the display of the file has many issues.
Non-ASCII characters are not decoded correctly.
The labels are all wrong.
The URI scheme is prepended to the phone number.
It is as if vCard 4.0 is not supported at all. Or did I make any mistakes?
The screenshot is attached below.
Like you suggested, it looks to me like the client does not support vCard
version 4. For example, URI-formatted telephone numbers are only supported by version 4, which might explain why it is not rendering the phone number properly. Try using a version 3 vCard.
Your ADR property is formatted strangely. I might trying
putting it all on one line to see if that makes any difference. If your intent is to make use of line
folding, each additional line must be prefixed with a
single space according to the RFC. You are using two spaces.
I have a WCF Service which returns a string response with &, <, >. For e.g.
<response>&</response>
Actually, I'm sending the '&' char but it is encoded for some reason.
Instead, I would like to send the decoded resoponse. The response I want is - <response>&</response>
Could someone suggest how to achieve this?
Thanks.
The XML Serializer has to encode special characters in order to generate a well-formed XML document.
According to the XML specification, the ampersand character must not appear in its literal form anywhere in the document, hence the encoding.
The consuming application should be aware of this already, and will know to decode the & ; back to ampersand. However, if you are looking at your response on a XML-based software such as Soap UI, you'll continue to see the symbol in its encoded form.
You can use
System.Web.HttpUtility.HtmlDecode()
on your perticular data contract property/entire response string. Or you can use method in below article that is not quite developer freindly but may be used if you have lots of special chars in your response.
http://seroter.wordpress.com/2007/11/09/xml-web-services-and-special-characters/
In case of software data flow control, we use xon and xoff (0x11 and 0x13) standard characters to pause and resume transmission. But if we want to send binary data which contains characters which match with the ascii value of xon and xoff, what character set should we use to send xon or xoff ?
I simple solution is to use base64 encoding, which you have it in python ..
base64.b64encode(yourData) - encode
base64.b64decode(yourData) - decode,
it adds the additional overhead but the sent data is in simple character format. even HDLC used base64 so this will be one option for you I suppose.
Using software handshaking precludes the sending of binary data.
Short of doing something esoteric (sending 9 bits/byte instead of 8 - very non-standard) there is no distinction between 2 of the 256 different binary data and the 2 codes selected for uses as XON/XOFF.
There are various protocols that attempt to deal with this. They all encode the "binary data" into something efficient but not a one-to-one mapping. One can use escape codes, compression, data packets, etc. Of course, both ends of the communication need to know how to encode/decode. This often limits your choices. If in doubt, start with Binary-to-text encoding as it tends to be easier to debug. http://en.wikipedia.org/wiki/Binary-to-text_encoding
To be able to use those two special characters as control ones, you have to make sure they do not occur in the payload data. One way to do that is to encode payload with a reduced alphabet that does not include the special characters. The binary-to-text encodings mentioned in a parallel answer would do the job, but if low overhead not depending on distribution of input bytes is critical, then the escapeless encoding may help.
I am building an API endpoint that accepts DateTime as a parameter.
It is recommended not to use : character as part of the URI, so I can't simply use ISO 8601 format.
So far I have considered two formats:
A) Exclamation mark as minute delimiter:
http://api.example.com/resource/2013-08-29T12!15
Looks unnatural and even with clear documentation, API consumers are bound to make mistakes.
B) URI segment per DateTime part:
http://api.example.com/resource/2013/08/29/12/15
Looks unreadable. Also, once I add further numeric parameters - it will become incomprehensible!
Is there standard/convention for for representing date/time in URIs?
I'd use the data interchange standard format.
Check this: http://en.wikipedia.org/wiki/ISO_8601
You can use : in URI paths.
The colon is a reserved character, but it has no delimiting role in the path segment. So the following should apply:
If a reserved character is found in a URI component and no delimiting role is known for that character, then it must be interpreted as representing the data octet corresponding to that character's encoding in US-ASCII.
There is only one exception for relative-path references:
A path segment that contains a colon character (e.g., "this:that") cannot be used as the first segment of a relative-path reference, as it would be mistaken for a scheme name. Such a segment must be preceded by a dot-segment (e.g., "./this:that") to make a relative-path reference.
But note that some encoding libraries might percent-encode the colon anyway.