Exim filter - Can a base64 or quoted-printable encoded message body be automatically decoded like the headers? - cpanel

I am attempting to filter unwanted incoming emails in my cPanel environment (with Exim as Mail Transfer Agent) based on the message body contents.
Often the message body is base64 or quoted-printable encoded (Content-Transfer-Encoding: quoted-printable or Content-Transfer-Encoding: base64), and in such cases
"$message_body contains <string>"
"$message_body matches <regexp>"
conditions fail because (I think) no decoding of the encoded message body takes place.
I read in The Exim Specification that for headers Exim decodes base64 or quoted-printable header text (an extract below):
$header_<header name>: or $h_<header name>:
$bheader_<header name>: or $bh_<header name>:
. bheader removes leading and trailing white space, and then decodes base64 or quoted-printable
MIME “words” within the header text, but does no character set translation.
. header tries to translate the string as decoded by bheader to a standard character set.
Can Exim decode base64 or quoted-printable encoded message body too? Can that be done in a cPanel & WHM environment?

Related

Apache adds data to output of javascript file

The weirdest thing, two of my javascript files have stopped being served due to incorrect mime type from apache. All my JS files have text/javascript, but two of them get application/octet-stream.
When troubleshooting I noticed that when I connect to the web server, it outputs "31c2" before the content of the file (see image). This is not an invisible character in the actual file, verified by hexdump. I am assuming that this is the source of the incorrect mime type reporting, but where does this come from? I noticed that after the file is output, apache also adds "0" on a single line.
How do I figure out what causes this? I might add that this file was last edited in 2017 and has worked flawlessly until today or yesterday, and I can't understand why.
Here are two requests side by side to a working .js file (left) and the one that reports incorrect mime type (right). There is no .htaccess file in any parent directory either.
Those things such as 31c2 that you see are hex encoded numbers. Now, if we decode 31c2, we get 12738. These strings only appear when you are using HTTP/1.1. Not when you are using HTTP/0.9, HTTP/1.0, HTTP/2.0, etc.
Why do these 'HEX' encoded numbers appear?
This occurs because HTTP/1.1 uses chunked transfer-encoding. Hence, you can see the header: Transfer-Encoding: chunked.
Chunked transfer-encoding has these hex strings:
CRLF a CRLF
Keep in mind that: CR = \r (carriage return), LF = \n (new line).
Now, for example, if you want to send Hello, World! to the user:
HTTP/1.1 200 OK
[CRLF]
Connection: Keep-Alive
[CRLF]
Transfer-Encoding: chunked
[CRLF]
[CRLF] # END OF HEADERS: the first hex won't contain another CRLF, idk why they chose to do this
5 # 5 in hex, is 5
[CRLF]
Hello
[CRLF] # first CRLF of hex
8 # 8 in hex, is 8
[CRLF] # second CRLF of hex
, World!
[CRLF] # first CRLF of hex encoding
0 # means this is the end of the transfer
[CRLF] # second CRLF of hex encoding
[CRLF] # contains third CRLF for the end too
HTTP/1.1 uses chunked transfer-encoding to send chunks as they are ready to be sent. Instead of sending all the data at once. This is especially useful for huge file transfers, where, with chunked transfer, the server doesn't need to calculate the size of the response in advance, saves time (this is also what causes the total-download size to be sometimes invisible when you are downloading stuff from some websites).
Why is your JS file not being detected as JavaScript?
It may be a bug in Apache. You should probably add this to your .htaccess/apache2.conf/httpd.conf to solve this issue:
AddType text/javascript js

How to specify this particular header in Postman

Resource URL
GET https://<MATD_IP>/php/session.php
The following HTTP headers should be specified in the session request:
Accept: application/vnd.ve.v1.0+json
Content-Type: application/json
VE-SDK-API: Base64 encoded "user name:password" string
VE-API-Version (Optional)
I am confused onto what does it mean by specifying base64 encoded string. I have tried to do it but I am failing at it. Can anybody help me with the exact header parameters by giving an example.
Thank you
You could use this in your Pre-request Script:
let base64 = Buffer.from("username:password").toString('base64')
pm.request.headers.add({key: "VE-SDK-API", value: base64})
This will convert to Base64 and then create the header with the encoded value.
It most likely means that you need to provide a base64 string for that field. Write down the credentials with a : in between. Ex:
cooluser:str0ngP4ssword
Then you encode this exact string as base64 which would give you:
Y29vbHVzZXI6c3RyMG5nUEBzc3dvcmQ=
You can encode via terminal (Linux) echo "XXX" | base64 or just search for "base64 encode" on the WEB (not really recommended due to security reasons).
You can then use it for the headers:
Accept: application/vnd.ve.v1.0+json
Content-Type: application/json
VE-SDK-API: Y29vbHVzZXI6c3RyMG5nUEBzc3dvcmQ=
VE-API-Version 1.x
Omit echoing trailing new line using option -n (for not needed):
echo -n "username:password" | base64

How do I encode a : in a url?

I need to send a get request where the last part of the url is a json value. I have encoded the following {"period":"600s"} to use on multiple different sites, however they all come up with the same result where the : is not decoded.
The encoded url: stickiness=%7B%22period%22%3A%22600s%22%7D.
Its result when I enter it into my browser:
So how do I encode a :?
%3A is the encoding of :. : is reserved in URIs for designating the port number (e.g. google.com:443 manually specifies to use port 443, the default HTTPS port). If you want to include a : in a URI, it must be precent-sign-encoded, which is what the %3A is. It can't be decoded in the URL bar because it would violate the reserved purpose of the : character.
The colon character is not decoded in the browser as it belong to the reserved characters that already have an explicit meaning in URLs elsewhere - in this case separating the protocol from the hostname and the port after the hostname.
The relevant standard is RFC 1738, page 3:
Many URL schemes reserve certain characters for a special meaning:
their appearance in the scheme-specific part of the URL has a
designated semantics. If the character corresponding to an octet is
reserved in a scheme, the octet must be encoded. The characters ";",
"/", "?", ":", "#", "=" and "&" are the characters which may be
reserved for special meaning within a scheme. No other characters may
be reserved within a scheme.
Usually a URL has the same interpretation when an octet is
represented by a character and when it encoded. However, this is not
true for reserved characters: encoding a character reserved for a
particular scheme may change the semantics of a URL.
Thus, only alphanumerics, the special characters "$-_.+!*'(),", and
reserved characters used for their reserved purposes may be used
unencoded within a URL.

Replacing wrong codificaton letters with SQL

I have a database with data from internet, but some pages have wrong codification and letters like ã becomes ã and çbecomes ç.
What are the possibilities to fix this? I'm using PostgreSQL.
I can use replace, but I need to do a replace for each case? I was thinking about translate, but I see that it transforms only one char into other. Is possible translate two chars into one? Something like: TRANSLATE(text,'ã|ç','ã|ç').
This particular problem looks like you have UTF-8 encoding being interpreted as a single-byte character set ("ç" becoming "ç" suggests iso-8859-1).
You can fix these up individually with a long chain of replace(...) calls. Or you can use postgresql's own character-conversion facilities:
select convert_from(convert_to('£20 - garçon', 'iso-8859-1'), 'utf-8')
In order, this:
Converts the string back to binary using the iso-8859-1 codec (which will just change unicode codepoints back to bytes, assuming all the codepoints are under 256)
Reinterprets that binary output as UTF-8, so sequences such as {0xc2, 0xa3} are translated to '£'
You can fix some of the characters by replacing them, but not all. By decoding the data using the wrong encoding you have already removed some information, and that is impossible to get back.
You should find out what the correct encoding is for those pages, and use that when decoding the data.
Some pages have the encoding in the response header, e.g.
Content-Type: text/html; charset=utf8
Some pages have the encoding in the HTML head, e.g.
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
If the information is not in the header you would first have to decode the page (or at least a part of it) using the ASCII encoding (which is not a problem as the meta tag contains no special characters), find out the encoding, then decode the page using the correct encoding.
PostgreSQL has a string replacement function:
replace(string text, from text, to text): Replace all occurrences in string of substring from with substring to
Example:
replace ('abcdefabcdef', 'cd', 'XX') ==> abXXefabXXef

MIME "From:" header with national characters

What is the correct format of "From:" header when From Name contains national characters and dot (.) character?
We generate (using C# Chilkat lib) this:
From: =?utf-8?Q?Micha=C5=82_from_domain.com?= <abcdef#domain.com>
(where From Name = Michał from domain.com)
This works OK in most cases. However, we encountered an email provider which marks this header as invalid and uses Return-Path header instead (which is machine-readable only).
The error is:
Illegal-Object: Syntax error in From: address found on ps11.m5r2.onet:
From: =?utf-8?Q?Micha=C5=82_from_domain.com?=<abcdef#domain.com>
^-missing end of mailbox
The provider insists the the problem is the lack of space between name and email. This is not the case on our end (see previous code example).
That email provider has a broken MTA. Unfortunately, you have to deal with it.
You're already formatting your non-ASCII "From" personal-part as an RFC 2047 encoded-word. Since you're using Q as the encoding, you can take advantage of the flexibility in the quoted-printable encoding and encode the . as well:
From: =?utf-8?Q?Micha=C5=82_from_domain=2Ecom?= <abcdef#domain.com>
(Note that the . has been replaced by its quoted-printable encoding, =2E.)