How to write a "Hello, World" CGI with Rebol 3? - cgi

Lets start with something simple: a form with a field which gets echoed.

At the time of this writing (2013-01), Rebol 3 still lacks the few CGI-supporting functions which were bundled with Rebol 2. However, if you are fine with hacking up the missing CGI support yourself, you can still get going right away.
Before we start, you need to store the R3 binary on the machine you want to run your CGI, and you need to know the full path to where you stored it (for simplicity's sake). The following examples assume a Unix-style machine with the R3 binary in /usr/local/bin/rebol3.
Let's start with something even more simple than you requested: a CGI just sending a "Hello, World!" page:
#!/usr/local/bin/rebol3 -cs
REBOL []
prin [
"Content-type: text/html" crlf
crlf
<!doctype html>
<title> "Rebol 3 CGI Sample: Hello" </title>
"Hello, World!"
]
This is identical to what you'd write in R2.
Onward to something slightly more interesting: reading and parsing a HTML form submission, as you requested.
For this we need to know two things about CGI: submitted data is passed as standard input to the CGI; other CGI-specific information is passed from the webserver via environment variables. We can access the input data in R3 via the system/ports/input port, and read environment variables by using the get-env native.
Let's embed the HTML form itself into the CGI, and do a mode switch within the CGI: if no data was submitted, show the HTML form; if data was submitted, process it and show an appropriate response. We can do that by writing a form that submits data via HTTP method POST, and then checking within the CGI if it was invoked via HTTP method GET (no data) or POST (form data). The method a CGI script was invoked with is available via the REQUEST_METHOD environment variable.
With all that said, here's the full script without further ado:
#!/usr/local/bin/rebol3 -cs
REBOL []
handle-get: function [] [
prin [
"Content-type: text/html" crlf
crlf
<!doctype html>
<title> "Rebol 3 CGI Sample: Form" </title>
<form method="POST">
"Your name:"
<input type="text" name="field">
<input type="submit">
</form>
]
]
handle-post: function [] [
data: to string! read system/ports/input
fields: parse data "&="
value: dehex select fields "field"
prin [
"Content-type: text/html" crlf
crlf
<!doctype html>
<title> "Rebol 3 CGI Sample: Response" </title>
"Hello," (join value "!")
]
]
main: does [
switch get-env "REQUEST_METHOD" [
"GET" [handle-get]
"POST" [handle-post]
]
]
main
The final piece to understanding this script is how to actually parse HTML form data sent to the CGI. Rebol 2 had a decode-cgi helper function for this, which Rebol 3 currently lacks.
However, for basic forms, it suffices to know that CGI data is sent in an encoding that seperates fields with & and the field's name and value with =; everthing is URL-encoded. So if we submit the form embedded above with a value of "Charlie", the CGI will receive field=Charlie as input. Submitting "Foo Bar" sends "field=Foo%20Bar". So, again: for basic forms, the combination of parse ... "&=" (for splitting up fields and field names and values) and dehex (for decoding the URL-encoding) as shown above will suffice.

Related

How to create a elasticsearch document document via URL(http)

I am new to elasticsearch and trying to find a way to create a document from url(http-API). I have tried below given options but none of them worked.
http://Myserver:9200/dilbert/user/3 -d '{ "name" : "Praveena" }
http://Myserver:9200/dilbert/user/3{ "name" : "Praveena" }
http://Myserver:9200/dilbert/user/3?pretty _create name=Praveena
I expect this to add a record. Here dilbert is Index name. user is type & 3 is id. This index only contains a single element called name.
First option should work, but you must explicitly set request type to PUT.
It seems to me that you use curl to insert data. If you send data to server with -d option, curl issues the POST request by default. But you specify id of document in URL, so Elasticsearch waits for PUT request (see official documentation). You can use POST requests only in case of automatic ID generation.
So your request may look as follows:
curl -X PUT http://Myserver:9200/dilbert/user/3 -d '{ "name" : "Praveena" }'

Debug and avoid periodic REBOL2 error, that try[] does not(?) catch?

Apparently un-catchable error while toying around with Rebol/Core (278-3-1) to make a kind-of web-server to serve a static text, containing a redirect link to a new service location.
The specific location of the error appear to be in example code written by Carl Sassenrath himself, back in 2006, so I'm kind of baffled there could be an undetected error after all these years.
I have three of these scripts running simultaneous, monitoring three individual ports. Essentially the script works as it should... when accessed repeatedly with multiple browsers at once (on all parallel scripts) it appear appear to be pretty stable... but one after another they fail. Sometimes after 2 minutes, sometimes after 20 minutes - after adding the print statements sometimes even after 60 minutes - but eventually they will fail like this:
** Script Error: Out of range or past end
** Where: forever
** Near: not empty? request: first http-port
I've tried wrapping just about every part of the program loop in a try[][exception], but the error still occurs. Unfortunately my search-fu appear to be weak this time of year, as I haven't found anything that could explain the problem.
The code is a cut down version of Carl Sassenrath's Tiny Web Server, slightly modified to bind to a specific IP, and to emit HTML instead of loading files:
REBOL [title: "TestMovedServer"]
AppName: "Test"
NewSite: "http://test.myserver.org"
listen-port: open/lines tcp://:81 browse http://10.100.44.6?
buffer: make string! 1024 ; will auto-expand if needed
forever [
http-port: first wait listen-port
clear buffer
while [not empty? request: first http-port][
print request
repend buffer [request newline]
print "----------"
]
repend buffer ["Address: " http-port/host newline]
print buffer
Location: ""
mime: "text/html"
parse buffer ["get" ["http" | "/ " | copy Location to " "]]
data: rejoin [{
<HTML><HEAD><TITLE>Site Relocated</TITLE></HEAD>
<BODY><CENTER><BR><BR><BR><BR><BR><BR>
<H1>} AppName { have moved to } NewSite {</H1>
<BR><BR><BR>Please update the link you came from.
<BR><BR><BR><BR><BR>(Continue directly to the requested page)
</CENTER></BODY></HTML>
}]
insert data rejoin ["HTTP/1.0 200 OK^/Content-type: " mime "^/^/"]
write-io http-port data length? data
close http-port
print "============"
]
I'm looking forward to see what you guys make out of this!
You get an error when trying to read from a closed connection. This seems to work.
n: 0
forever [
http-port: first wait listen-port
clear buffer
if attempt [all [request: first http-port not empty? request]] [
until [
print request
repend buffer [request newline]
print "----------"
any [not request: first http-port empty? request]
]
repend buffer ["Address: " http-port/host newline]
print buffer
Location: ""
mime: "text/html"
parse buffer ["get" ["http" | "/ " | copy Location to " "]]
data: rejoin [{
<HTML><HEAD><TITLE>Site Relocated</TITLE></HEAD>
<BODY><CENTER><BR><BR><BR><BR><BR><BR>
<H1>} AppName n: n + 1 { has moved to } NewSite {</H1>
<BR><BR><BR>Please update the link you came from.
<BR><BR><BR><BR><BR>(Continue directly to the requested page)
</CENTER></BODY></HTML>
}]
insert data rejoin ["HTTP/1.0 200 OK^/Content-type: " mime "^/^/"]
write-io http-port data length? data
]
attempt [close http-port]
print "============"
]
Let us see the documentation for empty?
Summary:
Returns TRUE if a series is at its tail.
Usage:
empty? series
Arguments:
series - The series argument. (must be: series port bitset)
So empty? requires series, port or bitset or string argument. Your variable (request) is getting any of them as long as there is connection to the port is open. empty? can thereafter determine whether it is at the tail of variable.
When the connection is closed/interrupted, your variable receives nothing but there is access error connecting to port. Error does not have tail. empty? gets confused and crashes with error.
sqlab has replaced empty? with attempt
if attempt [all [request: first http-port not empty? request]]
The ATTEMPT function is a shortcut for the frequent case of:
error? try [block]
with all he is guarding against error as well as none.
ATTEMPT returns the result of the block if an error did not occur. If an error did occur, a NONE is returned.
also with until and
any [not request: first http-port empty? request]
he is guarding against both.
Therefore his code is working.

Flatten FDF / XFDF forms to PDF in PHP with utf-8 characters

My scenario:
A PDF template with formfields: template.pdf
An XFDF file that contains the data to be filled in: fieldData.xfdf
Now I need to have these to files combined & flattened.
pdftk does the job easily within php:
exec("pdftk template.pdf fill_form fieldData.xfdf output flatFile.pdf flatten");
Unfortunately this does not work with full utf-8 support.
For example: Cyrillic and greek letters get scrambled. I used Arial for this, with an unicode character set.
How can I accomplish to flatten my unicode files?
Is there any other pdf tool that offers unicode support?
Does pdftk have an unicode switch that I am missing?
EDIT 1: As this question has not been solved for more then 9 month, I decided to start a bounty for it. In case there are options to sponsor a feature or a bugfix in pdftk, I'd be glad to donate.
EDIT 2: I am not working on this project anymore, so I cannot verify new answers. If anyone has a similar problem, I am glad if they can respond in my favour.
I found by using Jon's template but using the DomDocument the numeric encoding was handled for me and worked well. My slight variation is below:
$xml = new DOMDocument( '1.0', 'UTF-8' );
$rootNode = $xml->createElement( 'xfdf' );
$rootNode->setAttribute( 'xmlns', 'http://ns.adobe.com/xfdf/' );
$rootNode->setAttribute( 'xml:space', 'preserve' );
$xml->appendChild( $rootNode );
$fieldsNode = $xml->createElement( 'fields' );
$rootNode->appendChild( $fieldsNode );
foreach ( $fields as $field => $value )
{
$fieldNode = $xml->createElement( 'field' );
$fieldNode->setAttribute( 'name', $field );
$fieldsNode->appendChild( $fieldNode );
$valueNode = $xml->createElement( 'value' );
$valueNode->appendChild( $xml->createTextNode( $value ) );
$fieldNode->appendChild( $valueNode );
}
$xml->save( $file );
You could try the trial version of http://www.adobe.com/products/livecycle/designer/ and see what PDF files it generates.
Another commercial software you could try is http://www.appligent.com/fdfmerge. See page 16 in http://146.145.110.1/docs/userguide/FDFMergeUserGuide.pdf for how it handles xFDF with UTF-8.
I also had a look at the FDF specification http://partners.adobe.com/public/developer/en/xml/xfdf_2.0.pdf
On page 12 it states:
Although XFDF is encoded in UTF-8, double byte characters are encoded as character references when
exported from Acrobat.
For example, the Japanese double byte characters , , and are exported to XFDF using
three character references. Here is an example of double byte characters in a form field:
...
<fields>
<field name="Text1">
<value>Here are 3 UTF-8 double byte
characters: あいう
</value>
</field>
</fields> ...
I looked through pdftk-1.44-dist/java/com/lowagie/text/pdf/XfdfReader.java. It doesn't seem to do anything special with the input.
Maybe pdftk will do what you want, when you encode the weird characters as character references in your xFDF input.
Using the pdftk 1.44 on a Win7 machine I encounter the same problems with xfdf-files whereas fdf works fine. I made a xfdf-file without any special characters (only ANSI) but pdftk crashed again. I mailed the developper. Unfortunately no answer until now.
Unfortunately, UTF-8 character encoding does not work neither with decimal nor hexadecimal references of non-ASCII characters in source .xfdf file. PDFTK v. 1.44.
I made some progress on this. Starting with code from http://koivi.com/fill-pdf-form-fields/, I modified the value encoding to output numeric codes for any characters outside the ascii range.
Now with pitulski's special strings:
Poznań Śródmieście Ćwiartka Ósma outputs Pozna ródmiecie wiartka Ósma with some box shapes superimposed
ęóąśłżźćńĘÓĄŚŁŻŹĆŃ outputs óÓ with more box shapes. I think it may be that the box shapes are characters my server doesn't recognize.
I tried it with some French characters: ùûüÿ€’“”«»àâæçéèêëïôœÙÛÜŸÀÂÆÇÉÈÊËÏÎÔ and they all came out OK, but some of them were overlapping.
--edit-- I just tried entering these manually into the form and got the same result minus the box shapes (using Evince). I then tried with a different form (created by someone else) - after entering ęóąśłżźćńĘÓĄŚŁŻŹĆŃ, ółÓŁ was displayed. It looks like it depends which characters are included in the document's embedded fonts.
/*
KOIVI HTML Form to FDF Parser for PHP (C) 2004 Justin Koivisto
Version 1.2.?
Last Modified: 2013/01/17 - Jon Hulka(jon dot hulka at gmail dot com)
- changed character encoding, all non-ascii characters get encoded as numeric character references
This library is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation; either version 2.1 of the License, or (at
your option) any later version.
This library is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public
License for more details.
You should have received a copy of the GNU Lesser General Public License
along with this library; if not, write to the Free Software Foundation,
Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Full license agreement notice can be found in the LICENSE file contained
within this distribution package.
Justin Koivisto
justin dot koivisto at gmail dot com
http://koivi.com
*/
/**
* createXFDF
*
* Tales values passed via associative array and generates XFDF file format
* with that data for the pdf address sullpiled.
*
* #param string $file The pdf file - url or file path accepted
* #param array $info data to use in key/value pairs no more than 2 dimensions
* #param string $enc default UTF-8, match server output: default_charset in php.ini
* #return string The XFDF data for acrobat reader to use in the pdf form file
*/
function createXFDF($file,$info,$enc='UTF-8'){
$data=
'<?xml version="1.0" encoding="'.$enc.'"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
<fields>';
foreach($info as $field => $val){
$data.='
<field name="'.$field.'">';
if(is_array($val)){
foreach($val as $opt)
//2013.01.17 - Jon Hulka - all non-ascii characters get character references
$data.='
<value>'.mb_encode_numericentity(htmlspecialchars($opt),array(0x0080, 0xffff, 0, 0xffff), 'UTF-8').'</value>';
// $data.='<value>'.htmlentities($opt,ENT_COMPAT,$enc).'</value>'."\n";
}else{
$data.='
<value>'.mb_encode_numericentity(htmlspecialchars($val),array(0x0080, 0xffff, 0, 0xffff), 'UTF-8').'</value>';
// $data.='<value>'.htmlentities($val,ENT_COMPAT,$enc).'</value>'."\n";
}
$data.='
</field>';
}
$data.='
</fields>
<ids original="'.md5($file).'" modified="'.time().'" />
<f href="'.$file.'" />
</xfdf>';
return $data;
}
While pdftk doesn't appear to support UTF-8 in the FDF file, I found that with
iconv -f utf-8 -t ISO_8859-1
in the pipeline converting that FDF file to ISO-Latin-1, then at least those characters that are in the Latin-1 code page will still be represented properly.
What PDFTK's version?
I tried the same thing with Polish characters (utf-8).
Does not work for me.
pdftk.exe, libiconv2.dll from: http://www.pdflabs.com/docs/install-pdftk/
Windows 7, cmd, file.pdf + file.fdf -> new.pdf
pdftk file.pdf fill_form file.xfdf output new.pdf flatten
Unhandled Java Exception:
java.lang.NoClassDefFoundError: gnu.gcj.convert.Input_UTF8 not found in [file:.\, core:/]
at 0x005a3abe (Unknown Source)
at 0x005a3fb2 (Unknown Source)
at 0x006119f4 (Unknown Source)
at 0x00649ee4 (Unknown Source)
at 0x005b4c44 (Unknown Source)
at 0x005470a9 (Unknown Source)
at 0x00549c52 (Unknown Source)
at 0x0059d348 (Unknown Source)
at 0x007323c9 (Unknown Source)
at 0x0054715a (Unknown Source)
at 0x00562349 (Unknown Source)
But, with FDF file, with the same content, it worked properly.
But the characters in new.PDF are bad.
pdftk file.pdf fill_form file.fdf output new.pdf flatten
---FDF---
%FDF-1.2
%âãÏÓ
1 0 obj<</FDF<</F(file.pdf)
/Fields[
<</T(Miejsce)/V(666 Poznań Śródmieście Ćwiartka Ósma)>>
<</T(Nr)/V(ęóąśłżźćńĘÓĄŚŁŻŹĆŃ)>>
]>>>>
endobj
trailer
<</Root 1 0 R>>
%%EOF
---XFDF---
<?xml version="1.0" encoding="UTF-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
<f href="file.pdf"/>
<fields>
<field name="Miejsce">
<value>666 Poznań Śródmieście Ćwiartka Ósma</value>
</field>
<field name="Nr">
<value>ęóąśłżźćńĘÓĄŚŁŻŹĆŃ</value>
</field>
</fields>
</xfdf>
---PDF---
Miejsce: 666 PoznaÅ— ÅıródmieÅłcie ăwiartka Ãfisma
Nr: ÄŽÃ³Ä–ÅłÅ‡Å¼ÅºÄ⁄Å—ÄŸÃfiÄ—ÅıņŻŹăÅ
You can introduce utf-8 characters by giving their unicode code in octal with \ddd
To solve this, I wrote PdfFormFillerUTF-8: http://sourceforge.net/projects/pdfformfiller2/
There is a drop-in replacement for pdftk tool
Mcpdf: https://github.com/m-click/mcpdf
that solves unicode issues when filling forms. Works for me with CP1250 characters (Central Europe).
From project page:
the following command fills in form data from DATA.xfdf into FORM.pdf
and writes the result to RESULT.pdf. It also flattens the document to
prevent further editing:
java -jar mcpdf.jar FORM.pdf fill_form - output - flatten < DATA.xfdf > RESULT.pdf
This corresponds exactly to the usual PDFtk command:
pdftk FORM.pdf fill_form - output - flatten < DATA.xfdf > RESULT.pdf
Note that you need to have JRE installed.
I have managed to make it work with pdftk by creating a xfdf file with utf-8 encoding.
it took several tried but what make it work as exepcted was to add 'need_appearances'
here is an example:
pdftk source.pdf fill_form data.xfdf output output.pdf need_appearances
I have been solving this issue for a long time, and finally I have found the solution!
so, let's start.
download and install the latest version of pdftk
# PDFTK
RUN apk add openjdk8 \
&& cd /tmp \
&& wget https://gitlab.com/pdftk-java/pdftk/-/jobs/1507074845/artifacts/raw/build/libs/pdftk-all.jar \
&& mv pdftk-all.jar pdftk.jar \
&& echo '#!/usr/bin/env bash' > pdftk \
&& echo 'java -jar "$0.jar" "$#"' >> pdftk \
&& chmod 775 pdftk* \
&& mv pdftk* /usr/local/bin \
&& pdftk -version
Open your PDF Form in Adobe Acrobat Reader and look at field options, you need to detect the font, for example Helvetica, download this font.
Fill the form with flatten option
/usr/local/bin/pdftk A=form.pdf fill_form xfdf.xml output out.pdf drop_xfa need_appearances flatten replacement_font /path/to/font/HelveticaRegular.ttf
xfdf.xml example:
<?xml version="1.0" encoding="UTF-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
<fields>
<field name="Check Box 136">
<value>Your value | Значение (Cyrillic)</value>
</field>
</fields>
</xfdf>
Enjoy :)
pdftk supports encoding in UTF-16BE. It's not that difficult to convert from UTF-8 to UTF-16BE.
See: Weird characters when filling PDF with PDFTk

Showing a long running shell process with Apache

I have a CGI script which takes about 1 minute to run. Right now Apache only returns results to the browser once the process has finished.
How can I make it show the output like it was run on a terminal?
Here is a example which demonstrates the problem.
I want to see the numbers 1 to 5 appear as they are printed.
I had to disable mod_deflate to have chunk mode working with apache
I did not find another way for my cgi to disable auto encoding to gzip.
There are several factors at play here. To eliminate a few issues, Apache and bash are not buffering any of the output. You can verify with this script:
#!/bin/sh
cat <<END
Content-Type: text/plain
END
for i in $(seq 1 10)
do
echo $i
sleep 1
done
Stick this somewhere that Apache is configured to execute CGI scripts, and test with netcat:
$ nc localhost 80
GET /cgi-bin/chunkit.cgi HTTP/1.1
Host: localhost
HTTP/1.1 200 OK
Date: Tue, 24 Aug 2010 23:26:24 GMT
Server: Apache/2.2.14 (Unix) mod_ssl/2.2.14 OpenSSL/0.9.7l DAV/2
Transfer-Encoding: chunked
Content-Type: text/plain
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
10
0
When I do this, I see in netcat each number appearing once per second, as intended.
Note that my version of Apache, at least, applies the chunked transfer encoding automatically, presumably because I didn't include a Content-Length; if you return the Transfer-Encoding: chunked header yourself, then you need to encode the output of your script in the chunked transfer encoding. That's pretty easy, even in a shell script:
chunk () {
printf '%x\r\n' "${#1}" # Length of the chunk in hex, CRLF
printf '%s\r\n' "$1" # Chunk itself, CRLF
}
chunk $'1\n' # This is a Bash-ism, since it's pretty hard to get a newline
chunk $'2\n' # character portably.
However, serve this to a browser, and you'll get varying results depending on the browser. On my system, Mac OS X 10.5.8, I see different behaviors between my browsers. In Safari, Chrome, and Firefox 4 beta, I don't start seeing output until I've sent somewhere around 1000 characters (I would guess 1024 including the headers, or something like that, but I haven't narrowed it down to the exact behavior). In Firefox 3.6, it starts displaying immediately.
I would guess that this delay is due to content type sniffing, or character encoding sniffing, which are in the process of being standardized. I have tried to see if I could get around the delay by specifying proper content types and character encodings, but without luck. You may have to send some padding data (which would be pretty easy to do invisibly if you use HTML instead of plain text), to get beyond that initial buffer.
Once you start streaming HTML instead of plain text, the structure of your HTML matters too. Some content can be displayed progressively, while some cannot. For instance, streaming down <div>s into the body, with no styling, works fine, and can display progressively as it arrives. If you try to open a <pre> tag, and just stream content into that, Webkit based browsers will wait until they see the close tag to try to lay that out, while Firefox is happy to display it progressively. I don't know all of the corner cases; you'll have to experiment to see what works for you.
Anyhow, I hope this helps you get started. Let me know if you have any more questions!

What's the proper syntax in Rebol to execute an in-memory block of code including the header

This syntax doesn't work:
>> do load/header {rebol [Title: "Hello World"] Print System/Header/Script/Title }
** Script Error: Invalid path value: Header
** Near: Print System/Header/Script/Title
I want to get the meta-data in header.
My goal is mostly to be able to execute a whole rebol source including header to the clipboard and execute it in console by doing something like do read clipboard:// that doesn't work if I include the header, I can't strip it since I need it.
Rewritten in response to comment.
Use load/header/next to create a two-item block: the script header followed by the script content:
loaded: load/header/next {rebol [Title: "Hello World"] Print "this is my script"^/a: 99 + 5 print a}
probe loaded/1 ;; shows the header
do loaded/2 ;; executes the script