Consider the example:
read_verilog ./tests/simple/fsm.v
synth -flatten -top fsm_test
abc -g AND
write_aiger -ascii -symbols hoho.aag
The resulting AIGER file contains input variable clk, which is dangling.
Is it possible to avoid introducing such clock input into AIGER?
Thanks.
Not automatically. The following options exist:
Simply use the SystemVerilog $global_clock feature to avoid having a clock input at all. Use always #($global_clock) instead of always #(posedge clk) and then remove the clk input from your design.
Remove the clock input near the end of your synthesis script. I.e. right before calling write_aiger call something like delete -input fsm_test/clk. This will turn the clock signal into a dangling wire internal to the module. You should avoid doing that before running a lot of optimization commands, or you risk Yosys optimizing away all your FFs. But doing that near the end of your script should be fine.
You can combine 2. with mapping your FFs to $ff/$_FF_ cells (the kind of FF cells generated by $global_clock-blocks). The advantage of this approach is that it makes the clk wire truly unused, so there is no risk of optimizations messing with your FFs because they have an undriven clock input. I've now added a dff2ff.v techmap file in commit e7a984a that simplifies this a bit.
Script for option 2:
read_verilog ./tests/simple/fsm.v
synth -flatten -top fsm_test
abc -g AND
delete -input fsm_test/clk
write_aiger -ascii -symbols hoho.aag
Script for option 3 (requires Yosys git commit e7a984a or later):
read_verilog ./tests/simple/fsm.v
hierarchy -top fsm_test
proc
techmap -map +/dff2ff.v
delete fsm_test/clk
synth -flatten
abc -g AND
write_aiger -ascii -symbols hoho.aag
I have some (complex to me) XML code that I need to convert into CSV, I need absolutely every value added to the CSV for every submission, I have tried a few basic things however I cant get past the deep nesting and the different structures of this file.
Could someone please help me with a powershell script that would, I have started but cannot get the output of all data out I only get Canvas Results
Submissions.xml To large to post here (102KB)
$d=([xml](gc submissions.xml)).CANVASRESULTS | % {
foreach ($i in $_.CANVASRESULTS) {
$o = New-Object Object
Add-Member -InputObject $o -MemberType NoteProperty -Name Submissions -Value $_.Submission
Add-Member -InputObject $o -MemberType NoteProperty -Name Submission -Value $i
$o
}
}
$d | ConvertTo-Csv -NoTypeInformation -Delimiter ","
Anytime a complex XML has deeply nested structures and you require migration into a flat file format (i.e., txt, csv, xlsx, sql), consider using XSLT to simplify your XML format. As information, XSLT is a declarative, special-purpose programming language used to style, re-format, re-structure XML/HTML and other SGML markup documents for various end-use purposes. Aside - SQL is also a declarative, special-purpose programming language.
For most softwares to import XML into flat file formats in two dimensions of rows and columns, XML files must follow repeating elements (i.e., rows/records) with one level of children for columns/fields:
<data>
<row>
<column1>value</column1>
<column1>value</column1>
<column1>value</column1>
...
</row>
<row>
...
</data>
Nearly every programming language maintains an XSLT processor including PowerShell, Java, C#, Perl, PHP, Python, SAS, even VBA with your everyday MS Excel. For your complex XML, below is an example XSLT stylesheet with following output. Do note I manually create nodes based on values from original XML:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:strip-space elements="*"/>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="CanvasResult">
<Data>
<xsl:for-each select="//Responses">
<Submission>
<Fitter><xsl:value-of select="Response[contains(Label, 'Fitter Name')]/Value"/></Fitter>
<Date><xsl:value-of select="Response[Label='Date']/Value"/></Date>
<Time><xsl:value-of select="Response[Label='Time']/Value"/></Time>
<Client><xsl:value-of select="Response[Label='Client']/Value"/></Client>
<Machine><xsl:value-of select="Response[Label='Machine']/Value"/></Machine>
<Hours><xsl:value-of select="Response[Label='Hours']/Value"/></Hours>
<Signature><xsl:value-of select="Response[Label='Signature']/Value"/></Signature>
<SubmissionDate><xsl:value-of select="Response[Label='Submission Date:']/Value"/></SubmissionDate>
<SubmissionTime><xsl:value-of select="Response[Label='Submission Time:']/Value"/></SubmissionTime>
<Customer><xsl:value-of select="Response[Label='Customer:']/Value"/></Customer>
<PlantLocation><xsl:value-of select="Response[Label='Plant Location']/Value"/></PlantLocation>
<PlantType><xsl:value-of select="Response[Label='Plant Type:']/Value"/></PlantType>
<PlantID><xsl:value-of select="Response[Label='Plant ID:']/Value"/></PlantID>
<PlantHours><xsl:value-of select="Response[Label='Plant Hours:']/Value"/></PlantHours>
<RegoExpiryDate><xsl:value-of select="Response[Label='Rego Expiry Date:']/Value"/></RegoExpiryDate>
<Comments><xsl:value-of select="Response[Label='Comments:']/Value"/></Comments>
</Submission>
</xsl:for-each>
</Data>
</xsl:template>
</xsl:stylesheet>
Output
<?xml version='1.0' encoding='UTF-8'?>
<Data>
...
<Submission>
<Fitter>Damian Stewart</Fitter>
<Date/>
<Time/>
<Client/>
<Machine/>
<Hours/>
<Signature/>
<SubmissionDate>28/09/2015</SubmissionDate>
<SubmissionTime>16:30</SubmissionTime>
<Customer>Dicks Diesels</Customer>
<PlantLocation/>
<PlantType>Dozer</PlantType>
<PlantID>DZ09</PlantID>
<PlantHours>2213.6</PlantHours>
<RegoExpiryDate>05/03/2016</RegoExpiryDate>
<Comments>Moving tomorrow from Daracon BOP to KCE BOP S6A Dam
Cabbie to operate</Comments>
</Submission>
...
</Data>
From there, you can import the two-dimensional XML into a usable rows/columns format. Below are the same import into an MS Access Database and MS Excel spreadsheet. You will notice gaps in the data due to XML content not populating the created nodes (handled in XSLT). A simple SQL cleanup can render final dataset.
Database Import
I'm trying to create a script in (g)AWK in which I'd like to put the following EXACT lines at the beginning of the output text file:
<?xml version="1.0" encoding="UTF-8"?>
<notes version="1">
<labels>
<label id="0" color="30DBFF">Custom Label 1</label>
<label id="1" color="30FF97">Custom Label 2</label>
<label id="2" color="E1FF80">Custom Label 3</label>
<label id="3" color="FF9B30">Custom Label 4</label>
<label id="4" color="FF304E">Custom Label 5</label>
<label id="5" color="FF30D7">Custom Label 6</label>
<label id="6" color="303EFF">Custom Label 7</label>
<label id="7" color="1985FF">Custom Label 8</label>
</labels>
and this one to the end:
</notes>
Here is my script so far:
BEGIN {printf("<?xml version="1.0" encoding="UTF-8"?>\n") > "notes.sasi89.xml"}
END {printf("</notes>") > "notes.sasi89.xml"}
My problem is that it's not printing the way I'd like, it gives me this in the output file:
<?xml version=1 encoding=-8?>
</notes>
Some characters and quotes are missing, I've tried studying manuals but those are sound too complicated to me, I would appriciate if someone would give me a hand or put me to the right direction.
Answer is Community Wiki to give what credit can be given where credit is due.
Primary problem and solution
As swstephe noted in a comment:
You need to escape your quotes:
printf("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n")
Anti-patterns
I regard your outline script as an anti-pattern (actually, two anti-patterns). You have:
BEGIN {printf("<?xml version="1.0" encoding="UTF-8"?>\n") > "notes.sasi89.xml"}
END {printf("</notes>") > "notes.sasi89.xml"}
The anti-patterns are:
You repeat the file name; you shouldn't. You would do better to use:
BEGIN {file = "notes.sasi89.xml"
printf("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n") > file}
END {printf("</notes>") > file}
You shouldn't be doing the I/O redirection in the awk script in the first place. You should let the shell do the I/O redirection.
awk '
BEGIN {printf("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n")}
END {printf("</notes>")}
' > notes.sasi89.xml
There are times when I/O redirection in the script is appropriate, but that's when you need output to multiple files. When, as appears very probable here, you have just one output file, make the script write to standard output and have the shell do the I/O redirection. It is much more flexible; you can rename the file more easily, and send the output to other programs via a pipe, etc, which is very much harder if you have the output file name embedded in the awk script.
So, I'm trying to migrate a database from Textpattern CMS to something more generic. There are some textpattern-specific commands inside of articles that pull in images. I want to turn these into generic HTML image links. At the moment, they look like this in the sql file:
<txp:upm_image image_id="4" form="dose" />
I want to turn these into something more like this:
<img src="4.jpg" class="dose" />
I've had some luck with TextWrangler doing some regex stuff, but I'm stumped. Any ideas on how to find & replace all of these image paths?
EDIT:
For future reference, here's what I ended up doing in PHP to output it:
$body = $post['Body_html'];
$pattern = '/txp:upm_image image_id="([0-9]+)" form="([^"]*)"/i';
$replacement = 'img src="/images/$1.jpg" class="$2"';
$body = preg_replace($pattern, $replacement, $body);
// outputed <img src="/images/59.jpg" class="dose" />
I wouldn't use grep; it's sed you want
$ echo '<txp:upm_image image_id="4" form="dose" />' | sed -e 's/^.*image_id="\([[:digit:]]*\)".*form="\([[:alpha:]]*\)".*/<img src="\1.jpg" class="\2" \/>/'
<img src="4.jpg" class="dose" />
$
if your class has alphanumeric characters, use [[:alnum:]]
(works on macos darwin)
Not sure which tool you are using but try this regex solution: Search for this:
<txp:upm_image\s+image_id="(\d+)"\s+form="([^"]*)"\s*\/>
And replace with this:
<img src="$1.jpg" class="$2" />
Note that this only works for txp tags having the same form as your example. It will fail if there are txp tags having extra attributes, or if they are in a different order.
My scenario:
A PDF template with formfields: template.pdf
An XFDF file that contains the data to be filled in: fieldData.xfdf
Now I need to have these to files combined & flattened.
pdftk does the job easily within php:
exec("pdftk template.pdf fill_form fieldData.xfdf output flatFile.pdf flatten");
Unfortunately this does not work with full utf-8 support.
For example: Cyrillic and greek letters get scrambled. I used Arial for this, with an unicode character set.
How can I accomplish to flatten my unicode files?
Is there any other pdf tool that offers unicode support?
Does pdftk have an unicode switch that I am missing?
EDIT 1: As this question has not been solved for more then 9 month, I decided to start a bounty for it. In case there are options to sponsor a feature or a bugfix in pdftk, I'd be glad to donate.
EDIT 2: I am not working on this project anymore, so I cannot verify new answers. If anyone has a similar problem, I am glad if they can respond in my favour.
I found by using Jon's template but using the DomDocument the numeric encoding was handled for me and worked well. My slight variation is below:
$xml = new DOMDocument( '1.0', 'UTF-8' );
$rootNode = $xml->createElement( 'xfdf' );
$rootNode->setAttribute( 'xmlns', 'http://ns.adobe.com/xfdf/' );
$rootNode->setAttribute( 'xml:space', 'preserve' );
$xml->appendChild( $rootNode );
$fieldsNode = $xml->createElement( 'fields' );
$rootNode->appendChild( $fieldsNode );
foreach ( $fields as $field => $value )
{
$fieldNode = $xml->createElement( 'field' );
$fieldNode->setAttribute( 'name', $field );
$fieldsNode->appendChild( $fieldNode );
$valueNode = $xml->createElement( 'value' );
$valueNode->appendChild( $xml->createTextNode( $value ) );
$fieldNode->appendChild( $valueNode );
}
$xml->save( $file );
You could try the trial version of http://www.adobe.com/products/livecycle/designer/ and see what PDF files it generates.
Another commercial software you could try is http://www.appligent.com/fdfmerge. See page 16 in http://146.145.110.1/docs/userguide/FDFMergeUserGuide.pdf for how it handles xFDF with UTF-8.
I also had a look at the FDF specification http://partners.adobe.com/public/developer/en/xml/xfdf_2.0.pdf
On page 12 it states:
Although XFDF is encoded in UTF-8, double byte characters are encoded as character references when
exported from Acrobat.
For example, the Japanese double byte characters , , and are exported to XFDF using
three character references. Here is an example of double byte characters in a form field:
...
<fields>
<field name="Text1">
<value>Here are 3 UTF-8 double byte
characters: あいう
</value>
</field>
</fields> ...
I looked through pdftk-1.44-dist/java/com/lowagie/text/pdf/XfdfReader.java. It doesn't seem to do anything special with the input.
Maybe pdftk will do what you want, when you encode the weird characters as character references in your xFDF input.
Using the pdftk 1.44 on a Win7 machine I encounter the same problems with xfdf-files whereas fdf works fine. I made a xfdf-file without any special characters (only ANSI) but pdftk crashed again. I mailed the developper. Unfortunately no answer until now.
Unfortunately, UTF-8 character encoding does not work neither with decimal nor hexadecimal references of non-ASCII characters in source .xfdf file. PDFTK v. 1.44.
I made some progress on this. Starting with code from http://koivi.com/fill-pdf-form-fields/, I modified the value encoding to output numeric codes for any characters outside the ascii range.
Now with pitulski's special strings:
Poznań Śródmieście Ćwiartka Ósma outputs Pozna ródmiecie wiartka Ósma with some box shapes superimposed
ęóąśłżźćńĘÓĄŚŁŻŹĆŃ outputs óÓ with more box shapes. I think it may be that the box shapes are characters my server doesn't recognize.
I tried it with some French characters: ùûüÿ€’“”«»àâæçéèêëïôœÙÛÜŸÀÂÆÇÉÈÊËÏÎÔ and they all came out OK, but some of them were overlapping.
--edit-- I just tried entering these manually into the form and got the same result minus the box shapes (using Evince). I then tried with a different form (created by someone else) - after entering ęóąśłżźćńĘÓĄŚŁŻŹĆŃ, ółÓŁ was displayed. It looks like it depends which characters are included in the document's embedded fonts.
/*
KOIVI HTML Form to FDF Parser for PHP (C) 2004 Justin Koivisto
Version 1.2.?
Last Modified: 2013/01/17 - Jon Hulka(jon dot hulka at gmail dot com)
- changed character encoding, all non-ascii characters get encoded as numeric character references
This library is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation; either version 2.1 of the License, or (at
your option) any later version.
This library is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public
License for more details.
You should have received a copy of the GNU Lesser General Public License
along with this library; if not, write to the Free Software Foundation,
Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Full license agreement notice can be found in the LICENSE file contained
within this distribution package.
Justin Koivisto
justin dot koivisto at gmail dot com
http://koivi.com
*/
/**
* createXFDF
*
* Tales values passed via associative array and generates XFDF file format
* with that data for the pdf address sullpiled.
*
* #param string $file The pdf file - url or file path accepted
* #param array $info data to use in key/value pairs no more than 2 dimensions
* #param string $enc default UTF-8, match server output: default_charset in php.ini
* #return string The XFDF data for acrobat reader to use in the pdf form file
*/
function createXFDF($file,$info,$enc='UTF-8'){
$data=
'<?xml version="1.0" encoding="'.$enc.'"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
<fields>';
foreach($info as $field => $val){
$data.='
<field name="'.$field.'">';
if(is_array($val)){
foreach($val as $opt)
//2013.01.17 - Jon Hulka - all non-ascii characters get character references
$data.='
<value>'.mb_encode_numericentity(htmlspecialchars($opt),array(0x0080, 0xffff, 0, 0xffff), 'UTF-8').'</value>';
// $data.='<value>'.htmlentities($opt,ENT_COMPAT,$enc).'</value>'."\n";
}else{
$data.='
<value>'.mb_encode_numericentity(htmlspecialchars($val),array(0x0080, 0xffff, 0, 0xffff), 'UTF-8').'</value>';
// $data.='<value>'.htmlentities($val,ENT_COMPAT,$enc).'</value>'."\n";
}
$data.='
</field>';
}
$data.='
</fields>
<ids original="'.md5($file).'" modified="'.time().'" />
<f href="'.$file.'" />
</xfdf>';
return $data;
}
While pdftk doesn't appear to support UTF-8 in the FDF file, I found that with
iconv -f utf-8 -t ISO_8859-1
in the pipeline converting that FDF file to ISO-Latin-1, then at least those characters that are in the Latin-1 code page will still be represented properly.
What PDFTK's version?
I tried the same thing with Polish characters (utf-8).
Does not work for me.
pdftk.exe, libiconv2.dll from: http://www.pdflabs.com/docs/install-pdftk/
Windows 7, cmd, file.pdf + file.fdf -> new.pdf
pdftk file.pdf fill_form file.xfdf output new.pdf flatten
Unhandled Java Exception:
java.lang.NoClassDefFoundError: gnu.gcj.convert.Input_UTF8 not found in [file:.\, core:/]
at 0x005a3abe (Unknown Source)
at 0x005a3fb2 (Unknown Source)
at 0x006119f4 (Unknown Source)
at 0x00649ee4 (Unknown Source)
at 0x005b4c44 (Unknown Source)
at 0x005470a9 (Unknown Source)
at 0x00549c52 (Unknown Source)
at 0x0059d348 (Unknown Source)
at 0x007323c9 (Unknown Source)
at 0x0054715a (Unknown Source)
at 0x00562349 (Unknown Source)
But, with FDF file, with the same content, it worked properly.
But the characters in new.PDF are bad.
pdftk file.pdf fill_form file.fdf output new.pdf flatten
---FDF---
%FDF-1.2
%âãÏÓ
1 0 obj<</FDF<</F(file.pdf)
/Fields[
<</T(Miejsce)/V(666 Poznań Śródmieście Ćwiartka Ósma)>>
<</T(Nr)/V(ęóąśłżźćńĘÓĄŚŁŻŹĆŃ)>>
]>>>>
endobj
trailer
<</Root 1 0 R>>
%%EOF
---XFDF---
<?xml version="1.0" encoding="UTF-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
<f href="file.pdf"/>
<fields>
<field name="Miejsce">
<value>666 Poznań Śródmieście Ćwiartka Ósma</value>
</field>
<field name="Nr">
<value>ęóąśłżźćńĘÓĄŚŁŻŹĆŃ</value>
</field>
</fields>
</xfdf>
---PDF---
Miejsce: 666 PoznaÅ— ÅıródmieÅłcie ăwiartka Ãfisma
Nr: ÄŽÃ³Ä–ÅłÅ‡Å¼ÅºÄ⁄Å—ÄŸÃfiÄ—ÅıņŻŹăÅ
You can introduce utf-8 characters by giving their unicode code in octal with \ddd
To solve this, I wrote PdfFormFillerUTF-8: http://sourceforge.net/projects/pdfformfiller2/
There is a drop-in replacement for pdftk tool
Mcpdf: https://github.com/m-click/mcpdf
that solves unicode issues when filling forms. Works for me with CP1250 characters (Central Europe).
From project page:
the following command fills in form data from DATA.xfdf into FORM.pdf
and writes the result to RESULT.pdf. It also flattens the document to
prevent further editing:
java -jar mcpdf.jar FORM.pdf fill_form - output - flatten < DATA.xfdf > RESULT.pdf
This corresponds exactly to the usual PDFtk command:
pdftk FORM.pdf fill_form - output - flatten < DATA.xfdf > RESULT.pdf
Note that you need to have JRE installed.
I have managed to make it work with pdftk by creating a xfdf file with utf-8 encoding.
it took several tried but what make it work as exepcted was to add 'need_appearances'
here is an example:
pdftk source.pdf fill_form data.xfdf output output.pdf need_appearances
I have been solving this issue for a long time, and finally I have found the solution!
so, let's start.
download and install the latest version of pdftk
# PDFTK
RUN apk add openjdk8 \
&& cd /tmp \
&& wget https://gitlab.com/pdftk-java/pdftk/-/jobs/1507074845/artifacts/raw/build/libs/pdftk-all.jar \
&& mv pdftk-all.jar pdftk.jar \
&& echo '#!/usr/bin/env bash' > pdftk \
&& echo 'java -jar "$0.jar" "$#"' >> pdftk \
&& chmod 775 pdftk* \
&& mv pdftk* /usr/local/bin \
&& pdftk -version
Open your PDF Form in Adobe Acrobat Reader and look at field options, you need to detect the font, for example Helvetica, download this font.
Fill the form with flatten option
/usr/local/bin/pdftk A=form.pdf fill_form xfdf.xml output out.pdf drop_xfa need_appearances flatten replacement_font /path/to/font/HelveticaRegular.ttf
xfdf.xml example:
<?xml version="1.0" encoding="UTF-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
<fields>
<field name="Check Box 136">
<value>Your value | Значение (Cyrillic)</value>
</field>
</fields>
</xfdf>
Enjoy :)
pdftk supports encoding in UTF-16BE. It's not that difficult to convert from UTF-8 to UTF-16BE.
See: Weird characters when filling PDF with PDFTk