Any working binary differ tool implements GDIFF(Generic Diff Format, NOT Graphical file difference)? - command-line-tool

I've seen GDIFF(Generic Diff Format) in wikipedia, and I wander is there any command line tool implements this standard. Now the best I have is LibXDiff, but it's a library, I'll need some extra work to make it run.
I know when it comes to binary-differ, VCDIFF(xdelta, etc) and bsdiff would have better compression rate, but in my case I really need a straight forward one. VCDIFF copies anything before current window(if my poor English reading was right about this article), and bsdiff's patch file format would be more complex.
update
Finally I found VCDIFF with xdelta3 is actually good and working, when "disable small string-matching" and "disable external decompression" is toggled, AND it has a pretty good "printdelta" command that prints very useful(for my app) information so that I don't really neet to extract VCDIFF format from the patch file.

Javaxdelta library implements xdelta and GDIFF patches. It can be used as command-line application like this:
# create patch
java -cp javaxdelta-2.0.1.jar:trove-1.0.2.jar com.nothome.delta.Delta source.file target.file patch.gdiff
# apply patch
java -cp javaxdelta-2.0.1.jar:trove-1.0.2.jar com.nothome.delta.GDiffPatcher unpatched.file patch.gdiff patched.file
I wrote once a wrapper around it to support directories patching (GDIFF files for directory are packed into one ZIP patch).

Related

Track and output changes/diffs between LaTeX document revisions to PDF?

We want to keep track of changes in a LaTeX document in such a way that people who can't read LaTeX can also see the changes at once. The .tex files are stored in a git repository. So detailed information about the changes is available.
For this purpose I think it might be possible to use the git diff output between two revisions to generate the PDF and somehow mark the changes since the selected other revision of the document.
Do you know of an (easy) way to achieve this?
Do you know of other ways to visualize differences between PDF files?
[Expanding on my comment, since it apparently helped :-) ]
latexdiff is a Perl script that can diff two LaTeX documents and mark up changes without the distractions of the LaTeX markup itself. The README says:
latexdiff is a Perl script, which compares two latex files and marks
up significant differences between them (i.e. a diff for latex files).
Various options are available for visual markup using standard latex
packages such as "color.sty". Changes not directly affecting visible
text, for example in formatting commands, are still marked in
the latex source. Note that only files conforming to latex syntax will
be processed correctly, not generic TeX files. Some further
minor restrictions apply, see documentation.
A rudimentary revision facilility is provided by another Perl script,
latexrevise, which accepts or rejects all changes. Manual
editing of the difference file can be used to override this default
behaviour and accept or reject selected changes only.
The author is F Tilmann.
The project is developed on Github, but you can get the script in a tarball from CTAN if you prefer. The link in the comment is a useful overview of how to use it.

Apache giraph process graph with a custom algorithm

I have a custom algorithm for processing a graph which accepts a txt file as input. Because it is a large scale graph I want to implement it in the apache giraph framework. I' ve done a lot of research but I am still not sure if I am in the right path.
I am reading a .csv file which contain the graph data and using a parser I am converting it to the txt file and uploading to the HDFS file system of hadoop.
I have read the SimpleShortestPathsVertex example from the apache quick start guide and I can see that processes the data from a file in HDFS using the jar-with-dependencies jar file.
My problem is that I haven't yet understood how can I add my algorithm in the apache giraph framework and start the process of the graph. Can I add my algorithm to apache framework using eclipse and modify it from there or there is any other way?
Thank you!
Have a look here:
https://cwiki.apache.org/confluence/display/GIRAPH/Shortest+Paths+Example
Where you able to run this example?
If yes.
Familiarize yourself with the different Writable formats of hadoop! Else it is hard to use these to your algorithm.
All computation concerning the graph is done in the compute() function.
(If you're more advanced have a look into the workerContext preSuperstep and Aggregators!)
You can change the example, but as soon as you use other data types you have to change your VertexReader and VertexWriter.
If you have a specific Algorithm in mind, make up your mind what you need for the computation and specify the layout of your input file. Then adapt your VertexReader and -Writer. And then finaly start the implement your compute() function!
Of course you can use eclipse! Simply Reference the Giraph jar (For me it is "giraph-0.1-jar-with-dependencies.jar") And start coding.
All you need is a instance of these files specific to your algorithm:
YourGiraphJob (the file starting the Hadoop/Giraph job)
YourVertex (Specifies your compute() function executed on each Vertex)
YourInputFormat (Specifying the Writable formats of YourReader)
YourOutputFormat (Specifying the Writable formats of YourWriter)
YourReader (Specifies how your inputFile is transformed e.g. that for each line a Vertex can be initialized using given information)
YourWriter (Specifies how your outputFile is generated from the vertices)
(optionaly a WorkerContext if you want to use Aggregators.)
Simply checkout: http://giraph.apache.org/source-repository.html
using eclipse and you should have the code including an example application you can toy around with!

Override the .vimrc file in specific instances

I am looking to make vim function more closely to an IDE in specific situations. To do this I plan to write a Perl script called jvim that is passed a workspace path and have it open a java specific instance of my modified vim. I would then extend this for other file types.
What I would like to do is have a .jvimrc that is loaded in preference to the standard .vimrc. this would then lead to me having plvim with a .plvimrc and pyvim with a .pyvimrc.
This should be fairly straight forward. I would also be looking to map commands to run scrips such as :newclass, :newinterface, :newproject and :newpackage but i think you get the idea.
Any advice you could give as well as the .vimrc overriding would be great.

Using Apparat dump with FDT and ant

I am totally new to flash development, don't even know ActiveScript yet.
I have to improve some existing flash application, so at first I need to understand the code.
I want to use some tool for code analysis, something to visualize class dependencies and code structure. I googled and found out about Apparat tool. Now I'm struggling with it because I can not find documentation that describes how to use Apparat. I'm frustrated, but it seems to be the only such tool.
So I started with example.
I've set up apparat running on FDT following this guide:
http://www.webdevotion.be/blog/2010/06/02/how-to-get-up-and-running-with-apparat/
The example (http://blog.joa-ebert.com/2010/05/26/new-apparat-example/) builds well and creates two SWF files. (I'm using ANT builder)
Now I want to analyze existing swf and see a PNG with class dependencies.
How should I do that?
What do I have to add and where?
Or maybe someone can explain how to use dump from windows command line? Something like
dump example.swf exampleAnalysis.png
After resolving all dependencies (which was tricky), I managed to get dump running
dump -i example.swf -uml
But it saves the UML diagram in .DOT format which is really hard to read as Graphviz GVedit cannot zoom and exports to PNG only what you see (messy impossible to read zoomed out graph), smyrna doesn't work and zgrviewer fails to load some files.

Converting PCL to PDF

I am looking to create (as a proof-of-concept) an OCaml (preferably) program that converts PCL code to PDF format. I am not sure where to start. Is there a standardized algorithm for doing so? Is there any other advice available for accomplishing this task?
Thanks!
Conversion of PCL to PDF can be incredibly complex (assuming you need it to be generic and not just for simple PCL). We've investaged this many times and in the end always revert to using other tools. We keep investigating as we are a development shop who uses and understands all elements of PCL to great detail. If you are not really familure with PCL it will be daunting task. One of the major issues is that overtime, printers have become, for the most part, tollerent of malformed PCL and as such, creating something that follows the rules to the letter of the law is not always sufficient. If; however, you have control over the PCL, you may be able to work it out with some amount of success.
I don't mean to turn you off of this and I realize that you've come here looking for a programming answer but I have to say, this is a far from simple task and there are no 'standarized algorithms' for this (that I'm aware of).
If this is designed to be a tool to work alongside of somehting else you are building I'd highly recommend looking at these guys:
PageTech
This is by far the most complete set of tools (Windows) for handling this. There are a few others but, based on our extensive use of PCL and conversion tools over the years, this is the only one that work all the time.
EDIT: Most recently we've been working with LincPDF (http://www.lincolnco.com/). This is also an excellent product with has one big benefit, deployment is simple. Some of the other tools have complex software installations. This solution is very easy for us to deploy as a feature in an application. It's also faster then any tools we've tested to date (at least with the PCL that we generate from our apps which is quite complex as they include specialized fonts and macros).
Ghostscript developers have recently integrated their sister products GhostXPS, GhostPCL and GhostSVG into their Ghostscript source code tree. (It's now called GhostPDL.) So all of these additional functionalities (load, render and convert XPS, PCL and SVG) are now available from there.
This means you could build their language switching binary from their sources. This, in theory, can consume PCL, PDF and PostScript and convert this to a host of other formats. While it worked for me whenever I needed it, Ghostscript developers recommend to stop using the language switching binary (since it's 'almost non-supported' -- see KenS' comment to this answer) and instead switch to using the explicit binaries pcl6.exe (PCL input), gsvg.exe (SVG input, also 'almost non-supported') and gxps.exe (support status unclear to me).
So to 'convert PCL code to PDF format' as the request areads, you could use the pcl6 command line utility, a sister product to Ghostscript's gs/gswin32c.exe.
Sample commandline:
pcl6.exe \
-o output.pdf \
-sDEVICE=pdfwrite \
[...more parameters as required (optional)...] \
-f input.pcl
Updated as per KenS' hints in the comment....
There is a series of reference books from HP; you could re-implement a PCL parser and output corresponding PDF.
You might start with the "PCL 5 Printer Language Technical Reference Manual" (http://h20000.www2.hp.com/bc/docs/support/SupportManual/bpl13210/bpl13210.pdf) . Search HP for more (http://search.hp.com/query.html?qt=PCL+reference).
Or you could steal code or ideas from GhostPCL (http://www.ghostscript.com/GhostPCL.html)