Looking for an advanced PDF comparison software [closed] - pdf

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
For a work assignment I have to read through 600-1000 page pdfs and form a general opinion on how similar they are (the extent to which the company putting them out is using 'boilerplate' formats).
I've used Adobe XI Pro DC's comparator software, which can compare two PDFs and highlight the parts that are different or changed. What I'll do is look through the comparison PDF, and any part that isn't highlighted I'll know is the exact same in both PDFs.
The problem with this software is that if the similar portions/strings are very far apart from each other in terms of page count (ex: if the relevant section is on page 100 of the first PDF but page 25 on the second PDF), this comparison software isn't going to catch it.
My ideal PDF comparator software would highlight only sections/paragraphs/strings in the first PDF that cannot be found ANYWHERE in the second PDF. Any section/string/paragraph which is also found in the second PDF, no matter how far apart in terms of page count, would stay unhighlighted.
Any help would be hugely appreciated!

You may want to check xdocdiff project (open-source that you may adapt to your needs) and maybe some of these diff tools recommended in TortoiseSVN documentation (open-source and closed source).

Related

is there any working/real open source Plagiarism checker available? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I want to develop a plagiarism checker for checking several source codes but I couldn't find any proper source code or even a resource to get an idea about it.
I have checked the Boss2 which is useless. they claim that they use Sherlock module for detecting plagiarism but it seems there is no such tools included in boss2.
if any open source detection tool is available for checking source code please let me know.
regards
I'm aware of open-source plagiarism detectors for text (e.g., WCopyFind), but not code.
I couldn't find... even a resource to get an idea about it.
The authors of the excellent closed-source tool MOSS have published a helpful paper about the technology.
I know the question is old, but I did land here from a google.
Sherlock is an open source plagiarism detector. Sherlock's home page is here
I wrote SimiCheck, and you are welcome to use it. If you are interested in an API, I could probably write one very quickly.
I wrote the original algorithm as part of the CrowdGrader peer-grading tool, but then I decided to make the comparison tools available independently.
SimiCheck can handle code, Word (.docx), html, pdf, text, ..., as well as .zip, .tar, .gz, .tgz, and some more formats, and can deal with variable renaming, code moves, code across multiple files, etc.

Free UML Drawing Tools [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
What are the best free uml drawing tools?
All the ones I have found require membership payments and only offer limited functionality based to public users on a trial basis...rubbish!
For my (very simple) needs I used ArgoUML. I'm not an expert about, but I found it enough easy to use. It's open source and, on the web page, you can find a good user guide.
Have a look at StarUML ( http://staruml.sourceforge.net/en/ )
It's free, open source, and incredibly fully featured.
For a full list, check out the ones marked as Open Source here: http://en.wikipedia.org/wiki/List_of_Unified_Modeling_Language_tools
But I'd really recommend StarUML!
For my first two software engineering courses, I used the stand alone version of UMLet, but it is just for diagrams. It exports to standard graphics, or pdf. They also have an eclipse plugin version, but I never used it.
For a no frill drawing tool, I find Google Docs (drawings) pretty good. Note that printing works better under Mozilla than Chrome, strangely enough. In Chrome, I cannot get dashed lines to print.
Try UMLet. Supports Eclipse IDE.

Free library to read PDF files [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Is there a free way to read PDF files through VBA to extract basic text content? I need to automate a weekly data acquisition process at my company where data is contained in PDF files (which are updated weekly by the data provider). Also, is there a reference I can look into to understand the file structure (DOM?) of a PDF?
Adobe's PDF reference is online here: http://www.adobe.com/devnet/pdf/pdf_reference.html
I'm not sure about the best way to read PDFs from VBA directly, but if you can call an external Java or C# program, then I would recommend using iText for basic text extraction.
EDIT: I should maybe mention that Adobe's PDF reference is an 800 page beast. I found that it's good for looking up answers to particular questions (eg, storing widths of embedded truetype fonts), but it may not be a good place to start. For that, reading through the iText book helped me to get started on the format.
The IText book contains lots of worked examples for general PDF tasks and lots of background info to help you understand PDF files. It more than pays for itself very quickly!

Dynamic Collapsable Flow Chart Online [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Been looking through a number of other related posts relating to flowchart software.
I have been asked to put together a document outlining some of the typical problems our users encounter with our software product.
What I would like to do, is create an interactive/online flowchart that lets users choose from 4-5 overall headings on whats wrong. Then for this to dynamically expand more choices on pinpointing the problem, and so on and so on, until they can get a resolution to their problem.
The key thing that I have not been able to find in some of the flowchart software out there, is having the click + expand element.
- I dont want all options to appear to the end-user in a huge flow chart as it will distract away from their specific issue.
- I want them to be able to click away and go down a specific avenue that will end up giving them some good things to try, based on their decisions/clicks.
I was originally thinking of perhaps putting something in Flex or Silverlight (ideally someone would have a template out there) but am now thinking of taking advantage of 3rd party (ideally free) software.
This will need to be hosted in a browser.
Any ideas?
Check out FreeMind. It's mind mapping software, so not necessarily a flowcharting tool, but you can use it for what you describe.

Desktop search utility for pdf,chm and djvu files [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I want to write a tool that helps me search pdf/chm/djvu files in linux. Any pointers on how to go about it?
The major problem is reading/importing data from all these files. Can this be done with C and shell scripting?
Tracker ships with Ubuntu 8.04 -- it was a significant switch from Beagle which users believed was too resource (CPU) intensive and didn't yield good enough results. It indexes both pdf and chm and according to this bug report it also indexes djvu.
Note that djvu is an image compression format (optimized to compress 'pictures of text', typically the results of scanning). As such, you won't be able to search for text, except in the metadata -this is what the link sent by cdleary refers to-, or if you first use OCR on the document to convert it into text.
The same is true for PDFs which content are scanned articles/books.
How about a plugin for Beagle ?
It already searches PDFs but you can add other file types.
Here is the relevant wikipedia page : http://en.wikipedia.org/wiki/Beagle_(software)