Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I am working with a team that works on a very large software project, we have tons of Documentation that is written in MS WORD format with nohyperlinked indexes, no search ability.
Everyday we waste our time trying to find the exact document or reference.
I was thinking if there was way or even a professional tool that would convert all this into a wiki format and maybe with a little manual (painful) help be organised into something that improves the accessibility.
I use Google Desktop Search to make my life a little easier but its not the best solution
I just want to know if any of you faced similar problems and possible solutions to this issue.
Confluence wiki lets you import Word documents. I've been told it's a really good wiki with a lot of features.
Open Source: Trac Wiki
The Trac Wiki system is a popular open source option. There's even a thread on another Stack Exchange site about converting Word documents into the Track Wiki format. Some large scale or notable users of Trac include Red Hat, Django, Handbrake and SourceForge.
Commercial: Confluence
Confluence is another popular option, as mentioned by Alex Korban. A good example of a large-scale use of it is the Application Server 7 community documentation of software company JBoss.
2015 Update - PressGang and Corilla
I'll just add a quick update that I agree so strongly with this problem of discoverable content and collaboration that I've worked on solving it in a few different projects.
PressGang (the prototype)
Please refer to the PressGang CCMS project for an idea of what we did at Red Hat to build tools to solve this. The lead engineer did a run-through video that you can see on Vimeo, and I've created a public Amazon AMI if you wish to try it. It's not being maintained but it's all open source. Check out the repository or get involved in the community fork.
Corilla (the product)
That was an amazing experience to be a part of and inspired me to spin out Corilla, an open source technical writing startup tackling exactly the problems you mention. This will have both a commercial/enterprise version available as well as open source community version. There's a beta currently running that I would encourage you to get involved with - the problems you're experiencing will be great to shape the solutions being built.
Don't know about tools to convert MS Word to a Wiki, but if you save your documentation in ODF (The XML format from OpenDocument) it should be possible to import the documents in your own program, or transform them using XSL, and output them in Wiki format.
In the process you can try to guess some links based on string matching.
The difficult part, in a team used to write documentation in a word processor (without even using word processing basic features like indexes, table of contents and hyperlinks), is to convince them to use another system that is not a typewriter.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I need a personal archive tool to archive the programming algorithms, lessons, techniques and codes.
Something like a "Personal Wiki" that supports images attachments, code decoration, content categorization and search for any content at any time.
I know i can use an open source tool like forums or media wiki but i need something customized for this personal purpose.
Desktop tool or web tool.
For those who are search for the same purpose, I found some tools:
Here are some things I've tried with their pros and cons:
OneNote
Pros
Excellent ability to organise notes. You have have books containing section groups containing sections containing pages and subpages. I've got a book for Development, then a section group for Languages, then a section for Ruby, then pages for every topic within Ruby. The highlight here is that there's no penalty to creating dozens and dozens of pages on a given topic, helping you keep things organised when you get really deep into a specific topic. It's an easy mistake to just say "Yeah, a section for Languages, then a page for PHP", but before you know it the PHP page is half a mile long and you'll never sit down to properly re-read it again, it becomes a pain to find the info you want, etc.
Great support for multi-user notebooks. It properly keeps track of who added what and what got changed when, making co-operation easy.
Syntax highlighting can be done with the OneTastic plugin, which lets you define custom styles. Just define a custom style in a monospaced font with a special colour, and call it Code.
Support for tabular data, attached files, audio, video, etc, if you need that sort of thing.
Cons
Need to use a special app to consult it, so you can't just hit it from a work computer or the like.
Web app is clunky and lacking full features, I still haven't gotten a desktop notebook to properly sync as a web app notebook.
Search isn't the best.
MediaWiki
Pros
If you make it public, you can use Google on your notes, which is better than any other search.
CSS means it's easy to style and present it how you want without manually altering every bit of text like you'd need to in OneNote.
Because it's just a website like any other, you can access it from any device without installing anything or having to log in.
Export as ePub file, meaning you can read all your notes on your Kindle/ereader, really good for refreshing.
Any page can belong to multiple categories, which is nice.
Built in syntax highlighting with code tags.
Cons
Limited/clunky ability to organise in tiers, ultimately a fatal flaw for me.
Becomes a pain to quickly add notes to pages. (I would kill for a no-page-reload transition between read/edit modes!)
Reliant on an internet connection (usually not a problem, but something to be aware of).
Plaintext files in folders
Pros
Zero learning curve/adaptation.
Read them anywhere with no special software (tip: put them in a shared Dropbox folder, map an address on your domain to that folder).
Read natively on ereaders or convert to ebook format with no real effort.
Cons
No syntax highlighting, no image/audio/video media, no tabular data.
Hard to fuzzy search.
Edit conflicts if you're studying alongside someone.
Google Drive
Pros
Excellent support for sharing/co-operation
Good search
Good mobile support
Supports a lot of media
Cons
Tends to be slow to use
Presentation options tend to be frustrating
Reliant on internet connection
My personal recommendation: OneNote + Onetastic plugin, using all tiers/dividers, exported to PDF or multiple PDFs regularly so you can consult them from elsewhere.
Quoted from this link:
https://www.reddit.com/r/learnprogramming/comments/3acusr/how_to_take_notes_while_learning_programming/
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
we have an rather extensive set of Documentation for our software currently written in german. Now we want to translate this documentation into english for our foreign customers. For this we will use an external translation service.
But we want to keep the english and german version in sync as close as possible, as it will be updated in future accordingly to updates of our software. In this case we want to give only the changed pages of the documentation to the translation service.
Currently we use Atlassian Confluence to manage our documentation, but it has no support for internationalization.
The next approach that came to my mind was using some external tool to write/manage the documentation and then export it to confluence.
Things I found:
How to best manage multi-lingual presentations? - Use LaTex and Export it somehow to pdf/confluence/whatever
Some approach based on docBook or DITA (Paper in German)
So what is the best way to manage our software documentation in german and english simultaneously?
At the moment, localization support in Wikis actually seems to be very poor. See for example http://www.kilkku.com/blog/2012/09/the-final-obstacle-to-wiki-tech-comm-localization/.
You would need an efficient way to prepare the source language files for translation. This seems to be a major problem with Wikis.
In addition, with an extensive set of documentation, to even have a chance to keep multiple languages in sync, you or your service provider should use a translation memory system that can handle your file format. Translation memory systems divide the source text into segments. Normally, a sentence corresponds to a segment, as an option, segmentation can also be done at the paragraph level. The translator works on these segments. In case of an update, the translation memory system detects new and modified texts automatically. Everything else can be pre-translated from the memory.
Now, I've been managing localization projects for more than 15 years, but I've never heard of a translation tool that handles LaTex files. On the other hand, Docbook or DITA are supported by quite a few of these tools. For example, Maxprograms Swordfish is affordable and handles DITA as well as Docbook. In addition, with both formats, there seems to be the option to output to Wiki again (for example: http://sourceforge.net/projects/dita2wiki/) - though I don't know how well established these methods are.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I'm investigating different documentation systems for a project keeping up. Most recently I've been using DITA and the DITA OT, but its complexity makes me want to shoot myself.
Are there any systems that provide the following functionality:
Markdown support
Reusable content (I can refer to previously defined paragraphs or terms)
Localization support
Preferably, free or open source
Preferably, allows for multiple output
I wish I could use Pandoc for this, but it doesn't appear to support reusable content.
Edit: I just ended up writing my own library for this: https://github.com/gjtorikian/markdown_conrefs
If you don't mind reStructuredText instead of markdown, Sphynx is worth a look.
You could use pandoc + pre or post processor.
That way you could easily implement snippet reuse.
This is a topic close to my heart. There's quite a lot of Markdown processor options out there, but at time of writing those are more a case of personal solutions to this persistent problem. We all tend to get frustrated, make something to help in the short term, and share it.
The challenge has been to extend this to something built for purpose and at scale. Which is where I've turned my focus to over the last few years. That includes first working on PressGang CCMS inside of a tech writing team at Red Hat, and then being inspired to spin out Corilla, a dedicated technical writing startup building the tool you require.
PressGang (the prototype)
Please refer to the PressGang CCMS project for an idea of what we did at Red Hat to build tools to solve this. The lead engineer did a run-through video that you can see on Vimeo, and I've created a public Amazon AMI if you wish to try it. It's not being maintained but it's all open source.
It's a relatively large stack written for the most part in Java, but was useful as a look into an open source project in this space. But with bias I'd suggest...
Corilla (the product)
We cofounded Corilla as an open source company to focus on bringing together the elements of content reuse and collaboration with the ease of Markdown and Asciidoc. I've spent years writing DocBook XML, and quickly built my own snippets for Sublime Text to minimise the considerable overhead of authoring in that markup. The tide is of course turning. We need easier ways to write faster, and we need them to be discoverable, reusable, and allow the entire team to generate the content in formats they require.
I'd encourage you to get involved with the beta, as the technical writing and developer community is driving the project, and as we solve our problems together. Being able to resource and drive this to market is far more rewarding than having to pick through incomplete processor chains. I've been there, it's time we did more.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Im trying to find a piece of software to help a small (~3-5 people) team write A) a user guide and B) an API reference for the extensible parts of the software. We quite like the idea of using a wiki of some form, but have a few specific requirements:
The ability to export the manual(s) into both online and offline readable forms (eg, a bunch of HTML files that can be put on a web server or read locally)
Automatically create pages from XML documentation. We are writing in VB.Net/C#.Net, and most of the API specific stuff already has XML documentation comments. It would be great if the API portion of the manual could specify the classes, methods, arguments etc, but also allow the writers to link to these pages from others. (eg, have a page that details the foo class, and be able to have a page that details how to do some general task with the API link to the page for the foo.increaseBarCount() method).
That's just about it, other than the obvious ('easy to use', 'do all the writing for me so that i can get out of the tedium of writing technical documentation', 'not cause a global thermonuclear war').
Does such a piece of software exist? Can a similar system be cobbled together using mediawiki extensions?
As to "automatically create pages from XML documentation", the obvious solution would by doxygen. It creates documentation (HTML, PDF, WinHelp ...) from special doc comments embedded in source code. It handles Javadoc comments, Qt style comments and XML comments. If you generate HTML, the page names are predictable, so easy to link to; doxygen can also create internal links automatically. If you use other formats, you can probably embed some sort of anchor to refer to page names, but I'm not quite sure about that.
With respect to general documentation, we have good experiences with using a wiki. We use MediaWiki (of Wikipedia fame), but any decent wiki will probably do.
We have never tried printing it, but Google shows various solutions for printing from MediaWiki, so you can probably make something work without too much hassle.
The main thing we like about using a wiki for docs is that you can easily change them when you detect a mistake; that keeps them fresh. Also, no worries about having an outdated copy.
You could check out Atlassian Confluence. I believe that they have a free version (possibly just an offer, so you may end up paying in the long run, don't know) for small teams. They also have an API, so it wouldn't be too hard to write a utility to extract the XML C# documentation and create pages from it using the API, if nothing already exists.
The wiki is easy to use and is used quite a bit by open source projects. You can look at
the ANTLR Documentation, which is Confluence, for an example.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I'm looking for various NLP tools for a project I'm working on and right now I've found most useful the Stanford NLP projects.
Does anyone know if there are other tools that are out there that would be useful for a language understander?
And more importantly, are there tools that are NOT out there?
Most specifically, I'm looking for an api for morphophoneme analysis etc.
EDIT: I am an academic (a student working on a research project) and am mainly looking for open source or, at least, open api projects.
I suggest you take a look at the following:
the ususal nlp libraries like Open NLP, LingPipe, NLTK, Gate, UIMA. All of these provide parsers and word stemmers (i.e. they don't give you back the root of a word, but its stem). Some also provide lemmatizers.
websites which collect NLP tools. These are but a few of them: the wiki of the Association of Computational Linguistics, Language Technology World, the website of the compling dep. at Heidelberg University
I'm not aware of a tool which returns the root of a word, but, as I said, there are stemmers and lemmatizers. For lemmatization, try Tree Tagger or Morpha. Morphophonemic analysis is a term not specific enough to get you what you want.
Once you know more specifically what you need, you could search the archives of the Corpora List or post a question there.
NTLK is an interesting toolkit which allows building NLP-based applications. This can be used for practical applications which require for example POS tagging, or which implement simple classifiers or entity extractors.
I'm unsure of what a "language understander" application would encompass, however, but this sounds like something which may be beyond what can [easily] be based upon NLTK.
Reading the question completely, and its reference to morphophonics, seems to confirm that NLTK would probably not serve the OP's purpose very well; to my knowledge NTLK doesn't offer modules that deal with text at this level. You may want to check this for yourself however, as NLTK is a broad and active project and may have seen recent additions in this area.
I want to chime in with a link to the MontyLingua python package, which can be found here. I think it uses a different parser than the nltk.
http://www.fslog.com/2008/09/20/montylingua3-gpled-fork-of-montylingua/
you can google a comparison with nltk.
Maluuba has just released an API to their Natural Language Processor. It's available at http://developer.maluuba.com.
There are three libraries written for it by Maluuba:
Python Library: https://github.com/maluuba/napi-python
Ruby Library: https://github.com/maluuba/napi-ruby
Java Library: https://github.com/maluuba/napi-java
For an example of the power of it, take this query as an example of what can be extracted:
>> client.interpret phrase: 'Set up a meeting with Bob tomorrow \
night at 7 PM to discuss the TPS reports'
=>
{:entities=>
{
:daterange=>[{:start=>"2012-11-15", :end=>"2012-11-16"}],
:title=>["meeting to discuss the tps reports"],
:timerange=>[{:start=>"12:00:00AM", :end=>"12:00:00AM"}],
:contacts=>[{:name=>"bob"}]
},
:action=>:CALENDAR_CREATE_EVENT,
:category=>:CALENDAR
}