I have a document pdf or docx (only accepted formats for multiturn), this contains alot of subheadings which translate to follow up prompts. This all works fine! But I would like to enable context-only for all of my prompts, because the answers are not relevant out of context.
Can I denote this in my document itself? There are way too many too manually check the button.
I could write a script that changes the contextOnly to true on the exported tsv, but this seems like it is a silly workaround.
There does not seem to be any way to indicate whether a question is context-only through the document extraction process, so you will need to automate this with a script. If you don't want to modify the TSV directly, you can use the QnA REST API. You can also access this API through the Bot Framework CLI but I don't know if that makes anything more convenient for you.
Related
I am trying to make a dynamic PDF generator as an .NET Core API. I want to take an existing PDF, or .docx file, and edit it so it replaces the current name (John Doe) with something that can be replaced like #NAME_PLACEHOLDER.
I then want to transform #NAME_PLACEHOLDER -> John Doe (or whatever is in the KeyValuePair or Dictionary<string, string>).
I am running this on a Docker environment, so I can easily execute commands and I am willing to do that as well.
So far I have tried a few things:
1) pdf2htmlEX
Executes as pdf2htmlEX file.pdf
Does the job pretty well
Can be converted back to PDF using Google Chrome headless or similar
Problem: Only the characters used in the PDF can be used to replace. So if I only use A, B, C as characters, it will make D into Times New Roman (or default font)
2) LibreOffice ODT to PDF
This was pretty nice, because I could simply unzip the .odt file, open content.xml, search and replace, then save it as an .odt file again
Could be converted into PDF rather easily using soffice --convert-to pdf
LibreOffice is quite nice
Problem 1: Microsoft Word -> Save as ODT tends to break the formatting, so we have to use LibreOffice to go and change it back again
Problem 2: We don't want to move away from Microsoft's Office suite
3) HTML to PDF using Chrome Headless
What you see is what you get
By far the best option, if we're all developers aaand have unlimited time
Problem 1: Only our developers can make changes, since our marketing department do not know HTML
Problem 2: Our existing PDFs would have to be rewritten in HTML
As you can see, I have tried a bunch of things. None of them, except Chrome Headless, has lived up to my expectations. What I really like about #3 is what you see is what you get. I can make the whole thing in HTML, press CTRL+P and see what it looks like as a finished PDF, basically.
I am looking for a better solution, though. It can be paid. It can be free. All I need is to change out words/phrases with other words dynamically, which apparently seems like a tough thing to do.
Thanks for specifying what you've already found clearly. It helps a lot providing a succinct answer.
The conversion is always tricky - I'm sure you know Word has trouble displaying/editing some Word documents itself.
I have experience regarding point #2 "LibreOffice ODT to PDF" and can suggest a few things to test:
Don't use Microsoft to do the docx->odt conversion. It's not good as you know. Use LibreOffice itself to do this step. The rest of your process remains the same.
For some documents, Libre Office does doc->odt much better. So, you can instead work with DOC format and get a better result without any other changes.
You won't be able to remove the devs from the process, but you can certainly reduce their role allowing your business/marketing teams to have more direct input simply by:
get the starting point document to the devs to run through the conversion process. The devs can "clean up" the document to make it convert nicely.
make this version of the document the "official" starting point. The business or technical teams can load it, adjust it, and put it back into the process.
if possible, expose a test-platform to the business teams so they can download, adjust, upload and render to PDF. This cycle means they will be able to achieve more and if they're good, do impressive stuff without any dev input.
the above steps simply mean don't expect perfect conversion of arbitrary complex documents. Starting from a (even complex) working baseline is great.
Some of that might show you that your #2 is actually going to get the best overall results.
I hope that helps.
The system I'm working with are receiving PDF documents, inside those documents there are two clickable images. The click events just triggers a http url. The thing is that I need to update those two url:s when I receive the document.
So my question is, is it possible to find the events and change the url and then save the file again? Those two images can be anywhere in the document so I can't look in a specific location.
Edit: I forgot to say that I'm coding in C# so it needs to be a .NET library.
Yes, it's possible.
It's hard to describe the way it can be achieved without knowing how PDFs are constructed (there are a few ways to create the described behavior) and tools you are going to use.
I just want to tell you how I solved this problem, or rather where I found the solution. I used the code in this thread, and it worked like a charm.
I need to edit LibreOffice Calc document programmatically in C++. I know that there is odfkit library, which uses webodf, but it looks like it doesn't support editing .ods files.
Is there any alternative that can deliver me this feature?
Libreoffice has API, called UNO, for controlling it from another process. So if you need something more complicated, that would be the simplest route.
If you just need some simple transformation, the other option is to unpack the file with plain old zip library (libzip, libarchive, ...) and modify the XML manually.
The opendocument site also mentions lpOD, but the web seems defunct and while search comes up with something that looks relevant, I am not sure whether there is anything usable.
see the SDK documentation, with many examples
I am a newbie in MTurk, and I am trying to create a very simple Categorization Project via their Requester UI (rather then the API).
Each batch I use has 10 items (question and possible answer). I have searched their documentation and forums with not help and so I have several questions:
When i use their Standard Categorization template, I have no option for modifying the HTML and layout (as shown for "Tagging of an image" project). the only formatting options are for the categories, instructions and includes/excludes. Is there a way to edit the HTML of the standard template they provide?
In the Standard Categorization template, while my input data file (csv file) contains 10 items, only 5 are shown (tried with 6, still only 5 are displayed in the preview). Is there a way to change this limitation?
When I try to use the "Create HITs Individually" (rather than the standard template, as explained above), I have the "Design Layout" options, but I cannot find a way to make the questions in the "form" required (which is possible via the API). Is there a way to achieve this?
If you stick to the standard project templates, you can't modify them. That's the reason to create HITs individually (through the RUI or via the API).
You'll have to show us your CSV file, because it's not really clear from your description what the issues could be.
Your third question is unclear, but basically for creating HITs individually, you simply do standard HTML markup and put in ${variablename} placeholders wherever you want one of your CSV upload variables to be placed.
If your project is at all large, I would definitely recommend going through the API. It's simply much more flexible than the RUI for creating any kind of customized design.
I'm new to MODx, but am quite impressed with its power and flexibility. There's only one caveat, and I'm hoping it's just because I don't know any better.
I'm a frontend dev, and I'm used to building websites of all sizes. But I usually work with files and version control. How would I keep this paradigm with MODx?
From my poking around so far, the only way I found to use an IDE, is to keep static files with my code, to later on copy/paste into MODx Manager. Far from ideal.
I'm aware that a lot of people use an "include" snippet, to include snippets, chunks, etc. Does this work for MODx specific tags? For example, if I include a file as a snippet, and I have a template variable defined in there (or a resource link), would that be properly rendered?
Also, is there a performance hit using a snippet by including a file, vs having the snippet code entered into MODx Manager?
Bottom line, how do you develop sites on MODx? Where do you enter your code? Is there a feature like the "Import HTML" but for snippets and chunks? Is there a way to create new Templates, Documents, Chunks, TVs, etc. without going through the Manager?
Thanks in advance!
there is a whole documentation site for developing in modx, http://rtfm.modx.com/display/revolution20/Home - though it mostly concerns extending it - not customization & modification. The short answer is no, there is no version control for your snippets & such, yes, you will have to maintain them manually. [I wish that was not the case]
Most of your php code will go into either a snippet or a plugin, and yes you can include static files in either of those resource types, no, I on't know if there is a performance gain/loss, but I would imagine "no" if your include is cache-able.
for the includes you can do something like this:
include_once $modx->config['base_path'].'_path_to_my.php_';
-sean
There is VersionX for revolution that will allow you version control of chunks, snippets, resources and so on.
There is package called Auditor that will allow you to implement version control in Modx
EDIT
Sorry just noticed your question is tagged Revolution, Auditor is for Evo. I don't think there's a solution available yet although I believe it is on the Roadmap