Is it possible to select text in zathura without the mouse? - pdf

It is kind of annoying to use the mouse every time I have to copy text from a PDF in zathura.
I was wondering if it is possible to select and copy text in zathura similar to Vim's visual mode? I did a Google search but couldn't find anything.
I use zathura-pdf-mupdf on manjaro with xfce.

As of Jan '20, the zathura official documentation is being reworked and does not contain anything related to usage.
Looking at the man docs, it seems that zathura does not come with select / copy keybindings that mimic Vim's visual mode. The closest that comes to it is scrolling with h, k, j, l. Here's the full list of mouse and key bindings.
Feel free to open a feature request.

Related

GUI automation with clicking on text

There are many GUI automation tools that allow clicking on a specified image (well-known Sikuli, for example). Is there any way to click on the specified text, not image? This way the tool will:
make screenshot
recognize text on it
find text position (somehow)
send click event to this position
It would be much easier to write tests using this approach (many interfaces have text button, inputs etc.) rather than make screenshots for every single element.
I've seen some OCR feature in Sikuli but it didn't work for me (I tried invoking click('some-text-here').
Sikuli built-in OCR features are pretty buggy and unstable. All (or at least most of) the related issues are listed in this BUG. However there are few possible workarounds which are, however, not also always applicable..
If the text is known, you can take a screenshot of the text and then look for it as a screenshot. For example if you know exact font of this text, you can automatically generate such text on the screen and use it as a pattern to locate it elsewhere.
The built-in tesseract based OCR, performs significantly better when the font is bigger, "fatter" and in Grayscale (usually). Hence you might do some background image processing before attempting the actual recognition. I used ImageMagick to resize and filter the images for better recognition. It can be in the background as a command line tool. For example:
convert -filter spline -resize 100x -unsharp 10x20 -type Grayscale
I am aware that this does not answer your question directly but these are steps you might consider taking towards the final solution.
I'm a developer at Deskover company and we are currently developing an application, UiPath Studio that meets your needs.
We provide text recognition on various technologies with 100% accuracy, ability to find specific text in an area on screen, a control or an entire window, and also ability to click text or controls.
You can execute different actions, sequentially by creating workflows.
We at Deskover are big fans of Sikuli project. We actually use the same image recognition engine in UiPath Studio.
UiPath Studio is a visual tool that helps you create workflows easily, but you can also use the underlying API and implement an application that extracts text and clicks on it.
You can find more details about the UiPath library here.

Edge Animate Automation

Most Adobe products have the ability to be automated using AppleScript or ExtendScript/JavaScript but I don't seem to see the same capabilities in Edge Animate. Maybe I'm just missing something. I'm looking to be able to do things like open the document, add images, save the document, etc. Has anyone been able to find anything like this? I've done a number of different searches to no avail.
I'm not exactly sure what you're asking, but I think you're talking about adding your own javascript, which can be done by clicking the curly brackets located next to any of the elements in the animation, or hittin ctrl+E to see the full code.
Second, in terms of opening the document, you should be able to just double click the an file that it creates, and saving the document is just like in any other program, file>save(as).
Adding images is as simple as file>Import (hotkey = ctrl + I).
not exactly what I was asking for, but I did end up finding a solution. I was looking for a mechanism to be able to control Edge Animate similar to how you can control InDesign, Photoshop, and Flash via VBScript, and JavaScript respectively. This allows you to do things like import images into your document from an external script, save the document, export contents, etc. In the end, I wrote some code that sends key-strokes to the application and that resolved the problem although not ideally, IMHO. Thanks for your responses, though.

Possible to get _all_ rendered text on OS X?

I'm looking to log all the text that gets displayed on my OS X 10.6 machine. e.g. all webpage text (no matter the browser), PDF text (not necessarily the entire PDF, but at very least all the text that was actually viewed), anything I type into emacs, any email I write.
I've looked at the Accessibility API, but it seems to be more about describing function than content - and in any case relies on application developers to have implemented accessibility objects. Is there something lower-level? perhaps I can watch everything that goes through the OS font renderer?
After searching for a while my impression is that Apple doesn't explicitly make this possible, I'm open to any hackish suggestions you might have.
You'd have to get deep inside the Window Server to have any hope of getting all the text that was written to the screen. I suppose you could patch it yourself, but it's hard to see how without source. What you want has obvious nefarious uses so there's hardly going to be a public API for it.
Just a shot in the dark, but what about turning on Screen Sharing on the 'target' Mac and pointing a modified VNC client at it? I don't know whether text is sent as text over VNC or not, but if it was that might be one place to start. It's effectively giving you a Window Server equivalent that you control.

Finding a word's frame (position and size) on the screen using Cocoa or Carbon

Here's a tough one:
I need to be able to find a word's position and size (its frame) on the screen (its first occurence is enough, from there I should be able to get the next ones).
For example, I would like to be able to detect word positions in (but not limited to) Word, Excel and PowerPoint for Mac, as well as Safari and others.
The solution should be as fast as possible; I should be able to find at least 5-6 words per second and use as little CPU time as possible.
Here's what I thought of so far:
OCR in a window's screenshot / graphics context (any good Open Source framework that works on Mac OS X 10.4 and that can be used in a commercial product?). Evernote is very good at spotting words in images. I don't know if it uses a custom in-house engine or an Open Source / commercial one but that would be the kind of engine I would like to use if this is a "valid" solution. Ideally I would detect the word's frame in the active application's window (how to get the frame of another application?).
Getting some kind of "hook" on Quartz drawing of text and intercepting the location of the word when it's drawn (does not seem very feasible at first glance!).
AppleScript, but it depends a lot on what API the application offers (I don't think you can get a word's coordinates in a Word document from what I've seen) and it's slow.
... out of ideas ...
My goal is to get all the word's frames in a paragraph in the right order based on a string containing the text of the paragraph.
Thanks in advance for any hints!
As a starting place, you may want to take a look at QuickCursor's code. It retrieves text from many different applications through the AX Accessibility APIs. Now, it won't grab the pixel placement of the word, but it will at least return the NSString associated with the text in that UI element. Of course this means that the app in question has to support these APIs; I don't know if the MS Office suite would. In addition, it only supports editable elements, so an un-editable webpage in Safari won't work either. But it may give you a starting point for some ideas.
Take a look at the QCUIElement.{m,h}, and then the implementation in the QCAppDelegate.m (beginQuickCursorEdit:)... the implementation of his abstracted QCUIElement seems to be as simple as:
QCUIElement *focusedElement = [QCUIElement focusedElement];
id value = focusedElement.value;
Edit: Aha! Check out the Accessibility Inspector Sample code: UIElementInspector. It can actually get the AXPosition of elements on a page. Now, it's not word-by-word, but we're getting closer. It'll tell you the x,y placement of a textblock, as well as the words contained in the textblock.
This is possible, but very hard to get working reliably. You can play with Spell Catcher's Direct Connect feature to see an example.

side-by-side diff program that supports dragging blocks of text into windows?

I'm looking for a side-by-side diff program a la xxdiff or DiffMerge that, instead of diffing files, allows blocks of text to be dragged into either the left side or right side window.
I'm refactoring some SQL embedded in source files, and it would be nice to drag the sql statements from each source file into the diff program instead of having to cut and paste to files and then diff the files.
Any clues appreciated, bonus for mac and linux compatibility... Thanks!
update: both winmerge and beyond do this perfectly... thanks again guys!
Winmerge allows you to use Alt + Left and Alt + Right to move different text blocks to left and right.
It's free / open source and overall great tool as well.
If you use beyond compare and start a new text compare you can just paste into the windows and it will diff what you've pasted. Not quite drag and drop, but the same really.
No need to have the contents you want to diff in a file. I'd really recommend beyond compare, it's a great tool. You can get a trial version at:
http://www.scootersoftware.com/
Just to mention, it is linux compatible, but I've only ever used it on windows.
gVim (gvimdiff, vimdiff) can do it, although without dragging, but with keyboard shortcuts.
It has great documentation: http://www.vim.org/htmldoc/diff.html
And works on Windows too.
Just start a new file comparison with Diffuse and paste the text into the comparison panes (click the realign button if the text is multiple lines long). Diffuse is free and works on Linux, Mac, and Windows. It also has syntax highlighting for SQL.
I'm using meld http://meld.sourceforge.net/ and tktiff http://tkdiff.sourceforge.net/ on Unix, Linux and the like.
Devart's dbForge Schema Compare for SQL Server is a quick schema comparison and synchronization tool that has such functionality.