Stitching images together by finding similarities in the images? - objective-c

I am writing an app that stitches iOS text message screen captures together vertically.
I have the images cropped to only contain the message bubbles and none of the navigation bar of the text input bar (expect the the first and last screencaps of course) but I don't know how to detect where to line up the images.
I need to detect where there is overlap in the conversations and then stitch the images at the point. (I know how to draw the images to a new image context too.)
I am looking for an open source framework that could help me achieve this and any advice as to how to accomplish this task.
Thanks!

You can also give a go to the following code:
https://github.com/foundry/OpenCVStitch

While it might be overkill for your project, you could have a look at PanoTools: an Open Source software library for manipulating and stitching panoramic images. It's portable (Win/Lin/Mac), and if you're looking forward to making the stitching on your computer, GUIs to it like Hugin may even already fit your use case (I only used them for stitching photographs, so check the projection settings).

Related

Windows Form, image gridlines

Currently I want to create an application.multiscreen transitionstrong text
But I have a problem I do not know how to do these lines that divide the image into several parts. I want to cut the image into several parts and these parts you can handle, please at least some ideas.
If you want to split images up programmatically you are first going to have to choose a file format to support. I suggest PNG to begin with because their encoding is fairly simple to understand (see HERE) and c# has a class to decode it (see HERE.)
You'll want to think of the PNG file as a matrix of RGB values that you can split up and store into separate new smaller PNG files.
If you want to support other image file types you will have to do some research into their encoding formats and handle them differently.
Well I did. Of course other method as I create a WPF application. show image
This program is intended to control 3, 6 and 9 monitors connected by Raspberry Pi. As you can see in the image I want to first grid to have 2 buttons. The first button to set the image resolution of first grid. The second button can send images from the first grid to the first monitor and so on all grids.
Thank you very much for your help. Wait a few suggestions.

How to measure different coordinates from a PDF file on Windows?

I am looking for a way to measure the coordinates of different rectangles on a PDF file?
Mainly I do have to perform some overprinting on an existing PDF and I need to know the x,y,w,h on where I am supposed to write the texts.
It seems that Preview.app on Mac has this ability but so far I wasn't able to find anything on Windows that does the same.
Please do not confuse this feature with the Measuring Tools from Adobe Reader which are used to measure distance in printed construction stuff, not the PDF page itself.
It seems that the default using of measure is point, so I need something that would allow to select a rectangle and that will tell me the coordinates.
Please do not suggest on exporting as a imagine and using something else to measure the pixels on the image.
Update: http://legacy.activepdf.com/support/knowledgebase/view.cfm?tk=rl&kb=11866 -- PDF Units, that's what I am looking for, something to measure the PDF coordinates in PDF units.
Disclaimer: I work for Atalasoft.
I know you said not to suggest this, but honestly, it's the easiest approach:
If you mean "sweep out a rectangle in the UI and report the coordinates", that's pretty straight forward, but it's going to be a build-your-own type of thing. What you will need are:
A PDF rasterizer (GhostScript, Acrobat, FoxIt, Atalasoft) to get you an image at a specific resolution.
A tool to display that image in a window and let you sweep out a rectangle (this is straight forward winforms type code for .NET, but we have a control that does this out of the box - combining 1 & 2 into one step).
A tool that can look at the structure of a PDF page and report back the crop box (if any) and the media box for each page (iText, DotPdf).
A tool/understanding of matrix transformations to build the matrix that goes from display space into PDF space (and/or vice versa, probably in iText, definitely in DotPdf)
The code flow becomes something like:
For each page:
Open document, pull out crop and media box, rasterize page, build transformation matrix.
Display image, build/hook into event for selection changing.
Push the image viewer rectangle coordinates through the transformation matrix.
Profit.
From a coding point of view (assuming 0 prior knowledge of this, but a decent understanding of linear algebra), from 3 days to a 2 weeks. If I were to write it, it would probably take on the order of a few hours, but I wrote most of our PDF tools and this is pretty easy.
If your goal is to intuit where rectangles are on the page and report back those coordinates, that's also doable, but it decidedly non-trivial in comparison. You need to write code that can rip through a PDF display list and interpret the contents correctly. That means being able to handle all the cumulative matrix transformations, the graphics state changes, the gstate object use, Form XObject placement, and so on. You need to answer the question "what is a rectangle?" because in PDF placement, it could be an re operator, a set of degenerate beziers, a set of lines, an image of a rectangle or (surprise!) a combination of all of the above. Honestly, intuiting anything about the content on a PDF page is a Herculean task.

Photoshop jsx image grid

What I am ultimately trying to do is to create a grid of images for print that are minor variations of the same thing (different text is all). Looking through online resources I was able to create a script that changes the text and exports all of the images necessary (several hundred). What I am trying to do now is to import all of these images into a new photoshop document and lay them all out in a grid and I can't seem to find any examples of this.
Can anyone point me in the right direction to place a file at a specific coordinate (I'm using CS5 and have the design suite so if there is a way in illustrator to do this quickly...)?
Also, I'm open to other ideas on how to do this (even other programs) easily. It's for labels so the positioning on the sheet has to be pretty precise...
The art layer object has a translate() method that takes delta x and y params. You'll need to open each image, copy it to the target document, get its current location (using artLayer.bounds) and do the math to find the deltas to position it where you want it. Your deltas can be in pixels so you'll get plenty of precision.
Check out your 'JavaScript Scripting Reference' pdf in your Adobe install directory for more details.
Ok I'm marking Anna's response as the answer because though I didn't fully test it, it seems like it should work and answers the original question with jsx. However I'm also leaving my final solution in case anyone else runs across this with the same issue and may prefer this method as well.
What I ended up doing instead is using InDesign. I figured out that it has a grid option that lets you import a number of files and place them all in an equal grid in a single command. This is almost exactly what I was looking for, except that it leaves a small border/margin in between the columns and grids and mine were designed to meet exactly.
I couldn't figure out how to make it not have the border (I have very little experience with InDesign, it may be possible). However I was able to select all my images and scale them uniformly to be the correct size, then I just selected each column and dragged it over to snap to the adjacent column and the same with rows...

Finding a word's frame (position and size) on the screen using Cocoa or Carbon

Here's a tough one:
I need to be able to find a word's position and size (its frame) on the screen (its first occurence is enough, from there I should be able to get the next ones).
For example, I would like to be able to detect word positions in (but not limited to) Word, Excel and PowerPoint for Mac, as well as Safari and others.
The solution should be as fast as possible; I should be able to find at least 5-6 words per second and use as little CPU time as possible.
Here's what I thought of so far:
OCR in a window's screenshot / graphics context (any good Open Source framework that works on Mac OS X 10.4 and that can be used in a commercial product?). Evernote is very good at spotting words in images. I don't know if it uses a custom in-house engine or an Open Source / commercial one but that would be the kind of engine I would like to use if this is a "valid" solution. Ideally I would detect the word's frame in the active application's window (how to get the frame of another application?).
Getting some kind of "hook" on Quartz drawing of text and intercepting the location of the word when it's drawn (does not seem very feasible at first glance!).
AppleScript, but it depends a lot on what API the application offers (I don't think you can get a word's coordinates in a Word document from what I've seen) and it's slow.
... out of ideas ...
My goal is to get all the word's frames in a paragraph in the right order based on a string containing the text of the paragraph.
Thanks in advance for any hints!
As a starting place, you may want to take a look at QuickCursor's code. It retrieves text from many different applications through the AX Accessibility APIs. Now, it won't grab the pixel placement of the word, but it will at least return the NSString associated with the text in that UI element. Of course this means that the app in question has to support these APIs; I don't know if the MS Office suite would. In addition, it only supports editable elements, so an un-editable webpage in Safari won't work either. But it may give you a starting point for some ideas.
Take a look at the QCUIElement.{m,h}, and then the implementation in the QCAppDelegate.m (beginQuickCursorEdit:)... the implementation of his abstracted QCUIElement seems to be as simple as:
QCUIElement *focusedElement = [QCUIElement focusedElement];
id value = focusedElement.value;
Edit: Aha! Check out the Accessibility Inspector Sample code: UIElementInspector. It can actually get the AXPosition of elements on a page. Now, it's not word-by-word, but we're getting closer. It'll tell you the x,y placement of a textblock, as well as the words contained in the textblock.
This is possible, but very hard to get working reliably. You can play with Spell Catcher's Direct Connect feature to see an example.

Automated Development of Presentation with Interactivity

I am trying to identify the right tool, language, software package, or other for the automated development of presentations, where the presentation is user interactive.
The presentation will consist of images with titles and some descriptive text. Most of the time there will be 35–70 images. I would like to show each image on a separate page, slide, tab, etc. (I guess proper terminology depends on the solution.)
The images will change, but the titles will remain the same, and there will be a little bit of change to the description of each image.
After putting the presentation together, I would like the user to be able to circle and "write" on the electronic image in kind of the wax pencil sense (I previously worked in a photo lab and we worked with wax pencils on negatives all the time and would like to have kind of a similar flexibility). Moreover, I would like users to be able to add comments as well, kind of in the way Adobe PDF Professional allows, e.g. inserting bubble comments, etc.
Most importantly, I would like to be able to do this in an automated way. Right now we are using PowerPoint, but the amount of time it is taking to put an image on a slide in PowerPoint, resize it, and then set up the text is killing us. Plus, as the images change it takes tons of time to go back and update them. Thus, we would like something that is a bit faster to update images and get the feedback from our few users. Does not necessarily have to be a web hosted solution, but could be run through a browser.
Sorry this is so long and thanks for any ideas and feedback, especially if there is an existing software package solution, language that can be used, or other approach to get this done.
These days, two of the most popular are Adobe Captivate and Articulate Presenter. For service, instead of product, you can check out services like http://voicethread.com.
I don't know of any product that completely answers your requirements.
But, for similar results I use two different tools for developing the presentations and another one for drawing while presenting.
If I just want to make a presentation made of pictures and texts, and I want to automate its creation, I use irfanview http://www.irfanview.com/ with its wonderful feature for automated slideshows. I put all the images together, annotate them (I use either their filenames, or if not enought, with EXIF and comment fields) and create a slideshow, that can be compiler into an .exe file.
If I want a more elaborated presentation. With full annotation capabilities, I use Wink http://www.debugmode.com/wink/
For drawing over the screen during the presentation, I use a very old bitmap drawing program, called PC-Draw, that allows, with a hotkey, to capture the screen as a bitmap and begin drawing over it, and with another hotkey, to return to the original screen without altering the running programs at all. I have not found it anywhere in the web. However, I found similar programs just a quick google away.
All three tools are free and easy (and even fun) to use.