Using Office 2007 extension (i.e. docx) for skin based On-Screen keyboard -

I'm creating a On-Screen keyboard for my application, and it supports skins as well.
Here's what I'm doing with the skins, I have a folder which contains some images and a xml file which maps the images to the keyboard, I want to be able to have the folder as a zip file like in Office 2007 (.docx) and iPhone firmwares (.ipsw), I know I can simply zip the folder and change the extension, what I need to know is how to read the files in the code.

You've got two options, either 1) just use a zip library like SharpZipLib or DotNetZip or 2) try to use the System.IO.Packaging namespace. I think option 1 would be the easiest probably.
There's nothing really magical that Office and other programs are doing, they're just reading a zip file and pulling stuff out of it as needed. Instead of pulling an image from a disk you just pull it from a MemoryStream.


How do I use an existing PDF as a container?

I have a PDF (created from Word) for a game I wrote for an old 8-bit computer, and I'd like to embed the code for that game (binary, less than 32k) into that PDF. This way, my emulator can load the program by reading the PDF, and the two can be stored and shared in one file.
You could call this a form of steganography.
I know a PDF has a tree structure and uses ASCII to define its components; is there a way to add inert, "orphan" elements that won't cause problems for PDF readers? I think that would be the easiest way to do it. But I'm not sure how to do it.
The simplest solution would be to use a document attachment or a file attachment annotation.
Most PDF tools that are available support these features as they are pretty basic.

Export Video from powerpoint using VBA

I want to export a video from powerpoint using VBA. This video was uploaded from the PC and not using links. I saw that this is possible for the images using this line of code:
ActivePresentation.Slides(1).Shapes(1).Export "C:\Cover.PNG", ppShapeFormatPNG
but I couldn't do the same for videos.
Do you simply need to extract the original video from the PPTX file? If so, rename the file to give it a .ZIP extension, open the zip file, browse for the media folder and in it, you'll find copies of any inserted (but not linked) sounds, videos and pictures.
If you have routines for working with ZIP files in VBA, you could probably work out how to do the same thing.
A possible alternative: size the video to fill the slide, then use PPT's SaveAs method to save as a WMV.

Playing movie files without the URL in Visual Basic 2010

Right now in my project I have the windows media player object, and I can get that to play movies on it by inputting the movie URL into the URL property. How do I get it to play a movie that is currently in my resources? Because when I transfer the executable file to my other laptop to test it it doesn't work.
If i remember correctly, when you embed a video file into the application you only embed the data of it, and the extension gets stripped.
So in order to view it, you would need to copy it to a temporary directory and add the extension yourself, then play it from that location.
I haven't found a need to do such a thing since 2005, so visual studio may have made some changes.
Easiest way would be to have the video file you want played stored in the same location as your programs executable and simply refer to it using its name.

PowerPoint Programmatically open/play mediaobject in add-in

I am working on a VSTO PowerPoint 2010 add-in which will allow the user to playback a media object (video or audio) in a windows form using windows media player control.
In which way can I extract the embedded media object an play it back to the user?
I have access to the objects name, will that be enough to get to the embedded object?
Kinda yes and no.
The "No". Through VBA and VSTO, the answer is no or at least I've never seen it done before and have no idea. I've looked at this before and didn't find it to be possible.
The "Kinda Yes". Any embedded media in 2007/2010 can be extracted through Open XML. Here's where the "kinda" comes in - you can extract it so long as you know what you're extracting. Sounds easy enough, but it's not. When you insert a video or audio piece, it gets embedded into a shape. That shape is given a name[1], which is the file name of the audio/video file. So if I insert the sample video that comes with Win7, my shape name that holds the video is "wildlife.wmv". It can easily be renamed by an end user who knows how to do so (the Selection Pane in the client) and in that case, it would be impossible to find based on just having the name.
But if it hasn't been renamed, you would open an in-memory copy of your .pptx in Open XML, search on the name in each of the slides in the /ppt/slides/ folder and once found, use it's relationship Id to locate it's name in the /ppt/media folder. Then you can pull it out, save it to disk, play it, etc.
1 PowerPoint, however, renames the file based on an internal naming convention. My "wildlife.wmv" is renamed "media1.wmv" inside the package. Subsequent media items would be named media2.wmv, media1.mpg, etc.

Extract text from a PowerPoint (.ppt or .pptx) file?

I'm currently using a combination of OpenOffice macros and a pdf2text program to extract text and would like to find an easier, more efficient way getting the text out of a PowerPoint file.
I've tried using the Apache POI library and have not had much luck, encountered numerous exceptions within the library when trying to process the files I'm looking at and don't particularly want to sift through the source code of the library.
Is there an easy way to do this without using the aforementioned library?
If you have MS Office and you save the PPT in the RTF (Rich Text Format), it contains just the text from the presentation. You could then open the file in any editor that understands RTF files and save it as a text (TXT) file.
I expect this to work from Open Office too.
Since you talk of API, this may not be the way to go for you but maybe it will give you newer ideas on getting there. Say, you use multiple macros to do the conversion in stages...
Edit: I got curious and did a short google search
This is what i found on one of the pages
As people in this thread have pointed out, retrieving text from an OO
document isn't hard since it's just zipped xml that can be parsed with a
perl script. The problem is getting Microsoft Powerpoint documents into
a zipped XML format in the first place.
I've found that File -> Wizards -> Document Convertor does exactly that.
Just tell it you want to convert Powerpoint documents, not templates,
point it to your source directory and where you want it to spit out the
result and you're away.
I then find unzip -p $file.sxi content.xml | perl -p -e
"s/<[^>]>/\n/g;s/ +//;s/\n\n/\n/g;" -w
works rather well for extracting the text.
Sorry, i don't have Open Office handy to try any of that out.
pptx files are relatively easy to deal with, because they are just zipped xml - you can just unzip them and then strip all the xml tags from the content of the files in the 'ppt/slides' subdirectory of the unzipped stuff, yielding most of the pertinent text.
ppt files are a whole other ballgame, and the process is rendered even more painful because the canonical tool, catppt from the catdoc package, is susceptible to a buffer overflow that makes it nearly useless (it segfaults on a large percentage of ppt files).
LibreOffice-5 File - Export - HTML includes both slide contents and presenter notes.
Then, open the .html file in Firefox or other browser, and File - Save Page As - Text File (or utility such as pandoc -o file.txt file.html).