How to search for .xlsx files? (Excel 2007 and newer) - google-custom-search

I am using Google Custom Search along with the XML API. From the documentation linked below, I can see that the XML API supports searches for .xls files, but what about .xlsx files? Half of our files are now the newer .xlsx format and we need for them to turn up in our search results.
How does one search for .xlsx files with the XML API? This is not covered anywhere in the XML API documentation and searching for .xls files does not return any results for .xlsx files, when it should.
https://developers.google.com/custom-search/docs/xml_results

I figured it out. You can use filetype:xlsx even though the documentation does not include xlsx as a supported file type to search for.
Also, you can search for multiple file types. Here is the documentation on that: https://developers.google.com/custom-search/docs/xml_results#wsSpecialQueryTerms

Related

Import .docx contents into MS Access

I began writing a docx document to do a project of mine.
Recently, I realized that it would be easier to manage that data if it was in a database.
So, I wanted to import that data into MS Access automatically, to avoid copying and pasting the data manually.
Is there anyway to do it? I have only encontered ways of opening Word application via Access. I also know that docx has a XML structure, so I imagine if I can open that structure, it would be easy to do a parser in VBA
There are two basic ways information can be taken out of a Word document and put into an Access database: automating the Word object model using VBA code running in either Word or Access OR extracting the WordOpenXML that makes up the Word document. You indicate you lean towards the second option.
Here, again, there are a number of approaches available:
Use VBA in Word or Access to extract the WordOpenXML of the document open in the Word application user interface.
Use VBA in Access together with non-VBA tools to "crack open" the Zip file and extract the XML.
Use the tools available in the .NET Framework to extract the content of the ZIP file and write it to Access using an OLE DB connection.
I understand your goal is to be able to recreate the document at a later point for printing, so you want to preserve all the formatting. In addition, you want to be able to read the content from within Access.
I believe this will require a minimum of four fields in the Access table:
ID
Title
Text of song
The complete WordOpenXML for re-creating the document
You don't mention (4) in the discussion and problem description, but if you want to store the formatting AND you want to be able to read the content I believe this is necessary. While WordOpenXML is "readable", there's a lot of mark-up in there which doesn't make reading comfortable.
All things being equal, I'd go for either VBA working on the open Word document or the .NET approach, using the Open XML SDK (free download .NET library you can reference in Visual Studio and distribute with solutions).
One important thing to keep in mind is storing the Word Open XML in the database. Unless something has changed in Access, you can't store the ZIP file - you need a "streamable" format. That would be the OOXML OPC flat-file format.
When you read the WordOpenXML from a document using VBA, that's what you get, which is why that would be an option for me. The Open XML SDK doesn't have that option, but there is code available from Eric White's blog for doing this.
When you later want to recreate and print the document it should be enough to stream the WordOpenXML to a file with the .xml extension. Or you could convert it back to a docx zip file (same blog).

How to put files inside files

MS Word's .docx files contain a bunch of .xml files.
Setup.exe files spit out hundreds of files that a program uses.
Zips, rars etc also hold lots of compressed stuff.
So how are they made? What does MS Word or another program that produces these files have to do to put files inside files?
When I looked this up I just got a bunch of results about compression, but let's say I wanted to make a program that 'wraps' files inside a file without making the final result any smaller. What would I even have to write?
I'm not asking/expecting any source code that does this, I just need a pointer. Is there something you think I'm misunderstanding based on what I've asked here?
Even a simple link to an article or some documentation would be greatly appreciated.
Ok, I'll just come up with some headers for ordinary files and write them along with the bytes of the actual files into one custom-defined file. You guys were very helpful, thank you!
Historically, Windows had a number of technologies to support solutions like this. These were often called Compound Files or Structured storage. However, I don't think the newer Office documents use these technologies. I think the Office file formats are similar to ZIP files with a different extensions. If you change a file with .docx extension to .zip and open it with your favorite compression tool, you'll see a bunch of folders and XML files.
Here are some links to descriptions of different file formats that create "files within files"
Zip file format
Compound File Binary Format (CFBF)
Structured Storage
Compound Document File Format
Office Open XML I: Exploring the Office Open XML Formats
At least on POSIX systems (e.g. Linux), a file is only a stream (i.e. a sequence) of bytes. And you can only grow (or shrink, i.e. truncate) it at the end - there is no way to insert bytes in the middle (without copying the rest).
You need some conventions, and some additional software, to handle it otherwise.
You might be interested in Sqlite, which gives you a library to handle some (e.g.) *.sqlite file as an SQL database
You could also use GDBM - a library giving you some indexed file abstraction.
libtar is a library to manipulate tar archives. See also tardy, a tar file postprocessor.

create one pdf from multiple ppt files

Someone knows how can I create one pdf file from multiple ppt files ?
Whether it to write script or computer program. However if it can be done with some program it will be the best.
I searched the web for something like this but I didn't get any results.
If you want to convert the PPT/PPTX files to PDF and then join those converted PDF files into a single PDF using either .NET or Java, you may try Aspose.Slides and Aspose.Pdf.Kit components.
Aspose.Slides allows you to convert the PPT/PPTX files to PDF and Aspose.Pdf.kit allows you to join the PDF files into a single PDF. Please see if this solution can work for your scenario.
Disclosure: I work as developer evangelist at Aspose.

Generating Excel file from Cocoa (iPhone app)

I'd like to offer the possibility for users of my app to export to Excel. I don't ever need to read Excel files.
The three ways I know right now is to
make a CSV file, which isn't too great as I'd like to have some custom formatting in the spreadsheet
make an XML file that I don't think people'd recognize as an Excel file
make a template xlsx file, unzip it in the app, do a lot of search-replacing in the files and then zip it back up again
Are there other alternatives? I'm not sure how supported .xlsx files are, and that seems like very much work. Are there any frameworks out there I can lean on, that perhaps even make old-school .xls files?
Cheers
Nik
Some options for you to consider:
1) You may be able to use ooxml http://en.wikipedia.org/wiki/Office_Open_XML_file_formats. You may need the "office compatibility pack" on computers with excel 2003 or lower http://go.microsoft.com/?linkid=5754865.
2) Excel 2000 uses the BIFF file format: http://www.google.com/url?sa=t&source=web&cd=1&ved=0CBcQFjAA&url=http%3A%2F%2Fsc.openoffice.org%2Fexcelfileformat.pdf&ei=iDx0TKOhBIqmnQfckKy7CQ&usg=AFQjCNE2w4xyFSoKmvKdsa7O9TMqynYpbA (pdf). You may be able to create simple documents from the spec or based on other info on the web.

Rendering Word document without word

Are there any solutions for Rendering MS-Word 2003 Documents (WordML) into PDF without MS-Word? I found Aspose.Words which seems good but has some problems. Is there any other solution out there?
You could use OpenOffice. It reads and writes Word documents and can save documents as PDF.
Another solution might be is Altsoft's xml2pdf
Antiword. I used this awhile ago to have a Linux web app read MS Word documents and it worked fine.
Consider ZamZar.
On-line file conversion between many file formats: Upload the source file specifying target file type and an Email address; receive the converted file in your inbox.