Modify the content of an MS Word file contained inside a .zip file, without extracting it? - vb.net

Is it possible to manipulate the content of an MS Word file contained inside a .zip file, without extracting it?
I have 2,000 zip files containing Word files. I need to modify the same field in each of the 2,000 zipped MS Word files. Is this possible without extracting the file first?

Yes it is possible, but the difference is semantics. When I do this, with single documents, I COPY (not extract) the xml file from the zip container, edit as required, and then OVERWRITE back into the zip container.
I've also tried to edit the file from within the zip file, but it can't be saved directly (at least not the way I have tried) - so (for example in NotePad++) file SaveAs would be required...

Related

How do I search multiple PDFs for a phrase and then copy the PDFs containing that string to a directory?

I have a directory of 5000+ single page PDF documents, and I need to filter those that contain a certain name. Finding the documents is easy enough, but is there a tool that would automatically copy these into a directory, while renaming them based on the sub-folder they are in?
I have tried using Adobe PDF and Foxit. While they did return all the relevant search results, there are more than 250, and it's very tedious to copy and rename each one individually.

How to search full content of code files (e.g. SQL scripts) stored in a SharePoint folder?

Do you run into this situation that there are many code files (e.g. SQL scripts *.sql) stored in a SharePoint folder, and you need to run full text search to find a specific one containing certain keywords (a table/column/function name, a comment, someone's initials, etc)?
Problem here is the search function in SharePoint document library only looks at the filenames, not the full content.
Of course many code repo systems (e.g. BitBucket) have advanced full text search functions (e.g. definition over usage), but in this case SharePoint hosts the "code repo", and the business depends on it...
Through experiment I found out SharePoint search function actually looks at the full content of a few limited file types – plain text (.txt) and Word (.doc, .docx) files.
Since code files (*.sql, *.py, *.js, *.htm, *.css, etc) are essentially text files, we can "cheat" SharePoint by appending ".txt" to the filename (e.g. my_script.sql → my_script.sql.txt), making the file content searchable once indexing finishes (in the matter of seconds according to my test).
So if you manage a SharePoint folder of many code files, you can download the entire folder as a zipped file, unzip and modify all code filenames (with utility tools to do this in batch or run a DIY script), and re-upload... voila!

HIdden characters in cytoscape names

I am attempting to upload a .cys file to a journal website as part of a submission. Although they accept all file types, when I try to upload the .cys file, the name exceeds the 64-character limit, apparently due to some hidden characters in the name. Is there anyway to see the hidden characters in the file name and/or change them so the filename is less than 64 characters?
You should be able to simply rename the .cys file, however, they might be attempting to unzip it automatically (a .cys file is actually just a zip archive of everything in a Cytoscape session), in which case, they might be having problems with some of the filenames inside of the .cys file, which can't be renamed. You'll need to ask the journal about that.
-- scooter

Can you write to a file without knowing its complete file path?

In a program I am creating I want to write to a txt file. I know how to do this, however the method I use requires me to know the entire file path of the target file. Is there a way to do this without knowing the entire file path, or, if possible, write to a file located in the project resources?
There are several possible solutions:
Write the contents of the file to MemoryStream and when you know the path of the file write the stream to the file.
Write the file contents to a temporary file, and when you know the path of the file, copy the temporary file to the same path

Opening tsv format Eurostat data

I've been trying to open this data: http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?sort=1&file=data%2Fdemo_gind.tsv.gz. I've already unzipped it and get the tsv file, but when I opened it in gedit, it looks like a binary file. Could anybody help me to open this file?
The file is correctly formatted even if not so readable for human beings.
TSV is a file extension for a tab-delimited file used with spreadsheet
software. TSV stands for Tab Separated Values. TSV files are used for
raw data and can be imported into and exported from spreadsheet
software. TSV files are essentially text files, and the raw data can
be viewed by text editors, though they are often used when moving raw
data between spreadsheets.
You can import it inside Excel or Open Office. Otherwise you may convert it by using online service (example google sheets).
Once you've unarchived the original .gz file there are two more steps required to view the data, as noted on Eurostat's website.
TSV files may be imported into Excel by (1) Saving on hard disk with
the suffix .tsv and (2) unzipping and (3) saving the table(s) as Text
(*.txt).
As per user74158's comment, decompress/unzip the tsv file. This can
likely be done with many different programs, I used 7zip and it
worked for me. On windows 7 I did this by right clicking, hovering
over 7zip, selecting extract files, tell 7zip where you'd like to
extract the files too and press OK.
Next go to the file, and change the .tsv file extension to .txt. Answer yes, you're sure you want to change the file extension and then you should be able to read the data.