Is there a way to save all hyperlinked documents in excel? - vba

I know this question has been answered before, but the problem I'm facing is a bit different, and this is why I'm asking for your help :)
So, I'm working with multiple excel files that contain multiple hyperlinks that lead to documents such as Excel files, PDFs, DOCs and sometimes even images. The problem with these hyperlinks is that they are not leading to a "normal" website, but to a special internal software, that in its turn links to a local address on my computer that contatins the desired file. That means that there is no direct link that could be grabbed with a simple VBA code.
Let's have an example, assuming the internal software name is "John":
I see in the Excel documents this link: John://3434545345/345345345
When I click on it, it opens the file, which is located, for example, in: C:/local/Cutekitten.pdf
After this long intro, my question is: Is there a way to automate the process of saving each document, instead of manually opening it and saving it? Could it be solved with a VBA code? Or does is require a different approach? I was actually thinking to bypass this problem by finding a way to open all hyperlinks at once with VBA, and then maybe find some code (not VBA?) that saves all open documents.
P.S Please keep in mind that I can't download EXE files or any other "suspicious" files due to workplace restrictions.
Any help will be much appreciated,
Thanks! :)

You may try this
'''
Input: test.xlsx
name link
1 location/file1.jpg
2 location/file2.xlsx
3 location/file3.pdf
4 location/file4.mp4
'''
from openpyxl import load_workbook
wb = load_workbook('test.xlsx')
print wb.get_sheet_names()
# ['Sheet 1', 'Sheet 2', 'Sheet 3']
ws1 = wb['Sheet 1']
## alternate -> worksheet2 = wb2.get_sheet_by_name('Sheet2')
import urllib
import os
## Result directory
directory = 'result'
if not os.path.exists(directory):
os.makedirs(directory)
for row in ws1:
if '.exe' in row[1].value:
continue
print row[0].value, '\t', row[1].value
urllib.urlretrieve (row[1].value, directory+'/'+row[1].value)
'''
Output:
Table 1 None
None None
name link
1 location/file1.jpg
2 location/file2.xlsx
3 location/file3.pdf
4 location/file4.mp4
'''

Sub PathsText()
Dim List(), Path As String
Dim i, x As Integer
Dim s As InlineShape
Dim fso As FileSystemObject, ts As TextStream
Set fso = New FileSystemObject
Set ts = fso.OpenTextFile("C:\MyFolder\List.txt", 8, True)
With ts
.WriteLine (ActiveDocument.InlineShapes.Count)
End With
For Each s In ActiveDocument.InlineShapes
Path = s.LinkFormat.SourcePath & "\" _
& s.LinkFormat.SourceName
With ts
.WriteLine (Path)
End With
Next s
End Sub
I guess the important part is if "Path = s.LinkFormat.SourcePath" works.
If so, if you have a ton of excel files to do it for, I'd put those (the ones with the links) in one folder. I'd make a new empty workbook with one button on a single sheet to call code. The button would "For Each workbook in directory, For Each sheet in workbook, For Each link on sheet" add source path to a text file.
Once you had a list of paths the next step would be obvious. I wish I had demo code, but I migrated to c# years ago. My monstrously slow cell-by-cell search, modify and graph inventions threatened to bury me.

Related

VBA - Loop through multiple subfolders on a network location with date search criteria/ Improve search speed

Purpose of my question and of the VBA code:
Get specific data (a couple columns) from each one of the "table.csv" file in a network directory. Each networkdirectory/subfolders01/subfolders02 contains one "table.csv" file but 100 other subfolders are included in each network/subfolders01. The other folders are not needed, the only one we are interested in is subfolder02 for each subfolder01. The number of subfolders01 in the network directory is about 15000. However I only need subfolders01 from Jan2020 to Apr2020,for example (200 subfolders).
Final purpose is to trend data.
Issue:
I am trying to understand how I could improve the VBA code that I am currently using.
This code goes through each subfolder one by one and then check the date and file name.
I am wondering if there is a way to add any search filters criteria for subfolder date and name to have a faster loop.
How can we avoid the code to go through each subfolders?
Please see below the code I am using,
I really appreciate your time and hope my request is clear.
'''
Function GetFiles(startPath As String) As Collection
Dim fso As Object, rv As New Collection, colFolders As New Collection, fpath As String
Dim subFolder As Object, f, dMinfold, dtMod
Set fso = CreateObject("Scripting.FileSystemObject")
dMinfold = ThisWorkbook.Sheets("Enter_Date").Cells(2, 1)
colFolders.Add startPath
Do While colFolders.Count > 0
fpath = colFolders(1)
colFolders.Remove 1
'process subfolders
For Each subFolder In fso.getfolder(fpath).subfolders
If subFolder.DateLastModified >= dMinfold Then
colFolders.Add subFolder.Path
End If
Next subFolder
'process files
f = Dir(fso.buildpath(fpath, "*Table.csv"), vbNormal)
Do While f <> ""
f = fso.buildpath(fpath, f)
dtMod = FileDateTime(f)
If dtMod >= dMinfold And Right(f, 3) = "csv" Then
rv.Add f
End If
f = Dir()
Loop
Loop
Set GetFiles = rv
End Function'''
Then I have my code to get transfer data from each file.
Thank you.
I'll put in screenshots to clear up the Get & Transform method, since it is the GUI approach rather than code.
It is possible to filter before loading contents, which will speed things up significantly.
I tried with a few thousand subfolders filtered down to 20, loads instantly.
Here's the initial screen for get data from folder
You can then filter on path. In your case it will be based on the date from the folder name.
Now that it's filtered you can expand the content using the header button.
Inside content, you'll have to expand again to convert from csv to excel table
Choose/rename columns as needed, then hit "close and load" to drop it into excel.
Default is to a new table, but you can "load to" if something more custom is needed.
Here's your output. You can right-click refresh or refresh from vba as needed.
Edit- Just noticed that I used .txt rather than .csv for the files. Might change how a step or two looks in the middle, but the general idea is the same.

Excel VBA to open Sharepoint folder and create list of hyperlinks for files within

Prior to being asked to migrate to SharePoint, I was using a collection of .xlsm files to set project teams up to manage projects. My Project Manager file included a macro that would go to a designated folder and create hyperlinks for all current project files. I've saved the collection of .xlsm files on SharePoint, but when I run the macro below (which I found here - Thank you!), I receive an error related to the "Set xFolder = xFSO.GetFolder(xPath)" line. Any help would be great. I've read several posting that may have the answer and tried several adjustments to the code, with no luck.
Sub Create_Hyperlinks_for_all_Current_Projects()
Range("B8:D38").Clear
MsgBox "Once you click OK, an explorer box will appear. Select the folder
containing all the CSTPs and then click OK again. HINT: The folder
containing all the CSTPs should be in the same folder this document was in
and should be called ''CSTPs''. Links to all CSTPs will then appear in the
white box on the Manager Menu."
Dim xFSO As Object
Dim xFolder As Object
Dim xFile As Object
Dim xFiDialog As FileDialog
Dim xPath As String
Dim I As Integer
Set xFiDialog = Application.FileDialog(msoFileDialogFolderPicker)
With xFiDialog
.InitialFileName = ThisWorkbook.Path
End With
If xFiDialog.Show = -1 Then
xPath = xFiDialog.SelectedItems(1)
End If
Set xFiDialog = Nothing
If xPath = "" Then Exit Sub
Set xFSO = CreateObject("Scripting.FileSystemObject")
Set xFolder = xFSO.GetFolder(xPath)
For Each xFile In xFolder.Files
I = I + 2
ActiveSheet.Hyperlinks.Add Cells(I + 6, 2), xFile.Path, , , xFile.Name
Next
End Sub
I hope you find the following notes helpful.
They summarize the outcomes from several weeks of frustration.
If you are using SP365,
then the filesystemobject
no longer works very well, if at all.
I had hundreds of macros dependent on the FSO.
These have all worked great up until my organization migrated to SP365 :o(
Now they only work if I manually click
the Open Explorer button, in SP, first.
Opening Explorer provides Win Explorer with the required permissions to access SP.
This in turn provides the FSO with the required permissions to access SP.
But...in my case...FSO only works for a while.
For my first work around...
I rolled out a prototype app, which used a macro to automate
opening IE, opening Win Explorer and initializing permissions for the FSO
All worked great on my machine, for about an hour,
then some kind of timeout took me back to square one.
Colleagues on other machines experienced a range of FSO behaviours.
From all working ok. To having to rerun the connection macro every 30 seconds.
To nothing working at all.
I then invested time trying to update macros to connect to SP as a network drive.
Again, this is a process that I have used for years, up until migration to SP365.
Again, success connecting to SP365 seems to be very machine dependent.
In the end...with respect to my own requirements...
My work around was to create an SP view that lists all files.
Then use the Export to Excel option in SP to create an Excel data query.
Then copy the query into an Excel file that I refer to as Config.xlsx.
Then use Excel RefreshAll to update the list of files, each time a list is required.
Its not very elegant...but, it works ;o)
As I say....I hope this helps you / someone.
Feel free to connect on LinkedIn if you require any followup advice.
Peter,
LinkedIn name DrPeterEHSmee

Macro (VBA) in Excel is failing on colleagues PC but works on mine when colleague is logged in

I'm developing VBA tools to automate a series of long winded administration tasks, the code runs fine in the following circumstances.
When I am logged into my PC
When my colleague is logged into my PC
When I am logged into my colleagues PC
However it fails to complete correctly, when my colleague runs it on her PC.
The specific area it is failing in is:
'creates 2 dims for location of the two files that need opening based on the critera set on the home page
Dim newdata As String
newdata = Range("f11").Value
Dim olddata As String
olddata = Range("f12").Value
Dim fileextension As String
fileextension = Range("f14").Value
Dim fulllocationolddata As String
fulllocationolddata = Range("f13") & olddata & fileextension
Dim fulllocationnewdata As String
fulllocationnewdata = Range("f13") & newdata & fileextension
'open file containing OLDDATA c&p previous days data to the conversion tool
'then shuts the old data workbook
Workbooks.Open Filename:=fulllocationolddata
Workbooks(olddata).Activate
Worksheets("sheet1").Select
Range("A1").CurrentRegion.Copy
Workbooks("Stockfile Conversion Tool.xlsm").Activate
Sheets("OLD STOCK").Activate
Range("A3").Select
Selection.PasteSpecial
Workbooks(olddata).Activate
Worksheets("sheet1").Select
Workbooks(olddata).Close SaveChanges:=False
The final line (Workbooks(olddata).Close SaveChanges:=False) does not shut the workbook, then later in the macro I open another workbook of the same name but as it's already open it just activates the window and the rest of the code falls apart.
If anyone has any ideas where I'm going wrong it would be appreciated.
Thanks in advance for your assistance
Plan303
Making my comments to an answer here:
Whether Workbooks("Name of the Workbook") is working or not depends on the settings in System control - Folder Options - View - [ x ] Hide extensions for known file types.
If this is set, then Excel's file extensions .xlsx, .xlsm, ... are not visible in Explorer or other file listings. Only the file names of the Excel files are visible. If so, then Workbooks("Name of the Workbook") will work.
If Hide extensions for known file types is not set, then Excel's file extensions .xlsx, .xlsm, ... are visible in Explorer or other file listings. If so, then Workbooks("Name of the Workbook") will not work. Then only Workbooks("Name of the Workbook.xlsx"), giving the name and the extension, will work.
But Workbooks("Name of the Workbook.xlsx") will also work if Hide extensions for known file types is set. So using the full name inclusive extension should be preferred.
So for the concrete question:
If olddata only contains the name of the workbook and not the file extension then Workbooks(olddata) will only work if Hide extensions for known file types is set in Folder Options. It will fail, if that option is not set and the file extensions are visible in Explorer. But Workbooks("Stockfile Conversion Tool.xlsm") will always work independent of whether Hide extensions for known file types is set or not. So Workbooks(olddata & fileextension) should also always work if olddata only contains the name of the workbook and fileextension contains .xlsx for example.

Print XPS from Google Search without dialog with file name listed in excel file

I have an Excel file with three column A, B, C; Serial #, File #, Name.
I want to search column A on Google, and print the first page result in XPS format with set directory and File name containing something like this:
"Serial # / File # / Name"
"1 / 1103 / Mock up Name A. xps"
From various search online, I managed to cut and paste the following code:
Const OLECMDID_PRINT = 6
Const OLECMDEXECOPT_DONTPROMPTUSER = 1
Const PRINT_WAITFORCOMPLETION = 2
Sub AutoSearch()
Dim N As Long, i As Long
N = Cells(Rows.Count, "C").End(xlUp).Row
Dim objIE
Set objIE = CreateObject("InternetExplorer.Application")
With ActiveWorkbook
For i = 2 To N
objIE.Navigate "http://www.Google.com/search?q=" & Cells(i, 1).Value
objIE.Visible = 1
Do While objIE.ReadyState <> 4
DoEvents
Loop
objIE.ExecWB OLECMDID_PRINT, OLECMDEXECOPT_DONTPROMPTUSER, 1
Next i
End With
End Sub
Above code allows me to grab column A name, search in Google, and print in XPS, but I still have no idea how to suppress the printing dialog, or manipulate it to save into the file name format and directory I want.
Being extreme unfamiliar with coding, all the solution for suppressing print dialog that I can find involving with something else other than VBA, but I have no idea how to translate that into vba.
(https://blogs.msdn.microsoft.com/fyuan/2007/02/24/printing-documents-to-microsoft-xps-document-writer-without-user-interaction/)
If you have any solution or better idea, please let me know.
Thank you so much
After "some" research and digging around, I found a walk around solution. Since I am not a programmer at all, I came across a freeware called, "http://www.weenysoft.com/free-html-to-pdf-converter.html", which is derived from "http://wkhtmltopdf.org/". For those super coder, the latter approach is better way, since you can tackle the solution more directly and efficiently; however, as for me, I had to stick with the first solution, which cannot give you time stamp on the pdf files it produces.
Thus, I need to use another walk around approach, "File_Name_Stamper.pdf (just google this)", to put the time stamp on my pdf files. To use on multiple files, you just need to use any Macro Recorder.
Again, totally not recommended for any programmers, above approaches are pretty sad, but does the job for the time being.
Cheers.

VBA list of filepaths of linked objects in document

I have a number of large Microsoft Word documents with many linked files from many Microsoft Excel spreadsheets. When opening a Word document, even with the 'update linked files at open' option unchecked:
Word still checks each link at its source by opening and closing the relevant excel spreadsheet for each individual link (so for x number of links, even if from the same spreadsheet, Word will open and close the spreadsheet x times). This means opening documents takes a very long time.
I have found that documents open faster if the spreadsheets containing the source of linked objects are already open, so Word doesn't keep opening, closing, reopening them.
So far, the beginnings of a solution I have is to create a list of all the filepaths of the linked objects, done by following VBA code:
Sub TypeArray()
Dim List(), Path As String
Dim i, x As Integer
Dim s As InlineShape
Dim fso As FileSystemObject, ts As TextStream
Set fso = New FileSystemObject
Set ts = fso.OpenTextFile("C:\MyFolder\List.txt", 8, True)
With ts
.WriteLine (ActiveDocument.InlineShapes.Count)
End With
For Each s In ActiveDocument.InlineShapes
Path = s.LinkFormat.SourcePath & "\" _
& s.LinkFormat.SourceName
With ts
.WriteLine (Path)
End With
Next s
End Sub
'--------------------------------------------------------------------------------------
Private Sub WriteStringToFile(pFileName As String, pString As String)
Dim intFileNum As Integer
intFileNum = FreeFile
Open pFileName For Append As intFileNum
Print #intFileNum, pString
Close intFileNum
End Sub
'--------------------------------------------------------------------------------------
Private Sub SendFileToNotePad(pFileName As String)
Dim lngReturn As Long
lngReturn = Shell("NOTEPAD.EXE " & pFileName, vbNormalFocus)
End Sub
which works well, but can only be used after a document is already open, which defeats its purpose.
So, finally, my question(s) are these:
1) Is there a way to run this code (or any better, more efficient code - suggestions are welcome) before opening a Word document and waiting through the long process of checking each link at its source?
2) Is there a way to avoid all this and simply have Word not check the links when it I open a document?
Sorry for the long question, and thank you for the help!
If I am not wrong there should be Document_Open event according to msdn. This should actually be a before open document and should be fired before updating links (at least it in excel it is fired before calculation).
Try opening the files on document open. Then you will face another problem, and so when to close the files, but that is a much easier thing to do. (probably document_close event...)
EDITTED:
As comments state, this is too late. You can create a word opener (as a single app or as an addin). The logic basically is:
'1) on something_open run GetOpenFileName dialog
'2) before opening the real thing, open all files accompanied
'3) open the document itself
'4) close all files
'5) close the opener itself
This is not the most trivial way, but I use this logic for exampe to make sure, that my applications always runs in a fresh copy of excel etc. But I understand that this is a workaround rather then a solution.
If you are still looking for something on this front, I created the following in a combination of VBA and VB.NET (in VS 2010) to show what can be done quite easily using that system. If VB.NET is no use to you, sorry, but there are reasons why I don't really want to spend time on the pure VBA approach.
At present, it is a "console" application which means you'll probably see a box flash up when it runs, but also means that you are more likely to be able to create this app without VS if you absolutely had to (AFAICR the VB.NET /compiler/ is actually free). It just fetches the link info. (i.e. there's currently no facility to modify links).
The overview is that you have a small piece of VBA (say, in your Normal template) and you need an open document. The VBA starts a Windows Shell, runs the VB.NET program and passes it the full path name of the document you want to open.
The VB.NET program opens the .docx (or whatever) and looks at all the Relationships of type "oleObject" that are referenced from the Main document part (so right now, the code ignores headers, footers, footnotes, endnotes and anywhere else you might have a link)
The VB.NET program automates Word (which we know is running) and writes each link URL into a sequence of Document Variables in the active document. These variables are called "Link1", "Link2", etc. If there are no links (I haven't actually tested that path properly) or the program can't find the file, "Link0" should be set to "0". Otherwise it should be set to the link count.
The shell executes synchronously, so your VBA resumes when it's done. Then you either have 0 links, or a set of links that you can process.
The VBA is like this:
Sub getLinkInfo()
' the full path name of the program, quoted if there are any spaces in it
' You would need to modify this
Const theProgram As String = """C:\VBNET\getmaindocumentolelinks.exe"""
' You will need a VBA reference to the "Windows Script Host Object Model"
Dim oShell As WshShell
Set oShell = CreateObject("WScript.Shell")
' plug your document name in here (again, notice the double quotes)
If oShell.Run(theProgram & " ""c:\a\testdocexplorer.docx""", , True) = 0 Then
With ActiveDocument.Variables
For i = 1 To CInt(.Item("Link0").Value)
Debug.Print .Item("Link" & CStr(i))
Next
End With
Else
MsgBox "Attempt to retrieve links failed"
End If
End Sub
For the VB.NET, you would need the Office Open XML SDK (I think it's version 2.5). You need to make references to that, and Microsoft.Office.Interop.Word.
The code is as follows:
Imports System.Collections.Generic
Imports System.Linq
Imports System.Text
Imports System.IO
Imports System.Xml
Imports System.Xml.Linq
Imports DocumentFormat.OpenXml.Packaging
Imports Word = Microsoft.Office.Interop.Word
Module Module1
Const OLEOBJECT As String = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/oleObject"
Sub Main()
Dim s() As String = System.Environment.GetCommandLineArgs()
If UBound(s) > 0 Then
Dim wordApp As Word.Application
Try
wordApp = GetObject(, "Word.Application")
Dim targetDoc As Word.Document = wordApp.ActiveDocument
Try
Dim OOXMLDoc As WordprocessingDocument = WordprocessingDocument.Open(path:=s(1), isEditable:=False)
Dim linkUris As IEnumerable(Of System.Uri) = From rel In OOXMLDoc.MainDocumentPart.ExternalRelationships _
Where rel.RelationshipType = OLEOBJECT _
Select rel.Uri
For link As Integer = 0 To linkUris.Count - 1
targetDoc.Variables("Link" & CStr(link + 1)).Value = linkUris(link).ToString
Next
targetDoc.Variables("Link0").Value = CStr(linkUris.Count)
OOXMLDoc.Close()
Catch ex As Exception
targetDoc.Variables("Link0").Value = "0"
End Try
Finally
wordApp = Nothing
End Try
End If
End Sub
End Module
I originally wrote the .NET code as a COM object, which would be slightly easier to use from VBA, but significantly harder to set up on the .NET side and (frankly) much harder to modify & debug as you have constantly to close Word to release the references to the COM DLLs.
If you actually wanted to fix up the LINK paths, as far as I can tell, modifying them in the relationship records is enough to get Word to update the relevant LINK fields when it opens Word, which saves having to modify the XML code for the LINK fields as well. But that's another story...
I just found out that you can set/modify a DelayOleSrvParseDisplayName registry entry and a NoActivateOleLinkObjAtOpen registry entry to modify the global behaviour:
See http://support.microsoft.com/kb/970154
I also found that activedocument.fields can contain links to external objects (in my case, an Excel sheet).
Use this code to parse them:
for each f in activedocument.fields
debug.print f.code
next
And use activedocument.fields(FIELDNUMBER) to select each object, to figure out where it is in the document.
Maybe also activedocument.Variables and activedocument.Hyperlinks can contain links to external objects? (not in my case).