how to convert .docx and .pdf to .txt file - vb.net

I am working on an application for which i need to convert .docx and .pdf file to .txt
file with basic formatting. I searched it in internet but couldn't find any free third party dlls. Can any one suggest me best way and some dlls reference for this.
Thanks in Advance

http://support.microsoft.com/kb/316383 describes what you want to do with .docx files very well.
http://visualbasic.about.com/od/quicktips/qt/disppdf.htm describes the same, but with .pdf files.
Once you have read files into your code, output to a txt file using VB.NET's built in file writing functions.

The code below will handle the job for you. It is something I wrote for the big boss haha. I hope it helps. The code reads the first cell in the work sheet as the folder where docx files are present and then converts them to txt files one by one saving in the same folder.
Const wdFormatText = 2
If Not Len(Cells(1, "A").Value) > 0 Or Dir(Cells(1, "A").Value, vbDirectory) = "" Then
MsgBox ("Invalid Folder")
Exit Sub
End If
Dim StrFile As String
StrFile = Dir(Cells(1, "A").Value & "\*.docx")
Do While Len(StrFile) > 0
Set objWord = CreateObject("Word.Application")
Set objDoc = objWord.Documents.Open(Cells(1, "A").Value & "\" & StrFile, False, True)
objDoc.SaveAs Cells(1, "A").Value & "\" & StrFile & ".txt", wdFormatText
objWord.Quit
StrFile = Dir
Loop

Related

Open latest pdf file with vba

I have been looking for codes on the internet and writing some myself to open the latest pdf file in a sharepoint folder. The files that I am interested in the folder are all named as such "SD Progress_YYYYMMDD.pdf". So I tried having a for loop through all the files in this folder and comparing the YYYYMMDD in each file names and keeping the highest value (classic max value programming). Unfortunately I am quite new with vba and I believe that I have a mistake with string or array dimensions in my code below but I can't quite figure it out. The following error occurs at the first If statement:
Run-time error '13': Type mismatch
You guys are the experts so if you have any advices for my code below please I am very interested. Thank you
CODE BELOW HAS BEEN EDITED AND WORKS NOW. THANK YOU.
Sub Shop_Drawing_Status()
Dim MyPath As String
Dim LatestDate As Integer
Dim MyFile As String
MyPath = "C:\Users\Documents...etc\"
MyFile = Dir(MyPath & "*.pdf", vbNormal)
While Len(MyFile) > 0
If Right(MyFile, 3) = "pdf" Then
LatestFile = Split(MyFile, ".")
If Right(LatestFile(0), 4) > LatestDate Then
LatestDate = Right(LatestFile(0), 4)
End If
End If
MyFile = Dir()
Wend
ActiveWorkbook.FollowHyperlink (MyPath & "SD Progress_2020" & LatestDate & ".pdf")
On Error Resume Next
End Sub

Convert .txt file to .xlsx & remove unneeded rows & format columns correctly

I've got a folder which contains .txt files (they contain PHI, so I can't upload the .txt file, or an example without PHI, or even any images of it). I need an excel macro, which will allow the user to choose the folder containing the file, and will then insert the .txt file data into a new excel workbook, format the rows and columns appropriately, and finally save the file to the same folder that the source was found in.
So far I've got all of that working except for the formatting of rows and columns. As of now, the .txt data is inserted to a new workbook & worksheet, but I can't seem to figure out how to get rid of rows I don't need, or how to get the columns formatted appropriately.
Again, I can't upload the .txt file (or anything) because the Healthcare organization I work for blocks it - even if I've removed all PHI.
Below is the macro I've created so far:
Private Sub CommandButton2_Click()
On Error GoTo err
'Allow the user to choose the FOLDER where the TEXT file(s) are located
'The resulting EXCEL file will be saved in the same location
Dim FldrPath As String
Dim fldr As FileDialog
Dim fldrChosen As Integer
Set fldr = Application.FileDialog(msoFileDialogFolderPicker)
With fldr
.Title = "Select a Folder containing the Text File(s)"
.AllowMultiSelect = False
.InitialFileName = "\\FILELOCATION"
fldrChosen = .Show
If fldrChosen <> -1 Then
MsgBox "You Chose to Cancel"
Else
FldrPath = .SelectedItems(1)
End If
End With
If FldrPath <> "" Then
'Make a new workbook
Dim newWorkbook As Workbook
Set newWorkbook = Workbooks.Add
'Make worksheet1 of new workbook active
newWorkbook.Worksheets(1).Activate
'Completed files are saved in the chosen source file folder
Dim CurrentFile As String: CurrentFile = Dir(FldrPath & "\" & "*.txt")
Dim strLine() As String
Dim LineIndex As Long
Application.ScreenUpdating = False
Application.DisplayAlerts = False
While CurrentFile <> vbNullString
'How many rows to place in Excel ABOVE the data we are inserting
LineIndex = 0
Close #1
Open FldrPath & "\" & CurrentFile For Input As #1
While Not EOF(1)
'Adds number of rows below the inserted row of data
LineIndex = LineIndex + 1
ReDim Preserve strLine(1 To LineIndex)
Line Input #1, strLine(LineIndex)
Wend
Close #1
With ActiveSheet.Range("A1").Resize(LineIndex, 1)
.Value = WorksheetFunction.Transpose(strLine)
.TextToColumns Other:=True, OtherChar:="|"
End With
ActiveSheet.UsedRange.EntireColumn.AutoFit
ActiveSheet.Name = Replace(CurrentFile, ".txt", "")
ActiveWorkbook.SaveAs FldrPath & "\" & Replace(CurrentFile, ".txt", ".xls"), xlNormal
ActiveWorkbook.Close
CurrentFile = Dir
Wend
Application.DisplayAlerts = True
Application.ScreenUpdating = True
End If
Done:
Exit Sub
err:
MsgBox "The following ERROR Occurred:" & vbNewLine & err.Description
ActiveWorkbook.Close
End Sub
Any ideas of how I can delete entire lines from being brought into excel?
And how I can format the columns appropriately? So that I'm not getting 3 columns from the .txt file all jammed into 1 column in the resulting excel file?
Thanks
I'd recommend you not to re-invent the wheel. Microsoft provides an excellent add-on to accomplish this task, Power Query.
It lets you to load every file in a folder and process it in bulks.
Here you have a brief introduction of what can do for you.

How to change file extensions in VBA

I feel like this must be simple, but I can't find the answer. I'm saving a bunch of csv files using vba and would like to change all the file extensions from .csv to .txt to import into another program (Revit) which only recognizes the .txt extension. Is this possible? Here is the command I'm using.
For I = 1 To WS_Count
path = CurDir() + "\" + ActiveWorkbook.Worksheets(I).Name
Sheets(ActiveWorkbook.Worksheets(I).Name).Select
ActiveWorkbook.SaveAs Filename:=path, FileFormat:=xlCSV, CreateBackup:=False
Name path As ("path" + ".txt")
Next I
Thanks!
You don't even need to open the files to rename them.
Sub M_snb()
name "G:\OF\example.csv" As "G:\OF\example.txt"
end sub
You should change
FileFormat:=xlCSV
to
FileFormat:=xlTextWindows
See
https://msdn.microsoft.com/en-us/library/office/ff198017.aspx
or
The xlFileFormat enumeration (Excel) on MSDN
Ok got it. You can just ad txt to the file name, even if it is in the CSV format.
WS_Count = ActiveWorkbook.Worksheets.Count
For I = 1 To WS_Count
path = CurDir() + "\" + ActiveWorkbook.Worksheets(I).Name + ".txt"
Sheets(ActiveWorkbook.Worksheets(I).Name).Select
ActiveWorkbook.SaveAs Filename:=path, FileFormat:=xlCSV, CreateBackup:=False
Debug.Print (test)
Debug.Print (path)
Next I
Try this. It will get the file from the specified directory from your system and will return the converted file from CSV to Text.
Sub changeExt()
strDir = "C:\Users\user\Desktop\xyz" 'Your file directory
With CreateObject("wscript.shell")
.currentdirectory = strDir
.Run "%comspec% /c ren *.csv *.txt", 0, True
End With
End Sub

How to open file with format date and time in excel vba

I want to open and copy sheet in file TFM_20150224_084502 and this file has different date and time each day. I have developed code until open the date format but I can't develop to open it with time format.
What's the more code for it?
Sub OpenCopy ()
Dim directory As String, fileName As String, sheet As Worksheet
Application.ScreenUpdating = False
Application.DisplayAlerts = False
directory = "z:\FY1415\FI\Weekly Report\Astry"
fileName = "TFM_" & Format(Date, "yyyymmdd") & ".xls"
Workbooks.Open "z:\FY1415\FI\Weekly Report\Astry\" & "TFM_" & Format(Date, "yyyymmdd") & ".xls"
Sheets("MSP").Copy After:=Workbooks("Generate Report 2.xlsm").Sheets("PlanOEE")
ActiveSheet.Name = "MSP"
End sub
It seems that some linebreaks have disappeared when you posted the code into your post, but assuming you are aware of this, I assume that the main problem you have is figuring out the name of the file you want to open?
The VBA Dir-function lets you search for a file in a folder, and lets you include wildcards in your search. I've included this function in your sub, and have tested it with a similarly named file on my computer (albeit without the copying of the sheet), and it opened the sheet:
Sub OpenCopy()
Dim directory As String, fileName As String, sheet As Worksheet
Application.ScreenUpdating = False
Application.DisplayAlerts = False
directory = "z:\FY1415\FI\Weekly Report\Astry\"
fileName = Dir(directory & "TFM_" & Format(Date, "yyyymmdd") & "*.xls*")
If fileName <> "" Then
With Workbooks.Open(directory & fileName)
.Sheets("MSP").Copy After:=Workbooks("Generate Report 2.xlsm").Sheets("PlanOEE")
End With
ActiveSheet.Name = "MSP"
End If
Application.ScreenUpdating = True
Application.DisplayAlerts = True
End Sub
The line relevant for finding the filename is, as you probably see:
fileName = Dir(directory & "TFM_" & Format(Date, "yyyymmdd") & "*.xls*")
I have simply used Dir to do a search for file fitting the string inside the parantheses, where the asterisks are wildcards. The reason I have included an asterisk after xls too is because there is a chance the file can have extensions such as xlsx or xlsm in newer versions of office. I've also added a backslash at the end of the directory string, since you'll have to include it before the filename anyway.
I have also added an if-clause around what you do with the workbook you open, in case no file fitting the search is found.
Note that this sub will only do what you want provided that there only is one file generated for each date. If you want to loop through all files which includes a given date, I would recommend having a look at this post here on SO, which explains how to loop through all files in a folder, modifying the macros presented there to fit your needs should be fairly trivial.

Word VBA code for saving forms

I have Word survey files, each containing forms filled by subjects. Until now I have manually exported the forms data by saving as txt and choosing the option "save form data as delimited text file".
I want to programmatically save as delimited text file all the .doc documents in a given directory. Alternatively, if this were to be too complicated, it would be sufficient to save one file at a time. The new txt files must have the same name as the original .doc files.
Thanks for your input Jan Schejbal. I've reached a solution with this piece of code, so I share it for whose who encounter the same problem. I received help from here
Sub Save_Forms_Data()
Application.ScreenUpdating = False
Dim strFolder As String, strFile As String, wdDoc As Document, strDocName As String
strFolder = CurDir
If strFolder = "" Then Exit Sub
strFile = Dir(strFolder & "\*.doc", vbNormal)
While strFile <> ""
Set wdDoc = Documents.Open(FileName:=strFolder & "\" & strFile, AddToRecentFiles:=False, Visible:=False)
With wdDoc
strDocName = Left(.FullName, InStrRev(.FullName, ".")) & "txt"
.SaveAs2 FileName:=strDocName, FileFormat:=wdFormatText, AddToRecentFiles:=False, _
SaveFormsData:=True, Encoding:=1252, InsertLineBreaks:=False, LineEnding:=wdCRLF
.Close SaveChanges:=False
End With
strFile = Dir()
Wend
Set wdDoc = Nothing
Application.ScreenUpdating = True
Application.Quit SaveChanges:=wdDoNotSaveChanges
End Sub
You can record a macro, which means you start the recording, do certain actions, then stop the recording, and VBA code for said actions is automatically generated. The code may not be very clean, but it should give you a good start to show you how the syntax looks and what commands you need for your actions. For certain things (e.g. dynamically specifying the file name), you will need to consult the documentation, but if you have any programming experience in any common language, this should not pose a significant problem once you have the "skeleton" provided by the macro recorder.
The more you want to automate, the more VBA you will need to learn. As VBA really isn't difficult, and it seems like you have a lot of repetitive work in front of you if you don't automate it, I'd suggest you learn it and Google what you need. This way, you will get your work done in a similar timeframe (or less, especially if this is not just a one-off thing), you will have a macro to do it next time, it will be less boring, and you will have learned a bit of VBA.