Excel vba macro to connect to a particular webpage - search and retrieve data - vba

I have column, say column A containing 1500 rows each having a string (Hexadecimal encodes). What I need is connect to a particular website search paste the string, press on decode, copy the result and paste it back to column B.
Any help would be of great help. I am new here.
Example:
String in Column A: 5468616e6b732061206c6f7420696e20616476616e6365
Website to search in: http://encodertool.com/hexadecimal
Copy from excel cell and paste in tab (under heading): ENTER AN Hexadecimal CONTENT TO DECODE
Then hit DECODE
Then Copy from DECODING RESULT
Finally paste back in ColumnB in my excel sheet.
Looking forward for an answer.
Thanks a million in advance.

Are you doing this as an exercise in automating the browser? Seems like you could more easily do it directly in VBA
From: http://bytes.com/topic/access/answers/874752-convert-hex-string
Sub tester()
Debug.Print fConvertHexToString( _
"5468616e6b732061206c6f7420696e20616476616e6365")
End Sub
Public Function fConvertHexToString(strHexString As String) As String
Dim intLenOfString As Integer
Dim intCounter As Integer
Dim strBuild As String
'Hex String must have a valid length, and it must be an even length
If Len(strHexString) = 0 Or Len(strHexString) Mod 2 <> 0 Then Exit Function
intLenOfString = Len(strHexString)
For intCounter = 1 To Len(strHexString)
If intCounter Mod 2 <> 0 Then 'need Hex pairs
'Retrieve the Value of the Hex Pair, then Convert to a Character,
'then Append to a Base String
strBuild = strBuild & Chr$(Val("&H" & Mid$(strHexString, intCounter, 2)))
End If
Next
fConvertHexToString = strBuild
End Function

Something like this. I have just run a mock test and it works. Give it a try. You can modify the code to your needs. This is a plain code. Code can be enhanced as well. But this does what you ask for
Dim ie As InternetExplorer
Dim doc As HTMLDocument
Sub start()
Dim ran As Range
Dim cel As Excel.Range
Set ran = Worksheets("Sheet1").Range("A1:A4") 'Change Your input range here
For Each cel In ran
If cel.Value <> Empty Then
Set ie = New InternetExplorerMedium 'open iE
ie.navigate ("http://encodertool.com/hexadecimal") 'Navigate to IE
ie.Visible = True
'Wait untill IE is loaded
Do
' Wait till the Browser is loaded
Loop Until ie.readyState = READYSTATE_COMPLETE
Set doc = ie.document
doc.getElementById("input_4").innerText = cel.Value ' Enter input value
test ' Click button
cel.Offset(0, 1).Value = doc.getElementById("output_4").innerText ' save Output value
End If
ie.Quit
Next cel
End Sub
'Click the Decode button
Sub test()
Set cl_button= doc.getElementsByTagName("a")
For Each one In cl_button
If one.getAttribute("onclick") = "ajaxfct('fcts.php','4')" Then
one.Click
Exit For
End If
Next one
End Sub
Before running the code, add reference to HTML object library & Internet controls. Also change the range of your input. I have set it to A1:A4 . Change to it whatever. Make sure there are no blank cells in the Range. ALso, If you dont want the browser to be displayed set
ie.visible = false
This is one way of doing it. THere are many simpler and effective ways of doing it

Related

Extract list of all input boxes on webpage vba

I want to create a list on Excel of all the labels of input boxes on a webpage- so I imagine the code would be something like:
Sub IEInteract()
Dim i As Long
Dim URL As String
Dim IE As Object
Dim objCollection As Object
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
URL = "mywebsite.com"
IE.Navigate URL
Do While IE.ReadyState = 4: DoEvents: Loop
Do Until IE.ReadyState = 4: DoEvents: Loop
objCollection = IE.Document.getElementsByTagName("input")
For Each el In objCollection
label = el.label 'or something like that????'
Debug.Print label
Next el
End Sub
Where am I going wrong? Thanks
BTW My VBA is OK, but my HTML is non-existent.
For learning purposes maybe choose a website that has more obvious inputboxes, rather than dropdowns.
Many inputboxes won't be pre-populated so maybe consider reading other properties of the retrieved elements. Or even writing to them and then retrieving those values.
Selecting by tag name can bring back a host of items that you might not have expected.
Bearing all of the above in mind. Try running the following, which generates a collection of <input> tag elements.
Code:
Option Explicit
Public Sub PrintTagInfo()
'Tools > references > Microsoft XML and HTML Object library
Dim http As New XMLHTTP60 '<== this will be specific to your excel version
Dim html As New HTMLDocument
With http
.Open "GET", "https://www.mrexcel.com/forum/register.php", False
.send
html.body.innerHTML = .responseText
End With
Dim inputBoxes As MSHTML.IHTMLElementCollection, iBox As MSHTML.IHTMLElement, i As Long
Set inputBoxes = html.getElementsByTagName("input") '<== the collection of input tags on the page
'<== These are input boxes i.e. you are putting info into them so perhaps populate and then try to read what is in the entry box?
For Each iBox In inputBoxes
Debug.Print "Result #" & i + 1
Debug.Print vbNewLine
Debug.Print "ID: " & iBox.ID '<== select a sample of properties to print out as some maybe empty
Debug.Print "ClassName: " & iBox.className,
Debug.Print "Title: " & iBox.Title
Debug.Print String$(20, Chr$(61))
Debug.Print vbNewLine
i = i + 1
Next iBox
End Sub
Sample output:
From the above, it looks like class name might be in some ways more informative if you are looking to target boxes to input information into.
An initial inspection of the page source, selecting an inputbox and right-click > inspect... will help you refine your choices.
I noticed that a lot of the boxes of interest had the Input tag and then type = "text"
This means you can target elements matching this pattern using CSS selectors. In this case using the selector input[type=""text""].
Adjusting the former code to factor this in gives a smaller set of more targeted results. Note, using .querySelectorAll, to apply the CSS selector, returns a NodeList object which requires a different method of iterating over. A For Each Loop will cause Excel to crash as described here.
Code:
Option Explicit
Public Sub PrintTagInfo()
'Tools > references > Microsoft XML and HTML Object library
Dim http As New XMLHTTP60 '<== this will be specific to your excel version
Dim html As New HTMLDocument
With http
.Open "GET", "https://www.mrexcel.com/forum/register.php", False
.send
html.body.innerHTML = .responseText
End With
Dim inputBoxes As Object, i As Long
Set inputBoxes = html.querySelectorAll("input[type=""text""]") '<== the collection of text input boxes on page. Returned as a NodeList
'<== These are input boxes i.e. you are putting info into them so perhaps populate and then try to read what is in the entry box?
For i = 0 To inputBoxes.Length - 1
Debug.Print "Result #" & i + 1
Debug.Print vbNewLine
Debug.Print "ID: " & inputBoxes.Item(i).ID '<== select a sample of properties to print out as some maybe empty
Debug.Print "ClassName: " & inputBoxes.Item(i).className,
Debug.Print "Title: " & inputBoxes.Item(i).Title
Debug.Print String$(20, Chr$(61))
Debug.Print vbNewLine
Next i
End Sub
Sample results:
Note: I have edited the spacing to fit more into the image.
References added via VBE > Tools > References
Last two are those of interest. The bottom one will be version specific and you will need to re-write XMLHTTP60 which is for XML 6.0 to target your version of Excel if not using Excel 2016.

Collect images url and data from webpage table to Excel table

Need to collect data from table on a webpage, some table cell have images.
The codes is to copy the data to Excel, and if the cell has images, then get its src links instead of images. below are the codes, but it is not working, I don't know how to detect if the cell has image in it or not, and add its src links to Excel cell.
Sub extractData()
Dim IE As Object, obj As Object
Dim myYear As String
Dim r As Integer, c As Integer, t As Integer
Dim elemCollection As Object
Set IE = CreateObject("InternetExplorer.Application")
myYear = InputBox("Enter year")
With IE
.Visible = True
.navigate ("url")
While IE.ReadyState <> 4
DoEvents
Wend
For Each obj In IE.Document.All.Item("Year").Options
If obj.innerText = myYear Then
obj.Selected = True
End If
Next obj
IE.Document.getElementsByName("btn_search").Item.Click
Do While IE.busy: DoEvents: Loop
ThisWorkbook.Sheets("Sheet1").Range("A1:K2000").ClearContents
Set elemCollection = IE.Document.getElementsByTagName("TABLE")
For t = 0 To (elemCollection.Length - 9)
For r = 0 To (elemCollection(t).Rows.Length - 1)
For c = 0 To (elemCollection(t).Rows(r).Cells.Length - 1)
If elemCollection(t).Rows(r).Cells(c).innerText = "" Then
ThisWorkbook.Worksheets(1).Cells(r + 1, c + 1) = elemCollection(t).Rows(r).Cells(c).getAttribute("src")
Exit For
End If
Next
ThisWorkbook.Worksheets(1).Cells(r + 1, c + 1) = elemCollection(t).Rows(r).Cells(c).innerText
Next c
Next r
Next t
End With
Set IE = Nothing
End Sub
First, brush up on HTML Document Object Model. There are tons of tutorials on how to use JavaScript to work with the DOM, and VBA is real similar (because the DOM doesn't change based on language and VBA is very closely related to VBScript which is similar to JavaScript). Second, if you get an error but no line is highlighted when you click the Debug button, step through your code line by line with the F8 key. This will let you figure out where the error is occurring. Third, add a reference to the Microsoft HTML Object Library so you can use Intellisense for code hints.
It's tough to give an actual solution without seeing the HTML source so instead I'll give some pointers:
Use IE.Document.Body.getElementsByTagName("TABLE") (note the addition of BODY to narrow the scope) to get a collection of every table on the page. With a reference to the HTML Objects Lib you can do something like this:
Dim oTable As HTMLTable
Dim oCell As HTMLTableCell
Dim oImg As HTMLImage
Dim strSrc As String
For Each oCell In oTable.Cells
strSrc = ""
On Error Resume Next
Set oImg = oCell.getElementsByTagName("img")
strSrc = oImg.Source
On Error GoTo 0
If strSrc <> "" Then Debug.Print strSrc
Next
This should (I did not test it) loop through every cell in a table and attempt to get an img element. If it fails, no biggie, just continue to the next cell. If you want to use late binding after you get it working, remove the HTML Obj Lib reference then simply dim everything as an object. Eg:
Dim oTable As Object 'HTMLTable
Dim oCell As Object 'HTMLTableCell
Dim oImg As Object 'HTMLImage
Dim strSrc As String

VBA skipping code directly after submitting form in IE

Currently I have 2 pieces of code that work separately, but when used together they don't work properly.
The first code asks the user to input information which is stored. It then navigates to the correct webpage where it uses the stored user input information to navigate via filling and submitting a form. It arrives at the correct place.
The second code uses a specific URL via ie.navigate "insert url here" to navigate to the same place as the first code. It then scrapes URL data and stores it in a newly created sheet. It does this correctly.
When merging them I replace the navigation segment from the second code with the first code, but then it only stores the first 5 of 60 URLs as if it hadn't fully loaded the page before scraping data. It seems to skip the code directly after ie.document.forms(0).submit which is supposed to wait for the page to load before moving on to the scraping..
extra info: the button wasn't defined so I cannot just click it so I had to use ie.document.forms(0).submit
Summary of what I want the code to do:
request user input
store user input
open ie
navigate to page
enter user input into search field
select correct search category from listbox
submit form
'problem happens here
scrape url data
store url data in specific excel worksheet
The merged code:
Sub extractTablesData()
Dim ie As Object, obj As Object
Dim Var_input As String
Dim elemCollection As Object
Dim html As HTMLDocument
Dim Link As Object
Dim erow As Long
' create new sheet to store info
Application.DisplayAlerts = False
ThisWorkbook.Sheets("HL").Delete
ThisWorkbook.Sheets.Add.Name = "HL"
Application.DisplayAlerts = True
Set ie = CreateObject("InternetExplorer.Application")
Var_input = InputBox("Enter info")
With ie
.Visible = True
.navigate ("URL to the webpage")
While ie.readyState <> 4
DoEvents
Wend
'Input Term 1 into input box
ie.document.getElementById("trm1").Value = Var_input
'accessing the Field 1 ListBox
For Each obj In ie.document.all.Item("FIELD1").Options
If obj.Value = "value in listbox" Then
obj.Selected = True
End If
Next obj
' button undefined - using this to submit form
ie.document.forms(0).submit
'----------------------------------------------------------------
'seems to skip this part all together when merged
'Wait until IE is done loading page
Do While ie.readyState <> READYSTATE_COMPLETE
Application.StatusBar = "Trying to go to website…"
DoEvents
Loop
'----------------------------------------------------------------
Set html = ie.document
Set ElementCol = html.getElementsByTagName("a")
For Each Link In ElementCol
erow = Worksheets("HL").Cells(Rows.Count, 1).End(xlUp).Offset(1, 0).Row
Cells(erow, 1).Value = Link
Cells(erow, 1).Columns.AutoFit
Next
Application.StatusBar = “”
Application.ScreenUpdating = True
End With
End Sub
I've been stuck for quite some time on this and haven't found any solutions on my own so I'm reaching out. Any help will be greatly appreciated!
You mentioned you think the website might not be fully loaded. This is a common problem because of the more dynamic elements on a webpage. The easiest way to handle this is to insert the line:
Application.Wait Now + Timevalue("00:00:02")
This will force the code to pause for an additional 2 seconds. Insert this line below the code which waits for the page to load and this will give Internet Explorer a chance to catch back up. Depending on the website and the reliability of your connection to it I recommend adjusting this value anywhere up to about 5 seconds.
Most websites seem to require additional waiting like this, so handy code to remember when things don't work as expected. Hope this helps.
I solved this by using a completely different method. I used a query table with strings to go where I wanted.
Sub ExtractTableData()
Dim This_input As String
Const prefix As String = "Beginning of url"
Const postfix As String = "end of url"
Dim qt As QueryTable
Dim ws As Worksheet
Application.DisplayAlerts = False
ThisWorkbook.Sheets("HL").Delete
ThisWorkbook.Sheets.Add.Name = "HL"
Application.DisplayAlerts = True
This_input = InputBox("enter key info to go to specific url")
Set ws = ActiveSheet
Set qt = ws.QueryTables.Add( _
Connection:="URL;" & prefix & This_input & postfix, _
Destination:=Worksheets("HL").Range("A1"))
qt.RefreshOnFileOpen = True
qt.WebSelectionType = xlSpecifiedTables
'qt.webtables is key to getting the specific table on the page
qt.WebTables = 2
qt.Refresh BackgroundQuery:=False
End Sub

VBA extract and parse data from website to Word

I'm trying to extract some data from here: http://www.hnb.hr/tecajn/f140215.dat
This is the exchange rate list from the Croatian National Bank. The file name "f140215.dat" is basically a date, formatted in the following order:
"f" "DDMMYY" ".dat"
I intend to have the data organized in a Word table, which contains the following cells:
Cell#1 where a user would manually input a date in the following
format: "MMM DD, YYYY"
Cell#2 where a user would manually input the requested currency code
name (USD, GBP, etc)
Cell#3 where the extracted exchange rate should appear for the
specified date and currency.
Underneath the table there is an "UPDATE" button that updates the Cell#3 information. The script I'm asking for should be connected to that button.
After clicking the button, I'd like the script to do the following:
Figure out which page to go to based on the date inputted in Cell #1.
For example, if the Cell#1 contains "February 14, 2015", the script
should point to "http://www.hnb.hr/tecajn/f140215.dat"
On that page, grab the middle value for the currency specified in
Cell#2. For example, if Cell#2 contains "USD", the script should
extract "6,766508" which is the middle value for "840USD001". Only
the middle value is relevant.
Write this value to Cell#3.
So to sum it up, based in the criteria specified in the two table cells, the script needs to identify which page to go to and what data to extract from it, and with that data populate the third cell.
Hope I explained it well enough. This is only a part of the whole invoice generator I'm building. So far I've gotten everything to work, but this I really don't even know how to start. I can send the whole thing if needed, but figured it's not exactly relevant.
EDIT:
I watched some tutorials and played around, and this is what I got so far.
Enum READYSTATE
READYSTATE_UNINITIALIZED = 0
READYSTATE_LOADING = 1
READYSTATE_LOADED = 2
READYSTATE_INTERACTIVE = 3
READYSTATE_COMPLETE = 4
End Enum
Sub Test()
Dim ie As New InternetExplorer
ie.Visible = False
ie.navigate "http://www.hnb.hr/tecajn/f140215.dat"
Do While ie.READYSTATE <> READYSTATE_COMPLETE
DoEvents
Loop
Dim html As HTMLDocument
Set html = ie.document
MsgBox html.DocumentElement.innerText
End Sub
I know it's not much, but like I said, I'm new at this. I was able to get the data into the message box, but I have no idea how to parse it, and without that I can't really do anything mentioned above. What now?
EDIT 2:
Alright!! Made some progress! I've managed to parse it by using the split function:
Sub Test()
Dim ie As New InternetExplorer
ie.Visible = False
ie.navigate "http://www.hnb.hr/tecajn/f140215.dat"
Do While ie.READYSTATE <> READYSTATE_COMPLETE
DoEvents
Loop
Dim html As HTMLDocument
Set html = ie.document
Dim getData As String
getData = html.DocumentElement.innerText
'replaced all the space fields with line breaks
Dim repData As String
repData = Replace(getData, " ", vbCrLf)
'used line breaks as separators
Dim splData As Variant
splData = Split(repData, vbCrLf)
MsgBox splData(1)
MsgBox splData(2)
MsgBox splData(3)
End Sub
Right now it displays the parsed data in message boxes. The rest should be easy!
Addendum from OP's comment:
This is a part of the continued code:
Dim cur As String
cur = ActiveDocument.SelectContentControlsByTitle("valCombo").Item(1).Range.Text
If cur = "USD" Then
ActiveDocument.Tables(1).Cell(7, 3).Range.Text = splData(40) & " HRK"
End If
If cur = "EUR" Then
ActiveDocument.Tables(1).Cell(7, 3).Range.Text = splData(20) & " HRK"
End If
This way it works, but I'd like to set ActiveDocument.Tables(1).Cell(7, 3).Range.Text as a string. However, once I do that, it doesn't do anything. Why is that?
This should help you with the first half of your project; that being the retrieval of the data. As I mentioned in my earlier comment, data retrieval such as this is better suited to an MSXML2.ServerXMLHTT type of object.
You will have to go into the VBE's Tools ► References and add Microsoft XML v6.0.
Sub scrape_CNB()
Dim u As String, dtDATE As Date, xmlHTTP As MSXML2.ServerXMLHTTP60
Dim sTMP As String, sCURR As String
Dim i As Long, j As Long, vLINE As Variant, vRATE As Variant
On Error GoTo CleanUp
Set xmlHTTP = New MSXML2.ServerXMLHTTP60
sCURR = "USD"
dtDATE = CDate("February 14, 2015")
With xmlHTTP
u = "http://www.hnb.hr/tecajn/f" & Format(dtDATE, "ddmmyy") & ".dat"
.Open "GET", u, False
.setRequestHeader "Content-Type", "text/xml"
.send
If .Status <> 200 Then GoTo CleanUp
sTMP = .responseText
vLINE = Split(sTMP, Chr(13) & Chr(10))
For i = LBound(vLINE) To UBound(vLINE)
If CBool(InStr(1, vLINE(i), sCURR, vbTextCompare)) Then
Do While CBool(InStr(1, vLINE(i), Chr(32) & Chr(32))): vLINE(i) = Replace(vLINE(i), Chr(32) & Chr(32), Chr(32)): Loop
vRATE = Split(vLINE(i), Chr(32))
For j = LBound(vRATE) To UBound(vRATE)
MsgBox j & ": " & vRATE(j)
Next j
Exit For
End If
Next i
End With
CleanUp:
Set xmlHTTP = Nothing
End Sub
Since you are not initiating a full Internet.Explorer object, this should be much quicker and the .responseText that is returned is raw text, not HTML.
TBH, I find the cursor position based VBA programming within Word to be hard to deal with; preferring the one-to-one explicitly defined relationship(s) with an Excel worksheet. You may want to consider using Excel as a data repository and merging with Word to provide your invoice output.
Addendum:
Dim cur As String, t as long, r as long, c as long
cur = ActiveDocument.SelectContentControlsByTitle("valCombo").Item(1).Range.Text
t = 1: r = 7: c = 3
Select Case cur
Case "USD"
ActiveDocument.Tables(t).Cell(r, c).Range.Text = splData(40) & " HRK"
Case "EUR"
ActiveDocument.Tables(t).Cell(r, c).Range.Text = splData(20) & " HRK"
End Select

Vba- retrieve value from multiple internet explorer websites to multiple cells

Issue:
I would like to retrieve a particular value (Prev Close) from multiple internet explorer websites and copy them to multiple cells (Column C) automatically. I know how to retrieve value from a single internet explorer websites to a single cell. But i have no idea how to retrieve from multiple websites and copy them to multiple cells.
My computer info:
1.window 8.1
2.excel 2013
3.ie 11
My excel reference
Microsoft Object Library: yes
Microsoft Internet Controls: yes
Microsoft Form 2.0 Object library: yes
Microsoft Script Control 1.0: yes
URL:
http://finance.yahoo.com/q?s=hpq&type=2button&fr=uh3_finance_web_gs_ctrl1&uhb=uhb2
Below is my VBA code:
Private Sub CommandButton1_Click()
Dim ie As Object
Dim Doc As HTMLDocument
Dim prevClose As String
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = 0
ie.navigate "http://finance.yahoo.com/q;_ylt=AsqtxVZ0vjCPfBnINCrCWlXJgfME?uhb=uhb2&fr=uh3_finance_vert_gs_ctrl1_e&type=2button&s=" & Range("b2").Value
Do
DoEvents
Loop Until ie.readyState = 4
Set Doc = ie.document
prevClose = Trim(Doc.getElementById("table1").getElementsByTagName("td")(0).innerText)
Range("c2").Value = prevClose
End Sub
Don't use multiple tabs unless you really need to. It's an un-scalable solution that breaks quickly as the tabs add up.
It's far simpler and easier to just use one tab and deal with one webpage at a time using simple looping constructs. For this I am assuming that your URLs are the one your provided + some string contained in column B.
Private Sub CommandButton1_Click()
Const YAHOO_PARTIAL_URL As String = "http://finance.yahoo.com/q;_ylt=AsqtxVZ0vjCPfBnINCrCWlXJgfME?uhb=uhb2&fr=uh3_finance_vert_gs_ctrl1_e&type=2button&s="
Dim ie As Object
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = 0
For r = 2 To 10 ' Or whatever your row count is.
ie.navigate YAHOO_PARTIAL_URL & Cells(r, "B").Value
Do
DoEvents
Loop Until ie.readyState = 4
Dim Doc As HTMLDocument
Set Doc = ie.document
Dim prevClose As String
prevClose = Trim(Doc.getElementById("table1").getElementsByTagName("td")(0).innerText)
Cells(r, "C").Value = prevClose
Next r
End Sub