I need to fetch the price table from this page:
http://www.kieskeurig.nl/objectief/canon/ef_100mm_f2_usm/prijzen/bezorgen/167557#prijzen
So far I have developed this code to get the data
Sub TableExample()
Dim IE As Object
Dim doc As Object
Dim strURL As String
strURL = "http://www.kieskeurig.nl/objectief/canon/ef_100mm_f2_usm/prijzen/bezorgen/167557#prijzen"
' replace with URL of your choice
Set IE = CreateObject("InternetExplorer.Application")
With IE
'.Visible = True
.navigate strURL
Do Until .readyState = 4: DoEvents: Loop
Do While .Busy: DoEvents: Loop
Set doc = IE.document
GetAllTables doc
.Quit
End With
End Sub
Sub GetAllTables(doc As Object)
' get all the tables from a webpage document, doc, and put them in a new worksheet
Dim ws As Worksheet
Dim rng As Range
Dim tbl As Object
Dim rw As Object
Dim cl As Object
Dim tabno As Long
Dim nextrow As Long
Dim I As Long
Set ws = Sheets("Sheet1")
For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
For Each cl In rw.Cells
rng.Value = cl.innerText
Set rng = rng.Offset(, 1)
I = I + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -I)
I = 0
Next rw
Next tbl
ws.Cells.ClearFormats
End Sub
This code does work for me
But the problem is the 1st column i.e. the supplier data is not displaying in the fetched table.
Can anyone please help me on this
Replace your GetAllTables subroutine with the following:
Sub GetAllTables(doc As Object)
' get all the tables from a webpage document, doc, and put them in a new worksheet
Dim ws As Worksheet
Dim rng As Range
Dim tbl As Object
Dim rw As Object
Dim cl As Object
Dim tabno As Long
Dim nextrow As Long
Dim I As Long
Set ws = Sheets("Sheet1")
For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
colno = 1
For Each cl In rw.Cells
If colno = 1 and nextrow > 1 then
Set classColl = doc.getElementsByClassName("shopLogoX")
Set imgTgt = classColl(nextrow - 2).getElementsByTagName("img")
rng.Value = imgTgt(0).getAttribute("alt")
Else
rng.Value = cl.innerText
End If
Set rng = rng.Offset(, 1)
I = I + 1
colno = colno + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -I)
I = 0
Next rw
Next tbl
ws.Cells.ClearFormats
End Sub
The change is very little, actually. We use colno to track on which column we're in already in the row. Obviously, we check if we're in the first cell. If we are on the first column and not on the first row (header row), we create a collection of elements with class shopLogoX. This contains the img tags that have the alt attribute we want.
Tried, tested, and working. Let us know if this helps.
Related
I have a code that works with the selected running text but not working with the selected table cells.
Dim i As Integer
Dim oWords As Words
Dim oWord As Range
Set oWords = Selection.Range.Words
For i = 1 To oWords.Count Step 1
Set oWord = oWords(i)
''Make sure the word range doesn't include a space
Do While oWord.Characters.Last.text = " "
Call oWord.MoveEnd(WdUnits.wdCharacter, -1)
Loop
Debug.Print "'" & oWord.text & "'"
oWord.text = StrReverse(oWord.text)
Next i
I also have the code to extract each cell value but how to modify this to run on selected table cells.
First Code:
Sub Demo()
Dim x As String
Dim i As Integer
Dim j As Integer
Dim Tbl As Table
Set Tbl = ActiveDocument.Tables(1)
For i = 1 To Tbl.Rows.Count
For j = 1 To Tbl.Columns.Count
x = Tbl.Cell(i, j).Range.Text
Next j
Next i
End Sub
Second Code:
Sub testTable()
Dim arr As Variant
Dim intcols As Integer
Dim lngRows As Long
Dim lngCounter As Long
lngRows = ActiveDocument.Tables(1).Rows.Count
intcols = ActiveDocument.Tables(1).Columns.Count
arr = Split(Replace(ActiveDocument.Tables(1).Range.Text, Chr(7), ""), Chr(13))
For rw = 1 To lngRows
For col = 1 To intcols
Debug.Print "Table 1, Row " & rw & ", column " & col; " data is " & arr(lngCounter)
lngCounter = lngCounter + 1
Next
lngCounter = lngCounter + 1
Next
End Sub
Here is code that you should be able to adapt to your purpose.
Sub FindWordsInTableCells()
Dim doc As Word.Document, rng As Word.Range
Dim tbl As Word.Table, rw As Word.Row, cl As Word.Cell
Dim i As Integer, iRng As Word.Range
Set doc = ActiveDocument
For Each tbl In doc.Tables
For Each rw In tbl.rows
For Each cl In rw.Cells
Set rng = cl.Range
rng.MoveEnd Word.WdUnits.wdCharacter, Count:=-1
For i = 1 To rng.words.Count
Set iRng = rng.words(i)
Debug.Print iRng.Text
Next i
Next cl
Next rw
Next tbl
End Sub
If you want to only use cells that are currently selected then use this adaptation of the above routine.
Sub FindWordsInSelectedTableCells()
Dim rng As Word.Range
Dim cl As Word.Cell
Dim i As Integer, iRng As Word.Range
If Selection.Information(wdWithInTable) = True Then
For Each cl In Selection.Cells
Set rng = cl.Range
rng.MoveEnd Word.WdUnits.wdCharacter, Count:=-1
For i = 1 To rng.words.Count
Set iRng = rng.words(i)
rng.Select
'insert your word manipulation code here
Debug.Print Selection.Text
Next i
Next cl
End If
End Sub
as part of a research project i need to extract as much data as possible from a webpage. The problem is to access each table i have to follow lots of links which I can't get to work automatically.
Its from a greyhound-data.com. So an example would be I want to extract all the racing stats for every dog that raced in swindon between 1st Jan 2017- 28th Feb 2018. When i put it in the search engine I get 57236 races in a table. I have to follow the link on name of race for each race..
http://www.greyhound-data.com/d?racename=&country=13000&startmonth=3&endmonth=2&startdate=2017&enddate=2018&maxdist=unlimitied&class=any&order=dateD&x=2
My biggest problem is I don't know how to say follow the various links. And I don't know how to loop multiple times - once for each of the races in the original list.
I have created the simple Macro query :
Sub GetData()
Dim IE As Object
Dim doc As Object
Dim strURL As String
Dim I As Integer
For I = 1 To 9
strURL = "http://www.greyhound-data.com/d?racename=&country=13000&startmonth=3&endmonth=2&startdate=2017&enddate=2018&maxdist=unlimitied&class=any&order=dateD&x=" + Trim(Str(I))
Set IE = CreateObject("InternetExplorer.Application")
With IE
.navigate strURL
Do Until .ReadyState = 4: DoEvents: Loop
Do While .Busy: DoEvents: Loop
Set doc = IE.Document
GetAllTables doc
.Quit
End With
Next I
End Sub
Sub GetAllTables(doc As Object)
Dim ws As Worksheet
Dim rng As Range
Dim tbl As Object
Dim rw As Object
Dim cl As Object
Dim tabno As Long
Dim nextrow As Long
Dim I As Long
Set ws = Worksheets.Add
For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
For Each cl In rw.Cells
rng.Value = cl.outerText
Set rng = rng.Offset(, 1)
I = I + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -I)
I = 0
Next rw
Next tbl
End Sub
it retrieves the all races data from the url automatically. But can not make the next step. on each page there is a "name of the race" tab and I need to get all the data on each page for each row. It is because I need to get the information of 1st place, 2nd place and third place.
Thanks for your time I know its a bit garbled!!
My new code after changes is looking like this:
Sub GetData()
Dim IE As Object
Dim doc As Object
Dim strURL As String
Dim I As Integer
For I = 1 To 9
strURL = "http://www.greyhound-data.com/d?racename=&country=13000&startmonth=3&endmonth=2&startdate=2017&enddate=2018&maxdist=unlimitied&class=any&order=dateD&x=" + Trim(Str(I))
Set IE = CreateObject("InternetExplorer.Application")
With IE
.navigate strURL
Do Until .ReadyState = 4: DoEvents: Loop
Do While .Busy: DoEvents: Loop
Set doc = IE.Document
GetAllTables doc
.Quit
End With
Next I
End Sub
Sub GetAllTables(doc As Object)
Dim ws As Worksheet
Dim rng As Range
Dim tbl As Object
Dim rw As Object
Dim cl As Object
Dim tabno As Long
Dim nextrow As Long
Dim I As Long
Dim ThisLink As Object 'variable for <a> tags
Set ws = Worksheets.Add
For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
For Each cl In rw.Cells
rng.Value = cl.outerText
Set rng = rng.Offset(, 1)
I = I + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -I)
I = 0
Next rw
Next tbl
I = Range("B" & Rows.Count).End(xlUp).Row 'last row with data
Do While Cells(I, 1).Value = "" 'will loop until first not blank found in column A (starting from last row of data, from end to start)
For Each ThisLink In doc.getElementsByTagName("a") 'we check all <a> tags
If ThisLink.innerText = Cells(I, 2).Value Then Cells(I, 1).Value = ThisLink.href 'If the innertext is the name of the race, in column A we add link
Next ThisLink
I = I - 1 'we decrease row position
Loop
End Sub
but the case is that it returns the empty table as in this link : https://imageshack.us/i/poC4yhEZp
This code, after you get all your data, will check every race from end of list to start of list. And it will add in column A the related link to race.
Sub GetAllTables(doc As Object)
Dim ws As Worksheet
Dim rng As Range
Dim tbl As Object
Dim rw As Object
Dim cl As Object
Dim tabno As Long
Dim nextrow As Long
Dim I As Long
Dim ThisLink As Object 'variable for <a> tags
Set ws = Worksheets.Add
For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
For Each cl In rw.Cells
rng.Value = cl.outerText
Set rng = rng.Offset(, 1)
I = I + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -I)
I = 0
Next rw
Next tbl
I = Range("B" & Rows.Count).End(xlUp).Row 'last row with data
Do While Cells(I, 1).Value = "" 'will loop until first not blank found in column A (starting from last row of data, from end to start)
For Each ThisLink In doc.getElementsByTagName("a") 'we check all <a> tags
If ThisLink.innerText = Cells(I, 2).Value Then Cells(I, 1).Value = ThisLink.href 'If the innertext is the name of the race, in column A we add link
Next ThisLink
I = I - 1 'we decrease row position
Loop
End Sub
In HTML all <a> tags are like this:
Henlow 26 Feb 2018 HT 5
The href attribute contains the link related to text between <a> and </a>. You can get it with a.href in VBA
To know the text between <a> and </a> you can use a.InnerText
What i did is just a simple loop to check every <a> tag. If the InnerText matchs the value in the cell, then I get the href attribute.
This code will get you all the links you want in your question. Just adapt the code to your needs (I pasted them in column A, but maybe you want to do something else with them).
You need 2 references for this code to work;
Microsoft HTML Object Library
Microsoft Internet Controls
And this is the final result:
I have done really good progress in my macros while getting the data in different URL's. with the code block below I am getting the error like "Undefined Object":
Sub GetData()
Dim IE As Object, doc As Object
Dim strURL As String, myDate As Date
Set IE = CreateObject("InternetExplorer.Application")
With IE
For myDate = CDate("01-05-2017") To CDate("05-05-2017")
strURL = "https://www.ukdogracing.net/racecards/" & Format(myDate, "dd-mm-yyyy") & "/monmore" ' Trim(Str(I))
.navigate strURL
Do Until .ReadyState = 4: DoEvents: Loop
Do While .Busy: DoEvents: Loop
Set doc = IE.Document
GetAllTables doc
Next myDate
.Quit
End With
End Sub
Sub GetAllTables(doc As Object)
Dim ws As Worksheet
Dim rng As Range
Dim tbl As Object
Dim rw As Object
Dim cl As Object
Dim tabno As Long
Dim nextrow As Long
Dim I As Long
Dim ThisLink As Object 'variable for <a> tags
Set ws = Worksheets.Add
For Each tbl In doc.getElementsByTagName("TABLE")
tabno = tabno + 1
nextrow = nextrow + 1
Set rng = ws.Range("B" & nextrow)
rng.Offset(, -1) = "Table " & tabno
For Each rw In tbl.Rows
For Each cl In rw.Cells
rng.Value = cl.outerText
Set rng = rng.Offset(, 1)
myDate = myDate + 1
Next cl
nextrow = nextrow + 1
Set rng = rng.Offset(1, -myDate)
myDate = 0
Next rw
Next tbl
myDate = Range("B" & Rows.Count).End(xlUp).Row 'last row with data
Do While Cells(myDate, 1).Value = "" 'will loop until first not blank found in column A (starting from last row of data, from end to start)
For Each ThisLink In doc.getElementsByTagName("a") 'we check all <a> tags
If ThisLink.innerText = Cells(myDate, 2).Value Then Cells(myDate, 1).Value = ThisLink.href 'If the innertext is the name of the race, in column A we add link
Next ThisLink
myDate = myDate - 1 'we decrease row position
Loop
End Sub
the issue is in line :
Do While Cells(myDate, 1).Value = ""
the aim of this line is that, continue to get the data from the links if the page has empty return. What is my mistake could you please help me? It is the last part of my work of macros. Thanks in advance.
I think the error you are experiencing may be to do with the value of myDate. It can't be zero.
If I do this:
Option Explicit
Sub TEST()
Dim myDate As Date
myDate = 1
Do While Cells(myDate, 1).Value = ""
myDate = myDate - 1
Loop
End Sub
I get the Object defined error. I avoid this with:
Option Explicit
Sub TEST()
Dim myDate As Date
myDate = 1
Do While Cells(myDate, 1).Value = ""
myDate = myDate - 1
If Int(myDate) = 0 Then Exit Do
Loop
End Sub
I have an issue which I use a lot of manual time on currently.
I have following simple data:
And I wish to convert all the accounts downwards with the name next to the accounts in another column. Currently I do this by using the 'text to columns' function and then manually copy the names down.. HARD work.. :)
This is an example of my wish scenario..
Hope you are able to help..
Thanks a lot
Kristoffer
The following short macro will take data from Sheet1 and output records in Sheet2:
Sub DataReorganizer()
Dim i As Long, j As Long, N As Long
Dim s1 As Worksheet, s2 As Worksheet
Set s1 = Sheets("Sheet1")
Set s2 = Sheets("Sheet2")
N = s1.Cells(Rows.Count, "A").End(xlUp).Row
j = 1
For i = 2 To N
v1 = s1.Cells(i, 1)
ary = Split(s1.Cells(i, 2), ";")
For Each a In ary
s2.Cells(j, 1).Value = v1
s2.Cells(j, 2).Value = a
j = j + 1
Next a
Next i
End Sub
Input:
and output:
Try this
Option Explicit
Sub Test()
Dim rng As Excel.Range
Set rng = ThisWorkbook.Worksheets.Item(1).Cells(1, 1).CurrentRegion
Set rng = rng.Offset(1)
Set rng = rng.Resize(rng.Rows.Count - 1)
Dim vPaste
Dim lTotalRows As Long
Dim lPass As Long
For lPass = 0 To 1
Dim rowLoop As Excel.Range
For Each rowLoop In rng.Rows
Dim sName As String
sName = rowLoop.Cells(1, 1)
Dim sAccounts As String
sAccounts = rowLoop.Cells(1, 2)
Dim vSplitAccounts As Variant
vSplitAccounts = VBA.Split(sAccounts, ";")
If lPass = 0 Then
lTotalRows = lTotalRows + UBound(vSplitAccounts) + 1
Else
Dim vLoop As Variant
For Each vLoop In vSplitAccounts
lTotalRows = lTotalRows + 1
vPaste(lTotalRows, 1) = sName
vPaste(lTotalRows, 2) = vLoop
Next vLoop
End If
Next
If lPass = 0 Then
ReDim vPaste(1 To lTotalRows, 1 To 2)
lTotalRows = 0
End If
Next
ThisWorkbook.Worksheets.Item(2).Cells(1, 1).Value = "Name"
ThisWorkbook.Worksheets.Item(2).Cells(1, 2).Value = "Account"
Dim rngPaste As Excel.Range
Set rngPaste = ThisWorkbook.Worksheets.Item(2).Cells(2, 1).Resize(lTotalRows, 2)
rngPaste.Value2 = vPaste
End Sub
I have code here which loops through a list of files; opening them, extracting data and moving it into the main workbook. What i am looking to do get it so the data for abel is in columns c and d but then put varo in f and g etc. the problem that i see is that the placement code is inside the loop so for each i it will just write over the previous line instead of being in a different column all together!
Sub Source_Data()
Dim r
Dim findValues() As String
Dim Wrbk As Workbook
Dim This As Workbook
Dim sht As Worksheet
Dim i
Dim tmp
Dim counter
Dim c As Range
Dim firstAddress
Dim rng As Range
ReDim findValues(1 To 3)
findValues(1) = "abel"
findValues(2) = "varo"
findValues(3) = "Tiger"
counter = 0
r = Range("A1").End(xlDown).Row
Set rng = Range(Cells(1, 1), Cells(r, 1))
Set This = ThisWorkbook
For Each tmp In rng
Workbooks.Open tmp
Set Wrbk = ActiveWorkbook
Set sht = ActiveSheet
For i = 1 To 3
With sht.Range(Cells(1, 1), Range("A1").SpecialCells(xlCellTypeLastCell))
Set c = .Find(findValues(i), LookIn:=xlValues)
If Not c Is Nothing Then
firstAddress = c.Offset(0, 2).Value
Do
This.Activate
tmp.Offset(0, 2).Value = tmp.Value
tmp.Offset(0, 3).Value = firstAddress
Set c = .FindNext(c)
counter = counter + 1
Loop While Not c Is Nothing And c.Value = firstAddress
End If
End With
Wrbk.Activate
Next
Wrbk.Close
Next tmp
End Sub
**EDIT:**I know it can be done by adding a multiplier of "i" to the offset value but this makes things bigger than they need to be if i wish to search for 50 keywords
Here is my answer, hope to help you, and as always, if you need an improvement, just tell me.
Sub Source_Data()
Dim r
Dim findValues() As String
Dim Wrbk As Workbook
Dim This As Workbook
Dim sht As Worksheet
Dim i
Dim tmp
Dim counter
Dim c As Range
Dim firstAddress
Dim rng As Range
Dim ColNum 'the columns number var
ReDim findValues(1 To 3)
findValues(1) = "abel"
findValues(2) = "varo"
findValues(3) = "Tiger"
counter = 0
r = Range("A1").End(xlDown).Row
Set rng = Range(Cells(1, 1), Cells(r, 1))
Set This = ThisWorkbook
For Each tmp In rng
Workbooks.Open tmp
Set Wrbk = ActiveWorkbook
Set sht = ActiveSheet
For i = 1 To 3
With sht.Range(Cells(1, 1), Range("A1").SpecialCells(xlCellTypeLastCell))
Set c = .Find(findValues(i), LookIn:=xlValues)
If Not c Is Nothing Then
firstAddress = c.Offset(0, 2).Value
Do
This.Activate
Select Case i 'Test var i (the value)
Case "abel" 'in case the value (that is a string) is equal to...
ColNum = 1 'set the var, with the number of the column you want
Case "varo" 'in case the value...
ColNum = 2 'Set the column...
Case "Tiger"
ColNum = 3
Case Else 'In case that the i var not match with anyvalue take this column number
ColNum = 20 'the garbage!
End Select
tmp.Offset(0, ColNum).Value = tmp.Value 'Put the value in the selected columns
tmp.Offset(0, ColNum + 1).Value = firstAddress 'and put the value to the next column of the
'selected column
Set c = .FindNext(c)
counter = counter + 1
Loop While Not c Is Nothing And c.Value = firstAddress
End If
End With
Wrbk.Activate
Next
Wrbk.Close
Next tmp
End Sub
Note:
You need to set the ColNum var to the values that you need, put there the numbers of the columns you really need to store the value of i and the next line is to put the address of the i var
You can just change these two lines:
tmp.Offset(0, 2).Value = tmp.Value
tmp.Offset(0, 3).Value = firstAddress
To this
tmp.Offset(0, 2 + (i-1)*2).Value = tmp.Value
tmp.Offset(0, 3 + (i-1)*2).Value = firstAddress