In Excel VBA, how can I use IMPORTHTML function and increment the date in the html link - vba

I am importing a lot of data into an Excel file from an HTML page that stores the data on different pages by year, with similar links, except for the year. Is it possible to write a macro with a for loop to import data from every year on one sheet, by incrementing the year in the HTML link? I am currently using:
=IMPORTHTML("http://www.abc1900.com","table",0)
to import the data into the excel sheet. I would like to write this:
=IMPORTHTML("http://www.abc1900.com","table",0)
=IMPORTHTML("http://www.abc1901.com","table",0)
=IMPORTHTML("http://www.abc1902.com","table",0)
etc. I can't quite figure it out without ruining the link. Thanks.

Excel doesn't provide a function called IMPORTHTML so I assume this is a UDF.
Nevertheless with a helper column containing the numbers you can concatenate it as a string:
A B
1900 =IMPORTHTML("http://www.abc" & A:A & ".com","table",0)
=A1+1 =IMPORTHTML("http://www.abc" & A:A & ".com","table",0)
and copy that last row formula down.

Related

Is there a way to change a vlookup table_array in VBA?

Basically, I am trying to use VBA to change the VLookup table_array from one tab to another. Currently the VLookup value is something like this
=+vLookup($A4,'02.19.21'!$A$1:$O$9000,F$1,False)
02.19.21 is the name of a tab the data is pulling from, but every week a new tab is created with the most recent data, so the next week the data will need to pull from a new tab that will get created called "02.26.21"
I need vba to change '02.19.21' to '02.26.21', which is where the new data will be pulled from
Edit, I have tried doing a macro recorder, but the issue is that the data is changing weekly, meaining, if I record and change the date to 02.26.21, then when i need to do it for the next week, march 5th, it would not return that data.
I tried the date function in VBA and then used it as a string, since the vlookup would be looking at a tab that is called 02.26.21, but got errors when I did that.
I would consider using a named range for your vlookups instead of a direct sheet+range reference.
See here for named ranges for example: https://www.contextures.com/xlnames01.html
So your vlookup goes from this:
=+vLookup($A4,'02.19.21'!$A$1:$O$9000,F$1,False)
to this:
=+vLookup($A4,LookupTable,F$1,False)
So then when your code needs to change the source sheet, you can do something like:
ThisWorkbook.Names("LookupTable").RefersTo = _
ThisWorkbook.Sheets("02.26.21").Range("A1:O9000")

Referencing a sheet by index number

I've got a LibreOffice Calc spreadsheet that I use to keep track of my accounts receivable at work. Each sheet lists invoices and their status (paid, unpaid, etc) as well as info about each invoice. I'm trying to create a Summary sheet that lists certain data from each sheet. Creating the sheet manually is easy, but I'm trying to "automate" the process. I want the summary page to auto-update if I add a new sheet (or remove one) as I add and remove accounts to the file.
I know that LibreOffice assigns each sheet an index number that I could refer to in some sort of formula, but I cannot find a function that I can use to refer to that index number when getting a value from a cell within it. One would expect that a function like Sheet(2) would reference the second sheet, but, alas, that is not so!
I've tried using the indirect and address functions without success, but I'm not sure if I'm not understanding these functions or if they're not appropriate for what I'm trying to accomplish.
This has been a missing piece in Calc for a long time. The preferred solution is to write a user-defined function. Spreadsheet formulas do not access sheets by index number but Basic can.
The following function is from https://ask.libreoffice.org/en/question/16604/how-do-i-access-the-current-sheet-name-in-formula-to-use-in-indirect/.
Function SheetName(Optional nSheet)
If IsMissing(nSheet) Then
SheetName = ThisComponent.getCurrentController().getActiveSheet().getName()
Else
SheetName = ThisComponent.getSheets().getByIndex(nSheet-1).getName()
EndIf
End Function
Then get a relative address of the first sheet cell A1 like this.
=ADDRESS(1,1,4,,SHEETNAME(1))
A slightly different function is given at https://forum.openoffice.org/en/forum/viewtopic.php?f=9&t=49799.

grab and filter from more than 255 columns from a huge closed workbook

i have a huge workbook (0.6 million rows) and 315 columns whose column names i need to grab into an array. due to the huge size, i don't want to open and close the workbook to copy the 1st row of the range. Also, I want to only grab certain columns from the 1st row that begin with the word "Global ".
can anyone help with short code example on how to go about doing this? please note i have tried ADOX, ADO etc but both show the 255 column limitations. I also dont want to open the workbook, but pull the required "Global " columns from the 315 columns into an array.
any help is most appreciated.
You can copy the first row of your target by opening a new workbook, and in A1 use this formula:
='C:\PATH_TO_TARGET\[TARGET_FILE_NAME.xlsx]WORKSHEET_NAME'!A1
Note that PATH+FILENAME+WORKSHEET is enclosed in single quotes, the FILENAME is enclosed in square brackets, and an exclamation separates the cell reference.
Then copy/Paste or fill right to get the next 314 columns. Note: this formula will return zero for empty target cells.
Once you have the column heading you can copy/paste_special_values if you want to destroy the links to the closed workbook.
Hope that helps
You could use the Python programing language.
While it does not actively works with XLSX fiels, you just have to install the openpyxl external module from here: https://pypi.python.org/pypi/openpyxl -
(You will also have to install Python. of course - just download it from www.python.org)
It will make working with your data in an interactive Python session a piece of cake, and the time to open the workbook without having to load the Excel interface should be a fraction of what you are expecting. (I think it will have to fit in your memory, though).
But this is all I had to type, in an interactive Python2 session to open a workbook, and retreive the column names that start with "bl":
import openpyxl
a = openpyxl.load_workbook("bla.xlsx")
[cell.value for cell in a.worksheets[0].rows[0] if cell.value.startswith("bl")]
output:
Out[8]: [u'bla', u'ble', u'bli', u'blo', u'blu']
The last input line requires on to know Python to be understood, so, here is a summary of what happens: Python is a language very fond of working with sequences - and the openpyxl libray gives your workbook as just that:
an object which is a sequence of worksheets - each worksheet having a rows attribute which has a sequence of all rows in the sheet, and each row bein a sequence of cells. Each cell has a value attribute which is the text within it.
The inline for statement is the compact form, but it could be written as a multiple line statement as:
In [10]: for cell in a.worksheets[0].rows[0]:
....: if cell.value.startswith("bl"):
....: print cell.value
....:
bla
ble
bli
blo
blu
Keep in mind that by exploring Python a bit deeper, you can programatically manipulate your data in a way that will be easier than ininteractivelygiven a data-set this size - and you can even use Python itself to drop select contents to an SQL database, (including its bult-in, single-file database, sqlite), where sophisticated indexes and queries can make working with your data a breeze)

Excel - Copy the displayed value not the actual value

I am a pilot, and use a logbook program called Logten Pro. I have the ability to take excel spreadsheets saved from my work flight management software, and import them into Logten Pro using the CSV format.
My problem however, is that the work flight management software, exports the date and time of take-off of a flight into one cell in the following excel format: DD/MM/YYYY H:MM:SS PM.
This is handled fine by Excel, and is formatted by default to DD/MM/YY even though the actual value is more specific, comprising of the full length date and time group.
This is a problem because Logten Pro will only auto-import the date if it is in DD/MM/YY format, and there is no way to pull out just the displayed DD/MM/YY date rather than the full date time group actual value, unless you manually go through and delete the extra text from the function box.
My question is: Is there a VBA macro that can automatically copy the actual displayed text, and paste it into another cell, changing the actual value as it does, to just the DD/MM/YY value? Additionally, can this be made to work down a whole column rather than individual cells at a time?
Note I have no VBA experience so the perfect answer would just be a complete VBA string I could copy and paste.
Thank You.
As pointed out in the comments, you'd better not use VBA but formulas instead.
This formula:
TEXT(A1,"dd-mm-yyy")
will return the formated date in a text way. You can drag and drop the formula in the whole range of your cells and Copy/Paste Special > Values so that you will only have the needed values to get imported in Logten Pro.
There are three options using formulas.
Rounddown
Excel stores the date time as a number and uses formatting to display it as a date.
The format is date.time, where the integer is the date and the fraction is the time.
As an example
01/01/2012 10:30:00 PM is stored as 40909.9375
All the values after the decimal place relate to the hours and minutes
So a formula could be used to round the number down to a whole number.
=ROUNDDOWN(A1,0)
Then format the value as a short date.
It will then display as 01/01/2012
INT
As above, but using a different formula to get rid of the fraction (time)
=INT(A1)
Text
Alternately the date only could be extracted as text using this formula
=TEXT(A1,"dd/mm/yyyy")
It will then display as 01/01/2012
I'm a bit late to the party on this one, but recently came across this as was searching for answers to a similar problem.
Here is the answer I finally came up with.
Option Explicit
Sub ValuesToDisplayValues()
Dim ThisRange As Range, ThisCell As Range
Set ThisRange = Selection
For Each ThisCell In ThisRange
ThisCell.Value = WorksheetFunction.Text(ThisCell.Value, ThisCell.NumberFormat)
Next ThisCell
End Sub
This answers the question as asked, apart from the new values are pasted over the existing ones, not into a new cell, as there is no simple way to know where you would want the new values to be pasted. It will work on the whole range of selected cells, so you can do a whole column if needed.

How to AutoSum in Excel using VB.Net

Via VB.Net, is there any way to access the AutoSum feature that Excel has? I have a spreadsheet that I create and populate via a datatable using my application. I know how to sum based upon a predefined range (e.g., .cells(cnt + 1, 21).Formula = "=Sum(U3:U" & cnt & ")") but is there any way that I can just call a cell in my worksheet and have it AutoSum as if I was clicking the AutoSum button in Excel for that row? This would save me a lot of coding time based upon the logic my spreadsheet is going to need. Thanks for any help.
So are you trying to put the result of the sum of the complete row in the cells? Or do you want to put the formula "autosum" on the excel?
Either way you can do it by selecting that particular cell and then typing in the corresponding command on the value or formula field of that cell.