collate multiple openpyxl worksheets into a different workbook? - openpyxl

I've written some code that runs a number of simulations (a few dozen), and for each run of the sim it builds an openpyxl worksheet with the result data. I'm having trouble figuring out an effective mechanism to put each of these worksheets back together into a single workbook.
I found Copy whole worksheet with openpyxl -- which is disheartening as it led me down a road to probably not being able to "just copy a worksheet as an object into a new workbook".
Are there other mechanisms that might work here? Is it even legal (sensible?) to pass just a worksheet object around from one function to another, or is this totally barking up the wrong tree?
I suppose I could instead create a whole separate workbook in each simulation run, and then collect all those workbooks, and then for each one do a brute-force copy of the data in each worksheet over to a new worksheet in the main workbook... but that just SEEMS like the wrong way to approach it. Is there something more elegant that I haven't considered?

Related

Formatting Data in excel sheet with blue prism

I'm trying to run a duplicate check In which varying data is pulled from a website and compared to a master list, the master list being stored in Excel. The information from the website is read from a table in which has line breaks. These breaks are translated over to the data collection they are initially stored in. Some of the data from the website us eventually written to the master list in Excel. So when I read the master list back into Blue Prism to run a duplicate check, the rows that have line breaks are written into a collection as multiple rows (ex. I should have on 7 rows in my collections but am getting 42). Since the rows are not EXACTLY the same between the 2 collections, when it runs the automation does not recognize the duplicates.
The easiest way to solve this would be if I could make the collection rows have no line breaks as soon as the data is read. I've attempted to use the calculation stage to do so with no luck. I'm not sure if it is actually possible to do this, but would appreciate any direction.
Record an Excel macro to do the data sorting/cleaning in Excel (possibly Text To Columns, etc..) and then include the running of the macro as part of your Blue Prism process by using an action stage and the MS Excel VBO - Run Macro. Get the process to create an Excel instance (and create a handle data item from that stage), then use Open Workbook (whatever workbook you store your Macro in) and then use the MS Excel VBO - Run Macro (use the same handle created earlier and type in the name of the "macro").
It sounds like what is happening is that the MS Excel VBO is grabbing the data from the Excel Worksheet wholesale.
This is to say that it's accessing your Worksheet table, copying the cell values BUT not the cell formatting data, and then dumping the values into a BP collection.
Since it did not bring along any of the original cell formatting data to reference when it went to populate the collection it's just breaking up the values based on crturn/line breaks. Thus, your collection is organized based on that, and not on the original Worksheet cell.
So, with that said, on to a solution!
Solution 1
Brute force the organization of the incoming Excel cell data to the collection by looping over the Excel Worksheet cell-by-cell.
Run a loop, and in that loop have BP go into the Excel Worksheet and grab the first populated cell it comes across. Run a formatting/cleanup Calculation stage over the data. Dump the cell value into a single collection field.
Repeat.
This is...inelegant, expensive at best, and not at all recommended for any medium to large dataset. But it's definitely the best way to do string manipulation and value comparisons before it hits your collection. Since it sounds like your using a Master template then you as-well know what the expected format of your data should be.
This method will enable you implement Trim(), Concat(), or Split() in a Calculation stage to better organize your incoming data before you dump it into a collection.
This is also basically what I think you're already trying to do, but cell-by-cell instead of Worksheet row-by-row or table-by-table.
Solution 2
Clean up the table data you grab from the website before you dump it into the Excel Worksheet.
This is basically Solution 1, but in reverse. Simply format/cleanup your data before it hits you Excel Worksheet.
I'm not sure this is any better than Solution 1, but, you know, it's something...
Solution 3
Format the cell data IN the MS Excel Worksheet itself.
Basically rearrange the cells and cell data in the Excel Worksheet into a more predictable format by using the Split, Trim, Merge, or other actions included in the MS Excel VBO. You can also do this using the Data - OLEDB utility object, but that requires some pretty solid understanding of SQL syntax.
This would look like this using the MS Excel VBO:
Grab the Excel Worksheet data wholesale and dump into a collection
Count the rows/fields of the collection
Is that number consistent with the desired/expected format of your data?
If not, have the bot go back into the Excel Worksheet and reformat the cells by removing any carriage returns/line breaks/whatever else
Repeat.
However, I'm always reluctant to reformat any original source, as it's then hard to figure out what wrong and where it went wrong when you've changed the original structure of your data. So it's best to always make a copy of the Worksheet before you make any manipulation.
Unfortunately I don't have access to my BP environment at the moment or I'd provide you with the act object actions you'd need to do any of this, my bad. Once I do I'll update this answer.

VBA - Merging of Excel Workbooks - Possibly running out of Memory

I've done some google searching and looked around stackoverflow but haven't come across something which resolves the below.
Basically I have around 46 workbooks and worksheets which I need to merge into one for analysis purposes by another software.
wbBank.Sheets(c).Copy Before:=wbStructure.Worksheets(wbStructure.Sheets.Count)
For some of these worksheets in 'wbStructure', I need to do a copy functionality (per ISO currency or country) which means that from 46 respective worksheets, I end up with around 1300 worksheets for this particular workbook that is ending up consuming too much memory.
For each worksheet I also do the following:
Insert Rows / Columns depending on parameters set
go through each row and columns and put in descriptions
The functionality of all of this used to work correctly, however I had to change the functionality recently to keep the same excel object rather than opening a new Excel instance due to compatibility issues between Excel 2016 and Excel 2010.
Dim oXLApp As Object
Set oXLApp = GetObject(,"Excel.Application")
The reason I need to use is that I need to get data from the current Excel document which contains the macro and copy paste this data through the other application. when I tried placing data in another object and then pasting the data, the formatting is lost and creates an issue.
Are there any pointers one can give me to reduce the memory consumption which is resulting in unusual behavior; randomly crashes throughout, issues of "'Not enough system resources to display correctly" etc. The final workbook is substantially large but it has to remain so due to this being an input for another system
Any feedback is highly appreciated.

Transfer all cells, formatting, formulas into new workbook for all sheets

My Excel workbook is getting incredibly slow to save, even though I have manual calculation on, nothing crazy going on in the vlookups, etc. But it spends so much time recalculating and I am wasting hours on this.
Is there any clean, easy way to just copy/paste all the contents of all the sheets (cell values, equations, etc) into a new workbook just in case the old one happens to be corrupted?
Check each sheet by hitting control+End. Does any take you to a row and or column far outside of your actually used range? If so, delete entire rows and columns past your REALLY used last cell (repeat for each worksheet) and then save a copy of your file.
Also, the xlsb file format is smaller so quicker to save and open.
Finally, if your wb is slow to calculate there is room for improvement. Any SUMIF(s) or array formulas in there? You say nothing crazy on the vlookup. What do you mean by nothing crazy? How many are there and to how many cells do they refer?

Excel VBA Job : Updation of master sheet

Scenario :
I had 2 workbooks, First one is a "master" workbook and other one is a "servant" workbook or in other terms "servant" workbook is a subset of "master" workbook, both workbooks contain a column known as "Commodity", "master" workbook contains exhaustive list of all items where "servant" workbook has few of those items, but it contains new updated values corresponding to those few items.
Job given to me :
I had to manually first pick the commodity from "servant" workbook find that "commodity" in "master" workbook and update the master workbook with the new values which are populated in "servant" workbook.
Why I had to find each commodity manually is because sorting of data in both sheets doesn't match.
My Code :
Sub Macro4()
'
' Macro4 Macro
'
Windows("test2.xlsx").Activate
Range("A2:A15").Select
Selection.Copy
Windows("Test.xlsx").Activate
Range("B2:B15").Select
Windows("test2.xlsx").Activate
Range("B2:B15").Select
Application.CutCopyMode = False
Selection.Copy
Windows("Test.xlsx").Activate
ActiveSheet.Paste
Application.CutCopyMode = False
End Sub
I think I understand what you seek but I do not believe anyone could give it to you.
What you are asking for appears to be a very simple VBA macro. If you do not know enough VBA to write that macro, I do not believe you would understand it if it was given to you.
You must learn the basics. Search the web for “Excel VBA tutorial”. There are many to choose from so try a few and complete the one that matches your learning style. I prefer books. I visited a good library, reviewed their Excel VBA Primers and borrowed those I preferred to try at home. Finally I bought the one I liked the best as a permanent reference book.
I do not know how quickly you will absorb the key concepts of programming in Excel VBA. Some find those concepts obvious while for others programming is an alien world where they will never be happy. I suggest you invest half-a-day in studying VBA. If you are finding VBA reasonably easy to understand then start thinking about your macro.
The code you present has been created by the Excel Macro Recorder. The macro recorder is a great way of identifying the VBA statements corresponding to a keyboard function but it is not a good way to learn VBA. It will not help much with your macro.
The tutorial should explain the difference between “workbook” and “worksheet”. A workbook is a file saved to disk. “test2.xlsx” and “Test.xlsx” are workbooks. A workbook will contain one or more worksheets. A worksheet contains row and columns. In your question you use “workbook” for both workbooks and worksheets.
You will need to design your macro. For a simple macro, the easiest approach if often to record how you would approach the task manually:
Open master workbook.
Open servant workbook.
Within Servant workbook look at worksheet containing Commodity column.
For each entry in this column:
Search for matching entry in Community column of master workbook.
If matching entry found
Copy select values from servant row to master row
If matching entry not found
Record error or take corrective action.
Next
Save master workbook.
Assuming I have understood your question, the above list is the structure of your macro. Most lines in the list correspond to one, or a few, simple VBA statements that you should have learnt from your tutorial.
In your macro you have already opened the master and servant workbooks. A VBA macro in one workbook can open other workbooks. Once you have a little experience, you will probably find this the most convenient approach. However, I suggest you leave multi-workbook macros to stage 2. I suggest you create a development workbook containing a copy of the master worksheet and the servant worksheet. Get your macro working within that single workbook first then update for multiple workbooks. I usually find it best to have macros in a workbook not assessable by the regular users. This macro will open both the master and servant workbooks.
I hope this gives you a start. Most people do not find programming in Excel VBA too difficult once they have mastered the basics. However, trying to find snippets of code on the web before you have mastered those basics is at best difficult and slow and often impossible.

Copy Excel worksheets, macros, and graphs from one workbook to another, moving links to the new workbook

I have an Excel workbook with a number of features:
One main user-facing sheet
One summary sheet based on the user-facing sheet's data
A number of graphs based on the user-facing sheet's data (as in, the type of graphs with a separate tab for them, rather than objects within a worksheet - I'm not sure if they have a special name or special properties)
A series of 'background' worksheets that calculate the values for the user-facing sheet
A macro to allow the user to sort the user-sheet by any column they wish, which is referenced in the user-facing sheet's Worksheet_SelectionChange event
However, for distribution I'd like to cull the sheets for simplicity (and file size - the entire data query is included on one of the sheets). I still need to calculate the values for the user-facing sheet, but it's only done once per dataset, so that can quite happily be copied as formatting then values.
The trouble, however, is transferring the dependent sheet, graphs and macros across to a new workbook so that instead of referencing the old workbook, they reference the new versions of the sheet. Ideally I'd like to do this with VBA or something, but my Google searches thus far don't seem to have turned up much of relevance.
Does anyone know how to do this?
I'm not sure I entirely understand your question, but what I think that you want to do is to create a new version of your workbook, with less worksheets in it??
To do that in VBA, this code snippet is a good place to start:
Sub Macro1()
ActiveWorkbook.SaveAs Filename:= _
[your path here]\Book2.xls", FileFormat:=xlNormal, _
Sheets("Sheet2").Select
ActiveWindow.SelectedSheets.Delete
End Sub
Creating a copy of the entire workbook, and then deleting what you don't need will make sure that the links do not reference the old workbook.
Once you have created the new workbook, with links intact, it is then pretty easy to remove all the things (calculations, etc.) that you don't need.