Read Excel rows, mark rows read - vba

I need to use VBA to read rows out of Excel into another application, but if the process dies in the middle, I need to know which rows were read.
Is the best way to put something in a column, per row, that says that row was done? Then save it after every row is read?
Doesn't seem like a great way.
Any help would be great. Thanks everybody.

I usually write the row to another sheet, then delete it from the original. That way your original is always what's unwritten, but you still have the written data. If you can't muck with the original, start by copying everything to a new sheet, then clearing contents as you process.

Read the entire range into an array (matrix), and keep track of the position (index), saving the position value to a file or database.

Not knowing what application you are writing to, it might be best to check that the data does not exist in the destination application for each and every write. Make an index or some other autonumber (to use a database term) in your destination app. This way you don't have to save the state data in your spreadsheet and save it every time, which will slow down your VBA big time.

Related

Formatting Data in excel sheet with blue prism

I'm trying to run a duplicate check In which varying data is pulled from a website and compared to a master list, the master list being stored in Excel. The information from the website is read from a table in which has line breaks. These breaks are translated over to the data collection they are initially stored in. Some of the data from the website us eventually written to the master list in Excel. So when I read the master list back into Blue Prism to run a duplicate check, the rows that have line breaks are written into a collection as multiple rows (ex. I should have on 7 rows in my collections but am getting 42). Since the rows are not EXACTLY the same between the 2 collections, when it runs the automation does not recognize the duplicates.
The easiest way to solve this would be if I could make the collection rows have no line breaks as soon as the data is read. I've attempted to use the calculation stage to do so with no luck. I'm not sure if it is actually possible to do this, but would appreciate any direction.
Record an Excel macro to do the data sorting/cleaning in Excel (possibly Text To Columns, etc..) and then include the running of the macro as part of your Blue Prism process by using an action stage and the MS Excel VBO - Run Macro. Get the process to create an Excel instance (and create a handle data item from that stage), then use Open Workbook (whatever workbook you store your Macro in) and then use the MS Excel VBO - Run Macro (use the same handle created earlier and type in the name of the "macro").
It sounds like what is happening is that the MS Excel VBO is grabbing the data from the Excel Worksheet wholesale.
This is to say that it's accessing your Worksheet table, copying the cell values BUT not the cell formatting data, and then dumping the values into a BP collection.
Since it did not bring along any of the original cell formatting data to reference when it went to populate the collection it's just breaking up the values based on crturn/line breaks. Thus, your collection is organized based on that, and not on the original Worksheet cell.
So, with that said, on to a solution!
Solution 1
Brute force the organization of the incoming Excel cell data to the collection by looping over the Excel Worksheet cell-by-cell.
Run a loop, and in that loop have BP go into the Excel Worksheet and grab the first populated cell it comes across. Run a formatting/cleanup Calculation stage over the data. Dump the cell value into a single collection field.
Repeat.
This is...inelegant, expensive at best, and not at all recommended for any medium to large dataset. But it's definitely the best way to do string manipulation and value comparisons before it hits your collection. Since it sounds like your using a Master template then you as-well know what the expected format of your data should be.
This method will enable you implement Trim(), Concat(), or Split() in a Calculation stage to better organize your incoming data before you dump it into a collection.
This is also basically what I think you're already trying to do, but cell-by-cell instead of Worksheet row-by-row or table-by-table.
Solution 2
Clean up the table data you grab from the website before you dump it into the Excel Worksheet.
This is basically Solution 1, but in reverse. Simply format/cleanup your data before it hits you Excel Worksheet.
I'm not sure this is any better than Solution 1, but, you know, it's something...
Solution 3
Format the cell data IN the MS Excel Worksheet itself.
Basically rearrange the cells and cell data in the Excel Worksheet into a more predictable format by using the Split, Trim, Merge, or other actions included in the MS Excel VBO. You can also do this using the Data - OLEDB utility object, but that requires some pretty solid understanding of SQL syntax.
This would look like this using the MS Excel VBO:
Grab the Excel Worksheet data wholesale and dump into a collection
Count the rows/fields of the collection
Is that number consistent with the desired/expected format of your data?
If not, have the bot go back into the Excel Worksheet and reformat the cells by removing any carriage returns/line breaks/whatever else
Repeat.
However, I'm always reluctant to reformat any original source, as it's then hard to figure out what wrong and where it went wrong when you've changed the original structure of your data. So it's best to always make a copy of the Worksheet before you make any manipulation.
Unfortunately I don't have access to my BP environment at the moment or I'd provide you with the act object actions you'd need to do any of this, my bad. Once I do I'll update this answer.

How do I prevent unlocked cells from being selected/editable? (VBA)

I have a very specific question that I am looking for help in answering. I have been researching for hours and I feel like I am not able to find what I am looking for. Below is a quick overview of the criteria that my document must follow:
I am using Excel 2013
I will just be using rows for data input (instead of an excel object
table).
The very top/first row will act as my "column header".
This top row will have AutoFilter enabled.
THE DOCUMENT MUST BE PROTECTED (a must-have)!
I will be using VBA code
Now, the final issue I am having with finishing this document are the last two criteria points that I must have:
The first/top row (column headers) must NOT BE EDITABLE.
Each column must be able to SORT AND FILTER.
Now, in a perfect world, I would just "Lock Cells" for the entire first row that acts as my column headers and when I protect the worksheet I would make sure to check the "Sort" and "Use AutoFilter" boxes.
However, this option does not work because there seems to be an issue when I try to sort the data. If I just filter the data there is no problem, but when I try to sort a column in ascending/descending order I will get an error informing me that I can't sort locked cells while in Protected mode. This is because when excel uses the Sort function, it counts the header as part of the data that is being sorted (I found this out through my research) even though I really just want the data below it to be sorted.
I have been trying to brainstorm on how to get past this issue as well as researching different methods, and I am having trouble coming to a final conclusion. However I have narrowed it down to 2 possible solutions:
I want to be able to keep the cells in the first row officially unlocked to allow the AutoFilter's sort command to work as intended, but make it "behave" like the cells are locked when a user tries to make changes to it (AKA, make the entire row un-editable or un-selectable).
The other option would be to keep the first row locked but somehow have an event in VBA that can tell when a user tries to "Sort" the column, which will then temporarily unprotect the worksheet, follow through with the intended sort command, then protect the worksheet again (apparently though, upon my research there is no such event that can trigger off the AutoFilter's sort command alone).
These 2 solutions are the most logical I can think of based off my research, but if someone out there is an Excel genius and knows another way I am open to suggestions.
Thanks in advance for your help/suggestions,
Travo
Consider using two header rows. The top row would be protected and the second row would facilitate filtering and sorting of the data in rows 3 and below.

In SQL how do I update a table with a similar table?

In my current Database I have a table whose data is manually entered or comes in an excel sheet every week. Before we had the "manual entry option", the table would be dropped and replaced by the excel version.
Now because there is data that only exists in the original table this can not be done.
I'm trying to find a way to update the original table with changes and additions from the (excel) table while preserving all rows not in the new sheet.
I've been attempting to simply use an insert query and an update query /but/ I can't find a way to detect changes in a record.
Any suggestions? I can provide the current sql if you'd find that helpful.
Based on what I have read so far, I think I can offer some suggestions:
It appears you have control of the MS Access. I would suggest adding a field to your data table called "source". Modify your form in the access database to store something like "m" for manual entry in the source field. When you import the excel, store an "e" for excel in the field.
You would need to do a one time scrub of the data to mark existing records as manual entries or excel entries. There are a couple of ways you can do it through automation/queries that I can explain in detail if you want.
Once past these steps, your excel process is fairly simple. You can delete all records with source = "e" and then do a full excel import. Manual records would remain unchanged.
This concept will allow you to add new sources and codes and allow you to handle each differently if needed. You just need to spend some time cleaning up your old data. I think you will find it worth it in the end.
Good Luck.

Unable to move / delete rows in shared workbook - Not enough resources

this one's a bit of a painful one so thank you for your help and patience with me.
We have an Excel spreadsheet that we use as a master file for our website products. As such there are quite a few sheets and quite a few products on each running along side some macros to provide some extra functionality (turning entered data into HTML for product page, etc).
My issue is that one of our most used spreadsheets has become a trouble in that it has some phantom formatting all the way down to the millionth-and-something row and all the way across, causing the last cell to be the very last cell possible.
The issue that has finally popped up as a result is that we can no longer move rows in, out or around the sheet (a required functionality) as it results in an 'out of resources error'.
I've tried:
Highlight all rows below used range to right-click> delete - Results in runtime error (from macro)
Highlighting large chunks of rows and using Clear All - Resulted in the 38MB file bloating to 380MB
Deleting a chunk of rows at a time - Maxed out at 1,000 before it caused Excel to crash
Moving to new spreadsheet - Broke all our macros (which I did not write and am not proficient enough to fix on a new sheet)
Disabling macros and trying the above options, only marginally more efficient but still out of resources
I'm at my wits end on this one and, while we can continue with most day-to-day functions, we will soon be completely unable to use this particular sheet as we need it at all.
I'm wondering if there might be a way to run a VBA script to remove these rows, potentially one by one? I've tried running a short script that went something like rows[960,1000000].Delete (forgive my terrible VBA markup), but this also resulted in not enough resources errors.
I'm wondering if there's anything like:
row = 960;
while(row<=1048576){row.Delete};
Continuing, the runtime error debug points me to the below if statement within the macro:
If Target.Count > 1 Then Exit Sub
Where Target is the variable passed to the sub.
Which strikes me as very odd because my (limited) understanding of VBA and IF's in general simply recognizes that 'if my selection is larger than 1 (row?), do not run this code..
Thanks again in advance.
Use this method only if you don't have any links into or out of the sheet that will get broken. Also might have Sql connections that might get broken. Might need to disable macros. There are many possible problems with this approach. Use at your own risk.
Note the exact "Name" and "(Name)" of the sheet; Look in the VBA code window at the properties for the sheet. "Name" is the name displayed on the worksheet tab. "(Name)" is the code name visible only in the properties window.
Make a list of range names on the sheet.
Copy the data to a new sheet.
Copy any macros to the new sheet.
Delete the old sheet.
Rename the "Name" and "(Name)" of the new sheet the same as the old one.
Recreate range names.
A better method if you don't have too many formats:
Disable macros and set calculation to manual. This avoids recalculating while doing your delete operation.
Select entire sheet and clear formats.
Delete all rows below your data.
Redo your formatting. Select entire column (not just used area) to apply format if applicable.
It is important to remove formatting on the entire sheet from A1 to the end. Otherwise you'll get the bloat you mentioned. Just that step may solve your problem. If not then proceed with removing all the rows below the data. This should not cause file size bloat.

Open OFFICE CALC Macro's: how to get the old data that was in a cell after it had been changed?

I did a macro and used ChartDataChangeEvent to get a value of a cell after it had been changed, but i need the old value too for going to a specific column in another sheet and changing there the value according to the cell that had been changed.
is there a way to get the old data(before modification)?
(i don't want to copy all the sheet, because my tables are very big and it will take a long time).
Thanks
Just thinking out loud here, would this be a viable path: create a script based "on data change" (executes every time when data is modified by user). Then copy the "new" data from the active cell, then perform an "undo", to get the old data back, copy the old data, then put the new data back where it was. Don't know if you would programmatically have access to the internal undo stack, because that would be easier.