One SSIS package does the data export from 3 sources to Excel. Before exporting into excel the file, the data goes through sorting, aggregation and finally it goes before Conditional Split. I placed data viewers at every place and could see that the records are there but when the excel is exported i don't see any records.
before the export happens, previous day excel is deleted and a new excel is created from a template excel. At the end of data flow task, I see excel containing columns from template excel and no rows in it.
What is causing this issue. I have kept data viewer and I see the data as it goes by but I don't see data after the conditional split or before the conditional split happens.
Please see the pic. After the conditional split, there are 3 excel destination. One of them which you can see.
Related
I'm trying to run a duplicate check In which varying data is pulled from a website and compared to a master list, the master list being stored in Excel. The information from the website is read from a table in which has line breaks. These breaks are translated over to the data collection they are initially stored in. Some of the data from the website us eventually written to the master list in Excel. So when I read the master list back into Blue Prism to run a duplicate check, the rows that have line breaks are written into a collection as multiple rows (ex. I should have on 7 rows in my collections but am getting 42). Since the rows are not EXACTLY the same between the 2 collections, when it runs the automation does not recognize the duplicates.
The easiest way to solve this would be if I could make the collection rows have no line breaks as soon as the data is read. I've attempted to use the calculation stage to do so with no luck. I'm not sure if it is actually possible to do this, but would appreciate any direction.
Record an Excel macro to do the data sorting/cleaning in Excel (possibly Text To Columns, etc..) and then include the running of the macro as part of your Blue Prism process by using an action stage and the MS Excel VBO - Run Macro. Get the process to create an Excel instance (and create a handle data item from that stage), then use Open Workbook (whatever workbook you store your Macro in) and then use the MS Excel VBO - Run Macro (use the same handle created earlier and type in the name of the "macro").
It sounds like what is happening is that the MS Excel VBO is grabbing the data from the Excel Worksheet wholesale.
This is to say that it's accessing your Worksheet table, copying the cell values BUT not the cell formatting data, and then dumping the values into a BP collection.
Since it did not bring along any of the original cell formatting data to reference when it went to populate the collection it's just breaking up the values based on crturn/line breaks. Thus, your collection is organized based on that, and not on the original Worksheet cell.
So, with that said, on to a solution!
Solution 1
Brute force the organization of the incoming Excel cell data to the collection by looping over the Excel Worksheet cell-by-cell.
Run a loop, and in that loop have BP go into the Excel Worksheet and grab the first populated cell it comes across. Run a formatting/cleanup Calculation stage over the data. Dump the cell value into a single collection field.
Repeat.
This is...inelegant, expensive at best, and not at all recommended for any medium to large dataset. But it's definitely the best way to do string manipulation and value comparisons before it hits your collection. Since it sounds like your using a Master template then you as-well know what the expected format of your data should be.
This method will enable you implement Trim(), Concat(), or Split() in a Calculation stage to better organize your incoming data before you dump it into a collection.
This is also basically what I think you're already trying to do, but cell-by-cell instead of Worksheet row-by-row or table-by-table.
Solution 2
Clean up the table data you grab from the website before you dump it into the Excel Worksheet.
This is basically Solution 1, but in reverse. Simply format/cleanup your data before it hits you Excel Worksheet.
I'm not sure this is any better than Solution 1, but, you know, it's something...
Solution 3
Format the cell data IN the MS Excel Worksheet itself.
Basically rearrange the cells and cell data in the Excel Worksheet into a more predictable format by using the Split, Trim, Merge, or other actions included in the MS Excel VBO. You can also do this using the Data - OLEDB utility object, but that requires some pretty solid understanding of SQL syntax.
This would look like this using the MS Excel VBO:
Grab the Excel Worksheet data wholesale and dump into a collection
Count the rows/fields of the collection
Is that number consistent with the desired/expected format of your data?
If not, have the bot go back into the Excel Worksheet and reformat the cells by removing any carriage returns/line breaks/whatever else
Repeat.
However, I'm always reluctant to reformat any original source, as it's then hard to figure out what wrong and where it went wrong when you've changed the original structure of your data. So it's best to always make a copy of the Worksheet before you make any manipulation.
Unfortunately I don't have access to my BP environment at the moment or I'd provide you with the act object actions you'd need to do any of this, my bad. Once I do I'll update this answer.
Current Situation: I am consolidating a data package that receives multiple inputs from programs that roll into mine. Essentially, each program provides me their set of data and I am responsible for accurately consolidating data into one Excel workbook in a certain format. I have pushed out a template so that all the data comes back in the same format, but now I am having trouble quickly consolidating besides using "copy and paste".
Solutions I've looked into: I've looked into using connections to see if can pull in the data I need through a connected file however it imports the table in a different format and does not transfer correctly into the consolidation template that I have set up. I tried doing external source referencing to pull in data that way, but again the formatting is off. I know there are ways to just pull in the workbook, but I need data from specific points in these program workbooks that I can slot into my consolidated workbook. In most cases, copy and pasting would work, but I have over hundreds of programs that flow into this data sheet so any help would be appreciated!
How about setting the destination range to the values of the source range?
e.g.:
rngTarget.Resize(rngSource.Rows.Count, rngSource.Columns.Count).Value = rngSource.Value
I am attempting to upload an Excel document into Access. I have used VBA to unhide all columns and rows and then delete columns and rows that are not being used. All of the worksheets upload into Access properly except one. This particular worksheet attempts to upload a field and label it Field 12. I am unable to find a way to delete this field. Any help?
It is probably the first column after your data...
Try either in VBA or in Excel deleting the columns to the right of your data (not just contents but an actual delete). I've found this typically happens when the columns to the right of your data contained data at one point and Access / Excel sees those as still containing data. Then try your import again.
Alternatively, you could upload into a new Access staging table before pulling your desired known columns into the final table through an INSERT query. Then you can delete the staging table if you like or delete it before the next import. In this way, each import can have its own "added columns".
I have data in a spreadsheet which I have to upload in sql. The problem is that this data is quite crude. I need to rearrange the sheets in the excel file in terms of their relation with eachother. The first sheet has master data a colum of this sheet is to be linked to data in the other sheet. All I have is a sheet in which data is embedded. The relation between data is displayed using an expander button. Please tell me how I can rearrange this data fast? I think this can be done by running sql queries or ssis package but I'm not sure.
You can use SSIS package with multiple Source components (inside Data Flow) to access each sheet.
Once you can see both sources, you can sort all key columns using a sort component and then use a Merge Join to combine the two sets of data together
We have are relatively simple Reporting Services report that our users commonly export to Excel. I've noticed that the files produced by the Excel export seem unusually large. If I open one of these files and just click save, without making any changes, the file size reduces to about half of it's previous size. Has anyone else run into this and is there a known workaround?
You've mentioned that the report is relatively simple, but this is important to check. The export to Excel will go to extraordinary lengths to try and maintain how your report looks.
If you have lots of different borders or colours (particularly if different formatting is determined by the data in your report) this will bloat the file.
Also check if many columns with very small and unusual sizes are created in the exported worksheet. The export does this to try and match alignment in Excel with the original report.
Try recreating your report as a basic table with no formatting or headers/footers and see if you can reproduce the problem. If Excel's behaviour is acceptable then add each piece of formatting back until it goes awry. Please let us know what you find.
I don't have an immediate solution, but a common problem in Excel is files bloating because one/some/all of the worksheets have saved all 64K rows instead of the ones being used. The fix in Excel is to select all the lower rows not being used, and delete them, then save the spreadsheet, close and reopen. Therefore, I'd pursue the angle of extra rows being saved in the export, and see if there is a way to keep this from happening.
What tool are you using when exporting to Excel?
I have also managed to reduce # of rows in my Excel worksheet by copying it to another worksheet, then deleting the original sheet.
You could also try copying only the data in your worksheet, and paste it into a new Excel Workbook (file).
I had the issue where the exported Excel files took and extremely long to open and they would stop responding every time you clicked on a cell.
Also, extra and merged columns would appear in exported excel files.
The fix was to make sure my header text boxes lined up with the beginning and end of columns in the data table just below it. Once both were aligned, there were no more extra columns in the exported excel spreadsheets and the performance was back to normal.
Here's the reference that helped me understand the issue:
http://www.codegur.press/12747988/issue-report-export-to-excel-in-rdlc-report
May be I am answering your question very late. Here's the solution for exporting to CSV.
You need to give the a name that you want to see as a column header for the field (not the column name) in the designer.
By default all the text headers are exported as a separate columns along with the table columns and make sure that you name the Design name in the properties with the name you want to see.
The other important thing to note about the option DataElementOutput which is set to Auto meaning it will be exported. You can change that if you don't want it to be exported.
The last but not least thing ... after you export the data looks messed up. You need select the whole first column and go to the Data tab - > convert text to column -> use the delimiter as comma and say Finish. That should solve your issue.