I am doing a project that requires me to transfer data from one notepad to another notepad (saved using excel tab delimited form).
I have successfully done that, the only that left is I need to sort those data after transferring it.
For your information, I am transferring 5 column from the first notepad to the second notepad. I saved those information in five arrays.
How am I supposed to sort them after pasting?
I tried using vb.net sort function but that only will sort one array while the rest of the arrays wont follow.
I tried lines.sort also but the result is not satisfying, any other idea to sort those data like what we normally do manually in excel?
Any help will be very much appreciated.
One solution would be to create an object with 5 values in it. Then you would create an list of those objects (that way the values are all linked).
Then you would just do:
OBJECT.Sort(Function(x, y) x.valueToSortBy.CompareTo(y.valueToSortBy))
This would give you a list of your objects sorted by the value you wanted.
Related
I have a collection of excel spreadsheets that are formatted... less than ideally.
I'm testing out some solutions involving SQLBulkCopy and OleDB, but I'm a bit concerned about how to handle the format of this sheet.
I was considering writing a custom Insert statement, but would like to see if there may be some easier way to implement a heuristic.
Below is a sample of the data I will be parsing:
The highlighted columns are the ones I'll be loading into the two tables. One table will hold order #s, and the other table will hold all the lines below that order number.
Any suggestions on tackling this would be lovely. The excel sheets are hand entered, so some weird cases exist (one order number with multiple carriers, which imposes the question of whether I should treat the first row with the order number as a line in the database structure I designed.
I'm implementing this importer within VB.net, to my dismay, to avoid being looked at funny by my coworkers :).
One approach would be to save the worksheet to a text file (e.g., CSV) and then use AWK to split it at the empty row. Some examples are in this SO answer: Bash how to split file on empty line with awk
You could then import the CSV files directly into the database.
Amusingly , if I wrote anything in VB.NET I'd definitely get looked at funny by my coworkers
So I'd use a library called EPPlus to read the excel and not have to worry about converting it. How you do the blank line detection is an open question- checking that the Value of ten cells on the row is Nothing or Empty would suffice. Then take the next row as your parent, and proceed with subsequent rows as children until the next blank
Take a look at this answer for more info on how to detect blank rows in Excel- if you get stuck turning any of the c# into vb shoot us a question. Online converters exist because the two languages are the same thing under the hood
I'm trying to run a duplicate check In which varying data is pulled from a website and compared to a master list, the master list being stored in Excel. The information from the website is read from a table in which has line breaks. These breaks are translated over to the data collection they are initially stored in. Some of the data from the website us eventually written to the master list in Excel. So when I read the master list back into Blue Prism to run a duplicate check, the rows that have line breaks are written into a collection as multiple rows (ex. I should have on 7 rows in my collections but am getting 42). Since the rows are not EXACTLY the same between the 2 collections, when it runs the automation does not recognize the duplicates.
The easiest way to solve this would be if I could make the collection rows have no line breaks as soon as the data is read. I've attempted to use the calculation stage to do so with no luck. I'm not sure if it is actually possible to do this, but would appreciate any direction.
Record an Excel macro to do the data sorting/cleaning in Excel (possibly Text To Columns, etc..) and then include the running of the macro as part of your Blue Prism process by using an action stage and the MS Excel VBO - Run Macro. Get the process to create an Excel instance (and create a handle data item from that stage), then use Open Workbook (whatever workbook you store your Macro in) and then use the MS Excel VBO - Run Macro (use the same handle created earlier and type in the name of the "macro").
It sounds like what is happening is that the MS Excel VBO is grabbing the data from the Excel Worksheet wholesale.
This is to say that it's accessing your Worksheet table, copying the cell values BUT not the cell formatting data, and then dumping the values into a BP collection.
Since it did not bring along any of the original cell formatting data to reference when it went to populate the collection it's just breaking up the values based on crturn/line breaks. Thus, your collection is organized based on that, and not on the original Worksheet cell.
So, with that said, on to a solution!
Solution 1
Brute force the organization of the incoming Excel cell data to the collection by looping over the Excel Worksheet cell-by-cell.
Run a loop, and in that loop have BP go into the Excel Worksheet and grab the first populated cell it comes across. Run a formatting/cleanup Calculation stage over the data. Dump the cell value into a single collection field.
Repeat.
This is...inelegant, expensive at best, and not at all recommended for any medium to large dataset. But it's definitely the best way to do string manipulation and value comparisons before it hits your collection. Since it sounds like your using a Master template then you as-well know what the expected format of your data should be.
This method will enable you implement Trim(), Concat(), or Split() in a Calculation stage to better organize your incoming data before you dump it into a collection.
This is also basically what I think you're already trying to do, but cell-by-cell instead of Worksheet row-by-row or table-by-table.
Solution 2
Clean up the table data you grab from the website before you dump it into the Excel Worksheet.
This is basically Solution 1, but in reverse. Simply format/cleanup your data before it hits you Excel Worksheet.
I'm not sure this is any better than Solution 1, but, you know, it's something...
Solution 3
Format the cell data IN the MS Excel Worksheet itself.
Basically rearrange the cells and cell data in the Excel Worksheet into a more predictable format by using the Split, Trim, Merge, or other actions included in the MS Excel VBO. You can also do this using the Data - OLEDB utility object, but that requires some pretty solid understanding of SQL syntax.
This would look like this using the MS Excel VBO:
Grab the Excel Worksheet data wholesale and dump into a collection
Count the rows/fields of the collection
Is that number consistent with the desired/expected format of your data?
If not, have the bot go back into the Excel Worksheet and reformat the cells by removing any carriage returns/line breaks/whatever else
Repeat.
However, I'm always reluctant to reformat any original source, as it's then hard to figure out what wrong and where it went wrong when you've changed the original structure of your data. So it's best to always make a copy of the Worksheet before you make any manipulation.
Unfortunately I don't have access to my BP environment at the moment or I'd provide you with the act object actions you'd need to do any of this, my bad. Once I do I'll update this answer.
you most probably going to think "what an idiot" but remember i never done any type of coding before so this is all new to me,
My problem are that i'm working on a HUGE excel sheet with loads of data that is not needed. i need to sort the data into a few columns, i only need column "A,K,AN,AQ" but in column "AS" i only need certain values (yes,no,blank) i only want the yes and blank values. like i said never done any coding before but i know that you can use an macro to do it so please help, how do i go about this?
before trying to get into macros, try to use functions with if else statements. They are quite easy to handle. Like: If (yes) then put it into X. Later, you could select all needed. Also, check the, how the dollar sign is used
use this links to see, if it is something for you.
One quick and dirty way of getting this job done would be to:
Delete the columns you don't need.
Select all cells in the range you're interested in, click the Insert menu, and choose "Table". If your columns have titles, select the box for "My Table has Headers."
-This turns your data into an array so that Excel recognizes that each row is an entry (instead of thinking that the cells are unrelated).
Now you can use the filter icon in the column headers to select and display only the rows containing the values in column X that you're interested in.
Note that there are some limitations to what the table feature is good for, so, as always, whether this is a good solution for you depends on what you want to do with the data.
It's pretty simple really but I can't seem to find anything that relates to this without using a database.
Basically I want to simply take data out of a row of cells in the dataGridView that will of been inputted by the user and then manipulate that data as I like. This is without using a database, simply by temporarily storing the values in VB.
Such as in Excel you can perform a calculation between two cells next to each other and apply this calculation to every other cell. I imagine you can do this by using a counter however I am unsure of the basics first.
Back Story:
NEW PROJECT FROM MANAGEMENT: I have been given a soft project from my boss to evaluate one of our current ETL plans to look for room for improvement in the process, and I am looking for guidance.
MOTIVE: Excel is currently being used and crashes quite often during the process due to file size.
TASK: Every month an analyst receives a large csv file from a survey vendor containing up to 750 columns (not all unique names) with over 15,000 rows to simply transform a large csv file into an excel file with seven worksheets broken up based on the column headings in the csv. Details of how it is broken up is below.
My question is one large csv being transformed into an edited excel file with multiple worksheets any easier or quicker using VB.NET and VS2010 or VBA for that matter, or would using Excel be the simplilest way to continue this process? I am an Expert Excel user but I am still very much a beginner to intermediate at coding in VBA, VB.NET or any other language.
Detailed Question:
I am open to using free or open source software, but I am most familiar with VB.NET and Excel and Excel-VBA. I have played around a bit coding a simple windows form application to load the csv into a datatable using similar TextFieldParser code found here. I have thought of loading it into an array or even a 2d array to more easily edit the column headings and find the duplicate column headings. The datatable option still leaves me with more questions than answers because I need unique column headings and not sure if I should bother with a datatable if I'm going to just write an excel file right away. I tried CSVreader from CodeProject won't work on files with duplicate header names. I feel as though I am having writers block as I am not sure which direction I should take handle such a process. Any input you can provide will be much appreciated, and I apologize if this question does not have a single and clear best answer, Thanks.
Current Analyst tasks using excel
The current analytical plan has said analyst to open the csv in excel, insert a row above row 1 and use a vlookup to replace the 'New' column names with the 'Old' column names based on a simple two column lookup table on a separate worksheet. For example
New becomes Old
"org-name" becomes "org_name" or
"item_1_Vendor" becomes "item_1" or
"date-created_Survey" becomes "date_created"
etc...checking all sent "New" columns against the list of all possible 750 columns.
Then they paste values of the first row and then delete the 2nd row which contained the New headings we want to change.
Then the analyst has to fix the primary key on the file which is called "sid".
The Survey ID field (sid) should have a number for each row of the data file. Sometimes the sid shows up under the sid_HCAHPS or the sid_CGCAHPS fields instead.
The analyst would insert a column next to the "sid" field and put a formula in it like this, for example:
=IF(BE2<>"",BE2,IF(RD2<>"",RD2,IF(UH2<>"",UH2,"")))
Actual cell references would change but in the example excel formula,
"sid"=Range("BE2")
"sid_HCAHPS"=Range("RD2")
"sid_CGCAHPS"=Range("UH2")
Once the newly created primary key column is made and filled without blanks, we can delete the original "sid" column.
The next step is to check the columns because there may be a redundant HCAHPS section of columns (due to a second survey being sent and then returned- coded as Wave 2), delete second set of columns "sid_HCAHPS" through "language"
Next is the largest alteration because we have setup a system where we send this information to our database admins in the form of a seven worksheet excel file to be loaded by an MS Access Query that creates a table from each sheet that gets loaded into our proprietary business intelligence software. All Done!!
Is your question, "can VB.net automate our current analyst tasks?" -If so, then yes.
You could use the streamreader class to get data from your csv
(http://msdn.microsoft.com/en-us/library/system.io.streamreader.aspx)
Then store it either in an array as you mentioned or use the *list class
(http://msdn.microsoft.com/en-us/library/6sh2ey19.aspx)
Once you've got all your data stored you'll need to automate excel, this is quite straight forward but here's a link to get you started with that as well: http://support.microsoft.com/kb/301982/en-gb
With the list class you can create a list of custom objects using either classes or structures. eg.
We define a structure:
Structure rowOfData
Public intPrimaryKey as Integer
Public strIceCreamName as String
Public decPrice as Decimal
End Structure
We can then create a rowOfData and add properties to it:
Dim iceCream1 as rowOfData
iceCream1.intPrimaryKey = 1
iceCream1.strIceCreamName = "Mr Whippy"
iceCream1.decPrice = 0.99
We create a list with:
Dim listOfIceCreams as New List(of rowOfData)
And add to it like this:
listOfIceCreams.Add(iceCream1)
listOfIceCreams.Add(iceCream2)
etc.
And access the members of the list like this:
listOfIceCreams(0).decPrice 'gives us the price of the ice Cream that was added to the list first.
There are also a lot of other useful methods that lists have which arrays don't. You could have a look through that msdn list class link to see if anything jumps out at you that you might need