I have an excel spreadsheet with multiple columns for each header. I would like to collapse them into one column. For some reason the user spread what was supposed to go into one column across five.
Given this:
+------------+-----------+-----------+-----------+-----------+--+
| | | DOB | | | |
+------------+-----------+-----------+-----------+-----------+--+
| | | 1/7/1980 | | | |
| | | | | 1/30/1947 | |
| | | | 3/12/1948 | | |
| | | 1/26/1941 | | | |
| | 6/26/1951 | | | | |
| 12/29/1974 | | | | | |
+------------+-----------+-----------+-----------+-----------+--+
I want this:
DOB
1/7/1980
1/30/1947
3/12/1948
1/26/1941
6/26/1951
12/29/1974
I tried this, but it creates turn of century dates for each of the blank columns.
TEXT(A1,"m/d/yyyy")&TEXT(B1,"m/d/yyyy") . . .
How can I avoid that? Or is there a better way?
Because Excel formulas can add dates, assuming your other 4 cells in the row are truly blank, you could just write this formula:
=Sum(B1:B5)
Found this pretty cool custom function to easily combine an entire range via Excel VBA function to concatenate non-empty cells with a user defined seperator:
=ConcatenateRange(B1:B5,"")
Using the custom function:
Function ConcatenateRange(ByVal cell_range As Range, _
Optional ByVal seperator As String) As String
Dim cell As Range
Dim newString As String
Dim cellArray As Variant
Dim i As Long, j As Long
cellArray = cell_range.Value
For i = 1 To UBound(cellArray, 1)
For j = 1 To UBound(cellArray, 2)
If Len(cellArray(i, j)) <> 0 Then
newString = newString & (seperator & cellArray(i, j))
End If
Next
Next
If Len(newString) <> 0 Then
newString = Right$(newString, (Len(newString) - Len(seperator)))
End If
ConcatenateRange = newString
End Function
Related
I know a little about vba, and I would like to achieve this using vba.
I am transferring data from sheet to another sheet with some special case.
Given this situation:
In another sheet I have these ranges
A4:B10
D2:E10
G2:H10
My data is something like this
AXX | Contact no.
AXX | Address
AXX | Name
AXX | Summary
BXX | Address
BXX | Name
BXX | Contact no.
BXX | Details
CXX | Address
CXX | Name
CXX | Summary
DXX | Address
DXX | Name
DXX | Contact no.
DXX | Address
DXX | Name
My identifier is in the first column (AXX, BXX...).
The expected output is:
Row no| Column A | Column B | Column D | Column E |
1 | | | | |
2 | | | BXX | Address |
3 | | | BXX | Name |
4 | AXX | Contact no. | BXX | Contact no. |
5 | AXX | Address | BXX | Details |
6 | AXX | Name | | |
7 | AXX | Summary | CXX | Address |
8 | | | CXX | Name |
9 | | | CXX | Summary |
10 | | | | |
As you could see, my identifiers are AXX, BXX... If they are similar I would count the no. of rows and compare it to the no. of rows in my set of ranges.
BXX was not placed next to AXX because the remaining row is 3 but BXX needs 4 so it will be passed on to the next range. Also there will be a blank cells separating other values as seen BXX and CXX.
For now, what I only know is to count the rows using For loop. Would like to seek your help for this thanks.
My Initial code to get row count
Dim aa, aaLastrow As Long
aaLastrow = ShtData.Range("A" & Rows.Count).End(xlUp).Row
For aa = 2 To aaLastrow
If ShtData.Cells(aa, 2).Value = ShtData.Cells(bb, 4).Value Then
Sheets("Sheet1").Cells(aa, 1).Value = ShtData.Cells(aa, 2).Value
End If
Next aa
I know my code is incorrect and I am not sure if this approach is on the right track.
do like this
Sub test()
Dim Data As Worksheet, ToWs As Worksheet
Dim vData, vDB, vArray
Dim i As Integer, j As Long, n As Long
Dim rngDB(1 To 4) As Range
Set Data = Sheets(1)
Set ToWs = Sheets(2)
vData = Data.Range("a1").CurrentRegion
vArray = Array("A", "B", "C", "D")
With ToWs
Set rngDB(1) = .Range("a4:b10")
Set rngDB(2) = .Range("d2:e5")
Set rngDB(3) = .Range("d7:e10")
Set rngDB(4) = .Range("g2:h10")
End With
For i = 1 To 4
n = 0
rngDB(i).Clear
vDB = rngDB(i)
For j = 1 To UBound(vData)
If vData(j, 1) Like vArray(i - 1) & "*" Then
n = n + 1
vDB(n, 1) = vData(j, 1)
vDB(n, 2) = vData(j, 2)
End If
Next j
rngDB(i) = vDB
Next i
End Sub
=COUNTIF($A:$A, "AXX")
will give you the count you seem to want. You can refine the range, and you can insert the reference to a cell instead of the hard "AXX". If you wish to use VBA you can call the function as Application.Countif(Range, CountWhat) where "Range" is a range you define in VBA and "CountWhat" is a variant.
I'm trying to split multi-line text box and paste the output on the next sheet in MS Excel using VBA, and I found a fragment of code below and it works:
Dim Str As String, a
Dim cnt As Integer
Dim w()
Str = xmlRequestTextBox.Value
a = Chr(10)
cnt = UBound(Split(Str, a))
MsgBox (Str)
MsgBox (a)
MsgBox (cnt)
ReDim w(1 To cnt + 1, 1 To 1)
For i = 0 To cnt
w(i + 1, 1) = Split(Str, Chr(10))(i)
Next i
Sheet2.range("A1").Resize(i, 1) = w
Sheet2.Cells.Replace Chr(13), " "
Now my problem is when I tried to modify it and change it to a single dimensional array, it only outputs the value of the first index of the array. Why does the array have to be multi-dimensional? Thank you in advanced.
For multi-line text you would need to use 2d-array returning by Split() function, where the first dimension typically refers to the rows (lines) and second index is referring to the individual words in these lines. Hope this may help.
an example of using split
Option Explicit
Private Sub CommandButton1_Click()
Dim Str As String
Dim cnt As Integer
Dim w As Variant
Str = Me.xmlRequestTextBox.Value
w = Split(Str, Chr(10))
cnt = UBound(w)
MsgBox (Str)
MsgBox (Chr(10))
MsgBox (cnt)
' suppose the multi-line textbox value is:
' first line : 1 2 3
' second line: 4 5 6
' third line: 7 8 9
With ThisWorkbook.Sheets("split")
.Range("A1").Resize(cnt + 1) = w(1) ' writes "4 5 6" in A1:A3 (i.e. the 2nd element of w)
.Range("B1").Resize(cnt + 1) = Split(w(0), " ")(2) ' writes "3" in B1:B3 (i.e. the third element of the 1st element of w)
.Cells.Replace Chr(13), " "
End With
End Sub
where I assumed that there's some UserForm with a multi-line textbox named "xmlRequestTextBox" and a button named "CommandButton1", whose click event starts that piece of code of yours
To answer your question, the array does not have to be multi-dimensional. However, Excel assumes a 1-dimensional array will be horizontal, that is, fill the columns in a single row. So if you're writing to a vertical range (multiple rows in a single column) in a spreadsheet, it will take the value of the first element (column) and write it to each cell in the column. For example, let's assume you have an array like this:
MyArray = Array(1, 2, 3, 4, 5)
This code will write the first element only of MyArray to the vertical range in column A:
Set MyRange = Range("A1").Resize(5, 1) ' Cells A1:A5
MyRange.Value = MyArray
Result:
A | B | C | D | E
+---+---+---+---+---+
1 | 1 | | | | |
2 | 1 | | | | |
3 | 1 | | | | |
4 | 1 | | | | |
5 | 1 | | | | |
If you write the array to a horizontal range in a single row then it will display the whole array:
Set MyRange = Range("A1").Resize(1, 5) ' Cells A1:E1
MyRange.Value = MyArray
Result:
A | B | C | D | E
+---+---+---+---+---+
1 | 1 | 2 | 3 | 4 | 5 |
2 | | | | | |
3 | | | | | |
4 | | | | | |
5 | | | | | |
However, I'm assuming you actually want to write the array to a vertical range in a single column. You can accomplish this by using the Application.WorksheetFunction.Transpose method on the array. This will write each element of the array into rows in column A:
Set MyRange = Range("A1").Resize(5, 1) ' Cells A1:A5
MyRange.Value = Application.Transpose(MyArray)
Result:
A | B | C | D | E
+---+---+---+---+---+
1 | 1 | | | | |
2 | 2 | | | | |
3 | 3 | | | | |
4 | 4 | | | | |
5 | 5 | | | | |
For more information on this, see Chip Pearson's page on VBA Arrays And Worksheet Ranges
I have two excel sheets, A which contains products and B, which is the products we will discontinue when stock runs out.
I would like a macro so that we can make a list in B, hit the run function, and it will go and find where it is in sheet A, go to column E of that row and enter in today's date.
The hitch I have so far, is to not make it overwrite previous entries in the column if it wasn't found.
The basic formula I have right now is this
Sub Deletions()
Dim LastRow As Long
With Sheets("A") '<-set this worksheet reference properly
LastRow = .Range("A" & Cells.Rows.Count).End(xlUp).Row
With .Range("E2:E" & LastRow)
.Formula = "=IF(A1='B'!A1,TODAY(),)"
.Cells = .Value2
End With
End With
End Sub
The reason I need to use VBA, is that we have over 100k items, and not everyone using this will know excel very well. So we want to be able to make a list, put it in excel, and click the macro button and voila.
Also, the list of removed items gets deleted afterwards, as the information is kept in sheet A. We also need to keep the dates of when products got discontinued, so it is very crucial that this macro not erase previous entries.
Heres my answer:
Please follow the comments inside the code.
Sub discontinue_Prods()
'the button need to be on sheet B
'In sheet B need to have a header
Dim r
Dim c
Dim disRange As Range
Dim i
Dim shtA As Worksheet
Dim shtB As Worksheet
Dim dLine
Dim E 'to store the column number of column E
Dim A 'to store the column number of column A
Set shtA = Sheets("A") 'storing the sheets...
Set shtB = Sheets("B")
shtB.Activate 'no matter you are in the workbook, always run from the sheet B,
'this code will do that for you.
r = Range("A2").End(xlDown).Row 'the last row of the list
'with the discounted prods
'If you do not want headers,
'use A1 here
c = 1 'column A... changed if you need
Set disRange = Range(Cells(2, c), Cells(r, c)) 'here need to change the 2 for
'1 if you do not want headers
E = 5 'column E and A, just the numbers
A = 1
shtA.Activate 'go to sheet A
For Each i In disRange 'for each item inside the list of prod going to discount
dLine = Empty
On Error Resume Next
dLine = Application.WorksheetFunction.Match(i.Value, shtA.Columns(A), False)
'here we find the row where the prod is,
'searching for the item on the list (Sheet B).
If Not dLine = Empty Then
shtA.Cells(dLine, E).Value = Date 'heres we add the today date (system date)
'to column E, just as text
'IMPORTANT!
'if you want the formula uncomment and use this:
'Cells(dLine, E).FormulaR1C1 = "=TODAY()"
End If
On Error GoTo 0
Next i
End Sub
Just go over the cells in the list of Sheet B, and go to Sheet A and find the products, and if the code find any Match product, set the column E as a Todays date, using the system date. Note, if you want to user formulas see the comments.
With a list like this:
Sheet A
+----------+-----+
| Products | Qty |
+----------+-----+
| Prod001 | 44 |
| Prod002 | 27 |
| Prod003 | 65 |
| Prod004 | 135 |
| Prod005 | 95 |
| Prod006 | 36 |
| Prod007 | 114 |
| Prod008 | 20 |
| Prod009 | 107 |
| Prod010 | 7 |
| Prod011 | 22 |
| Prod012 | 142 |
| Prod013 | 99 |
| Prod014 | 144 |
| Prod015 | 150 |
| Prod016 | 44 |
| Prod017 | 57 |
| Prod018 | 64 |
| Prod019 | 17 |
| Prod020 | 88 |
+----------+-----+
Sheet B
+----------+
| Products |
+----------+
| Prod017 |
| Prod011 |
| Prod005 |
| Prod018 |
| Prod006 |
| Prod009 |
| Prod006 |
| Prod001 |
| Prod017 |
+----------+
Result in Sheet A
+----------+-----+--+--+-----------+
| Products | Qty | | | |
+----------+-----+--+--+-----------+
| Prod001 | 44 | | | 2/23/2016 |
| Prod002 | 27 | | | |
| Prod003 | 65 | | | |
| Prod004 | 135 | | | |
| Prod005 | 95 | | | 2/23/2016 |
| Prod006 | 36 | | | 2/23/2016 |
| Prod007 | 114 | | | |
| Prod008 | 20 | | | |
| Prod009 | 107 | | | 2/23/2016 |
| Prod010 | 7 | | | |
| Prod011 | 22 | | | 2/23/2016 |
| Prod012 | 142 | | | |
| Prod013 | 99 | | | |
| Prod014 | 144 | | | |
| Prod015 | 150 | | | |
| Prod016 | 44 | | | |
| Prod017 | 57 | | | 2/23/2016 |
| Prod018 | 64 | | | 2/23/2016 |
| Prod019 | 17 | | | |
| Prod020 | 88 | | | |
+----------+-----+--+--+-----------+
I think you are overcomplicating this by using VBA.
Instead, you can do this with a simple Excel formula:
Assume 'Sheet B', column A holds the list of discontinued items. 'Sheet A' column A holds the name of each item, and you want today's date in column E, wherever there is a match of an item in Sheet B. Put this in 'Sheet A' E1 and copy it down to the end of the sheet.
=IF(ISERROR(MATCH(A1,'Sheet B'!A:A, 0)), "", TODAY())
This will put today's date, as long as the row in sheet A matches any of the rows in sheet B. It tries to find a match anywhere on Sheet B, and if it doesn't, it will produce an error, meaning ISERROR will be TRUE, and the IF statement will produce "". If it does match, there will be no error, and it will produce TODAY().
This is what I would do:
Dim b as Variant
For j=1 to Range("A1").End(xlDown).Row 'Assuming the button is on the "B" Sheet
b=Cells(j,1).Value 'This is your product in Sheet "B", assuming it is in the first column
For i=1 to Sheets("A").Range("A1").End(xlDown).Row
If Sheets("A").Cells(i,1).Value=b Then 'This would mean the product was found in the i Row
Sheets("A").Cells(i,5)=Format(Now(), "MMM-DD-YYYY") 'Write today's date
Exit For 'No need to keep looping
End if
Next i
Next j
It's very basic, but I'm sure it works.
I have a fairly large data-set that needs to be exported as CSV from Excel for import into another application. It can not have duplicate column headings but at this time there are many instances of that happening. I need to consolidate these headings and their respective data into single columns and remove duplicates.
I am trying to take data like this:
MAKE | MAKE | MAKE | MODEL | MODEL | TRIM |
-------------------------------------------
FORD | | | | | |
-------------------------------------------
| FIAT | | | | |
-------------------------------------------
| | MINI | | | |
-------------------------------------------
| | | PILOT | | |
-------------------------------------------
| | | | SC400 | |
-------------------------------------------
| | | | | EX |
-------------------------------------------
and turn it into this:
MAKE | MODEL | TRIM |
---------------------
FORD | | |
---------------------
FIAT | | |
---------------------
MINI | | |
---------------------
| PILOT | |
---------------------
| SC400 | |
---------------------
| | EX |
---------------------
Thanks in advance for any help in accomplishing this.
You need to separate the problem in smaller bits:
Read the unique titles and save them in a Dictionary object (as a value you might want to hold on the column they are going to be saved in)
You iterate through each cell getting the value and reading the column header.
You write that value in a new sheet on the column you are currently iterating through but for column position you look-up the current column title in the dictionary and get its position.
EDIT: Code tested and debugged. Works well.
Note: This method assumes that you have only 1 value per duplicated columns per row.
If you have more than 1 value for duplicated columns then the program will always save the last one (as it will overwrite the previous value). If you want a method that handles multiple values per column then you need to keep a Row number for each column in the new sheet and increment it by 1 each time you write data in that column.
Sub WriteValues()
'Aassuming your column titles are in row 1
Dim mainSheet As Worksheet
Set mainSheet = ActiveSheet
Dim maxCols As Integer
Dim maxRows As Double
maxRows = 0
maxCols = Cells(1, Columns.Count).End(xlToLeft).Column
Dim colPositions As Dictionary
Set colPositions = New Dictionary
'Iterate throgh row 1 to get all uniue values
Dim iCol As Integer
For iCol = 1 To maxCols
On Error Resume Next
colPositions.Add Cells(1, iCol).Value, colPositions.Count + 1
On Error GoTo 0
'Also record maxRows
If Cells(rows.Count, iCol).rows.End(xlUp).row > maxRows Then
maxRows = Cells(rows.Count, iCol).rows.End(xlUp).row
End If
Next i
Dim newSheet As Worksheet
Set newSheet = Sheets.Add
Dim col As Integer
Dim row As Double
'Write column titles in new sheet
Dim v As Variant
iCol = 1
For Each v In colPositions
Cells(1, iCol).Value = v
iCol = iCol + 1
Next v
'Main data iterator
For row = 2 To maxRows
For col = 1 To maxCols
Dim cellValue As String
Dim valueColumn As String
With mainSheet
cellValue = .Cells(row, col).Value
valueColumn = .Cells(1, col).Value
End With
If cellValue <> "" Then
newSheet.Cells(row, colPositions(valueColumn)).Value = cellValue
End If
Next col
Next row
End Sub
I'm trying to LINQ two tables based on a dynamic key. User can change key via a combo box. Key may be money, string, double, int, etc. Currently I'm getting the data just fine, but without filtering out the doubles. I can filter the double in VB, but it's slooooow. I'd like to do it in the LINQ query right out of the gate.
Here's the data:
First Table:
-------------------------------------------------------------
| AppleIndex | AppleCost | AppleColor | AppleDescription |
------------------------------------------------------------
| 1 | 3 | Red | This is an apple |
| 2 | 5 | Green | This is an apple |
| 3 | 4 | Pink | This is an apple |
| 4 | 2 | Yellow | This is an apple |
| 5 | 2 | Orange | This is an apple |
| 1 | 3 | Red | This is a duplicate|
| 2 | 5 | Green | This is a duplicate|
| 3 | 4 | Pink | This is a duplicate|
| 4 | 2 | Yellow | This is a duplicate|
| 5 | 2 | Orange | This is a duplicate|
-------------------------------------------------------------
Second Table:
------------------------------------------------------------
| OrangeIndex | OrangeCost | OrangeColor | OrangeDescription |
------------------------------------------------------------
| 1 | 1 | Orange | This is an Orange |
| 2 | 3 | Orange | |
| 3 | 2 | Orange | This is an Orange |
| 4 | 3 | Orange | |
| 5 | 2 | Orange | This is an Orange |
------------------------------------------------------------
Currently, I'm using the following code to get too much data:
Dim Matches = From mRows In LinqMasterTable Join sRows In LinqSecondTable _
On mRows(ThePrimaryKey) Equals sRows(TheForignKey) _
Order By mRows(ThePrimaryKey) _
Select mRows, sRows Distinct
Outcome:
-------------------------------------------------------------------------
| 1 | 3 | Red | This is an apple | 1 | Orange | This is an Orange |
| 1 | 3 | Red | This is an duplicate | 1 | Orange | This is an Orange |
| 2 | 5 | Green | This is an apple | 3 | Orange | |
| 2 | 5 | Green | This is an duplicate | 3 | Orange | |
| 3 | 4 | Pink | This is an apple | 2 | Orange | This is an Orange |
| 3 | 4 | Pink | This is an duplicate | 2 | Orange | This is an Orange |
| 4 | 2 | Yellow | This is an apple | 3 | Orange | |
| 4 | 2 | Yellow | This is an duplicate | 3 | Orange | |
| 5 | 2 | Orange | This is an apple | 2 | Orange | This is an Orange |
| 5 | 2 | Orange | This is an duplicate | 2 | Orange | This is an Orange |
-------------------------------------------------------------------------
Desired Outcome:
------------------------------------------------------------------------
| 1 | 3 | Red | This is an apple | 1 | 1 | Orange | This is an Orange |
| 2 | 5 | Green | This is an apple | 2 | 3 | Orange | |
| 3 | 4 | Pink | This is an apple | 3 | 2 | Orange | This is an Orange |
| 4 | 2 | Yellow | This is an apple | 4 | 3 | Orange | |
| 5 | 2 | Orange | This is an apple | 5 | 2 | Orange | This is an Orange |
------------------------------------------------------------------------
I have tried the following:
'Get the original Column Names into an Array List
'MasterTableColumns = GetColumns(qMasterDS, TheMasterTable) '(external code)
'Plug the Existing DataSet into a DataView:
Dim View As DataView = New DataView(qMasterTable)
'Sort by the Primary Key:
View.Sort = ThePrimaryKey
'Build a new table listing only one column:
Dim newListTable As DataTable = _
View.ToTable("UniqueData", True, ThePrimaryKey)
This returns a unique list, but no associated data:
-------------
| AppleIndex |
-------------
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
-------------
So I tried this instead:
'Build a new table with ALL the columns:
Dim newFullTable As DataTable = _
View.ToTable("UniqueData", True, _
MasterTableColumns(0), _
MasterTableColumns(1), _
MasterTableColumns(2), _
MasterTableColumns(3))
Unfortunately, it yields the following... with duplicates:
-------------------------------------------------------------
| AppleIndex | AppleCost | AppleColor | AppleDescription |
------------------------------------------------------------
| 1 | 3 | Red | This is an apple |
| 2 | 5 | Green | This is an apple |
| 3 | 4 | Pink | This is an apple |
| 4 | 2 | Yellow | This is an apple |
| 5 | 2 | Orange | This is an apple |
| 1 | 3 | Red | This is a duplicate|
| 2 | 5 | Green | This is a duplicate|
| 3 | 4 | Pink | This is a duplicate|
| 4 | 2 | Yellow | This is a duplicate|
| 5 | 2 | Orange | This is a duplicate|
-------------------------------------------------------------
Any ideas?
~~~~~~~~~~~~ Update: ~~~~~~~~~~~~
Jeff M suggested the following code. (Thanks Jeff) However, it gives me a error. Does anyone know the syntax for making this work in VB? I've monkeyed with it a bit and can't seem to get it right.
Dim matches = _
From mRows In (From row In LinqMasterTable _
Group row By row(ThePrimaryKey) Into g() _
Select g.First()) _
Join sRows In LinqSecondTable _
On mRows(ThePrimaryKey) Equals sRows(TheForignKey) _
Order By mRows(ThePrimaryKey) _
Select mRows, sRows
Error in Third row at "row(ThePrimaryKey)":
"Range variable name can be inferred only from a simple or qualified name with no arguments."
Well, the basic problem isn't the LINQ. It's the fact the your First Table contains "duplicates", which aren't really duplicates, since in your example, every row is distinctive.
So, our question to you is "How do we identify the duplicates in the original table?". Once that is answered, the rest should be trivial.
For example (In C# since I'm not sure of the VB syntax)
var Matches = from mRows in LinqMasterTable
.Where(r=>r.AppleDescription=="This is an Apple")
join sRows in LinqSecondTable
on mRows(ThePrimaryKey) equals sRows(TheForignKey)
orderby mRows(ThePrimaryKey)
select new { mRows, sRows};
Edit:
Here's how I would write the C# LINQ query. Here's an alternate version rather than using Distinct(), uses a nested query with grouping which should have similar semantics. It should be easily convertible to VB.
var matches = from mRows in (from row in LinqMasterTable
group row by row[ThePrimaryKey] into g
select g.First())
join sRows in LinqSecondTable
on mRows[ThePrimaryKey] Equals sRows[TheForignKey]
orderby mRows[ThePrimaryKey]
select new { mRows, sRows }
and my attempt at a VB version of the above:
Edit:
As for the most recent error, I know exactly how to deal with it. When I was playing with VB LINQ, I found that the compiler doesn't like complex grouping expressions. To get around that, assign row(ThePrimaryKey) to a temporary variable and group by that variable. It should work then.
Dim matches = From mRows In (From row In LinqMasterTable _
Let grouping = row(ThePrimaryKey)
Group row By grouping Into g() _
Select g.First()) _
Join sRows In LinqSecondTable _
On mRows(ThePrimaryKey) Equals sRows(TheForignKey) _
Order By mRows(ThePrimaryKey) _
Select mRows, sRows
Actually upon second inspection, it turns out that what is being grouped by needs a name. The following will work.
Dim matches = From mRows In (From row In LinqMasterTable _
Group row By Grouping = row(ThePrimaryKey) Into g() _
Select g.First()) _
Join sRows In LinqSecondTable _
On mRows(ThePrimaryKey) Equals sRows(TheForignKey) _
Order By mRows(ThePrimaryKey) _
Select mRows, sRows
Declarations and Such:
Private Sub LinqTwoTableInnerJoin(ByRef qMasterDS As DataSet, _
ByRef qMasterTable As DataTable, _
ByRef qSecondDS As DataSet, _
ByRef qSecondTable As DataTable, _
ByRef qPrimaryKey As String, _
ByRef qForignKey As String, _
ByVal qResultsName As String)
Dim TheMasterTable As String = qMasterTable.TableName
Dim TheSecondTable As String = qSecondTable.TableName
Dim ThePrimaryKey As String = qPrimaryKey
Dim TheForignKey As String = qForignKey
Dim TheNewForignKey As String = ""
MasterTableColumns = GetColumns(qMasterDS, TheMasterTable)
SecondTableColumns = GetColumns(qSecondDS, TheSecondTable)
Dim mColumnCount As Integer = MasterTableColumns.Count
Dim sColumnCount As Integer = SecondTableColumns.Count
Dim ColumnCount As Integer = mColumnCount + sColumnCount
Dim LinqMasterTable = qMasterDS.Tables(TheMasterTable).AsEnumerable
Dim LinqSecondTable = qSecondDS.Tables(TheSecondTable).AsEnumerable
Get the Data and order it by the Selected Key:
Dim Matches = From mRows In LinqMasterTable Join sRows In LinqSecondTable _
On mRows(ThePrimaryKey) Equals sRows(TheForignKey) _
Order By mRows(ThePrimaryKey) _
Select mRows, sRows
Put the Results into a Dataset Table:
' Make sure the dataset is available and/or cleared:
If dsResults.Tables(qResultsName) Is Nothing Then dsResults.Tables.Add(qResultsName)
dsResults.Tables(qResultsName).Clear() : dsResults.Tables(qResultsName).Columns.Clear()
'Adds Master Table Column Names
For x = 0 To MasterTableColumns.Count - 1
dsResults.Tables(qResultsName).Columns.Add(MasterTableColumns(x))
Next
'Rename Second Table Names if Needed:
For x = 0 To SecondTableColumns.Count - 1
With dsResults.Tables(qResultsName)
For y = 0 To .Columns.Count - 1
If SecondTableColumns(x) = .Columns(y).ColumnName Then
SecondTableColumns(x) = SecondTableColumns(x) & "_2"
End If
Next
End With
Next
'Make sure that the Forign Key is a Unique Value
If ForignKey1 = PrimaryKey Then
TheNewForignKey = ForignKey1 & "_2"
Else
TheNewForignKey = ForignKey1
End If
'Adds Second Table Column Names
For x = 0 To SecondTableColumns.Count - 1
dsResults.Tables(qResultsName).Columns.Add(SecondTableColumns(x))
Next
'Copy Results into the Dataset:
For Each Match In Matches
'Build an array for each row:
Dim NewRow(ColumnCount - 1) As Object
'Add the mRow Items:
For x = 0 To MasterTableColumns.Count - 1
NewRow(x) = Match.mRows.Item(x)
Next
'Add the srow Items:
For x = 0 To SecondTableColumns.Count - 1
Dim y As Integer = x + (MasterTableColumns.Count)
NewRow(y) = Match.sRows.Item(x)
Next
'Add the array to dsResults as a Row:
dsResults.Tables(qResultsName).Rows.Add(NewRow)
Next
Give the user an option to clean doubles or not:
If chkUnique.Checked = True Then
ReMoveDuplicates(dsResults.Tables(qResultsName), ThePrimaryKey)
End If
Remove the Duplicates if they so desire:
Private Sub ReMoveDuplicates(ByRef SkipTable As DataTable, _
ByRef TableKey As String)
'Make sure that there's data to work with:
If SkipTable Is Nothing Then Exit Sub
If TableKey Is Nothing Then Exit Sub
'Create an ArrayList of rows to delete:
Dim DeleteRows As New ArrayList()
'Fill the Array with Row Number of the items equal
'to the item above them:
For x = 1 To SkipTable.Rows.Count - 1
Dim RowOne As DataRow = SkipTable.Rows(x - 1)
Dim RowTwo As DataRow = SkipTable.Rows(x)
If RowTwo.Item(TableKey) = RowOne.Item(TableKey) Then
DeleteRows.Add(x)
End If
Next
'If there are no hits, exit this sub:
If DeleteRows.Count < 1 Or DeleteRows Is Nothing Then
Exit Sub
End If
'Otherwise, remove the rows based on the row count value:
For x = 0 To DeleteRows.Count - 1
'Start at the END and count backwards so the duplicate
'item's row count value doesn't change with each deleted row
Dim KillRow As Integer = DeleteRows((DeleteRows.Count - 1) - x)
'Delete the row:
SkipTable.Rows(KillRow).Delete()
Next
End Sub
Then clean up any leftovers:
If Not chkRetainKeys.Checked = True Then 'Removes Forign Key
dsResults.Tables(qResultsName).Columns.Remove(TheNewForignKey)
End If
'Clear Arrays
MasterTableColumns.Clear()
SecondTableColumns.Clear()
Final Analysis:
Ran this against 2 Files with 4 columns, 65,535 rows, and with some doubles. Process time, roughly 1 second. In fact it took longer to load the fields into memory than it did to parse the data.