I'm trying to combine rows in a DataTable based on their shared ID. The table data looks something like this:
Member | ID | Assistant | Content
---------------------------------------------
16 | 1234 | jkaufman | 1/1/2015 - stuff1
16 | 1234 | jkaufman | 1/2/2015 - stuff2
16 | 4321 | mhatfield | 1/3/2015 - stuff3
16 | 4321 | mhatfield | 1/4/2015 - stuff4
16 | 4321 | mhatfield | 1/5/2015 - stuff5
16 | 5678 | psmith | 1/6/2015 - stuff6
I want to combine rows based on matching IDs. There are two steps I could use some clarification on. The first is merging the rows. The second is combining the Content columns so that the contents aren't lost. For the example above, here's what I want:
Member | ID | Assistant | Content
-------------------------------------------------------------------------------------------
16 | 1234 | jkaufman | 1/1/2015 - stuff1 \r\n 1/2/2015 - stuff2
16 | 4321 | mhatfield | 1/3/2015 - stuff3 \r\n 1/4/2015 - stuff4 \r\n 1/5/2015 - stuff5
16 | 5678 | psmith | 1/6/2015 - stuff6
My eventual goal is copy the DataTable to an Excel spreadsheet so I'm not sure sure if the \r\n is the correct newline character but that's the least of my concerns at this point.
Here's my code right now (EDIT: updated to current code):
Dim tmpRow As DataRow
dtFinal = dt.Clone()
Dim i As Integer = 0
While i < dt.Rows.Count
tmpRow = dtFinal.NewRow()
tmpRow.ItemArray = dt.Rows(i).ItemArray.Clone()
Dim j As Integer = i + 1
While j <= dt.Rows.Count
If j = dt.Rows.Count Then 'if we've iterated off the end of the datset
i = j
Exit While
End If
If dt.Rows(i).Item("ID") = dt.Rows(j).Item("ID") Then 'if we've found another entry for this id
'append change to tmpRow
tmpRow.Item("Content") = tmpRow.Item("Content").ToString & Environment.NewLine & dt.Rows(j).Item("Content").ToString
Else 'if we've run out of entries to combine
i = j
Exit While
End If
j += 1
End While
'add our combined row to the final result
dtFinal.ImportRow(tmpRow)
End While
When I export the final table to Excel, the spreadsheet is blank so I'm definitely doing something wrong.
Any help would be fantastic. Thanks!
I see various problems with your approach (with both versions; but the second one seems better). That's why I have preferred to write a whole working code to help transmit my ideas clearly.
Dim dtFinal As DataTable = New DataTable
For Each col As DataColumn In dt.Columns
dtFinal.Columns.Add(col.ColumnName, col.DataType)
Next
Dim oldRow As Integer = -1
Dim row As Integer = -1
While oldRow < dt.Rows.Count - 1
dtFinal.Rows.Add()
row = row + 1
oldRow = oldRow + 1
Dim curID As String = dt.Rows(oldRow)(1).ToString()
Dim lastCol As String = ""
While (oldRow < dt.Rows.Count AndAlso dt.Rows(oldRow)(1).ToString() = curID)
lastCol = lastCol & dt.Rows(oldRow)(3).ToString() & Environment.NewLine
oldRow = oldRow + 1
End While
oldRow = oldRow - 1
For i As Integer = 0 To 2
dtFinal.Rows(row)(i) = dt.Rows(oldRow)(i)
Next
dtFinal.Rows(row)(3) = lastCol
End While
Note that trying to come up with the most "elegant" solution or to maximise the given in-built functionalities might not be the best way to face certain situations. In the problem you propose, for example, I think that it is better going step by step (and reducing code size/improving elegance only after a properly working version is in place). This is the kind of code I have tried to create here: a simple one delivering what is expected (I think that this is the exact functionality you want; in any case, bear in mind that I am including a simplistic code which you are expected to take as a mere help to understand the point).
I find the VB syntax clunky compared to how it would be in C#, but you may prefer this Linq with grouping solution:
Dim merge = (From rw In dt.Rows.OfType(Of DataRow)()
Group rw By
New With {.fld1 = rw(0)}.fld1,
New With {.fld2 = rw(1)}.fld2,
New With {.fld3 = rw(2)}.fld3 Into Group).
Select(Function(x)
Return New With {.Member = x.fld1,
.ID = x.fld2,
.Assistant = x.fld3,
.Content = String.Join("", x.Group.Select(Function(y)
Return String.Join("", y.ItemArray)
End Function))}
End Function)
Related
I would like to create with this code, to do a multiple random (ie a loop) to do a random 20-30 times. how could i do that? and I want to increase the value of the +1 textbox each time it goes to the next random step.
TextBox1.Text = Val(TextBox1.Text) + 1
Code:
Dim strWords As String() = str1.Split(",")
'Create an instance of the Random class
Dim ValRnd As Integer = TxtNumRnd.Text
Dim rnd As New Random(ValRnd)' (Values Random not worked)
'Get a random number from 1 to 80 (2 digits)
TextBox1.Text = Val(TextBox1.Text) + 1
Dim randomNumber As Integer = rnd.Next(0, 81)
If randomNumber = strWords(StrwrVal.Text) Then
Exit For
Else
TxtRnd1.Text = TxtRnd1.Text & vbNewLine & randomNumber
End If
Next
Next
Values random not worked.
Dim rnd As New Random(5)'
While this may not be a direct answer it should serve to clarify a few things.
The Random Constructor - as I stated in the comments, using New Random(5) does not generate a sequence of 5 random numbers for you. It simply sets the seed for your random number generator. This means that the sequence of numbers that you generate when calling Random.Next() will follow the same pattern as they all have the same seed (see example).
Note: Ideally, when creating a New Random() do not set a seed value. They current time will be used as the default seed.
Creating a New Random(x) inside a loop with a defined seed
'Since random is declared inside the loop, using the same seed value
'each time the loop executes, the same random sequence would be generated.
'Random.Next() will then continually access the first value in the sequence.
For i = 1 To 5
Dim rnd As New Random(5)
TextBox1.AppendText($"{rnd.Next(0, 11)} | ")
Next
Output: 3 | 3 | 3 | 3 | 3
Creating a New Random(x) outside of the loop with a defined seed
'Since random is now declared outside the loop, and Random.Next() is called
'inside the loop, the output sequence actually progresses.
'Note that the first number is the same as the previous example as the seed is the same.
Dim rnd As New Random(5)
For i = 0 To 5
TextBox1.AppendText($"{rnd.Next(0, 11)} | ")
Next
Output: 3 | 3 | 2 | 6 | 5 | 10
Creating multiple instances of New Random(x) all with the SAME seed
'Instance 1
Dim rnd1 As New Random(5)
For i = 0 To 5
TextBox1.AppendText($"{rnd1.Next(0, 11)} | ")
Next
'Instance 2
Dim rnd2 As New Random(5)
For i = 0 To 5
TextBox2.AppendText($"{rnd2.Next(0, 11)} | ")
Next
Instance 3
Dim rnd3 As New Random(5)
For i = 0 To 5
TextBox3.AppendText($"{rnd3.Next(0, 11)} | ")
Next
Instance 1 Output: 3 | 3 | 2 | 6 | 5 | 10
Instance 2 Output: 3 | 3 | 2 | 6 | 5 | 10
Instance 3 Output: 3 | 3 | 2 | 6 | 5 | 10
I got 2 columns on 2 sheets of ~100000 cells long.
Those columns look like this:
---------
| 1 | a |
---------
| 2 | b |
---------
| 3 | c |
---------
| 4 | d |
---------
| 5 | e |
---------
and this:
---------
| 1 | a |
---------
| 3 | k |
---------
| 2 | b |
---------
| 4 | d |
---------
Now I am comparing the first columns to each other, if they match it has to check if the second column also matches. So the result will look like this:
---------------------
| 1 | a | correct |
---------------------
| 2 | b | correct |
---------------------
| 3 | c | wrong |
---------------------
| 4 | d | correct |
---------------------
| 5 | e | not found |
---------------------
I am using this function to do this: =IFERROR(IF(VLOOKUP(A3;newsheet!A:B;2;FALSE)=B3;"Correct";"Wrong");"Not Found") But to do this, it takes very long, I am using excel 2016 and all my 4 processors. Now it calculates slower and slower, probably because I got the first column on alphabetical order both, but the deeper it gets, the more rows it is going to check. So is there any way to let the VLOOKUP function not check the cells, it did already found an agreement.
So in my example: if it found the | 1 | a |, the next round it will search only the following remaining items:
---------------------
| 2 | b | correct |
---------------------
| 3 | c | wrong |
---------------------
| 4 | d | correct |
---------------------
| 5 | e | not found |
---------------------
Thanks in advance for helping me out with this problem
Unless you need the values to dynamically update; the VBA is a better alternative to 100K formulas. Using an ArrayList and arrays it took 1.98 seconds to process the data.
Sub ValidateData()
Application.ScreenUpdating = False
Application.Calculation = xlCalculationManual
'You'll need to adjust these const values
Const LOOKUP_SHEET As String = "newsheet"
Const TARGET_SHEET As String = "oldsheet"
Dim x As Long, y As Long
Dim data As Variant, results As Variant
Dim key As String
Dim list As Object
Set list = CreateObject("System.Collections.ArrayList")
With Worksheets(LOOKUP_SHEET)
'Load the values from columns A and B into an array
data = .Range("A1", .Range("A" & .Rows.Count).End(xlUp)).Resize(, 2)
End With
For x = 1 To UBound(data, 1)
'Create a unique identifier
'using a delimiter to ensure value don't mix
key = data(x, 1) & "|" & data(x, 2)
If Not list.Contains(key) Then list.Add key
Next
With Worksheets(TARGET_SHEET)
'Load the values from columns A and B into an array
data = .Range("A1", .Range("A" & .Rows.Count).End(xlUp)).Resize(, 2)
'Resize the results array
ReDim results(1 To UBound(data), 1 To 1)
For x = 1 To UBound(data, 1)
'Create a unique identifier
'using a delimiter to ensure value don't mix
key = data(x, 1) & "|" & data(x, 2)
results(x, 1) = IIf(list.Contains(key), "Correct", "Wrong")
Next
.Range("C1").Resize(UBound(results, 1)) = results
End With
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
End Sub
Can't post comments because of rep issues, but assuming they are sorted suitably this I think does what you ask.
Edit:
Also if you want to check both columns at once instead of going one at a time, you can join the two columns to make a proxy. i.e. Autofilling down =A1 & B1
So you get a third column containing
1a
2b
2c
etc.
Cuts the vlookups required in half :)
Sub ihopethishelps()
Dim last As Long
Dim r As Long
Range("B1").Select
Selection.End(xlDown).Select
last = ActiveCell.Row - 1
Range("C1").Select
For r = 0 To last
ActiveCell.Offset(r, 0).Value = _
"=IFERROR(IF(VLOOKUP(A" & r + 1 & ",Sheet2!A" & r + 1 & ":O" & last & ",2,FALSE)=B" & r + 1 & "," & Chr(34) & "Correct" & Chr(34) & "," & Chr(34) & "Wrong" & Chr(34) & ")," & Chr(34) & "Not Found" & Chr(34) & ")"
Next
End Sub
I have a datatable called dtstore with 4 columns called section, department, palletnumber and uniquenumber. I am trying to make a new datatable called dtmulti which has an extra column called multi which shows the count for the number of duplicate rows...
dtstore
section | department | palletnumber | batchnumber
---------------------------------------------------
pipes 2012 1234 21
taps 2011 5678 345
pipes 2012 1234 21
taps 2011 5678 345
taps 2011 5678 345
plugs 2009 7643 63
dtmulti
section | department | palletnumber | batchnumber | multi
----------------------------------------------------------
pipes 2012 1234 21 2
taps 2011 5678 345 3
I have tried lots of approaches but my code always feels clumsy and bloated, is there an efficient way to do this?
Here is the code I am using:
Dim dupmulti = dataTable.AsEnumerable().GroupBy(Function(i) i).Where(Function(g) g.Count() = 2).Select(Function(g) g.Key)
For Each row In dupmulti multirow("Section") = dup("Section")
multirow("Department") = dup("Department")
multirow("PalletNumber") = dup("PalletNumber")
multirow("BatchNumber") = dup("BatchNumber")
multirow("Multi") = 2
Next
Assumptions of the code below these lines: the DataTable containing the original information is called dup. It might contain any number of duplicates and all of them can be defined by just looking at the first column.
'Creating final table from the columns in the original table
Dim multirow As DataTable = New DataTable
For Each col As DataColumn In dup.Columns
multirow.Columns.Add(col.ColumnName, col.DataType)
Next
multirow.Columns.Add("multi", GetType(Integer))
'Looping though the groupped rows (= no duplicates) on account of the first column
For Each groups In dup.AsEnumerable().GroupBy(Function(x) x(0))
multirow.Rows.Add()
'Adding all the cells in the corresponding row except the last one
For c As Integer = 0 To dup.Columns.Count - 1
multirow(multirow.Rows.Count - 1)(c) = groups(0)(c)
Next
'Adding the last cell (duplicates count)
multirow(multirow.Rows.Count - 1)(multirow.Columns.Count - 1) = groups.Count
Next
I'm having trouble with some VBA for Excel 2010. I have a list of names that have different serial numbers associated with them. The following code will look at a name in column A, look it up in the names dictionary for an array of serial numbers associated with this name, and print out each number in a new column.
Names Dictionary:
Names("Jane B") = [111112, 22222]
Output:
|Joe A | 11111
|Jane B | 111112| 22222 |
|Jim C | 11111 | 121212 | 1122112
Code:
Dim name, counter
For i = 2 To Worksheets("Contacts").UsedRange.Rows.Count
name = Worksheets("Contact").Cells(i, 1)
counter = 0
If names.Exists(name) Then
For Each serial In names(name)
Worksheets("Contact").Cells(i, 2+counter).Value = serial
counter = counter + 1
Next serial
End If
Next i
So far, so good. But the output format isn't good for inputting into Access. Instead, I'd like to have the following format:
|Joe A | 11111
|Jane B | 111112
|Jane B | 22222
|Jim C | 11111
|Jim C | 121212
|Jim C | 1122112
Here's my code:
Dim name, counter
For i = 2 To Worksheets("Contact").UsedRange.Rows.Count
name= Worksheets("Contact").Cells(i, 1)
counter = 0
If names.Exists(name) Then
For Each serial In names(name)
Worksheets("Contact").Cells(i + counter, 2).Value = serial
Worksheets("Contact").Cells(i + counter, 1).Value = name
Worksheets("Contact").Cells(i + counter + 1, 1).EntireRow.Insert
counter = counter + 1
Next serial
End If
Next i
This is where I run into a problem. My output looks like this:
|Joe A | 11111
|Joe A | 1700
|Joe A | 1700
|Joe A | 1700
|Joe A | 1700
|Joe A | 1700
|Joe A | 1700
While the numbers are all made up, the 1700 output is actually what is outputting, although that doesn't relate to any serial number (???).
Can anyone spot what's off in my code?
Thank you all for your time and consideration.
With gratitude,
Zac
Try this: Use a new sheet (example: "NewContactSheet").
Instead of inserting rows to the current contact sheet, which makes you insert a row then scan the next row (the one you just inserted) and insert it again and again.
Then scan the contact sheet one row at a time, and compare to the dictionary exactly as you are. Then, one serial at a time per Name, you add cell 1 and 2 on the new sheet and increment the row.
Without the dictionary to test with, and based on the original post saying "So far so good"...
Sub SerialNameMover()
Dim name As String
Dim counter As Integer
Dim lastContactRow As Integer
Dim newSheet As String
Dim nRow As Integer
Dim i As Integer
newSheet = "NewContactSheet"
nRow = 2
lastContactRow = Worksheets("Contact").UsedRange.Rows.Count
For i = 2 To lastContactRow
name = Sheets("Contact").Cells(i, 1)
If Names.Exists(name) Then
For Each serial In Names(name)
Sheets(newSheet).Cells(nRow, 1) = name
Sheets(newSheet).Cells(nRow, 2) = serial
nRow = nRow + 1
Next serial
End If
Next i
End Sub
I have a collection (IList(Of Sample)) of the following class:
Public Class Sample
Public sampleNum As String
Public volume As Integer
Public initial As Single
Public final As Single
End Class
This collection is filled from a regex that gets passed over a file.
What I want to do is use Linq to generate a collection of these for each unique samplenum using the following conditions.
For each samplenum:
Have the highest volume where the final is greater then one
If the sample has multiple records for this volume then pick the one with the the highest final
If the previous step leaves us with no records pick the record with the highest final ignoring volume
I am extremely new to Linq and just can't get my head around this. For now I have solved this using for each's and temporary lists but I am interested in how this would be handled using pure Linq.
Sample Data:
samplenum | volume | initial | final
1 | 50 | 8.47 | 6.87
1 | 300 | 8.93 | 3.15
2 | 5 | 8.28 | 6.48
2 | 10 | 8.18 | 5.63
2 | 5 | 8.33 | 6.63
2 | 10 | 8.26 | 5.58
3 | 1 | 8.31 | 0.75
3 | 5 | 8.19 | 0.03
4 | 50 | 8.28 | 6.55
4 | 300 | 7.19 | 0.03
This should hopefully solve your problems:
Dim source As IEnumerable(Of Sample)
' Get the data...
Dim processed = source _
.GroupBy(Function(s) s.sampleNum) _
.Select(Function(s) Process(s))
Dim array = processed.ToArray()
Console.ReadLine()
The Process function:
Private Function Process(ByVal sequence As IEnumerable(Of Sample)) As Sample
Dim filtered = (
From s In sequence
Where s.final > 1
Order By
s.volume Descending,
s.final Descending
)
' If we don't have any elements after the filtering,
' return the one with the highest final.
' Otherwise, return the first element.
If Not filtered.Any() Then
Return (From s In sequence Order By s.final Descending).FirstOrDefault()
Else
Return filtered.First()
End If
End Function
Try this. I haven't tried it but it should do what you want. There is probs a better way of doing this:
' For each sample number in the list
For Each el In (From p In lst Select p.sampleNum).Distinct()
' can cause odd results in some cases so always put the foreach var into another var
Dim var As String = el
' get another "list" but for this sample num
Dim res As IEnumerable(Of Sample) = lst.Where(Function(p) p.volume > 1 AndAlso p.sampleNum = var)
Dim sam As Sample ' the result
If Not res Is Nothing And res.Any() Then
' we have a result, so get the first result where the
sam = res.Where(Function(p) p.volume = res.Max(Function(x) x.volume)).First()
Else
' we have no results, so resort back to the normal list, for this sample number
sam = lst.Where(Function(p) p.sampleNum = var AndAlso p.volume = lst.Max(Function(x) x.volume)).First()
End If
'
' do what ever with the sample here
'
Next