Use .Contains() to match on a property of a property of <T> in LINQ query - vb.net

Looking for help on how to perform a LINQ query using the .Contains() method of a List(Of T) to get back elements that are not contained in a second List(Of T) based on a property of a property of T in the first List(Of T).
Here is some sample code that I wrote up, this scenario is ficticious, but the concept is still there.
Module Module1
Sub Main()
' Get all Files in a directory that contain `.mp` in the name
Dim AllFiles As List(Of IO.FileInfo) = New IO.DirectoryInfo("C:\Test\Path").GetFiles("*.mp*").ToList
Dim ValidFiles As New List(Of fileStruct)
' Get all Files that actually have an extension of `.mp3`
AllFiles.ForEach(Sub(x) If x.Extension.Contains("mp3") Then ValidFiles.Add(New fileStruct(prop1:=x.Name, path:=x.FullName)))
' Attempting the get all files that are not listed in the Valid files list
Dim InvalidFiles As IO.FileInfo() = From file As IO.FileInfo In AllFiles Where Not ValidFiles.Contains(Function(x As fileStruct) x.fleInfo.FullName = file.FullName) Select file
' Errors on the `.Contains()` method because I have no idea what I'm doing and I am basically guessing at this point
'Here is the same but instead using the `.Any()` Method
Dim InvalidFiles As IO.FileInfo() = From file As IO.FileInfo In AllFiles Where Not ValidFiles.Any(Function(x As fileStruct) x.fleInfo.FullName = file.FullName) Select file
' This doesn't error out, but all files are returned
End Sub
Public Structure fileStruct
Private _filePath As String
Private _property1 As String
Public ReadOnly Property property1 As String
Get
Return _property1
End Get
End Property
Public ReadOnly Property fleInfo As IO.FileInfo
Get
Return New IO.FileInfo(_filePath)
End Get
End Property
Public Sub New(ByVal prop1 As String, ByVal path As String)
_property1 = prop1
_filePath = path
End Sub
End Structure
End Module

This is a more or less direct implementation of the MP3 files list in the question. I did use a FileItem class instead of a structure. The good part is afterwards:
' note: EnumerateFiles
Dim AllFiles As List(Of IO.FileInfo) = New IO.DirectoryInfo("M:\Music").
EnumerateFiles("*.mp*", IO.SearchOption.AllDirectories).ToList()
Dim goofyFilter As String() = {"g", "h", "s", "a"}
' filter All files to those starting with the above (lots of
' Aerosmith, Steely Dan and Heart)
Dim ValidFiles As List(Of FileItem) = AllFiles.
Where(Function(w) goofyFilter.Contains((w.Name.ToLower)(0))).
Select(Function(s) New FileItem(s.FullName)).ToList()
Dim invalid As List(Of FileInfo)
invalid = AllFiles.Where(Function(w) Not ValidFiles.
Any(Function(a) w.FullName = a.FilePath)).ToList()
This is much the same as Sam's answer except with your file/mp3 usage. AllFiles has 809 items, ValidFiles has 274. The resulting invalid list is 535.
Now, lets speed it up 50-60x:
Same starting code for AllFiles and ValidFiles:
Dim FileItemValid = Function(s As String)
Dim valid As Boolean = False
For Each fi As FileItem In ValidFiles
If fi.FilePath = s Then
valid = True
Exit For
End If
Next
Return valid
End Function
invalid = AllFiles.Where(Function(w) FileItemValid(w.FullName) = False).ToList()
With a Stopwatch, the results are:
Where/Any count: 535, time: 572ms
FileItemValid count: 535, time: 9ms
You get similar results with a plain old For/Each loop that calls an IsValid function.
If you do not need other FileInfo, you could create your AllFiles as a list of the same structure as you are receiving so you can do property vs property compares, use Except and Contains:
AllFiles2 = Directory.EnumerateFiles("M:\Music", "*.mp3", IO.SearchOption.AllDirectories).
Select(Function(s) New FileItem(s)).ToList()
Now you can use Contains with middling results:
invalid2 = AllFiles2.Where(Function(w) Not ValidFiles.Contains(w)).ToList()
This also allows you to use Except which is simpler and faster:
invalid2 = AllFiles2.Except(ValidFiles).ToList()
Where/Contains count: 535, time: 74ms
Except count: 535, time: 3ms
Even if you need other items from FileInfo, you can easily fetch them given the filename

As others have noted, .Except() is a better approach but here is an answer to your question:
List<int> list1 = new List<int> { 1, 2, 3 };
List<int> list2 = new List<int> { 3, 4, 5 };
List<int> list3 = list1.Where(list1value => !list2.Contains(list1value)).ToList(); // 1, 2
Based on comments here as an example using different types. This query use .Any()
List<Product> list1 = new List<Produc> { ... };
List<Vendor> list2 = new List<Vendor> { ... };
List<Product> list3 = list1.Where(product => !list2.Any(vendor => product.VendorID == vendor.ID)).ToList();
// list3 will contain products with a vendorID that does not match the ID of any vendor in list2.

Simply use Except as CraigW suggested. You have to do some projections (select) to get it done.
Dim InvalidFiles as IO.FileInfo() = AllFiles.Select(Function(p) p.FullName).Except(ValidFiles.Select(Function(x) x.fleInfo.FullName)).Select(Function(fullName) New IO.FileInfo(fullName)).ToArray()
Note: This code is not really efficient and also not very readable but works.
But i would go for something like this:
Dim AllFiles As List(Of IO.FileInfo) = New IO.DirectoryInfo("C:\MyFiles").GetFiles("*.mp*").ToList
Dim ValidFiles As New List(Of fileStruct)
Dim InvalidFiles as New List(Of FileInfo)
For Each fileInfo As FileInfo In AllFiles
If fileInfo.Extension.Contains("mp3") Then
ValidFiles.Add(New fileStruct(prop1:=fileInfo.Name, path:=fileInfo.FullName))
Else
InvalidFiles.Add(fileInfo)
End If
Next
Simple, fast and readable.

Related

VB.NET Index out of Range exception related to text file

I have some code I have used many times over which has always worked great for me. The latest use, however, throws an exception under certain circumstances that I cannot seem to resolve. Here it is:
I read from a text file to an array, use it as a binding source for some of my controls (it autofills 3 controls based on the selection of a single control). I created a Student class with 4 properties (Name, ID, DOB and DOE). Here is the code I use:
Private Sub autoFill()
Dim rost As String = "Roster.txt"
Dim lines As List(Of String) = File.ReadAllLines(rost).ToList
Dim list As List(Of Student) = New List(Of Student)
For i As Integer = 0 To lines.Count - 1
Dim data As String() = lines(i).Split(":")
list.Add(New Student() With {
.StudentName = data(0),
.StudentID = data(1),
.StudentDOB = data(2),
.StudentDOE = data(3)
})
Next
StudentBindingSource.DataSource = list
End Sub
Now here is the problem. In the "For" loop when I set i to 0 to lines.count -1 it throws this error:
VB>NET EXCEPTION
However...If I change i to 1 instead of 0 it works OR if I take away data(2) and data(3) it works with i = 0. I would prefer to use 0 so that I can have a blank line in the combobox or "--choose--", etc. The only thing I have thought that might be useful is that my first row in the text file has nothing to split. Here is the line format of the text file:
Student Name ID# DOB DOE <-----This header row is NOT in the text file
Last Name, First Name : 0000000 : 01/01/2021 : 01/01/2021
I'm going to assume I'm missing something really simple here. Any guidance would be greatly appreciated! Thank you.
Before we get to the actual problem, let's re-work some things.
A better way to structure code, especially when working with data loading, is to have a method that accepts an input and returns a result. Additionally, calling ToList() or ToArray() is a very expensive operation for performance. Very often you can improve performance dramatically by working with a lower-level IEnumerable for as long as possible.
With those principles in mind, consider this code:
Private Function ReadStudentData(fileName As String) As IEnumerable(Of Student)
Dim lines As IEnumerable(Of String) = File.ReadLines(fileName)
Return lines.
Select(Function(line) line.Split(":")).
Select(Function(data)
Return New Student() With {
.StudentName = data(0),
.StudentID = data(1),
.StudentDOB = data(2),
.StudentDOE = data(3)
}
End Function)
End Function
Private Sub autoFill()
StudentBindingSource.DataSource = ReadStudentData("Roster.txt")
End Sub
Now on to the actual issue. The problem was not from looping through the list variable. The problem is the data array. At some point you have a line that doesn't have enough elements. This is common, for example, as the last line in a file.
There are many ways to address this. In some cases, the exception is already the appropriate result, because if you have bad data you really don't want to continue. In other cases you want to log the bad records, perhaps to a report you can easily review later. Or maybe you just want to ignore the error, or pre-filter for rows with the right number of columns. Here is an example of the last option:
Private Function ReadStudentData(fileName As String) As IEnumerable(Of Student)
Return File.ReadLines(fileName).
Select(Function(line) line.Split(":")).
Where(Function(data) data.Length = 4).
Select(Function(data)
Return New Student() With {
.StudentName = data(0),
.StudentID = data(1),
.StudentDOB = data(2),
.StudentDOE = data(3)
}
End Function)
End Function
Private Sub autoFill()
StudentBindingSource.DataSource = ReadStudentData("Roster.txt")
End Sub
The problem is that you didn't check 'data' to have enough elements to create the 'Student'. A simple check should fix it.
Private Sub autoFill()
Dim rost As String = "Roster.txt"
Dim lines As List(Of String) = File.ReadAllLines(rost).ToList
Dim list As List(Of Student) = New List(Of Student)
For i As Integer = 0 To lines.Count - 1
Dim data As String() = lines(i).Split(":"c)
'Check data
If data.Length >= 4 Then '
list.Add(New Student() With {
.StudentName = data(0),
.StudentID = data(1),
.StudentDOB = data(2),
.StudentDOE = data(3)
})
End If
Next
StudentBindingSource.DataSource = list
End Sub
try this code:
Dim list As List(Of Student) = New List(Of Student)(100)
basically initialize the student list with a capacity. This is the capacity of the list, not the count/length.

Two List(Of String), when I add a value to one it gets added to the other

I'm getting the folders in two different directories and putting their names into two listviews, but if one folder name exist in only one directory I would like to enter nothing into the other list and vice versa, unless the folder name is in both, then I just want the name in both lists.
Here is the code I am using:
Private folders As New List(Of String), oFolders As New List(Of String), AllFoldrs As New List(Of DirectoryInfo), CurFdr As DirectoryInfo
Private Sub Get_LVItems()
folders.Clear() : oFolders.Clear() : CurLV.Items.Clear() : AllFoldrs.Clear() 'Empty folders List and CurrentLV items
Dim LV As ListView = CurLV, oLV As ListView = OtherLv
Dim Pth As String = CurTV.SelectedNode.Name, oPth As String = ""
Dim Dinfo As New DirectoryInfo(Pth), oDinfo As DirectoryInfo = Nothing
Dim TmpFoldrs As New List(Of DirectoryInfo), oTmpFoldrs As New List(Of DirectoryInfo)
If Not IsNothing(OtherTV.SelectedNode) Then
End If
If TVs_Syncd Then
oPth = OtherTV.SelectedNode.Name : oDinfo = New DirectoryInfo(oPth)
TmpFoldrs.AddRange(Dinfo.GetDirectories) : oTmpFoldrs.AddRange(oDinfo.GetDirectories)
AllFoldrs.AddRange(Dinfo.GetDirectories) : AllFoldrs.AddRange(oDinfo.GetDirectories)
AllFoldrs.Sort(AddressOf SrtAllFdrs)
Do While AllFoldrs.Count > 0
CurFdr = AllFoldrs(0)
Dim Found_fdr As DirectoryInfo = Nothing, oFound_fdr As DirectoryInfo = Nothing
For Each fdr As DirectoryInfo In TmpFoldrs
If fdr.Name = CurFdr.Name Then Found_fdr = fdr : Exit For
Next
If IsNothing(Found_fdr) Then folders.Add(Nothing) Else folders.Add(Found_fdr.FullName)
For Each ofdr As DirectoryInfo In oTmpFoldrs
If ofdr.Name = CurFdr.Name Then oFound_fdr = ofdr : Exit For
Next
If IsNothing(oFound_fdr) Then oFolders.Add(Nothing) Else oFolders.Add(oFound_fdr.FullName)
AllFoldrs.RemoveAll(AddressOf RemDirs)'After adding a folder to both collections (folders & oFolders) all instances of that folder get removed from AllFolders
Loop
LoadListView(oLV)
folders.Clear()
folders = oFolders
LoadListView(LV)
Else
folders.AddRange(Directory.GetDirectories(Pth))
LoadListView(LV)
End If
End Sub
two functions:
for removing all instances of the folder name just processed and
the sort function for sorting AllFoldrs after adding all the folder names to it:
Public Function RemDirs(dir As DirectoryInfo) As Boolean
Return dir.Name = CurFdr.Name
End Function
Public Function SrtAllFdrs(ByVal X As DirectoryInfo, ByVal Y As DirectoryInfo) As Integer
Return X.Name.CompareTo(Y.Name)
End Function
I add both directories folders to AllFolders and each directories folders to their own Tmp List.
In the first For Each loop, I see if the current folder name exists in tmp list, if it does I put the value into Found_fdr then when I add it to the list(Of String) I am using to fill list A, it gets added to the other list(Of String) at the same time. I's baffling me!
This is much more to the point than my last question that got no answers, which really didn't surprise me. Anyone? Please...

Getfile with multiple extension filter and order by file name

i am working on vb.net desktop application.now i need that files coming from directory is in with extension .txt and .sql and also need that files coming in order by folder name. in need both together how to do it?
Try
Dim s As String = Txtfolder.Text
Dim files As List(Of String) = New List(Of String)()
Try
For Each f As String In Directory.GetFiles(s, "*.*").Where(Function(f1) f1.EndsWith(".sql") OrElse f1.EndsWith(".txt")).OrderBy(Function(f) f.LastWriteTime).First()
files.Add(f)
Next
For Each d As String In Directory.GetDirectories(s)
files.AddRange(DirSearch(d))
Next
Catch excpt As System.Exception
MessageBox.Show(excpt.Message)
End Try
Private Function DirSearch(ByVal sDir As String) As List(Of String)
Dim files As List(Of String) = New List(Of String)()
Try
For Each f As String In Directory.GetFiles(sDir, "*.*").Where(Function(f1) f1.EndsWith(".sql") OrElse f1.EndsWith(".txt"))
files.Add(f)
Next
For Each d As String In Directory.GetDirectories(sDir)
files.AddRange(DirSearch(d))
Next
Catch excpt As System.Exception
MessageBox.Show(excpt.Message)
End Try
Return files
End Function
Here is an example of option 1 from my comment, i.e. get all file paths and filter yourself:
Dim folderPath = "folder path here"
Dim filePaths = Directory.GetFiles(folderPath).
Where(Function(s) {".txt", ".sql"}.Contains(Path.GetExtension(s))).
OrderBy(Function(s) Path.GetFileName(s)).
ToArray()
Here's an example of option 2, i.e. get paths by extension and combine:
Dim folderPath = "folder path here"
Dim filePaths = Directory.GetFiles(folderPath, "*.txt").
Concat(Directory.GetFiles(folderPath, "*.sql")).
OrderBy(Function(s) Path.GetFileName(s)).
ToArray()
An alternative method, which allows searching for multiple directories and filtering the results using multiple search patterns.
It returns an ordered List(Of String):
Private Function DirSearch(ByVal sDirList As String(), SearchPatter As String()) As List(Of String)
Return sDirList.SelectMany(
Function(dir) SearchPatter.SelectMany(
Function(filter)
Return Directory.GetFiles(dir, filter, SearchOption.AllDirectories)
End Function).OrderBy(Function(xDir) xDir)).ToList()
End Function
You can pass the method a list of paths and a list of extensions:
Dim SearchPaths As String() = New String() {"[Directory1]", "[Directory2]"}
Dim ItemSearchPattern As String() = New String() {"*.txt", "*.sql", "*.jpg"}
Dim DirListing As List(Of String) = DirSearch(SearchPaths, ItemSearchPattern)
Extract the content of a sigle directory with:
Dim FilesInDir As List(Of String) = DirListing.
Where(Function(entry) entry.ToUpper().
Contains("[DirectoryName]".ToUpper())).ToList()
This is a case insensitive filter. Remove (ToUpper()) for a case sensitive one.

Split() doesn't work properly

well I'm doing a computing assessment and well I've ran into an issue with splitting a string. For some reason when the string splits the array stores the whole thing in Variable(0). The error that occurs is when it tries to assign TicketID(Index) a value, it says that the array is out of bound.
Here's the code:
Private Sub ReadInformation(ByRef TicketID() As String, CustomerID() As String, PurchaseMethod() As Char, NumberOfTickets() As Integer, FileName As String)
Dim Line, TextArray(3) As String
Dim Index As Integer
FileOpen(1, FileName, OpenMode.Input)
For Index = 0 To 499
Input(1, Line)
TextArray = Line.Split(",")
CustomerID(Index) = TextArray(0)
TicketID(Index) = TextArray(1)
NumberOfTickets(Index) = TextArray(2)
PurchaseMethod(Index) = TextArray(3)
MessageBox.Show(CustomerID(Index))
Next
FileClose()
End Sub
Here's the first 10 lines of the TextFile I'm trying to read:
C001,F3,10,S
C002,F3,2,O
C003,F3,3,S
C004,W2,9,S
C005,T3,10,S
C006,F3,2,S
C007,W1,3,O
C008,W3,1,O
C009,T2,2,S
C010,F2,9,O
Here's the Error Message I receive:
Error Message
I would use some Lists instead of arrays. In this way you don't have to worry about length of the arrays or if there are fewer lines than 500. Of course, using the more advanced NET Framework methods of the File.IO namespace is a must
Private Sub ReadInformation(TicketID As List(Of String), _
CustomerID As List(Of String), _
PurchaseMethod As List(Of Char), _
NumberOfTickets As List(Of Integer), _
FileName As String)
for each line in File.ReadLines(FileName)
Dim TextArray = Line.Split(","c)
if TextArray.Length > 3 Then
CustomerID.Add(TextArray(0))
TicketID.Add(TextArray(1))
' This line works just because you have Option Strict Off
' It should be changed as soon as possible
NumberOfTickets.Add(TextArray(2))
PurchaseMethod.Add(TextArray(3))
End If
Next
End Sub
You can call this version of your code declaring the 4 lists
Dim TicketID = New List(Of String)()
Dim CustomerID = New List(Of String)()
Dim PurchaseMethod = New List(Of Char)()
Dim NumberOfTickets = New List(Of Integer)()
ReadInformation(TicketID, CustomerID, PurchaseMethod, NumberOfTickets, FileName)
Another approach more Object Oriented is to create a class that represent a line of your data. Inside the loop you create instances of that class and add the instance to a single List
Public Class CustomerData
Public Property TicketID As String
Public Property CustomerID As String
Public Property NumberOfTickets As Integer
Public Property PurchaseMethod As Char
End Class
Now the loop becomes
Private Function ReadInformation(FileName As String) as List(Of CustomerData)
Dim custData = New List(Of CustomerData)()
For Each line in File.ReadLines(FileName)
Dim TextArray = Line.Split(","c)
if TextArray.Length > 3 Then
Dim data = new CustomerData()
data.CustomerID = TextArray(0)
data.TicketID = TextArray(1)
data.NumberOfTickets = TextArray(2)
data.PurchaseMethod = TextArray(3)
custData.Add(data)
End If
Next
return custData
End Function
This version requires the declaration of just one list
You can call this version of your code passing just the filename and receiving the result fo the function
Dim customers = ReadInformation(FileName)
For Each cust in customers
Console.WriteLine(cust.CustomerID)
...
Next
Or use it as an array
Dim theFirstCustomer = customers[0]
Console.WriteLine(theFirstCustomer.CustomerID)

List all folders that are in any 3rd subdirectory from current

I would need to make an array list, displaying all folders that are in the 3rd subfolder from the current one.
Folder1/sub1folder/sub2folder/sub3folder
It has to be recursive. what I need is an array of strings that contains all the strings like above.
I do know how to look recursively into folders, but I do not know how to limit the search to the 3rd subfolder.
Thanks!
Here's my stab at it. I tested it and it works for me:
Dim resultList as List(Of String) = DirectorySearch(baseDirectoryPath, 0)
Function DirectorySearch(directoryPath As String, level As Integer) As List(Of String)
level += 1
Dim directories As String() = IO.Directory.GetDirectories(directoryPath)
Dim resultList As New List(Of String)
If level = 3 Then
For Each subDirectoryPath In directories
Dim result As String = GetFinalResult(subDirectoryPath)
resultList.Add(result)
Next
Else
For Each subDirectoryPath In directories
Dim partialResultList As List(Of String) = DirectorySearch(subDirectoryPath, level)
resultList.AddRange(partialResultList)
Next
End If
Return resultList
End Function
Private Function GetFinalResult(directoryPath As String) As String
Dim directoryInfo As New IO.DirectoryInfo(directoryPath)
Return String.Format("{0}/{1}/{2}/{3}",
directoryInfo.Parent.Parent.Parent.Name,
directoryInfo.Parent.Parent.Name,
directoryInfo.Parent.Name,
directoryInfo.Name)
End Function
If you had a recursive function which began at the current folder:
Public Function recurse(Optional depth As Integer = 0) As String()
Dim folderList As String()
If (depth < 3) Then
depth += 1
folderList = recurse(depth)
Else
'Do third subfolder analysis and set the output to folderList
Return folderList
End If
End Sub