Unknown number of columns to create PredictionEngine in ML.NET - vb.net

I have been writing a program to create custom Models for Machine Learning.
The user will read a .csv or .txt and pass that data to a DatagridView.
The DataGridView contains the columns "A", "B" and "C" (apart from "Label" column, which is the result), I created a model inserting all that columns in a List (of string) and concatenate then like this:(lista2 -> "A","B","C" / Lista value -> "Label")
Dim estimator = mlContext.Transforms.Concatenate("Features", lista2.ToArray) _
.Append(mlContext.BinaryClassification.Trainers.FastTree(labelColumnName:=Lista(Lista.Count - 1), featureColumnName:="Features"))
After that, I have 2 classes named NewData and NewDataPrediction with the following properties:
Public Class NewData
<LoadColumn(0)>
Public A As Single
<LoadColumn(1)>
Public B As Single
<LoadColumn(2)>
Public C As Single
<LoadColumn(3)>
Public Label As Boolean
Public Class NewDataPrediction
Inherits NewData
<ColumnName("PredictedLabel")>
Public Property Prediction As Boolean
Public Property Probability As Single
Public Property Score As Single
After this, I create the PredictionEngine
Dim predictionFunction As PredictionEngine(Of NewData, NewDataPrediction) = mlContext.Model.CreatePredictionEngine(Of NewData, NewDataPrediction)(model)
The model is created with no problem and if I assign some values, It predictes everything right.
The problem is the following, Since the program reads the Text File, it can contain a different number of Columns with different names. How can I create the PredictionEngineModel with NewData(where I would like to insert all new columns, in an array,list or something and read the columns inside that list as properties in the class), since I cant create a class with properties at runtime. How can I approach this problem? Thanks

Related

Replacing an object by a deserialized version of it, and preserving references

Say I have an object of my custom class, called AppSettings, which has various properties that hold both value types (integers, doubles, strings, etc.) and reference types (arrays, other custom objects, etc.). Some of these custom objects have their own custom objects, so the path down to some of the value type properties can go very deep.
For example:
<Serializable()>
Public Class AppSettings
Public Property windowHeight As Integer = 600
Public Property windowWidth As Integer = 800
Public Property defaultLengthUnit As Unit = Units.meters
Public Property defaultAngleUnit As Unit = Units.degrees
End Class
Where Unit class is defined as:
<Serializable()>
Public Class Unit
Public Property Name As String
Public Property Abbreviation As String
Public Property Scale As Double
End Class
And Units module is defined as:
Public Module Units
Public meters As New Unit With {
.Name = "Meters",
.Abbreviation = "m.",
.Scale = 1
}
Public degrees As New Unit With {
.Name = "Degrees",
.Abbreviation = "°",
.Scale = 1
}
End Module
Some other code might refer or bind to some of the reference type properties, or their internal properties. Now, let's say I provide a way for the user to save current state of AppSettings by serializing it into XML:
Public Sub SerializeAppSettings(ByVal filename As String)
Using sw As StreamWriter = New StreamWriter(filename)
Dim xmls As XmlSerializer = New XmlSerializer(GetType(AppSettings))
xmls.Serialize(sw, appSettings)
End Using
End Sub
and then load them back (by deserializing) at any time while running the application:
Public Function DeserializeAppSettings(ByVal filename As String) As AppSettings
If Not File.Exists(filename) Then Return Nothing
Using sr As StreamReader = New StreamReader(filename)
Dim xmls As XmlSerializer = New XmlSerializer(GetType(AppSettings))
Return TryCast(xmls.Deserialize(sr), AppSettings)
End Using
End Function
It is called like so:
AppSettings = DeserializeAppSettings(settingsFilePath)
The problem here is that all the references to AppSettings that other objects and bindings have, are now broken, because deserialization replaces the old instance of AppSettings with a completely new instance, and the references are not transferred to it.
It appears that this doesn't break references to value-type properties (like windowHeight, which is Integer), but it definitely breaks references to reference-type properties, like defaultLengthUnit. So for example, if some other object or WPF control is referring/binding to, say, AppSettings.defaultLengthUnit.scaleToBaseUnit, it doesn't work anymore.
I wonder, how can I fix this, so that deserialization would replace the old instance of AppSettings and transfer all the references from it to the new instance that it generated?
As I understand it, there are three ways to go about it:
Replace the old instance with an new one in the exact same memory allocation, with the same internal ID, which would probably be too hacky, and I'm not sure if at all possible.
Another way would be for the DeserializeAppSettings function to overwrite each property value of the current AppSettings instance, one by one, by the deserialized values. However, since some properties of AppSettings are objects, which have their own objects, which have their own objects (and so on), I would basically need to type out all the hierarchy tree in that DeserializeAppSettings function to get down to the value type properties. And every time I would need to add or remove any property in the AppSettings class (or in any class that is used in it's properties), I would also need to manually update the parsing code in DeserializeAppSettings function. This is seriously unmaintainable.
Lastly, it would probably be possible to automate this value replacement through reflection, but reflection is very slow, and generally discouraged if there is any other option.
I hope I am missing something obvious here. Any suggestions on how to transfer all the references to AppSettings when the old instance of it is replaced with a new one through deserialization?
EDIT: Updated the code to include all the relevant classes.

Is it possible to create a multi dimensional array with varying types?

I have 3 controls that consists of a label, a listbox and a textbox on the same line. On a different line I have 3 different controls of the same type which is label, a listbox and a textbox. I want to put them into a 3 dimensional array like this:
Dim multiArray() As Object = { {Lane1Label, ListBox1, TextBox1},
{Lane2Label, ListBox2, TextBox2} }
But it's not letting me do this. Is there a way?
Make a class (maybe even a custom or user control) to contain each line of controls:
Public Class ControlLine
Public Property Lane As Label
Public Property List As ListBox
Public Property Text As TextBox
End Class
Then create a single dimensional array of these objects (or usually even better: a List(Of ControlLine) ) and put your items in here:
Dim lines() As ControlLine = {
New ControlLine With { Lane = LaneLabel1, List = ListBox1, Text = TextBox1},
New ControlLine With { Lane = LaneLabel2, List = ListBox2, Text = TextBox2}
}
This is much better because the items in your array remain strongly-typed, for good compile-time checking and IDE support for things like intellisense. Recent versions of Visual Studio can also accomplish this via Tuples.
And again, also consider abstracting this further into custom or user controls, where you can create a whole set with one simple constructor call, place one control on the form and have the whole set line up properly, and even think about data binding these ControlLines to a container like a FlowLayoutPanel instead of managing all the controls and arrays and placement yourself.
Just do this:
Dim multiArray(,) As Object = _
{ _
{Lane1Label, ListBox1, TextBox1}, _
{Lane2Label, ListBox2, TextBox2} _
}
Note the , in the declaration of the two-dimensional array.

Arrange list of class type from uncollated to collated manner

I have a list of class type with class structure as below:
Class MyClass
Public Property ID As Integer
Public Property Details As Object
End Class
Now, my list contains items of MyClass type in following manner:
Item1(ID=1,Copy1)
Item2(ID=1,copy2)
Item3(ID=1,copy3)
Item4(ID=2,Copy1)
Item5(ID=2,copy2)
Item6(ID=2,copy3)
Item7(ID=3,Copy1)
Item8(ID=3,copy2)
Item9(ID=3,copy3)
Here, i wanna arrange the same list in collated manner as below:
Item1(ID=1,Copy1)
Item2(ID=2,Copy1)
Item3(ID=3,Copy1)
Item4(ID=1,copy2)
Item5(ID=2,copy2)
Item6(ID=3,copy2)
Item7(ID=1,copy3)
Item8(ID=2,copy3)
Item9(ID=3,copy3)
My list contains thousands of entries in-memory and not finding advisable looping through for the same.
Is there any way to do the same with performance?
I achieved the same using following Linq GroupBy:
Dim oMyClassList As New List(Of MyClass)
For iCopySequence As Integer = 0 To iNoOfCopies - 1 Step 1
oMyClassList.AddRange(oMyClassOriginalList.GroupBy(Function(v) v.ID).Select(Function(v) v(iCopySequence)).ToList)
Next
oMyClassOriginalList = oMyClassList

Combining duplicate objects(POJO) within an ArrayList and generating a third and new Object Java

I have an ArrayList. It will have duplicate objects in that arraylist. The duplicate is found on a "firstId" field. There will be one another field,
but that will not be used for comparing object equality. Once I find the duplicate I need to combine the info from both and populate a different object with 3 fields.
This is what I would like to do
ClassA
{
private string firstId;
private String secondId;
//equals() and hashcode() override code for firstId;
}
ClassB
{
private string firstId;
private String secondId;
private string thirdID;
}
Class A cls1A = new ClassA();
Class A cls2A = new ClassA();
ClassB clsB = new ClassB();
if(cls1A.equals(cls2A))
{
clsB.setFirstId(cls1A.getFirstId());
clsB.setSecondId(cls1A.getSecondId());
clsB.setThirdId(cls2A.getSecondID());
}
I have written quite a bit of code to identify the duplicates(basically trying to add it to a hashset and then again getting it from both the arraylist and hashset object, matching each one and trying to create the third object. But there are few bugs and also it makes my program very slow. Is there a better solution to this?
You really want an easy way to group objects with the same firstId value together. Two possible approaches:
Sort your ArrayList (using a Comparator that looks only at firstId), then iterate over the sorted list, keeping a reference to the last item you found. If the last item and the current item match on firstId, create a new ClassB.
Instead of a HashSet, use a HashMap with String key and Set values. Iterate over ArrayList, storing in it a mapping from firstId values to all the secondId values you find with that firstId.

Weird VB.NET array-property situation

I have this weird situation.
I have these two classes:
Public Class Entry
End Class
Public Class Core
End Class
One of the properties of the Core class will be an array of Entry objects. How can I declare it?
Now, the only way to change (add/remove) this array from outside should be using two functions - AddEntry(Ent As Entry) and RemoveEntry(ID As String). Note that here, whoever is calling the AddEntry function should only be bothered with creating an Entry object and passing it. It will be added to the existing array.
But, the Entry array should be accessible like this from outside, for looping through and printing or whatever like this:
' core1 is a valid Core object
For Each Ent As Entry In core1.Entries
MsgBox(Ent.SomeProperty)
Next Ent
Is it possible to expose the Array as a property but restrict modification through functions alone? I know that the logic inside the Add and Remove functions can be inside the setter or getter, but the person wanting to add should pass only a single Entry object.
It is like saying You have readonly access to the array, but for modifying it, just create an object and send it or the ID to remove it. Don't bother about the entire array.
I hope I am making sense.
Why do you want to expose it as an array ?
What I would do, is use a List internally to store the entries. (That List would be private)
Create the necessary public methods (AddEntry / RemoveEntry / ... ), which manipulate the private list.
Then, create a public property which exposes the List, but in a ReadOnly fashion. That is, that property should return an ReadOnlyCollection instance.
Like this:
(I know it is in C#, but that 's my 'main language' - a bit too lazy to convert it to VB.NET)
public class Core
{
private List<Entry> _entries = new List<Entry>();
public void AddEntry( Entry entry )
{
_entries.Add (entry);
}
public ReadOnlyCollection<Entry> Entries
{
get { return _entries.AsReadOnly(); }
}
}
EDIT: VB.Net version provided by MarkJ
Imports System.Collections.ObjectModel
Public Class Core
Private _entries As New List(Of Entry)
Public Sub AddEntry( new As Entry )
_entries.Add (new)
End Sub
Public ReadOnly Property Entries() As ReadOnlyCollection(Of Entry)
Get
Return _entries.AsReadOnly
End Get
End Property
End Class
Create a private field for the array and then create your accessing methods to work with the array internally. In order to expose this array to callers so that they can enumerate it you should expose a property of type IEnumerable(Of T).
This approach is not foolproof, however as a clever caller could simply cast the IEnumerable(Of T) back to an array and modify it so it may be necessary to create a copy of the array and return that as the IEnumerable(Of T). All this has obvious performance penalties as I am sure you already see. This is just one of many issues that can arise when arrays are used as underlying data structures.
You can keep the List private and instead return an IEnumerable. Code generated via Reflector - I hope it's readable:
Public Class Core
Public Sub AddEntry(ByVal Ent As Entry)
Me.entries.Add(Ent)
End Sub
Public Sub RemoveEntry(ByVal ID As String)
Dim pred As Predicate(Of Entry) = Function (ByVal entry As Entry)
Return (entry.Id = ID)
End Function
Me.entries.RemoveAll(pred)
End Sub
Public ReadOnly Property Entries As IEnumerable(Of Entry)
Get
Return Me.entries
End Get
End Property
Private entries As List(Of Entry) = New List(Of Entry)
End Class
Note: I'd recommend using a List<Entry> instead of an array if you'll be adding and removing objects - or perhaps even a Dictionary<string, Entry> given the way you are using it.