libsvm predicting same value for all data set instances - libsvm

I am Using libsvm C# wrapper for svr to predict Salary .predicted results are wrong. libsvm is giving same value for all the instances.i have preform grid search parameter selection.how can i solve this issue here is my code.
Problem train = Problem.Read(#"H:\test.csv");
Problem test = Problem.Read(#"H:\testsvmf1.csv");
//For this example (and indeed, many scenarios), the default
//parameters will suffice.
Parameter parameters = new Parameter();
//double C;
//double Gamma;
//This will do a grid optimization to find the best parameters
//and store them in C and Gamma, outputting the entire
//search to params.txt.
ParameterSelection.Grid(train, parameters, #"H:\params.txt", out C, out Gamma);
parameters.C = 512;
parameters.Gamma = 0.5;
parameters.SvmType = SvmType.NU_SVR;
double cv = Training.PerformCrossValidation(train,parameters,10);
Console.Write(cv);
//Train the model using the optimal parameters.
Model model = Training.Train(train, parameters);
//Perform classification on the test data, putting the
//results in results.txt.
Prediction.Predict(test, #"H:\resultsnew1.txt", model, false);
}
public static SvmType NU_SVR { get; set; }

It looks like you perform a grid search to tune your parameters, but then you set them manually to fixed values (C=512,Gamma=0.5). The fixed parameters are used for training...

Related

Is it possible to cast a Tensor<TInt32> in Java to a double and other native Java variable types?

I'm new to TensorFlow and I cannot seem to find how to do this despite extensive searching online. I want to load the TensorFlow model I have built into Java to set some variable values, which in this case, is a double. Is there a good way of going about this?
I have looked at the TensorFlow copyTo() function but it doesn't seem relevant. I have found no relevant search results when trying to do this casting as well.
Here is the code snippet of what I am trying to do:
try(SavedModelBundle b = SavedModelBundle.load("/somePath", "serve")) {
Session s = b.session();
Tensor<TInt32> x = TInt32.scalarOf(1);
Tensor<TInt32> y = TInt32.scalarOf(2);
Tensor<TInt32> result = (Tensor<TInt32>) s.runner().feed("x", x).feed("y", y).fetch("ans").run().get(0);
// I know this won't work but do whatever is needed to convert to a double
this.ExampleDouble = result;
}
First you need to retrieve the output tensor(s) with their expected type.
If your model is returning an 32-bit integer (like it seems to do given your example), then you should do the following:
// Also note all resources that should be protected by a try-with-resource block...
try(SavedModelBundle model = SavedModelBundle.load("/somePath", "serve");
Tensor<TInt32> x = TInt32.scalarOf(1);
Tensor<TInt32> y = TInt32.scalarOf(2)) {
// Let's use the new functional API to instead of invoking the session directly
Map<String, Tensor<?>> inputTensors = new HashMap() {{
put("x", x);
put("y", y);
}};
try (Tensor<TInt32> result = model.call(inputTensors).get("ans").expect(TInt32.DTYPE)) {
this.ExampleDouble = (double)(result.data().getInt());
}
}
It is a bit surprising though that the datatype of the model output is a 32-bit integer if you know that it is returning a double floating value. Make sure to request the right datatype, as found in the model signature, or the cast will fail (if it is a double, then the expected datatype might be TFloat64.DTYPE).
You can look at the signatures of your model by calling model.signatures()

Graph traversal name to graph name mapping

Is there any API using which I can get graphTraversalName to graphName mapping defined in the script?
I am using the below messy code but it's error-prone if both graphs are using the same underlying storage.
Map<String, String> graphTraversalToNameMap = new ConcurrentHashMap<String, String>();
while(traversalSourceIterator.hasNext()){
String traversalSource = traversalSourceIterator.next();
String currentGraphString = ( (GraphTraversalSource) graphManager.getAsBindings().get(traversalSource)).getGraph().toString();
graphNameTraversalMap.put(currentGraphString, traversalSource);
}
Iterator<String> graphNamesIterator = graphManager.getGraphNames().iterator();
while(graphNamesIterator.hasNext()){
String graphName = graphNamesIterator.next();
String currentGraphString = graphManager.getGraph(graphName).toString();
String traversalSource = graphNameTraversalMap.get(currentGraphString);
graphTraversalToNameMap.put(traversalSource, graphName);
}
Does gremlinExecutor.getScriptEngineManager().getBindings().entrySet() provide order guarantee? I can iterate over this and populate my map
Is there any API using which I can get graphTraversalName to graphName mapping defined in the script?
No. They share the same namespace in Gremlin Server so the relationship gets lost programmatically. You would need to do something like what you are doing but I wouldn't rely on toString() of a Graph for equality. Perhaps use the Graph instance itself? Although that might not work either depending on your situation and what you want for equality as you could have two different Graph configurations pointed at the same data and want to resolve those as the same graph. I'm also not sure that any approach will work generally for all graph systems. Anyway, I think I'd experiment with using Map<Graph, String> graphTraversalToNameMap for your case and see how that goes.
Does gremlinExecutor.getScriptEngineManager().getBindings().entrySet() provide order guarantee?
No as it is backed by a ConcurrentHashMap. You would have to provide your own order.
Underlying storage details can be obtained from the configuration object and can be used for the mapping, sample code:
public class GraphTraversalMappingUtil {
public static void populateGraphTraversalToNameMapping(GraphManager graphManager){
if(graphTraversalToNameMap.size() != 0){
return;
}
Iterator<String> traversalSourceIterator = graphManager.getTraversalSourceNames().iterator();
Map<StorageBackendKey, String> storageKeyToTraversalMap = new HashMap<StorageBackendKey, String>();
while(traversalSourceIterator.hasNext()){
String traversalSource = traversalSourceIterator.next();
StorageBackendKey key = new StorageBackendKey(
graphManager.getTraversalSource(traversalSource).getGraph().configuration());
storageKeyToTraversalMap.put(key, traversalSource);
}
Iterator<String> graphNamesIterator = graphManager.getGraphNames().iterator();
while(graphNamesIterator.hasNext()) {
String graphName = graphNamesIterator.next();
StorageBackendKey key = new StorageBackendKey(
graphManager.getGraph(graphName).configuration());
graphTraversalToNameMap.put(storageKeyToTraversalMap.get(key), graphName);
}
}
}
For full code, refer: https://pastebin.com/7m8hi53p

How to enable parallelism for a custom U-SQL Extractor

I’m implementing a custom U-SQL Extractor for our internal file format (binary serialization). It works well in the "Atomic" mode:
[SqlUserDefinedExtractor(AtomicFileProcessing = true)]
public class BinaryExtractor : IExtractor
If I switch off the “Atomic“ mode, It looks like U-SQL is splitting the file in a random place (I guess just by 250MB chunks). This is not acceptable for me. The file format has a special row delimiter. Can I define a custom row delimiter in my Extractor and enable parallelism for it. Technically I can change our row delimiter to a new one if it can help.
Could anyone help me with this question?
The file is indeed split into chunks (I think it is 1 GB at the moment, but the exact value is implementation defined and may change for performance reasons).
If the file is indeed row delimited, and assuming your raw input data for the row is less than 4MB, you can use the input.Split() function inside your UDO to do the splitting into rows. The call will automatically handle the case if the raw input data spans the chunk boundary (assuming it is less than 4MB).
Here is an example:
public override IEnumerable<IRow> Extract(IUnstructuredReader input, IUpdatableRow outputrow)
{
// this._row_delim = this._encoding.GetBytes(row_delim); in class ctor
foreach (Stream current in input.Split(this._row_delim))
{
using (StreamReader streamReader = new StreamReader(current, this._encoding))
{
int num = 0;
string[] array = streamReader.ReadToEnd().Split(new string[]{this._col_delim}, StringSplitOptions.None);
for (int i = 0; i < array.Length; i++)
{
// DO YOUR PROCESSING
}
}
yield return outputrow.AsReadOnly();
}
}
Please note that you cannot read across chunk boundaries yourself and you should make sure your data is indeed splittable into rows.

How to iterate multiple times over data during PIG store function

I wonder if it possible to write a user-defined store function for PIG that iterates twice over the data / input tuples.
I read here http://pig.apache.org/docs/r0.7.0/udf.html#Store+Functions how to write your own store function, e.g. by implementing your own "getNext()" method.
For my use case, however, it is necessary to see every tuple twice in the "getNext()" method, so I wonder whether there is a way to that, for example by reseting the reader somehow or by overwriting some other method...
Additional information: I am looking for a way to iterate from tuple 1 to tuple n and then again from 1 to n.
Does anyone has an idea how to do something like that?
Thanks!
Sebastian
This is from top of my head, but you could try something like this:
imports here ...;
class MyStorage extends PigStorage {
private int counter = 0;
private Tuple cachedTuple = null;
public Tuple getNext(){
if (this.counter++ % 2 == 0) {
this.cachedTuple = super.getNext();
}
return this.cachedTuple;
}
}

How to customize the labels of an Infragistics Ultrachart?

I am trying to customize the series labels of the X axis of a linear ultrachart using vb.net.
I looked into the documentation from Infragistics and found that I could use this code:
UltraChart.Axis.Y.Labels.SeriesLabels.FormatString = "<Item_Label>"
A description of the types of labels available can be seen here.
However, I'm not getting the result I expected. I get "Row#1" and I want to get only the "1".
I've tried the approach used in the first reply of this post in Infragistics forums, which consists of using an hashtable with the customized labels. The code used there is the following (in C#):
Hashtable labelHash = new Hashtable();
labelHash.Add("CUSTOM", new MyLabelRenderer());
ultraChart1.LabelHash = labelHash;
xAxis.Labels.ItemFormatString = "<CUSTOM>";
public class MyLabelRenderer : IRenderLabel
{
public string ToString(Hashtable context)
{
string label = (string)context["ITEM_LABEL"];
int row = (int)context["DATA_ROW"];
int col = (int)context["DATA_COLUMN"];
//use row, col, current label's text or other info inside the context to get the axis label.
//the string returned here will replace the current label.
return label;
}
}
This approach didn't work either.
I am using Infragistics NetAdvantage 2011.1.
Anyone has any idea how to customize these labels in order to obtain the number after "Row#"?
There are different approaches to solve this task. One possible solution could be if you are using FillSceneGraph event. By this way you could get your TEXT primitives and modify it. For example:
private void ultraChart1_FillSceneGraph(object sender, Infragistics.UltraChart.Shared.Events.FillSceneGraphEventArgs e)
{
foreach (Primitive pr in e.SceneGraph)
{
if (pr is Text &&((Text)pr).labelStyle.Orientation == TextOrientation.VerticalLeftFacing )
{
pr.PE.Fill = Color.Red;
((Text)pr).SetTextString("My custom labels");
}
}
}
OK. I`ll try to explain more deeply about FormatString property.
When you are using this property, you could determinate which information to be shown (for example: Items values or Data Values or Series Values). Of course there are option to use your custom FormatString.
For example:
axisX2.Labels.ItemFormat=AxisItemLabelFormat.Custom;
axisX2.Labels.ItemFormatString ="";
In this case we have labels which represent Date on your X axis, so if you are using these two properties, you are able to determinate the Date format (for example dd/MM/yyyy or MM/ddd/yy). In your scenario you have string values on your X axis. If you are not able to modify these strings values at lower level (for example in your database, through TableAdapters SQL query, DataSet, i.e. before to set your DataSource to our UltraChart), then you could use FillSceneGraph event and modify your Text primitives. More details about this event you could find at http://help.infragistics.com/Help/NetAdvantage/WinForms/2013.1/CLR4.0/html/Chart_Modify_Scene_Graph_Using_FillSceneGraph_Event.html If you need a sample or additional assistance, please do not hesitate to create a new forum thread in our web site - http://www.infragistics.com/community/forums/
I`ll be glad to help you.