Calculate statistical distributions from a column in a datatable - vb.net

I don't know if there is a specific method in VB.Net to calculate the statistical distribution from an array of values like the formula Frequency() in Excel. If not what is the easiest and fastest way of doing the same thing ?
For example I've a DataTable with my values in a Column called "Cement Deviation" :
Column Deviation
0
14
11
2
6
1
16
14
5
21
The bands in which I want to know the frequency of these values are :
From minValue To -50 by Step of 10
From -50 To -10 by Step of 5
From -10 To -5 by Step of 1
From -5 To -1 by Step of 0.5
From -1 To -0.5 by Step of 0.1
From -0.5 To -0.1 by Step of 0.05
From -0.1 To 0.1 by Step of 0.01
From 0.1 To 0.5 by Step of 0.05
From 0.5 To 1 by Step of 0.1
From 1 To 5 by Step of 0.5
From 5 To 10 by Step of 1
From 10 To 50 by Step of 5
From 50 To maxValue by Step of 10
Can someone help me with it?
Thanks

I don't know how you calculate it since my experiences with statistical distribution is limited and you haven't mentioned the way you want to calculate it.
However, this does at least compile:
Dim stat(2) As Integer
For Each row As DataRow In gridView.Rows
Dim cementDeviation = row.Field(Of Int32)("Cement Deviation")
Select Case cementDeviation
Case 0 To 10
stat(0) += 1
Case 10 To 20
stat(1) += 1
End Select
Next
In general there's nothing bad in looping the DataRows to calculate the values. But you should set OPTION STRICT to on, then your code would not compile because row("Cement Deviation") is an object not integer. The good thing is that you are forced to use the correct types which prevents from nasty runtime errors.
Edit Here is an example how you could use dynamic ranges and count each class with LINQ. I have used a DataTable to store the min- and max-values but you could also use a different in-memory collection like List(Of CustomClass) or even better - the database.
You can also simply loop the table but you wanted to see a different approach. I like LINQ since it can reduce complexitiy and increase readability:
The range table with sample data:
Dim rangeTable = New DataTable()
rangeTable.Columns.Add("Min", GetType(Int32))
rangeTable.Columns.Add("Max", GetType(Int32))
For i = 0 To 90 Step 10
rangeTable.Rows.Add(i, i + 10)
Next
A single LINQ query to calculate all occurences for every range even ordered descending:
Dim stats =
From rangeRow In rangeTable
Let min = rangeRow.Field(Of Int32)("Min")
Let max = rangeRow.Field(Of Int32)("Max")
Select StatsInfo = New With {
.Min = min, .Max = max,
.Count = (From devRow In devTable
Let cementDeviation = devRow.Field(Of Int32)("Cement Deviation")
Where cementDeviation >= min AndAlso cementDeviation <= max).Count()
}
Order By StatsInfo.Count Descending
Output the result:
For Each stat In stats
Console.WriteLine("Min: {0} Max: {1} Count: {2}", stat.Min, stat.Max, stat.Count)
Next
Note that i've renamed your DataTable to devTable since gridView is not a good name.

Related

Why are the variables are not taking the desired values

I have to check how many hundreds are there in a number and translate that number to letters. For example the number 700. I have done the following code:
DATA(lv_dmbtr) = ZDS_FG-DMBTR. //Declared local variable of type DMBTR, thus DMBTR=700.
lv_dmbtr = ZDS_FG-DMBTR MOD 100. //Finding how many times 700 is in 100 via MOD and putting the value in lv_dmbtr.
IF lv_dmbtr LE 9. //The value is less or equal than 9(if larger means that the DMBTR is larger than hundreds,
e.g. 8000)
lv_hundred = lv_dmbtr / 100. // Divide the 700 with 100, taking the number 7.
lv_hundred_check = lv_hundred MOD 1. // Then taking the value of 7 into the new variable, done in case the
lv_hundred is a decimal value, e.g. 7.32.
IF lv_hundred_check > 0.
CALL FUNCTION 'SPELL_AMOUNT'
EXPORTING
amount = lv_hundred_check
* CURRENCY = ' '
* FILLER = ' '
LANGUAGE = SY-LANGU
IMPORTING
in_words = lv_hundred_string // the value is put in the new string
EXCEPTIONS
not_found = 1
too_large = 2
OTHERS = 3.
ENDIF.
Now when I debugg the code, all the variables have the value 0. Thus, lv_dmbtr, lv_hundred, lv_hundred_check all have the value 0.
May anyone of you know where the problem may be?
Thank you in advance!
Sorry for writing a lot in the code, just wanted to clarify as much as I could what I had done.
yes so I want to display the value of a specific number 700-> seven, 1400-> four.
So the basic formula to get the hundred in a number is the following: Find out how many times 100 fits completely into your number with integer division.
99 / 100 = 0
700 / 100 = 7
701 / 100 = 7
1400 / 100 = 14
1401 / 100 = 14
Now you can simply take this number MOD 10 to get the the individual hundreds.
0 MOD 10 = 0
7 MOD 10 = 7
14 MOD 10 = 4
Keep in mind that ABAP, in contrast to many other programming languages, rounds automatically. So in code this would be:
CONSTANTS lc_hundred TYPE f VALUE '100.0'.
DATA(lv_number) = 1403.
DATA(lv_hundred_count) = CONV i( floor( ( abs( lv_number ) / lc_hundred ) ) MOD 10 ).

Calculating a total cost based on how many stripes someone wants on their clothes

I'm trying to make it so that if someone wants 3 or less stripes on their shorts it costs 50 cent per stripe on top of the 5.50 base cost for a pair of shorts and then every stripe after the third costs 2 euro each. It works if they chose 3 or less but once I enter any stripe amount above 3 it just displays the base 5.50 cost for the shorts. Not sure what to do any help is appreciated.
I have declared all my variables correctly, I assume the problem is with the code below
'calculate cost of Shorts
If mskShortStripes.Text <= 3 Then
dblTotalShorts += CDbl(mskShorts.Text * 5.5) + (mskShortStripes.Text * 0.5)
ElseIf mskShortStripes.Text > 3 Then
dblTotalShorts += CDbl(mskShorts.Text * 5.5) + (mskShortStripes.Text <= 3 * 0.5) + (mskShortStripes.Text > 3 * 2)
End If
You're asking for trouble working with the .Text property directly as if it were a number. It is not. Fun things happen when the value in your control is not actually a number.
Use Integer.TryParse to convert that string to a number:
Dim numberOfStripes As Integer
If Integer.TryParse(mskShortStripes.Text, numberOfStripes) Then
If numberOfStripes >= 0 Then
' ... now do some math in here with the "numberOfStripes" variable ...
Else
MessageBox.Show("Number of Stripes can't be negative!")
End If
Else
MessageBox.Show("Invalid Number of Stripes!")
End If

Calculate diff() between selected rows

I have a dataframe with ordered times (in seconds) and a column that is either 0 or 1:
time bit
index
0 0.24 0
1 0.245 0
2 0.47 1
3 0.471 1
4 0.479 0
5 0.58 1
... ... ...
I want to select those rows where the time difference is, let's say <0.01 s. But only those differences between rows with bit 1 and bit 0. So in the above example I would only select row 3 and 4 (or any one of them). I thought that I would calculate the diff() of the time column. But I need to somehow select on the 0/1 bit.
Coming from the future to answer this one. You can apply a function to the dataframe that finds the indices of the rows that adhere to the condition and returns the row pairs accordingly:
def filter_(x, threshold = 0.01):
indices = df.index[(df.time.diff() < threshold) & (df.bit.diff().abs() == 1)]
mask = indices | indices - 1
return x[mask]
print(df.apply(filter_, args = (0.01,)))
Output:
time bit
3 0.471 1
4 0.479 0

C or Obj-c function to normalize range to x-y

I have an output of ranges from 150-0. I want to map those to 0 to 1. Or perhaps 0 to (some value less than 1 such as 0.5) where 150 is 0 and 0 is 1 ( or some values less than..).
Is this considered interpolation? What is the formula to derive these values? But preferably, is there a built-in StdLib function I can call?
Divide your number by the (Max - min). This would make 150 be 1 and 0 will be 0, with everything else a number in between. Now, to make it the opposite just do 1 - result.
If you need to map 0-1 to any custom range, you need to multiply range with MAX-MIN and then add MIN to it to get the exact number in range.
Formula will be MIN + (MAX-MIN)*value
where value is range in between 0-1;
MIN is number mapped to 0;
MAX is number mapped to 1;

min over one dimension followed by max over another dimension

I have an SQL table that looks like this:
i j x
0 0 0.5
0 1 1.0
0 2 1.5
1 0 1.4
1 1 1.3
1 2 1.2
and so on. I would like to take the average over the j dimension followed by the minimum over the i dimension. In this case, taking the average over the j dimension produces the following:
i x
0 1.0
1 1.3
Taking the minimum over the i dimension then produces the value 1.0, which is the final result. Is there an efficient way to perform a query like the one in this example, i.e., a query in which a sequence of dimension reduction operations is performed in a specified order?
Note that if we reverse the order of operations, the intermediate result is
j x
0 0.5
1 1.0
2 1.2
Taking the average over the j dimension produces a final result of 0.9. Thus, the order of operations is important.
Phillip
http://phillipmfeldman.org
You can do it with a subquery, of course:
SELECT MIN(avg_over_j) FROM (
SELECT i, AVG(x) AS avg_over_j
FROM TheTable
GROUP BY i
)
But this isn't APL or the J language; there's no "dimension reduction operations".