How to read from a file an 2d array? - vb.net

I'm new in vb.net programming, and i want to read a 2d array from a file. I searched a lot and i can't figure out how can i do that. There is the input file :
1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1
And here is the code part :
Dim map As Integer(,)
Dim reader As StreamReader
reader = IO.File.OpenText(folder + "\harta\harta.txt")
Dim linie As String, i, j As Integer
For i = 0 To 10
For j = 0 To 12
linie = reader.ReadLine()
map(i, j) = linie.Substring(j, linie.IndexOf(" ")) 'here is my problem'
Next j
Next i
reader.Close()
When i run the code, i get the following error:
An unhandled exception of type 'System.NullReferenceException' occurred in WindowsApplication1.exe
Edit:
I tried another method :
Dim reader As IO.StreamReader
reader = IO.File.OpenText(folder + "\harta\harta.txt")
Dim linie As String, i, j As Integer
For i = 0 To 10
linie = reader.ReadLine
Dim parametrii As String() = linie.Split(" ")
Dim parametru As String
j = 0
For Each parametru In parametrii
map(i, j) = parametru 'i get the same error here'
j += 1
Next
Next i
I really dont know what is wrong.

Here you are, and I fixed some problems that you can see by comparing between this code and yours :
Dim map(10, 12) As Integer
Dim reader As IO.StreamReader
reader = IO.File.OpenText("harta.txt")
Dim linie As String, i, j As Integer
For i = 0 To 10
linie = reader.ReadLine.Trim
For j = 0 To 12
map(i, j) = Split(linie, " ")(j)
Next j
Next i
reader.Close()

You are reading too many lines...if there is no line to read, a Null reference is returned by ReadLine.
You need to ReadLine from 0 to 10, and for each line, use split to get the column values.
This part is currently returning a null reference:
linie = reader.ReadLine()
And when you attempt this:
linie.IndexOf(" ")
It causes an exception. The linie variable is null.

Related

how to convert pandas dataframe to libsvm format?

I have pandas data frame like below.
df
Out[50]:
0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 \
0 0 0 0 0 0 0 0 0 0 0 ... 1 1 1 1 1 1 1 1
1 0 1 1 1 0 0 1 1 1 1 ... 0 0 0 0 0 0 0 0
2 1 1 1 1 1 1 1 1 1 1 ... 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0 ... 1 1 1 1 1 1 1 1
4 0 0 0 0 0 0 0 0 0 0 ... 1 1 1 1 1 1 1 1
5 1 0 0 1 1 1 1 0 0 0 ... 0 0 0 0 0 0 0 0
6 0 0 0 0 0 0 0 0 0 0 ... 1 1 1 1 1 1 1 1
7 0 0 0 0 0 0 0 0 0 0 ... 1 1 1 1 1 1 1 1
[8 rows x 100 columns]
I have target variable as an array as below.
[1, -1, -1, 1, 1, -1, 1, 1]
How can I map this target variable to a data frame and convert it into lib SVM format?.
equi = {0:1, 1:-1, 2:-1,3:1,4:1,5:-1,6:1,7:1}
df["labels"] = df.index.map[(equi)]
d = df[np.setdiff1d(df.columns,['indx','labels'])]
e = df.label
dump_svmlight_file(d,e,'D:/result/smvlight2.dat')er code here
ERROR:
File "D:/spyder/april.py", line 54, in <module>
df["labels"] = df.index.map[(equi)]
TypeError: 'method' object is not subscriptable
When I use
df["labels"] = df.index.list(map[(equi)])
ERROR:
AttributeError: 'RangeIndex' object has no attribute 'list'
Please help me to solve those errors.
I think you need convert index to_series and then call map:
df["labels"] = df.index.to_series().map(equi)
Or use rename of index:
df["labels"] = df.rename(index=equi).index
All together:
For difference of columns pandas has difference:
from sklearn.datasets import dump_svmlight_file
equi = {0:1, 1:-1, 2:-1,3:1,4:1,5:-1,6:1,7:1}
df["labels"] = df.rename(index=equi).index
e = df["labels"]
d = df[df.columns.difference(['indx','labels'])]
dump_svmlight_file(d,e,'C:/result/smvlight2.dat')
Also it seems label column is not necessary:
from sklearn.datasets import dump_svmlight_file
equi = {0:1, 1:-1, 2:-1,3:1,4:1,5:-1,6:1,7:1}
e = df.rename(index=equi).index
d = df[df.columns.difference(['indx'])]
dump_svmlight_file(d,e,'C:/result/smvlight2.dat')

Complex Excel Formula in Pandas

Excel Formulas I am trying to replicate in pandas:
Click here to download workbook
* Look at columns D, E and F
entsig and exsig are manual and can be changed. In real life they would be derived from the value of another column or a comparison of two other columns
ent = 1 if entsig previous = 1 and in = 0
in = 1 if ent previous = 1 or (in previous = 1 and ex = 0)
ex = 1 if exsig previous = 1 and in previous = 1
so either ent, in, or ex will always be = 1 but never more than one of them
import pandas as pd
df = pd.DataFrame(
[[0,0,0,0,0], [1,0,0,0,0], [1,0,0,0,0], [1,0,0,0,0], [0,0,0,0,0],
[0,1,0,0,0], [0,1,0,0,0], [1,0,0,0,0], [1,0,0,0,0], [0,0,0,0,0],
[0,0,0,0,0], [0,0,0,0,0], [0,1,0,0,0], [0,1,0,0,0], [0,1,0,0,0],
[0,0,0,0,0], [0,0,0,0,0], [1,0,0,0,0], [1,0,0,0,0], [1,0,0,0,0],
[1,1,0,0,0], [0,1,0,0,0], [0,1,0,0,0], [0,1,0,0,0]],
columns=['entsig', 'exsig','ent', 'in', 'ex'])
for i in df.index:
df['ent'][(df.entsig.shift(1)==1) & (df['ent'].shift(1) == 0) & (df['in'].shift(1) == 0)]=1
df['ex'][(df.exsig.shift(1)==1) & (df['in'].shift(1)==1)]=1
df['in'][(df.ent.shift(1)==1) | ((df['in'].shift(1)==1) & (df['ex']==0))]=1
for j in df.index:
df['ent'][df['in'] == 1]=0
df['in'][df['ex']==1]=0
df['ex'][df['ex'].shift(1)==1]=0
df
results in
entsig exsig ent in ex
0 0 0 0 0 0
1 1 0 0 0 0
2 1 0 1 0 0
3 1 0 0 1 0
4 0 0 0 1 0
5 0 1 0 1 0
6 0 1 0 0 1
7 1 0 0 0 0
8 1 0 1 0 0
9 0 0 0 1 0
10 0 0 0 1 0
11 0 0 0 1 0
12 0 1 0 1 0
13 0 1 0 0 1
14 0 1 0 0 0
15 0 0 0 0 0
16 0 0 0 0 0
17 1 0 0 0 0
18 1 0 1 0 0
19 1 0 0 1 0
20 1 1 0 1 0
21 0 1 0 0 1
22 0 1 0 0 0
23 0 1 0 0 0
Question
How can I make this code faster? It runs slow because it's a loop but I have not been able to come up with a solution that does not use loops. Any ideas or comments are appreciated.
If we can assume every group of 1's in entsig is followed by at least one 1 in
exsig, then you could compute ent, ex and in like this:
def ent_in_ex(df):
entsig_mask = (df['entsig'].diff().shift(1) == 1)
exsig_mask = (df['exsig'].diff().shift(1) == 1)
df.loc[entsig_mask, 'ent'] = 1
df.loc[exsig_mask, 'ex'] = 1
df['in'] = df['ent'].shift(1).cumsum().subtract(df['ex'].cumsum(), fill_value=0)
return df
If we can make this assumption, then ent_in_ex is significantly faster:
In [5]: %timeit orig(df)
10 loops, best of 3: 185 ms per loop
In [6]: %timeit ent_in_ex(df)
100 loops, best of 3: 2.23 ms per loop
In [95]: orig(df).equals(ent_in_ex(df))
Out[95]: True
where orig is the original code:
def orig(df):
for i in df.index:
df['ent'][(df.entsig.shift(1)==1) & (df['ent'].shift(1) == 0) & (df['in'].shift(1) == 0)]=1
df['ex'][(df.exsig.shift(1)==1) & (df['in'].shift(1)==1)]=1
df['in'][(df.ent.shift(1)==1) | ((df['in'].shift(1)==1) & (df['ex']==0))]=1
for j in df.index:
df['ent'][df['in'] == 1]=0
df['in'][df['ex']==1]=0
df['ex'][df['ex'].shift(1)==1]=0
return df

Why doesn't my VBA function work properly?

I'm very new to VBA and programming, so this might be a dumb question. I have written the following code:
Function central(X)
Dim xc(300, 10), xa(200)
m = X.Rows.Count
n = X.Columns.Count
For j = 1 To n
xa(j) = 0
For i = 1 To m
xa(j) = xa(j) + X(i, j)
Next i
xa(j) = xa(j) / m
For i = 1 To m
xc(i, j) = X(i, j) - xa(j)
Next i
Next
central = xc()
End Function
This should output a matrix whose elements are subtracted from the average value of their columns.
My problem is that the output is shifted with one row and column. So for example for this table:
1 1 1
2 2 2
3 3 3
it gives me:
0 0 0
0 -1 -1
0 0 0
Thanks in advance!

set column value based on distinct values in another column

I am trying to do something very similar to this question: mysql - UPDATEing row based on other rows
I have a table, called modset, of the following form:
member year y1 y2 y3 y1y2 y2y3 y1y3 y1y2y3
a 1 0 0 0 0 0 0 0
a 2 0 0 0 0 0 0 0
a 3 0 0 0 0 0 0 0
b 1 0 0 0 0 0 0 0
b 2 0 0 0 0 0 0 0
c 1 0 0 0 0 0 0 0
c 3 0 0 0 0 0 0 0
d 2 0 0 0 0 0 0 0
Columns 3:9 are binary flags to indicate which combination of years the member has records in. So I wish the result of an SQL update to look as follows:
member year y1 y2 y3 y1y2 y2y3 y1y3 y1y2y3
a 1 0 0 0 0 0 0 1
a 2 0 0 0 0 0 0 1
a 3 0 0 0 0 0 0 1
b 1 0 0 0 1 0 0 0
b 2 0 0 0 1 0 0 0
c 1 0 0 0 0 0 1 0
c 3 0 0 0 0 0 1 0
d 2 0 1 0 0 0 0 0
The code in the question linked above does something very close but only when it is a count of the distinct years in which the member has records. I need to base the columns on the specific values of the years in which the member has records.
Thanks in advance!
SOLUTION
SELECT member,
case when min(distinct(year)) = 1 and max(distinct(year)) = 1 then 1 else 0 end y1,
case when min(distinct(year)) = 1 and max(distinct(year)) = 2 then 1 else 0 end y1y2,
case when min(distinct(year)) = 1 and max(distinct(year)) = 3 and count(distinct(year)) = 2 then 1 else 0 end y1y3,
case when min(distinct(year)) = 1 and max(distinct(year)) = 3 and count(distinct(year)) = 3 then 1 else 0 end y1y2y3,
case when min(distinct(year)) = 2 and max(distinct(year)) = 2 then 1 else 0 end y2,
case when min(distinct(year)) = 2 and max(distinct(year)) = 3 then 1 else 0 end y2y3,
case when min(distinct(year)) = 3 then 1 else 0 end y3
INTO temp5
FROM modset
GROUP BY member;
UPDATE modset M
SET y1 = T.y1, y2 = T.y1, y3 = T.y3, y1y2 = T.y1y2, y1y3 = T.y1y3, y2y3 = T.y2y3, y1y2y3 = T.y1y2y3
FROM temp5 T
WHERE T.member = M.member;
What is the query you are using to return the indicators of the years the member has records in?
It sounds like you would want take your query results and use it in your update:
http://dev.mysql.com/doc/refman/5.0/en/update.html
It may look something like this:
UPDATE targetTable t, sourceTable s
SET t.y1 = s.y1, t.y2 = s.y2 -- (and so on...)
WHERE t.member = s.member AND t.year = m.year;

VB.NET : Generate all possible words on file

Example :
If a got word "don" then file will contain
ddd
ddo
ddn
dod
doo
don
dnd
dno
dnn
odd
odo
odn
ood
<...>
I have no idea to do this. Not less then 3 symbol words.
I presented a solution in Experts Exchange, which you may not be able to see (if you never payed them) so I copy it for you:
Question was:
I have n items and each item can be assigned a 1 or a 2. So I would like to get the matrix result that would generate all possible combinations.
For eg. if n= 3 , then the possible outcomes are : I need an algorithm that can generate this series for n . Please help thanks. ideally i would like to store the result in a datatable
1 1 1
1 1 2
1 2 1
2 1 1
2 1 2
1 2 2
2 2 1
2 2 2
Answer:
Dim HighestValue As Integer = 2 ' max value
Dim NrOfValues As Integer = 3 ' nr of values in one result
Dim Values(NrOfValues) As Integer
Dim i As Integer
For i = 0 To NrOfValues - 1
Values(i) = 1
Next
Values(NrOfValues - 1) = 0 ' to generate first as ALL 1
For i = 1 To HighestValue ^ NrOfValues
Values(NrOfValues - 1) += 1
For j As Integer = NrOfValues - 1 To 0 Step -1
If Values(j) > HighestValue Then
Values(j) = 1
Values(j - 1) += 1
End If
Next
Dim Result As String = ""
For j As Integer = 0 To NrOfValues - 1
Result = Result & CStr(Values(j))
Next
Debug.WriteLine(Result)
Next
Ok Here's the solution, you just need to change the Debug.Writeline with a write to your file
Dim HighestValue As Integer = 3 ' max value
Dim NrOfValues As Integer = 3 ' nr of values in one result
Dim Values(NrOfValues) As Integer
Dim i As Integer
For i = 0 To NrOfValues - 1
Values(i) = 1
Next
Values(NrOfValues - 1) = 0 ' to generate first as ALL 1
For i = 1 To HighestValue ^ NrOfValues
Values(NrOfValues - 1) += 1
For j As Integer = NrOfValues - 1 To 0 Step -1
If Values(j) > HighestValue Then
Values(j) = 1
Values(j - 1) += 1
End If
Next
Dim Result As String = ""
For j As Integer = 0 To NrOfValues - 1
If Values(j) = 1 Then Result = Result & "d"
If Values(j) = 2 Then Result = Result & "o"
If Values(j) = 3 Then Result = Result & "n"
'Result = Result & CStr(Values(j))
Next
Debug.WriteLine(Result)
Next