Pytesseract output into a list then split and getting error - python-tesseract

I'm trying to have pytesseract output into a list and then split it but I can't get it to work.
The part of the code that is not working:
mylist = []
mylist = [pytesseract.image_to_string(Image.open('test.png'))]
print(mylist)
list2 = mylist.split()
print(list2)
And the output I get is:
['33 44 55\n\x0c'] list2 = mylist.split( ) AttributeError: 'list' object has no attribute 'split'

The problem is you want to use the string's (str) split function on a list object.
Lets interpret step-by-step
mylist = []
You don't need to declare mylist as an empty list, since
mylist = [pytesseract.image_to_string(Image.open('test.png'))]
you have already declared. But if you want to use the split, then you need to declare mylist without brackets.
mylist = pytesseract.image_to_string(Image.open('test.png'))
Now if you check mylist type
print(type(mylist))
The result should be:
<class 'str'>
According to the documentation you need to give an input parameter. For instance:
list2 = mylist.split("\n")
print(list2)
Split mylist based on "\n" character. Therefore the correct implementation will be:
mylist = pytesseract.image_to_string(Image.open('test.png'))
print(mylist)
list2 = mylist.split()
print(list2)

Related

Remove an object from a List(of) based on object value

How do I remove an object from a list based on value, I'me getting an index out of range error
'A class called Person exists'
Dim PersonOne As New Person
PersonOne.name = "John"
PersonOne.age = 50
'Created a list to store them'
Dim MyList As New List(Of Person)
MyList.Add(PersonOne)
'Now to remove an object'
Dim ObjectToRemove As String = "John"
For Each item in MyList
If String.Compare(item.name, ObjectToRemove) = 0 Then
'How do I get the index to remove?'
End If
Next
Excuse the code, I bashed it out off the top of my head, my original code is a bit more convoluted. But the gist is I just want to compare a value, if it matches remove that . object from the List(Of).
Use the FindIndex method which accepts a predicate delegate. You can pass a lambda function for the predicate delegate.
(Also, I think it's clearer to use String.Equals with an explicit StringComparison instead of using String.Compare(x,y) == 0 as it signals intent better).
Dim personOneHasThisIndex As Integer
personOneHasThisIndex = MyList.FindIndex( Function(p) p.name.Equals( ObjectToRemove, StringComparison.Ordinal ) )
If personOneHasThisIndx > -1 Then
' Always check if the result is -1, which means it doesn't exist
MyList.RemoveAt( personOneHasThisIndex )
End If
Note that List<T>.RemoveAt(Int32 index) is inefficient because it has to move every element in the List (lists cannot have "empty" spaces). If you have a conceptual list you'll be adding and removing from a lot, consider using LinkedList or a HashSet (if the order of the elements doesn't matter) instead.

Variables refer to the same instance

Within my learning curve I play around with converting List and IEnumerable between each other.
What I am surprised with is that after executing EditMyList procedure MyIEnumerable contains the same data for each DBTable object as MyList. However I have modified MyList only, without assigning it to MyIEnumerable once List has been modified.
Can you explain what happened here and why MyList and MyEInumerable refer to the same instance?
Public Class DBTable
Public Property TableName As String
Public Property NumberOfRows As Integer
End Class
Public Sub EditMyList
Dim MyList As New List(Of DBTable)
MyList.Add(New DBTable With {.TableName = "A", .NumberOfRows = 1})
MyList.Add(New DBTable With {.TableName = "B", .NumberOfRows = 2})
MyList.Add(New DBTable With {.TableName = "C", .NumberOfRows = 3})
Dim MyIEnumerable As IEnumerable(Of DBTable) = MyList
For Each item In MyList
item.NumberOfRows += 10
Next
End Sub
UPDATE: string case where at the end b is not equal to a. String is also reference type, so assigning one variable to other one we shall copy just reference. However at the end there is different result than in the first example (explained by #Sefe)
Dim a As String
Dim b As String
a = "aaa"
b = "bbb"
a = b
' At this point a and b have the same value of "bbb"
a = "xxx"
' At this point I would expect a and b equal to "xxx", however a="xxx" but b="bbb"
A List is a reference type. That means it is created on the heap and your MyList variable contains just a reference (sometimes incorrectly called "pointer") to the list. When you assign MyList to MyEnumerable you don't copy the whole list, you just copy the reference. That means all changes you make to the (the one) list, is reflected by all the references.
If you want a new list you need to create it. You can use the list constructor:
Dim MyIEnumerable As IEnumerable(Of DBTable) = New List(Of DBTable)(MyList)
Since you don't need a list, but an IEnumerable you can also call the list's ToArray method:
Dim MyIEnumerable As IEnumerable(Of DBTable) = MyList.ToArray
You can also use LINQ:
Dim MyIEnumerable As IEnumerable(Of DBTable) = MyList.ToList
As far as the behavior of String is concerned, strings in .net are immutable. That means once created, they can not be changed. String operations (for example concatinations) will always create new strings. In other words: the copy operation you have to do manually for your lists is done automatically for strings. That's why you see similar behavior for strings as for value types.
Also, the assignment operation in your question would also still behave the same if strings were mutable. When you assign a = "xxx", you update the reference of afrom "bbb" to "xxx". That however does not affect b, which still keeps its old reference.
Use ToList() extension method for creating another List
Dim newCollection = MyList.ToList()
But notice that instances of DBTable still will reference to the same items
For creating "full" copy you need create new instances of DBTable for every item in the collection
Dim newCollection = MyList.Select(Function(item)
return new DBTable
{
.TableName = item.TableName,
.NumberOfRows = item.NumberOfRows
}
End Function).ToList()
For Each item in MyList
item.NumberOfrows += 10 ' will not affect on the newCollection items
Next

Compare 2 ArrayLists and remove duplicates

I am looking to compare values of 2 different ArrayLists, and remove any duplicates from 1 ArrayList.
Example:
Arr1 = {HF,HA,GM,RV}
Arr2 = {FB,HA}
Since they have 'HA' in common, I would like to remove 'HA' from Arr1. Any help or point in the right direction would be appreciated.
You can use LINQ's Except but you will have to convert array lists to regular arrays first:
https://msdn.microsoft.com/en-us/library/bb300779(v=vs.110).aspx
Dim list1 As New ArrayList()
list1.Add("A")
list1.Add("B")
list1.Add("C")
Dim list2 As New ArrayList()
list2.Add("A")
list2.Add("B")
Dim array1 = list1.ToArray()
Dim array2 = list2.ToArray()
Dim except = array1.Except(array2).ToArray()
Also if you need a custom comparison, use this overload instead:
https://msdn.microsoft.com/en-us/library/bb336390(v=vs.110).aspx
EDIT
There are very few LINQ methods available for ArrayList, however you can convert it back very easily:
Dim arrayList as New ArrayList(except)

Evaluate expression

So, I have an object with some properties, like this: Dim.d1, Dim.d2,...,Dim.d50
that return strings. For example: Dim.d1="Description A", Dim.d2="Description B",etc.
What I want to do is to attribute these descriptions to the headers of a Gridview and for that I was thinking using indexes, like this pseudocode:
for i=0 until 49
e.Row.Cells[i].Text = Evaluate(Dim.d(i+1))
So, basically, I need a way to change the call to my object properties depending on the index, but I don't know if it is possible. When index i=0, call Dim.d1, when index i=1 call Dim.d2, and so on until 50.
Any ideas?
This is what Arrays or Lists are for!
var dim = new string[50];
dim[0] = "Description A";
dim[1] = "Description B";
..// etc
for(var i=0;i<49;i++)
{
e.Row.Cells[i].Text = dim[i];
}
You can use methods in the System.Reflection namespace to do this. However, the answer is presented in order to answer the question - you should look at using some of the options suggested by other answerers e.g. use a List(Of String) or something similar.
Anyway, let's say you have a class:
Public Class Class1
Public Property d1 As String
Public Property d2 As String
Public Property d3 As String
End Class
And then, let's say you create an instance of that class and set its properties:
Dim obj As New Class1
obj.d1 = "Foo"
obj.d2 = "Bar"
obj.d3 = "Test"
If you then want to have a loop from 1 to 3, and access e.g. d1, d2, d2 etc then this is where you use Reflection:
For i As Integer = 1 To 3
Dim info As System.Reflection.PropertyInfo = obj.GetType().GetProperty("d" & i)
Dim val As String = info.GetValue(obj, Reflection.BindingFlags.GetProperty, Nothing, Nothing, Nothing)
Debug.Print(val.ToString)
Next
Will give you the output:
Foo
Bar
Test
Like Jamiec already posted, use an Array or List.
Where do you description labels come from?
If you have your descriptions in a comma separated string, here is the vb.net code:
dim descriptions as String = "Description A,Description B,Description C"
dim myArray as String() = descriptions.Split(cchar(","))
for i as Integer = 1 To myArray.Length
e.Row.Cells(i-1).Text = myArray(i)
Next

How to assign a value to an array from a combobox

The code I have is:
Dim Dbase() As String = Nothing
Dbase(0) = Db_ComboBox.Text
I have declared Dbase as array and assigned Nothing, Db_ComboBox is a combobox.
For that assignment statement, I'm getting the following error: "Reference 'Dbase' has a value of 'Nothing'"
What is the reason for this error, and how can I take the value from the combobox and save it in the array?
You need to change this:
Dim Dbase() As String = Nothing
to this (declare an array of 1 element):
Dim Dbase(0) As String
And then this line will work:
Dbase(0) = Db_ComboBox.Text
If you need to change your array size you can use Redim or Redim preserve, as required.
If you anticipate contents of Dbase to change often, I am all with #Joel's suggestion about switching to List(Of String) instead of handling array sizes manually.
Let's look at your code:
Dim Dbase() As String = Nothing
Dbase(0) = Db_ComboBox.Text
Especially the first line. That first line creates a variable that can refer to an array, but the = Nothing portion explicitly tells it, "Do not create a real array here yet". You have, effectively, a pointer that doesn't point to anything.
I get here that what you really need is a List collection that you can append to over time:
Dim Dbase As New List(Of String)()
Dbase.Add(Db_ComboBox.Text)
Dbase() IS NOTHING. Look at this example:
cargoWeights = New Double(10) {}
atmospherePressures = New Short(2, 2, 4, 10) {}
inquiriesByYearMonthDay = New Byte(20)()() {}
That's how you declare arrays.
More examples: http://msdn.microsoft.com/en-us/library/vstudio/wak0wfyt.aspx