import pandas error : Traceback (most recent call last) and expected string or bytes-like object - pandas

I installed GIS-Pro and use Jupyter. I believe Jupyter is in the GIS-Pro package. I use Jupyter to write Python codes. Since yesterday, I've got the following errors once executing import pandas as pd :
TypeError Traceback (most recent call last)
C:\Users\AppData\Local\Temp\2/ipykernel_23172/4080736814.py in <module>
----> 1 import pandas as pd
C:\ArcGISPro28\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\__init__.py in <module>
# numpy compat
from pandas.compat import (
np_version_under1p18 as _np_version_under1p18,
is_numpy_dev as _is_numpy_dev,
C:\ArcGISPro28\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\compat\__init__.py in <module>
np_version_under1p20)
from pandas.compat.pyarrow import (
pa_version_under1p0,
pa_version_under2p0,
C:\ArcGISPro28\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\compat\pyarrow.py in <module>
pa_version = pa.__version__
palv = Version(_pa_version)
pa_version_under1p0 = _palv < Version("1.0.0")
pa_version_under2p0 = _palv < Version("2.0.0")
C:\ArcGISPro28\bin\Python\envs\arcgispro-py3\lib\site-packages\pandas\util\version\__init__.py in __init__(self, version)
# Validate the version and parse it into pieces
match = self._regex.search(version)
if not match:
raise InvalidVersion(f"Invalid version: '{version}'")
TypeError: expected string or bytes-like object

Related

Jython ValueError: chr() arg not in range(256)

I am using Jython (jython2.7.0) to send a string value from a java program to a python method and then return the value to the java program but I get this error. ValueError: chr() arg not in range(256) Do you know what is the cause of the problem and How can I solve it ??
Exception in thread "main" Traceback (most recent call last):
File "PageRanking.py", line 9, in <module>
from bs4 import BeautifulSoup
File "C:\jython2.7.0\Lib\bs4\__init__.py", line 35, in <module>
from .builder import builder_registry, ParserRejectedMarkup
File "C:\jython2.7.0\Lib\bs4\builder\__init__.py", line 7, in <module>
from bs4.element import (
File "C:\jython2.7.0\Lib\bs4\element.py", line 10, in <module>
from bs4.dammit import EntitySubstitution
File "C:\jython2.7.0\Lib\bs4\dammit.py", line 14, in <module>
from html.entities import codepoint2name
File "C:\jython2.7.0\Lib\html\__init__.py", line 6, in <module>
from html.entities import html5 as _html5
File "C:\jython2.7.0\Lib\html\entities.py", line 2507, in <module>
entitydefs[name] = chr(codepoint)
This is my Python code
from __future__ import with_statement
from bs4 import BeautifulSoup
import requests
def pageRank(link):
url = "https://checkpagerank.net/"
payload = {'name':link}
r = requests.post(url, payload)
with open("requests_results.html", "wb") as f:
f.write(r.content)
with open(r'requests_results.html', "r", encoding='utf-8') as f:
text= f.read()
soup = BeautifulSoup(r.text, 'html.parser')
results = soup.find_all('h2')
SResult = results[1]
first= SResult.contents[0]
rankerName = first.find('b').text
second= SResult.contents[2]
rankervalue = second.find('b').text
x = rankervalue[:1]
x = int(x)
x= x*100/10
return x

Assertion error when making an MP4 video out of numpy arrays with OpenCV

I have this python code that should make a video:
import cv2
import numpy as np
out = cv2.VideoWriter("/tmp/test.mp4",
cv2.VideoWriter_fourcc(*'MP4V'),
25,
(500, 500),
True)
data = np.zeros((500,500,3))
for i in xrange(500):
out.write(data)
out.release()
I expect a black video but the code throws an assertion error:
$ python test.py
OpenCV(3.4.1) Error: Assertion failed (image->depth == 8) in writeFrame, file /io/opencv/modules/videoio/src/cap_ffmpeg.cpp, line 274
Traceback (most recent call last):
File "test.py", line 11, in <module>
out.write(data)
cv2.error: OpenCV(3.4.1) /io/opencv/modules/videoio/src/cap_ffmpeg.cpp:274: error: (-215) image->depth == 8 in function writeFrame
I tried various fourcc values but none seem to work.
According to #jeru-luke and #dan-masek's comments:
import cv2
import numpy as np
out = cv2.VideoWriter("/tmp/test.mp4",
cv2.VideoWriter_fourcc(*'mp4v'),
25,
(1000, 500),
True)
data = np.transpose(np.zeros((1000, 500,3), np.uint8), (1,0,2))
for i in xrange(500):
out.write(data)
out.release()
The problem is that you did not specify the data type of elements when calling np.zeros. As the documentation states, by default numpy will use float64.
>>> import numpy as np
>>> np.zeros((500,500,3)).dtype
dtype('float64')
However, the VideoWriter implementation only supports 8 bit image depth (as the "(image->depth == 8)" part of the error message suggests).
The solution is simple -- specify the appropriate data type, in this case uint8.
data = np.zeros((500,500,3), dtype=np.uint8)

hstack csr matrix with pandas array

I am doing an exercise on Amazon Reviews, Below is the code.
Basically I am not able to add column (pandas array) to CSR Matrix which i got after applying BoW.
Even though the number of rows in both matrices matches i am not able to get through.
import sqlite3
import pandas as pd
import numpy as np
import nltk
import string
import matplotlib.pyplot as plt
import seaborn as sns
import scipy
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics import confusion_matrix
from sklearn import metrics
from sklearn.metrics import roc_curve, auc
from nltk.stem.porter import PorterStemmer
from sklearn.manifold import TSNE
#Create Connection to sqlite3
con = sqlite3.connect('C:/Users/609316120/Desktop/Python/Amazon_Review_Exercise/database/database.sqlite')
filtered_data = pd.read_sql_query("""select * from Reviews where Score != 3""", con)
def partition(x):
if x < 3:
return 'negative'
return 'positive'
actualScore = filtered_data['Score']
actualScore.head()
positiveNegative = actualScore.map(partition)
positiveNegative.head(10)
filtered_data['Score'] = positiveNegative
filtered_data.head(1)
filtered_data.shape
display = pd.read_sql_query("""select * from Reviews where Score !=3 and Userid="AR5J8UI46CURR" ORDER BY PRODUCTID""", con)
sorted_data = filtered_data.sort_values('ProductId', axis=0, ascending=True, inplace=False, kind='quicksort', na_position='last')
final=sorted_data.drop_duplicates(subset={"UserId","ProfileName","Time","Text"}, keep='first', inplace=False)
final.shape
display = pd.read_sql_query(""" select * from reviews where score != 3 and id=44737 or id = 64422 order by productid""", con)
final=final[final.HelpfulnessNumerator<=final.HelpfulnessDenominator]
final['Score'].value_counts()
count_vect = CountVectorizer()
final_counts = count_vect.fit_transform(final['Text'].values)
final_counts.shape
type(final_counts)
positive_negative = final['Score']
#Below is giving error
final_counts = hstack((final_counts,positive_negative))
sparse.hstack combines the coo format matrices of the inputs into a new coo format matrix.
final_counts is a csr matrix, so the sparse.coo_matrix(final_counts) conversion is trivial.
positive_negative is a column of a DataFrame. Look at
sparse.coo_matrix(positive_negative)
It probably is a (1,n) sparse matrix. But to combine it with final_counts it needs to be (1,n) shaped.
Try creating the sparse matrix, and transposing it:
sparse.hstack((final_counts, sparse.coo_matrix(positive_negative).T))
Used Below but still getting error
merged_data = scipy.sparse.hstack((final_counts, scipy.sparse.coo_matrix(positive_negative).T))
Below is the error
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'sparse' is not defined
>>> merged_data = scipy.sparse.hstack((final_counts, sparse.coo_matrix(positive_
negative).T))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'sparse' is not defined
>>> merged_data = scipy.sparse.hstack((final_counts, scipy.sparse.coo_matrix(pos
itive_negative).T))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python34\lib\site-packages\scipy\sparse\construct.py", line 464, in h
stack
return bmat([blocks], format=format, dtype=dtype)
File "C:\Python34\lib\site-packages\scipy\sparse\construct.py", line 600, in b
mat
dtype = upcast(*all_dtypes) if all_dtypes else None
File "C:\Python34\lib\site-packages\scipy\sparse\sputils.py", line 52, in upca
st
raise TypeError('no supported conversion for types: %r' % (args,))
TypeError: no supported conversion for types: (dtype('int64'), dtype('O'))
Even I was facing the same issue with sparse matrices. you can convert the CSR matrix to dense by todense() and then you can use np.hstack((dataframe.values,converted_dense_matrix)). It will work fine. you can't deal with sparse matrices by using numpy.hstack
However for very large data set converting to dense matrix is not a good idea. In your case scipy hstack won't work because the data types are different in hstack(int,object).
Try positive_negative = final['Score'].values and scipy.sparse.hstack it. if it doesn't work can you give me the output of your positive_negative.dtype

ImportError: No module named Image when importing ironpython dll

I have a python package called CoreCode which I have compiled using clr.CompileModules() in IronPython 2.7.5. This generated a file called CoreCode.dll. I then import this dll into my IronPython module by using clr.AddReference(). I know the dll works because I have successfully tested some of the classes as shown below. However, my problem lies with the Base_Slice_Previewer class. This class makes use of Image and ImageDraw from PIL in order to generate and save a bitmap file.
I know the problem doesn't lie with PIL because the package works perfectly well when run in Python 2.7. I'm assuming that this error is coming up because IronPython can't find PIL but I'm not sure how to work around this problem. Any help will be much appreciated.
Code to create the dll
import clr
clr.CompileModules("CoreCode.dll", "CoreCode\AdvancedFileHandlers\ScannerSliceWriter.py", "CoreCode\AdvancedFileHandlers\__init__.py", "CoreCode\MarcamFileHandlers\MTTExport.py", "CoreCode\MarcamFileHandlers\MTTImporter.py", "CoreCode\MarcamFileHandlers\__init__.py", "CoreCode\Visualizer\SlicePreviewMaker.py", "CoreCode\Visualizer\__init__.py", "CoreCode\Timer.py", "CoreCode\__init__.py")
Test for Timer.py
>>> import clr
>>> clr.AddReference('CoreCode.dll')
>>> from CoreCode.Timer import StopWatch
>>> stop_watch = StopWatch()
>>> print stop_watch.__str__()
0:00:00:00 0:00:00:00
>>>
Test for MTTExport.py
>>> from CoreCode.MarcamFileHandlers.MTTExport import MTT_Layer_Exporter
>>> mttlayer = MTT_Layer_Exporter()
>>> in_val = (2**20)+ (2**16) + 2
>>> bytes = mttlayer.write_lf_int(in_val, force_full_size=True)
>>> print "%s = %s" %(bytes, [hex(ord(x)) for x in bytes])
à ◄ ☻ = ['0xe0', '0x0', '0x0', '0x0', '0x0', '0x11', '0x0', '0x2']
>>>
Test for SlicePreviewMaker.py
>>> from CoreCode.Visualizer.SlicePreviewMaker import Base_Slice_Previewer
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "CoreCode\Visualizer\SlicePreviewMaker", line 1, in <module>
ImportError: No module named Image
>>>

win32com.client error

When using win32com, something puzzled my.
>>> import win32com
>>> w=win32com.client.Dispatch('Word.Application')
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
w=win32com.client.Dispatch('Word.Application')
AttributeError: 'module' object has no attribute 'client'
what's wrong?
win32com.client is a module in the win32com package you need to import the actual module.
import win32com.client
w = win32com.client.Dispatch('Word.Application')