key error: 'metric_file' using fbprophet and kernel restart - facebook-prophet

trying to reproduce Ten's tutorial on Time Series Forecasting with fbprophet (https://xang1234.github.io/prophet/). Dataset can be downloaded from https://data.gov.sg/dataset/air-passenger-arrivals-total-by-region-and-selected-country-of-embarkation
Here's my code:
air=pd.read_csv(r'C:\Users\minri\Desktop\total-air-passenger-arrivals-by-country.csv')
air=air[air.level_3=='China']
air=air.drop(['level_1','level_2','level_3'],axis=1)
air=air[(air.value!='na') & (air.value!='-')]
air=air.rename(columns={'month':'ds','value':'y'})
model_air=Prophet()
model_air.fit(air)
below are my current versions:
pandas: 1.0.4
numpy: 1.18.5
fbprophet: 0.7.1
pystan: 2.18.0.0
INFO:numexpr.utils:NumExpr defaulting to 8 threads.
INFO:fbprophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
KeyError: 'metric_file'
Restarting kernel...
[SpyderKernelApp] WARNING | No such comm: d690f93a8e1511eb9fdc1063c874af38

Related

Geopandas and Spyder incompability

I want to run modules of Geopandas in Spyder. Apparently Geopandas is compatible with Sypder 4.2.5, (not with any higher version) and I could run code with this combination. However, in one of my code I had to use "input" command and the problem starts there. Sypder 4.2.5 crashes if I try to run input command. From the internet, I came to know that there was a bug in spyder and it was fixed in Spyder 5.3. Now I have no idea how to fix this problem. If I upgrade Spyder, Geopandas will not work. If I don't upgrade spyder, 'input' will not work.
I was trying to run something like the following code
def Coditions_R3():
print("This is R3")
def Coditions_R4():
print("This is R4")
System = input('Please Enter drone system: \n' )
print(System)
if (System == 'R3'):
Coditions_R3()
elif (System == 'R4'):
Coditions_R4()
Can anyone help? is there any way around to run geopandas with higher Spyder versions? or use something else in place of input?

KeyError: 'NormalizeUTF8' loading a model to another jupyter notebook

I'm trying to reload another model to another jupyter notebook using this code:
import tensorflow as tf
reloaded = tf.saved_model.load('m_translator')
result = reloaded.tf_translate(input_text)
and I recently got this error:
KeyError Traceback (most recent call last)
File ~\anaconda3\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py:4177, in Graph._get_op_def(self, type)
4176 try:
-> 4177 return self._op_def_cache[type]
4178 except KeyError:
KeyError: 'NormalizeUTF8'
FileNotFoundError: Op type not registered 'NormalizeUTF8' in binary running on LAPTOP-D3PPA576. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
You may be trying to load on a different device from the computational device. Consider setting the `experimental_io_device` option in `tf.saved_model.LoadOptions` to the io_device such as '/job:localhost'.
I had this issue before. To solve it, you need to install tensorflow_text. You should try to :
>>> tf.__version__
2.8.2
>>>!pip install tensorflow-text==2.8.2
In addition to installing tensorflow_text library, what helped me with a similar problem was importing the library at the top of the notebook:
import tensorflow_text

How to run pandas-Koalas progam suing spark-submit(windows)?

I have pandas data frame(sample program), converted koalas dataframe, now I am to execute on spark cluster(windows standalone), when i try from command prompt as
spark-submit --master local hello.py, getting error ModuleNotFoundError: No module named 'databricks'
import pandas as pd
from databricks import koalas as ks
workbook_loc = "c:\\2020\Book1.xlsx"
df = pd.read_excel(workbook_loc, sheet_name='Sheet1')
kdf = ks.from_pandas(df)
print(kdf)
What should I change so that I can make use of spark cluster features. My actual program written in pandas does many things, I want to make use of spark cluster to see performance improvements.
You should install koalas via the cluster's admin UI (Libraries/PyPI), if you run pip install koalas on the cluster, it won't work.

Cannot write pandas DataFrame to csv: __init__() got an unexpected keyword argument 'tupleize_cols'

I'm running the following basic code:
dfMain.to_csv('./January_filtered_International_WE.csv')
which used to run normally until yesterday. This morning I upgraded to pandas 0.25.0 while running code and now I cannot write my 500k rows dataframe to a csv. I can mention that I left Jupyter Notebook running in order to do some processing, so this morning when I opened it I had the dataFrame already, processed.
Versions (using Windows 10)
Jupyter notebook : 5.7.8
Python : 3.6.7
Pandas : 0.25.0
I would like to save my DataFrame in a fast and efficient manner as I will load it several times in the future. I do not want to close the notebook as this will delete the dataFrame.
I tried:
downgrading to Pandas 0.24.2 (previous version used) but still getting the __init__() got an unexpected keyword argument 'tupleize_cols'
use pd.to_pickle but got a memoryError
use pd.to_hdf but got a memoryError
using msgbox instead but apparently it does not support DataFrames (got an error)
upgrade Jupyter notebook, but got the following error:
ERROR: ipython 5.8.0 has requirement prompt-toolkit<2.0.0,>=1.0.4, but
you'll have prompt-toolkit 2.0.9 which is incompatible
so naturally I did pip install prompt-toolkit 1.0.16 but then got this message:
ERROR: jupyter-console 6.0.0 has requirement prompt-toolkit<2.1.0,>=2.0.0, but you'll have prompt-toolkit 1.0.16 which is incompatible.
As an alternative I went into PyCharm and took a random DataFrame.to_csv and it worked. This makes me think the issue is with Jupyter Notebook.
Any help on how to save the DataFrame (~12 GB) is appreciated!
Re-installing Jupyter did the trick in my case
I kept getting the same error, but updating Jupyter fixed it

Google Bigquery ML model schema ValueError

I'm using Google Bigquery ML for the first time and try to train a linear regression model using the following command:
%%bigquery
CREATE OR REPLACE MODEL `sandbox.sample_lr_model`
OPTIONS
(model_type='linear_reg',
data_split_method ='no_split',
max_iterations=1) AS
SELECT
y AS label,
x AS x
FROM
`sandbox.y2018m08d01_rh_sample_dataframe_to_bq_v01_v01`
this step fails with the following error message:
ValueError: Table has no schema: call 'client.get_table()'
However the model is created and can be viewed:
The model has a so-called "Model schema". Am I doing something wrong?
google-cloud-bigquery==1.4.0 Python 3.5 Ubuntu
My input table is a minimum example:
This issue was fixed in https://github.com/GoogleCloudPlatform/google-cloud-python/pull/5602, which was released in version 1.4.0 of BigQuery. To double-check your BigQuery version, run !pip freeze | grep bigquery in a notebook cell.
Note that Datalab does not include the latest version of the google-cloud-bigquery library. To upgrade the version, run !pip install --upgrade google-cloud-bigquery.