deployment to heroku causes pandas to read in my csv file incorrectly for my streamlit app - pandas

I have a streamlit app that uses three datasets to give descriptive statistics about three cities. Libraries used are: Streamlit, Pandas and Numpy.
My streamlit app works perfectly on my local machine, but after deploying to heroku. Pandas doesn't read in the csv dataset correctly. It simply reads in the version, oid and size. Also i used GIT LFS to upload my datasets to github (Each dataset surpasses the 25mb limit). I followed the instructions on how to deploy an app to Heroku carefully from this video (https://youtu.be/JwSS70SZdyM?t=10526).
This image below is what pandas reads in:
I truly don't know what to try as i haven't seen a solution to this problem on stackoverflow before.

Related

Instal Scrapy on PythonAnywhere? (or Cloud9)

Can I run Scrapy on the free level of PythonAnywhere? I've looked, but haven't found instructions for installing it there.
If it can't be run on the free level of PythonAnywhere, is there another online environment where I can run Scrapy without needing to install Python and Scrapy on my computer?
EDIT: My question was just about PythonAnywhere, but in finding the answer to the question, I came across Cloud9 and found it to be a preferable alternative, which is explained in the answer.
Short summary:
Scrapy comes preinstalled on PythonAnywhere. No installation required.
I found an alternative that I like better: Cloud9. I was able to install Scrapy on it, but with a security issue that probably won't be a problem for me.
====================================
There were three parts to my question:
Can I run Scrapy in the free level of PythonAnywhere? This part has been answered: Yes, but with debilitating restrictions.
The other two parts have not been answered, but I've found some answers and will share them here.
What other online environments allow me to run Scrapy without needing to install Python and Scrapy on my computer? I haven't found a direct answer to this, but the free tutorial website, Python for Everybody ("Py4E"), has a page, Setting up your Python Development Environment, which lists four online Python environments. It provides a brief tutorial on PythonAnywhere and then just provides links to the other three: Trinket, Cloud9, and CodeAnywhere.
None of those four environments say anything about running Scrapy on them. With some more research, I did find out how to use Scrapy in PythonAnywhere, which I explain next below. Of the other three, Cloud9 is part of Amazon's AWS suite, which is a sophisticated set of software tools that I've used other parts of before. Based on that, I assumed it also accommodates Scrapy, and I checked it out as well. I've added the results of that below as a new part 4 to my question.
Now, the main part of my question: How to install Scrapy on PythonAnywhere? The answer is:
You don't. It's already installed!
It's amazing that PythonAnywhere's otherwise excellent documentation doesn't say anything about this. I found it out by following instructions that I hoped would lead me to installing Scrapy:
First, since I'm new to Python (but not to programming), I ran through Py4E's tutorial on PythonAnywhere, which is really a quick introduction to Python and got me to write a simple program, told me to use the Bash Unix shell instead of the Python interpreter ("$" instead of ">>>"), and had me save a simple program to a file.
Next, I went to Scrapy's installation instructions. It has this wonderful line: "... if you’re already familiar with installation of Python packages, you can install Scrapy and its dependencies from PyPI with: pip install Scrapy". Of course, it doesn't follow that by saying what to do if I'm not familiar with that. Sigh!
After that, I somehow found my way to Python's official instructions on Installing Packages, which starts by explaining that "package" means "a bundle of software to be installed", so I thought that might include Scrapy. So I ran through the instructions there, and about half-way through, it told me to run:
python3 -m pip install "SomeProject"
(* Footnote below on syntax of that command)
The instructions said that "SomeProject" is supposed to be a project that's included in the Python Package Index, so I went there and searched for Scrapy. It gave me a list of 681 projects with "scrapy" in the name, and some of them looked like they might be various versions of Scrapy itself. None of them were called just "Scrapy", but the Scrapy instruction quoted above said to use just that name. So I held my breath and entered:
python3 -m pip install Scrapy
And guess what I got? PythonAnywhere told me:
Requirement already satisfied: Scrapy in /usr/local/lib/python3.9/site-packages (2.5.0)
That was followed by a couple of dozen more lines that all started with "Requirement already satisfied", which I took to be the dependencies required by Scrapy, all of them already present and ready to roll.
So, hmmm, Scrapy is already there? To find out if that's really true, I went to the tutorial on Scrapy's website. The first thing it said was to create a project by using the command:
scrapy startproject tutorial
I entered that, and PythonAnywhere told me that it had successfully created a new project. Since this was a Scrapy command, I conclude that, yes, indeed, I already have Scrapy installed and running on PythonAnywhere. No installation necessary!
What about Cloud9? As I said above in my answer to part 2, when I found out about Cloud9, I was interested because it's part of Amazon Web Services ("AWS"). I've used other parts of AWS before and found them to be sophisticated, complicated, powerful, and well-documented. They are also very economical.
AWS is a commercial system run by Amazon. It charges fees based on usage, with no minimums, and with low-volume usage being free. The pricing page for Cloud9 shows it to be no exception. Cloud9 itself is free to use, but using it calls on other AWS resources that have charges.
The pricing page gives the following example: "If you use the default settings running an IDE for 4 hours per day for 20 days in a month with a 30-minute auto-hibernation setting your monthly charges for 90 hours of usage would be ... $2.05". That's less than half the lowest monthly cost of PythonAnywhere. (As stated in the answer by Giles Thomas, the free level of PythonAnywhere is not very useful for Scrapy.) I'm not sure how the amount of usage in the Cloud9 example compares with the amount of usage allowed by PythonAnywhere's $5/mo service, but my usage is going to be a lot less than either one, so I expect my cost of using Cloud9 to be very low, and possibly nothing. Furthermore, if I only use Scrapy for a project a couple of times a year, with PythonAnywhere, I'd have to close my account in between projects to stop being charged, but AWS doesn't charge me when I'm not using it, so I can keep the account with no cost between projects.
So based on both the quality of the AWS modules I've used and the low usage cost, I was very interested in Cloud9 as an alternative.
And I was not surprised to find that I could use Scrapy in it.
To figure that out, I quickly abandoned the webpage instructions in favor of downloading a pdf of the comprehensive User Guide from the documentation page. Comprehensive = 595 pages! But it's very well organized and cross-referenced, so I was able to learn what I needed by reading about 20 pages, which included a tutorial on using the GUI environment (pg 29..38) and another on using Python in Cloud9 (pg 423..7).
In that second tutorial, I ran:
python3 --version to find out that Python was already installed, version 3.7.10.
python -m pip --version to find out that pip version 20.2.2 is running.
After that tutorial, I was ready to find out if Scrapy is there. I had learned by then about pip show, so I ran:
python -m pip show Scrapy
The answer was no:
WARNING: Package(s) not found: Scrapy
So I repeated the command that I'd done earlier in PythonAnywhere:
python3 -m pip install Scrapy
This time, there were very few "Requirement already satisfied"s and instead there were a lot of "Collecting ... Downloading"s, followed by "Installing collected packages" and then "Successfully installed" with a long list that included Scrapy-2.6.1.
I repeated python -m pip show Scrapy and got several lines of output that told me Scrapy 2.6.1 is installed. Finally, I ran the same test I'd run before in PythonAnywhere, the first instruction in the official Scrapy tutorial:
scrapy startproject tutorial
and got the same output as before, telling me that the project had been created.
Bingo! I have Scrapy running in Cloud9.
On the negative side, there was a problem here. AWS has two levels of sign-in authority, called root users and IAM. For proper security, I should be running Cloud9 as an IAM user, but there was a problem being able to sign in that way. I posted a question on SO about that, but while waiting for an answer, went ahead and started using Cloud9 as the root user. In the course of that, I got the message:
WARNING: Running pip install with root privileges is generally not a good idea.
That warning came with a suggestion of an alternative command that didn't make sense and didn't work when I tried it. So I'm not sure how much I've messed up the security of my AWS account by what I've been doing here. My work is not secretive, so the security may be a non-issue, but I'd still like to figure out how to proceed as an IAM user and clean up any damage I might have caused by what I've been doing as the root user. If anyone knows about that, please respond to the SO question about it linked in the previous paragraph.
So now I've got Scrapy running in Cloud9, and I'm going to go find out if it can get the data I need. I'll make another edit here if there are any surprises in terms of Cloud9 either (a) not being able to do something or (b) resulting in unexpected charges.
====================================
(*) Footnote on syntax of python3 -m pip install "SomeProject":
Since I was working in something called PythonAnywhere, I was tempted to think that this was a Python command. But then I had to remember that, within PythonAnywhere, I was working in Bash, a Unix shell. So Python3 is a Unix command. I haven't found documentation of that exact command, but did of a command it's presumably based on Python. That documentation says, "-m module-name Searches ... for the named module and runs the corresponding .py file as a script." So this means that pip is a Unix module written in Python for installing Python packages. Then install <project name> is a parameter of the pip module. (Somebody please correct me if I've said any of that wrong.)
You can, but free accounts on PythonAnywhere are limited to accessing sites on a whitelist of official public APIs, so you will probably not be able to access non-API sites.

Is there a way to store a generated file in an .ipynb jupyter notebook?

In jupyter notebooks a whole bunch of things that are effectively files can be represented, for example videos or images. This is one of their core strengths.
Is it possible to actually store a file in a notebook so that afterwards you can open the ipynb and still download the file when the backend is no longer running?

How to reduce electron apps size

My electron application is insanely big on Mac after installation. It's around 1.39GB for no apparent reason, even though it's around 70MB on Windows. I tried to unpack the dmg file to see what makes it so big then found a file called app.asar that takes a large size of the app (1.22GB) and I don't know how to unpack this file.
So my question is,
How to make the application's size much smaller like on windows?
and what does app.asar file contain?
I'm using electron-builder to build the app by the way
It depends on your app.
If your app is too heavy, then it might be natural.
If not; try removing unwanted packages in the node_modules directory using npm.
Try instructions here :
https://www.electronjs.org/docs/tutorial/application-distribution
Make electron app smaller?
https://github.com/electron/electron/issues/2003
You can also unpack the asar file using:
npx asar extract app.asar destfolder
more on that here How to unpack an .asar file?
it basically contains all you code and stuff.
I hope I could help you 😉.
some stuff added based on others comments

Export a python dataframe from Google AI Platform Notebook to a table in Google BigQuery

I got a dataframe in Google AI Platform Notebook (PN) that i would like to transfer to a table in Google BigQuery.
I am aware of the option to use: df.to_gqb() but that requires a pip install of pandas_gbq. I prefer to avoid pip installs on top of the libraries already included in PN to keep the setup as simple as possible.
Do i perhaps miss an easy solution?
Br, Torben
There is a native library for importing data into BigQuery, via the BigQuery Client Library.
As the "google.cloud" library is already part of AI Platform notebooks environment, you won't need to install any new packages (thanks Torben for the confirmation!)
Here is a link to the official documentation

Deploying Tensorflow models as Windows exe

I want to use Tensorflow 1.4 for my ML modeling needs. My use case requires:
Training the model on GPU <--- I know how to do this with TF
Deploying the trained model on an ordinary box - as an .exe on CPU running Windows (for inference) <----
I don't know how to do this.
Can somebody tell me if TF 1.4 supports this and if so, point me to a guide or explain how its done ?
This is a little late but this video on youtube covers it pretty well.
He uses pyinstaller which grabs everything needed and puts it all either into one executable without anything else, or a folder with the exe in there and other stuff.
I've tried this myself and it works pretty well, although since pyinstaller smashes everything needed into one folder which gets really huge, it includes the entire tensorflow library, the python interpreter and if you use tensorflow-gpu, it also includes the cudnn files as well which are like 600mb, effectively leaving you with over a 1gb worth of files in the end.
That can be reduced by excluding modules that you don't need, I recommend creating a virtual environment and start with a clean installation of python.
Hope this helps in anyway.