Import RRD files into cacti - rrd

I have rrd files collected from different hosts. Would like to import rrd files into a cacti instance. I followed what is described here http://docs.cacti.net/manual:087:8_rrdtool.05_external_rrds to generate graphs from RRD files.
How to update files in rra directory without overwriting (append)?

Related

Pyspark project structure

I am very new in Pyspark and AWS domain. I have a project to complete where I need to read files from various datasource (db, xml file) and store and process the data in AWS cloud. Is there any project structure that I need to follow, I am thinking of having following folders:
config/
input/
utils/
process/
readme.txt

CSV/Pickle Files Too Large to Commit to GitHub Repo

I'm working on committing a project I have been working on for awhile that we have not yet uploaded to GitHub. Most of it is Python Pandas where we are doing all our ETL work and saving to CSV's and Pickle files to then use in creating dashboards/running metrics on our data.
We are running into some issues with version control without using GitHub so want to get on top of that. I don't need version control on our CSV or Pickle files, but I can't change the file paths or everything will break. When I try to initially commit to the repo it won't let me because our pickle and CSV files are too big. Is there a way for me to commit the project and not upload the whole CSV/pickle files (the largest is ~10 GB).
I have this in my gitignore file, but still not letting me get around it. Thanks for any and all help!
*.csv
*.pickle
*.pyc
*.json
*.txt
__pycache__/MyScripts.cpython-38.pyc
.Git
.vscode/settings.json
*.pm
*.e2x
*.vim
*.dict
*.pl
*.xlsx

How to upload my dataset into Google Colab?

I have my dataset on my local device. Is there any way to upload this dataset into google colab directly.
Note:
I tried this code :
from google.colab import files
uploaded = files.upload()
But it loads file by file. I want to upload the whole dataset directly
Here's the workflow I used to upload a zip file and create a local data directory:
zip the file locally. Something like: $zip -r data.zip data
upload zip file of your data directory to colab using their (Google's) instructions.
from google.colab import files
uploaded = files.upload()
Once zip file is uploaded, perform the following operations:
import zipfile
import io
zf = zipfile.ZipFile(io.BytesIO(uploaded['data.zip']), "r")
zf.extractall()
Your data directory should now be in colab's working directory under a 'data' directory.
Zip or tar the files first, and then use tarfile or zipfile to unpack them.
Another way is to store all the dataset into a numpy object and upload to drive. There you can easily retrieve it. (zipping and unzipping also fine but I faced difficulty with it)

Sphinx cannot find module for autodoc even after including it in sys.path

I have a large directory structure so instead of manually adding the paths, i am searching for all modules using os.walk(). Whenever I am finding a directory containing a __init__.py file I am including that directory in the PYTHONPATH by using the following code in my conf.py file:
import os, sys
top = "/home/userme/Desktop/root"
for dirpaths, dirnames, filenames in os.walk(top):
if "__init__.py" in filenames:
sys.path.insert(0, os.path.abspath(dirpath))
I am also printing the filepaths to see that they are correct. Even then while trying to read the corresponding rst file I am getting an error trying to import the module.

Scrapy crawl appends locally, replaces on S3?

I implemented a Scrapy project that is now working fine locally. Using the crawl command, each spider appended it's jsonlines to the same file if the file existed. When I changed the feed exporter to S3 using boto it now overwrites the entire file with the data from the last run spider instead of appending to the file.
Is there any way to enable the Scrapy/boto/S3 to append the jsonlines in to the file like it does locally?
Thanks
There is no way to append to a file in S3. You could enable versioning on the S3 bucket and then each time the file was written to S3, it would create a new version of the file. Then you could retrieve all versions of the file using the list_versions method of the boto Bucket object.
From reading the feed exporter code (https://github.com/scrapy/scrapy/blob/master/scrapy/contrib/feedexport.py), the file exporter opens the specified file in append mode whilst the S3 exporter calls set_contents_from_file which presumably overwrites the original file.
The boto S3 documentation (http://boto.readthedocs.org/en/latest/getting_started.html) doesn't mention being able to modify stored files, so the only solution would be to create a custom exporter that stores a local copy of results that can be appended to first before copying that file to S3.