How to download netCDF4 file from webpage? - urllib

I want to download a netCDF4 file from a webpage. I can download the datafile, but there seems to be some errors in the file I downloaded using following codes:
import requests
from netCDF4 import Dataset
def download_file(url):
local_filename = url.split('/')[-1]
with requests.get(url, stream=True) as r:
with open(local_filename, 'wb') as f:
shutil.copyfileobj(r.raw, f)
return local_filename
url = 'https://smos-diss.eo.esa.int/oads/data/SMOS_Open_V7/SM_REPR_MIR_SMUDP2_20191222T183243_20191222T192549_700_300_1.nc'
local_filename = download_file(url)
sm_nc = Dataset(local_filename)
But finally I got error message:
Traceback (most recent call last):
File "<ipython-input-98-809c92d8bce8>", line 1, in <module>
sm_nc = Dataset(local_filename)
File "netCDF4/_netCDF4.pyx", line 2321, in netCDF4._netCDF4.Dataset.__init__
File "netCDF4/_netCDF4.pyx", line 1885, in netCDF4._netCDF4._ensure_nc_success
OSError: [Errno -51] NetCDF: Unknown file format: b'SM_REPR_MIR_SMUDP2_20191222T183243_20191222T192549_700_300_1.nc'
I also simply tried urllib.request.urlretrieve(url, './1.nc'), then sm_nc = Dataset('./1.nc'), but just got the following error message:
Traceback (most recent call last):
File "<ipython-input-101-61d1f577421e>", line 1, in <module>
sm_nc = Dataset('./1.nc')
File "netCDF4/_netCDF4.pyx", line 2321, in netCDF4._netCDF4.Dataset.__init__
File "netCDF4/_netCDF4.pyx", line 1885, in netCDF4._netCDF4._ensure_nc_success
OSError: [Errno -51] NetCDF: Unknown file format: b'./1.nc'
But the thing is that, if I paste the url in the search box of my Safari or Chrome, then click download, the file I got is readable by netCDF4.Dataset. (You could also try that.) I tried with many other solutions but didn't work. So is there anybody who could do me a favour? Thanks!
By the way, the requests and netCDF4 I am using are of version 2.26.0 and 1.5.3, urllib.request is of 3.7.

Tiy probably want to use urlretrieve. The following call to urllib should work:
import urllib
new_x = "/tmp/temp.nc"
x = "https://smos-diss.eo.esa.int/oads/data/SMOS_Open_V7/SM_REPR_MIR_SMUDP2_20191222T183243_20191222T192549_700_300_1.nc"
urllib.request.urlretrieve(x, new_x)

When I try to wget it gives me nc file but I am not sure it size is 19 KB. You can use wget in python if this file okey for you.
wget https://smos-diss.eo.esa.int/oads/data/SMOS_Open_V7/SM_REPR_MIR_SMUDP2_20191222T183243_20191222T192549_700_300_1.nc
But it is not readable because if you try access without login to site, it gives meaningless file. Just paste this link to your browser then login it gives 6 MB file which I'm sure it is readable. Still if you want to get file with python script check selenium that provide click on the website so you can login then download your file with script.

Related

ERROR trying to load Data to Google Collab from disk

i am trying to open and load some data from disk in Google Collab but i get the following error message:
FileNotFoundError Traceback (most recent call last)
<ipython-input-38-cc9c795dc8d8> in <module>()
----> 1 test=open(r"C:\Users\Stefanos\Desktop\ΑΕΡΟΜΑΓΝΗΤΙΚΑ PUBLICATION\data\test.txt",mode="r")
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\Stefanos\\Desktop\\ΑΕΡΟΜΑΓΝΗΤΙΚΑ PUBLICATION\\data\\test.txt'
the error occurs by this code:
test=open(r"C:\Users\Stefanos\Desktop\ΑΕΡΟΜΑΓΝΗΤΙΚΑ PUBLICATION\data\test.txt",mode="r")
Your problem is that you are trying to load from disk with path of your computer!
Collab gives you a completely different computer in the cloud to work with so it wont be able to open the files in your computer:
You have to upload files to collab:
Use this function to upload files. It will SAVE them as well.
def upload_files():
from google.colab import files
uploaded = files.upload()
for k, v in uploaded.items():
open(k, 'wb').write(v)
return list(uploaded.keys())

How to fix "AttributeError: 'module' object has no attribute 'SOL_UDP'" error in Python Connector Mule

I'm trying to execute a basic script to return Cisco Config File as a JSON Format, and I have a success process over Python2.7.16 and Python 3.7.3, but when I'm trying to execute the same script over Python Connector for Mule ESB I receive the error refered in the title of this thread.
This is for a Mule feature, the Python connector script in this tool, works with a Jython 2.7.1, and is loaded as a library for the Mule.
I expect the output as a JSON file but actual output is:
Root Exception stack trace:
Traceback (most recent call last):
File "<script>", line 2, in <module>
File "C:\Python27\Lib\site-packages\ciscoconfparse\__init__.py", line 1, in <module>
from ciscoconfparse import *
File "C:\Python27\Lib\site-packages\ciscoconfparse\ciscoconfparse.py", line 17, in <module>
from models_cisco import IOSHostnameLine, IOSRouteLine, IOSIntfLine
File "C:\Python27\Lib\site-packages\ciscoconfparse\models_cisco.py", line 8, in <module>
from ccp_util import _IPV6_REGEX_STR_COMPRESSED1, _IPV6_REGEX_STR_COMPRESSED2
File "C:\Python27\Lib\site-packages\ciscoconfparse\ccp_util.py", line 16, in <module>
from dns.resolver import Resolver
File "C:\Python27\Lib\site-packages\dns\resolver.py", line 1148, in <module>
_protocols_for_socktype = {
AttributeError: 'module' object has no attribute 'SOL_UDP'
The only thing I had to do was comment that line in the script resolver.py and in this way the script on Anypoint Studio ran smoothly.
Thanks for your help, I hope that this helps to other people.
The problem appears to be that you are trying to execute a script that depends on a different python package. Mule supports executing python scripts using the Java Jython implementation but it probably doesn't know about pyhton packages dependencies.

How to fix "calibration_pb2 from 'object_detection.protos' " error (Windows)

I've tried to run the below code but it always gives a set of errors. I tried searching the answers but none work for my code, there are two files named 'object_detection' one in the research folder and other in the object_detection-0.1-py3.7.egg folder which might be causing the error but i tried to change the path but the errors still persist
I'm trying to execue this command:
C:\tensorflow1\models\research\object_detection>python train.py --
logtostderr --train_dir=training/ --
pipeline_config_path=training/faster_rcnn_inception_v2_pets.config
but have the following errors coming up:
Traceback (most recent call last):
1. File "train.py", line 51, in <module>
from object_detection.builders import model_builder
2. File "C:\Users\Swayam\mypython\lib\site-packages\object_detection-
0.1-
py3.7.egg\object_detection\builders\model_builder.py", line 27, in
<module>
from object_detection.builders import post_processing_builder
3. File "C:\Users\Swayam\mypython\lib\site-packages\object_detection-
0.1-
py3.7.egg\object_detection\builders\post_processing_builder.py",
line 2
2, in <module>
from object_detection.protos import post_processing_pb2
4. File "C:\Users\Swayam\mypython\lib\site-packages\object_detection-
0.1-
py3.7.egg\object_detection\protos\post_processing_pb2.py", line 15,
in
<module>
from object_detection.protos import calibration_pb2 as
object__detection_dot_protos_dot_calibration__pb2
5. ImportError: cannot import name 'calibration_pb2' from
'object_detection.protos' (C:\Users\Swayam\mypython\lib\site-
packages\object_detection-0.1-
py3.7.egg\object_detection\protos\__init__.py)
I've tried using the
protoc object_detection/protos/*.proto --python_out=.
command but it brings up errors too.
Also, the environment is not made in conda, could that be the cause of the error? Though all the necessary installations are present in the existing virtual environment.
Try this Solution:
Check if the file
"calibration_pb2.py"
is located in the following path, in your case may be this one:
C:\Users\Swayam\mypython\lib\site-packages\object_detection-0.1-
py3.7.egg\object_detection\protos\
If not, just copy it from your working path:
C:\tensorflow1\models\research\object_detection\protos\
If it works, I sugggest you try to copy all the *pb2.py files into the path mentioned above.
you just compile this
protoc --python_out=. .\object_detection\protos\anchor_generator.proto .\object_detection\protos\argmax_matcher.proto .\object_detection\protos\bipartite_matcher.proto .\object_detection\protos\box_coder.proto .\object_detection\protos\box_predictor.proto .\object_detection\protos\eval.proto .\object_detection\protos\faster_rcnn.proto .\object_detection\protos\faster_rcnn_box_coder.proto .\object_detection\protos\grid_anchor_generator.proto .\object_detection\protos\hyperparams.proto .\object_detection\protos\image_resizer.proto .\object_detection\protos\input_reader.proto .\object_detection\protos\losses.proto .\object_detection\protos\matcher.proto .\object_detection\protos\mean_stddev_box_coder.proto .\object_detection\protos\model.proto .\object_detection\protos\optimizer.proto .\object_detection\protos\pipeline.proto .\object_detection\protos\post_processing.proto .\object_detection\protos\preprocessor.proto .\object_detection\protos\region_similarity_calculator.proto .\object_detection\protos\square_box_coder.proto .\object_detection\protos\ssd.proto .\object_detection\protos\ssd_anchor_generator.proto .\object_detection\protos\string_int_label_map.proto .\object_detection\protos\train.proto .\object_detection\protos\keypoint_box_coder.proto .\object_detection\protos\multiscale_anchor_generator.proto .\object_detection\protos\graph_rewriter.proto .\object_detection\protos\calibration.proto
it will resolve the issue

Lxml import issues when using Scrapy

I am trying to use Scrapy with Anaconda/Miniconda on Windows 10. Installation goes fine, but trying to actually run Scrapy gives the following error:
Traceback (most recent call last):
File "C:\ProgramData\Miniconda3\Scripts\scrapy-script.py", line 6, in <module>
from scrapy.cmdline import execute
File "C:\ProgramData\Miniconda3\lib\site-packages\scrapy\__init__.py", line 34, in <module>
from scrapy.spiders import Spider
File "C:\ProgramData\Miniconda3\lib\site-packages\scrapy\spiders\__init__.py", line 10, in <module>
from scrapy.http import Request
File "C:\ProgramData\Miniconda3\lib\site-packages\scrapy\http\__init__.py", line 11, in <module>
from scrapy.http.request.form import FormRequest
File "C:\ProgramData\Miniconda3\lib\site-packages\scrapy\http\request\form.py", line 11, in <module>
import lxml.html
File "C:\ProgramData\Miniconda3\lib\site-packages\lxml\html\__init__.py", line 53, in <module>
from .. import etree
ImportError: DLL load failed: The specified module could not be found.
I have tried reinstalling Scrapy, lxml, and Anaconda itself (this time, I'm using a clean install of Miniconda), as well as downloading unofficial lxml build from https://www.lfd.uci.edu/~gohlke/pythonlibs/, as suggested in one of the answers on Stack Overflow, but the problem persists. I have also done this on an Amazon AWS EC2 instance started from scratch, but I'm getting the same issue.
It seems to be something relatively common, but I couldn't find an answer that would work for me. What's an appropriate way to address this? Is it just about lxml, or is there something else causing this problem?

Unable to connect to endpoint when writing to S3 using Tensorflow

Tensorflow 1.4.0 comes with the S3 filesystem driver by default. I'm having trouble using it, and have this minimal example, that does not work for me:
import tensorflow as tf
f = tf.gfile.Open("s3://bucket/plipp", mode='w')
f.write("foo")
f.close()
which gives the following error:
Traceback (most recent call last):
File "test2.py", line 5, in <module>
f.close()
File "/Users/me/venv3/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 234, in close
pywrap_tensorflow.Set_TF_Status_from_Status(status, ret_status)
File "/Users/me/venv3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InternalError: : Unable to connect to endpoint
From what I can see, it seems like "Unable to connect to endpoint" is an error from the C++ AWS SDK. I've given myself * permissions to the bucket.
My bucket is in eu-west-1 and I've tried doing export S3_ENDPOINT=https://s3-eu-west-1.amazonaws.com and export S3_REGION=eu-west-1 since it seems that those variables are consumed by the S3 driver, but this changes nothing.
I've also tried using s3://bucket.s3-eu-west-1.amazonaws.com/plipp as the path, instead of just using the bucket name.
I can copy files to the bucket fine:
~> aws s3 cp foo s3://bucket/plipp
upload: ./foo to s3://bucket/plipp
Any ideas what I might be doing wrong? How can I debug further?
I'm not quite sure what went wrong last time I tried this, but now I got it working by just doing export S3_REGION=eu-west-1 and writing to the bucket with
with tf.gfile.Open("s3://bucket/plipp", mode='w') as f:
f.write("foo")
So, don't export the S3_ENDPOINT variable.