matplotlib fetch historical works when start year 2015 but not 2016 ?

matplotlib fetch historical works when start year 2015 but not 2016 ? - matplotlib

want to get just a single day ( last trading day) from yahoo finance using matplotlib BUT I get an error on the code below. I REALLY can't see why can anyone help please?
import datetime
import matplotlib.finance as finance
start_date = datetime.date(2016,8,1)
today = end_date = datetime.date.today()
ticker = 'GE'
fh = finance.fetch_historical_yahoo(ticker,start_date,end_date)
the error thrown is
Traceback (most recent call last):
File "/home/dave/Desktop/development/1 10 16 mp single test.py", line
13, in <module>
fh = finance.fetch_historical_yahoo(ticker,start_date,end_date)
File "/usr/lib/python3/dist-packages/matplotlib/finance.py", line
431, in fetch_historical_yahoo
with contextlib.closing(urlopen(url)) as urlfh:
File "/usr/lib/python3.4/urllib/request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.4/urllib/request.py", line 470, in open
response = meth(req, response)
File "/usr/lib/python3.4/urllib/request.py", line 580, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python3.4/urllib/request.py", line 508, in error
return self._call_chain(*args)
File "/usr/lib/python3.4/urllib/request.py", line 442, in _call_chain
result = func(*args)
File "/usr/lib/python3.4/urllib/request.py", line 588, in
http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

Related

problem running python script with Selenium wrong parameter

Good morning, since a few days I have this problem when I run my script in python, until two days ago it was working now it doesn't work anymore, and I can't understand what is the problem...
I attach error code and imports.
Thanks
2022-07-14 15:49:04,691 INFO Your selenium-driver-updater library is up to date.
2022-07-14 15:49:05,756 INFO Latest version of edgedriver: 103.0.1264.51
2022-07-14 15:49:05,756 INFO Started download edgedriver latest_version: 103.0.1264.51
2022-07-14 15:49:05,756 INFO Started download edgedriver by url: https://msedgedriver.azureedge.net/103.0.1264.51/edgedriver_win64.zip
2022-07-14 15:49:06,882 ERROR error: Traceback (most recent call last):
File "C:\Users\diriy\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium_driver_updater\driverUpdater.py", line 133, in install
driver_path = DriverUpdater.__run_specific_driver()
File "C:\Users\diriy\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium_driver_updater\driverUpdater.py", line 339, in __run_specific_driver
driver_path = driver.main()
File "C:\Users\diriy\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium_driver_updater\_edgeDriver.py", line 54, in main
driver_path = self.__check_if_edgedriver_is_up_to_date()
File "C:\Users\diriy\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium_driver_updater\_edgeDriver.py", line 80, in __check_if_edgedriver_is_up_to_date
driver_path = self._download_driver()
File "C:\Users\diriy\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium_driver_updater\_edgeDriver.py", line 187, in _download_driver
archive_path = wget.download(url=url, out=out_path)
File "C:\Users\diriy\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\wget.py", line 526, in download
(tmpfile, headers) = ulib.urlretrieve(binurl, tmpfile, callback)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\urllib\request.py", line 239, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\urllib\request.py", line 214, in urlopen
return opener.open(url, data, timeout)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\urllib\request.py", line 523, in open
response = meth(req, response)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\urllib\request.py", line 632, in http_response
response = self.parent.error(
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\urllib\request.py", line 561, in error
return self._call_chain(*args)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\urllib\request.py", line 494, in _call_chain
result = func(*args)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\urllib\request.py", line 641, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
c:\Users\diriy\OneDrive\Python\Monatseinteilung\Monatseint_Tool_v3.104.py:65: DeprecationWarning: executable_path has been deprecated, please pass in a Service object
driver = webdriver.Edge(filename, options=options)
Traceback (most recent call last):
File "c:\Users\diriy\OneDrive\Python\Monatseinteilung\Monatseint_Tool_v3.104.py", line 65, in <module>
driver = webdriver.Edge(filename, options=options)
File "C:\Users\diriy\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium\webdriver\edge\webdriver.py", line 61, in __init__
super().__init__(DesiredCapabilities.EDGE['browserName'], "ms",
File "C:\Users\diriy\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium\webdriver\chromium\webdriver.py", line 89, in __init__
self.service.start()
File "C:\Users\diriy\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\LocalCache\local-packages\Python39\site-packages\selenium\webdriver\common\service.py", line 71, in start
self.process = subprocess.Popen(cmd, env=self.env,
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\subprocess.py", line 951, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\subprocess.py", line 1420, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
OSError: [WinError 87] Wrong paramater

Why does a requests.exceptions.ReadTimeout errors happens when executing (heavy?) code in ganache using web3py?

I'm trying to generate a sorted list of random uint32 numbers. Generating the list is easily done:
for (uint24 i = 1; i < limit; i++) {
seed = uint(keccak256(abi.encodePacked(seed)));
sorted[i] = uint32(seed);
}
where limit is an uint24 indicating the number of samples, seed is an arbitrary uint and sampled is an uint32[limit]. However, if I try to generate a sorted array like this:
for (uint24 i = 1; i < limit; i++) {
seed = uint(keccak256(abi.encodePacked(seed)));
sorted[i] = uint32(seed);
uint24 j = i;
while (sorted[j - 1] > sorted[j]) {
(sorted[j - 1], sorted[j]) = (sorted[j], sorted[j - 1]);
j--;
if (j == 0) {
break;
}
}
}
then this yields the expected result for small values of limit (like 10), but web3py fails with the following error for bigger inputs (like 300) when I try to call the function associated with the previous code:
Traceback (most recent call last):
File "/home/personal/Python/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 445, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/home/personal/Python/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 440, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.9/http/client.py", line 1371, in getresponse
response.begin()
File "/usr/lib/python3.9/http/client.py", line 319, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.9/http/client.py", line 280, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.9/socket.py", line 704, in readinto
return self._sock.recv_into(b)
socket.timeout: timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/personal/Python/venv/lib/python3.9/site-packages/requests/adapters.py", line 439, in send
resp = conn.urlopen(
File "/home/personal/Python/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 755, in urlopen
retries = retries.increment(
File "/home/personal/Python/venv/lib/python3.9/site-packages/urllib3/util/retry.py", line 532, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/home/personal/Python/venv/lib/python3.9/site-packages/urllib3/packages/six.py", line 770, in reraise
raise value
File "/home/personal/Python/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 699, in urlopen
httplib_response = self._make_request(
File "/home/personal/Python/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 447, in _make_request
self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
File "/home/personal/Python/venv/lib/python3.9/site-packages/urllib3/connectionpool.py", line 336, in _raise_timeout
raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPConnectionPool(host='127.0.0.1', port=8545): Read timed out. (read timeout=10)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/personal/Python/venv/lib/python3.9/site-packages/web3/contract.py", line 957, in call
return call_contract_function(
File "/home/personal/Python/venv/lib/python3.9/site-packages/web3/contract.py", line 1501, in call_contract_function
return_data = web3.eth.call(
File "/home/personal/Python/venv/lib/python3.9/site-packages/web3/module.py", line 57, in caller
result = w3.manager.request_blocking(method_str,
File "/home/personal/Python/venv/lib/python3.9/site-packages/web3/manager.py", line 186, in request_blocking
response = self._make_request(method, params)
File "/home/personal/Python/venv/lib/python3.9/site-packages/web3/manager.py", line 147, in _make_request
return request_func(method, params)
File "cytoolz/functoolz.pyx", line 250, in cytoolz.functoolz.curry.__call__
File "/home/personal/Python/venv/lib/python3.9/site-packages/web3/middleware/formatting.py", line 76, in apply_formatters
response = make_request(method, params)
File "/home/personal/Python/venv/lib/python3.9/site-packages/web3/middleware/gas_price_strategy.py", line 90, in middleware
return make_request(method, params)
File "cytoolz/functoolz.pyx", line 250, in cytoolz.functoolz.curry.__call__
File "/home/personal/Python/venv/lib/python3.9/site-packages/web3/middleware/formatting.py", line 74, in apply_formatters
response = make_request(method, formatted_params)
File "/home/personal/Python/venv/lib/python3.9/site-packages/web3/middleware/attrdict.py", line 33, in middleware
response = make_request(method, params)
File "cytoolz/functoolz.pyx", line 250, in cytoolz.functoolz.curry.__call__
File "/home/personal/Python/venv/lib/python3.9/site-packages/web3/middleware/formatting.py", line 74, in apply_formatters
response = make_request(method, formatted_params)
File "cytoolz/functoolz.pyx", line 250, in cytoolz.functoolz.curry.__call__
File "/home/personal/Python/venv/lib/python3.9/site-packages/web3/middleware/formatting.py", line 74, in apply_formatters
response = make_request(method, formatted_params)
File "cytoolz/functoolz.pyx", line 250, in cytoolz.functoolz.curry.__call__
File "/home/personal/Python/venv/lib/python3.9/site-packages/web3/middleware/formatting.py", line 74, in apply_formatters
response = make_request(method, formatted_params)
File "/home/personal/Python/venv/lib/python3.9/site-packages/web3/middleware/buffered_gas_estimate.py", line 40, in middleware
return make_request(method, params)
File "/home/personal/Python/venv/lib/python3.9/site-packages/web3/middleware/exception_retry_request.py", line 105, in middleware
return make_request(method, params)
File "/home/personal/Python/venv/lib/python3.9/site-packages/web3/providers/rpc.py", line 88, in make_request
raw_response = make_post_request(
File "/home/personal/Python/venv/lib/python3.9/site-packages/web3/_utils/request.py", line 48, in make_post_request
response = session.post(endpoint_uri, data=data, *args, **kwargs) # type: ignore
File "/home/personal/Python/venv/lib/python3.9/site-packages/requests/sessions.py", line 590, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/home/personal/Python/venv/lib/python3.9/site-packages/requests/sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "/home/personal/Python/venv/lib/python3.9/site-packages/requests/sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "/home/personal/Python/venv/lib/python3.9/site-packages/requests/adapters.py", line 529, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPConnectionPool(host='127.0.0.1', port=8545): Read timed out. (read timeout=10)
I guess that Ganache is quite busy and that's why it can't answer fast enough for web3py to be satisfied, but the added code doesn't seem so heavy that it cannot be dealt with. Or do I miss something else that makes this code too heavy for Ganache?

you can set timeout in your Web3py http provider:
Web3(Web3.HTTPProvider(endpoint_uri=http://127.0.0.1:8545,request_kwargs={'timeout': 600})

Delete bulk rows in Big Query table based on list of unique ids

So I tried to delete some rows in Big Query table with simple query like this:
client = bigquery.Client()
query = "DELETE FROM Sample_Dataset.Sample_Table WHERE Sample_Table_Id IN {}".format(primary_key_list)
query_job = client.query(query, location="us-west1")
Where primary_key_list is some python list contains a list of Sample_Table unique id like: [123, 124, 125, ...]
Things working good with small data I retrieve but when the primary_key_list grows, it gives me an error:
The query is too large. The maximum standard SQL query length is
1024.00K characters, including comments and white space characters.
I realize the query will be long enough to reach max query length, and after searching on stack overflow I found out there's a solution to use parameterized queries, so I change my code to this:
client = bigquery.Client()
query = "DELETE FROM Sample_Dataset.Sample_Table WHERE Sample_Table_Id IN UNNEST(#List_Sample_Table_Id)"
job_config = bigquery.QueryJobConfig(
query_parameters=[
bigquery.ArrayQueryParameter("List_Sample_Table_Id", "INT64", primary_key_list),
]
)
query_job = client.query(query, job_config=job_config)
It stops giving me maximum standard SQL query length, but returns me another error exception, any idea to delete bulk rows from Big Query?
I don't know if this information is useful or not but, I'm running this python code on google cloud Dataproc, and this is just some function I recently added, before I add this function everything is working, and here's the log I've got from running delete using parameterized queries.
Traceback (most recent call last):
File "/opt/conda/default/lib/python3.7/site-packages/urllib3/contrib/pyopenssl.py", line 320, in _send_until_done
return self.connection.send(data)
File "/opt/conda/default/lib/python3.7/site-packages/OpenSSL/SSL.py", line 1737, in send
self._raise_ssl_error(self._ssl, result)
File "/opt/conda/default/lib/python3.7/site-packages/OpenSSL/SSL.py", line 1639, in _raise_ssl_error
raise SysCallError(errno, errorcode.get(errno))
OpenSSL.SSL.SysCallError: (32, 'EPIPE')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/default/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/opt/conda/default/lib/python3.7/site-packages/urllib3/connectionpool.py", line 354, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/opt/conda/default/lib/python3.7/http/client.py", line 1244, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/opt/conda/default/lib/python3.7/http/client.py", line 1290, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/opt/conda/default/lib/python3.7/http/client.py", line 1239, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/opt/conda/default/lib/python3.7/http/client.py", line 1065, in _send_output
self.send(chunk)
File "/opt/conda/default/lib/python3.7/http/client.py", line 987, in send
self.sock.sendall(data)
File "/opt/conda/default/lib/python3.7/site-packages/urllib3/contrib/pyopenssl.py", line 331, in sendall
sent = self._send_until_done(data[total_sent:total_sent + SSL_WRITE_BLOCKSIZE])
File "/opt/conda/default/lib/python3.7/site-packages/urllib3/contrib/pyopenssl.py", line 326, in _send_until_done
raise SocketError(str(e))
OSError: (32, 'EPIPE')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/conda/default/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/opt/conda/default/lib/python3.7/site-packages/urllib3/connectionpool.py", line 638, in urlopen
_stacktrace=sys.exc_info()[2])
File "/opt/conda/default/lib/python3.7/site-packages/urllib3/util/retry.py", line 368, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/opt/conda/default/lib/python3.7/site-packages/urllib3/packages/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "/opt/conda/default/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/opt/conda/default/lib/python3.7/site-packages/urllib3/connectionpool.py", line 354, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/opt/conda/default/lib/python3.7/http/client.py", line 1244, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/opt/conda/default/lib/python3.7/http/client.py", line 1290, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/opt/conda/default/lib/python3.7/http/client.py", line 1239, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/opt/conda/default/lib/python3.7/http/client.py", line 1065, in _send_output
self.send(chunk)
File "/opt/conda/default/lib/python3.7/http/client.py", line 987, in send
self.sock.sendall(data)
File "/opt/conda/default/lib/python3.7/site-packages/urllib3/contrib/pyopenssl.py", line 331, in sendall
sent = self._send_until_done(data[total_sent:total_sent + SSL_WRITE_BLOCKSIZE])
File "/opt/conda/default/lib/python3.7/site-packages/urllib3/contrib/pyopenssl.py", line 326, in _send_until_done
raise SocketError(str(e))
urllib3.exceptions.ProtocolError: ('Connection aborted.', OSError("(32, 'EPIPE')"))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/tmp/a05a15822be04a9abdc1ba05e317bb2f/ItemHistory-get.py", line 92, in <module>
delete_duplicate_pk()
File "/tmp/a05a15822be04a9abdc1ba05e317bb2f/ItemHistory-get.py", line 84, in delete_duplicate_pk
query_job = client.query(query, job_config=job_config, location="asia-southeast2")
File "/opt/conda/default/lib/python3.7/site-packages/google/cloud/bigquery/client.py", line 2893, in query
query_job._begin(retry=retry, timeout=timeout)
File "/opt/conda/default/lib/python3.7/site-packages/google/cloud/bigquery/job/query.py", line 1069, in _begin
super(QueryJob, self)._begin(client=client, retry=retry, timeout=timeout)
File "/opt/conda/default/lib/python3.7/site-packages/google/cloud/bigquery/job/base.py", line 438, in _begin
timeout=timeout,
File "/opt/conda/default/lib/python3.7/site-packages/google/cloud/bigquery/client.py", line 643, in _call_api
return call()
File "/opt/conda/default/lib/python3.7/site-packages/google/api_core/retry.py", line 286, in retry_wrapped_func
on_error=on_error,
File "/opt/conda/default/lib/python3.7/site-packages/google/api_core/retry.py", line 184, in retry_target
return target()
File "/opt/conda/default/lib/python3.7/site-packages/google/cloud/_http.py", line 434, in api_request
timeout=timeout,
File "/opt/conda/default/lib/python3.7/site-packages/google/cloud/_http.py", line 292, in _make_request
method, url, headers, data, target_object, timeout=timeout
File "/opt/conda/default/lib/python3.7/site-packages/google/cloud/_http.py", line 330, in _do_request
url=url, method=method, headers=headers, data=data, timeout=timeout
File "/opt/conda/default/lib/python3.7/site-packages/google/auth/transport/requests.py", line 470, in request
**kwargs
File "/opt/conda/default/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/opt/conda/default/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/opt/conda/default/lib/python3.7/site-packages/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', OSError("(32, 'EPIPE')"))

As #blackbishop mention, you can try to upgrade requests to the latest version (in my case it solve the problem) but since I tried to update bulk rows (let's say 500.000+ row in Big Query, which every row have a unique id), turns out it gives me a timeout with small machine type Dataproc clusters that I used (if someone has a resource to try with better Dataproc cluster and success please feel free to edit this answer).
So I go with the Merge statement as documented here:
https://cloud.google.com/bigquery/docs/reference/standard-sql/dml-syntax#merge_statement
The script would be like this to update existing rows (assuming I have the new data I retrieve and already load it to Staging_Sample_Dataset.Staging_Sample_Table):
def merge_data():
client = bigquery.Client()
query = """MERGE INTO Sample_Dataset.Sample_Table st
USING Staging_Sample_Dataset.Staging_Sample_Table ss
ON st.Sample_Table_Id = ss.Sample_Table_Id
WHEN MATCHED UPDATE SET st.Your_Column = ss.Your_Column -- and the list goes on...
WHEN NOT MATCHED THEN INSERT ROW
"""
query_job = client.query(query, location="asia-southeast2")
results = query_job.result()
or I could delete bulk rows and call another function to load after this function executed:
def bulk_delete():
client = bigquery.Client()
query = """MERGE INTO Sample_Dataset.Sample_Table st
USING Staging_Sample_Dataset.Staging_Sample_Table sst
ON st.Sample_Table_Id = sst.Sample_Table_Id
WHEN MATCHED THEN DELETE
"""
query_job = client.query(query, location="asia-southeast2")
results = query_job.result()

the concept is looping the array of the ids you have and run the delete fro each one of them
it is just the structure, not the actual code. so it might need more than just tweaking
set arrlength = ARRAY_LENGTH(primary_key_list)
client = bigquery.Client()
WHILE i < arrlength DO
query = "delete from Sample_Dataset.Sample_Table WHERE Sample_Table_Id=primary_key_list[OFFSET(i)]"
query_job = client.query(query, location="us-west1")
set i = i+1;
END WHILE;

python 3.8 urllib file:// issue

I want to open a local file with urllib.request.urlopen with the following code:
urllib.request.urlopen('file:///home/parham/.bashrc')
But it generates the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/linuxbrew/.linuxbrew/Cellar/python#3.8/3.8.5/lib/python3.8/urllib/request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "/home/linuxbrew/.linuxbrew/Cellar/python#3.8/3.8.5/lib/python3.8/urllib/request.py", line 525, in open
response = self._open(req, data)
File "/home/linuxbrew/.linuxbrew/Cellar/python#3.8/3.8.5/lib/python3.8/urllib/request.py", line 536, in _open
result = self._call_chain(self.handle_open, 'default',
File "/home/linuxbrew/.linuxbrew/Cellar/python#3.8/3.8.5/lib/python3.8/urllib/request.py", line 502, in _call_chain
result = func(*args)
File "/home/linuxbrew/.linuxbrew/Cellar/python#3.8/3.8.5/lib/python3.8/urllib/request.py", line 806, in <lambda>
meth(r, proxy, type))
File "/home/linuxbrew/.linuxbrew/Cellar/python#3.8/3.8.5/lib/python3.8/urllib/request.py", line 834, in proxy_open
return self.parent.open(req, timeout=req.timeout)
File "/home/linuxbrew/.linuxbrew/Cellar/python#3.8/3.8.5/lib/python3.8/urllib/request.py", line 531, in open
response = meth(req, response)
File "/home/linuxbrew/.linuxbrew/Cellar/python#3.8/3.8.5/lib/python3.8/urllib/request.py", line 640, in http_response
response = self.parent.error(
File "/home/linuxbrew/.linuxbrew/Cellar/python#3.8/3.8.5/lib/python3.8/urllib/request.py", line 569, in error
return self._call_chain(*args)
File "/home/linuxbrew/.linuxbrew/Cellar/python#3.8/3.8.5/lib/python3.8/urllib/request.py", line 502, in _call_chain
result = func(*args)
File "/home/linuxbrew/.linuxbrew/Cellar/python#3.8/3.8.5/lib/python3.8/urllib/request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request
Am I doing it right? or the issue is related to urllib?

This issue is related to the shell-proxy plugin of oh-my-zsh. After disabling it everything works as before.

<urlopen error [WinError 10054] An existing connection was forcibly closed by the remote host>

All I'm trying to do is firstly get the list of link of all anchor tags and then scrape the text of tables from each of those scraped list. But All I'm getting is an error. Can anybody please help me?
My code:
from bs4 import BeautifulSoup as soup
from urllib.request import urlopen as uReq
import csv
from datetime import datetime
pages = []
my_url = 'https://deltaimmigration.com.au/Australia-jobs/'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
col = page_soup.find('td', width="250")
# rows = table.find_all('tr')
# print(len(col))
# print(col.text)
filename = "links.txt"
f = open(filename,'w')
headers = "Link of all jobs "
f.write(headers)
for a in col.find_all('a', href=True):
link= a['href']
f_link = link.replace(".." , "https://deltaimmigration.com.au")
pages.append(f_link)
for items in pages:
tables = items.find('table', width="900")
table = tables[0]
for table in tables:
rows_tables = table.tbody.find_all('tr')
rows = rows_tables[0].text
for row in rows:
print(row.text)
f.close()
Can somebody please help me to solve this? Can we not scrape the data from the scraped links from the same page?
Error:
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "e:/scrape/link.py", line 8, in <module>
uClient = uReq(my_url)
File "C:\Users\user\Anaconda3\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\user\Anaconda3\lib\urllib\request.py", line 525, in open
response = self._open(req, data)
File "C:\Users\user\Anaconda3\lib\urllib\request.py", line 543, in _open
'_open', req)
File "C:\Users\user\Anaconda3\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\user\Anaconda3\lib\urllib\request.py", line 1360, in https_open
context=self._context, check_hostname=self._check_hostname)
File "C:\Users\user\Anaconda3\lib\urllib\request.py", line 1319, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [WinError 10054] An existing connection was forcibly closed by the remote host>
user#Bidhya-1s MINGW64 /e/scrape
$ C:/Users/user/Anaconda3/python.exe e:/scrape/link.py
Traceback (most recent call last):
File "C:\Users\user\Anaconda3\lib\urllib\request.py", line 1317, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "C:\Users\user\Anaconda3\lib\http\client.py", line 1244, in request
self._send_request(method, url, body, headers, encode_chunked)
File "C:\Users\user\Anaconda3\lib\http\client.py", line 1290, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "C:\Users\user\Anaconda3\lib\http\client.py", line 1239, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "C:\Users\user\Anaconda3\lib\http\client.py", line 1026, in _send_output
self.send(msg)
File "C:\Users\user\Anaconda3\lib\http\client.py", line 966, in send
self.connect()
File "C:\Users\user\Anaconda3\lib\http\client.py", line 1414, in connect
server_hostname=server_hostname)
File "C:\Users\user\Anaconda3\lib\ssl.py", line 423, in wrap_socket
session=session
File "C:\Users\user\Anaconda3\lib\ssl.py", line 870, in _create
self.do_handshake()
File "C:\Users\user\Anaconda3\lib\ssl.py", line 1139, in do_handshake
self._sslobj.do_handshake()
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "e:/scrape/link.py", line 8, in <module>
uClient = uReq(my_url)
File "C:\Users\user\Anaconda3\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\user\Anaconda3\lib\urllib\request.py", line 525, in open
response = self._open(req, data)
File "C:\Users\user\Anaconda3\lib\urllib\request.py", line 543, in _open
'_open', req)
File "C:\Users\user\Anaconda3\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "C:\Users\user\Anaconda3\lib\urllib\request.py", line 1360, in https_open
context=self._context, check_hostname=self._check_hostname)
File "C:\Users\user\Anaconda3\lib\urllib\request.py", line 1319, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [WinError 10054] An existing connection was forcibly closed by the remote host>
'''

Basically your code contains much faults and completely unreadable.
I see that you are using find instead of findAll which means you are only picking the first url, and then I'm not even seeing any request you made to that url in order to parse it. therefor i couldn't understand your desired target as you haven't included much information. anyway from my point of view, i think that my below code will help you.
import requests
from bs4 import BeautifulSoup
main = "https://deltaimmigration.com.au/Australia-jobs/"
def First():
r = requests.get(main)
soup = BeautifulSoup(r.text, 'html5lib')
links = []
with open("links.txt", 'w', newline="", encoding="UTF-8") as f:
for item in soup.findAll("td", {'width': '250'}):
item = item.contents[1].get("href")[3:]
item = f"https://deltaimmigration.com.au/{item}"
f.write(item+"\n")
links.append(item)
print(f"We Have Collected {len(links)} urls")
return links
def Second():
links = First()
with requests.Session() as req:
for link in links:
print(f"Extracting {link}")
r = req.get(link)
soup = BeautifulSoup(r.text, 'html5lib')
for item in soup.findAll("table", {'width': '900'}):
print("*" * 40)
print(item.text)
print("*" * 40)
Second()

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

matplotlib fetch historical works when start year 2015 but not 2016 ? - matplotlib

Related

problem running python script with Selenium wrong parameter

Why does a requests.exceptions.ReadTimeout errors happens when executing (heavy?) code in ganache using web3py?

Delete bulk rows in Big Query table based on list of unique ids

python 3.8 urllib file:// issue

<urlopen error [WinError 10054] An existing connection was forcibly closed by the remote host>

Categories

Resources