Getting Error with Pytube: signature = cipher.get_signature(js, stream['s']) KeyError: 's' - urllib2

I have been this error when running my Pytube script:
signature = cipher.get_signature(js, stream['s'])
KeyError: 's'
My code is like this:
url = 'https://www.youtube.com/watch?v='
train_List = []
i = 0
while i < len(my_list):
if len(my_list[i]) > 6:
urls = url + my_list[i]
train_List.append(urls)
yt=YouTube(train_List[i])
t=yt.streams.filter(only_audio=True).all()
t[0].download('/pathtofolder')
i+=1
I also tried:
t=yt.streams.filter(file_extension='mp4').all()
I altered the cipher.py and helper.py files per the recommendation here: https://github.com/nficano/pytube/issues/353#issuecomment-455116197
but it hasn't fixed the problem. After making the alterations, I got the error noted above.
Next, I ran "pip install pytube --upgrade" per some other recommendations. Still getting the KeyError after it downloads a few audio files.
I've also implemented this in mixins.py, per the github issues:
if ('signature=' in url) or ('&sig=' in url) or ('&lsig=' in url):
but now it's hanging after 3 uploads.
Does anyone have a fix for this?

I fixed this issue (for me, at least) by changing line 49 in mixins.py to this:
signature = cipher.get_signature(js, stream['url'])
instead of
signature = cipher.get_signature(js, stream['s'])
and then changing lines 55-63 to
logger.debug(
'finished descrambling signature for itag=%s\n%s',
stream['itag'], pprint.pformat(
{
'url': stream['url'],
'signature': signature,
}, indent=2,
),
)

Have the same problem. If I try a second time it works most of the time...
url = 'https://www.youtube.com/watch?v=gQrkvZeE3Uc'
yt = YouTube(url)
yt.streams.filter(progressive=True, subtype='mp4').order_by('resolution').desc().last().download()
Thanks for any help

You have to fix the cipher.py file in lib/python3.6/site-packages/pytube:
In cipher.py, change pattern starting at line 38 to this:
pattern = [
r'(["\'])signature\1\s*,\s*(?P<sig>[a-zA-Z0-9$]+)\(',
r'\.sig\|\|(?P<sig>[a-zA-Z0-9$]+)\(',
r'yt\.akamaized\.net/\)\s*\|\|\s*.*?\s*c\s*&&\s*d\.set\([^,]+\s*,\s*(?:encodeURIComponent\s*\()?(?P<sig>[a-zA-Z0-9$]+)\(',
r'\bc\s*&&\s*d\.set\([^,]+\s*,\s*(?:encodeURIComponent\s*\()?\s*(?P<sig>[a-zA-Z0-9$]+)\(',
r'\bc\s*&&\s*d\.set\([^,]+\s*,\s*\([^)]*\)\s*\(\s*(?P<sig>[a-zA-Z0-9$]+)\(',
]

One of the ways to solve it: pip install git+https://github.com/nficano/pytube.git
Other than that, check this thread

I did pip uninstall pytube.
Checked my python version with python --version, turned out to be the Anaconda 3.6.8 version.
So I did pip3 install pytube, pip3 being for Python 3 I believe.
It works no problems now.

had same issue, based on other answers that installed it from sources, just checked my current pytube version, which was 9.5.0 and see that the last released is 9.5.2, so I just run pip install --upgrade pytube and worked perfectly

Related

Writing apache beam pCollection to bigquery causes type Error

I have a simple beam pipeline, as follows:
with beam.Pipeline() as pipeline:
output = (
pipeline
| 'Read CSV' >> beam.io.ReadFromText('raw_files/myfile.csv',
skip_header_lines=True)
| 'Split strings' >> beam.Map(lambda x: x.split(','))
| 'Convert records to dictionary' >> beam.Map(to_json)
| beam.io.WriteToBigQuery(project='gcp_project_id',
dataset='datasetID',
table='tableID',
create_disposition=bigquery.CreateDisposition.CREATE_NEVER,
write_disposition=bigquery.WriteDisposition.WRITE_APPEND
)
)
However upon runnning I get a typeError, stating the following:
line 2147, in __init__
self.table_reference = bigquery_tools.parse_table_reference(if isinstance(table,
TableReference):
TypeError: isinstance() arg 2 must be a type or tuple of types
I have tried defining a TableReference object and passing it to the WriteToBigQuery class but still facing the same issue. Am I missing something here? I've been stuck at this step for what feels like forever and I don't know what to do. Any help is appreciated!
This probably occurred since you installed Apache Beam without the GCP modules. Please make sure to do following (in a virtual environment).
pip install apache-beam[gcp]
It's a weird error though so feel free to file a Github issue against the Apache Beam project.

Airflow Xcom with SSHOperator

Im trying to get param from SSHOperator into Xcom and get it in python.
def decision_function(**context):
ti = context['ti']
output_ssh= ti.xcom_pull(task_ids='ssh_task')
print('ssh output is: {}'.format(output_ssh))
ls = SSHOperator(
task_id="ssh_task",
command= "ls",
ssh_hook = sshHook,
dag = mydag)
get_xcom = PythonOperator(
task_id='test',
python_callable=decision_function,
provide_context=True,
dag=mydag
ls >> get_xcom
I get the wrong result in the xcom_pull:
ssh output is: MTcySyBhaXJmbG93CTIuMUcgQXV0b0hpdHVtX05ld
Every thing is work fine with BashOperator but when I try to use the SSH, it not working currect.
I also try to change in the config the enable_xcom_pickling param, but still not working.
airflow: 2.1.4
linux
thank you.
The value of Xcom may be b64encoded depending on the value of enable_xcom_pickling as you can see in the source code
The issue you are facing has been discussed in PR which suggested to change functionality to be like other operators but the PR was not merged and you can see the reasons for it in the code review comments.
if you wish you can create custom version of the operator without the enable_xcom_pickling condition:
class MySSHOperator(SSHOperator):
def execute(self, context=None) -> Union[bytes, str]:
result: Union[bytes, str]
if self.command is None:
raise AirflowException("SSH operator error: SSH command not specified. Aborting.")
# Forcing get_pty to True if the command begins with "sudo".
self.get_pty = self.command.startswith('sudo') or self.get_pty
try:
with self.get_ssh_client() as ssh_client:
result = self.run_ssh_client_command(ssh_client, self.command)
except Exception as e:
raise AirflowException(f"SSH operator error: {str(e)}")
return result.decode('utf-8')

pipenv - Pipfile.lock is not being generated due to the 'Could not find a version that matches keras-nightly~=2.5.0.dev' error

As the title clearly describes the issue I've been experiencing, no Pipfile.lock is being generated as I get the following error when I execute the recommended command pipenv lock --clear:
ERROR: ERROR: Could not find a version that matches keras-nightly~=2.5.0.dev
Skipped pre-versions: 2.5.0.dev2021020510, 2.5.0.dev2021020600, 2.5.0.dev2021020700, 2.5.0.dev2021020800, 2.5.0.dev2021020900, 2.5.0.dev2021021000, 2.5.0.dev2021021100, 2.5.0.dev2021021200, 2.5.0.dev2021021300, 2.5.0.dev2021021400, 2.5.0.dev2021021500, 2.5.0.dev2021021600, 2.5.0.dev2021021700, 2.5.0.dev2021021800, 2.5.0.dev2021021900, 2.5.0.dev2021022000, 2.5.0.dev2021022100, 2.5.0.dev2021022200, 2.5.0.dev2021022300, 2.5.0.dev2021022317, 2.5.0.dev2021022400, 2.5.0.dev2021022411, 2.5.0.dev2021022500, 2.5.0.dev2021022600, 2.5.0.dev2021022700, 2.5.0.dev2021022800, 2.5.0.dev2021030100, 2.5.0.dev2021030200, 2.5.0.dev2021030300, 2.5.0.dev2021030400, 2.5.0.dev2021030500, 2.5.0.dev2021030600, 2.5.0.dev2021030700, 2.5.0.dev2021030800, 2.5.0.dev2021030900, 2.5.0.dev2021031000, 2.5.0.dev2021031100, 2.5.0.dev2021031200, 2.5.0.dev2021031300, 2.5.0.dev2021031400, 2.5.0.dev2021031500, 2.5.0.dev2021031600, 2.5.0.dev2021031700, 2.5.0.dev2021031800, 2.5.0.dev2021032213, 2.5.0.dev2021032300, 2.5.0.dev2021032413, 2.5.0.dev2021032500, 2.5.0.dev2021032600, 2.5.0.dev2021032610, 2.5.0.dev2021032700, 2.5.0.dev2021032800, 2.5.0.dev2021032900, 2.6.0.dev2021033000, 2.6.0.dev2021033100, 2.6.0.dev2021040100, 2.6.0.dev2021040200, 2.6.0.dev2021040300, 2.6.0.dev2021040400, 2.6.0.dev2021040500, 2.6.0.dev2021040600, 2.6.0.dev2021040714, 2.6.0.dev2021040800, 2.6.0.dev2021040900, 2.6.0.dev2021041000, 2.6.0.dev2021041100, 2.6.0.dev2021041200, 2.6.0.dev2021041300, 2.6.0.dev2021041400, 2.6.0.dev2021041500, 2.6.0.dev2021041600, 2.6.0.dev2021041700, 2.6.0.dev2021041800, 2.6.0.dev2021041900, 2.6.0.dev2021042000, 2.6.0.dev2021042100, 2.6.0.dev2021042200, 2.6.0.dev2021042300, 2.6.0.dev2021042500, 2.6.0.dev2021042600, 2.6.0.dev2021042700, 2.6.0.dev2021042800, 2.6.0.dev2021042900, 2.6.0.dev2021043000, 2.6.0.dev2021050100, 2.6.0.dev2021050200, 2.6.0.dev2021050300, 2.6.0.dev2021050400, 2.6.0.dev2021050500, 2.6.0.dev2021050600, 2.6.0.dev2021051200, 2.6.0.dev2021051300, 2.6.0.dev2021051400, 2.6.0.dev2021051500, 2.6.0.dev2021051600, 2.6.0.dev2021051700, 2.6.0.dev2021051800, 2.6.0.dev2021051900, 2.6.0.dev2021052000, 2.6.0.dev2021052100, 2.6.0.dev2021052200, 2.6.0.dev2021052300, 2.6.0.dev2021052400, 2.6.0.dev2021052500, 2.6.0.dev2021052600, 2.6.0.dev2021052700
There are incompatible versions in the resolved dependencies.
So, how can I overcome this situation? I'm basically developing a deep neural network using Keras. I simply installed the following dependencies without explicitly declaring versions:
tensorflow = "*"
nltk = "*"
pandas = "*"
tweepy = "*"
textblob = "*"
seaborn = "*"
matplotlib = "*"
wordcloud = "*"
stop-words = "*"
vadersentiment = "*"
scikit-learn = "*"
keras = "*"
By looking at the pypi site for keras-nightly library, I could see that there are no versions named 2.5.0.dev. Check which package is generating the error and try downgrading that package.
Pipenv wants you to use the --pre flag in this situation. I think the simplest way to install it would be like this:
pipenv install tensorflow --python=3.8 --pre
I had the same problem. Apparently, tensorflow arises a sub-dependency resolution error with Pipfile when resolving keras. It searches for a version that matches keras-nightly~=2.5.0.dev. Looking at pypi, this version does not exist.
Whatsoever, installing a version 2.4.X works, e. g. pipenv install tensorflow~=2.4.1.
By Changing the python_version in pipenv file resolved my issue.
I had it 3.9 before.
Pipfile:
python_version = "3.8"
It seems that tensorflow does not work on python 3.9 as it has a dependency on the nightly.
If you are using python 3.9, your option is to go to tensorflow 2.6.0rc1,
adding those line to Pipfile fixed my issue.
[pipenv]
allow_prereleases = true

How do I set a backend for django-celery. I set CELERY_RESULT_BACKEND, but it is not recognized

I set CELERY_RESULT_BACKEND = "amqp" in celeryconfig.py
but I get:
>>> from tasks import add
>>> result = add.delay(3,5)
>>> result.ready()
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/djangoprojects/venv/local/lib/python2.7/site-packages/celery/result.py", line 105, in ready
return self.state in self.backend.READY_STATES
File "/djangoprojects/venv/local/lib/python2.7/site-packages/celery/result.py", line 184, in state
return self.backend.get_status(self.task_id)
File "/djangoprojects/venv/local/lib/python2.7/site-packages/celery/backends/base.py", line 414, in _is_disabled
raise NotImplementedError("No result backend configured. "
NotImplementedError: No result backend configured. Please see the documentation for more information.
I just went through this so I can shed some light on this. One might think for all of the great documentation stating some of this would have been a bit more obvious.
I'll assume you have both RabbitMQ up and functioning (it needs to be running), and that you have dj-celery installed.
Once you have that then all you need to do is to include this single line in your setting.py file.
BROKER_URL = "amqp://guest:guest#localhost:5672//"
Then you need to run syncdb and start this thing up using:
python manage.py celeryd -E -B --loglevel=info
The -E states that you want events captured and the -B states you want celerybeats running. The former enable you to actually see something in the admin window and the later allows you to schedule. Finally you need to ensure that you are actually going to capture the events and the status. So in another terminal run this:
./manage.py celerycam
And then finally your able to see the working example provided in the docs.. -- Again assuming you created the tasks.py that is says to.
>>> result = add.delay(4, 4)
>>> result.ready() # returns True if the task has finished processing.
False
>>> result.result # task is not ready, so no return value yet.
None
>>> result.get() # Waits until the task is done and returns the retval.
8
>>> result.result # direct access to result, doesn't re-raise errors.
8
>>> result.successful() # returns True if the task didn't end in failure.
True
Furthermore then you are able to view your status in the admin panel.
I hope this helps!! I would add one more thing which helped me. Watching the RabbitMQ Log file was key as it helped me identify that django-celery was actually talking to RabbitMQ.
Are you running django celery?
If so, you need to start a python shell in the context of django (or whatever the technical term is).
Type:
python manage.py shell
And try your commands from that shell
HI tried everything to work celery v3.1.25 with Django 1.8 version nothing worked..
Finally below line helped me ,feeling happy
app = Celery('documents',backend="celery.backends.amqp:AMQPBackend")
Setting backend="celery.backends.amqp:AMQPBackend" fixed my error.

Python file location conventions and Import errors

I'm new to Python and I simply don't know how to handle this specific problem:
I'm trying to run an executable (named ros_M5e.py) that's located in the directory /opt/ros/diamondback/stacks/hrl/hrl_rfid/src/hrl_rfid/ros_M5e.py (annoyingly long filepath, I know, but necessary). However, within the ros_M5e.py file there is a call to another file that is further up the file path: from hrl.hrl_rfid.msg import RFIDread. The directory msg indeed is located at /opt/ros/diamondback/stacks/hrl/hrl_rfid/ and it does indeed contain the file RFIDread. However, whenever I try to execute ros_M5e.py I get this error:
Traceback (most recent call last):
File "/opt/ros/diamondback/stacks/hrl/hrl_rfid/src/hrl_rfid/ros_M5e.py", line 37, in <module>
from hrl.hrl_rfid.msg import RFIDread
ImportError: No module named hrl.hrl_rfid.msg
Would someone with some expertise please shine some light on this problem for me? It seems like just a rudimentary file location problem, but I just don't know the appropriate Python conventions to fix it. I've tried putting the ros_M5e.py file in the same directory as the files it calls and changing the filepaths but to no avail.
Thanks a lot,
Khiya
Sure, I can help you get it up and running.
From the StackOverflow posting, it would seem that you're checking out the stack to /opt/ros/diamondback. This is no good, as it is a system path. You need to install into your local path. The reason for "readonly" on the repository is that you do not have permissions to make changes to the code -- it will still work just fine for you on your local machine. I spent a fair amount of time showing how to use this package (at least the python version) here:
http://www.ros.org/wiki/hrl_rfid
I'll try to do a quick run-through for installing it.... Run the following commands:
cd
mkdir sandbox
cd sandbox/
svn checkout http://gt-ros-pkg.googlecode.com/svn/trunk/hrl/hrl_rfid hrl_rfid (double-check that this checkout works OK!)
Add the following line to the bottom of your bashrc to tell ROS where to find the new package. (You may use "gedit ~/.bashrc")
export ROS_PACKAGE_PATH=$ROS_PACKAGE_PATH:$HOME/sandbox/hrl_rfid
Now execute the following:
roscd hrl_rfid (did you end up in the correct directory?)
rosmake hrl_rfid (did it make without errors?)
roscd hrl_rfid/src/hrl_rfid
At this point everything is actually installed correctly. By default, ros_M5e.py assumes that the reader is located at "/dev/robot/RFIDreader". Unless you've already altered udev rules, this will not be the case on your machine. I suggest running through the code:
http://www.ros.org/wiki/hrl_rfid
using iPython (a command-line python prompt that will let you execute python commands one at a time) to make sure everything is working (replace /dev/ttyUSB0 with whatever device your RFID reader is connected as):
import lib_M5e as M5e
r = M5e.M5e( '/dev/ttyUSB0', readPwr = 3000 )
r.ChangeAntennaPorts( 1, 1 )
r.QueryEnvironment()
r.TrackSingleTag( 'test_tag_id_' )
r.ChangeTagID( 'test_tag_id_' )
r.QueryEnvironment()
r.TrackSingleTag( 'test_tag_id_' )
r.ChangeAntennaPorts( 2, 2 )
r.QueryEnvironment()
This means that the underlying library is working just fine. Next, test ROS (make sure "roscore" is running!), by putting this in a python file and executing:
import lib_M5e as M5e
def P1(r):
r.ChangeAntennaPorts(1,1)
return 'AntPort1'
def P2(r):
r.ChangeAntennaPorts(2,2)
return 'AntPort2'
def PrintDatum(data):
ant, ids, rssi = data
print data
r = M5e.M5e( '/dev/ttyUSB0', readPwr = 3000 )
q = M5e.M5e_Poller(r, antfuncs=[P1, P2], callbacks=[PrintDatum])
q.query_mode()
t0 = time.time()
while time.time() - t0 < 3.0:
time.sleep( 0.1 )
q.track_mode( 'test_tag_id_' )
t0 = time.time()
while time.time() - t0 < 3.0:
time.sleep( 0.1 )
q.stop()
OK, everything works now. You can make your own node that is tuned to your setup:
#!/usr/bin/python
import ros_M5e as rm
def P1(r):
r.ChangeAntennaPorts(1,1)
return 'AntPort1'
def P2(r):
r.ChangeAntennaPorts(2,2)
return 'AntPort2'
ros_rfid = rm.ROS_M5e( name = 'my_rfid_server',
readPwr = 3000,
portStr = '/dev/ttyUSB0',
antFuncs = [P1, P2],
callbacks = [] )
rospy.spin()
ros_rfid.stop()
Or, ping me back and I can tweak ros_M5e.py to take an optional "portStr" -- though I recommend making your own so that you can name your antennas sensibly. Also, I highly recommend setting udev rules to ensure that the RFID reader always gets assigned to the same device: http://www.hsi.gatech.edu/hrl-wiki/index.php/Linux_Tools#udev_for_persistent_device_naming
BUS=="usb", KERNEL=="ttyUSB*", SYSFS{idVendor}=="0403", SYSFS{idProduct}=="6001", SYSFS{serial}=="ftDXR6FS", SYMLINK+="robot/RFIDreader"
If you do not do this... there is no guarantee that the reader will always be enumerated at /dev/ttyUSBx.
Let me know if you have any further problems.
~Travis Deyle (Hizook.com)
PS -- Did you modify ros_M5e.py to "from hrl.hrl_rfid.msg import RFIDread"? In the repo, it is "from hrl_rfid.msg import RFIDread". The latter is correct. As long as you have your ROS_PACKAGE_PATH correctly defined, and you've run rosmake on the package, then the import statement should work just fine. Also, I would not recommend posting ROS-related questions to StackOverflow. Very few people on here are going to be familiar with the ROS ecosystem (which is VERY complex). Please post questions here instead:
http://answers.ros.org/
http://code.google.com/p/gt-ros-pkg/issues/list
You need to make sure that following are true:
Directory /opt/ros/diamondback/stacks/ is in your python path.
/opt/ros/diamondback/stacks/hr1 contains __init__.py
/opt/ros/diamondback/stacks/hr1/hr1_rfid contians __init__.py
/opt/ros/diamondback/stacks/hr1/hr1_rfid/msg contians __init__.py
As the asker explained in comments that the RFIDRead does not have .py extension, so here is how that can be imported.
import imp
imp.load_source('RFIDRead', '/opt/ros/diamondback/stacks/hr1/hr1_rfid/msg/RFIDRead.msg')
Check out imp documentation for more information.