Simple Python pytube code not working anymore - python

This is just simple Python code that worked when I first ran it, but didn't work the next day, even though I didn't change anything:
from pytube import YouTube
link = "https://www.youtube.com/watch?v=6_ardA6TuX0"
video = YouTube(link)
yt = video.streams.get_highest_resolution()
yt.download("Lieder")
I get this Error:
Traceback (most recent call last):
File "C:\Users\edonj\OneDrive\Desktop\Spot\main.py", line 9, in <module>
yt.download("Lieder")
File "C:\Users\edonj\OneDrive\Desktop\Spot\venv\lib\site-packages\pytube\streams.py", line 252, in download
for chunk in request.stream(
File "C:\Users\edonj\OneDrive\Desktop\Spot\venv\lib\site-packages\pytube\request.py", line 185, in stream
chunk = response.read()
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\http\client.py", line 476, in read
s = self._safe_read(self.length)
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.9_3.9.3568.0_x64__qbz5n2kfra8p0\lib\http\client.py", line 628, in _safe_read
raise IncompleteRead(b''.join(s), amt)
http.client.IncompleteRead: IncompleteRead(66612 bytes read, 9370572 more expected)

VideoUnavailable.
Proxy will help you.

Related

Using pytube on PyCharm

I have a script, which has been working fine in April 2021 when I created it, but now it gives me the following error. I'm not very experienced in coding, so if anyone can help me it would be great.
What I'm trying to do is simply download a song from youtube as a mp4. I can see that the error says there is something wrong with the pytube module imported, but I am not skilled enough to see what it is.
I'm using MacOS 12.1, Pycharm 2020.3, and Python 3.9.
Script:
import pytube
url = str('https://www.youtube.com/watch?v=gJLIiF15wjQ')
youtube = pytube.YouTube(url)
video = youtube.streams.get_by_itag(140)
video.download(output_path='/Users/clarajacobsen/Documents/TrueFIR/Klub100/Songs/', filename='test')
Error:
Traceback (most recent call last):
File "/Users/user/Documents/Folder1/venv/test.py", line 8, in <module>
video = youtube.streams.get_by_itag(140)
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/__main__.py", line 292, in streams
return StreamQuery(self.fmt_streams)
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/__main__.py", line 177, in fmt_streams
extract.apply_signature(stream_manifest, self.vid_info, self.js)
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/extract.py", line 409, in apply_signature
cipher = Cipher(js=js)
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/cipher.py", line 43, in __init__
self.throttling_plan = get_throttling_plan(js)
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/cipher.py", line 387, in get_throttling_plan
raw_code = get_throttling_function_code(js)
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/cipher.py", line 301, in get_throttling_function_code
code_lines_list = find_object_from_startpoint(js, match.span()[1]).split('\n')
AttributeError: 'NoneType' object has no attribute 'span'
After trying out solution 1, suggested by Sarim, error in PyCharm:
Traceback (most recent call last):
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/__main__.py", line 177, in fmt_streams
extract.apply_signature(stream_manifest, self.vid_info, self.js)
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/extract.py", line 409, in apply_signature
cipher = Cipher(js=js)
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/cipher.py", line 29, in __init__
self.transform_plan: List[str] = get_transform_plan(js)
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/cipher.py", line 197, in get_transform_plan
return regex_search(pattern, js, group=1).split(";")
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/helpers.py", line 129, in regex_search
raise RegexMatchError(caller="regex_search", pattern=pattern)
pytube.exceptions.RegexMatchError: regex_search: could not find match for iha=function\(\w\){[a-z=\.\(\"\)]*;(.*);(?:.+)}
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/user/Documents/Folder1/venv/test.py", line 5, in <module>
video = youtube.streams.get_by_itag(140)
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/__main__.py", line 292, in streams
return StreamQuery(self.fmt_streams)
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/__main__.py", line 184, in fmt_streams
extract.apply_signature(stream_manifest, self.vid_info, self.js)
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/extract.py", line 409, in apply_signature
cipher = Cipher(js=js)
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/cipher.py", line 29, in __init__
self.transform_plan: List[str] = get_transform_plan(js)
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/cipher.py", line 197, in get_transform_plan
return regex_search(pattern, js, group=1).split(";")
File "/Users/user/Documents/Folder1/venv/lib/python3.9/site-packages/pytube/helpers.py", line 129, in regex_search
raise RegexMatchError(caller="regex_search", pattern=pattern)
pytube.exceptions.RegexMatchError: regex_search: could not find match for iha=function\(\w\){[a-z=\.\(\"\)]*;(.*);(?:.+)}
After trying to run it in Google Colab:
/usr/local/lib/python3.7/dist-packages/pytube/cipher.py in get_throttling_function_code(js)
299
300 # Extract the code within curly braces for the function itself, and merge any split lines
--> 301 code_lines_list = find_object_from_startpoint(js, match.span()[1]).split('\n')
302 joined_lines = "".join(code_lines_list)
303
AttributeError: 'NoneType' object has no attribute 'span'
To fix this issue, This doesnt depends on which operating system you are on or which python you are using. Follow these steps:
I used Colab for this, if you are using Google colab use it and test it.
Install Pytube with !pip install pytube
After installing pytube just shutdown the kernel and the application you are using for it. either VSCode, Jupyter notebook or Colab. shut down its kernel.
Then run the enviroment again and try importing and running your code.
It should run now.
or if it gives you the same error as before:
Go to the files where pytube is install and go to folder in pytube named "pytube" then go into "cipher.py" and open it.
Search for the line: 293. Where name = re.escape(get_throttling_function_name(js))
Replace name = "iha"
Then close all kernels and file you are running the code on. and restart them completely after shutting down.
These two solutions should work 100%. Solution that worked for me is first one.
As the error tells us, you have a NoneType object called youtube in line 8 which was created before in line 7. Did you check if the YouTube link or anything on that video page that concerns you has changed?

pytube.exceptions.RegexMatchError: __init__: could not find match for ^\w+\W

so my issue is I run this simple code to attempt to make a pytube stream object...
from pytube import YouTube
yt = YouTube('https://www.youtube.com/watch?v=aQNrG7ag2G4')
stream = yt.streams.filter(file_extension='mp4')
And end up with the error in the title.
full error:
Traceback (most recent call last):
File ".\test.py", line 4, in <module>
stream = yt.streams.filter(file_extension='mp4')
File "C:\Users\logan\AppData\Local\Programs\Python\Python38\lib\site-packages\pytube\__main__.py", line 292, in streams
return StreamQuery(self.fmt_streams)
File "C:\Users\logan\AppData\Local\Programs\Python\Python38\lib\site-packages\pytube\__main__.py", line 184, in fmt_streams
extract.apply_signature(stream_manifest, self.vid_info, self.js)
File "C:\Users\logan\AppData\Local\Programs\Python\Python38\lib\site-packages\pytube\extract.py", line 409, in apply_signature
cipher = Cipher(js=js)
File "C:\Users\logan\AppData\Local\Programs\Python\Python38\lib\site-packages\pytube\cipher.py", line 33, in __init__
raise RegexMatchError(
pytube.exceptions.RegexMatchError: __init__: could not find match for ^\w+\W
Extra data:
python version: 3.8.10
pytube version: 11.0.2
As juanchosaravia suggested on https://github.com/pytube/pytube/issues/1199, in order to solve the problem, you should go in the cipher.py file and replace the line 30, which is:
var_regex = re.compile(r"^\w+\W")
With that line:
var_regex = re.compile(r"^\$*\w+\W")
After that, it worked again.
To go cipher.py in the pytube site packages
Change line 30
var_regex = re.compile(r"^\w+\W")
to
var_regex = re.compile(r"^$*\w+\W")
"^" means search beginning of the line
".*" means zero or more of any character
"$" means to end of line

Getting 2 errors while converting MP3 to WAV

I am trying to play mp3 file using pyglet module.
Following some suggestions, I have already installed avbin64 and moved avbin64.dll to the directory where my python code is. but still, I am getting 2 errors
import pyglet
music = pyglet.resource.media('song.mp3')
music.play()
pyglet.app.run()
error code
Traceback (most recent call last):
File "F:\PycharmProjects\test\venv\lib\site-packages\pyglet\media\codecs\wave.py", line 59, in __init__
self._wave = wave.open(file)
File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\wave.py", line 510, in open
return Wave_read(f)
File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\wave.py", line 164, in __init__
self.initfp(f)
File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\wave.py", line 131, in initfp
raise Error('file does not start with RIFF id')
wave.Error: file does not start with RIFF id
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "F:/PycharmProjects/test/test2.py", line 3, in <module>
music = pyglet.resource.media('song.mp3')
File "F:\PycharmProjects\test\venv\lib\site-packages\pyglet\resource.py", line 678, in media
return media.load(path, streaming=streaming)
File "F:\PycharmProjects\test\venv\lib\site-packages\pyglet\media\__init__.py", line 143, in load
raise first_exception
File "F:\PycharmProjects\test\venv\lib\site-packages\pyglet\media\__init__.py", line 133, in load
loaded_source = decoder.decode(file, filename, streaming)
File "F:\PycharmProjects\test\venv\lib\site-packages\pyglet\media\codecs\wave.py", line 109, in decode
return WaveSource(filename, file)
File "F:\PycharmProjects\test\venv\lib\site-packages\pyglet\media\codecs\wave.py", line 61, in __init__
raise WAVEDecodeException(e)
pyglet.media.codecs.wave.WAVEDecodeException: file does not start with RIFF id
As per "Loading media", you're supposed to open audio (and video) files with pyglet.media.load:
music = pyglet.media.load('song.mp3')
You must also have ffmpeg installed for pyglet to be able to read mp3 files (as per Supported media types). Make sure to follow the installation instructions.

Pyglet can't load .wav file

Env:
Ubuntu 18.04
Python 3.6.6
pyglet 1.3.2
Issue:
Based on documentation of pyglet I try to run following code:
import pyglet
pyglet.options["audio"] = ("openal", "pulse", "directsound", "silent")
explosion = pyglet.media.load('explosion.wav')
But following exceptions occured:
1) if file was converted by ffmpeg -i input.mp3 output.wav
Traceback (most recent call last):
File "<path_to_dir>/test_sound.py", line 3, in <module>
explosion = pyglet.media.load('zxc.wav', streaming=False)
File "<path_to_env>lib/python3.6/site-packages/pyglet/media/sources/loader.py", line 63, in load
source = get_source_loader().load(filename, file)
File "<path_to_env>lib/python3.6/site-packages/pyglet/media/sources/loader.py", line 84, in load
return WaveSource(filename, file)
File "<path_to_env>lib/python3.6/site-packages/pyglet/media/sources/riff.py", line 197, in __init__
raise WAVEFormatException('Not a WAVE file')
pyglet.media.sources.riff.WAVEFormatException: Not a WAVE file
2) or this for several .wav from internet
Traceback (most recent call last):
File "<path_to_dir>//test_sound.py", line 3, in <module>
explosion = pyglet.media.load('explosion.wav', streaming=False)
File "<path_to_env>lib/python3.6/site-packages/pyglet/media/sources/loader.py", line 63, in load
source = get_source_loader().load(filename, file)
File "<path_to_env>lib/python3.6/site-packages/pyglet/media/sources/loader.py", line 84, in load
return WaveSource(filename, file)
File "<path_to_env>lib/python3.6/site-packages/pyglet/media/sources/riff.py", line 192, in __init__
format = wave_form.get_format_chunk()
File "<path_to_env>lib/python3.6/site-packages/pyglet/media/sources/riff.py", line 172, in get_format_chunk
for chunk in self.get_chunks():
File "<path_to_env>lib/python3.6/site-packages/pyglet/media/sources/riff.py", line 108, in get_chunks
chunk = cls(self.file, name, length, offset)
File "<path_to_env>lib/python3.6/site-packages/pyglet/media/sources/riff.py", line 153, in __init__
raise RIFFFormatException('Size of format chunk is incorrect.')
pyglet.media.sources.riff.RIFFFormatException: Size of format chunk is incorrect.
Question:
How to run .wav files via pyglet correctly?
Like in the example, it is probably either an issue with openal or the wav-files. Are the procedural sounds playing correctly, e.g.:
from pyglet.media.sources.procedural import Sine
sine = Sine(duration=1, frequency=500,
sample_size=16, sample_rate=44100)
pyglet.media.StaticSource(sine).play()
and can you share an offending wav-file? I just ran a test on Linux Mint 19, Python 3.7.1 and pyglet 1.3.2 with https://github.com/pyreiz/pyreiz/blob/master/reiz/media/wav/ding.wav and it runs fine.

wave.Error: file does not start with RIFF id

I am trying to use the SpeechRecognition library (https://pypi.python.org/pypi/SpeechRecognition/). When running the example code (full example) below:
#!/usr/bin/env python3
import speech_recognition as sr
# obtain path to "english.wav" in the same folder as this script
from os import path
AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), "english.wav")
#AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), "french.aiff")
#AUDIO_FILE = path.join(path.dirname(path.realpath(__file__)), "chinese.flac")
# use the audio file as the audio source
r = sr.Recognizer()
with sr.AudioFile(AUDIO_FILE) as source:
audio = r.record(source) # read the entire audio file
I receive this error:
/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/bin/python3.4 /Users/adamg/te/Polli/ASR/SpeechRecognitionTest0.py
Traceback (most recent call last):
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/speech_recognition/__init__.py", line 174, in __enter__
self.audio_reader = wave.open(self.filename_or_fileobject, "rb")
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/wave.py", line 497, in open
return Wave_read(f)
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/wave.py", line 163, in __init__
self.initfp(f)
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/wave.py", line 130, in initfp
raise Error('file does not start with RIFF id')
wave.Error: file does not start with RIFF id
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/speech_recognition/__init__.py", line 179, in __enter__
self.audio_reader = aifc.open(self.filename_or_fileobject, "rb")
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/aifc.py", line 887, in open
return Aifc_read(f)
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/aifc.py", line 340, in __init__
self.initfp(f)
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/aifc.py", line 305, in initfp
raise Error('file does not start with FORM id')
aifc.Error: file does not start with FORM id
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/adamg/te/Polli/ASR/SpeechRecognitionTest0.py", line 13, in <module>
with sr.AudioFile(AUDIO_FILE) as source:
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/speech_recognition/__init__.py", line 199, in __enter__
self.audio_reader = aifc.open(aiff_file, "rb")
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/aifc.py", line 887, in open
return Aifc_read(f)
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/aifc.py", line 340, in __init__
self.initfp(f)
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/aifc.py", line 303, in initfp
chunk = Chunk(file)
File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/chunk.py", line 63, in __init__
raise EOFError
EOFError
Process finished with exit code 1
How can this be fixed?
I had the same problem for a while, my mistake is to download the wav file using "save as". When I double click to play the wav file, I am not able to play the wav file using any player. After cloning or downloading the entire zip file, I am able to play the wav file and the error disappears.
Your wav file is probably corrupted. To check try to play the file using any media player if possible. Download the audio file into your environment correctly and then it should work.
I have faced this issue when I did a mistake by downloading the file the incorrect way, I downloaded the audio file using the following command in the google colab:
[!wget 'https://github.com/mozilla/DeepSpeech/blob/master/data/smoke_test/LDC93S1_pcms16le_1_16000.wav']
This threw an error as the audio file was corrupted(could not be played my media player when downloaded).
I was able to correct the issue by downloading the following way:
[!wget 'https://raw.githubusercontent.com/mozilla/DeepSpeech/master/data/smoke_test/LDC93S1_pcms16le_1_16000.wav']
For what it's worth, I ran into this problem when git LFS WAV files I had stored in a repo were not cloned properly. Their pointers were present, but the files were not. A
git lfs pull
fixed the problem.
That being said, this might just happen any time you a file is pointed to that isn't actually an audio file that is "complete". Hopefully that helps!

Categories