OSError: Not a gzipped file (b've') python - python

I have the following code and I made sure its extension and name are correct. However, I still get the error outputted as seen below.
I did see another person asked a similar question here on Stack Overflow, and read the answer but it did not help me.
Failed to load a .bin.gz pre trained words2vecx
Any suggestions how to fix this?
Input:
import gensim
word2vec_path = "GoogleNews-vectors-negative300.bin.gz"
word2vec = gensim.models.KeyedVectors.load_word2vec_format(word2vec_path, binary=True)
Output:
OSError: Not a gzipped file (b've')

The problem is that the file you've downloaded is not a gzip file. If you check the size of the file it maybe in KBs (that is what happened with me, when I downloaded it from this Github link because it needed git-lfs)
Here is an alternate solution to resolve this issue:
Download the model using the below command on your terminal:
wget -c "https://s3.amazonaws.com/dl4j-distribution/GoogleNews-vectors-negative300.bin.gz"
Then, load the model as you would using gensim:
from gensim import models
w = models.KeyedVectors.load_word2vec_format(
'GoogleNews-vectors-negative300.bin', binary=True)
Hope this helps you!!

Try this
import tensorflow
word2vec_path = 'https://s3.amazonaws.com/dl4j-distribution/GoogleNews-vectors-negative300.bin.gz'
word2vec = models.KeyedVectors.load_word2vec_format(word2vec_path, binary=True)

Related

Open-Cv dnn error for python while using Yolov3. Using open-cv ver(4.2.0)

cv2.error: OpenCV(4.2.0) C:\projects\opencv-python\opencv\modules\dnn\src\darknet\darknet_io.cpp:677: error: (-212:Parsing error) Unknown layer type: in function 'cv::dnn::darknet::ReadDarknetFromCfgStream
Code:
import cv2
import numpy as np
# Load Yolo
path =r"D:\yolov3-coco"
weight = path+r"\yolov3.weights"
cfg = path+r"\yolov3.cfg"
net = cv2.dnn.readNetFromDarknet(weight ,cfg )
classes = []
with open(path+"\coco.txt", "r") as f:
classes = [line.strip() for line in f.readlines()]
print(cfg)
print(weight)
print(classes)
cv2.destroyAllWindows()
I have already used the command net= cv2.dnn.readNet(weights,cfg) but it did not work i have also gone to https://pjreddie.com/media/files/yolov3.weights and downloaded the weights and config files and have put them in a folder called yolov3-coco.
You seem to be passing *.weights first and *.cfg later. If takes *.cfg first, then the darknet weights file. Reference for readNetFromDarknet,
https://docs.opencv.org/master/d6/d0f/group__dnn.html#gafde362956af949cce087f3f25c6aff0d
This problem is similar to,
YOLO V3 Video Stream Object Detection
Maybe yolo3-coco cfg and weight do not match so Unknown layer type error.
Path for yolov3.weights,yolov3.cfg is not proper so you can pass path directly where they are stored
net = cv.dnn.readNetFromDarknet("yolov3-coco/yolov3.cfg", "yolov3-coco/yolov3.weights")
For me it were the wrong files. I was trying it in google colab with version 2.5.0 of tensorflow. The following files worked:
!wget "https://pjreddie.com/media/files/yolov3.weights"
!wget "https://raw.githubusercontent.com/pjreddie/darknet/master/cfg/yolov3.cfg"
!wget "https://raw.githubusercontent.com/pjreddie/darknet/master/data/coco.names"
try to use OpenCV --version 5. My error was solved with this.
ı had trouble problem like this. Just check your code path and make sure where cfg,weights are,open main folder and work under the main folder.

Pytorch datasets.ImageFolder flag FileNotFoundError: [Errno 2] No such file or directory: '\u2068/

Trying to load data from the local directory in Pytorch using dataset.ImageLoader but getting FileNotFoundError...
import torch
from torchvision import datasets, transforms
data_dir = '⁨/Users/Desktop/Udacity/AI for Trading/deep-learning-v2-pytorch/intro-to-pytorch/data⁩/Cat_Dog_data⁩/⁨train⁩'
transform = transforms.Compose([transforms.Resize(255),
transforms.CenterCrop(224),
transforms.ToTensor()])
dataset = datasets.ImageFolder(root=data_dir, transform=transform)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=32, shuffle=True)
FileNotFoundError: [Errno 2] No such file or directory: '\u2068/Users/Desktop/Udacity/AI for Trading/deep-learning-v2-pytorch/intro-to-pytorch/data\u2069/Cat_Dog_data\u2069/\u2068train\u2069'
Answer: Retyping the Image Dir Path manually instead of copy-pasting solved the problem.
This error is not related to pytorch or dataset library, it seems getting reported by python's OS library..
I tried loading the simple path like
import os
data_dir = '/Users'
os.chdir(data_dir)
Even above code failed, post researching it seems, the error is because of the below reasons:
There are some invisible 'LEFT-TO-RIGHT MARK' (u200e) and 'FIRST STRONG ISOLATE' (u2068) characters in the st
And retyping the path manually (instead of copy-pasting) solved the problem
This stackoverflow helped me fix the problem, I still thought to post it again, as a problem in the context of pytorch was not reported and hopefully, this help folks connects the dots.

Error with NLTK package and other dependencies

I have installed the NLTK package and other dependencies and set the environment variables as follows:
STANFORD_MODELS=/mnt/d/stanford-ner/stanford-ner-2018-10-16/classifiers/english.all.3class.distsim.crf.ser.gz:/mnt/d/stanford-ner/stanford-ner-2018-10-16/classifiers/english.muc.7class.distsim.crf.ser.gz:/mnt/d/stanford-ner/stanford-ner-2018-10-16/classifiers/english.conll.4class.distsim.crf.ser.gz
CLASSPATH=/mnt/d/stanford-ner/stanford-ner-2018-10-16/stanford-ner.jar
When I try to access the classifier like below:
stanford_classifier = os.environ.get('STANFORD_MODELS').split(':')[0]
stanford_ner_path = os.environ.get('CLASSPATH').split(':')[0]
st = StanfordNERTagger(stanford_classifier, stanford_ner_path, encoding='utf-8')
I get the following error. But I don't understand what is causing this error.
Error: Could not find or load main class edu.stanford.nlp.ie.crf.CRFClassifier
OSError: Java command failed : ['/mnt/c/Program Files (x86)/Common
Files/Oracle/Java/javapath_target_1133041234/java.exe', '-mx1000m', '-cp', '/mnt/d/stanford-ner/stanford-ner-2018-10-16/stanford-ner.jar', 'edu.stanford.nlp.ie.crf.CRFClassifier', '-loadClassifier', '/mnt/d/stanford-ner/stanford-ner-2018-10-16/classifiers/english.all.3class.distsim.crf.ser.gz', '-textFile', '/tmp/tmpaiqclf_d', '-outputFormat', 'slashTags', '-tokenizerFactory', 'edu.stanford.nlp.process.WhitespaceTokenizer', '-tokenizerOptions', '"tokenizeNLs=false"', '-encoding', 'utf8']
I found the answer for this issue. I am using NLTK == 3.4. From NLTK ==3.3 and above Stanford NLP (POS, NER , tokenizer) are not loaded as part of nltk.tag but from nltk.parse.corenlp.CoreNLPParser. The stackoverflow answer is available in stackoverflow.com/questions/13883277/stanford-parser-and-nltk/… and the github link for official documentation is github.com/nltk/nltk/wiki/Stanford-CoreNLP-API-in-NLTK.
Additional information if you are facing timeout issue from the NER tagger or any other parser of coreNLP API, please increase the timeout limit as stated in https://github.com/nltk/nltk/wiki/Stanford-CoreNLP-API-in-NLTK/_compare/3d64e56bede5e6d93502360f2fcd286b633cbdb9...f33be8b06094dae21f1437a6cb634f86ad7d83f7 by dimazest.

Cannot load audio file in tensorflow (Windows10)

this is may problem.
I can load the audio_binary like this
audio_binary = tf.read_file(wav_file_path)
but when I try to read the wav with this:
from tensorflow.contrib import ffmpeg
waveform = ffmpeg.decode_audio( audio_binary, file_format='wav', samples_per_second=16000, channel_count=1)
I get error ImportError: No module named 'tensorflow.contrib.ffmpeg.ops'
I have also tried doing this:
from tensorflow.contrib.framework.python.ops import audio_ops as contrib_audio
wav_decoder = contrib_audio.decode_wav(audio_binary, desired_channels=1)
and I get this error InvalidArgumentError: Header mismatch: Expected RIFF but found NIST
By the way I'm using tensorflow-gpu in a Jupyter notebook.
Any help would be highly appreciated.
Thanks!
You might want to check what version of tensorflow you currently have.
tensorflow 1.X:
tensorflow.contrib.ffmpeg.decode_audio()
tensorflow 2.X:
tensorflow.audio.decode_wav()
keep in mind that decode_wav() needs the .wav data and cannot read it from the .wav data on it's own
for more information on tensorflow.audio.decode_wav() see documentation here: https://www.tensorflow.org/api_docs/python/tf/audio/decode_wav
check out this answer for more information: From audio to tensor, back to audio in tensorflow
In case someone has the same problem.
I was using TIMIT database, and their files, althought their were .wav, the have a differnet codification (NIST). I have to change them to RIFF, like this forfiles /s /m *.wav /c "cmd /c sph2pipe -f wav #file #fnameRIFF.wav"
and the use the second command contrib_audio.decode_wav(...)
Based on this answer:
Change huge amount of data from NIST to RIFF wav file
And this page:
http://soundfile.sapp.org/doc/WaveFormat/

Fasttext for Python - module 'fasttext' has no attribute 'load_model'

Please forgive my newbness here, but fasttext is not working for me on python. I am using anaconda running python 3.6. My code is as follows(just an example):
import fasttext
model = fasttext.load_model('/home/sproc/share/fastText/model.bin')
print(model.words)
This returns the following error:
Traceback (most recent call last):
File "/media/sf_VBoxShare/LiClipseWorkspace/test/testpack/fasttext.py", line 1, in <module>
import fasttext
File "/media/sf_VBoxShare/LiClipseWorkspace/test/testpack/fasttext.py", line 3, in <module>
model = fasttext.load_model('/home/sproc/share/fastText/model.bin')
AttributeError: module 'fasttext' has no attribute 'load_model'
Does the same thing with cbow and skipgram when trying to create word vectors. I check the init.py file from the .../site-packages/fasttext directory and it imports said attributes, but they are not part of the model.py module. I'm guessing this has something to do with the shared object file but I am not sure. Any help is greatly appreciated.
Here is a solution that worked for me when I got the error you are getting;
Import FastText
from gensim.models.wrappers import FastText
Load the binary
model=FastText.load_fasttext_format('wiki.simple.bin')
Rename your python file .
Don't name it as fasttext.py .If your name it like this , what you import by "import fasttext.py " will be your own file.
You can rename it as 'fast_text.py' or something else .
If you install fastText package instead of the old fasttext, then
import fastText
model = fastText.load_model('/home/sproc/share/fastText/model.bin')
should work as expected.
#spencerktm30 I recommend you using pyfasttext instead of fasttext which is no longer active and it has a lot of bugs. link to pyfasttext
Actually, I faced similar issue when trying to load a C++ pre trained model and I had to switch to using pyfasttext to get it to work.
So this should hopefully work for you:
>>> from pyfasttext import FastText
>>> model = FastText('/home/sproc/share/fastText/model.bin')
Rename the file from fasttext.py to another name, it will work.
Apparently there are different fasttext python libraries out there!
fasttext != fasttext-win

Categories