get_file(fname, origin=_URL, extract=True) not extracting the file

get_file(fname, origin=_URL, extract=True) not extracting the file - python

My code:
import matplotlib.pyplot as plt
import numpy as np
import os
import tensorflow as tf
from tensorflow.keras.preprocessing import image_dataset_from_directory
import os
os.environ['KMP_DUPLICATE_LIB_OK'] = 'TRUE'
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'
dl_fname = '/Volumes/D/PythonCode/tf_transfer_learning/cats_and_dogs.zip'
path_to_zip = tf.keras.utils.get_file(dl_fname, origin=_URL, extract=True)
I can see cats_and_dogs.zip being downloaded, however, it is not extracted/unzipped. I am on MacOS, using PyCharm.
I am not sure why not. Anyone has a pointer? Thanks.

The URL is https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip but you saved it as cats_and_dogs.zip. When you extract it, you get a directory cats_and_dogs_filtered, no matter what the name of the zip file is.
Though it does seem that the extracted file is in ~/.keras/datasets/cats_and_dogs_filtered.zip even if one uses an absolute path.
import tensorflow as tf
url = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'
dl_path = "/tmp/cats_and_dogs_filtered.zip"
path_to_zip = tf.keras.utils.get_file(dl_path, origin=url, extract=True)
The zip file is saved to the absolute path, but the extracted path goes to ~/.keras/datasets/cats_and_dogs_filtered/.
print(path_to_zip)
# /tmp/cats_and_dogs_filtered.zip

Related

Cloned github repository url lead to error in google colab

I'm working on a google colab available at this link: https://colab.research.google.com/github/lyricstopaintings/lyricstopaintings/blob/main/Lyrics%20inspired%20AI%20paintings.ipynb
On github: https://github.com/lyricstopaintings/lyricstopaintings
I would like to store all the files regarding my project on my own github repository, so I cloned a few public repositories for example:
https://github.com/openai/CLIP
https://github.com/kostarion/guided-diffusion
The following script works fine,
import pathlib, shutil, os, sys
useCPU = False ##param {type:"boolean"}
if not is_colab:
# If running locally, there's a good chance your env will need this in order to not crash upon np.matmul() or similar operations.
os.environ['KMP_DUPLICATE_LIB_OK']='TRUE'
PROJECT_DIR = os.path.abspath(os.getcwd())
USE_ADABINS = True
if is_colab:
if not google_drive:
root_path = f'/content'
model_path = '/content/models'
else:
root_path = os.getcwd()
model_path = f'{root_path}/models'
multipip_res = subprocess.run(['pip', 'install', 'lpips', 'datetime', 'timm', 'ftfy', 'einops', 'pytorch-lightning', 'omegaconf'], stdout=subprocess.PIPE).stdout.decode('utf-8')
print(multipip_res)
if is_colab:
subprocess.run(['apt', 'install', 'imagemagick'], stdout=subprocess.PIPE).stdout.decode('utf-8')
try:
from CLIP import clip
except:
if not os.path.exists("CLIP"):
gitclone("https://github.com/openai/CLIP")
sys.path.append(f'{PROJECT_DIR}/CLIP')
try:
from guided_diffusion.script_util import create_model_and_diffusion
except:
if not os.path.exists("guided-diffusion"):
gitclone("https://github.com/kostarion/guided-diffusion")
sys.path.append(f'{PROJECT_DIR}/guided-diffusion')
**... other modules and packages in the same structure**
import torch
from dataclasses import dataclass
from functools import partial
import cv2
import pandas as pd
import gc
import io
import math
import timm
from IPython import display
import lpips
from PIL import Image, ImageOps
import requests
from glob import glob
import json
from types import SimpleNamespace
from torch import nn
from torch.nn import functional as F
import torchvision.transforms as T
import torchvision.transforms.functional as TF
from tqdm.notebook import tqdm
from CLIP import clip
from resize_right import resize
from guided_diffusion.script_util import create_model_and_diffusion, model_and_diffusion_defaults
from datetime import datetime
import numpy as np
import matplotlib.pyplot as plt
import random
from ipywidgets import Output
import hashlib
from functools import partial
if is_colab:
os.chdir('/content')
from google.colab import files
else:
os.chdir(f'{PROJECT_DIR}')
from IPython.display import Image as ipyimg
from numpy import asarray
from einops import rearrange, repeat
import torch, torchvision
import time
from omegaconf import OmegaConf
import warnings
warnings.filterwarnings("ignore", category=UserWarning)
but when I change the links
https://github.com/openai/CLIP --> https://github.com/lyricstopaintings/lyricstopaintings/tree/main/CLIP
https://github.com/kostarion/guided-diffusion --> https://github.com/lyricstopaintings/lyricstopaintings/tree/main/guided-diffusion
got the following error:
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-3-b4c7e2b3bdd9> in <module>()
107 import torchvision.transforms.functional as TF
108 from tqdm.notebook import tqdm
--> 109 from CLIP import clip
110 from resize_right import resize
111 from guided_diffusion.script_util import create_model_and_diffusion, model_and_diffusion_defaults
ModuleNotFoundError: No module named 'CLIP'
What is wrong with my approach? I'm new on this field, so sorry if this is some basic thing.

The Github repository should be referred to as follows
https://github.com/lyricstopaintings/lyricstopaintings
But this contains folders for the cloned repositories so the path needs to include the folder as follows.
sys.path.append(f'{PROJECT_DIR}/lyricstopaintings/CLIP')
I couldn't get from CLIP import clip working so I changed it to import clip.
I cut down the code to be a minimal example because I couldn't find some of the functions such as subprocess.run and gitclone so I replaced these with other functions.
import pathlib, shutil, os, sys
PROJECT_DIR = os.path.abspath(os.getcwd())
!pip install -q ftfy
try:
import clip
except:
if not os.path.exists("CLIP"):
!git clone -q "https://github.com/lyricstopaintings/lyricstopaintings"
sys.path.append(f'{PROJECT_DIR}/lyricstopaintings/CLIP')
import clip
clip.available_models()
#
['RN50',
'RN101',
'RN50x4',
'RN50x16',
'RN50x64',
'ViT-B/32',
'ViT-B/16',
'ViT-L/14',
'ViT-L/14#336px']

Python cv2.imread returns 'NoneType' object has no attribute 'shape'

I'm trying to read my pictures with cv2.imread from a folder named "Bilder", but I always get None returned. When I put my pictures in the folder "Straßenverkehr Projekt" (the folder, where my code [module.py] is also saved) it works.
Folder path of the pictures : C:\Users\ramif\Desktop\Straßenverkehr Projekt\Bilder
Folder path of the code: C:/Users/ramif/Desktop/Straßenverkehr Projekt/module.py
Traceback (most recent call):
File "c:/Users/ramif/Desktop/Straßenverkehr Projekt/module.py", line
12, in read_image print(img.shape)
AttributeError: 'NoneType' object has no attribute 'shape'
import cv2
import matplotlib.pyplot as plt
import numpy as np
import os
def read_image():
'reading the images'
folder = os.path.join(os.path.dirname(__file__),"Bilder")
for i in os.listdir(folder):
img = cv2.imread(i)
print(img.shape)
read_image()

One of the annoying features of cv2.imread is that it will not raise an exception if there is an error. You have to check the return value. 'NoneType' is always a clue that it couldn't find the file. There are a number of ways to solve this. Starting with your code the easiest thing I can think of is to use os.chdir to change the working directory to where the pictures are:
import cv2
import matplotlib.pyplot as plt
import numpy as np
import os
def read_image():
'reading the images'
folder = os.path.join(os.path.dirname(__file__),"Bilder")
os.chdir(folder)
for i in os.listdir(folder):
img = cv2.imread(i)
print(img.shape)
read_image()
An alternative solution is to use os.path.join inside your for loop, i.e.
for i in os.listdir(folder):
fullpath = os.path.join(folder, i)
img = cv2.imread(fullpath)
print(img.shape)
The #gautam-bose answer should work for Linux systems, but I forget what Python wants path separators to look like in Windows. If you print(folder) you can get an idea of what the separators are.

The problem is that os.listdir is giving you just the file names inside of the Bilder folder. To make this example work, you need to append the directory to the file name so that you have the complete path to the image.
import cv2
import matplotlib.pyplot as plt
import numpy as np
import os
def read_image():
'reading the images'
folder = os.path.join(os.path.dirname(__file__),"Bilder")
for i in os.listdir(folder):
img = cv2.imread(folder + '/' + i)
print(img.shape)
read_image()

How to solve FileNotFoundError: [Errno 2] No such file or directory for Python 3.7/Mac

Details:
Python 3.7.1, Mac OS High Sierra 10.13.6. I am using IDLE and running the program through the terminal. I recently had success with the MNIST handwritten numbers and now I am trying to train a Generative Adversarial Network with my own dataset. The dataset is a folder of images.
The error:
Traceback (most recent call last):
File "Pride.py", line 29, in <module>
listing = os.listdir(path1)
FileNotFoundError: [Errno 2] No such file or directory: 'Users/darren/Desktop/Pride'
I have looked at other threads on this issue but don't understand what's wrong with my path so I apologise if my error is due to something simple. The Python file that I'm executing from the terminal and my dataset folder are both on my Desktop.
Here is my code up until this point:
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.optimizers import SGD,RMSprop,adam
from keras.utils import np_utils
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import os
import theano
from PIL import Image
from numpy import *
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split
# input image dimensions
img_rows, img_cols = 200, 200
# number of channels
img_channels = 1
#%%
# data
path1 = "Users/darren/Desktop/Pride" #path of folder of images
path2 = "Users/darren/Desktop/Prideresized" #path of folder to save images
listing = os.listdir(path1)
num_samples=size(listing)
print ("num_samples")
for file in listing:
im = Image.open(path1 + '\\' + file)
img = im.resize((img_rows,img_cols))
gray = img.convert('L')
gray.save(path2 +'\\' + file, "JPEG")

in mac os, if you save the file of your dataset or any other file in the documents folder then code for the path of that file may be like this
melbourne_file_path = **'/Users/adi/Documents/Top250.csv'**
melbourne_data = pd.read_csv(melbourne_file_path)
melbourne_data.describe()

If you're using relative paths (a path that does not start with a /) you need to change them so they are relative to your script's location.
In your case:
path1 = "Pride" #path of folder of images
path2 = "Prideresized" #path of folder to save images

how to show NewsAggregatorDataset using read_csv

i create following code and want to see the head of dataset but it doesent work.
i download the dataset from here and put it in the project folder.
help me
thanks
import pandas as pd
from pathlib import Path
path_csv = Path('NewsAggregatorDataset/newsCorpora.csv').absolute()
data = pd.read_csv(path_csv)
print(data.head())

You can try like this way with os module if it is in the NewsAggregatorDataset directory
import pandas as pd
import os
path_csv = os.path.dirname(__file__)+'/NewsAggregatorDataset/newsCorpora.csv'
data = pd.read_csv(path_csv)
print(data.head())
Also see below lines on how to get current_directory and parent_directory
import os
current_directory = os.path.abspath(os.path.dirname(__file__)) # get current directory
parent_directory = os.path.abspath(current_directory + "/../") # get parent directory

Creating Decision Tree using python

I am creating a decision tree using a dataset named as "wine":
i am trying following code to execute:
dt = c.fit(X_train, y_train)
Creating the image of the decision tree:
where "Malik Shahid Ali" is the location/path of the image
def show_tree(tree, features, path):
f = io.StringIO()
export_graphviz(tree, out_file=f, feature_names=features)
pydotplus.graph_from_dot_data(f.getvalue()).write_png("Malik Shahid Ali")
img = misc.imread("Malik Shahid Ali")
plt.imshow(img)
Calling the image:
show_tree(dt, features, 'dec_tree_01.png')
but when i call the image it gives the following error:
GraphViz's executables not found
import section:
import numpy as np
import pandas as pd
from sklearn import tree
from sklearn.tree import DecisionTreeClassifier, export_graphviz
from sklearn.model_selection import train_test_split
import graphviz
import pydotplus
import io
from scipy import misc
import matplotlib.pyplot as plt #sets up plotting under plt
import seaborn as sb
from pylab import rcParams
reading csv dataset
data=pd.read_csv('C:/Users/malik/Desktop/wine.csv',low_memory=False)
data.head()
train, test = train_test_split(data,test_size=0.15)
print("Training size: {} Test size: {}".format(len(train),len(test)))
c=DecisionTreeClassifier(min_samples_split=2)
features = ["id","Alcohol","Malic acid","Ash","Alcalinity of ash","Magnesium","Total phenols","Flavanoids","Field9Nonflavanoid phenols","Proanthocyanins","Color intensity","Hue","OD280/OD315 of diluted wines","Proline"]
X_train = train[features]
y_train = train["id"]
X_test = test[features]
y_test = test["id"]
y_test
dt = c.fit(X_train, y_train)
path of the excutable file:
import os
os.environ["PATH"] += os.pathsep + 'E:\Graphviz2.38\bin'
image function:
def show_tree(tree, features, path):
f = io.StringIO()
export_graphviz(tree, out_file=f, feature_names=features)
pydotplus.graph_from_dot_data(f.getvalue()).write_png(path)
img = misc.imread(path)
plt.imshow(img)
show_tree(dt, features, 'dec_tree_01.png')
Now on this command jupyter is giving eror like this:
E:\python\lib\site-packages\pydotplus\graphviz.py in create(self, prog, format)
1958 if self.progs is None:
1959 raise InvocationException(
-> 1960 'GraphViz\'s executables not found')
1961
1962 if prog not in self.progs:
InvocationException: GraphViz's executables not found

I'm re-purposing my answer to a related problem here.
Make sure you have installed the actual executables, not just the python package. I used conda's install package here (recommended over pip install graphviz as pip install doesn't
include the actual GraphViz executables).
Update
At the end of the day, an incorrectly formatted string path to the necessary file directory was added to the environment variable PATH. Be sure to add double back slashes in the string path to the directory, e.g.:
import os
os.environ["PATH"] += os.pathsep + 'E:\\Graphviz2.38\\bin\\'

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

get_file(fname, origin=_URL, extract=True) not extracting the file - python

Related

Cloned github repository url lead to error in google colab

Python cv2.imread returns 'NoneType' object has no attribute 'shape'

How to solve FileNotFoundError: [Errno 2] No such file or directory for Python 3.7/Mac

how to show NewsAggregatorDataset using read_csv

Creating Decision Tree using python

Categories

Resources