I'm following this tutorial using windows 10 and pycharm
https://www.datacamp.com/community/tutorials/tensorflow-tutorial
Below is my code so far
As suggested, I installed scimage using pip install scikit-image
Because pip install scimage doesnt work and tells me to install scikit-image
This worked for installing tensorflow, so I know its installing in the correct directory, but it wont run because it wont recognize import scimage
Why wont it recognize this? Am I supposed to import something different?
I tried import scikit-image as scimage it didn't find it either.
# Import `tensorflow`
import tensorflow as tf
import numpy as np
import os
import scimage
# Initialize two constants
x1 = tf.constant([1,2,3,4])
x2 = tf.constant([5,6,7,8])
# Multiply
result = tf.multiply(x1, x2)
# Print the result
print(result)
def load_data(data_directory):
directories = [d for d in os.listdir(data_directory)
if os.path.isdir(os.path.join(data_directory, d))]
labels = []
images = []
for d in directories:
label_directory = os.path.join(data_directory, d)
file_names = [os.path.join(label_directory, f)
for f in os.listdir(label_directory)
if f.endswith(".ppm")]
for f in file_names:
images.append(skimage.data.imread(f))
labels.append(int(d))
return images, labels
ROOT_PATH = "C:\\Users\\dm\\PycharmProjects\\test"
train_data_directory = os.path.join(ROOT_PATH, "Training")
test_data_directory = os.path.join(ROOT_PATH, "Testing")
You need to use:
import skimage
Related
Trying to run tesseract on python, this is my code:
import cv2
import os
import numpy as np
import matplotlib.pyplot as plt
import pytesseract
import Image
# def main():
jpgCounter = 0
for root, dirs, files in os.walk('/home/manel/Desktop/fotografias etiquetas'):
for file in files:
if file.endswith('.jpg'):
jpgCounter += 1
for i in range(1, 2):
name = str(i) + ".jpg"
nameBW = str(i) + "_bw.jpg"
img = cv2.imread(name,0) #zero -> abre em grayscale
# img = cv2.equalizeHist(img)
kernel = np.array([[0,-1,0], [-1,5,-1], [0,-1,0]])
img = cv2.filter2D(img, -1, kernel)
cv2.normalize(img,img,0,255,cv2.NORM_MINMAX)
med = np.median(img)
retval, threshold_manual = cv2.threshold(img, med*0.6, 255, cv2.THRESH_BINARY)
cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY,11,2)
print(pytesseract.image_to_string(threshold_manual, lang='eng', config='-psm 11', nice=0, output_type=Output.STRING))
the error im getting is the following:
NameError: name 'Output' is not defined
Any idea why I'm getting this?
thank you!
Add.
from pytesseract import Output
The problem is you have installed original pytesseract package (downloaded using pip) and referring documentation of madmaze GitHub version, actually both are different.
I suggest uninstalling the present version and cloning the GitHub repo and installing the same, by following this steps:
Uninstall present version:
pip uninstall pytesseract
Clone madmaze/pytesseract GitHub repo by either using git:
git clone https://github.com/madmaze/pytesseract.git
or download it directly by clicking here
Get to the root directory of the cloned repo and run:
pip install .
I created a virtual env using venv (Python 3.xx).
There, I installed only a few packages:
I have one Python Script that reads the files in a given directory and manipulate their data using Pandas:
import pandas as pd
import os
path = os.getcwd()+'\Ph3_Charts'+'\\'
print(path)
from os import listdir
from os.path import isfile, join
days_range = [f for f in listdir(path) if isfile(join(path, f))]
dataframes = []
count = 0
for i in days_range:
try:
print(i,count)
dataframes.append(pd.read_excel(path+i, sheet_name = "Issues Jira", index_col=0))
count += 1
except:
pass
The problem seems to be with the variable path, as the program breaks when it tries to append some data frames from each of the listed files.
However, the red marked passage above shows the path just fine... The strangest thing is that when I run this program locally, the iteration works ok.
Any guesses why this is happening please?
The source of the issue is that you're forcing the script to use the backslash \ as the path separator. Your remote system uses Linux whereas you're using Windows locally. Unlike Windows, Linux and macOS systems prefer to use the forward slash for separating directories in a system path. That's why the difference.
Below is a properly platform independent implementation that avoids such needless specificity:
import pandas as pd
import os
from os import listdir
from os.path import isfile, join
# Get CWD
CWD = os.getcwd()
# Join CWD with `Ph3_Charts` folder
PH3_CHART_PATH = os.path.join(CWD, 'Ph3_Charts')
print("PH3 chart path: '"+PH3_CHART_PATH+"'")
days_range = [f for f in listdir(PH3_CHART_PATH) if isfile(join(PH3_CHART_PATH, f))]
dataframes = []
count = 0
for i in days_range:
try:
print(i,count)
dataframes.append(pd.read_excel(PH3_CHART_PATH+i, sheet_name = "Issues Jira", index_col=0))
count += 1
except Exception as e: # catch *all* exceptions
print(e) # print it out
This solution works with or without the venv features you discussed. Please see the file diff here (comparison of differences between your version and the above code).
My code:
import matplotlib.pyplot as plt
import numpy as np
import os
import tensorflow as tf
from tensorflow.keras.preprocessing import image_dataset_from_directory
import os
os.environ['KMP_DUPLICATE_LIB_OK'] = 'TRUE'
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'
dl_fname = '/Volumes/D/PythonCode/tf_transfer_learning/cats_and_dogs.zip'
path_to_zip = tf.keras.utils.get_file(dl_fname, origin=_URL, extract=True)
I can see cats_and_dogs.zip being downloaded, however, it is not extracted/unzipped. I am on MacOS, using PyCharm.
I am not sure why not. Anyone has a pointer? Thanks.
The URL is https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip but you saved it as cats_and_dogs.zip. When you extract it, you get a directory cats_and_dogs_filtered, no matter what the name of the zip file is.
Though it does seem that the extracted file is in ~/.keras/datasets/cats_and_dogs_filtered.zip even if one uses an absolute path.
import tensorflow as tf
url = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'
dl_path = "/tmp/cats_and_dogs_filtered.zip"
path_to_zip = tf.keras.utils.get_file(dl_path, origin=url, extract=True)
The zip file is saved to the absolute path, but the extracted path goes to ~/.keras/datasets/cats_and_dogs_filtered/.
print(path_to_zip)
# /tmp/cats_and_dogs_filtered.zip
I am creating a decision tree using a dataset named as "wine": i am trying following code to execute:
dt = c.fit(X_train, y_train)
Creating the image of the decision tree:
def show_tree(tree, features, path):
f = io.StringIO()
export_graphviz(tree, out_file=f, feature_names=features)
pydotplus.graph_from_dot_data(f.getvalue()).write_png(path)
img = misc.imread(path)
plt.rcParams["figure.figuresize"] = (20 , 20)
plt.imshow(img)
Calling the image:
show_tree(dt, features, 'dec_tree_01.png')
but when i call the image it gives the following error:
GraphViz's executables not found
I have installed graphviz-2.38msi from there website...but the same error is continuously showing.
I have also added environment variables string in the user variable like this:
%SystemRoot%\system32;%SystemRoot%;%SystemRoot%\System32\Wbem;%SYSTEMROOT%\System32\WindowsPowerShell\v1.0\;C:\Program Files (x86)\Graphviz2.38\bin;
But it could also not solve the problem.
try appending the path to os variable in code like
import os
os.environ["PATH"] += os.pathsep + 'C:\Program Files (x86)\Graphviz2.38\bin'
Note: Do it at top of he code before excution of show_tree()
Trying to run tesseract on python, this is my code:
import cv2
import os
import numpy as np
import matplotlib.pyplot as plt
import pytesseract
import Image
# def main():
jpgCounter = 0
for root, dirs, files in os.walk('/home/manel/Desktop/fotografias etiquetas'):
for file in files:
if file.endswith('.jpg'):
jpgCounter += 1
for i in range(1, 2):
name = str(i) + ".jpg"
nameBW = str(i) + "_bw.jpg"
img = cv2.imread(name,0) #zero -> abre em grayscale
# img = cv2.equalizeHist(img)
kernel = np.array([[0,-1,0], [-1,5,-1], [0,-1,0]])
img = cv2.filter2D(img, -1, kernel)
cv2.normalize(img,img,0,255,cv2.NORM_MINMAX)
med = np.median(img)
retval, threshold_manual = cv2.threshold(img, med*0.6, 255, cv2.THRESH_BINARY)
cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY,11,2)
print(pytesseract.image_to_string(threshold_manual, lang='eng', config='-psm 11', nice=0, output_type=Output.STRING))
the error im getting is the following:
NameError: name 'Output' is not defined
Any idea why I'm getting this?
thank you!
Add.
from pytesseract import Output
The problem is you have installed original pytesseract package (downloaded using pip) and referring documentation of madmaze GitHub version, actually both are different.
I suggest uninstalling the present version and cloning the GitHub repo and installing the same, by following this steps:
Uninstall present version:
pip uninstall pytesseract
Clone madmaze/pytesseract GitHub repo by either using git:
git clone https://github.com/madmaze/pytesseract.git
or download it directly by clicking here
Get to the root directory of the cloned repo and run:
pip install .