azureml score.py not reading additional json file

azureml score.py not reading additional json file - python

I have a json file which I am using in score.py however, it is not being found. When making a post request to the endpoint I get the following error "No such file or directory: '/var/azureml-app/model_adjustments.json'"
json file is in the same folder as score.py and calling it from a script
path_to_source="path/example"
inf_conf = InferenceConfig(entry_script="score.py",environment=environment,source_directory=path_to_source)
in my score.py file i have the following
def init():
global model
# note you can pass in multiple rows for scoring
def run(raw_data):
try:
# parse the features from the json doc
dat = json.loads(raw_data)
#deserialize the model file back into a sklearn mode
model_name = "{0}_{1}_{2}".format(dat["col1"], dat["col2"], dat["col3"])
model_path = Model.get_model_path(model_name = model_name)
model = load(model_path)
# parse the savings json file
current_directory = os.getcwd()
file_name="model_adjustments.json"
json_file=os.path.join(current_directory,file_name)
with open(json_file, "r") as fr:
adjustments_dat = json.loads(fr.read())
modelAdjustment = adjustments_dat[model_name]
# create a dataframe of the features
dfPredict = pd.DataFrame(
columns = [
"feature1",
"feature2",
"feature3",
"feature4",
"feature5"
]
)
# predict the outcome for these features
prediction = model.predict(dfPredict)
Predicted_val = prediction[0]
#prediction + adjustment for buffer
final_val= Predicted_val+modelAdjustment
return {
"final_val": final_val}
except Exception as e:
error = str(e)
return {"debug": -1, "error": error}
I know its to do with the json file because when I remove everything to do with it there is no error when doing a post request. I get back a value like I expected. not sure why its not reading the json file

I have had not time to try this out, but I would assume that the score.py file (and any files in the same directory) will not be in the os.getcwd().
I would try to use __file__ to get the path of score.py:
so, instead of:
current_directory = os.getcwd()
file_name="model_adjustments.json"
json_file=os.path.join(current_directory,file_name)
rather try:
import pathlib
model_directory = pathlib.Path(os.path.realpath(__file__)).parent.resolve()
file_name="model_adjustments.json"
json_file=os.path.join(model_directory,file_name)
(Note: realpath is there to make sure symlinks are properly resolved)

Related

Trying to read a config file in order to connect to twitter API

I am brand new at all of this and I am completely lost even after Googling, watching hours of youtube videos, and reading posts on this site for the past week.
I am using Jupyter notebook
I have a config file with my api keys it is called config.ipynb
I have a different file where I am trying to call?? (I am not sure if this is the correct terminology) my config file so that I can connect to the twitter API but I getting an attribute error
Here is my code
import numpy as np
import pandas as pd
import tweepy as tw
import configparser
#Read info from the config file named config.ipynb
config = configparser.ConfigParser()
config.read(config.ipynb)
api_key = config[twitter][API_key]
print(api_key) #to test if I did this correctly`
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In [17], line 4
1 #Read info from the config file named config.ipynb
3 config = configparser.ConfigParser()
----> 4 config.read(config.ipynb)
5 api_key = config[twitter][API_key]
AttributeError: 'ConfigParser' object has no attribute 'ipynb'
After fixing my read() mistake I received a MissingSectionHeaderError.
MissingSectionHeaderError: File contains no section headers.
file: 'config.ipynb', line: 1 '{\n'.
My header in my config file is [twitter] but that gives me a NameError and say [twitter] is not defined... I have updated this many times per readings but I always get the same error.
My config.ipynb file code is below:
['twitter']
API_key = "" #key between the ""
API_secret = "" #key between the ""
Bearer_token = "" #key between the ""
Client_ID = "" #key between the ""
Client_Secret = "" #key between the ""
I have tried [twitter], ['twitter'], and ["twitter"] but all render a MissingSectionHeaderError:

Per your last comment in Brance's answer, this is probably related to your file path. If your file path is not correct, configparser will raise a KeyError or NameError.
Tested and working in Jupyter:
Note that no quotation marks such as "twitter" are used
# stackoverflow.txt
[twitter]
API_key = 6556456fghhgf
API_secret = afsdfsdf45435
import configparser
import os
# Define file path and make sure path is correct
file_name = "stackoverflow.txt"
# Config file stored in the same directory as the script.
# Get currect working directory with os.getcwd()
file_path = os.path.join(os.getcwd(), file_name)
# Confirm that the file exists.
assert os.path.isfile(file_path) is True
# Read info from the config file named stackoverflow.txt
config = configparser.ConfigParser()
config.read(file_path)
# Will raise KeyError if the file path is not correct
api_key = config["twitter"]["API_key"]
print(api_key)

You are using the read() method incorrectly, the input should be a string of the filename, so if your filename is config.ipynb then you need to set the method to
config.read('config.ipynb')

How to write a FAST API function taking .csv file and making some preprocessing in pandas

I am trying to create an API function, that takes in .csv file (uploaded) and opens it as pandas DataFrame. Like that:
from fastapi import FastAPI
from fastapi import UploadFile, Query, Form
import pandas as pd
app = FastAPI()
#app.post("/check")
def foo(file: UploadFile):
df = pd.read_csv(file.file)
return len(df)
Then, I am invoking my API:
import requests
url = 'http://127.0.0.1:8000/check'
file = {'file': open('data/ny_pollution_events.csv', 'rb')}
resp = requests.post(url=url, files=file)
print(resp.json())
But I got such error: FileNotFoundError: [Errno 2] No such file or directory: 'ny_pollution_events.csv'
As far as I understand from doc pandas is able to read .csv file from file-like object, which file.file is supposed to be. But it seems, that here in read_csv() method pandas obtains name (not a file object itself) and tries to find it locally.
Am I doing something wrong?
Can I somehow implement this logic?

To read the file in pandas, the file must be stored on your PC. Don't forget to import shutil. if you don't need the file to be stored on your PC, delete it using os.remove(filepath).
if not file.filename.lower().endswith(('.csv',".xlsx",".xls")):
return 404,"Please upload xlsx,csv or xls file."
if file.filename.lower().endswith(".csv"):
extension = ".csv"
elif file.filename.lower().endswith(".xlsx"):
extension = ".xlsx"
elif file.filename.lower().endswith(".xls"):
extension = ".xls"
# eventid = datetime.datetime.now().strftime('%Y%m-%d%H-%M%S-') + str(uuid4())
filepath = "location where you want to store file"+ extension
with open(filepath, "wb") as buffer:
shutil.copyfileobj(file.file, buffer)
try:
if filepath.endswith(".csv"):
df = pd.read_csv(filepath)
else:
df = pd.read_excel(filepath)
except:
return 401, "File is not proper"

Need some assistance on a DBF "File not found" error in Python when looping through a directory?

I would like to ask for help with a Python script that is supposed to loop through a directory on a drive. Basically, what I want to do is convert over 10,0000 DBF files to CSV. So far, I can achieve this on an individual dbf file by using using the dbfread and Pandas packages. Running this script over 10,000 individual times is obviously not feasible, hence why I want automate the task by writing a script that will loop through each dbf file in the directory.
Here is what I would like to do.
Define the directory
Write a for loop that will loop through each file in the directory
Only open a file with the extension '.dbf'
Convert to Pandas DataFrame
Define the name for the output file
Write to CSV and place file in a new directory
Here is the code that I was using to test whether I could convert an individual '.dbf' file to a CSV.
from dbfread import DBF
import pandas as pd
table = DBF('Name_of_File.dbf')
#I originally kept receiving a unicode decoding error
#So I manually adjusted the attributes below
table.encoding = 'utf-8' # Set encoding to utf-8 instead of 'ascii'
table.char_decode_errors = 'ignore' #ignore any decode errors while reading in the file
frame = pd.DataFrame(iter(table)) #Convert to DataFrame
print(frame) #Check to make sure Dataframe is structured proprely
frame.to_csv('Name_of_New_File')
The above code worked exactly as it was intended.
Here is my code to loop through the directory.
import os
from dbfread import DBF
import pandas as pd
directory = 'Path_to_diretory'
dest_directory = 'Directory_to_place_new_file'
for file in os.listdir(directory):
if file.endswith('.DBF'):
print(f'Reading in {file}...')
dbf = DBF(file)
dbf.encoding = 'utf-8'
dbf.char_decode_errors = 'ignore'
print('\nConverting to DataFrame...')
frame = pd.DataFrame(iter(dbf))
print(frame)
outfile = frame.os.path.join(frame + '_CSV' + '.csv')
print('\nWriting to CSV...')
outfile.to_csv(dest_directory, index = False)
print('\nConverted to CSV. Moving to next file...')
else:
print('File not found.')
When I run this code, I receive a DBFNotFound error that says it couldn't find the first file in the directory. As I am looking at my code, I am not sure why this is happening when it worked in the first script.
This is the code from the dbfread package from where the exception is being raised.
class DBF(object):
"""DBF table."""
def __init__(self, filename, encoding=None, ignorecase=True,
lowernames=False,
parserclass=FieldParser,
recfactory=collections.OrderedDict,
load=False,
raw=False,
ignore_missing_memofile=False,
char_decode_errors='strict'):
self.encoding = encoding
self.ignorecase = ignorecase
self.lowernames = lowernames
self.parserclass = parserclass
self.raw = raw
self.ignore_missing_memofile = ignore_missing_memofile
self.char_decode_errors = char_decode_errors
if recfactory is None:
self.recfactory = lambda items: items
else:
self.recfactory = recfactory
# Name part before .dbf is the table name
self.name = os.path.basename(filename)
self.name = os.path.splitext(self.name)[0].lower()
self._records = None
self._deleted = None
if ignorecase:
self.filename = ifind(filename)
if not self.filename:
**raise DBFNotFound('could not find file {!r}'.format(filename))** #ERROR IS HERE
else:
self.filename = filename
Thank you any help provided.

os.listdir returns the file names inside the directory, so you have to join them to the base path to get the full path:
for file_name in os.listdir(directory):
if file_name.endswith('.DBF'):
file_path = os.path.join(directory, file_name)
print(f'Reading in {file_name}...')
dbf = DBF(file_path)

Importing and naming a remote file

I have written some code to read the contents from a specific url:
import requests
import os
def read_doc(doc_ID):
filename = doc_ID + ".txt"
if not os.path.exists(filename):
my_url = encode_url(doc_ID) #this is a call to another function that would encode the url
my_response = requests.get(my_url)
if my_response.status_code == requests.codes.ok:
return my_response.text
return None
This checks if there's a file named doc_ID.txt (where doc_ID could be any name provided). And if there's no such file, it would read the contents from a specific url and would return them. What I would like to do is to store those returned contents in a file called doc_ID.txt. That is, I would like to finish my function by creating a new file in case it didn't exist at the beginning.
How can I do that? I tried this:
my_text = my_response.text
output = os.rename(my_text, filename)
return output
but then, the actual contents of the file would become the name of the file and I would get an error saying the filename is too long.

So the issue I think I'm seeing is that you want to put the contents of your request's response into the file, rather than naming the file with the contents. The code below should create a file with the filename you want, and insert the text from your response!
import requests
import os
def read_doc(doc_ID):
filename = doc_ID + ".txt"
if not os.path.exists(filename):
my_url = encode_url(doc_ID) #this is a call to another function that would encode the url
my_response = requests.get(my_url)
if my_response.status_code == requests.codes.ok:
with open(filename, "w") as file:
file.write(my_response.text)
return file
return None

To write the response text to the file, you can simply use python file object, https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files
with open(filename, "w") as file:
file.write(my_text)

Error reading file in working directory python

I am trying to read a file in python. This is the code I am using:
# test script
import csv
import json
import os
def loadKeys(key_file):
json_keys=open(key_file).read()
data = json.loads(json_keys)
return data["api_key"],data["api_secret"],data["token"],data["token_secret"]
KEY_FILE = 'keys.json'
print(os.listdir(os.path.dirname(os.path.realpath(__file__))))
api_key, api_secret, token, token_secret = loadKeys(KEY_FILE)
However it returns the following error
->print(os.listdir(os.path.dirname(os.path.realpath(__file__))))
['.DS_Store', 'keys.json', 'script.py', 'test.py']
->api_key, api_secret, token, token_secret = loadKeys(KEY_FILE)
IOError: (2, 'No such file or directory', 'keys.json')
Is there anything I am doing wrong?

the KEY_FILE has no path, so it defaults to looking in the current directory. You've listed the file in another directory, which is the result of:
os.path.dirname(os.path.realpath(__file__))
Use os.path.join:
path = os.path.dirname(os.path.realpath(__file__))
loadKeys(os.path.join(path,KEY_FILE))

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

azureml score.py not reading additional json file - python

Related

Trying to read a config file in order to connect to twitter API

How to write a FAST API function taking .csv file and making some preprocessing in pandas

Need some assistance on a DBF "File not found" error in Python when looping through a directory?

Importing and naming a remote file

Error reading file in working directory python

Categories

Resources