nhlscrapi - download Data error - python

I am trying to get the statistics and game information of every game of an NHL Season. I am working with Stata. I found the package nhlscrapi and have written code to get all the data and statistics of a particular season:
# Import statements
# Notice how I import the whole modules and not the functions explicitly as given in the online example (good practice)
from nhlscrapi.games import game, cumstats
from nhlscrapi import constants
import csv
# Define season being considered:
season = 2012
# Get all stats they have defined
# Googled "get all methods of a class python" and found this:
# http://stackoverflow.com/questions/34439/finding-what-methods-an-object-has
# Also, needed to excclude some methods (ABCMeta, ...) after I checked what they do
# (I did that with: "help(cumstats.METHODNAME)") and saw that they did not contain stats
methods = [method for method in dir(cumstats) if callable(getattr(cumstats, method)) and
method != 'ABCMeta' and
method != 'AccumulateStats' and
method != 'ShotEventTallyBase' and
method != 'abstractmethod' and
method != 'TeamIncrementor' and
method != 'EF' and
method != 'St']
# Set up dictionary with all stats
cum_stats = {method: getattr(cumstats, method)() for method in methods}
print('All the stats:', cum_stats.keys())
# Now, look up how many games were in the regular season of the year 2012
maxgames = constants.GAME_CT_DICT[season]
# If one is interested in all the home coaches (as an example), one would first set up an empty list,
# and gradually fill it:
thingswewant_keys = ['home_coach', 'away_coach', 'home', 'away', 'attendance', 'Score', 'Fenwick']
thingswewant_values = {key: [] for key in thingswewant_keys if not key in cum_stats.keys()}
thingswewant_values.update({key+'_home': [] for key in cum_stats.keys()})
thingswewant_values.update({key+'_away': [] for key in cum_stats.keys()})
# Now, loop over all games in this season
for i in range(**12**):
# Set up object which queries database
# If one enters the following command in ipython: "help(game.Game)", one sees also alternative ways to set up
# query other than the one given in the example
ggames = game.Game(game.GameKey(season, game.GameType.Regular, i+1), cum_stats=cum_stats)
# This object 'ggames' now contains all the information of 1 specific game.
# To concatenate all the home coaches for example, one would do it like this
for key in thingswewant_keys:
if not key in cum_stats.keys():
# First case: Information is attribute of ggames (e.g. home_coach)
if not key in ['home', 'away', 'attendance']:
thingswewant_values[key] += [getattr(ggames, key)]
# Second case: Information is key of ggames.matchup (e.g. home)
if key in ['home', 'away', 'attendance']:
thingswewant_values[key] += [ggames.matchup[key]]
# Third case: Information is a cum_stat
# Figure out home_team and away team
hometeam = ggames.matchup['home']
awayteam = ggames.matchup['away']
for key in cum_stats.keys():
thingswewant_values[key+'_home'] += [ggames.cum_stats[key].total[hometeam]]
thingswewant_values[key+'_away'] += [ggames.cum_stats[key].total[awayteam]]
# Make one single table out of all the columns
results = [tuple([key for key in thingswewant_values.keys()])]
results += zip(*[thingswewant_values[key] for key in thingswewant_values.keys()])
# Write to csv
with open('brrr.csv', 'wb') as f:
writer = csv.writer(f)
writer.writerows(results)
The problem now is that in every season, after a certain game, the code stops and spits out following error:
Traceback (most recent call last):
File "C:/Users/Dennis/Downloads/AllStatsExcell.py", line 67, in <module>
thingswewant_values[key+'_home'] += [ggames.cum_stats[key].total[hometeam]]
File "C:\Python27\lib\site-packages\nhlscrapi\games\game.py", line 211, in cum_stats
return self.play_by_play.compute_stats()
File "C:\Python27\lib\site-packages\nhlscrapi\games\playbyplay.py", line 95, in compute_stats
for play in self._rep_reader.parse_plays_stream():
File "C:\Python27\lib\site-packages\nhlscrapi\scrapr\rtss.py", line 56, in parse_plays_stream
p_obj = parser.build_play(p)
File "C:\Python27\lib\site-packages\nhlscrapi\scrapr\rtss.py", line 130, in build_play
p['vis_on_ice'] = self.__skaters(skater_tab[0][0]) if len(skater_tab) else { }
File "C:\Python27\lib\site-packages\nhlscrapi\scrapr\rtss.py", line 159, in __skaters
if pl[0].text.isdigit():
AttributeError: 'NoneType' object has no attribute 'isdigit'
In the 2012 season, this occurs after game 12. Therefore I just run for the game 12 in season 2012.
ggames1=game.Game(game.GameKey(2012, game.GameType.Regular, 12),cum_stats=cum_stats
ggames1.cum_stats['ShootOut'].total
In ShootOut, for example, it crashes. But if I run this line again I get the results.
I don't know how to fix this.
If I just could get the csv file of all the games, even if there are some missing values I would be very happy.

First, you need to do some debugging yourself. The error explicitly states:
File "C:/Users/Dennis/Downloads/AllStatsExcell.py", line 67, in
thingswewant_values[key+'_home'] += [ggames.cum_stats[key].total[hometeam]]
That means on line 67 of your program there is an error. At the bottom it shows you what that error is:
AttributeError: 'NoneType' object has no attribute 'isdigit'
This means that you are attempting to get the attribute isdigit on the value of an object that is NoneType. As you might surmise, NoneType objects don't have any contents.
This is the offending line, along with the preceding for block:
for key in cum_stats.keys():
thingswewant_values[key+'_home'] += [ggames.cum_stats[key].total[hometeam]]
What you want to do is probably the following:
for key in cum_stats.keys():
try:
thingswewant_values[key+'_home'] += [ggames.cum_stats[key].total[hometeam]]
except Exception as e:
print(e)
print("key={}".format(key)
print("hometeam={}".format(hometeam)
print("ggames.cumstats={}".format(s[key].total[hometeam])
This is a basic error catching block. The first print line should tell you the exception. The following ones inform you as to the state of various things you're utilizing in the offending line. Your job is to figure out which thing is NoneType (it may not be one of the ones I provided) and then, after that, figure out why it is NoneType. Essentially: look at the data you have and are trying to manipulate in that block. Something is missing in it.

Related

Pandas dataframe not returning the index using the loc method

I'm trying to retrieve the index of a row within a dataframe using the loc method and a comparison of data from another dataframe within a for loop. Maybe I'm going about this wrong, I dunno. Here's a bit of information to help give the problem some context...
The following function imports some inventory data into a pandas dataframe from an xlsx file; this seemingly works just fine:
def import_inventory():
import warnings
try:
with warnings.catch_warnings(record=True):
warnings.simplefilter("always")
return pandas.read_excel(config_data["inventory_file"],header=1)
except Exception as E:
writelog.error(E)
sys.exit(E)
The following function imports some data from a combination of CSV files, creating a singular dataframe to work from during comparison; this seemingly works just fine:
def get_report_results():
output_dir = f"{config_data['output_path']}/reports"
report_ids = []
......
...execute and download the report csv files
......
reports_content = []
for path,current_directory,files in os.walk(output_dir):
for file in files:
file_path = os.path.join(path,file)
clean_csv_data(file_path) # This function simply cleans up the CSV content (removes blank rows, removes unnecessary footer data); updates same file that was sent in upon successful completion
current_file_content = pandas.read_csv(file_path,index_col=None,header=7)
reports_content.append(current_file_content)
reports_content = pandas.concat(reports_content,axis=0,ignore_index=True)
return reports_content
The problems exist here, at the following function that is supposed to search the reports content for the existence of an ID value then grab that row's index so I can use it in the future to modify some columns, add some columns.
def search_reports(inventory_df,reports_df):
for index,row in inventory_df.iterrows():
reports_index = reports_df.loc[reports_df["Inventory ID"] == row["Inv ID"]].index[0]
print(reports_df.iloc[reports_index]["Lookup ID"])
Here's the error I receive upon comparison
Length of values (1) does not match length of index (4729)
I can't quite figure out why this is happening. If I pull everything out of functions the work seems to happen the way it should. Any ideas?
There's a bit more work happening to the dataframe that comes from import_inventory, but didn't want to clutter the question. It's nothing major - one function adds a few columns that splits out a comma-separated value in the inventory into its own columns, another adds a column based on the contents of another column.
Edit:
As requested, the full stack trace is below. I've also included the other functions that operate on the original inventory_df object between its retreival (import_inventory) and its final comparison (search_reports).
This function again operates on the inventory_df function, only this time it retrieves a single column from each row (if it has data) and breaks the semicolon-separated list of key-value pair tags apart for further inspection. If it finds one, it creates the necessary column for it and populates that row with the found value.
def sort_tags(inventory_df):
cluster_key = "Cluster:"
nodetype_key = "NodeType:"
project_key = "project:"
tags = inventory_df["Tags List"]
for index,tag in inventory_df.items():
if not pandas.isna(tag):
tag_keysvalues = tag.split(";")
if any(cluster_key in string for string in tag_keysvalues):
pair = [x for x in tag_keysvalues if x.startswith(cluster_key)]
key_value_split = pair[0].split(":")
inventory_df.loc[index, "Cluster Name"] = key_value_split[1]
if any(nodetype_key in string for string in tag_keysvalues):
pair = [x for x in tag_keysvalues if x.startswith(nodetype_key)]
key_value_split = pair[0].split(":")
inventory_df.loc[index, "Node Type"] = key_value_split[1]
if any(project_key in string for string in tag_keysvalues):
pair = [x for x in tag_keysvalues if x.startswith(project_key)]
key_value_split = pair[0].split(":")
inventory_df.loc[index, "Project Name"] = key_value_split[1]
return inventory_df
This function compares the new inventory DF with a CSV import-to-DF of the old inventory. It creates new columns based on old inventory data if it finds a match. I know this is ugly code, but I'm hoping to replace it when I can find a solution to my current problem.
def compare_inventories(old_inventory_df,inventory_df):
aws_rowcount = len(inventory_df)
now = parser.parse(datetime.utcnow().isoformat()).replace(tzinfo=timezone.utc).astimezone(tz=None)
for a_index,a_row in inventory_df.iterrows():
if a_row["Comments"] != "none":
for o_index,o_row in old_inventory_df.iterrows():
last_checkin = parser.parse(str(o_row["last_checkin"])).replace(tzinfo=timezone.utc).astimezone(tz=None)
if (a_row["Comments"] == o_row["asset_name"]) and ((now - timedelta(days=30)) <= last_checkin):
inventory_df.loc[a_index,["Found in OldInv","OldInv Address","OldInv Asset ID","Inv ID"]] = ["true",o_row["address"],o_row["asset_id"],o_row["host_id"]]
return inventory_df
Here's the stack trace for the error:
Traceback (most recent call last):
File "c:\Users\beefcake-quad\Code\INVENTORYAssetSnapshot\main.py", line 52, in main
reports_index = reports_df.loc[reports_df["Inventory ID"] == row["Inv ID"]].index
File "c:\Users\beefcake-quad\Code\INVENTORYAssetSnapshot\.venv\lib\site-packages\pandas\core\ops\common.py", line 70, in new_method
return method(self, other)
File "c:\Users\beefcake-quad\Code\INVENTORYAssetSnapshot\.venv\lib\site-packages\pandas\core\arraylike.py", line 40, in __eq__
return self._cmp_method(other, operator.eq)
File "c:\Users\beefcake-quad\Code\INVENTORYAssetSnapshot\.venv\lib\site-packages\pandas\core\series.py", line 5625, in _cmp_method
return self._construct_result(res_values, name=res_name)
File "c:\Users\beefcake-quad\Code\INVENTORYAssetSnapshot\.venv\lib\site-packages\pandas\core\series.py", line 3017, in _construct_result
out = self._constructor(result, index=self.index)
File "c:\Users\beefcake-quad\Code\INVENTORYAssetSnapshot\.venv\lib\site-packages\pandas\core\series.py", line 442, in __init__
com.require_length_match(data, index)
File "c:\Users\beefcake-quad\Code\INVENTORYAssetSnapshot\.venv\lib\site-packages\pandas\core\common.py", line 557, in require_length_match
raise ValueError(
ValueError: Length of values (1) does not match length of index (7150)
reports_index = reports_df.loc[report_data["Inventory ID"] == row["Inv ID"].index[0]
missing ] at end

Python 3 Multidimentional List populated by existing MySQL table

I have an existing MySQL database that needs to change in rows with out rewriting code. I have referenced stackexchange Two dimensional array in python. Attempted those corrections (Examples maybe commented out). All login information HAS been changed.
import mysql.connector # Import MySQL Connector
import os
from sys import argv # Allow Arguments to be passed to this script
# DEFINE Global Variables
dataVar=[[],[]] # looked on stackexchange - failed
#dataVar=[] # Originally Did this - failed
#dataVar.append([]) # And This
#for i in range(20): dataVar.append(i) # Added this - failed
# MySQL Column Names -- Python doesn't allow string index of Lists.
UUID=1; # Unique Unit IDentifier
Name=10;
Type=3;
Description=11; # Written Description
V2=18;
ImgUrl=15; # Custom Icon
InStock=4; # Is it in stock 'Yes' or 'No'
New=2; # Display New Placard 'Yes' or 'No'
Damage=5; # Display Damage Warning 'Yes' or 'No'
Tankable=12; # Display Tank Warning 'Yes' or 'No'
Display=19; # Display If in stock 'Yes' or 'No'
Mix=16; # How to mix
Customer=17; # Customer Name
Profile=13; # Searchable keywords not in Description
ml005=14; # Prices u=Unavailble s=Standard
ml015=6; # Prices u=Unavailble s=Standard
ml030=7; # Prices u=Unavailble s=Standard
ml120=8; # Prices u=Unavailble s=Standard
Shots=9; # Prices u=Unavailble s=Standard
price005='$2.50';
price015='$5.00';
price030='$10.00';
price120='u'; # No 120ml Bottles
def fetchData():{
global totalCount # totalCount able to be written to in this function
global dataVar # Data able to be written to in this function
MySQLcon=mysql.connector.connect(user='u', password='p', host='localhost', database='d')
if (MySQLcon):
cursor=MySQLcon.cursor()
query="SELECT * FROM eJuice ORDER BY Name";
cursor.execute(query)
results=cursor.fetchall()
MySQLcon.close
totalCount=0;
for row in results: { # Fetch eJuice data
dataVar[UUID].append(row[0]);
dataVar[New].append(row[1]);
dataVar[Type].append(row[2]);
dataVar[InStock].append(row[3]);
dataVar[Damage].append(row[4]);
dataVar[ml015].append(row[5]);
dataVar[ml030].append(row[6]);
dataVar[ml120].append(row[7]);
dataVar[Shots].append(row[8]);
dataVar[Name].append(row[9]);
dataVar[Description].append(row[10]);
dataVar[Tankable].append(row[11]);
dataVar[Profile].append(row[12]);
dataVar[ml005].append(row[13]);
dataVar[ImgUrl].append(row[14]);
dataVar[Mix].append(row[15]);
dataVar[Customer].append(row[16]);
dataVar[V2].append(row[17]);
dataVar[Display].append(row[18]);
totalCount+=1;
}# End for row in results
}# End with MySQLcon
return
}# End fetchData()
# Start Program
fetchData();
# Create Display Function
# End Program
CLI output from running above code:
$ python3 main.py
Traceback (most recent call last):
File "main.py", line 88, in <module>
fetchData(); # Used from MySQL.py
File "main.py", line 61, in fetchData
dataVar[UUID].append(row[0]);
AttributeError: 'int' object has no attribute 'append'
You need to define a list with as many lists within it as there are columns in the data-table. You can to this:
unitUUID = []
unitNew = []
unitType = []
...
dataVar[unitUUID, unitNew, unitType,...]
You can now proceed to append items to each list:
unitUUID.append(row[0])
and so forth. Note that this is pretty clear from the explanation of the Two dimensional array in Python. I suggest you read that article carefully.

Script skips second for loop when reading a file

I am trying to read a log file and compare certain values against preset thresholds. My code manages to log the raw data from with the first for loop in my function.
I have added print statements to try and figure out what was going on and I've managed to deduce that my second for loop never "happens".
This is my code:
def smartTest(log, passed_file):
# Threshold values based on averages, subject to change if need be
RRER = 5
SER = 5
OU = 5
UDMA = 5
MZER = 5
datafile = passed_file
# Log the raw data
log.write('=== LOGGING RAW DATA FROM SMART TEST===\r\n')
for line in datafile:
log.write(line)
log.write('=== END OF RAW DATA===\r\n')
print 'Checking SMART parameters...',
log.write('=== VERIFYING SMART PARAMETERS ===\r\n')
for line in datafile:
if 'Raw_Read_Error_Rate' in line:
line = line.split()
if int(line[9]) < RRER and datafile == 'diskOne.txt':
log.write("Raw_Read_Error_Rate SMART parameter is: %s. Value under threshold. DISK ONE OK!\r\n" %int(line[9]))
elif int(line[9]) < RRER and datafile == 'diskTwo.txt':
log.write("Raw_Read_Error_Rate SMART parameter is: %s. Value under threshold. DISK TWO OK!\r\n" %int(line[9]))
else:
print 'FAILED'
log.write("WARNING: Raw_Read_Error_Rate SMART parameter is: %s. Value over threshold!\r\n" %int(line[9]))
rcode = mbox(u'Attention!', u'One or more hardrives may need replacement.', 0x30)
This is how I am calling this function:
dataOne = diskOne()
smartTest(log, dataOne)
print 'Disk One Done'
diskOne() looks like this:
def diskOne():
if os.path.exists(r"C:\Dejero\HDD Guardian 0.6.1\Smartctl"):
os.chdir(r"C:\Dejero\HDD Guardian 0.6.1\Smartctl")
os.system("Smartctl -a /dev/csmi0,0 > C:\Dejero\Installation-Scripts\diskOne.txt")
# Store file in variable
os.chdir(r"C:\Dejero\Installation-Scripts")
datafile = open('diskOne.txt', 'rb')
return datafile
else:
log.write('Smart utility not found.\r\n')
I have tried googling similar issues to mine and have found none. I tried moving my first for loop into diskOne() but the same issue occurs. There is no syntax error and I am just not able to see the issue at this point.
It is not skipping your second loop. You need to seek the position back. This is because after reading the file, the file offset will be placed at the end of the file, so you will need to put it back at the start. This can be done easily by adding a line
datafile.seek(0);
Before the second loop.
Ref: Documentation

select and make new list with specific information

EDIT2: Nevermind this, someone pointed my error. Thanks
first of all, this is an example of results i have
(172, 'Nucleus')
(172, 'Nucleus')
(472, 'Cytoplasm')
(472, 'Cytoplasm')
(472, 'Nucleus')
what i`m trying to do is to match the first number (position 0) and then look if there is a part of the word "nucleus" (here, it would be "nuc") It can happens that in each number there is only word that has nucleus.
i'm trying to make 2 lists : the first list would be only the number containing only "nuc" word. the second list would be containing those with nuc and other things (like cytoplasm in my example)
That is only a little part of my result.
I don't have example of code, because i have really no clue how to include only one valor of my query in the list ( as on the example, i would enter the number 172 two time) (oops i now have an example of code)
EDIT: oops wrote that before i wrote the code i tried...
right now, my code looks like that :
here is how i got my example a little bit higher
def number1(self, position):
self.position = position
List = [self.name()]
for item in List:
for i in range(position, self.c.rowcount):
self.number(i)
def separate_list(self, list_signal):
nuc_list = []
not_nuc_list = []
for i in list_signal:
print(list_signal(i))
if list_signal(i)(0) == list_signal(i+1)(0):
if list_signal(i)(1) and list_signal(i+1)(1) == re.search("nuc"):
nuc_list.append(list_signal(i))
else:not_nuc_list.append(list_signal(i))
return nuc_list and not_nuc_list
dc = connection()
dc.separate_list(dc.number1(0))
error:
Traceback (most recent call last):
File "class vincent.py", line 91, in <module>
dc.separate_list(dc.number1(0))
File "class vincent.py", line 61, in separate_list
for i in list_signal:
TypeError: 'NoneType' object is not iterable
i know this is not cute, i tried doing it the best way i can .. (new to python and programming in itself)
EDIT2: Nevermind this, someone pointed my error. Thanks
A few things, if you are trying to get the index, position 0 of the list as you say, you would use list_name[0], if you are using position to sort, use a different method
Are (172, 'Nucleus') ... (172, 'Nucleus') tuples or are they lists of their own? List you can use index with the [0] method, tuple you can assign it to two variables to work with the data as number, cell_type = (172, 'nucleus')
Also, at the moment dc.number1 doesn't return anything so it cant be used at input to another function. Add a return of some sort or change what you are using as the input to whatever self.number is modifying.
You may want to make a list of all your results, e.g. [(172, 'Nucleus'), ...(172, 'Nucleus')] then you can iterate through with
for item in results_list:
for number, cell_type in item:
print str(number), cell_type
#Should give you "172 Nucleus"

Python NoneType error in some executions of a program

I was testing a code found in a book about Genetic Algorithms and I came up with an strange mistake. The code is the following:
import time
import random
import math
people = [('Seymour','BOS'),
('Franny','DAL'),
('Zooey','CAK'),
('Walt','MIA'),
('Buddy','ORD'),
('Les','OMA')]
# Laguardia
destination='LGA'
flights={}
#
for line in file('schedule.txt'):
origin,dest,depart,arrive,price=line.strip().split(',')
flights.setdefault((origin,dest),[])
# Add details to the list of possible flights
flights[(origin,dest)].append((depart,arrive,int(price)))
def getminutes(t):
x=time.strptime(t,'%H:%M')
return x[3]*60+x[4]
def printschedule(r):
for d in range(len(r)/2):
name=people[d][0]
origin=people[d][1]
out=flights[(origin,destination)][int(r[d])]
ret=flights[(destination,origin)][int(r[d+1])]
print '%10s%10s %5s-%5s $%3s %5s-%5s $%3s' % (name,origin,
out[0],out[1],out[2],
ret[0],ret[1],ret[2])
def schedulecost(sol):
totalprice=0
latestarrival=0
earliestdep=24*60
for d in range(len(sol)/2):
# Get the inbound and outbound flights
origin=people[d][1]
outbound=flights[(origin,destination)][int(sol[d])]
returnf=flights[(destination,origin)][int(sol[d+1])]
# Total price is the price of all outbound and return flights
totalprice+=outbound[2]
totalprice+=returnf[2]
# Track the latest arrival and earliest departure
if latestarrival<getminutes(outbound[1]): latestarrival=getminutes(outbound[1])
if earliestdep>getminutes(returnf[0]): earliestdep=getminutes(returnf[0])
# Every person must wait at the airport until the latest person arrives.
# They also must arrive at the same time and wait for their flights.
totalwait=0
for d in range(len(sol)/2):
origin=people[d][1]
outbound=flights[(origin,destination)][int(sol[d])]
returnf=flights[(destination,origin)][int(sol[d+1])]
totalwait+=latestarrival-getminutes(outbound[1])
totalwait+=getminutes(returnf[0])-earliestdep
# Does this solution require an extra day of car rental? That'll be $50!
if latestarrival>earliestdep: totalprice+=50
return totalprice+totalwait
def geneticoptimize(domain,costf,popsize=50,step=1,
mutprob=0.2,elite=0.2,maxiter=100):
# Mutation Operation
def mutate(vec):
i=random.randint(0,len(domain)-1)
if random.random()<0.5 and vec[i]>domain[i][0]:
return vec[0:i]+[vec[i]-step]+vec[i+1:]
elif vec[i]<domain[i][1]:
return vec[0:i]+[vec[i]+step]+vec[i+1:]
# Crossover Operation
def crossover(r1,r2):
i=random.randint(1,len(domain)-2)
return r1[0:i]+r2[i:]
# Build the initial population
pop=[]
for i in range(popsize):
vec=[random.randint(domain[i][0],domain[i][1])
for i in range(len(domain))]
pop.append(vec)
# How many winners from each generation?
topelite=int(elite*popsize)
# Main loop
for i in range(maxiter):
scores=[(costf(v),v) for v in pop]
scores.sort()
ranked=[v for (s,v) in scores]
# Start with the pure winners
pop=ranked[0:topelite]
# Add mutated and bred forms of the winners
while len(pop)<popsize:
if random.random()<mutprob:
# Mutation
c=random.randint(0,topelite)
pop.append(mutate(ranked[c]))
else:
# Crossover
c1=random.randint(0,topelite)
c2=random.randint(0,topelite)
pop.append(crossover(ranked[c1],ranked[c2]))
# Print current best score
print scores[0][0]
return scores[0][1]
This code uses a .txt file called schedule.txt and that it can be downloaded from http://kiwitobes.com/optimize/schedule.txt
When I run the code I put the following, according to the book:
>>> domain=[(0,8)]*(len(optimization.people)*2)
>>> s=optimization.geneticoptimize(domain,optimization.schedulecost)
But the error that I get is:
Traceback (most recent call last):
File "<pyshell#12>", line 1, in <module>
s=optimization.geneticoptimize(domain,optimization.schedulecost)
File "optimization.py", line 99, in geneticoptimize
scores=[(costf(v),v) for v in pop]
File "optimization.py", line 42, in schedulecost
for d in range(len(sol)/2):
TypeError: object of type 'NoneType' has no len()
The thing is that the error message appears sometimes and other times not. I have checked the code and I cannot see where it can be the fault, because pop never is populated with empty vectors.
Any help?
Thanks
You can get None in your pop list if neither of the conditions in the mutate function are met. In that case the control runs off the end of the function, which is the same as returning None. You need to update the code to either have only one condition, or to handle a case that doesn't meet either of them in a separate block:
def mutate(vec):
i=random.randint(0,len(domain)-1)
if random.random()<0.5 and vec[i]>domain[i][0]:
return vec[0:i]+[vec[i]-step]+vec[i+1:]
elif vec[i]<domain[i][1]:
return vec[0:i]+[vec[i]+step]+vec[i+1:]
else:
# new code needed here!
It's not an answer, but I'd trap this error and print out the pop array when it arises to see what it looks like at the time. It looks from the code like it should never get into this state, as you point out, so first look to see if it does get into that state, and then backtrack until you find the conditions where it happens.
Presumably this only happens sometimes because there are randomised factors in your code?

Categories