looping until the number of cells changed is neglible - python

This is probably a simple question, but it's driving me crazy! I have a python code that performs cellular automata on a land use grid. I've made a dictionary of cell id: land use code imported from a text file. I've also import of the adjacent neighbors of each cell from a text file. For each cell in the nested loop, I pick out the highest value, count the highest value of the neighboring cells. If this value is greater than the processing cell and occurred more than 4 times, then I update the dictionary for that cell id. The land use codes are ranked in priority. You will see < 6 in the code below...6 is water and wetlands which I do not want to be changed. The first time I run the code, 7509 cells changed land use based on adjacent neighbors land uses. I can comment out the reading the dictionary text file and run it again, then around 5,000 cells changed. Run it again, then even less and so on. What I would like to do is run this in a loop until only 0.0001 of the total cells change, after that break the loop.
I've tried several times using iterators like "for r in range(999)---something big; If End_Sim > count: break". But it breaks after the first one, because the count goes back to zero. I've tried putting the count = 0 inside the loop and it adds up...I want it to start back over every time so the number of cells gets less and less. I'm stump...hopefully this is trivial to somebody!
Here's my code (it's a clean slate...I've deleted my failed attempts to create the number of simulations loop):
import sys, string, csv
#Creating a dictionary of FID: LU_Codes from external txt file
text_file = open("H:\SWAT\NC\FID_Whole_Copy.txt", "rb")
#Lines = text_file.readlines()
FID_GC_dict = dict()
reader = csv.reader(text_file, delimiter='\t')
for line in reader:
FID_GC_dict[line[0]] = int(line[1])
text_file.close()
#Importing neighbor list file for each FID value
Neighbors_file = open("H:\SWAT\NC\Pro_NL_Copy.txt","rb")
Entries = Neighbors_file.readlines()
Neighbors_file.close()
Neighbors_List = map(string.split, Entries)
#print Neighbors_List
#creates a list of the current FID
FID = [x[0] for x in Neighbors_List]
#Calculate when to end of one sweep
Tot_Cells = len(FID)
End_Sim = int(0.0001*Tot_Cells)
gridList = []
for nlist in Neighbors_List:
row = []
for item in nlist:
row.append(FID_GC_dict[item])
gridList.append(row)
#print gridList
#Performs cellular automata rules on land use grid codes
i = iter(FID)
count = 0
for glist in gridList:
Cur_FID = i.next()
Cur_GC = glist[0]
glist.sort()
lr_Value = glist[-1]
if lr_Value < 6:
tie_LR = glist.count(lr_Value)
if tie_LR >= 4 and lr_Value > Cur_GC:
FID_GC_dict[Cur_FID] = lr_Value
#print "The updated gridcode for FID ", Cur_FID, "is ", FID_GC_dict[Cur_FID]
count += 1
print count
Thanks for any help!

use a while loop:
cnt_total = 1234 # init appropriately
cnt_changed = cnt_total
p = 0.001
while (cnt_changed > cnt_total*p):
# your code here
# remember to update the cnt_changed variable

Try with the while break statements
initialization stuff
while(1):
...
if x < 0.0001:
break
...
http://docs.python.org/tutorial/controlflow.html#break-and-continue-statements-and-else-clauses-on-loops

I fixed the code so the simulations stop once the number of cells change is less than 0.0001 of the total cells. I had the while loop in the wrong place. Here's the code if anyone is interested in cellular automata.
import sys, string, csv
#Creating a dictionary of FID: LU_Codes from external txt file
text_file = open("H:\SWAT\NC\FID_Whole_Copy.txt", "rb")
#Lines = text_file.readlines()
FID_GC_dict = dict()
reader = csv.reader(text_file, delimiter='\t')
for line in reader:
FID_GC_dict[line[0]] = int(line[1])
text_file.close()
#Importing neighbor list file for each FID value
Neighbors_file = open("H:\SWAT\NC\Pro_NL_Copy.txt","rb")
Entries = Neighbors_file.readlines()
Neighbors_file.close()
Neighbors_List = map(string.split, Entries)
#print Neighbors_List
#creates a list of the current FID
FID = [x[0] for x in Neighbors_List]
#print FID
#Calculate when to end the simulations (neglible change in land use)
tot_cells = len(FID)
end_sim = tot_cells
p = 0.0001
#Performs cellular automata rules on land use grid codes
while (end_sim > tot_cells*p):
gridList = []
for nlist in Neighbors_List:
row = []
for item in nlist:
row.append(FID_GC_dict[item])
gridList.append(row)
#print gridList
i = iter(FID)
count = 0
for glist in gridList:
Cur_FID = i.next()
Cur_GC = glist[0]
glist.sort()
lr_Value = glist[-1]
if lr_Value < 6:
tie_LR = glist.count(lr_Value)
if tie_LR >= 4 and lr_Value > Cur_GC:
FID_GC_dict[Cur_FID] = lr_Value
print "The updated gridcode for FID ", Cur_FID, "is ", FID_GC_dict[Cur_FID]
count += 1
end_sim = count
print count

Related

I'm getting index out of list Error [duplicate]

This question already has answers here:
Does "IndexError: list index out of range" when trying to access the N'th item mean that my list has less than N items?
(7 answers)
Closed 5 years ago.
def calcDistance(x1, y1, x2, y2):
distance = sqrt((x1-x2)**2 + (y1-y2)**2)
return distance
def make_dict():
return defaultdict(make_dict)
# Capture 1 input from the command line.
# NOTE: sys.argv[0] is the name of the python file
# Try "print sys.argv" (without the quotes) to see the sys.argv list
# 1 input --> the sys.argv list should have 2 elements.
if (len(sys.argv) == 2):
print "\tOK. 1 command line argument was passed."
# Now, we'll store the command line inputs to variables
myFile = str(sys.argv[1])
else:
print 'ERROR: You passed', len(sys.argv)-1, 'input parameters.'
quit()
# Create an empty list:
cities = []
# Create an empty dictionary to hold our (x,y) coordinate info:
myCoordinates = {}
# Open our file:
myFile = '%s.csv' % (myFile)
with open(myFile, 'rb') as csvfile:
spamreader = csv.reader(csvfile, delimiter=',', quotechar='|')
for row in spamreader:
# Only read rows that do NOT start with the "%" character.
if (row[0][0] != '%'):
# print row
id = int(row[0])
isHome = int(row[1])
x = float(row[2])
y = float(row[3])
myCoordinates[id] = {'x': x, 'y': y}
# print myCoordinates[id]['x']
# print myCoordinates[id]['y']
if (isHome == 1):
# Store this id as the home city
homeCity = id
cities.append(id)
print homeCity
print cities
# Create a TSP tour.
# VERSION 1 -- Using range() and for() loops:
myTour = []
for i in range(homeCity, len(cities)+1):
myTour.append(i)
for i in range(1, homeCity+1):
myTour.append(i)
print myTour
# VERSION 2 -- Using only range()
'''
firstPart = range(homeCity, len(cities)+1)
secondPart = range(1, homeCity+1)
myTour = firstPart + secondPart
print myTour
'''
tau = defaultdict(make_dict)
for i in cities:
# print "distance[%d][%d] = 0" % (i, i)
tau[i][i] = 0
for j in range(i+1, len(cities)+1):
# print "distance[%d][%d] > 0" % (i, j)
tau[i][j] = calcDistance(myCoordinates[i]['x'], myCoordinates[i]['y'], myCoordinates[j]['x'], myCoordinates[j]['y'])
# print "distance[%d][%d] = distance[%d][%d]" % (j, i, i, j)
tau[j][i] = tau[i][j]
# FIXME -- Edit the code below...
# Calculate the total distance of our TSP solution:
i = myTour[i]
for myIndex in range(1, len(myTour)+1):
j = myTour[myIndex]
print j
Function to calculate cost based on distance. Need to be modified.
def cost(rate,j):
cost = rate * j
cost = cost(1000,j)
print cost
Also I need to calculate cost based on distance traveled. with myIndex i am getting an error of list index out of range. I am not knowing what exactly is going there. The j is like total distance calculated.
List in python have 0 based index. If you add n elements to a list the indexes are from 0 to n-1. But you are running the loop from 1 to n. So, it getting list index out of range error.
You should do this-
for myIndex in range(0, len(myTour)):
j = myTour[myIndex]
print(j)
If you are getting list index out of range error then change the loop where you are getting the error and accessing a list using 1-based indexing, from range(1,len(some_list)+1) to range(0,len(some_list)). Or you can simply write range(len(some_list)). When there is no start value passed in the range function it starts from 0 by default.
To calculate cost try this -
for myIndex in range(0, len(myTour)):
j = myTour[myIndex]
cost = rate * j
print(cost)
Set the value of rate before starting the loop.

Python File IO - building dictionary and finding max value

Problem is to return the name of the event that has the highest number of participants in this text file:
#Beyond the Imposter Syndrome
32 students
4 faculty
10 industries
#Diversifying Computing Panel
15 students
20 faculty
#Movie Night
52 students
So I figured I had to split it into a dictionary with the keys as the event names and the values as the sum of the integers at the beginning of the other lines. I'm having a lot of trouble and I think I'm making it too complicated than it is.
This is what I have so far:
def most_attended(fname):
'''(str: filename, )'''
d = {}
f = open(fname)
lines = f.read().split(' \n')
print lines
indexes = []
count = 0
for i in range(len(lines)):
if lines[i].startswith('#'):
event = lines[i].strip('#').strip()
if event not in d:
d[event] = []
print d
indexes.append(i)
print indexes
if not lines[i].startswith('#') and indexes !=0:
num = lines[i].strip().split()[0]
print num
if num not in d[len(d)-1]:
d[len(d)-1] += [num]
print d
f.close()
import sys
from collections import defaultdict
from operator import itemgetter
def load_data(file_name):
events = defaultdict(int)
current_event = None
for line in open(file_name):
if line.startswith('#'):
current_event = line[1:].strip()
else:
participants_count = int(line.split()[0])
events[current_event] += participants_count
return events
if __name__ == '__main__':
if len(sys.argv) < 2:
print('Usage:\n\t{} <file>\n'.format(sys.argv[0]))
else:
events = load_data(sys.argv[1])
print('{}: {}'.format(*max(events.items(), key=itemgetter(1))))
Here's how I would do it.
with open("test.txt", "r") as f:
docText = f.read()
eventsList = []
#start at one because we don't want what's before the first #
for item in docText.split("#")[1:]:
individualLines = item.split("\n")
#get the sum by finding everything after the name, name is the first line here
sumPeople = 0
#we don't want the title
for line in individualLines[1:]:
if not line == "":
sumPeople += int(line.split(" ")[0]) #add everything before the first space to the sum
#add to the list a tuple with (eventname, numpeopleatevent)
eventsList.append((individualLines[0], sumPeople))
#get the item in the list with the max number of people
print(max(eventsList, key=lambda x: x[1]))
Essentially you first want to split up the document by #, ignoring the first item because that's always going to be empty. Now you have a list of events. Now for each event you have to go through, and for every additional line in that event (except the first) you have to add that lines value to the sum. Then you create a list of tuples like (eventname) (numPeopleAtEvent). Finally you use max() to get the item with the maximum number of people.
This code prints ('Movie Night', 104) obviously you can format it to however you like
Similar answers to the ones above.
result = {} # store the results
current_key = None # placeholder to hold the current_key
for line in lines:
# find what event we are currently stripping data for
# if this line doesnt start with '#', we can assume that its going to be info for the last seen event
if line.startswith("#"):
current_key = line[1:]
result[current_key] = 0
elif current_key:
# pull the number out of the string
number = [int(s) for s in line.split() if s.isdigit()]
# make sure we actually got a number in the line
if len(number) > 0:
result[current_key] = result[current_key] + number[0]
print(max(result, key=lambda x: x[1]))
This will print "Movie Night".
Your problem description says that you want to find the event with highest number of participants. I tried a solution which does not use list or dictionary.
Ps: I am new to Python.
bigEventName = ""
participants = 0
curEventName = ""
curEventParticipants = 0
# Use RegEx to split the file by lines
itr = re.finditer("^([#\w+].*)$", lines, flags = re.MULTILINE)
for m in itr:
if m.group(1).startswith("#"):
# Whenever a new group is encountered, check if the previous sum of
# participants is more than the recent event. If so, save the results.
if curEventParticipants > participants:
participants = curEventParticipants
bigEventName = curEventName
# Reset the current event name and sum as 0
curEventName = m.group(1)[1:]
curEventParticipants = 0
elif re.match("(\d+) .*", m.group(1)):
# If it is line which starts with number, extract the number and sum it
curEventParticipants += int(re.search("(\d+) .*", m.group(1)).group(1))
# This nasty code is needed to take care of the last event
bigEventName = curEventName if curEventParticipants > participants else bigEventName
# Here is the answer
print("Event: ", bigEventName)
You can do it without a dictionary and maybe make it a little simpler if just using lists:
with open('myfile.txt', 'r') as f:
lines = f.readlines()
lines = [l.strip() for l in lines if l[0] != '#'] # remove comment lines and '\n'
highest = 0
event = ""
for l in lines:
l = l.split()
if int(l[0]) > highest:
highest = int(l[0])
event = l[1]
print (event)

Finding Maximum Value in CSV File

Have an assignment of finding average and maximum rainfall in file "BoulderWeatherData.csv". Have found the average using this code:
rain = open("BoulderWeatherData.csv", "r")
data = rain.readline()
print(rain)
data = rain.readlines()
total = 0
linecounter = 0
for rain in data:
linecounter = linecounter + 1
print("The number of lines is", linecounter)
for line in data:
r = line.split(",")
total = total + float(r[4])
print(total)
average = float(total / linecounter)
print("The average rainfall is ", "%.2f" % average)
However, can't seem to find maximum using this same process. Attempted using max, function but the answer that must be obtained is float number, which can not be iterated through max.
Any help would be appreciated.
This is my prefered way of handling this.
#!/usr/bin/env python3
rain = open("BoulderWeatherData.csv","r")
average = 0.0
total = 0
maxt = 0.0
for line in rain:
try:
p = float(line.split(",")[4])
average += p
total += 1
maxt = max(maxt,p)
except:
pass
average = average / float(total)
print("Average:",average)
print("Maximum:",maxt)
This will output:
Average: 0.05465272591486193
Maximum: 1.98
import csv
INPUT = "BoulderWeatherData.csv"
PRECIP = 4 # 5th column
with open(INPUT, "rU") as inf:
incsv = csv.reader(inf)
header = next(incsv, None) # skip header row
precip = [float(row[PRECIP]) for row in incsv]
avg_precip = sum(precip, 0.) / (1 and len(precip)) # prevent div-by-0
max_precip = max(precip)
print(
"Avg precip: {:0.3f} in/day, max precip: {:0.3f} in/day"
.format(avg_precip, max_precip)
)
returns
Avg precip: 0.055 in/day, max precip: 1.980 in/day
max=0
for line in data:
r = line.split(",")
if float(r[4]) > max:
max=float(r[4])
print(max)
something like that
You're already accumulating total across loop iterations.
To keep track of a maxvalue, it's basically the same thing, except instead of adding you're maxing:
total = 0
maxvalue = 0
for line in data:
r = line.split(",")
value = float(r[4])
total = total + value
maxvalue = max(maxvalue, value)
print(total)
print(maxvalue)
Or, if you don't want to use the max function:
for line in data:
r = line.split(",")
value = float(r[4])
total = total + value
if value > maxvalue:
maxvalue = value
This code will attempt to find the maximum value, and the average value, of floats stored in the 5th position in a .csv.
rainval = []
Initializes the empty array where we will store values.
with open ("BoulderWeatherData.csv", "r") as rain:
Opens the .csv file and names it "rain".
for lines in rain:
This reads every line in rain until the end of the file.
rainval += [float(lines.strip().split(",")[4])]
We append the float value found in the fifth position (fourth index) of the line.
We repeat the above for every line located in the .csv file.
print (sorted(rainval)[len(rainval)])
This sorts the values in the rainval array and then takes the last (greatest) value, and prints it. This is the maximum value and is better than max because it can handle floats and not just ints.
print (sum(rainval)/len(rainval))
This prints the average rainfall.
Alternatively, if we don't want to use arrays:
maxrain = -float("inf")
total, count = 0, 0
with open ("test.txt", "r") as rain:
for lines in rain:
temp = float(lines.strip().split(",")[4])
if maxrain < temp:
maxrain = temp
total += temp
count += 1
print (maxrain)
print (total/count)

Python, if statement to change a number

I am learning Python, and I made a script which searches for several lines that contain the "keyword" and then write/print in a new file the (previously) selected list-line (I used a second argument to select the line from the list).
Everything went well until I tried to add a statement in case my selected list-line is > than the actual len(list) then the selected list_line = len(list); for whatever reason, it does not work.
Can anyone please point out to why it is not working, this my script. Thanks a million for the help. (Here is a link with an example of the files that I am using as an input)
import sys
import re
filename = sys.argv[1]
line_select = int(sys.argv[2])
newfile = str(filename) + ".3d"
openold = open(filename,"r")
opennew = open(newfile,"w")
rline = openold.readlines()
energies = []
line_number = 0
for line in rline:
line_number += 1
if re.search( r"SCF Done", line ):
words = line.split()
energy = float( words[4] )
energies.append(str(line_number) + " : " + "The energy of the molecule is %f kcal mol-1" % energy)
len_list = len(energies)
if line_select > len_list:
line_select = len_list
print >>opennew, energies[line_select]
openold.close()
opennew.close()
The last element of the energies list is actually energies[len_list-1], since Python indexes start from 0.
So if you want to print "the last element of energies", you need to initialize line_select to one less than the list length:
if line_select >= len_list:
line_select = len_list-1

looping until the number of cells changed is very small

This is a repost because I'm getting weird results. I'm trying to run a simulation loop for cells that change in a cellular automata code that changes land use codes based on their adjacent neighbors. I import text files that create a cell id key = land use code value. I also import a text file with each cell's adjacent neighbors. The first time I run the code, 7509 cells changed land use based on adjacent neighbors land uses. I can comment out the reading the dictionary text file and run it again, then around 5,000 cells changed. Run it again, then even less and so on. What I would like to do is run this in a loop until only 0.0001 of the total cells change, after that break the loop.
I've tried a while loop, but it's not giving me the results I'm looking for. After the first run, the count is correct at 7509. After that the count is 28,476 over and over again. I don't understand why this is happening because the count should go back to zero. Can anyone tell me what I'm doing wrong? Here's the code:
import sys, string, csv
#Creating a dictionary of FID: LU_Codes from external txt file
text_file = open("H:\SWAT\NC\FID_Whole_Copy.txt", "rb")
#Lines = text_file.readlines()
FID_GC_dict = dict()
reader = csv.reader(text_file, delimiter='\t')
for line in reader:
FID_GC_dict[line[0]] = int(line[1])
text_file.close()
#Importing neighbor list file for each FID value
Neighbors_file = open("H:\SWAT\NC\Pro_NL_Copy.txt","rb")
Entries = Neighbors_file.readlines()
Neighbors_file.close()
Neighbors_List = map(string.split, Entries)
#print Neighbors_List
#creates a list of the current FID
FID = [x[0] for x in Neighbors_List]
gridList = []
for nlist in Neighbors_List:
row = []
for item in nlist:
row.append(FID_GC_dict[item])
gridList.append(row)
#print gridList
#Calculate when to end of one sweep
tot_cells = len(FID)
end_sim = tot_cells
p = 0.0001
#Performs cellular automata rules on land use grid codes
while (end_sim > tot_cells*p):
i = iter(FID)
count = 0
for glist in gridList:
Cur_FID = i.next()
Cur_GC = glist[0]
glist.sort()
lr_Value = glist[-1]
if lr_Value < 6:
tie_LR = glist.count(lr_Value)
if tie_LR >= 4 and lr_Value > Cur_GC:
FID_GC_dict[Cur_FID] = lr_Value
#print "The updated gridcode for FID ", Cur_FID, "is ", FID_GC_dict[Cur_FID]
count += 1
end_sim = count
print end_sim
Thanks for any help....again! :(
I fixed the code so that the simulations stop after the number of cells changed is less than 0.0001 of total cells. I put the while loop in the wrong place. If anyone is interested, here's the revised code for land use cellular automata.
import sys, string, csv
#Creating a dictionary of FID: LU_Codes from external txt file
text_file = open("H:\SWAT\NC\FID_Whole_Copy.txt", "rb")
#Lines = text_file.readlines()
FID_GC_dict = dict()
reader = csv.reader(text_file, delimiter='\t')
for line in reader:
FID_GC_dict[line[0]] = int(line[1])
text_file.close()
#Importing neighbor list file for each FID value
Neighbors_file = open("H:\SWAT\NC\Pro_NL_Copy.txt","rb")
Entries = Neighbors_file.readlines()
Neighbors_file.close()
Neighbors_List = map(string.split, Entries)
#print Neighbors_List
#creates a list of the current FID
FID = [x[0] for x in Neighbors_List]
#print FID
#Calculate when to end the simulations (neglible change in land use)
tot_cells = len(FID)
end_sim = tot_cells
p = 0.0001
#Performs cellular automata rules on land use grid codes
while (end_sim > tot_cells*p):
gridList = []
for nlist in Neighbors_List:
row = []
for item in nlist:
row.append(FID_GC_dict[item])
gridList.append(row)
#print gridList
i = iter(FID)
count = 0
for glist in gridList:
Cur_FID = i.next()
Cur_GC = glist[0]
glist.sort()
lr_Value = glist[-1]
if lr_Value < 6:
tie_LR = glist.count(lr_Value)
if tie_LR >= 4 and lr_Value > Cur_GC:
FID_GC_dict[Cur_FID] = lr_Value
print "The updated gridcode for FID ", Cur_FID, "is ", FID_GC_dict[Cur_FID]
count += 1
end_sim = count
print count
I don't know the type of cellular automata that you are programming so mine it's just a guess but usually cellular automata works by updating a whole phase ignoring updated values until the phase is finished.
When I had unexpected results for simple cellular automata it was because I just forgot to apply the phase to a backup grid, but I was applying it directly to the grid I was working on.
What I mean is that you should have 2 grids, let's call them grid1 and grid2, and do something like
init grid1 with data
while number of generations < total generations needed
calculate grid2 as the next generation of grid1
grid1 = grid2 (you replace the real grid with the buffer)
Altering values of grid1 directly will lead to different results because you will mostly change neighbours of a cell that still has to be updated before having finished the current phase..

Categories