How to limit for loop iterations within cursor? - python

I am using a for loop within a SearchCursor to iterate through features in a featureclass.
import arcpy
fc = r'C:\path\to\featureclass'
with arcpy.da.SearchCursor(fc, ["fieldA", "FieldB", "FieldC"]) as cursor:
for row in cursor:
# Do something...
I am currently troubleshooting the script and need to find a way to limit the iterations to, say, 5 rather than 3500 as it is currently configured. I know the most basic way to limit the number of iterations in a for loop is as follows:
numbers = [1,2,3,4,5]
for i in numbers[0:2]
print i
However, this approach does not work when iterating over a cursor object. What method can I use to limit the number of iterations of a for loop within a cursor object wrapped in a with statement?

You could use a list comprehension to grab everything and then take only the first five rows that you need. Check the example below:
max = 5 #insert max number of iterations here
with arcpy.da.SearchCursor(fc, ["fieldA", "FieldB", "FieldC"]) as cursor:
output = [list(row) for row in cursor][:max]
It is important to note that each row is a tuple output, thus the list() method is used to create a 2d list that can be used for whatever you need. Even if your dataset is 3500 rows, this should do the trick in little time. I hope this helps!

Add a counter and a logical statement to limit the number of iterations. For example:
import arcpy
fc = r'C:\path\to\featureclass'
count = 1 # Start a counter
with arcpy.da.SearchCursor(fc, ["fieldA", "FieldB", "FieldC"]) as cursor:
for row in cursor:
# Do something...
if count >= 2:
print "Processing stopped because iterations >= 2"
sys.exit()
count += 1

One possible way:
for index, row in enumerate(cursor):
if index > x:
# do something...
else:
# do something...

Related

Having troubles to write list in a new row every time during while loop with openpyxl

I'm encountering problems inserting the list values into a new row every time the code loops.
In a few words, I need the code every, it loops to write the values from lst into a separate row each time.
I have done an extensive research to try to understand how this works and found plenty of examples but unfortunately, I couldn't make the code work following those examples.
This is the code:
max_value = 5
lst = []
while True:
wb = Workbook()
ws = wb.active
num = (max_value)
for n in range(num):
weight = float(input('Weight: '))
lst.append(weight)
ws.append(lst)
if n+1 == max_value:
wb.save(filename='path')
I have tried to add ws.insert_rows(idx=2, amount=1) just after the line ws.append(lst) like this:
...
ws.insert_rows(idx=2, amount=1)
ws.append(lst)
if n+1 == max_value:
wb.save(filename='path')
but it doesn't do anything because I suppose it needs something that tells the code to write the next values in that row.
I have also tried something like this:
...
next_avail_row = len(list(ws.rows))
ws.append(lst)
if n+1 == max_value
wb.save(filename='path')
But here as well I'm not sure how to tell the code after it finds next_avail_row = len(list(ws.rows)) to write in that row.
Thoughts?
EDIT:
At the moment if I enter at the prompt for instance:
1,2,3,4,5
it outputs:
if I continue inputting numbers for instance:
7,6,5,4,3
it outputs:
and so forth, what I expect to be is:
in a few words, every time the function gets called it writes in the same file but one row below. I hope it is a clear explanation.
There are a couple of issues with your code.
First, in your while loop. Every time that it loops through it is calling wb = Worbook() and eventually wb.save(filename='path'). This is creating a new excel worksheet every time. Assuming that in your wb.save() call that the filename is the same each time, every time you call save on the new workbook it will overwrite the previously made workbook with that same file name.
Next, your list that contains the weight values that you have being input. You aren't clearing the list so each time you add something to it in the loop the list will just keep expanding.
Also the line num = (max_value) doesn't really do anything, and you also need to have some kind of condition to break out of your while loop. Using while True will keep the loop going forever unless you break out of it at some point.
Here is some code that should do this the way that you want it to:
max_value = 5
line_count = 0
wb = Workbook()
ws = wb.active
white True:
lst = []
for n in range(max_value):
weight = float(input('Weight: '))
lst.append(weight)
ws.append(lst)
line_count += 1
# an example condition to break out of the loop
if line_count >= 5:
break
wb.save(filename='path')
Here, your Workbook object is only being created and saved once so it isn't opening a new one each time and overwriting the previous Workbook. The list lst is being emptied each time through the loop so that your lines will only be as long as the value in max_value. I also added in an example way of breaking out of your while True: loop. The way I set it is that once there have been 5 lines that you have added to the workbook, it will break out of the loop. You can create any condition you want to break out of the loop, this is just for example purposes.

Python Code for Standard Deviation with data from SQLITE3

from math import *
import sqlite3
ages=sqlite3.connect('person.sqlite3')
def main():
ageslist=ages.execute("SELECT age from person")
#average age
for row in ageslist:
row[0]
average = (sum(row[0]))/len(row[0])
#subtracts average x from x or opposite and square, depending on n
for n in range(len(ageslist) - 1):
if numbers[n] > average:
numbers.append((ageslist[n] - average)**2)
if numbers[n] < average:
numbers.append((average - ageslist[n])**2)
#takes square rt of the sum of all these numbers and divides by n-1
Stdv = math.sqrt(sum(ageslist))/(len(ageslist)-1)
end=time()
print(Stdv)
main()
I am trying to find the standard deviation of the ages from an SQLITE3 db. However, I am getting the current error:
average = (sum(row[0]))/len(row[0])
TypeError: 'int' object is not iterable
How can I correct this?
The query sent to the database connection returns an iterator. You can only pass over that iterator once before it is flushed from memory. Here is some correction to your code to do what you are asking.
conn = sqlite3.connect('person.sqlite3')
def main():
ages_iterator = conn.execute("SELECT age from person")
# this turns the iterator into an actual list, which you need for stdev
age_list = [a[0] for a in ages_iterator]
# average age
average = (sum(age_list))/len(age_list)
# subtracts average x from x square
# because you are squaring the difference, the it does not matter if it is
# greater or less than the average
numbers = [(age-average)**2 for age in age_list]
#takes square rt of the sum of all these numbers and divides by n-1
Stdv = math.sqrt(sum(numbers))/float(len(numbers)-1)
end=time()
print(Stdv)
main()
Some quick comments in the code..
for row in ageslist:
row[0] # This statement does nothing
average = (sum(row[0]))/len(row[0]) # This statement will not have a row value to reference because your rows in ageslist will have been iterated through
When you execute ageslist=ages.execute("SELECT age from person")
your ageslist variable is now an iterable object. Once you iterate through it you can no longer reference values in it without executing the database command again.
So I believe you should have a variable that sums the age during every row iteration in the for loop and also another variable that keeps a count of the number of entries in the database. This could be done in the for loop as well. Although I'm sure there is a more "pythonic" way to accomplish this.

Iteration moves the cursor in PyMongo?

Not sure what's happening here or why, it seems as if when I iterate on a cursor it moves it, because I can't run a second loop from the same starting point. My example:
players = db.player.find({'parent_api__id' : 'stats', 'game__id':{'$in':games_list}, "played":"true"});
count = 0;
for c in players:
count = count + 1
for c in players:
game = db.game.find_one({'parent_api__id':'schedule', 'id':c['game__id']})
c['date'] = game['scheduled']
print c
In this one it never enters the second loop, if I put a print up top it never hits it, and it never does the print c at the bottom
Now if I comment out the loop with the count in it, so it looks like this:
players = db.player.find({'parent_api__id' : 'stats', 'game__id':{'$in':games_list}, "played":"true"});
#count = 0;
#for c in players:
# count = count + 1
for c in players:
game = db.game.find_one({'parent_api__id':'schedule', 'id':c['game__id']})
c['date'] = game['scheduled']
print c
then it enters the 2nd loop and iterates completely fine (printing out as it goes along)
Why is this? do I have to reset the cursor every time in between with another players = db.player.find({'parent_api__id' : 'stats', 'game__id':{'$in':games_list}, "played":"true"}); ? Seems like that can't be the way it was intended.
Thanks for any help you can provide!
Yes, a cursor (by definition) points to the next item (document in case of Mongo) once the current item is iterated over, the cursor itself provides an iterator function and internally maintains a pointer to the items that have been "consumed".
There are two approaches to solve the problem you are facing,
First is to use the cursor's rewind() method to set the cursor to its unevaluated original state
Second, clone the cursor using clone(), which will give you a second cursor that will be an exact clone of the first one, but will be a completely new instance. Use this if you need to maintain states of the two cursors throughout or at the end of evaluation.

Incorrect value from dynamodb table description and scan count

I'm having a problem with dynamodb. I'm attempting to verify the data contained within,
but scan seems to be only returning a subset of the data, here is the code I'm using with the python boto bindings
#!/usr/bin/python
#Check the scanned length of a table against the Table Description
import boto.dynamodb
#Connect
TABLENAME = "MyTableName"
sdbconn = boto.dynamodb.connect_to_region(
"eu-west-1",
aws_access_key_id='-snipped-',
aws_secret_access_key='-snipped-')
#Initial Scan
results = sdbconn.layer1.scan(TABLENAME,count=True)
previouskey = results['LastEvaluatedKey']
#Create Counting Variable
count = results['Count']
#DynamoDB scan results are limited to 1MB but return a Key value to carry on for the next MB
#so loop untill it does not return a continuation point
while previouskey != False:
results = sdbconn.layer1.scan(TABLENAME,exclusive_start_key=previouskey,count=True)
print(count)
count = count + results['Count']
try:
#get next key
previouskey = results['LastEvaluatedKey']
except:
#no key returned so thats all folks!
print(previouskey)
print("Reached End")
previouskey = False
#these presumably should match, they dont on the MyTableName Table, not even close
print(sdbconn.describe_table(TABLENAME)['Table']['ItemCount'])
print(count)
print(sdbconn.describe_table) gives me 1748175 and
print(count) gives me 583021.
I was the under the impression that these should always match? (I'm aware of the 6 hour update) only 300 rows have been added in the last 24 hours though
does anyone know if this is an issue with dynamodb? or does my code have a wrong assumption?
figured it out finally, its to do with Local Secondary Indexes, they show up in the table description as unique items, the table has two LSI's causing it to show 3x the number of items actually present

Loop thru a list and stop when the first string is found

I have a list and I want to extract to another list the data that exist between top_row and bottom_row.
I know the top_row and also that the bottom_row corresponds to data[0] = last integer data (next row is made of strings, but there are also rows with integers which I'm not interested).
I've tried several things, but w/o success:
for row,data in enumerate(fileData):
if row > row_elements: #top_row
try:
n = int(data[0])
aux = True
except:
n = 0
while aux: #until it finds the bottom_row
elements.append(data)
The problem is that it never iterates the second row, if I replace while with if I get all rows which the first column is an integer.
fileData is like:
*Element, type=B31H
1, 1, 2
2, 2, 3
.
.
.
359, 374, 375
360, 375, 376
*Elset, elset=PART-1-1_LEDGER-1-LIN-1-2-RAD-2__PICKEDSET2, generate
I'm only interested in rows with first column values equal to 1 to 360.
Many thanks!
The code you've posted is confusing. For example, "aux" is a poorly-named variable. And the loop really wants to start with a specific element of the input, but it loops over everything until it finds the iteration it wants, turning what might be a constant-time operation into a linear one. Let's try rewriting it:
for record in fileData[row_elements:]: # skip first row_elements (might need +1?)
try:
int(record[0])
except ValueError:
break # found bottow_row, stop iterating
elements.append(record)
If no exception is thrown in the try part, then you basically end up with an endless loop, given that aux will always be True.
I’m not perfectly sure what you are doing in your code, given the way the data looks isn’t clear and some things are not used (like n?), but in general, you can stop a running loop (both for and while loops) with the break statement:
for row, data in enumerate(fileData):
if conditionToAbortTheLoop:
break
So in your case, I would guess something like this would work:
for row, data in enumerate(fileData):
if row > row_elements: # below `top_row`
try:
int(data[0])
except ValueError:
break # not an int value, `bottom_row` found
# if we get here, we’re between the top- and bottom row.
elements.append(data)
Will this work?
for row, data in enumerate(fileData):
if row > row_elements: #top_row
try:
n = int(data[0])
elements.append(data)
except ValueError:
continue
Or what about:
elements = [int(data[0]) for data in fileData if data[0].isdigit()]
By the way, if you care to follow the convention of most python code, you can rename fileData to file_data.
Use a generator:
def isInteger(testInput):
try:
int(testInput)
return True
except ValueError: return False
def integersOnly(fileData):
element = fileData.next()
while isInteger(element):
yield element
element = fileData.next()

Categories