If-statement seemingly ignored by Write operation - python

What I am trying to do here is write the latitude and longitude of the sighting of a pokemon to a text file if it doesn't already exist. Since I am using an infinite loop, I added an if-state that prevents an already existent pair of coordinates to be added.
Note that I also have a list Coordinates that stores the same information. The list works as no repeats are added.(By checking) However, the text file has the same coordinates appended over and over again even though it theoretically shouldn't as it is contained within the same if-block as the list.
import requests
pokemon_url = 'https://pogo.appx.hk/top'
while True:
response = requests.get(pokemon_url)
response.raise_for_status()
pokemon = response.json()[0:]
Sighting = 0
Coordinates = [None] * 100
for num in range(len(pokemon)):
if pokemon[num]['pokemon_name'] == 'Aerodactyl':
Lat = pokemon[num]['latitude']
Long = pokemon[num]['longitude']
if (Lat, Long) not in Coordinates:
Coordinates[Sighting] = (Lat, Long)
file = open("aerodactyl.txt", "a")
file.write(str(Lat) + "," + str(Long) + "\n")
file.close()
Sighting += 1
For clarity purposes, this is the output

You need to put your Sighting and Coordinates variables outside of the while loop if you do not want them to reset on every iteration.
However, there are a lot more things wrong with the code. Without trying it, here's what I spot:
You have no exit condition for the while loop. Please don't do this to the poor website. You'll essentially be spamming requests.
file.close should be file.close(), but overall you should only need to open the file once, not on every single iteration of the loop. Open it once, and close once you're done (assuming you will add an exit condition).
Slicing from 0 (response.json()[0:]) is unnecessary. By default the list starts at index 0. This may be a convoluted way to get a new list, but that seems unnecessary here.
Coordinates should not be a hard-coded list of 100 Nones. Just use a set to track existing coordinates.
Get rid of Sighting altogether. It doesn't make sense if you're re-issuing the request over and over again. If you want to iterate through the pokémon from one response, use enumerate if you need the index.
It's generally good practice to use snake case for Python variables.

Try this:
#!/usr/bin/env python
import urllib2
import json
pokemon_url = 'https://pogo.appx.hk/top'
pokemon = urllib2.urlopen(pokemon_url)
pokeset = json.load(pokemon)
Coordinates = [None] * 100
for num in range(len(pokeset)):
if pokeset[num]['pokemon_name'] == 'Aerodactyl':
Lat = pokeset[num]['latitude']
Long = pokeset[num]['longitude']
if (Lat, Long) not in Coordinates:
Coordinates.append((Lat, Long))
file = open("aerodactyl.txt", "a")
file.write(str(Lat) + "," + str(Long) + "\n")
file.close

Related

How to assign a new value to the list in running loop

My data cutting loop seems to run ok in the loop, but when it prints the result outside the loop, the contents are unchanged. Presuming it's buggy because I'm trying to assign to what the for loop is running through, but I don't know.
For reference, it's a small web review scraper project I'm working on. To get it formatted to CSV with pandas I think all the data needs to end at the same point (length), so I'm cutting any lists that are longer than the shortest. The values "cust_stars_result, rev_result, cust_res" are all lists with basics strings stored inside, in this case equal to lengths 16, 12, and 15. I try to slice everything down to 12 in the end but the results are overwritten. What is the right/best way to go about this?
star_len = len(cust_stars_result)
rev_len = len(rev_result)
custname_len = len(cust_res)
print('customer name length: ' + str(custname_len) + ' -- review length: ' + str(rev_len) + ' -- star length: ' + str(star_len))
datalen = [star_len, rev_len, custname_len]
print(min(datalen))
datapack = [cust_stars_result, rev_result, cust_res]
# LOOPER FOR CULLING
for data in datapack:
if len(data) != min(datalen):
print("operating culler to make data even length")
print(len(data))
data = data[: min(datalen)]
print(len(data)) #this comes out OK
else:
print("equal length, skipping culler")
pass
print(datapack) # prints the original values
Inside your loop you update the data variable but that's just reassigning the value of that variable. You want to do something like
for i, data in enumerate(datapack):
...
datapack[i] = data[: min(datalen)]
This will update the datapack element
While "trying to assign to what the for loop is running through" is a real issue, in this case the problem is rather that your code is not assigning anything to datapack when you change data. Instead, what it does is assign each item in datapack to data, so when you change data, datapack remain unchanged.
Instead, try either adding each item to new list, and then assigning datapack to equal the new list:
temp = []
for data in datapack:
...
temp.append(data[:min(datalen)])
datapack = temp
Or try using a range or enumerate loop:
for i, data in enumerate(datapack):
...
datapack[i] = data[:min(datalen)]
There are more fancy ways (but less readable and debuggable) to accomplish what you're doing here (slicing off the end of the list), such as the below which uses list comprehension and map:
mindatalen = min(map(len, datapack))
datapack = [data[:mindatalen]for data in datapack]

Why does my output display turn 1 instead of turn 11 when I load game after saving?

image of output after saving and loading turn 11
def save_game(buildings,building_count,turn):
f = open('data.txt', 'w')
f.write('Turn {}\n'.format(turn))
for i in range(len(board)):
s = ''
for j in range(len(board[i])):
s += board[i][j] + ','
f.write(s[:-1] + '\n')
f.close()
return buildings, building_count, turn
def load_game(buildings, building_count, turn):
f = open('data.txt','r')
f.readline()
for line in f:
data = line.strip('\n').split(',')
board.append(data)
f.close()
return buildings, building_count, turn
Please help me using file i\o, this is a school assignment and I am not allowed to use imports or anything, thank you so much!
Your load game doesn't change the input parameters buildings, building_count, turn in any way.
I would suspect 1 is just the default value or a random initialized value for turn, when you call the load_game function.
You append something into board, wherever that is from. So the code in general seems wrong.
As a start I would not pass any parameters except maybe the file path to load_game. Initialize the variables in there and fill them with the correct values.
Regarding the turn count problem specifically:
For the first line you can read it in, split it into to parts by using .split(' '). check if the first entry in the list is equal to 'Turn' and read in the second one and parse it to an int.

Python readlines() doesn't function outputs bug in for loop

Intro:
I'm a beginner python learning syntax at the moment. I've come across this concept of reading and writing files natively supported by python. I've figured to give it a try and find bugs after attempting looping reading and writing commands. I wanted to randomly pick a name from a name file and then writing it into a new file. My file includes 19239 lines of names, randrange(18238) generates from 0 - 18238, and, supposedly, would read a randomly read a line between 1 - 18239. The problem is that the code that reads and writes works without the for loop but not with the for loop.
My attempt:
from random import randrange
rdname = open("names.dat", "r")
wrmain = open("main.dat", "a")
rdmain = open("main.dat", "r")
for x in range(6):
nm = rdname.readlines()[randrange(18238)]
print(str(randrange(18238)) + ": " + nm)
wrmain.write("\n" + nm)
...
Error code:
Exception has occurred: IndexError
list index out of range
Good luck with your programming journey.
The readlines() method. Has some non-intuitive behaviour. When you use the readlines() it "dumps" the entire content of the file and returns a list of strings of each line. Thus the second time you call the rdname.readlines()[randrange(18238)], the rdname file object is completely empty and you actually have an empty list. So functionally you are telling your programme to run [][randrange(18238)] on the second iteration of the loop.
I also took the liberty of fixing the random number call, as the way you had implemented it would mean it would call 2 different random numbers when selecting the name nm = rdname.readlines()[randrange(18238)] and printing the selected name and linenumber print(str(randrange(18238)) + ": " + nm)
...
rdname = open("names.dat", "r")
wrmain = open("main.dat", "a")
rdmain = open("main.dat", "r")
rdname_list = rdname.readlines()
for x in range(6):
rd_number = randrange(18238)
nm = rdname_list[rd_number]
print(str(rd_number) + ": " + nm)
wrmain.write("\n" + nm)
...
rdname.readlines() exhausts your file handle. Running rdname.readlines() gives you the list of lines the first time, but returns an empty list every subsequent time. Obviously, you can't access an element in an empty list. To fix this, assign the result of readlines() to a variable just once, before your loop.
rdlines = rdname.readlines()
maxval = len(rdlines)
for x in range(6):
randval = randrange(maxval)
nm = rdlines[randval]
print(str(randval) + ": " + nm)
wrmain.write("\n" + nm)
Also, making sure your random number can only go to the length of your list is a good idea. No need to hardcode the length of the list though -- the len() function will give you that.
I highly recommend you take a look at how to debug small programs. Using a debugger to step through your code is immensely helpful because you can see how each line affects the values of your variables. In this case, if you'd looked at the value of nm in each iteration, it would be obvious why you got the IndexError, and finding out that nm becomes an empty list on readlines() would point you in the direction of the answer.

Python - program for searching for relevant cells in excel does not work correctly

I've written a code to search for relevant cells in an excel file. However, it does not work as well as I had hoped.
In pseudocode, this is it what it should do:
Ask for input excel file
Ask for input textfile containing keywords to search for
Convert input textfile to list containing keywords
For each keyword in list, scan the excelfile
If the keyword is found within a cell, write it into a new excelfile
Repeat with next word
The code works, but some keywords are not found while they are present within the input excelfile. I think it might have something to do with the way I iterate over the list, since when I provide a single keyword to search for, it works correctly. This is my whole code: https://pastebin.com/euZzN3T3
This is the part I suspect is not working correctly. Splitting the textfile into a list works fine (I think).
#IF TEXTFILE
elif btext == True:
#Split each line of textfile into a list
file = open(txtfile, 'r')
#Keywords in list
for line in file:
keywordlist = file.read().splitlines()
nkeywords = len(keywordlist)
print(keywordlist)
print(nkeywords)
#Iterate over each string in list, look for match in .xlsx file
for i in range(1, nkeywords):
nfound = 0
ws_matches.cell(row = 1, column = i).value = str.lower(keywordlist[i-1])
for j in range(1, worksheet.max_row + 1):
cursor = worksheet.cell(row = j, column = c)
cellcontent = str.lower(cursor.value)
if match(keywordlist[i-1], cellcontent) == True:
ws_matches.cell(row = 2 + nfound, column = i).value = cellcontent
nfound = nfound + 1
and my match() function:
def match(keyword, content):
"""Check if the keyword is present within the cell content, return True if found, else False"""
if content.find(keyword) == -1:
return False
else:
return True
I'm new to Python so my apologies if the way I code looks like a warzone. Can someone help me see what I'm doing wrong (or could be doing better?)? Thank you for taking the time!
Splitting the textfile into a list works fine (I think).
This is something you should actually test (hint: it does but is inelegant). The best way to make easily testable code is to isolate functional units into separate functions, i.e. you could make a function that takes the name of a text file and returns a list of keywords. Then you can easily check if that bit of code works on its own. A more pythonic way to read lines from a file (which is what you do, assuming one word per line) is as follows:
with open(filename) as f:
keywords = f.readlines()
The rest of your code may actually work better than you expect. I'm not able to test it right now (and don't have your spreadsheet to try it on anyway), but if you're relying on nfound to give you an accurate count for all keywords, you've made a small but significant mistake: it's set to zero inside the loop, and thus you only get a count for the last keyword. Move nfound = 0 outside the loop.
In Python, the way to iterate over lists - or just about anything - is not to increment an integer and then use that integer to index the value in the list. Rather loop over the list (or other iterable) itself:
for keyword in keywordlist:
...
As a hint, you shouldn't need nkeywords at all.
I hope this gets you on the right track. When asking questions in future, it'd be a great help to provide more information about what goes wrong, and preferably enough to be able to reproduce the error.

In SPSS Python essentials, can I get the value of an SPSS variable returned to Python for further use?

I have a database where each case holds info about handwritten digits, eg:
Digit1Seq : when in the sequence of 12 digits the "1" was drawn
Digit1Ht: the height of the digit "1"
Digit1Width: its width
Digit2Seq: same info for digit "2"
on up to digit "12"
I find I now need the information organized a little differently as well. In particular I want a new variables with the height and width of the first digit written, then the height and width of the second, etc., as SPSS vars
FirstDigitHt
FirstDigitWidth ...
TwelvthDigitWidth
Here's a Python program I wrote to do within SPSS what ought to be a very simple computation, but it runs into a sort of namespace problem:
BEGIN PROGRAM PYTHON.
import spss
indices = ["1", "2", "3","4","5", "6", "7", "8", "9", "10", "11", "12"]
seq=0
for i in indices:
spss.Submit("COMPUTE seq = COMDigit" + i + "Seq.")
spss.Submit("EXECUTE.")
spss.Submit("COMPUTE COM" + indices[seq] + "thWidth = COMDigit" + i + "Width.")
spss.Submit("COMPUTE COM" + indices[seq] + "thHgt = COMDigit" + i + "Hgt.")
spss.Submit("EXECUTE.")
END PROGRAM.
It's clear what's wrong here: the value of seq in the first COMPUTE command doesn't get back to Python, so that the right thing can happen in the next two COMPUTEcommands. Python's value of seq doesn't change, so I end up with SPSS code that gives me only two variables (COM1thWidth and COM1Hgt), into which COMDigit1Width, COMDigit2Width, etc. get written.
Is there any way to get Python to access SPSS's value of seq each time so that the string concatenation will create the correct COMPUTE? Or am I just thinking about this incorrectly?
Have googled extensively, but find no way to do this.
As I'm new to using Python in SPSS (and not all that much of wiz with SPSS) there may well be a far easier way to do this.
All suggestions most welcome.
Probably the easiest way to get your SPSS variable data into Python variables for manipulation is with the spss.Dataset class.
To do this, You will need:
1.) the dataset name of your SPSS Dataset
2.) either the name of the variable you want to pull data from or its index in your dataset.
If the name of the variable you want to extract data from is named 'seq' (as I believe it was in your question), then you can use something like:
BEGIN PROGRAM PYTHON.
from __future__ import with_statement
import spss
with spss.DataStep()
#the lines below create references to your dataset,
#to its variable list, and to its case data
lv_dataset = spss.Dataset(name = <name of your SPSS dataset>)
lv_caseData = lv_dataset.cases
lv_variables = lv_dataset.varlist
#the line below extracts all the data from the SPSS variable named 'seq' in the dataset referenced above into a list
#to make use of an SPSS cases object, you specify in square brackets which rows and which variables to extract from, such as:
#Each row you request to be extracted will be returned as a list of values, one value for each variable you request data for
#lv_theData = lv_caseData[rowStartIndex:rowEndIndex, columnStartIndex:columnEndIndex]
#This means that if you want to get data for one variable across many rows of data, you will get a list for each row of data, but each row's list will have only one value in it, hence in the code below, we grab the first element of each list returned
lv_variableData = [itm[0] for itm in lv_caseData[0:len(lv_caseData), lv_variables['seq'].index]]
END PROGRAM.
There are lots of ways to process the case data held by Statistics via Python, but the case data has to be read explicitly using the spss.Cursor, spssdata.Spssdata, or spss.Dataset class. It does not live in the Python namespace.
In this case the simplest thing to do would be to just substitute the formula for seq into the later references. There are many other ways to tackle this.
Also, get rid of those EXECUTE calls. They just force unnecessary data passes. Statistics will automatically pass the data when it needs to based on the command stream.
Hi I just stumbled across this, and you've probably moved on, but it might help other folks. I don't thing you actually need to access have Python access the SPSS values. I think something like this might work:
BEGIN PROGRAM PYTHON.
import spss
for i in range(1,13):
k = "COMPUTE seq = COMDigit" + str(i) + "Seq."
l = "Do if seq = " + str(i)+ "."
m = "COMPUTE COM" + str(i) + "thWidth = COMDigit" + str(i) + "Width."
n = "COMPUTE COM" + str(i) + "thHgt = COMDigit" + str(i) + "Hgt."
o = "End if."
print k
print l
print m
print n
print o
spss.Submit(k)
spss.Submit(l)
spss.Submit(m)
spss.Submit(n)
spss.Submit(o)
spss.Submit("EXECUTE.")
END PROGRAM.
But I'd have to see the data to make sure I'm understanding your problem correctly. Also, the print stuff makes the code look ugly, but its the only way I can keep a handle on whats going on under the hood. Cheerio!

Categories