how to fix the error in python code? - python

def main():
fname = input("Enter name of file: ")
with open(fname) as inf:
animalnames, dates, locations = zip(*[line.strip().split(':') for line in inf])
d = {}
for animalname, loc in zip(animalname, locations):
d.setdefault(animalname, []).append(loc)
for k, v in d.items():
print(k, end='\t')
print(v.count('loc1'), end='\t')
print(v.count('loc2'))
main()
i have a txt file name animallog1.txt which contains the following
a01:01-24-2011:s1
a03:01-24-2011:s2
a02:01-24-2011:s2
a03:02-02-2011:s2
a03:03-02-2011:s1
a02:04-19-2011:s2
a01:05-14-2011:s2
a02:06-11-2011:s2
a03:07-12-2011:s1
a01:08-19-2011:s1
a03:09-19-2011:s1
a03:10-19-2011:s2
a03:11-19-2011:s1
a03:12-19-2011:s2
i would like to use the above data which is in the format animaname:data:location to print the following table:
Number of times each animal visited each station :
Animal name Station 1 Station 2
a01 2 1
a02 0 3
a03 4 4
========================================
i have tried and my code is what i have got, but it gives me the error
builtins.UnboundLocalError: local variable 'animalname' referenced before assignment
could someone help me fix this so i can get the desired result.

You might want to refer animalnames here
for animalname, loc in zip(animalnames, locations):

In response to your second question, are you searching for the exact string names of the stations with count?
I see "loc1" and "loc2" are hard-coded into the provided snippet, and am wondering if that is just by way of example, or if the final script uses those instead of "s1" and "s2" or some flexible parameters.

Related

Writing multiple lists to a file

I have 2 lists and an output file sent to a function I am having an issue with how to do the .write statement.
I have tried using an index, but get list error.
nameList = ['james','billy','kathy']
incomeList = [40000,50000,60000]
I need to search the lists and write the name and income to a file.
for income in incomeList:
if income > 40000:
output.write(str("%10d %12.2f \n") # (this is what I can't figure out)))
You can do it like this.
nameList = ['james','billy','kathy']
incomeList = [40000,50000,60000]
for k, v in zip(nameList, incomeList):
if v > 40000:
print(k,v )
Output :-
billy 50000
kathy 60000
Maybe this is what you want:
for i,income in enumerate(incomeList):
if income > 40000:
output.write(str(nameList[i]) )
In the case of 2 lists, I would suggest using a dict.
nameIncomeList = {'james':40000,'billy':50000,'kathy':60000}
For multiple case scenario,
f=open('f1.txt','w')
for i in range(len(nameList)):
if incomeList[i]>40000:
f.write(nameList[i]+' '+incomeList[i]+'\n')
f.close()

Searching the the lowest Value from a file

I've included a test sample of my golf.txt file and the code I've used to print the results.
golf.txt:
Andrew
53
Dougie
21
The code to open this file and print the results (to keep this short I only have two players and two scores
golfData = open('golf.txt','r')
whichLine=golfData.readlines()
for i in range(0,len(whichLine),2):
print('Name:'+whichLine[i])
print('Score:'+whichLine[i+1])
golfData.close()
Can I modify the code I have to pull out the minimum player score with name? I believe I can without writing to a list or dictionary but have NO clue how.
Help/suggestions much appreciated.
Use min() function for that:
with open('file.txt') as f_in:
min_player, min_score = min(zip(f_in, f_in), key=lambda k: int(k[1]))
print(min_player, min_score)
Prints:
Dougie
21
As #h0r53 indicated, you can try something like this:
golfData = open('golf.txt','r')
whichLine=golfData.readlines()
lowest=float('Inf')
Name=''
for i in range(0,len(whichLine),2):
if float(whichLine[i+1])<lowest:
lowest=float(whichLine[i+1])
Name=whichLine[i]
golfData.close()
print(Name)
print(lowest)

Find the list of number in another list using nested For loops and If condition

I have 2 different excel files file 1/file 2. I have stored the values of the columns in 2 different lists. I have to search the number present in file 1 with file 2 and I wanted the output as per file 3/ExpectedAnswer.
File 1:
File 2:
File 3/ Expected Answer:
I tried the below code for the above requirement. But I don't know where I'm going wrong.
for j in range(len(terr_code)):
g=terr_code[j]
#print(g)
for lists in Zip_code:
Zip_code= lists.split(";")
while('' in Zip_code):
Zip_code.remove('')
for i in range(len(Zip_code)):
#print(i)
h=Zip_code[i]
print(g)
if g in h:
print(h)
territory_code.append(str(terr_code[j]))
print(territory_code[j])
final_list.append(Zip_terr_Hem['NAME'][i])
#print(final_list)
s = ";"
s= s.join(str(v) for v in final_list)
#print(s)
final_file['Territory Code'] = pd.Series(str(terr_code[j]))
final_file['Territory Name'] = pd.Series(s)
final_file = pd.DataFrame(final_file )
final_file.to_csv('test file.csv', index=False)
The first for loop is working fine. But when I try to print the list of number from the 2nd for loop, the first number is getting printed multiple time. And though both the list are working, still they are not getting inside the if condition. Please tell me what I'm doing wrong here. Thanks

Python row.replace issue

Started fiddling with Python for the first time a week or so ago and have been trying to create a script that will replace instances of a string in a file with a new string. The actual reading and creation of a new file with intended strings seems to be successful, but error checking at the end of the file displays output suggesting that there is an error. I checked a few other threads but couldn't find a solution or alternative that fit what I was looking for or was at a level I was comfortable working with.
Apologies for messy/odd code structure, I am very new to the language. Initial four variables are example values.
editElement = "Testvalue"
newElement = "Testvalue2"
readFile = "/Users/Euan/Desktop/Testfile.csv"
writeFile = "/Users/Euan/Desktop/ModifiedFile.csv"
editelementCount1 = 0
newelementCount1 = 0
editelementCount2 = 0
newelementCount2 = 0
#Reading from file
print("Reading file...")
file1 = open(readFile,'r')
fileHolder = file1.readlines()
file1.close()
#Creating modified data
fileHolder_replaced = [row.replace(editElement, newElement) for row in fileHolder]
#Writing to file
file2 = open(writeFile,'w')
file2.writelines(fileHolder_replaced)
file2.close()
print("Modified file generated!")
#Error checking
for row in fileHolder:
if editElement in row:
editelementCount1 +=1
for row in fileHolder:
if newElement in row:
newelementCount1 +=1
for row in fileHolder_replaced:
if editElement in row:
editelementCount2 +=1
for row in fileHolder_replaced:
if newElement in row:
newelementCount2 +=1
print(editelementCount1 + newelementCount1)
print(editelementCount2 +newelementCount2)
Expected output would be the last two instances of 'print' displaying the same value, however...
The first instance of print returns the value of A + B as expected.
The second line only returns the value of B (from fileHolder), and from what I can see, A has indeed been converted to B (In fileHolder_replaced).
Edit:
For example,
if the first two counts show A and B to be 2029 and 1619 respectively (fileHolder), the last two counts show A as 0 and B as 2029 (fileHolder_replace). Obviously this is missing the original value of B.
So in am more exdented version as in the comment.
If you look for "TestValue" in the modified file, it will find the string, even if you assume it is "TestValue2". Thats because the originalvalue is a substring of the modified value. Therefore it should find twice the number of occurences. Or more precise the number of lines in which the string occurs.
If you query
if newElement in row
It will have a look if the string newElement is contained in the string row

Output with Python Glob // Cannot find where is error in Python code

I have the following code, which does NOT give an error but it also does not produce an output.
The script is made to do the following:
The script takes an input file of 4 tab-separated columns:
It then counts the unique values in Column 1 and the frequency of corresponding values in Column 4 (which contains 2 different tags: C and D).
The output is 3 tab-separated columns containing the unique values of column 1 and their corresponding frequency of values in Column 4: Column 2 has the frequency of the string in Column 1 that corresponds with Tag C and Column 3 has the frequency of the string in Column 1 that corresponds with Tag D.
Here is a sample of input:
algorithm-n like-1-resonator-n 8.1848 C
algorithm-n produce-hull-n 7.9104 C
algorithm-n like-1-resonator-n 8.1848 D
algorithm-n produce-hull-n 7.9104 D
anything-n about-1-Zulus-n 7.3731 C
anything-n above-shortage-n 6.0142 C
anything-n above-1-gig-n 5.8967 C
anything-n above-1-magnification-n 7.8973 C
anything-n after-1-memory-n 2.5866 C
and here is a sample of the desired output:
algorithm-n 2 2
anything-n 5 0
The code I am using is the following (which one will see takes into consideration all suggestions from the comments):
from collections import defaultdict, Counter
def sortAndCount(opened_file):
lemma_sense_freqs = defaultdict(Counter)
for line in opened_file:
lemma, _, _, senseCode = line.split()
lemma_sense_freqs[lemma][senseCode] += 1
return lemma_sense_freqs
def writeOutCsv(output_file, input_dict):
with open(output_file, "wb") as outfile:
for lemma in input_dict.keys():
for senseCode in input_dict[lemma].keys():
outstring = "\t".join([lemma, senseCode,\
str(input_dict[lemma][senseCode])])
outfile.write(outstring + "\n")
import os
import glob
folderPath = "Python_Counter" # declare here
for input_file in glob.glob(os.path.join(folderPath, 'out_')):
with open(input_file, "rb") as opened_file:
lemma_sense_freqs = sortAndCount(input_file)
output_file = "count_*.csv"
writeOutCsv(output_file, lemma_sense_freqs)
My intuition is the problem is coming from the "glob" function.
But, as I said before: the code itself DOES NOT give me an error -- but it doesn't seem to produce an output either.
Can someone help?
I have referred to the documentation here and here, and I cannot seem to understand what I am doing wrong.
Can someone provide me insight on how to solve the problem by outputting the results from glob. As I have a large amount of files I need to process.
In regards to your original code, *lemma_sense_freqs* is not defined cause it should be returned by the function sortAndCount(). And you never call that function.
For instance, you have a second function in your code, which is called writeOutCsv. You define it, and then you actually call it on the last line.
While you never call the function sortAndCount() (which is the one that should return the value of *lemma_sense_freqs*). Hence, the error.
I don't know what you want to achieve exactly with that code, but you definitely need to write at a certain point (try before the last line) something like this
lemma_sense_freqs = sortAndCount(input_file)
this is the way you call the function you need and lemma_sense_freqs will then have a value associated and you shouldn't get the error.
I cannot be more specific cause it is not clear exactly what you want to achieve with that code. However, you just are experiencing a basic issue at the moment (you defined a function but never used it to retrieve the value lemma_sense_freqs). Try to add the piece of code I suggest and play with it.

Categories