can someone explain to me what I did wrong? - python

I need help unscrambling this code. I am only allowed to use these specific lines of code, but I need to 'unscramble' it to make it work. To me, this code looks good but I don't seem to get it to work so I would like to find out why this is the case.
The assignment that I am trying to solve is as follows:
Read in the file using the csv reader and build a dictionary with the tree species as the key and a count of the number of times the tree appears. Use the "in" operator to see if a tree has been added, and if not set it to 1.
Print the dictionary with the counts at the end.
My code is as follows:
from BrowserFile import open as _
import csv
with open("treeinventory.csv", "r", newline='') as f:
count = {}
reader = csv.reader(f)
for yard in reader:
for tree in yard:
if tree in count:
count[tree] = 1
else:
count[tree] = count[tree] + 1
print(count)
I would love if someone can help me and also explain why this code is not able to work as it is, i am trying to learn and this would be very helpful!
thank you!

Generally, we don't solve "homework" problems on SO. You should also try to ask specific questions. Also put better titles on your questions. And, as such, I always like to post This to help new question askers out.
Since I'm here: The answer to your assignment is that line 9 and line 11 are swapped.
This is because the logic seems to set that dict count with the key tree is being set to 1 if the key is in the dict, and add 1 to the value stored at count[tree] if it's not in the dict. This will result in a KeyError exception to be thrown when the value is accessed to do this addition in the statement count[tree] + 1, because, there is no value there yet.
Of course, without the input file, I can't actually run the code to verify it, so please try this out for yourself and update your question with specific issues if any come up.

Related

python nested loop list

I am currently stuck at one nested loop problem. I would appreciate it greatly if anyone can offer their insight or tips on how to solve this sticky problem that i am facing.
I am trying to append some values to a list in a for loop. I succeeded in doing that. But how can I get the last list as my variable to use in another loop?
Lets say. I am extracting something by appending them in a list in a for loop.
a=list()
for b in hugo:
a.append(ids)
print(a)
gives me
[1]
[1,2]
[1,2,3]
[1,2,3,4]
But I only need the last line of the list as my variable to be used in another for loop. Can anybody gives me some insights how to do this? Your help is much appreciated. Thanks in advance.
Edit:
Actually I am not trying to get someone to do my homework for me. I am just testing some software programming using python. Here goes:
I am trying to write a script to extract files with the end name of .dat from ANSA pre-processor with the correct name and file ID
For example:
ID Name
1 hugo1.dat
8 hugo2.dat
11 hugo3.dat
18 hugo4.dat
Here is what I have written:
import os
import ansa
from ansa import base
from ansa import constants
from ansa import guitk
def export_include_content():
directory = gutik.UserInput('Please enter the directory to Output dat files:')
ishow=list()
includes=list()
setna=list()
iname=list()
# Set includes variables to collect the elements from a function known as "INCLUDE" from the software
includes=base.CollectEntitites(deck, None, "INCLUDE")
# For loop to get information from the "INCLUDE" function with the end filename ".dat"
for include in includes:
ret=base.GetEntityCardValues(deck, include, 'NAME', 'ID')
ids=str(ret['ID'])
setname=ret['NAME']
if setname.endswith('dat'):
ishow.append(ids)
iname.append(setname)
# Print(ishow) gives me
[1]
[1,8]
[1,8,11]
[1,8,11,18]
# print(iname) gives me
[hugo1]
[hugo1,hugo2]
[hugo1,hugo2,hugo3]
[hugo1,hugo2,hugo3,hugo4]
# Now that I got both of my required list of IDs and Names. It's time for me to save the files with the respective IDs and Names.
for a in ishow:
test=base.GetEntity(deck,'INCLUDE',int(a))
print(a)
file_path_name=directory+"/"+iname
print(file_path_name)
#print(a) gives me
1
8
11
18
#print(file_path_name) gives me
filepath/[hugo1,hugo2,hugo3,hugo4]
filepath/[hugo1,hugo2,hugo3,hugo4]
filepath/[hugo1,hugo2,hugo3,hugo4]
filepath/[hugo1,hugo2,hugo3,hugo4]
# This is the part I got stuck. I wanted the output to be printed in this order:
1
filepath/hugo1
8
filepath/hugo2
11
filepath/hugo3
18
filepath/hugo4
But it doesnt work well so far for me, that's why I am asking whether you all can provide me some assistance on solving this problem :) Helps appreciated!! Thanks all
Your problem is with the code indent:
a=list()
for b in hugo:
a.append(ids)
print(a)
Use a dictionary instead of having 2 separate list for ids and names of includes
The code below creates a dictionary with include id as keys and the corresponding include's name as the value. later this dict is used to print file name
In case you want to save each include as separate file,First isolate the include using "Or"(API) then we have an API for each deck in ANSA to do save files(make sure to enable optional argument 'save visible').for example for NASTRAN it is OutputNastran you can search it in the API search tab in the script editor window
dict={}
for include in includes:
ret=base.GetEntityCardValues(deck, include, 'NAME', 'ID')
ids=str(ret['ID'])
setname=ret['NAME']
if setname.endswith('.dat'):
dict[ids]=setname
for k, v in dict.items():
test=base.GetEntity(deck,'INCLUDE',int(k))
file_path_name=directory+"/"+v
print(file_path_name)
Hope this helps
Assuming ids is actually just the elements in hugo:
a=[id for id in hugo]
print(a)
Or
a=hugo.copy()
print(a)
Or
print(hugo)
Or
a=hugo
print(a)
Or
string = "["
for elem in hugo:
string.append(elem + ",")
print(string[:-1] + "]")
Edit: Added more amazing answers. The last is my personal favourite.
Edit 2:
Answer for your edited question:
This part
for a in ishow:
test=base.GetEntity(deck,'INCLUDE',int(a))
print(a)
file_path_name=directory+"/"+iname
print(file_path_name)
Needs to be changed to
for i in range(len(ishow)):
test=base.GetEntity(deck,'INCLUDE',int(ishow[i]))
file_path_name=directory+"/"+iname[i]
The print statements can be left if you wish.
When you are trying to refer to the same index in multiple lists, it is better to use for i in range(len(a))so that you can access the same index in both.
Your current code has the loop printing every single time it iterates through, so move the print statement left to the same indent level as the for loop, so it only prints once the for loop has finished running its iterations.
a=list()
for b in hugo:
a.append(ids)
print(a)

Python 3.x: len(myList) not matching actual number of elements of myList

I have a dictionary, myDict, containing many (120k+) keys with list-values that I append to each key appropriately.
I noticed when I return myDict, some key-values with e.g. one element in the list will show len(myDict["key_with_one_list_entry"]) > 1.
The opposite does not occur as far as I can tell.
What can be the reason for this? Could key collisions cause this?
Minimum reproducible ex:
fileDict = defaultdict(list)
for file in os.listdir("."):
if file.endswith(".sh"):
with open(file, "r") as file_ptr:
for i, line in enumerate(file_ptr):
if i == 0:
continue
fileName = line.split("/")
_targetId = _getFileParametersFromFileName(fileName[-1][0:-1])
fileDict[_targetId].append(fileName[-1][0:-1])
def _getFileParametersFromFileName(fileName):
_fileNameParameterList = fileName.split("-")
return _fileNameParameterList[2]
These files are created before, there is no risk for collision as these are the only *.sh files in the directory.
There are approximately 130k keys in "fileDict" with list-values ranging from 1 to 12 entries.
Welcome to Stackoverflow!
You are mistaken. Consider, Python is a twenty-year-old language, with an active development team and a huge bank of tests. Ask yourself, how likely is is that such a fundamental bug would have remained undiscovered all this time, only to reveal itself when your code comes along.
One of the more difficult aspects of learning to program is accepting that you make "avoidable" mistakes all the time. Your question assumes rather a lot, without offering any evidence.
Might I suggest that you either edit the question so that you can demonstrate what's actually going wrong, or delete it and post a new one. You may find this article helpful in formulating your question to attract answers.
Alright, I'll bite. I have reworked your code like so,
from glob import iglob
from collections import defaultdict
target_files = defaultdict(list)
for path in iglob('tesscurl*.sh'):
with open(path, "r") as file:
for i, line in enumerate(file):
if i == 0:
continue
file_name = line.split("/")[-1].strip()
target_id = file_name.split("-")[2]
target_files[target_id].append(file_name)
def print_target(target):
print(target, len(target_files[target]), target_files[target])
print_target('0000000000001275')
print_target('0000000000028465')
This outputs,
0000000000001275 1 ['tess2019112060037-s0011-0000000000001275-0143-s_lc.fits']
0000000000028465 1 ['tess2019112060037-s0011-0000000000028465-0143-s_lc.fits']
Or in short: I cannot reproduce the issue. Now, in the comments to another answer, you mention running this in a notebook. Are you aware that repeated runs of a single cell might erroneously update the global state, and thus cause the dictionary of targets to be updated more than once? I suspect that is the issue at hand, rather than something fundamental in Python none of us seem able to reproduce. I suggest you restart your kernel to clear the workspace, and re-run all cells just to be sure.

Python - Searching a dictionary for strings

Basically, I have a troubleshooting program, which, I want the user to enter their input. Then, I take this input and split the words into separate strings. After that, I want to create a dictionary from the contents of a .CSV file, with the key as recognisable keywords and the second column as solutions. Finally, I want to check if any of the strings from the split users input are in the dictionary key, print the solution.
However, the problem I am facing is that I can do what I have stated above, however, it loops through and if my input was 'My phone is wet', and 'wet' was a recognisable keyword, it would go through and say 'Not recognised', 'Not recognised', 'Not recognised', then finally it would print the solution. It says not recognised so many times because the strings 'My', 'phone' and 'is' are not recognised.
So how do I test if a users split input is in my dictionary without it outputting 'Not recognised' etc..
Sorry if this was unclear, I'm quite confused by the whole matter.
Code:
import csv, easygui as eg
KeywordsCSV = dict(csv.reader(open('Keywords and Solutions.csv')))
Problem = eg.enterbox('Please enter your problem: ', 'Troubleshooting').lower().split()
for Problems, Solutions in (KeywordsCSV.items()):
pass
Note, I have the pass there, because this is the part I need help on.
My CSV file consists of:
problemKeyword | solution
For example;
wet Put the phone in a bowl of rice.
Your code reads like some ugly code golf. Let's clean it up before we look at how to solve the problem
import easygui as eg
import csv
# # KeywordsCSV = dict(csv.reader(open('Keywords and Solutions.csv')))
# why are you nesting THREE function calls? That's awful. Don't do that.
# KeywordsCSV should be named something different, too. `problems` is probably fine.
with open("Keywords and Solutions.csv") as f:
reader = csv.reader(f)
problems = dict(reader)
problem = eg.enterbox('Please enter your problem: ', 'Troubleshooting').lower().split()
# this one's not bad, but I lowercased your `Problem` because capital-case
# words are idiomatically class names. Chaining this many functions together isn't
# ideal, but for this one-shot case it's not awful.
Let's break a second here and notice that I changed something on literally every line of your code. Take time to familiarize yourself with PEP8 when you can! It will drastically improve any code you write in Python.
Anyway, once you've got a problems dict, and a problem that should be a KEY in that dict, you can do:
if problem in problems:
solution = problems[problem]
or even using the default return of dict.get:
solution = problems.get(problem)
# if KeyError: solution is None
If you wanted to loop this, you could do something like:
while True:
problem = eg.enterbox(...) # as above
solution = problems.get(problem)
if solution is None:
# invalid problem, warn the user
else:
# display the solution? Do whatever it is you're doing with it and...
break
Just have a boolean and an if after the loop that only runs if none of the words in the sentence were recognized.
I think you might be able to use something like:
for word in Problem:
if KeywordsCSV.has_key(word):
KeywordsCSV.get(word)
or the list comprehension:
[KeywordsCSV.get(word) for word in Problem if KeywordsCSV.has_key(word)]

Use generic keys in dictionary in Python

I am trying to name keys in my dictionary in a generic way because the name will be based on the data I get from a file. I am a new beginner to Python and I am not able to solve it, hope to get answer from u guys.
For example:
from collections import defaultdict
dic = defaultdict(dict)
dic = {}
if cycle = fergurson:
dic[cycle] = {}
if loop = mourinho:
a = 2
dic[cycle][loop] = {a}
Sorry if there is syntax error or any other mistake.
The variable fergurson and mourinho will be changing due to different files that I will import later on.
So I am expecting to see my output when i type :
dic[fergurson][mourinho]
the result will be:
>>>dic[fergurson][mourinho]
['2']
It will be done by using Python
Naming things, as they say, is one of the two hardest problems in Computer Science. That and cache invalidation and off-by-one errors.
Instead of focusing on what to call it now, think of how you're going to use the variable in your code a few lines down.
If you were to read code that was
for filename in directory_list:
print filename
It would be easy to presume that it is printing out a list of filenames
On the other hand, if the same code had different names
for a in b:
print a
it would be a lot less expressive as to what it is doing other than printing out a list of who knows what.
I know that this doesn't help what to call your 'dic' variable, but I hope that it gets you on the right track to find the right one for you.
i have found a way, if it is wrong please correct it
import re
dictionary={}
dsw = "I am a Geography teacher"
abc = "I am a clever student"
search = re.search(r'(?<=Geography )(.\w+)',dsw)
dictionary[search]={}
again = re.search(r'(?<=clever )(.\w+)' abc)
dictionary[search][again]={}
number = 56
dictionary[search][again]={number}
and so when you want to find your specific dictionary after running the program:
dictionary["teacher"]["student"]
you will get
>>>'56'
This is what i mean to

Identifying keys with multiple values in a hash table

I am a beginner in Python scripting.
I have a CSV file which has 5 columns and over 1000 rows. I am attaching a screenshot to give an idea of how the file looks like. (I have included only 4 rows, but the real file has over 1000 rows). So the task I am trying to achieve is this:
I need to print an output csv file, which prints the rows of original csv file based on following conditions.
Each "number" field (column1) is supposed to have just one "name" field associated with it. If it has more than one name fields associated with it, it must throw an error (or display a message next to the number in the output.csv)
If a number field has just one name associated with it, simply print the entire row.
The data in CSV file is in the below format.
Number Name Choices
11234 ABCDEF A1B6N5
11234 ABCDEF A2B6C4
11234 EFGHJK A4F2
11235 ABCDEF A3F5H7
11236 MNOPQR F3D4D5
So my expected output should look something like this. Flag and Message should be displayed only when a "number" has more than one "name" associated with it.
If a "name" has been associated to more than one "number" it should not be flagged. (like 11235 had same name as 11234, but not flagged).
Number Name Choices Flag Message
11234 1 More than 1 name
11234
11234
11235 ABCDEF A3F5H7
11236 MNOPQR F3D4D5
I do understand that this can be implemented as a hashtable, where the number serves as a key and the name serves as value. If the value count is more than 1 for any key, we can probably set a flag and print the error message accordingly.
But could someone help me get started with this? As in, how do I implement this in Python?
Any help is appreciated.
Thanks!
Here are a few concepts you should learn and understand first:
Importing and Exporting CSV: https://docs.python.org/2/library/csv.html
Counter: https://docs.python.org/2/library/collections.html#collections.Counter
or
Defaultdict(int) for counting: https://docs.python.org/2/library/collections.html#collections.defaultdict
It sounds like you need column1 to be the key of a dictionary. If you're trying to count how many times it appears (that's not clear), then you can use names = defaultdict(int); names[key]+=1
If all you want is to remove the duplicates with no counting or crash if there's a duplicate, then here's what you can do:
mydict = {}
with open('yourfile.csv', mode='r') as infile:
reader = csv.reader(infile)
with open('yourfile.csv', mode='w') as outfile:
writer = csv.writer(outfile)
for row in reader:
key = row[0]
if key in mydict:
#Could handle this separately
print "Bad key, already found: %s. Ignoring row: %s" % (key, row)
raise #Element already found
mydict[key] = row
writer.writerows(mydict.values())
If this doesn't work, please give us sample input and expected output. Either way, this should get you started. Also, be patient: you'll learn most by doing things wrong and figuring out why they are wrong. Good luck!
====
Update:
You have a few choices. The easiest for a beginning is probably to build two lists and then output them.
Use key = row[1]
If key is already in the dictionary, remove it (del mydict[key]) and add it to the other dict multiple_dict = {}; multiple_dict[key] = [number, None, None, Data, Message]
def proc_entry(row):
key = row[1]
Saved existing data
if key in mydict:
multiple_dict[key] = key, None, None, 1, "Message"
del mydict[key]
elif key in multiple_dict:
#Key was already duplicated, increase flag?
multiple_dict[key][4]+=1
At this point, your code is getting complicated enough to use things like:
number, name, value = row, and splitting your code into functions. Then you should test the functions with known input to see if the output is as expected.
i.e. pre-load "mydict", then call your processing function and see how it worked. Even better? Learn to write simple unit tests :) .
While we could write it for you, that's not the spirit of Stackoverflow. If you have any more questions, you might want to split this into precise questions that haven't been answered already. Everything I mentioned above could have been found on Stackoverflow and a bit of practice. Knowing what solution to go for is then the art of programming! Have fun ...or hire a programmer if this isn't fun to you!

Categories