Python name variable from string [duplicate] - python

This question already has answers here:
How can you dynamically create variables? [duplicate]
(8 answers)
Closed 8 years ago.
Is it possible to create a variable name based on the value of a string?
I have a script that will read a file for blocks of information and store them in a dictionary. Each block's dictionary will then be appended to a 'master' dictionary. The number of blocks of information in a file will vary and uses the word 'done' to indicate the end of a block.
I want to do something like this:
master={}
block=0
for lines in file:
if line != "done":
$block.append(line)
elif line == "done":
master['$block'].append($block)
block = block + 1
If a file had content like so:
eggs
done
bacon
done
ham
cheese
done
The result would be a dictionary with 3 lists:
master = {'0': ["eggs"], '1': ["bacon"], '2': ["ham", "cheese"]}
How could this be accomplished?

I would actually suggest you to use a list instead. Is there any specific point why would you need dicts that are array-ish?
In case you could do with an array, you can use this:
with open("yourFile") as fd:
arr = [x.strip().split() for x in fd.read().split("done")][:-1]
Output:
[['eggs'], ['bacon'], ['ham', 'cheese']]
In case you wanted number-string indices, you could use this:
with open("yourFile") as fd:
l = [x.strip().split() for x in fd.read().split("done")][:-1]
print dict(zip(map(str,range(len(l))),l))

You seem to be misunderstanding how dictionaries work. They take keys that are objects, so no magic is needed here.
We can however, make your code nicer by using a collections.defaultdict to make the sublists as required.
from collections import defaultdict
master = defaultdict(list)
block = 0
for line in file:
if line == "done":
block += 1
else:
master[block].append(line)
I would, however, suggest that a dictionary is unnecessary if you want continuous, numbered indices - that's what lists are for. In that case, I suggest you follow Thrustmaster's first suggestion, or, as an alternative:
from itertools import takewhile
def repeat_while(predicate, action):
while True:
current = action()
if not predicate(current):
break
else:
yield current
with open("test") as file:
action = lambda: list(takewhile(lambda line: not line == "done", (line.strip() for line in file)))
print(list(repeat_while(lambda x: x, action)))

I think that split on "done" is doomed to failure. Consider the list:
eggs
done
bacon
done
rare steak
well done stake
done
Stealing from Thrustmaster (which I gave a +1 for my theft) I'd suggest:
>>> dict(enumerate(l.split() for l in open(file).read().split('\ndone\n') if l))
{0: ['eggs'], 1: ['bacon'], 2: ['ham', 'cheese']}
I know this expects a trailing "\n". If there is a question there you could use "open(file).read()+'\n'" or even "+'\n\ndone\n'" if the final done is optional.

Use setattr or globals().
See How do I call setattr() on the current module?

Here's your code again, for juxtaposition:
master={}
block=0
for lines in file:
if line != "done":
$block.append(line)
elif line == "done":
master['$block'].append($block)
block = block + 1
As mentioned in the post by Thrustmaster, it makes more sense to use a nested list here. Here's how you would do that; I've changed as little as possible structurally from your original code:
master=[[]] # Start with a list containing only a single list
for line in file: # Note the typo in your code: you wrote "for lines in file"
if line != "done":
master[-1].append(line) # Index -1 is the last element of your list
else: # Since it's not not "done", it must in fact be "done"
master.append([])
The only thing here is that you'll end up with one extra list at the end of your master list, so you should add a line to delete the last, empty sublist:
del master[-1]

Related

Python: Assigning user input as key in dictionary [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
Problem
I was trying to assign user input as a key in a dictionary. If user input is a key then print out its value, else print invalid key. The problem is the keys and the values will be from a text file. For simplicity I will just use random data for the text. Any help would be appreciated.
file.txt
Dog,bark
Cat,meow
bird,chirp
Code
def main():
file = open("file.txt")
for i in file:
i = i.strip()
animal, sound = i.split(",")
dict = {animal : sound}
keyinput = input("Enter animal to know what it sounds like: ")
if keyinput in dict:
print("The ",keyinput,sound,"s")
else:
print("The animal is not in the list")
On every iteration of the loop, you are redefining the dictionary, instead, add new entries:
d = {}
for i in file:
i = i.strip()
animal, sound = i.split(",")
d[animal] = sound
Then, you can access the dictionary items by key:
keyinput = input("Enter animal to know what it sounds like: ")
if keyinput in d:
print("The {key} {value}s".format(key=keyinput, value=d[keyinput]))
else:
print("The animal is not in the list")
Note that I've also changed the dictionary variable name from dict to d, since dict is a poor variable name choice because it is shadowing the built-in dict.
Also, I've improved the way you construct the reporting string and used a string formatting instead. If you would enter Dog, the output would be The Dog barks.
You can also initialize the dictionary in one line using the dict() constructor:
d = dict(line.strip().split(",") for line in file)
As a side note, to follow the best practices and keep your code portable and reliable, use the with context manager when opening the file - it would take care about closing it properly:
with open("file.txt") as f:
# ...
OP, I've written some verbose explanatory notes in the code and fixed a few issues; I might have overlooked something but check it out.
For one, avoid using dict as a variable name since it shadows Python's bult-in dict method.
Remember that , in most cases, you need to declare variables before a loop to make them accessible after the loop; this applies to your dictionary.
Also, remember to close files after reading/writing unless you use with open(filename) ...
def main():
# declare a new, empty dictionary to hold your animal species and sounds.
# Note that I'm avoiding the use of "dict" as a variable name since it
# shadows/overrides the built-in method
animal_dict = {}
file = open("file.txt")
for i in file:
i = i.strip()
animal, sound = i.split(",")
animal_dict[animal] = sound
# Remember to close your files after reading
file.close()
keyinput = input("Enter animal to know what it sounds like: ")
if keyinput in animal_dict:
# here, keyinput is the string/key and to do a lookup
# in the dictionary, you use brackets.
# animal_dict[keyinput] thus returns the sound
print("The ",keyinput,animal_dict[keyinput],"s")
else:
print("The animal is not in the list")
There were comments on every line I changed something explaining what I changed, but to help readability, I'm putting them here too.
On Line 2 I instantiated a dictionary - you were previously
re-defining a dictionary for each line
On Line 7 I changed your
code to add something to the dictionary instead of just creating a
new one. That's proper dictionary syntax.
On Line 10 I changed "if
keyinput in dict" to "if keyinput in dict.keys()", since you're
checking to see if the animal exists, and the animals in your file
become the keys of the dictionary.
def main():
dict = {} #Create an empty dictionary to add to
file = open("file.txt")
for i in file:
i = i.strip()
animal, sound = i.split(",")
dict[animal] = sound #This is how you add a key/value pair to a dictionary in Python
keyinput = input("Enter animal to know what it sounds like: ")
if keyinput in dict.keys(): #Add .keys() to iterate through dictionary keys
print("The ",keyinput,sound,"s")
else:
print("The animal is not in the list")
First of all you should not name a variable the same as a keyword. Secondly, the way you put the input into the dictionary will overwrite the previous values. You need to create the dictionary and then add the new values.
Third, you output the value sound without getting it from the dictionary
dict as a variable should be named mydict
create mydict = {} before the initial loop
set mydict[animal] = sound in the first loop
mydict['dog'] = 'bark' # This is what happens
print keyinput and mydict[keyinput] if it is in the list.
You can also use mysound = mydict.get(keyinput, "not in dictionary") instead of the if.
Why dict.get(key) instead of dict[key]?

Building Nested dictionary in Python reading in line by line from file

The way I go about nested dictionary is this:
dicty = dict()
tmp = dict()
tmp["a"] = 1
tmp["b"] = 2
dicty["A"] = tmp
dicty == {"A" : {"a" : 1, "b" : 1}}
The problem starts when I try to implement this on a big file, reading in line by line.
This is printing the content per line in a list:
['proA', 'macbook', '0.666667']
['proA', 'smart', '0.666667']
['proA', 'ssd', '0.666667']
['FrontPage', 'frontpage', '0.710145']
['FrontPage', 'troubleshooting', '0.971014']
I would like to end up with a nested dictionary (ignore decimals):
{'FrontPage': {'frontpage': '0.710145', 'troubleshooting': '0.971014'},
'proA': {'macbook': '0.666667', 'smart': '0.666667', 'ssd': '0.666667'}}
As I am reading in line by line, I have to check whether or not the first word is still found in the file (they are all grouped), before I add it as a complete dict to the higher dict.
This is my implementation:
def doubleDict(filename):
dicty = dict()
with open(filename, "r") as f:
row = 0
tmp = dict()
oldword = ""
for line in f:
values = line.rstrip().split(" ")
print(values)
if oldword == values[0]:
tmp[values[1]] = values[2]
else:
if oldword is not "":
dicty[oldword] = tmp
tmp.clear()
oldword = values[0]
tmp[values[1]] = values[2]
row += 1
if row % 25 == 0:
print(dicty)
break #print(row)
return(dicty)
I would actually like to have this in pandas, but for now I would be happy if this would work as a dict. For some reason after reading in just the first 5 lines, I end up with:
{'proA': {'frontpage': '0.710145', 'troubleshooting': '0.971014'}},
which is clearly incorrect. What is wrong?
Use a collections.defaultdict() object to auto-instantiate nested dictionaries:
from collections import defaultdict
def doubleDict(filename):
dicty = defaultdict(dict)
with open(filename, "r") as f:
for i, line in enumerate(f):
outer, inner, value = line.split()
dicty[outer][inner] = value
if i % 25 == 0:
print(dicty)
break #print(row)
return(dicty)
I used enumerate() to generate the line count here; much simpler than keeping a separate counter going.
Even without a defaultdict, you can let the outer dictionary keep the reference to the nested dictionary, and retrieve it again by using values[0]; there is no need to keep the temp reference around:
>>> dicty = {}
>>> dicty['A'] = {}
>>> dicty['A']['a'] = 1
>>> dicty['A']['b'] = 2
>>> dicty
{'A': {'a': 1, 'b': 1}}
All the defaultdict then does is keep us from having to test if we already created that nested dictionary. Instead of:
if outer not in dicty:
dicty[outer] = {}
dicty[outer][inner] = value
we simply omit the if test as defaultdict will create a new dictionary for us if the key was not yet present.
While this isn't the ideal way to do things, you're pretty close to making it work.
Your main problem is that you're reusing the same tmp dictionary. After you insert it into dicty under the first key, you then clear it and start filling it with the new values. Replace tmp.clear() with tmp = {} to fix that, so you have a different dictionary for each key, instead of the same one for all keys.
Your second problem is that you're never storing the last tmp value in the dictionary when you reach the end, so add another dicty[oldword] = tmp after the for loop.
Your third problem is that you're checking if oldword is not "":. That may be true even if it's an empty string, because you're comparing identity, not equality. Just change that to if oldword:. (This one, you'll usually get away with, because small strings are usually interned and will usually share identity… but you shouldn't count on that.)
If you fix both of those, you get this:
{'FrontPage': {'frontpage': '0.710145', 'troubleshooting': '0.971014'},
'proA': {'macbook': '0.666667', 'smart': '0.666667', 'ssd': '0.666667'}}
I'm not sure how to turn this into the format you claim to want, because that format isn't even a valid dictionary. But hopefully this gets you close.
There are two simpler ways to do it:
Group the values with, e.g., itertools.groupby, then transform each group into a dict and insert it all in one step. This, like your existing code, requires that the input already be batched by values[0].
Use the dictionary as a dictionary. You can look up each key as it comes in and add to the value if found, create a new one if not. A defaultdict or the setdefault method will make this concise, but even if you don't know about those, it's pretty simple to write it out explicitly, and it'll still be less verbose than what you have now.
The second version is already explained very nicely in Martijn Pieters's answer.
The first can be written like this:
def doubleDict(s):
with open(filename, "r") as f:
rows = (line.rstrip().split(" ") for line in f)
return {k: {values[1]: values[2] for values in g}
for k, g in itertools.groupby(rows, key=operator.itemgetter(0))}
Of course that doesn't print out the dict so far after every 25 rows, but that's easy to add by turning the comprehension into an explicit loop (and ideally using enumerate instead of keeping an explicit row counter).

How to add to a list without overwriting it when calling a method for the second time?

Here's an example of the code im trying to implement:
def notes():
print "\nPlease enter any notes:"
global texts
texts = []
if not texts:
print "no notes exist."
write_note()
else:
print "this note already exists"
def write_note():
while True:
global txt
txt = raw_input(">>> ")
if not txt:
break
else:
texts.append(txt)
print "\nNote(s) added to report."
notes_menu()
def print_note():
new_report.write("\nNotes:")
for txt in texts:
new_report.write("\n-%r" % txt)
print "Note Printed to %r. Goodbye!" % file_name
exit(0)
My goal here is to make it so if/when "notes()" is called a second(or ad infinitum) time the new inputs are added to "texts" list and dont overwrite the list. I tried attempting to at least determine if the list was empty whenever "notes()" is called. But every time I do, regardless of how many items ive created in "texts" during the previous calling, it always prints "no notes exist."
I'm kind of at a loss at this point. I've looked into the dictionary function but im not sure how to incorporate it into this code. Anyone have any advice/suggestions?
I agree with the comments that suggest a better design would be to create a class that contains the texts. However, with respect to the code as it stands, it appears to me that texts = [] should be in the main code, outside notes(), so that the line is only ever run once.
without changing too much of what you have above, can I suggest a function that simply requests a new note, and appends the note to the existing list:
>>> notes = []
>>> def write_note(notes):
... while True:
... new_note = raw_input('>>> ')
... if not new_note:
... break
... else:
... notes.append(new_note)
Does this do what you are after?
When you call texts = [] you set text to an empty list, blanking out any items set earlier. Removing that line should help.
Also, I think may you want to use the .extend() function. Append adds an item on to the end of a list, i.e.:
>>li = [1,2,3]
>>li2 = [4,5,6]
>>li.append(li2)
li = [1,2,3,[4,5,6]]
Where as extend() concatenates two lists:
>>li = [1,2,3]
>>li2 = [4,5,6]
>>li.extend(li2)
li = [1,2,3,4,5,6]
This can be found on dive into python

Creating a Dictionary from File [duplicate]

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
Reading a File into a Dictionary And Keeping Count
I am trying to create a dictionary with two values: the first value is the text:
<NEW ARTICLE>
Take a look at
what I found.
<NEW ARTICLE>
It looks like something
dark and shiny.
<NEW ARTICLE>
But how can something be dark
and shiny at the same time?
<NEW ARTICLE>
I have no idea.
and the second value is the count of how many times the word "ARTICLE>" is used.
I tried different methods and one method I received this error:
The erorr I receive is this:
(key, val) = line.split()
ValueError: need more than 1 value to unpack
I've tried a few different methods but to no avail, one method I tried said it gave too many values to unpack..
I want to be able to search for a key/word in the dictionary later on and find its appropriate count.
Using Python 3.
this should do it:
>>> with open("data1.txt") as f:
... lines=f.read()
... spl=lines.split("<NEW ARTICLE>")[1:]
... dic=dict((i,x.strip()) for i,x in enumerate(spl))
... print dic
...
{0: 'Take a look at \nwhat I found.',
1: 'It looks like something\ndark and shiny.',
2: 'But how can something be dark\nand shiny at the same time?',
3: 'I have no idea.'}
Make sure you don't have an empty line somewhere:
if newdoc == True and line != "ARTICLE>" and line:
(key, val) = line.split()
(an empty line would be splitted as [], which cannot be parsed as a tuple with two elements...)

Is there a better way to create dynamic functions on the fly, without using string formatting and exec?

I have written a little program that parses log files of anywhere between a few thousand lines to a few hundred thousand lines. For this, I have a function in my code which parses every line, looks for keywords, and returns the keywords with the associated values.
These log files contain of little sections. Each section has some values I'm interested in and want to store as a dictionary.
I have simplified the sample below, but the idea is the same.
My original function looked like this, it gets called between 100 and 10000 times per run, so you can understand why I want to optimize it:
def parse_txt(f):
d = {}
for line in f:
if not line:
pass
elif 'apples' in line:
d['apples'] = True
elif 'bananas' in line:
d['bananas'] = True
elif line.startswith('End of section'):
return d
f = open('fruit.txt','r')
d = parse_txt(f)
print d
The problem I run into, is that I have a lot of conditionals in my program, because it checks for a lot of different things and stores the values for it. And when checking every line for anywhere between 0 and 30 keywords, this gets slow fast. I don't want to do that, because, not every time I run the program I'm interested in everything. I'm only ever interested in 5-6 keywords, but I'm parsing every line for 30 or so keywords.
In order to optimize it, I wrote the following by using exec on a string:
def make_func(args):
func_str = """
def parse_txt(f):
d = {}
for line in f:
if not line:
pass
"""
if 'apples' in args:
func_str += """
elif 'apples' in line:
d['apples'] = True
"""
if 'bananas' in args:
func_str += """
elif 'bananas' in line:
d['bananas'] = True
"""
func_str += """
elif line.startswith('End of section'):
return d"""
print func_str
exec(func_str)
return parse_txt
args = ['apples','bananas']
fun = make_func(args)
f = open('fruit.txt','r')
d = fun(f)
print d
This solution works great, because it speeds up the program by an order of magnitude and it is relatively simple. Depending on the arguments I put in, it will give me the first function, but without checking for all the stuff I don't need.
For example, if I give it args=['bananas'], it will not check for 'apples', which is exactly what I want to do.
This makes it much more efficient.
However, I do not like it this solution very much, because it is not very readable, difficult to change something and very error prone whenever I modify something. Besides that, it feels a little bit dirty.
I am looking for alternative or better ways to do this. I have tried using a set of functions to call on every line, and while this worked, it did not offer me the speed increase that my current solution gives me, because it adds a few function calls for every line. My current solution doesn't have this problem, because it only has to be called once at the start of the program. I have read about the security issues with exec and eval, but I do not really care about that, because I'm the only one using it.
EDIT:
I should add that, for the sake of clarity, I have greatly simplified my function. From the answers I understand that I didn't make this clear enough.
I do not check for keywords in a consistent way. Sometimes I need to check for 2 or 3 keywords in a single line, sometimes just for 1. I also do not treat the result in the same way. For example, sometimes I extract a single value from the line I'm on, sometimes I need to parse the next 5 lines.
I would try defining a list of keywords you want to look for ("keywords") and doing this:
for word in keywords:
if word in line:
d[word] = True
Or, using a list comprehension:
dict([(word,True) for word in keywords if word in line])
Unless I'm mistaken this shouldn't be much slower than your version.
No need to use eval here, in my opinion. You're right in that an eval based solution should raise a red flag most of the time.
Edit: as you have to perform a different action depending on the keyword, I would just define function handlers and then use a dictionary like this:
def keyword_handler_word1(line):
(...)
(...)
def keyword_handler_wordN(line):
(...)
keyword_handlers = { 'word1': keyword_handler_word1, (...), 'wordN': keyword_handler_wordN }
Then, in the actual processing code:
for word in keywords:
# keyword_handlers[word] is a function
keyword_handlers[word](line)
Use regular expressions. Something like the next:
>>> lookup = {'a': 'apple', 'b': 'banane'} # keyword: characters to look for
>>> pattern = '|'.join('(?P<%s>%s)' % (key, val) for key, val in lookup.items())
>>> re.search(pattern, 'apple aaa').groupdict()
{'a': 'apple', 'b': None}
def create_parser(fruits):
def parse_txt(f):
d = {}
for line in f:
if not line:
pass
elif line.startswith('End of section'):
return d
else:
for testfruit in fruits:
if testfruit in line:
d[testfruit] = True
This is what you want - create a test function dynamically.
Depending on what you really want to do, it is, of course, possibe to remove one level of complexity and define
def parse_txt(f, fruits):
[...]
or
def parse_txt(fruits, f):
[...]
and work with functools.partial.
You can use set structure, like this:
fruit = set(['cocos', 'apple', 'lime'])
need = set (['cocos', 'pineapple'])
need. intersection(fruit)
return to you 'cocos'.

Categories