ValueError: too many values to unpack (expected 2) errors - python

My code:
def dictest():
global my_glossary
# read all lines of the file
inFile = open("glossary.txt", "r")
inText = inFile.read()
inFile.close()
my_glossary = {}
# iterate through all lines, after removing the line-end character(s)
for line in inText.splitlines():
if line != '': # ignore empty lines
(key,value) = line.split(",")
my_glossary[key] = value
addToGlossary = entryNew.get()
addToGlossaryDef = outputNew.get()
my_glossary[addToGlossary] = addToGlossaryDef
# list all the dictionary entries
for k,v in my_glossary.items():
print('key:', k, ', value:', v)
My Output:
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Python34\lib\tkinter\__init__.py", line 1533, in __call__
return self.func(*args)
File "I:\School\Working Glossary.py", line 59, in MultiFunc
dictest()
File "I:\School\Working Glossary.py", line 14, in dictest
(key,value) = line.split(",")
ValueError: too many values to unpack (expected 2)
I am trying to accomplish making of a keywords glossary using a text file as storage. I keep running into this error which is causing the program to not work.
My text file contents:
bug, this is a test
test, this is another test
testing,testing
123,12354

I think you want this:
>>> line = "hello,world,foo,bar"
>>> (key, value) = line.split(",", 1)
>>> key
'hello'
>>> value
'world,foo,bar'
>>>
The change being: (key, value) = line.split(",", 1)
Passing 1 as the second argument to split tells split to stop after it comes across 1 comma, passing the rest of the line to value.
From the docs,
str.split([sep[, maxsplit]])
(...)
If maxsplit is given, at most maxsplit splits are done (thus, the list
will have at most maxsplit+1 elements).

Related

Python - Error while trying to split line of text

I am having as issue while trying to split a line of text I get from .txt file. It is quite a big file, but I will paste only 2 lines, with original text
1307;Własność: udział 1/1<>GMINA TARNOWIEC<><> 211<>30-200 ZipCode;KS1J/00080000/2;861;Własność: udział 1/1<>GMINA TARNOWIEC<><> 211<>30-200 ZipCode;KS1J/00080990/2;
1306;Własność: udział 1/1<>Jan Nowak<>im. rodz.: Tomasz_ Maria<>Somewhere 2<>30-200 ZipCode;KW22222;861;Własność: udział 1/1<>GMINA TARNOWIEC<><>Tarnowiec 211<>30-200 ZipCode;KS1W/00080000/1;
Data I get from this file will be used to create reports, and _ and <> will be used for further formatting. I want to have the line split on ;
Problem is, I am getting error on 2 methods of splitting.
first, the basic .split(';')
dane = open('dane_protokoly.txt', 'r')
for line in dane:
a,b,c,d,e,f,g = line.split(';')
print(a)
print(b)
print(c)
print(d)
print(e)
print(f)
print(g)
I am getting an error after printing the first loop
Traceback (most recent call last):
File "C:\Users\Admin\Desktop\Nowy folder\costam.py", line 36, in <module>
a,b,c,d,e,f,g = line.split(';')
ValueError: not enough values to unpack (expected 7, got 1)
Same with creating lists from this file (list looks like: ['1307', 'Własność: udział 1/1<>GMINA TARNOWIEC<><> 211<>30-200 ZipCode', 'KS1J/00080000/2', '861', 'Własność: udział 1/1<>GMINA TARNOWIEC<><> 211<>30-200 ZipCode', 'KS1J/00080990/2', '']
dane = plik('dane_protokoly.txt')
for line in dane:
a = line[0]
b = line[1]
c = line[2]
d = line[3]
e = line[4]
f = line[5]
g = line[6]
print(str(a))
print(str(b))
print(str(c))
print(str(d))
print(str(e))
print(str(f))
error I get also after properly printing the first line:
Traceback (most recent call last):
File "C:\Users\Admin\Desktop\Nowy folder\costam.py", line 22, in <module>
b = line[1]
IndexError: list index out of range
Any idea why am I getting such errors?
Sometimes line.split(';') not giving 7 values to unpack for (a,b,c,...), So better to iterate like this ,
lst = line.split(';')
for item in lst:
print item
And there is a newline in between that's making the problems for you,
And the syntax that followed is a bad practice
You change your code like this,
for line in open("'dane_protokoly.txt'").read().split('\n'):
lst = line.split(';')
for item in lst:
print item
It's doesn't care about the newlines in between,
As Rahul K P mentioned, the problems are the "empty" lines in between your lines with the data. You should skip them when trying to split your data.
Maybe use this as a starting point:
with open(r"dane_protokoly.txt", "r") as data_file:
for line in data_file:
#skip rows which only contain a newline special char
if len(line)>1:
data_row=line.strip().split(";")
print(data_row)
Your second strategy didn't work because line[0] is essentially the whole line as it includes no spaces and the default is splitting at spaces.
Therefore there is no line[1] or line[2]... and therefore you get a list index out of range error.
I hope this helps. And I hope it solves your problem.

why am I getting this error? twitter_list_dict.append(line[0]) IndexError: list index out of range

Traceback (most recent call last):
File "D:\myscripts\NewTermSentimentInference.py", line 88, in <module>
main()
File "D:\myscripts\NewTermSentimentInference.py", line 34, in main
tweets = tweet_dict(twitterData)
File "D:\myscripts\NewTermSentimentInference.py", line 15, in tweet_dict
twitter_list_dict.append(line[0])
IndexError: list index out of range
Code:
twitterData = sys.argv[0] # csv file
def tweet_dict(twitterData):
''' (file) -> list of dictionaries
This method should take your csv file
file and create a list of dictionaries.
'''
twitter_list_dict = []
twitterfile = open(twitterData)
twitterreader = csv.reader(twitterfile)
for line in twitterreader:
**twitter_list_dict.append(line[1])**
return twitter_list_dict
def sentiment_dict(sentimentData):
''' (file) -> dictionary
This method should take your sentiment file
and create a dictionary in the form {word: value}
'''
afinnfile = open(sentimentData)
scores = {} # initialize an empty dictionary
for line in afinnfile:
term, score = line.split("\t") # The file is tab-delimited. "\t" means "tab character"
scores[term] = float(score) # Convert the score to an integer.
return scores # Print every (term, score) pair in the dictionary
def main():
tweets = tweet_dict(twitterData)
sentiment = sentiment_dict("AFINN-111.txt")
accum_term = dict()
"""Calculating sentiment scores for the whole tweet with unknown terms set to score of zero
See -> DeriveTweetSentimentEasy
"""
for index in range(len(tweets)):
tweet_word = tweets[index].split()
sent_score = 0 # sentiment of the sentence
term_count = {}
term_list = []
Trying to do sentiment analysis but facing Index error in the line in this portion of code in the method which tries to create dictionaries from a csv file which has tweets accessed from twitter, can someone please help me with it?
Check the input CSV for empty lines.
The 'list index out of range' error will get thrown in twitter_list_dict.append(line[0]) if line is an empty list, and hence has no first element to reference. The most likely culprit: one or more of the lines in the CSV is empty, which will lead csv.reader to return an empty list for line.
If empty lines in the CSV are expected, you can skip them by adding a check to ensure the line isn't empty:
for line in twitterreader:
if line: # Check if line is non-empty
twitter_list_dict.append(line[0])
return twitter_list_dict

Reading a file and storing contents into a dictionary - Python

I'm trying to store contents of a file into a dictionary and I want to return a value when I call its key. Each line of the file has two items (acronyms and corresponding phrases) that are separated by commas, and there are 585 lines. I want to store the acronyms on the left of the comma to the key, and the phrases on the right of the comma to the value. Here's what I have:
def read_file(filename):
infile = open(filename, 'r')
for line in infile:
line = line.strip() #remove newline character at end of each line
phrase = line.split(',')
newDict = {'phrase[0]':'phrase[1]'}
infile.close()
And here's what I get when I try to look up the values:
>>> read_file('acronyms.csv')
>>> acronyms=read_file('acronyms.csv')
>>> acronyms['ABT']
Traceback (most recent call last):
File "<pyshell#65>", line 1, in <module>
acronyms['ABT']
TypeError: 'NoneType' object is not subscriptable
>>>
If I add return newDict to the end of the body of the function, it obviously just returns {'phrase[0]':'phrase[1]'} when I call read_file('acronyms.csv'). I've also tried {phrase[0]:phrase[1]} (no single quotation marks) but that returns the same error. Thanks for any help.
def read_acronym_meanings(path:str):
with open(path) as f:
acronyms = dict(l.strip().split(',') for l in f)
return acronyms
First off, you are creating a new dictionary at every iteration of the loop. Instead, create one dictionary and add elements every time you go over a line. Second, the 'phrase[0]' includes the apostrophes which turn make it a string instead of a reference to the phrase variable that you just created.
Also, try using the with keyword so that you don't have to explicitly close the file later.
def read(filename):
newDict = {}
with open(filename, 'r') as infile:
for line in infile:
line = line.strip() #remove newline character at end of each line
phrase = line.split(',')
newDict[phrase[0]] = phrase[1]}
return newDict
def read_file(filename):
infile = open(filename, 'r')
newDict = {}
for line in infile:
line = line.strip() #remove newline character at end of each line
phrase = line.split(',', 1) # split max of one time
newDict.update( {phrase[0]:phrase[1]})
infile.close()
return newDict
Your original creates a new dictionary every iteration of the loop.

storing length of query sequence as value to the key that is sequence header: IndexError: list index out of range

I have written a python script to extract unaligned region from a blast alignment output. I made a dictionary which has a header(sequence identifier) as a key and length of the sequence as its value. The file I am dealing with is a csv file. Here is the piece of my code :
my_dict = {}
for line in fhand:
line = line.rstrip()
line = line.split(",")
if line[0] == "Query":
continue #Skipping the header of our csv file
my_dict[line[0]] = int(line[2]) #Storing the sequence identifier as key and the length of sequence as its value.
Error:
Traceback (most recent call last):
File "Pf_extract_mapper.py", line 31, in <module>
my_dict[line[0]] = int(line[2]) #Storing the sequence identifier as key and the length of sequence as its value.
IndexError: list index out of range
Sample file I am working on:
Query,Hit ID,Query_length,Hit Def,E-Value,query_start,query_end,sbjct_start,sbjct_end
Seq1,seq11111,100,control1,2e-21,10,35,15,31
Seq1,seq22222,100,control2,34e-34,25,40,27,38
Seq1,seq33333,100,control3,25e-27,58,84,54,80
d = {}
with open('data.csv', 'r') as f:
next(f, None)
for line in f:
line = line.split(',')
d[line[0]] = int(line[2])
pp(d)
{'Seq1': 100}

Reading textfile into dictionary

My code:
fd = open('C:\Python27\\alu.txt', 'r')
D = dict(line.split("\n") for line in fd)
It shows the following error
traceback (most recent call last):
File "C:\Users\ram\Desktop\rest_enz3.py", line 8, in <module>
D = dict(line.split("\n") for line in fd)
ValueError: dictionary update sequence element #69 has length 1; 2 is required
The only newline you'll ever find in line will be the one at the very end, so line.split("\n") will return a list of length 1. Perhaps you meant to use a different delimiter. If your file looks like...
lorem:ipsum
dolor:sit
Then you should do
D=dict(line.strip().split(":") for line in fd)
As Kevin above points out, line.split("\n") is a bit odd, but maybe the file is just a list of dictionary keys?
Regardless, the error you get implies that the line.split("\n") returns just a single element (in other words, the line is missing the trailing newline). For example:
"Key1\n".split("\n") returns ["Key1", ""]
while
"Key1".split("\n") returns ["Key1"]
dict([["key1", ""],["key2", ""]])
is fine, while
dict([["key1, ""],["key2"]])
returns the error you quote
It may be as simple as editing the file in question and adding a new line at the end of the file.
File example: alu.txt
GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGCGGGCGGA
TCACGAGGTCAGGAGATCGAGACCATCCTGGCTAACACGGTGAAACCCCGTCTCTACTAA
AAATACAAAAATTAGCCGGGCGTGGTGGCGGGCGCCTGTAGTCCCAGCTACTCGGGAGGC
TGAGGCAGGAGAATGGCGTGAACCCGGGAGGCGGAGCTTGCAGTGAGCCGAGATCGCGCC
Reading file
with open('C:\\Python27\\alu.txt', 'r') as fp:
dna = {'Key %i' % i: j.strip() for i, j in enumerate(fp.readlines())}
for key, value in dna.iteritems():
print key, ':', value
Key 1 : GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGCGGGCGGA
Key 2 : TCACGAGGTCAGGAGATCGAGACCATCCTGGCTAACACGGTGAAACCCCGTCTCTACTAA
Key 3 : AAATACAAAAATTAGCCGGGCGTGGTGGCGGGCGCCTGTAGTCCCAGCTACTCGGGAGGC
If you're using Python 3 change the iteration flow to this way
for key, value in dna.items():
print(key, ':', value)

Categories