Reading textfile into dictionary - python

My code:
fd = open('C:\Python27\\alu.txt', 'r')
D = dict(line.split("\n") for line in fd)
It shows the following error
traceback (most recent call last):
File "C:\Users\ram\Desktop\rest_enz3.py", line 8, in <module>
D = dict(line.split("\n") for line in fd)
ValueError: dictionary update sequence element #69 has length 1; 2 is required

The only newline you'll ever find in line will be the one at the very end, so line.split("\n") will return a list of length 1. Perhaps you meant to use a different delimiter. If your file looks like...
lorem:ipsum
dolor:sit
Then you should do
D=dict(line.strip().split(":") for line in fd)

As Kevin above points out, line.split("\n") is a bit odd, but maybe the file is just a list of dictionary keys?
Regardless, the error you get implies that the line.split("\n") returns just a single element (in other words, the line is missing the trailing newline). For example:
"Key1\n".split("\n") returns ["Key1", ""]
while
"Key1".split("\n") returns ["Key1"]
dict([["key1", ""],["key2", ""]])
is fine, while
dict([["key1, ""],["key2"]])
returns the error you quote
It may be as simple as editing the file in question and adding a new line at the end of the file.

File example: alu.txt
GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGCGGGCGGA
TCACGAGGTCAGGAGATCGAGACCATCCTGGCTAACACGGTGAAACCCCGTCTCTACTAA
AAATACAAAAATTAGCCGGGCGTGGTGGCGGGCGCCTGTAGTCCCAGCTACTCGGGAGGC
TGAGGCAGGAGAATGGCGTGAACCCGGGAGGCGGAGCTTGCAGTGAGCCGAGATCGCGCC
Reading file
with open('C:\\Python27\\alu.txt', 'r') as fp:
dna = {'Key %i' % i: j.strip() for i, j in enumerate(fp.readlines())}
for key, value in dna.iteritems():
print key, ':', value
Key 1 : GGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGCGGGCGGA
Key 2 : TCACGAGGTCAGGAGATCGAGACCATCCTGGCTAACACGGTGAAACCCCGTCTCTACTAA
Key 3 : AAATACAAAAATTAGCCGGGCGTGGTGGCGGGCGCCTGTAGTCCCAGCTACTCGGGAGGC
If you're using Python 3 change the iteration flow to this way
for key, value in dna.items():
print(key, ':', value)

Related

string.replace function keep showing 'list index out of range error'

Here is what I want
read large JSON file(4.8MB)
replace specific words to new words
make new file then write line to new file
Here is my code.
def replaceString(input,replace_list): #read one line, and in that line, replace string in replace_list[0] with string in replace_list[1]
new_string = input
for i in range(len(replace_list)):
new_string = new_string.replace(replace_list[i][0], replace_list[i][1])
return new_string
input_f = open("ko_ko.json",'r') #very long file
output_f = open("new_ko_ko.json",'w')
replace_list = [["`","'"],["&#x27"],[" !","!"],[" ?","?"]] #[ ["string to replace", "string to be replaced"] , ... ]
input_line = input_f.readlines()[0]
new_lines = replaceString(input_line,replace_list)
output_f.write(new_lines)
I debugged program keep showing following error
Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/jaegu/PycharmProjects/newJSON/makeJSON.py
Traceback (most recent call last):
File "/Users/jaegu/PycharmProjects/newJSON/makeJSON.py", line 13, in <module>
new_lines = replaceString(input_line,replace_list)
File "/Users/jaegu/PycharmProjects/newJSON/makeJSON.py", line 4, in replaceString
new_string = new_string.replace(replace_list[i][0], replace_list[i][1])
IndexError: list index out of range
One of your replace_list elements is a list with just one element: ["&#x27"]. There is no second element in that list so you get an exception. Presumably you wanted that to be ["&#27", "'"].
Some other remarks:
Use tuples for your pairs; the pairs don't need to be mutable, using tuples lets you catch bugs earlier.
Don't use range() when you can loop directly over your pairs:
for old, new in replace_list:
new_string = new_string.replace(old, new)

Python - Error while trying to split line of text

I am having as issue while trying to split a line of text I get from .txt file. It is quite a big file, but I will paste only 2 lines, with original text
1307;Własność: udział 1/1<>GMINA TARNOWIEC<><> 211<>30-200 ZipCode;KS1J/00080000/2;861;Własność: udział 1/1<>GMINA TARNOWIEC<><> 211<>30-200 ZipCode;KS1J/00080990/2;
1306;Własność: udział 1/1<>Jan Nowak<>im. rodz.: Tomasz_ Maria<>Somewhere 2<>30-200 ZipCode;KW22222;861;Własność: udział 1/1<>GMINA TARNOWIEC<><>Tarnowiec 211<>30-200 ZipCode;KS1W/00080000/1;
Data I get from this file will be used to create reports, and _ and <> will be used for further formatting. I want to have the line split on ;
Problem is, I am getting error on 2 methods of splitting.
first, the basic .split(';')
dane = open('dane_protokoly.txt', 'r')
for line in dane:
a,b,c,d,e,f,g = line.split(';')
print(a)
print(b)
print(c)
print(d)
print(e)
print(f)
print(g)
I am getting an error after printing the first loop
Traceback (most recent call last):
File "C:\Users\Admin\Desktop\Nowy folder\costam.py", line 36, in <module>
a,b,c,d,e,f,g = line.split(';')
ValueError: not enough values to unpack (expected 7, got 1)
Same with creating lists from this file (list looks like: ['1307', 'Własność: udział 1/1<>GMINA TARNOWIEC<><> 211<>30-200 ZipCode', 'KS1J/00080000/2', '861', 'Własność: udział 1/1<>GMINA TARNOWIEC<><> 211<>30-200 ZipCode', 'KS1J/00080990/2', '']
dane = plik('dane_protokoly.txt')
for line in dane:
a = line[0]
b = line[1]
c = line[2]
d = line[3]
e = line[4]
f = line[5]
g = line[6]
print(str(a))
print(str(b))
print(str(c))
print(str(d))
print(str(e))
print(str(f))
error I get also after properly printing the first line:
Traceback (most recent call last):
File "C:\Users\Admin\Desktop\Nowy folder\costam.py", line 22, in <module>
b = line[1]
IndexError: list index out of range
Any idea why am I getting such errors?
Sometimes line.split(';') not giving 7 values to unpack for (a,b,c,...), So better to iterate like this ,
lst = line.split(';')
for item in lst:
print item
And there is a newline in between that's making the problems for you,
And the syntax that followed is a bad practice
You change your code like this,
for line in open("'dane_protokoly.txt'").read().split('\n'):
lst = line.split(';')
for item in lst:
print item
It's doesn't care about the newlines in between,
As Rahul K P mentioned, the problems are the "empty" lines in between your lines with the data. You should skip them when trying to split your data.
Maybe use this as a starting point:
with open(r"dane_protokoly.txt", "r") as data_file:
for line in data_file:
#skip rows which only contain a newline special char
if len(line)>1:
data_row=line.strip().split(";")
print(data_row)
Your second strategy didn't work because line[0] is essentially the whole line as it includes no spaces and the default is splitting at spaces.
Therefore there is no line[1] or line[2]... and therefore you get a list index out of range error.
I hope this helps. And I hope it solves your problem.

too many values to unpack (expected 2)

def read_dict(file_name):
f=open(file_name,'r')
dict_rap={}
for key, val in csv.reader(f):
dict_rap[key]=str(val)
f.close()
return(dict_rap)
test_dict = {'wassup':['Hi','Hello'],'get up through':['to leave','to exit'],
'its on with you':['good bye','have a nice day'],'bet':['ok','alright'],'ight':['ok','yes'],
'whip':['car','vechile'],'lit':['fun','festive'],'guap':['money','currency'],'finesse':['to get desired results by anymeans','to trick someone'],
'jugg':['how you makemoney','modern term for hustle'],'1111':['www'] }
Traceback (most recent call last):
File "C:\Users\C2C\Desktop\rosetta_stone.py", line 97, in
reformed_dict = read_dict(file_name)#,test_dict)
File "C:\Users\C2C\Desktop\rosetta_stone.py", line 63, in read_dict
for key, val in csv.reader(f):
ValueError: too many values to unpack (expected 2)
From csv documentation ...
In [2]: csv.reader??
Docstring:
csv_reader = reader(iterable [, dialect='excel']
[optional keyword args])
for row in csv_reader:
process(row)
......
......
The returned object is an iterator. Each iteration returns a row
I guess it's pretty self explanatory...
I think each row of that list is a dictionary you're expecting. So your dict processing code should go inside a iteration which will iterate over the fat list returned by the csv.reader
I'm afraid that csv.reader(f) does not return what you are expecting it to return. I don't know exactly how your .csv file looks like, but I doubt that it directly returns the two values that you are trying to put into the dictionary.
Assuming that the first 3 lines of your .csv look something like this:
wassup,hi,hello
get up through,to leave,to exit
its on you,good bye,have a nice day
a better way to get the .cvs and iterate over each line might be:
...
my_csv = csv.reader(f)
for row in my_csv:
# row is a list with all the values you have in one line in the .csv
if len(row) > 1:
key = row[0] # for the 1st line the value is the string: 'wassup'
values = row[1:] # here for the first line you get the list: ['hi', 'hello']
# ... and so on
It is saying that csv.reader(f) is yielding only one thing that you are trying to treat as two things (key and val).
Presuming that you are using the standard csv module, then you are getting a list of only one item. If you expect the input to have two items, then perhaps you need to specificity a different delimiter. For example if your input has semi colons instead of commas:
csv.reader(f, delimiter=";")

How to create a dictionary that contains key‐value pairs from a text file

I have a text file (one.txt) that contains an arbitrary number of key‐value pairs (where the key and value are separated by a colon – e.g., x:17). Here are some (minus the numbers):
mattis:turpis
Aliquam:adipiscing
nonummy:ligula
Duis:ultricies
nonummy:pretium
urna:dolor
odio:mauris
lectus:per
quam:ridiculus
tellus:nonummy
consequat:metus
I need to open the file and create a dictionary that contains all of the key‐value pairs.
So far I have opened the file with
file = []
with open('one.txt', 'r') as _:
for line in _:
line = line.strip()
if line:
file.append(line)
I opened it this way to get rid of new line characters and the last black line in the text file. I am given a list of the key-value pairs within python.
I am not sure how to create a dictionary with the list key-value pairs.
Everything I have tried gives me an error. Some say something along the lines of
ValueError: dictionary update sequence element #0 has length 1; 2 is required
Use str.split():
with open('one.txt') as f:
d = dict(l.strip().split(':') for l in f)
split() will allow you to specify the separator : to separate the key and value into separate strings. Then you can use them to populate a dictionary, for example: mydict
mydict = {}
with open('one.txt', 'r') as _:
for line in _:
line = line.strip()
if line:
key, value = line.split(':')
mydict[key] = value
print mydict
output:
{'mattis': 'turpis', 'lectus': 'per', 'tellus': 'nonummy', 'quam': 'ridiculus', 'Duis': 'ultricies', 'consequat': 'metus', 'nonummy': 'pretium', 'odio': 'mauris', 'urna': 'dolor', 'Aliquam': 'adipiscing'}

ValueError: too many values to unpack (expected 2) errors

My code:
def dictest():
global my_glossary
# read all lines of the file
inFile = open("glossary.txt", "r")
inText = inFile.read()
inFile.close()
my_glossary = {}
# iterate through all lines, after removing the line-end character(s)
for line in inText.splitlines():
if line != '': # ignore empty lines
(key,value) = line.split(",")
my_glossary[key] = value
addToGlossary = entryNew.get()
addToGlossaryDef = outputNew.get()
my_glossary[addToGlossary] = addToGlossaryDef
# list all the dictionary entries
for k,v in my_glossary.items():
print('key:', k, ', value:', v)
My Output:
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Python34\lib\tkinter\__init__.py", line 1533, in __call__
return self.func(*args)
File "I:\School\Working Glossary.py", line 59, in MultiFunc
dictest()
File "I:\School\Working Glossary.py", line 14, in dictest
(key,value) = line.split(",")
ValueError: too many values to unpack (expected 2)
I am trying to accomplish making of a keywords glossary using a text file as storage. I keep running into this error which is causing the program to not work.
My text file contents:
bug, this is a test
test, this is another test
testing,testing
123,12354
I think you want this:
>>> line = "hello,world,foo,bar"
>>> (key, value) = line.split(",", 1)
>>> key
'hello'
>>> value
'world,foo,bar'
>>>
The change being: (key, value) = line.split(",", 1)
Passing 1 as the second argument to split tells split to stop after it comes across 1 comma, passing the rest of the line to value.
From the docs,
str.split([sep[, maxsplit]])
(...)
If maxsplit is given, at most maxsplit splits are done (thus, the list
will have at most maxsplit+1 elements).

Categories