Importing nested list into a text file - python

I've been working on a problem which I realise I am probably approaching the wrong way but am now confused and out of ideas. Any research that I have done has left me more confused, and thus I have come for help.
I have a nested list:
[['# Name Surname', 'Age', 'Class', 'Score', '\n'], ['name', '9', 'B',
'N/A', '\n'], ['name1', '9', 'B', 'N/A', '\n'], ['name2', '8', 'B',
'N/A', '\n'], ['name3', '9', 'B', 'N/A', '\n'], ['name4', '8', 'B',
'N/A', '']]
I am trying to make it so this list is imported into a text file in the correct layout. For this I flattened the string and then joined it together with ','.
The problem with this is that because the '\n' is being stored in the list itself, it adds a comma after this, which ends up turning this:
Name Surname,Age,Class,Score,
Name,9,B,N/A,
Name1,9,B,N/A,
Name2,8,B,N/A,
Name3,9,B,N/A,
Name4,8,B,N/A,
into:
Name Surname,Age,Class,Score,
,
,Name,9,B,N/A,
,Name1,9,B,N/A,
,Name2,8,B,N/A,
,Name3,9,B,N/A,
,Name4,8,B,N/A,
If I remove the \n from the code the formatting in the text file is all wrong due to no new lines.
Is there a better way to approach this or is there a quick fix to all my problems that I cannot see?
Thanks!
My code for reference:
def scorestore(score):
user[accountLocation][3] = score
file = ("classdata",schclass,".txt")
file = "".join(file)
flattened = [val for sublist in user for val in sublist]
flatstring = ','.join(str(v) for v in flattened)
accountlist = open(file,"w")
accountlist.write(flatstring)
accountlist.close()

I'm not sure which list is the one in your post (sublist?) but when you flatten it, just discard the "\n" strings:
flattened = [x for x in sublist if x != ["\n"]]

The easiest way would probably be to remove the newlines from the sublists as you get them, the print each sublist one at a time. This would look something like:
for sublist in users:
print(",".join(val for val in sublist if not val.isspace()), file=accountlist)
This will fail on the 0 in your list, however. I'm not sure if you intend to handle that, or if it's extraneous. If you do need to handle is, then you'll have to change the generator expression to str(val) for val in sublist if not str(val).isspace().

Instead of making one string, how about writing lines. Use something like this:
list_of_list = [[...]]
lines = [','.join(line).strip() for line in list_of_list]
lines = [line for line in lines if line]
open(file,'w').writelines(lines)

Use the csv module to make it easier:
import csv
data = [
['# Name Surname', 'Age', 'Class', 'Score','\n'],
['\n'],
['Name', '9', 'B', 'N/A','\n'],
['Name1', '9', 'B', 'N/A','\n'],
['Name2', '8', 'B', 'N/A','\n'],
['Name3', '9', 'B', 'N/A','\n'],
['Name4', '8', 'B', 0]
]
# Remove all the ending new lines
data = [row[:-1] if row[-1] == '\n' else row for row in data]
# Write to file
with open('write_sublists.csv', 'wb') as f:
writer = csv.writer(f)
writer.writerows(data)
Discussion
Your data is irregular: some row contains the ending new line, some row don't. Yet some row contains all strings and some row contains a mixed data type. The first step is to normalize them by remove all ending new lines. The csv module can take care of mixed data types just fine.

Related

how to access python lists made with a for loop(reading data from a file)

i have made several arrays using a for loop by splitting lines from a text file. however, i do not know how to access the individual elements that these lists hold. i would prefer if someone showed me how to combine these arrays so the elements inside of them would be easier to access.
file2 = open('text_file.txt','r')
for line in file2:
words = line.split()
words.remove(words[0])
print(words)
Right now, that prints out this:
['CP', '0.50', '96']
['HR', '1.00', '93']
['HR', '1.00', '85']
['HR', '1.00', '99']
['CP', '0.75', '100']
['CP', '1.00', '94']
['HR', '1.00', '88']
['CP', '1.00', '92']
as you can see, it shows 8 separate lists. When I try to access the data stored in these arrays, using
words[0]
words[1]
or
words[2]
inside of the for loop, i get many values instead of just 1. these arrays have no names, so how can one access each individual number or string from these arrays. for example, when i try:
print(words[0])
i get the following:
CP
HR
HR
HR
CP
CP
HR
CP
however, when i do:
print(words[0])
i want to have this as the result:
CP
"inside of the for loop, i get many values instead of just 1"
of course you will, it is a loop, and the loop will run several times hence the many values.
what you need is to create another list of lists where you store your words and then access them from outside the loop:
file2 = open('text_file.txt','r')
lines = []
for line in file2:
words = line.split()
words.remove(words[0])
lines.append(words)
print(lines[0][0]) # to print CP from the first line in the document
Since it appears that you only want to read the 1st line of the file, use
line=file2.readline()
words = line.split()
print(words[1])
try accessing each array this way
words = [['CP', '0.50', '96'],
['HR', '1.00', '93'],
['HR', '1.00', '85'],
['HR', '1.00', '99'],
['CP', '0.75', '100'],
['CP', '1.00', '94'],
['HR', '1.00', '88'],
['CP', '1.00', '92']]
[x[0] for x in words]

How to get two values in a list within a list?

I was trying to come up with a function that would read an .csv archive and from there I could get for example, grades for students tests, example below:
NOME,G1,G2
Paulo,5.0,7.2
Pedro,6,4.1
Ana,3.3,2.3
Thereza,5,6.5
Roberto,7,5.2
Matheus,6.3,6.1
I managed to split the lines on the , part but I end up with somewhat a matrix:
[['NOME', 'G1', 'G2'], ['Paulo', '5.0', '7.2'], ['Pedro', '6', '4.1'], ['Ana', '3.3', '2.3'], ['Thereza', '5', '6.5'], ['Roberto', '7', '5.2'], ['Matheus', '6.3', '6.1']]
How do I go from one list to the other and manage to get the grades within them?
This is the code I got so far:
def leArquivo(arquivo):
arq = open(arquivo, 'r')
conteudo = arq.read()
arq.close
return conteudo
def separaLinhas(conteudo):
conteudo=conteudo.split('\n')
conteudo1 = []
for i in conteudo:
conteudo1.append(i.split(','))
return conteudo1
Where do I go from here?
A simple for will do it, i.e.:
notas = [['NOME', 'G1', 'G2'], ['Paulo', '5.0', '7.2'], ['Pedro', '6', '4.1'], ['Ana', '3.3', '2.3'], ['Thereza', '5', '6.5'], ['Roberto', '7', '5.2'], ['Matheus', '6.3', '6.1']]
for nota in notas[1:]: ## [1:] skip the first item
nome = nota[0]
g1 = nota[1]
g2 = nota[2]
print ("NOME:{} | G1: {} | G2: {}".format(nome, g1, g2))
DEMO
PS: You may want to cast g1 and g2 to a float - float(nota[1])- if you need to perform math operations.
Since you're working with a csv file, you may want to look at the csv module in Python. That module has many convenient options and forms in which the data is read. Following is an example of csv.DictReader reading and usage,
import csv
# Read the data
with open('data.csv') as f:
reader = csv.DictReader(f)
data = [row for row in reader]
# Print it
for row in data:
print (' ').join(['Nome:',row['NOME'],'G1:',row['G1'],'G2:',row['G2']])
# Print only names and G2 grades as a table
print '- '*10
print 'NOME\t' + 'G2'
for row in data:
print row['NOME'] + '\t' + row['G2']
# Average of G1 and G2 for each student
print '- '*10
print 'NOME\t' + 'Average'
for row in data:
gpa = (float(row['G1']) + float(row['G2']))/2.0
print row['NOME'] + '\t' + str(gpa)
Here the data is read as a list of dictionaries - each element in the list is a dictionary representing a single row of your dataset. The dictionary keys are names of your headers (NOME, G1) and values are the corresponding values for that row.
That particular form can be useful in some situations. Here in the first part of the program the data is printed with keys and values, one row per line. The thing to note is that dictionaries are unordered - to ensure printing in some specific order we need to traverse the dictionary "manually". I used join simply to demonstrate an alternative to format (which is actually more powerful) or just typing everything with spaces in between. Second usage example prints names and the second grade as a table with proper headers. Third calculates the average and prints it as a table.
For me this approach proved very useful when dealing with datasets with several thousands entries that have many columns - headers - that I want to study separately (thus I don't mind them not being in order). To get an ordered dictionary you can use OrderedDict or consider other available datastructures. I also use Python 2.7, but since you tagged the question as 3.X, the links point to 3.X documentation.

How to read files with one key but multiple values in Python

I am just beginning working with Python and am a little confused. I understand the basic idea of a dictionary as (key, value). I am writing a program and want to read in a file, story it in a dictionary and then complete different functions by referrencing the values. I am not sure if I should use a dictionary or lists. The basic layout of the file is:
Name followed by 12 different years for example :
A 12 12 01 11 0 0 2 3 4 9 12 9
I am not sure what the best way to read in this information would be. I was thinking that a dictionary may be helpful if I had Name followed by Years, but I am not sure if I can map 12 years to one key name. I am really confused on how to do this. I can read in the file line by line, but not within the dictionary.
def readInFile():
fileDict ={"Name ": "Years"}
with open("names.txt", "r") as f:
_ = next(f)
for line in f:
if line[1] in fileDict:
fileDict[line[0]].append(line[1])
else:
fileDict[line[0]] = [line[1]]
My thinking with this code was to append each year to the value.
Please let me know if you have any recommendations.
Thank you!
You can do in one line :)
print({line[0]:line[1:].split() for line in open('file.txt','r') if line[0]!='\n'})
output:
{'A': ['12', '12', '01', '11', '0', '0', '2', '3', '4', '9', '12', '9']}
Above dict comprehension is same as:
dict_1={}
for line in open('legend.txt', 'r'):
if line[0]!='\n':
dict_1[line[0]]=line[1:].split()
print(dict_1)
You can map 12 years to one key name. You seem to think that you need to choose between a dictionary and a list ("I am not sure if I should use a dictionary or lists.") But those are not alternatives. Your 12 years can usefully be represented as a list. Your names can be dictionary keys. So you need (as PM 2Ring suggests) a dictionary where the key is a name and the value is a list of years.
def readInFile():
fileDict = {}
with open(r"names.txt", "r") as f:
for line in f:
name, years = line.split(" ",1)
fileDict[name] = years.split()
There are two calls to the string method split(). The first splits the name from the years at the first space. (You can get the name using line[0], but only if the name is one character long, and that is unlikely to be useful with real data.) The second call to split() picks the years apart and puts them in a list.
The result from the one-line sample file will be the same as running this:
fileDict = {'A': ['12', '12', '01', '11', '0', '0', '2', '3', '4', '9', '12', '9']}
As you can see, these years are strings not integers: you may want to convert them.
Rather than doing:
_ = next(f)
to throw away your record count, consider doing
for line in f:
if line.strip().isdigit():
continue
instead. If you are using file's built-in iteration (for line in f) then it's generally best not to call next() on f yourself.
It's also not clear to me why your code is doing this:
fileDict ={"Name ": "Years"}
This is a description of what you plan to put in the dictionary, but that is not how dictionaries work. They are not database tables with named columns. If you use a dictionary with key:name and value:list of years, that structure is implicit. The best you can do is describe it in a comment or a type annotation. Performing the assignment will result in this:
fileDict = {
'A': ['12', '12', '01', '11', '0', '0', '2', '3', '4', '9', '12', '9'],
'Name ': 'Years'
}
which mixes up description and data, and is probably not what you want, because your subsequent code is likely to expect a 12-list of years in the dictionary value, and if so it will choke on the string "Years".
Values in a dict can be anything, including a new dict, but in this case a list sounds good. Maybe something like this.
from io import StringIO # just to make it run without an actual file
the_file_content = 'A 12 12 01 11\nB 13 13 02'
fake_file = StringIO(the_file_content)
# this stays for your
#with open('names.txt', 'rt') as f:
# lines = f.readlines()
lines = fake_file.readlines() # this goes away for you
lines = [l.strip().split(' ') for l in lines]
fileDict = {row[0]: row[1:] for row in lines}
# if you want the values to be actual numbers rather than strings
for k, v in fileDict.items():
fileDict[k] = [int(i) for i in v]
In python there are constructs where most simple as well as complex things can be done in one go, rather than looping with index-like constructs.

How do I turn a repeated list element with delimiters into a list?

I imported a CSV file that's basically a table with 5 headers and data sets with 5 elements.
With this code I turned that data into a list of individuals with 5 bits of information (list within a list):
import csv
readFile = open('Category.csv','r')
categoryList = []
for row in csv.reader(readFile):
categoryList.append(row)
readFile.close()
Now I have a list of lists [[a,b,c,d,e],[a,b,c,d,e],[a,b,c,d,e]...]
However element 2 (categoryList[i][2]) or 'c' in each list within the overall list is a string separated by a delimiter (':') of variable length. How do I turn element 2 into a list itself? Basically making it look like this:
[[a,b,[1,2,3...],d,e][a,b,[1,2,3...],d,e][a,b,[1,2,3...],d,e]...]
I thought about looping through each list element and finding element 2, then use the .split(':') command to separate those values out.
The solution you suggested is feasible. You just don't need to do it after you read the file. You can do it while taking it as a input in the first place.
for row in csv.reader(readFile):
row[2] = row[2].split(":") # Split element 2 of each row before appending
categoryList.append(row)
Edit: I guess you know the purpose of split function. So I will explain row[2].
You have a data such as [[a,b,c,d,e],[a,b,c,d,e],[a,b,c,d,e]...] which means each row goes like [a,b,c,d,e], [a,b,c,d,e], [a,b,c,d,e], So every row[2] corresponds to c. Using this way, you get to alter all c's before you append and turn them in to [[a,b,c,d,e],[a,b,c,d,e],[a,b,c,d,e]...].
Not really clear about your structure but if c is a string seperated by : within then try
list(c.split(':'))
Let me know if it solved your problem
You can use a list comprehension on each row and split items containing ':' into a new sublist:
for row in csv.reader(readFile):
new_row = [i.split(':') if ':' in i else i for i in row]
categoryList.append(new_row)
This works if you also have other items in the row that you need to split on ':'.
Otherwise, you can directly split on the index if you only have one item containing ':':
for row in csv.reader(readFile):
row[2] = row[2].split(':')
categoryList.append(row)
Assume that you have a row like this:
row = ["foo", "bar", "1:2:3:4:5", "baz"]
To convert item [2] into a sublist, you can use
row[2] = row[2].split(":") # elements can be assigned to, yawn.
Now the row is ['foo', 'bar', ['1', '2', '3', '4', '5'], 'baz']
To graft the split items to the "top level" of the row, you can use
row[2:3] = row[2].split(":") # slices can be assigned to, too, yay!
Now the row is ['foo', 'bar', '1', '2', '3', '4', '5', 'baz']
This of course skips any defensive checks of the row data (can it at all be split?) that a real robust application should have.

File Input and Output in Python, stripping of space and outputting to list

I am writing a program in python to take 5 lines of input from a file 'var_input' and input it into a list, and then input each seperate number into the list first or second
I am just wondering what the best way would be to go about separating the space from in between each number and then appending it to the lists first or second. I am thinking about using python's split method but I am not sure about how to do this
Data in input file would look like this
18 24
10 5
101 567
234 90
107 4567
first should contain ['18', '10', '101', '234', '107']
second should contain ['24', '5', '567', '90', '4567']
Here's What I have so far
first = []
second = []
file_input = open('var_input')
input_list = file_input.readlines()
Thank You So Much, any help would be greatly appreciated
You can do this with zip and split:
with open('var_input') as file_input:
input_list = file_input.readlines()
first, second = zip(*[l[:-1].split() for l in input_list])
How it works- [l[:-1].split() for l in input_list] is a list comprehension, which applies the split method to each line to make it look like:
[["18", "24"], ["10", "5"], ["101", "567"], ["234", "90"], ["107", "4567"]]
zip is a function that then zips multiple lists together (when given with *, it treats each item in the input list as a separate argument). It "zips" it by taking the first item of each list, then the second item of each list (if you had three or more items on each line you'd end up with three or more output lists). The result will look like
[('18', '10', '101', '234', '107'), ('24', '5', '567', '90', '4567')]
first = []
second = []
with open('var_input') as fp:
for line in fp:
temp = line.split()
first.append(temp[0])
second.append(temp[1])
This may looks stupid but it is simple and works.

Categories