generating list by reading from file - python

i want to generate a list of server addresses and credentials reading from a file, as a single list splitting from newline in file.
file is in this format
login:username
pass:password
destPath:/directory/subdir/
ip:10.95.64.211
ip:10.95.64.215
ip:10.95.64.212
ip:10.95.64.219
ip:10.95.64.213
output i want is in this manner
[['login:username', 'pass:password', 'destPath:/directory/subdirectory', 'ip:10.95.64.211;ip:10.95.64.215;ip:10.95.64.212;ip:10.95.64.219;ip:10.95.64.213']]
i tried this
with open('file') as f:
credentials = [x.strip().split('\n') for x in f.readlines()]
and this returns lists within list
[['login:username'], ['pass:password'], ['destPath:/directory/subdir/'], ['ip:10.95.64.211'], ['ip:10.95.64.215'], ['ip:10.95.64.212'], ['ip:10.95.64.219'], ['ip:10.95.64.213']]
am new to python, how can i split by newline character and create single list. thank you in advance

You could do it like this
with open('servers.dat') as f:
L = [[line.strip() for line in f]]
print(L)
Output
[['login:username', 'pass:password', 'destPath:/directory/subdir/', 'ip:10.95.64.211', 'ip:10.95.64.215', 'ip:10.95.64.212', 'ip:10.95.64.219', 'ip:10.95.64.213']]
Just use a list comprehension to read the lines. You don't need to split on \n as the regular file iterator reads line by line. The double list is a bit unconventional, just remove the outer [] if you decide you don't want it.
I just noticed you wanted the list of ip addresses joined in one string. It's not clear as its off the screen in the question and you make no attempt to do it in your own code.
To do that read the first three lines individually using next then just join up the remaining lines using ; as your delimiter.
def reader(f):
yield next(f)
yield next(f)
yield next(f)
yield ';'.join(ip.strip() for ip in f)
with open('servers.dat') as f:
L2 = [[line.strip() for line in reader(f)]]
For which the output is
[['login:username', 'pass:password', 'destPath:/directory/subdir/', 'ip:10.95.64.211;ip:10.95.64.215;ip:10.95.64.212;ip:10.95.64.219;ip:10.95.64.213']]
It does not match your expected output exactly as there is a typo 'destPath:/directory/subdirectory' instead of 'destPath:/directory/subdir' from the data.

This should work
arr = []
with open('file') as f:
for line in f:
arr.append(line)
return [arr]

You could just treat the file as a list and iterate through it with a for loop:
arr = []
with open('file', 'r') as f:
for line in f:
arr.append(line.strip('\n'))

Related

Enumerate using python

I'm a new coder and am currently trying to write a piece of code that, from an opened txt document, will print out the line number that each piece of information is on.
I've opened the file and striped it of all it's commas. I found online that you can use a function called enumerate() to get the line number. However when I run the code instead of getting numbers like 1, 2, 3 I get information like: 0x113a2cff0. Any idea of how to fix this problem/what the actual problem is? The code for how I used enumerate is below.
my_document = open("data.txt")
readDocument = my_document.readlines()
invalidData = []
for data in readDocument:
stripDocument = data.strip()
if stripDocument.isnumeric() == False:
data = (enumerate(stripDocument))
invalidData.append(data)
First of all, start by opening the document and already reading its content, and it's a good practice to use with, as it closes the document after the use. The readlines function gathers all the lines (this assumes the data.txt file is in the same folder as your .py one:
with open("data.txt") as f:
lines = f.readlines()
After, use enumerate to add index to the lines, so you can read them, use them, or even save the indexes:
for index, line in enumerate(lines):
print(index, line)
As last point, if you have breaklines on your data.txt, the lines will contain a \n, and you can remove them with the line.strip(), if you need.
The full code would be:
with open("data.txt") as f:
lines = f.readlines()
for index, line in enumerate(lines):
print(index, line.strip())
Taking your problem statement:
trying to write a piece of code that, from an opened txt document, will print out the line number that each piece of information is on
You're using enumerate incorrectly as #roganjosh was trying to explain:
with open("data.txt") as my_document:
for i, data in enumerate(my_document):
print(i, data)
The way you're doing it now, you're not removing the commas. The strip() method without arguments only deletes whitespaces leading and trailing the line. If you only want the data, this would work:
invalidData = []
for row_number, data in enumerate(readDocument):
stripped_line = ''.join(data.split(','))
if not stripped_line.isnumeric():
invalidData.append((row_number, data))
You can use the enumerate() function to enumerate a list. This will return a list of tuples containing the index first, then the line string. Like this:
(0, 'first line')
Your readDocument is a list of the lines, so it might be a good idea to name it accordingly.
lines = my_document.readlines()
for i, line in enumerate(lines):
print i, line

Nested lists in python containing a single string and not single letters

I need to load text from a file which contains several lines, each line contains letters separated by coma, into a 2-dimensional list. When I run this, I get a 2 dimensional list, but the nested lists contain single strings instead of separated values, and I can not iterate over them. how do I solve this?
def read_matrix_file(filename):
matrix = []
with open(filename, 'r') as matrix_letters:
for line in matrix_letters:
line = line.split()
matrix.append(line)
return matrix
result:
[['a,p,p,l,e'], ['a,g,o,d,o'], ['n,n,e,r,t'], ['g,a,T,A,C'], ['m,i,c,s,r'], ['P,o,P,o,P']]
I need each letter in the nested lists to be a single string so I can use them.
thanks in advance
split() function splits on white space by default. You can fix this by passing the string you want to split on. In this case, that would be a comma. The code below should work.
def read_matrix_file(filename):
matrix = []
with open(filename, 'r') as matrix_letters:
for line in matrix_letters:
line = line.split(',')
matrix.append(line)
return matrix
The input format you described conforms to CSV format. Python has a library just for reading CSV files. If you just want to get the job done, you can use this library to do the work for you. Here's an example:
Input(test.csv):
a,string,here
more,strings,here
Code:
>>> import csv
>>> lines = []
>>> with open('test.csv') as file:
... reader = csv.reader(file)
... for row in reader:
... lines.append(row)
...
>>>
Output:
>>> lines
[['a', 'string', 'here'], ['more', 'strings', 'here']]
Using the strip() function will get rid of the new line character as well:
def read_matrix_file(filename):
matrix = []
with open(filename, 'r') as matrix_letters:
for line in matrix_letters:
line = line.split(',')
line[-1] = line[-1].strip()
matrix.append(line)
return matrix

Parsing a file from first char in each line

I'm trying to group a file by the first character in each line of the file.
For example, the file:
s/1/1/2/3/4/5///6
p/22/LLL/GP/1/3//
x//-/-/-/1/5/-/-/
s/1/1/2/3/4/5///6
p/22/LLL/GP/1/3//
x//-/-/-/1/5/-/-/
I need to group everything starting with the first s/ up to the next s/. I don't think split() will work since it would remove the delimiter.
Desired end result:
s/1/1/2/3/4/5///6
p/22/LLL/GP/1/3//
x//-/-/-/1/5/-/-/
s/1/1/2/3/4/5///6
p/22/LLL/GP/1/3//
x//-/-/-/1/5/-/-/
I'd prefer to do this without the re module if possible (is it?)
Edit: Attempts:
The following gets me the values in groups using list comprehension:
with open('/file/path', 'r') as f:
content = f.read()
groups = ['s/' + group for group in content.split('s/')[1:]]
Since the s/ is the first character in the sequence, I use the [1:] to avoid having an element of just s/ in groups[0].
Is there a better way? Or is this the best?
Assuming the first line of the file starts with 's/' you could try something like this:
groups = []
with open('test.txt', 'r') as f:
for line in f:
if line.startswith('s/'):
groups.append('')
groups[-1] += line
To deal with files that don't start with 's/' and have the first element be all lines until the first 's/', we can make a small change and add in an empty string on the first line:
groups = []
with open('test.txt', 'r') as f:
for line in f:
if line.startswith('s/') or not groups:
groups.append('')
groups[-1] += line
Alternatively, if we want to skip lines until the first 's/', we can do the following:
groups = []
with open('test.txt', 'r') as f:
for line in f:
if line.startswith('s/'):
groups.append('')
if groups:
groups[-1] += line

importing from a text file to a dictionary

filename:dictionary.txt
YAHOO:YHOO
GOOGLE INC:GOOG
Harley-Davidson:HOG
Yamana Gold:AUY
Sotheby’s:BID
inBev:BUD
code:
infile = open('dictionary.txt', 'r')
content= infile.readlines()
infile.close()
counters ={}
for line in content:
counters.append(content)
print(counters)
i am trying to import contents of the file.txt to the dictionary. I have searched through stack overflow but please an answer in a simple way (not with open...)
First off, instead of opening and closing the files explicitly you can use with statement for opening the files which, closes the file automatically at the end of the block.
Secondly, as the file objects are iterator-like objects (one shot iterable) you can loop over the lines and split them with : character. You can do all of these things as a generator expression within dict function:
with open('dictionary.txt') as infile:
my_dict = dict(line.strip().split(':') for line in infile)
I assume that you don't have semi-colons in your keys.
In that case you should:
#read lines from your file
lines = open('dictionary.txt').read().split('\n')
#create an empty dictionary
dict = {}
#split every lines at ':' and use the left element as a key for the right value
for l in lines:
content = l.split(':')
dict[content[0]] = content[1]

Copy the last three lines of a text file in python?

I'm new to python and the way it handles variables and arrays of variables in lists is quite alien to me. I would normally read a text file into a vector and then copy the last three into a new array/vector by determining the size of the vector and then looping with a for loop a copy function for the last size-three into a new array.
I don't understand how for loops work in python so I can't do that.
so far I have:
#read text file into line list
numberOfLinesInChat = 3
text_file = open("Output.txt", "r")
lines = text_file.readlines()
text_file.close()
writeLines = []
if len(lines) > numberOfLinesInChat:
i = 0
while ((numberOfLinesInChat-i) >= 0):
writeLine[i] = lines[(len(lines)-(numberOfLinesInChat-i))]
i+= 1
#write what people say to text file
text_file = open("Output.txt", "w")
text_file.write(writeLines)
text_file.close()
To get the last three lines of a file efficiently, use deque:
from collections import deque
with open('somefile') as fin:
last3 = deque(fin, 3)
This saves reading the whole file into memory to slice off what you didn't actually want.
To reflect your comment - your complete code would be:
from collections import deque
with open('somefile') as fin, open('outputfile', 'w') as fout:
fout.writelines(deque(fin, 3))
As long as you're ok to hold all of the file lines in memory, you can slice the list of lines to get the last x items. See http://docs.python.org/2/tutorial/introduction.html and search for 'slice notation'.
def get_chat_lines(file_path, num_chat_lines):
with open(file_path) as src:
lines = src.readlines()
return lines[-num_chat_lines:]
>>> lines = get_chat_lines('Output.txt', 3)
>>> print(lines)
... ['line n-3\n', 'line n-2\n', 'line n-1']
First to answer your question, my guress is that you had an index error you should replace the line writeLine[i] with writeLine.append( ). After that, you should also do a loop to write the output :
text_file = open("Output.txt", "w")
for row in writeLine :
text_file.write(row)
text_file.close()
May I suggest a more pythonic way to write this ? It would be as follow :
with open("Input.txt") as f_in, open("Output.txt", "w") as f_out :
for row in f_in.readlines()[-3:] :
f_out.write(row)
A possible solution:
lines = [ l for l in open("Output.txt")]
file = open('Output.txt', 'w')
file.write(lines[-3:0])
file.close()
This might be a little clearer if you do not know python syntax.
lst_lines = lines.split()
This will create a list containing all the lines in the text file.
Then for the last line you can do:
last = lst_lines[-1]
secondLAst = lst_lines[-2]
etc... list and string indexes can be reached from the end with the '-'.
or you can loop through them and print specific ones using:
start = start line, stop = where to end, step = what to increment by.
for i in range(start, stop-1, step):
string = lst_lines[i]
then just write them to a file.

Categories