How to write new columns to a CSV file - python

How to create new columns when writing to a CSV file? For example, I have a list of data:
Mylist = [1,3,67,43,23,52,7,9,21]
I would like to start a new line after every third value, so the output would look as follows, with each number in a separate cell and arranged into three columns (a 3x3 grid):
1 3 67\n
43 23 52\n
7 9 21\n
I know that the escape function \n is used to start a new line, but how would I go about starting a new column? I would prefer to use only BASIC Python read/write functions, not the imported csv module. This seems like it would be a fairly easy thing to do, but I can't figure it out.

Don't reinvent the wheel. You split up your list into evenly sized chunks, then use the csv module to produce your output:
import csv
with open(filename, 'wb') as outfile:
writer = csv.writer(outfile, delimiter=' ')
for i in xrange(0, len(Mylist), 3):
writer.writerow(Mylist[i:i + 3])
Even without the module, you can trivially join your columns using str.join(), but you have to explicitly map all values to strings first:
with open(filename, 'w') as outfile:
for i in xrange(0, len(Mylist), 3):
outfile.write(' '.join(map(str, Mylist[i:i + 3])) + '\n')
If you need to specifically pad your numbers to fit in columns 2 characters wide, add a format() call in a list comprehension:
with open(filename, 'w') as outfile:
for i in xrange(0, len(Mylist), 3):
outfile.write(' '.join([format(d, '<2d') for d in Mylist[i:i + 3]]) + '\n')
The '<2' width specifier left-aligns your numbers with whitespace.
Demo of the first and last options:
>>> import csv
>>> from io import BytesIO
>>> Mylist = [1,3,67,43,23,52,7,9,21]
>>> demo = BytesIO()
>>> writer = csv.writer(demo, delimiter=' ')
>>> for i in xrange(0, len(Mylist), 3):
... writer.writerow(Mylist[i:i + 3])
...
8L
10L
8L
>>> print demo.getvalue()
1 3 67
43 23 52
7 9 21
>>> demo = BytesIO()
>>> for i in xrange(0, len(Mylist), 3):
... demo.write(' '.join([format(d, '<2d') for d in Mylist[i:i + 3]]) + '\n')
...
9L
9L
9L
>>> print demo.getvalue()
1 3 67
43 23 52
7 9 21

try like this:
Mylist = [1,3,67,43,23,52,7,9,21]
with open('outfile', 'w') as f
for i in range(len(Mylist)):
if (i+1)%3 == 0:
f.write(" ".join(map(str, Mylist[i-2:i+1])) + '\n')
output:
1 3 67
43 23 52
7 9 21

Here's another option, maybe the easiest
#!/usr/bin/env python3
Mylist = [1,3,67,43,23,52,7,9,21]
filename = 'outfile.csv'
with open(filename, 'w') as outfile:
for i in range(0, len(Mylist), 3):
print('{} {} {}'.format(Mylist[i], Mylist[i+1], Mylist[i+2]))
Output
1 3 67
43 23 52
7 9 21

Related

Moving contents from one file to a new one in python

i want to print in the even file all even numbers with spaces between them eg: 12 6 20 10 not 1262010 with no spaces in front or back. How can i do this?
def write_positive_even_to_file(filename):
with open(filename, 'r') as orginal, open('xxx.txt', 'a') as even:
red = orginal.read().split()
for number in red:
if number % 2 == 0:
even.write(number + " ")
Input file:
15 12 6
7 20 9 10
13 17
3
You need to split each input line into tokens (assumed to represent integers) convert to int then determine if any value is even.
Something like this:
def write_positives(infile, outfile, mode='a'):
with open(infile) as fin, open(outfile, mode) as fout:
if (evens := [x for x in map(int, fin.read().split()) if x % 2 == 0]):
print(*evens, file=fout)
By printing the unpacked list you will, by default, have space separation

Compare 2 .csv Files via Loop

I have 15 .csv files with the following formats:
**File 1**
MYC
RASSF1
DAPK1
MDM2
TP53
E2F1
...
**File 2**
K06227
C00187
GLI1
PTCH1
BMP2
TP53
...
I would like to create a loop that runs through each of the 15 files and compares 2 at each time, creating unique pairs. So, File 1 and File 2 would be compared with each other giving an output telling me how many matches it found and what they were. So in the above example, the output would be:
1 match and TP53
The loops would be used to compare all the files against each other so 1,3 (File 1 against File 3), 1,4 and so on.
f1 = set(open(str(cancers[1]) + '.csv', 'r'))
f2 = set(open(str(cancers[2]) + '.csv', 'r'))
f3 = open(str(cancers[1]) + '_vs_' + str(cancers[2]) + '.txt', 'wb').writelines(f1 & f2)
The above works but I'm having a hard time creating the looping portion.
In order not to compare the same file, and make the code flexible to the number of cancers, I would code like this. I assume cancer is a list.
# example list of cancers
cancers = ['BRCA', 'BLCA', 'HNSC']
fout = open('match.csv', 'w')
for i in range(len(cancers)):
for j in range(len(cancers)):
if j > i:
# if there are string elements in cancers,
# then it doesn't need 'str(cancers[i])'
f1 = [x.strip() for x in set(open(cancers[i] + '.csv', 'r'))]
f2 = [x.strip() for x in set(open(cancers[j] + '.csv', 'r'))]
match = list(set(f1) & set(f2))
# I use ; to separate matched genes to make excel able to read
fout.write('{}_vs_{},{} matches,{}\n'.format(
cancers[i], cancers[j], len(match), ';'.join(match)))
fout.close()
Results
BRCA_vs_BLCA,1 matches,TP53
BRCA_vs_HNSC,6 matches,TP53;BMP2;GLI1;C00187;PTCH1;K06227
BLCA_vs_HNSC,1 matches,TP53
To loop through all pairs up to 15, something like this can do it:
for i in range(1, 15):
for j in range(i+1, 16):
f1 = set(open(str(cancers[i]) + '.csv', 'r'))
f2 = set(open(str(cancers[j]) + '.csv', 'r'))
f3 = open(str(cancers[i]) + '_vs_' + str(cancers[j]) + '.txt',
'wb').writelines(f1 & f2)

Find sum of numbers in line

This is what I have to do:
Read content of a text file, where two numbers separated by comma are on each line (like 10, 5\n, 12, 8\n, …)
Make a sum of those two numbers
Write into new text file two original numbers and the result of summation = like 10 + 5 = 15\n, 12 + 8 = 20\n, …
So far, I've got this:
import os
import sys
relative_path = "Homework 2.txt"
if not os.path.exists(relative_path):
print "not found"
sys.exit()
read_file = open(relative_path, "r")
lines = read_file.readlines()
read_file.close()
print lines
path_output = "data_result4.txt"
write_file = open(path_output, "w")
for line in lines:
line_array = line.split()
print line_array
You need to have a good understanding of python to understand this.
First, read the file, and get all of the lines by splitting it with a line feed (\n)
For each expression, calculate the answer and write it. Remember, you need to cast the numbers to integers so that they can be added together.
with open('Original.txt') as f:
lines = f.read().split('\n')
with open('answers.txt', 'w+') as f:
for expression in lines: # expression should be in format '12, 8'
nums = [int(i) for i in expression.split(', ')]
f.write('{} + {} = {}\n'.format(nums[0], nums[1], nums[0] + nums[1]))
# That should write '12 + 8 = 20\n'
Make your last for loop look like this:
for line in lines:
splitline = line.strip().split(",")
summation = sum(map(int, splitline))
write_file.write(" + ".join(splitline) + " = " + str(summation) + "\n")
One beautiful thing about that way is that you can have as many numbers as you want on a line, and it will still display correctly.
Seems like the input File is csv so just use the csv reader module in python.
Input File Homework 2.txt
1, 2
1,3
1,5
10,6
The script
import csv
f = open('Homework 2.txt', 'rb')
reader = csv.reader(f)
result = []
for line in list(reader):
nums = [int(i) for i in line]
result.append(["%(a)s + %(b)s = %(c)s" % {'a' : nums[0], 'b' : nums[1], 'c' : nums[0] + nums[1] }])
f = open('Homework 2 Output.txt', 'wb')
writer = csv.writer(f, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
for line in result:
writer.writerow(line)
The output file is then Homework 2 Output.txt
1 + 2 = 3
1 + 3 = 4
1 + 5 = 6
10 + 6 = 16

How to make Python read new lines and just lines?

I know that Python can read numbers like:
8
5
4
2
2
6
But I am not sure how to make it read it like:
8 5 4 2 2 6
Also, is there a way to make python read both ways? For example:
8 5 4
2
6
I think reading with new lines would be:
info = open("info.txt", "r")
lines = info.readlines()
info.close()
How can I change the code so it would read downwards and to the sides like in my third example above?
I have a program like this:
info = open("1.txt", "r")
lines = info.readlines()
numbers = []
for l in lines:
num = int(l)
numbers.append(str(num**2))
info.close()
info = open("1.txt", "w")
for num in numbers:
info.write(num + "\n")
info.close()
How can I make the program read each number separately in new lines and in just lines?
Keeping them as strings:
with open("info.txt") as fobj:
numbers = fobj.read().split()
Or, converting them to integers:
with open("info.txt") as fobj:
numbers = [int(entry) for entry in fobj.read().split()]
This works with one number and several numbers per line.
This file content:
1
2
3 4 5
6
7
8 9 10
11
will result in this output for numbers:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
This approach reads the whole file at once. Unless your file is really large this is fine.
info = open("1.txt", "r")
lines = info.readlines()
numbers = []
for line in lines:
for num_str in line.split(' '):
num = int(num_str)
numbers.append(str(num**2))
info.close()
info = open("test.txt", "r")
lines = info.readlines()
numbers = []
for l in lines:
l = l.strip()
lSplit = l.split(' ')
if len(lSplit) == 1:
num = int(l)
numbers.append(str(num**2))
else:
for num in lSplit:
num2 = int(num)
numbers.append(str(num2**2))
print numbers
info.close()
A good way to do this is with a generator that iterates over the lines, and, for each line, yields each of the numbers on it. This works fine if there is only one number on the line (or none), too.
def numberfile(filename):
with open(filename) as input:
for line in input:
for number in line.split():
yield int(number)
Then you can just write, for example:
for n in numberfile("info.txt"):
print(n)
If you don't care how many numbers per line, then you could try this to create the list of the squares of all the numbers.
I have simplified your code a bit by simply iterating over the open file using a with statement, but iterating over the readlines() result will work just as well (for small files - for large ones, this method doesn't require you to hold the whole content of the file in memory).
numbers = []
with open("1.txt", 'r') as f:
for line in f:
nums = line.split()
for n in nums:
numbers.append(str(int(n)**2))
Just another not yet posted way...
numbers = []
with open('info.txt') as f:
for line in f:
numbers.extend(map(int, line.split()))
file_ = """
1 2 3 4 5 6 7 8
9 10
11
12 13 14
"""
for number in file_ .split():
print number
>>
1
2
3
4
5
6
7
8
9
10
11
12
13
14

Find all string positions in file

If i want to find position of string in a file i can do
f = open('file.txt', 'r')
lines = f.read()
posn = lines.find('string')
What if the string occured several times in the file and I want to find all the positions where it occurs? I have a list of strings so right now my code is
for string in list:
f = open('file.txt', 'r')
lines = f.read()
posn = lines.find(string)
My code is incomplete, it only finds the first position of each string in the list
You can use the following
import re
a = open("file", "r")
g = a.read()
ma = re.finditer('test', g)
for t in ma:
print t.start(), t.end()
Possible output
8 12
16 20
For example:
g='hahahatesthahatesthahahatest'
ma=re.finditer('test',g)
for t in ma:
print t.start(), t.end()
Output
6 10
14 18
24 28
print g[t.start():t.end()] gives you test as expected
You can just use enumerate :
>>> s='this is a string'
>>> def find_pos(s,sub):
... return [i for i,j in enumerate(s) if j==sub]
...
>>> find_pos(s,'s')
[3, 6, 10]
This will return where your pattern is present in your file. Uses re.finditer.
import re
with open('your.file') as f:
text = f.read()
positions = [m.span() for m in re.finditer('pattern', text)]

Categories