Working with numbers in Python - python

Hello. I am very new to Python and programming in general.
I have 3 columns from CSV file
X,CH1,CH2,
Second,Volt,Volt,
2.66400e-02,4.00e-03,1.04e-03,
-2.66360e-02,4.00e-03,7.20e-04,
-2.66320e-02,4.00e-03,5.60e-04,
-2.66280e-02,4.00e-03,3.20e-04,
-2.66240e-02,4.00e-03,8.00e-05,
-2.66200e-02,4.00e-03,-2.40e-04,
-2.66160e-02,4.00e-03,-5.60e-04,
-2.66120e-02,4.00e-03,-7.20e-04,
-2.66080e-02,4.00e-03,-1.04e-03, ***for example.***
I am using
:
with open('maximum.csv', 'rb') as f:
reader = csv.reader(f, delimiter=',')
for _ in xrange(2):
next(f)
to skip first two lines, as this is just text, and then
for row in reader:
x=(float(row[2]))
print(x)
gives me
0.00104
0.00072
0.00056
0.00032
8e-05
-0.00024
-0.00056
-0.00072
-0.00104
So there is the question:
What should I write, so that it will give me an integer number instead of decimals, like
104
72
56
24
8
24
56
72
104
P.S I do not want just to multiply by 10^5
Thanks

You have to multiply by 10 ^ 5 because you actually want to have bigger number.
Then apply function int() and get 104 instead of 104.0

Related

printing column numbers in python

How can I print first 52 numbers in the same column and so on (in 6 columns in total that repeats). I have lots of float numbers and I want to keep the first 52 and so on numbers in the same column before starting new column that will as well have to contain the next 52 numbers. The numbers are listed in lines separated by one space in a file.txt document. So in the end I want to have:
1 53 105 157 209 261
2
...
52 104 156 208 260 312
313 ... ... ... ... ...
...(another 52 numbers and so on)
I have try this:
with open('file.txt') as f:
line = f.read().split()
line1 = "\n".join("%-20s %s"%(line[i+len(line)/52],line[i+len(line)/6]) for i in range(len(line)/6))
print(line1)
However this only prints of course 2 column numbers . I have try to add line[i+len()line)/52] six time but the code is still not working.
for row in range(52):
for col in range(6):
print line[row + 52*col], # Dangling comma to stay on this line
print # Now go to the next line
Granted, you can do this in more Pythonic ways, but this will show you the algorithm structure and let you tighten the code as you wish.

Read and replace the contents of a line using dictionary

I have a file below where I want to convert what is written on every fourth line into a number.
sample.fastq
#HISE
GGATCGCAATGGGTA
+
CC#!$%*&J#':AAA
#HISE
ATCGATCGATCGATA
+
()**D12EFHI#$;;
Each fourth line is a series of characters which each individually equate to a number (stored in a dictionary). I would like to convert each character into it’s corresponding number and then find the average of all those numbers on that line.
I have gotten as far as being able to display each of the characters individually but I’m pretty stunted as to how to replace the characters with their number and then subsequently go on further.
script.py
d = {
'!':0, '"':1, '#':2, '$':3, '%':4, '&':5, '\'':6, '(':7, ')':8,
'*':9, '+':10, ',':11, '-':12, '.':13, '/':14, '0':15,'1':16,
'2':17, '3':18, '4':19, '5':20, '6':21, '7':22, '8':23, '9':24,
':':25, ';':26, '<':27, '=':28, '>':29, '?':30, '#':31, 'A':32, 'B':33,
'C':34, 'D':35, 'E':36, 'F':37, 'G':38, 'H':39, 'I':40, 'J':41 }
with open('sample.fastq') as fin:
for i in fin.readlines()[3::4]:
for j in i:
print j
The output should be as below and stored in a new file.
output.txt
#HISE
GGATCGCAATGGGTA
+
19 #From 34 34 31 0 3 4 9 5 41 2 6 25 32 32 32
#HISE
ATCGATCGATCGATA
+
23 #From 7 8 9 9 35 16 17 36 37 39 40 31 3 26 26
Is what i’m proposing possible?
You can do this with a for loop over the input file lines:
with open('sample.fastq') as fin, open('outfile.fastq', "w") as outf:
for i, line in enumerate(fin):
if i % 4 == 3: # only change every fourth line
# don't forget to do line[:-1] to get rid of newline
qualities = [d[ch] for ch in line[:-1]]
# take the average quality score. Note that as in your example,
# this truncates each to an integer
average = sum(qualities) / len(qualities)
# new version; average with \n at end
line = str(average) + "\n"
# write line (or new version thereof)
outf.write(line)
This produces the output you requested:
#HISE
GGATCGCAATGGGTA
+
19
#HISE
ATCGATCGATCGATA
+
22
Assuming you read from stdin and write to stdout:
for i, line in enumerate(stdin, 1):
line = line[:-1] # Remove newline
if i % 4 != 0:
print(line)
continue
nums = [d[c] for c in line]
print(sum(nums) / float(len(nums)))

Using Python to extract segment of numbers from a file?

I know there are questions on how to extract numbers from a text file, which have helped partially. Here is my problem. I have a text file that looks like:
Some crap here: 3434
A couple more lines
of crap.
34 56 56
34 55 55
A bunch more crap here
More crap here: 23
And more: 33
54 545 54
4555 55 55
I am trying to write a script that extracts the lines with the three numbers and put them into separate text files. For example, I'd have one file:
34 56 56
34 55 55
And another file:
54 545 54
4555 55 55
Right now I have:
for line in file_in:
try:
float(line[1])
file_out.write(line)
except ValueError:
print "Just using this as placeholder"
This successfully puts both chunks of numbers into a single file. But I need it to put one chunk in one file, and another chunk in another file, and I'm lost on how to accomplish this.
You didn't specify what version of Python you were using but you might approach it this way in Python2.7.
string.translate takes a translation table (which can be None) and a group of characters to translate (or delete if table is None).
You can set your delete_chars to everything but 0-9 and space by slicing string.printable correctly:
>>> import string
>>> remove_chars = string.printable[10:-6] + string.printable[-4:]
>>> string.translate('Some crap 3434', None, remove_chars)
' 3434'
>>> string.translate('34 45 56', None, remove_chars)
'34 45 56'
Adding a strip to trim white space on the left and right and iterating over a testfile containing the data from your question:
>>> with open('testfile.txt') as testfile:
... for line in testfile:
... trans = line.translate(None, remove_chars).strip()
... if trans:
... print trans
...
3434
34 56 56
34 55 55
23
33
54 545 54
4555 55 55
You can use regex here.But this will require reading file into a variable by file.read() or something.(If the file is not huge)
((?:(?:\d+ ){2}\d+(?:\n|$))+)
See demo.
https://regex101.com/r/tX2bH4/20
import re
p = re.compile(r'((?:(?:\d+ ){2}\d+(?:\n|$))+)', re.IGNORECASE)
test_str = "Some crap here: 3434\nA couple more lines\nof crap.\n34 56 56\n34 55 55\nA bunch more crap here\nMore crap here: 23\nAnd more: 33\n54 545 54\n4555 55 55"
re.findall(p, test_str)
re.findall returns a list.You can easily put each content of list in a new file.
To know if a string is a number you can use str.isdigit:
for line in file_in:
# split line to parts
parts = line.strip().split()
# check all parts are numbers
if all([str.isdigit(part) for part in parts]):
if should_split:
split += 1
with open('split%d' % split, 'a') as f:
f.write(line)
# don't split until we skip a line
should_split = False
else:
with open('split%d' % split, 'a') as f:
f.write(line)
elif not should_split:
# skipped line means we should split
should_split = True

Filling tabs until the maximum length of column

I have a tab-delimited txt that looks like
11 22 33 44
53 25 36 25
74 89 24 35 and
But there is no "tab" after 44 and 25. So the 1st and 2nd rows have 4 columns, 3rd row has 5 columns.
To rewrite it so that tabs are shown,
11\t22\t33\t44
53\t25\t36\t25
74\t89\t24\t35\tand
I need to have a tool to mass-add tabs where there are no entries.
If the maximum length of column is n (n=5 in the above example), then I want to fill tabs until that nth column for all rows to make
11\t22\t33\t44\t
53\t25\t36\t25\t
74\t89\t24\t35\tand
I tried to do it by notepad++, and python by using replacer code like
map_dict = {'':'\t'}
but it seems I need more logic to do it.
I am assuming your file also contains newlines so it would actually look like this:
11\t22\t33\t44\n
53\t25\t36\t25\n
74\t89\t24\t35\tand\n
If you know for sure that the maximum length of your columns is 5, you can do it like this:
with open('my_file.txt') as my_file:
y = lambda x: len(x.strip().split('\t'))
a = [line if y(line) == 5 else '%s%s\n' % (line.strip(), '\t'*(5 - y(line)))
for line in my_file.readlines()]
# ['11\t22\t33\t44\t\n', '53\t25\t36\t25\t\n', '74\t89\t24\t35\tand\n']
This will add ending tabs until you reach 5 columns. You will get a list of lines that you need to write back to a file (i have 'my_file2.txt' but you can write back to the original one if you want).
with open('my_file2.txt', 'w+') as out_file:
for line in a:
out_file.write(line)
If I understood it correctly, you can achieve this in Notepad++ only using following:
And yes, if you have several files on which you want to perform this, you can record this as a macro and bind it on to key as a shortcut

Writing CSV values to file results in values being separated in single characters (Python)

Kinda knew to Python:
I have the following code:
def printCSV(output, values, header):
63 """
64 prints the output data as comma-separated values
65 """
66
67 try:
68 with open(output, 'w') as csvFile:
69 #print headers
70 csvFile.write(header)
71
72 for value in values:
73 #print value, "\n"
74 csvFile.write(",".join(value))
75 csvFile.write("\n")
76 except:
77 print "Error occured while writing CSV file..."
Values is a list constructed somewhat like this:
values = []
for i in range(0,5):
row = "A,%s,%s,%s" % (0,stringval, intval)
values.append(row)
When I open the file created by the above function, I expect to see something like this:
Col1,Col2,Col3,Col4
A,0,'hello',123
A,0,'foobar',42
Instead, I am seeing data like this:
Col1,Col2,Col3,Col4
A,0,'h','e','l','l','o',1,2,3
A,0,'f','o','o','b','a','r',4,2
Anyone knows what is causing this?
I even tried to use fopen and fwrite() directly, still the same problem exists.
Whats causing this?
The problem you're encountering is that you're doing ",".join(value) with value being a string. Strings act like a collection of characters, so the command translates to "Join each character with a comma."
What you could do instead is use a tuple instead of a string for your row values you pass to printCSV, like this:
values = []
for i in range(0,5):
row = ('A', 0, stringval, intval)
values.append(row)

Categories