Replacing a numeric string with another in a .txt file [duplicate] - python

This question already has answers here:
How to convert a string to a number if it has commas in it as thousands separators?
(10 answers)
Closed 3 years ago.
My .txt have a lot of numbers divided in two rows but they are in the brazilian way to write them (this means the number 3.41 is writen as 3,41)... I know how to read each column, I just need to change every comma in the .txt to a dot, but I have no idea how to do that...
There's 3 ways I thought how to solve the problem:
Changing every comma into a dot and overwrite the previous .txt,
Write another .txt with another name, but with every comma changed into a dot,
Import every string (that should be float) from the txt and use replace to change the "," into a ".".
If you can help me with one of the first two ways would be better, especially the first one
(I just imported numpy and don't know how to use others library yet, so if you could help me with the codes and recipes I would really appreciate that) (sorry about the bad english, love ya)

Try this:
with open('input.txt') as input_f, open('output.txt', 'w') as output_f:
for line in input_f.readlines():
output_f.write(line.replace(',', '.'))
for input.txt:
1,2,3,4,5
10,20,30,40
the output will be:
1.2.3.4.5
10.20.30.40.

while your question is tagged python, here's a super-simple non-pythonic way, using the sed cmdline utility.
This will replace all commas (,) with dots (.) in your textfile, overwriting the original file:
sed -e 's/,/./g' -i yourtext.txt
Or, if you want the output in a different file:
sed -e 's/,/./g' yourtext.txt > newfile.txt

umlaute's answer is good, but if you insist on doing this in Python you can use fileinput, which supports inplace replacement:
import fileinput
with fileinput.FileInput(filename, inplace=True, backup='.bak') as file:
for line in file:
line.replace(',', '~').replace('.', ',').replace('~', '.')
This example assumes you have .'s and ,'s in your example already so uses the tilde as an interim character while fixing both characters. If you have ~'s in your data, feel free to swap that out for another uncommon character.
If you're working with a csv, be careful not to replace your column delimiter character. In this case, you'll probably want to use regex replace instead to ensure that each comma replaced is surrounded by digits: r'\d,\d'

Related

How to write multiple lines in a file using multiline string? I really want to optimize the below code

I am using file_name.write to write multiple line considering spaces to write a file.
Code:
file_handle.write('$TTL 1h\n')
file_handle.write('#\tIN\tSOA\tns1.test.nimblestorage.com.\tis-ops.hpe.com. (\n'
)
file_handle.write('\t\t\t%s\t; serial\n' % serial_number)
file_handle.write('\t\t\t3h\t; refresh\n')
file_handle.write('\t\t\t30m\t; retry\n')
file_handle.write('\t\t\t30d\t; expire\n')
file_handle.write('\t\t\t5m )\t; minimum\n')
file_handle.write('\t\tNS\tns1.test.nimblestorage.com.\n')
file_handle.write('\t\tNS\tns2.test.nimblestorage.com.\n')
file_handle.write('\n')
But I need some multiline string code with all lines in a single file.
Use triple quotes:
file_handle.write('''$TTL 1h
#\tIN\tSOA\tns1.test.nimblestorage.com.\tis-ops.hpe.com. (
\t\t\t{serial}\t; serial
...
'''.format(serial=serial_number))
There are reasons to want multi-line strings, in which case the answer by John Zwinck is the good one. However, if you want them for file I/O optimization:
Don't bother
Python already does the optimization for you: see the buffering option in https://docs.python.org/3/library/functions.html#open

On python, how do I get rid of quotations after joining a list of floats?

Apologies if this has been asked before, I couldn't find the same question.
I'm trying to write 3 things to a CSV file in one line, productcode, amountentered and changecoins.
The changecoins is a list of floats which I have joined together using changecoins=",".join(map(str,changecoins)). This works fine except I still have quotations around the values which are then written into my csv file.
I've tried using strip and replace but they don't seem to work.
I've attached my code and output in the csv file below, does anyone know how to fix this?
changecoins=",".join(map(str,changecoins)).replace('"', '')
changeline=(productcode, amountentered, changecoins)
changewriter.writerow(changeline)
Output
01,1.0,"0.1,0.1"
01,2.0,"0.5,0.5,0.1,0.1"
04,1.0,"0.1,0.1,0.1,0.1"
Why not use extend?
result = [productcode, amountentered]
result.extend(changecoins)
changewriter.writerow(result)
if you want to get even more slick, you can just do:
result = [productcode, amountentered] + changecoins
changewriter.writerow(result)
or even just:
changewriter.writerow([productcode, amountentered] + changecoins)
It seems you're unnecessarily joining the floats...You already have a list of floats, just tack it on to the other two guys and then pass that to your csv writer.
This is most likely because you're using "," to join your values when the CSV delimiter is also a ,. Python is wrapping the column in quotes so the "," inside the cell value isn't confused for a delimiter.
If you change to joining with a different character than "," or change the delimiter for the file, the quotes will go away.

replace a string with regular expression in python

I have been learning regular expression for a while but still find it confusing sometimes
I am trying to replace all the
self.assertRaisesRegexp(SomeError,'somestring'):
to
self.assertRaiseRegexp(SomeError,somemethod('somestring'))
How can I do it? I am assuming the first step is fetch 'somestring' and modify it to somemethod('somestring') then replace the original 'somestring'
here is your regular expression
#f is going to be your file in string form
re.sub(r'(?m)self\.assertRaisesRegexp\((.+?),((?P<quote>[\'"]).*?(?P=quote))\)',r'self.assertRaisesRegexp(\1,somemethod(\2))',f)
this will grab something that matches and replace it accordingly. It will also make sure that the quotation marks line up correctly by setting a reference in quote
there is no need to iterate over the file here either, the first statement "(?m)" puts it in multiline mode so it maps the regular expression over each line in the file. I have tested this expression and it works as expected!
test
>>> print f
this is some
multi line example that self.assertRaisesRegexp(SomeError,'somestring'):
and so on. there self.assertRaisesRegexp(SomeError,'somestring'): will be many
of these in the file and I am just ranting for example
here is the last one self.assertRaisesRegexp(SomeError,'somestring'): okay
im done now
>>> print re.sub(r'(?m)self\.assertRaisesRegexp\((.+?),((?P<quote>[\'"]).*?(?P=quote))\)',r'self.assertRaisesRegexp(\1,somemethod(\2))',f)
this is some
multi line example that self.assertRaisesRegexp(SomeError,somemethod('somestring')):
and so on. there self.assertRaisesRegexp(SomeError,somemethod('somestring')): will be many
of these in the file and I am just ranting for example
here is the last one self.assertRaisesRegexp(SomeError,somemethod('somestring')): okay
im done now
A better tool for this particular task is sed:
$ sed -i 's/\(self.assertRaisesRegexp\)(\(.*\),\(.*\))/\1(\2,somemethod(\3))/' *.py
sed will take care of the file I/O, renaming files, etc.
If you already know how to do the file manipulation, and iterating over lines in each file, then the python re.sub line will look like:
new_line = re.sub(r"(self.assertRaisesRegexp)\((.*),(.*)\)",
r"\1(\2,somemethod(\3)",
old_line)

Regex out leading and trailing quotes if not contains comma

I'm at a total loss of how to do this.
My Question: I want to take this:
"A, two words with comma","B","C word without comma","D"
"E, two words with comma","F","G more stuff","H no commas here!"
... (continue)
To this:
"A, two words with comma",B,C word without comma,D
"E, two words with comma",F,G more stuff,H no commas here!
... (continue)
I used software that created 1,900 records in a text file and I think it was supposed to be a CSV but whoever wrote the software doesn't know how CSV files work because it only needs quotes if the cell contains a comma (right?). At least I know that in Excel it puts everything in the first cell...
I would prefer this to be solvable using some sort of command line tool like perl or python (I'm on a Mac). I don't want to make a whole project in Java or anything to take care of this.
Any help is greatly appreciated!
Shot in the dark here, but I think that Excel is putting everything in the first column because it doesn't know it's being given comma-separated data.
Excel has a "text-to-columns" feature, where you'll be able to split a column by a delimiter (make sure you choose the comma).
There's more info here:
http://support.microsoft.com/kb/214261
edit
You might also try renaming the file from *.txt to *.csv. That will change the way Excel reads the file, so it better understands how to parse whatever it finds inside.
If just bashing is an option, you can try this one-liner in a terminal:
cat file.csv | sed 's/"\([^,]*\)"/\1/g' >> new-file.csv
That technically should be fine. It is text delimited with the " and separated via the ,
I don't see anything wrong with the first at all, any field may be quoted, only some require it. More than likely the writer of the code didn't want to over complicate the logic and quoted everything.
One way to clean it up is to feed the data to csv and dump it back.
import csv
from cStringIO import StringIO
bad_data = """\
"A, two words with comma","B","C word without comma","D"
"E, two words with comma","F","G more stuff","H no commas here!"
"""
buffer = StringIO()
writer = csv.writer(buffer)
writer.writerows(csv.reader(bad_data.split('\n')))
buffer.seek(0)
print buffer.read()
Python's csv.writer will default to the "excel" dialect, so it will not write the commas when not necessary.

remove one comma using a python script

I have csv file with a line that looks something like this:
,,,,,,,,,,
That's 10 commas. I wish to remove only the last (i.e. the 10th) comma, so that the line changes to:
,,,,,,,,,
Has anyone had any experience dealing with a case like this? I use the vim text editor also. So, any help using python script or text editing commands using vim would be appreciated.
Removing last comma in current line in vim:
:s/,$//
The same for lines n through m:
:n,ms/,$//
The same for whole file:
:%s/,$//
This will do it in the simplest case, once you've updated your question with what you're looking for, I'll update the code.
commas = ",,,,,,,,,,,"
print commas.replace(","*10, ","*9)
If you want to remove the last comma on any given line you can do:
import re
commas = """,,,,,,,,,,
,,,,,,,,,,"""
print re.sub(r',$', '', commas, re.MULTILINE)
And if, in any file, you want to take a line that is just 10 commas and make it 9 commas:
import re
commas = ",,,,,,,,,,\n,,,,,,,,,,\n,,,,,,,,,,"
print re.sub(r'^,{10}$', ','*9, commas, re.MULTILINE)
I would use:
sed -i 's/,$//' file.csv
I really love the VIM normal command. So if you want to remove the last "column" from this CSV file I'd do like this:
:%normal $F,D
In other words, execute in every line of this file (%) the following procedures (normal):
$ - go to the end of the line;
F, - Move the cursor to the previews comma in this line;
D - Delete from the cursor until the end of the line;
Also, this can be used with ranges (from line 1 to 20):
:1,20normal $F,D
But if there are nothing but a lot of commas with no data between them, you can simply do:
:%normal $X

Categories