I need to delete numbers from a text file on windows XP. I am new to python and just installed it for data scrubbing.
I have stored the test file in C:\folder1\test1.txt
The contexts on test1.txt is just 1 line:
This must not b3 delet3d, but the number at the end yes 134411
I want to created a file result1.txt which contains
This must not b3 delet3d, but the number at the end yes
Here is what I tried so far
import os
fin = os.open('C:\folder1\test1.txt','r')
I get the following error:
TypeError: an integer is required.
I am not sure what integer it is expecting.
Can you please let me know how to go about programming to get the result I want. Thanks a lot for your help.
You're using open() in the os module, which takes a numeric file mode. You want instead the builtin open() function. Also, backslashes in strings take on a special meaning in Python; you need to double them up if you really mean backslashes. Try:
fin = open('C:\\folder1\\test1.txt','r')
according to http://docs.python.org/library/os.html#file-descriptor-operations, os.open is looking for 'flag' parameter, made of one or more of these flags which 'r' is not. It also seems to indicate that you probably want to look into using open() rather than os.open()
f = open(r'C:\folder1\test1.txt','r')
Related
I read various lines from a CSV file like this:
f1 = open(current_csv, 'rb')
table = f1.readlines()
f1.close()
So essentially any single line in table is something like this:
line = b' G\xe4rmanword: 123,45\r\n'
which type tells me is bytes, but I need to work around with .replace so I'm turning it into a string: line = str(line), but now line turned into
"b' G\\xe4rmanword: 123,45\\r\\n'"
with and added \ before every \. However, with print(line), they don't show up, but if I want to turn \xe4 into ae (alternative way of writing ä) with line = line.replace('\xe4', 'ae') this just does nothing. Using '\\xe4' works, however. But I would have expected that the first one just turns \\xe4 into \ae instead of just doing nothing, and the second option, while working, relies on my defining a new definition for the replacement for ä, both of which I'd rather avoid.
So I'm trying to understand where the extra backslash comes from and how I can avoid it to start with, instead of having to fix it in my postprocessing. I have the feeling that something changed between python2 and 3, since the original csv reader is a python2 script I had translated with 2to3.
Yes, since Python3 uses Unicode for all strings, the semantics of many string-related functions including str have changed compared to Python2. In this particular case, you need to use second argument to str providing the encoding used in your input bytes value (which, judging from the use of German language, is 'latin1'):
unicode_string = str(line, 'latin1')
Alternatively you can do the same using
unicode_string = line.decode('latin1')
And you'd probably want the \r\n removed, so add .rstrip() to that.
Besides, a more elegant solution for reading the file is:
with open(current_csv, 'rb') as f1:
table = f1.readlines()
(so no need for close())
I recently noticed that when I have the following code:
File = "/dir/to/file"
Content = "abcdefg"
with open(File,"a") as f:
f.write(Content)
I got "7" as an output and it is the count of characters in the variable "Content". I do not recall seeing this (I used ipython notebook before, but this time I did it in the python environment in shell) and wonder if I did something wrong. My python version: Python 3.3.3. Thank you for your help.
As always this behaviour is normal for most .write() implementations, see also I/O Base Classes.
For example io.RawIOBase.write
Write the given bytes-like object, b, to the underlying raw stream, and return the number of bytes written.
or io.TextIOBase.write
Write the string s to the stream and return the number of characters written.
Which IO-class is used depends on (the OS and) the parameters given to open. But as far as I can see all of them return some sort of "characters" or "bytes" written count.
I opened an image file in readbinary("rb") mode and stored the data in a variable. Now i want to replace some values in the binary with my values.. but its not working using usual replace method of string
f=open("a.jpg","rb")
a=f.read()
''' first line is '\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x00\x00\x01\x00\x01\x00\x00\xff\xe1\x00*Exif\x00\x00II*\x00\x08\x00\x00\x00\x0 '''
a=a.replace("ff","z")
print a
#but there's no change in a
can anyone tell where iam going wrong.. i also tried
a=a.replace(b'ff',b'z')
but still the output was unchanged.
can anyone tell what iam supposed to do to perform the replacement?
I don't know which version of Python you're using (this kind of operations are different between 2 and 3), but try a = str(a) before executing replace method.
EDIT: For python 2.7 only reasonable way I've discovered to do what you want is use built-in function repr. Example:
>>> picture = open("some_picture.jpg", 'rb')
>>> first_line = picture.readline()
>>> first_line
'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x00\x00\x01\x00\x01\x00\x00\xff\xe1\x00*Exif\x00\x00II*\x00\x08\x00\x00\x00\x01\x001\x01\x02\x00\x07\x00\x00\x00\x1a\x00\x00\x00\x00\x00\x00\x00Google\x00\x00\xff\xdb\x00\x84\x00\x03\x02\x02\x03\x02\x02\x03\x03\x03\x03\x04\x03\x03\x04\x05\x08\x05\x05\x04\x04\x05\n'
>>> repr(first_line)
>>> "'\\xff\\xd8\\xff\\xe0\\x00\\x10JFIF\\x00\\x01\\x01\\x00\\x00\\x01\\x00\\x01\\x00\\x00\\xff\\xe1\\x00*Exif\\x00\\x00II*\\x00\\x08\\x00\\x00\\x00\\x01\\x001\\x01\\x02\\x00\\x07\\x00\\x00\\x00\\x1a\\x00\\x00\\x00\\x00\\x00\\x00\\x00Google\\x00\\x00\\xff\\xdb\\x00\\x84\\x00\\x03\\x02\\x02\\x03\\x02\\x02\\x03\\x03\\x03\\x03\\x04\\x03\\x03\\x04\\x05\\x08\\x05\\x05\\x04\\x04\\x05\\n'"
>>> repr(first_line).replace('ff', 'SOME_OTHER_STRING')
"'\\xSOME_OTHER_STRING\\xd8\\xSOME_OTHER_STRING\\xe0\\x00\\x10JFIF\\x00\\x01\\x01\\x00\\x00\\x01\\x00\\x01\\x00\\x00\\xSOME_OTHER_STRING\\xe1\\x00*Exif\\x00\\x00II*\\x00\\x08\\x00\\x00\\x00\\x01\\x001\\x01\\x02\\x00\\x07\\x00\\x00\\x00\\x1a\\x00\\x00\\x00\\x00\\x00\\x00\\x00Google\\x00\\x00\\xSOME_OTHER_STRING\\xdb\\x00\\x84\\x00\\x03\\x02\\x02\\x03\\x02\\x02\\x03\\x03\\x03\\x03\\x04\\x03\\x03\\x04\\x05\\x08\\x05\\x05\\x04\\x04\\x05\\n'"
When you display a string at the Python console, the string is encoded so that you can see all of the characters, even the ones that aren't printable. Whenever you see something like \xff, that's not 4 characters, it's a single character in hex notation. To replace it, you also need to specify the same single character.
a = a.replace("\xff", "z")
I need to call an executable in a python script and also pass binary data (generated in the same script) to this executable.
I have it working like so:
bin = make_config(data)
open('binaryInfo.bin', 'wb+').write(bin)
os.system("something.exe " + "binaryInfo.bin")
I thought I could avoid creating the binaryInfo.bin file altogether by passing 'bin' straight to the os.system call:
bin = make_config(data)
os.system("something.exe " + bin)
But in this case I get an error:
"Can't convert 'bytes' object to str implicitly"
Does anyone know the correct syntax here? Is this even possible?
Does anyone know the correct syntax here? Is this even possible?
Not like you're doing it. You can't pass arbitrary binary data on the UNIX command line, as each argument is inherently treated as null-terminated, and there's a maximum total length limit which is typically 64KB or less.
With some applications which recognize this convention, you may be able to pipe data on stdin using something like:
pipe = os.popen("something.exe -", "w")
pipe.write(bin)
pipe.close()
If the application doesn't recognize "-" for stdin, though, you will probably have to use a temporary file like you're already doing.
os.system(b"something.exe " + bin)
Should do it.. However, I'm not sure you should be sending binary data through the command line. There might be some sort of limit on character count. Also, does this something.exe actually accept binary data through the command line even?
how bout base64encoding it before sending and decoding on the other end... afaik command line arguments must be ascii range values (although this maynot be true... but I think it is..) ...
another option would be to do it the way you currently are and passing the file ...
or maybe see this Passing binary data as arguments in bash
I'm doing some extra credit for "Zed Shaw's Learn Python The Hard Way;" the "extra credit" for exercise 15 tells you to read through pydoc file to find other things I could do files. I was interested in figuring out how to have the terminal print out a certain number of bytes of a text file using "read()". I can hard code in the argument for how many bytes to read, but I hit a wall when trying to prompt the user to define the number of bytes.
Here's the script as I have it so far:
from sys import argv
script, filename = argv
txt = open(filename)
print "Here's 24 bytes of your file %r:" % filename
print txt.read(24)
print """What about an arbitrary, not hard-coded number of bytes? Enter the number
of bytes you want read out of the txt file at this prompt, as an integer:"""
how_far = raw_input("> ")
print txt.read(how_far2) # this format makes sense in my head but obviously isn't the done thing.
terminal spits out the error:
"NameError: name 'how_far2' is not defined"
How do I prompt the user of the script to type in a number of bytes, and have the script read out that number of bytes?
BONUS QUESTIONS:
What is the actual-factual term for what I'm doing trying to do here? Pass a variable to a method? Pass a variable to a function?
Is the number of bytes an argument of read? Is that the correct term?
More generally, what's a good place to get a vocabulary list of python terms? Any other books Stack Overflow would recommend, or some in online documentation somewhere? Really looking for a no assumptions, no prior knowledge, "explain it to me like I'm five" level of granularity... a half hour of web-searching hasn't helped too much. I've not found terminology really collected together into any one place online despite a good amount of effort searching the web.
The error message is because you have used how_far in one place and how_far2 in the other.
You'll also need to convert how_far to an int before passing it to read - using int(how_far) for example
You will find it can be called passing a variable, parameter or argument. These are not Python terms, they are general programming terms
raw_input returns a string. file.read expects an integer -- likely you just need to convert the output from raw_input into an integer before you use it.