more ids getting deleted from resulting text file - python

I have an id in invalid_change variable which I am trying to delete from the input file "list.txt" and create a file results.txt as below..can anyone provide inputs on how can this be fixed?I have a sample input and expected output below..
'''
INPUT(list.txt:-
350882 348521 350166
346917 352470
360049
EXPECTEDOUTPUT(results.txt):-
350882 348521 350166
346917
360049
'''
invalid_list=['352470','12345']
f_write = open('results.txt', 'wb')
with open('list.txt','r') as f :
for line in f :
#delete the whole line if any invalid gerrit is presnet
gerrit_list = line.strip().split(' ')
ifvalid = True
for gerrit in gerrit_list:
try: # check if invalid gerrit is present
invalid_gerrit.index(invalid_change)
ifvalid = False
break
except:
pass
if ifvalid:
f_write.write(line)
f_write.close()

The problem is that when you reach this line:
346917 352470
the ifvalid variable is set to False and thus this specific line (even if adjusted) is not being written to the output file.
Possible solutions:
Modify string eg. using regular expressions. They will do replacing.
Do not skip writing "invalid" line to a file, just prepare it without the ID you want to remove (as in solution no. 1).
Let us know, if you have any questions.
Other problems in your code are:
diaper anti-pattern (you catch all the exceptions, including NameError, while you should only catch ValueError, or rather performing test including in keyword) - this is the reason you do not see errors related to lack of invalid_change and invalid_gerrit variables,
instead of opening and closing f_write, you could use it also in with statement you are already using, if you have Python 2.7+ or 3.1+,
str.split() splits by default by whitespaces, and removes empty strings from result, so instead of writing line.strip().split(' ') you could just write line.split(),

Each item in gerrit_list is an individual number. You don't need to find whether the invalid change is contained within the number. In fact, that can cause bugs. You can just directly iterate through the items in gerrit_list and throw away the ones that match invalid_list:
for line in f:
gerrit_list = line.split()
valid_gerrits = [gerrit for gerrit in gerrit_list if gerrit not in invalid_list]
if valid_gerrits:
newline = ' '.join(valid_gerrits) + '\n'
f_write.write(newline)

Related

Python writing int into file

I have a problem with python code and i don't know what to do, because im fairly new in it.
date_now1 = datetime.datetime.now()
archive_date1 = date_now1.strftime("%d.%m.%Y")
f1 = open(archive_date1, "r+")
print("What product do you wish to delete ?")
delate_product = str(input())
for line in f1.readlines():
if delate_product in line:
list = line
ready_product = list.split()
quantity_of_product = int(ready_product[1])
if quantity_of_product == 1:
del line
print("Product deleted")
else:
print("You have {} amounts of this product. How many do you want to delete ?".format(quantity_of_product))
x = int(input())
quantity_of_product = quantity_of_product - x
temporary = "{}".format(quantity_of_product)
print(type(temporary))
f1.write(temporary) in ready_product[1]
I get the message
f1.write(temporary) in ready_product[1]
TypeError: 'in <string>' requires string as left operand, not int
When i do print(type()) in temporary it says string. I also tried str(quantity_of_product), but it doesn't work as well. Maybe somebody could give me the idea of what to do, or what to read to get the answer.
The error is arising because you are asking python to find out whether an integer is "in" a string.
The output of f1.write(temporary) is an integer. To see this, try adding a print statement before the erroneous line. In contrast, ready_product[1] is a string (i.e. the second string element in the list "ready_product").
The operator "in" takes two iterables and returns whether the first is "in" the second. For example:
>>> "hello in ["hello", "world"]
>> True
>>> "b" in "a string"
>> False
When Python attempts to see if an integer is "in" a string, it cannot and throws a TypeError, saying "requires string as left operand, not int". This is the root of your error.
You may also have a number of other errors in your code:
"list" is a reserved word in Python, and so calling your variable "list" is bad practice. Try another name such as _list (or delete the variable as it doesn't appear to serve a purpose).
"del line" deletes the variable "line". However, it won't delete the actual line in the text file, only the variable containing it. See Deleting a specific line in a file (python) for how to delete a line from a text file.
There doesn't appear to be a f1.close() statement in the code. This is necessary to close the file after use, as otherwise edits may not be saved.
Personally, instead of attempting to delete lines as I go, I'd maintain a list of lines in the text file, and delete/alter lines from the list as I go. Then, at the end of the program I'd rewrite the file from the list of altered lines.

string mismatch in Python even if they have same value

Instead of keeping keys in my application I intent to read the keys from local file system into a variable (array of strings) and use those array elements in my oAuth APIs. However, when i used keys (in plaintext) as argument to OAuth APIs, authentication succeeds. BUT authentication failed when same value in read into a variable from file & that variable is passed to OAuth API.
Tried comparing the key value and variable value t find out they don't match though they same exactly same.
Input file looks as below:
$cat .keys
k1='jFOMZ0bI60fDAEKw53lYCj2r4'
k2='LNkyPehneIi8HeqTg1ji74H42jFkkBxZolRfzNFmaJKwLg7R7E'
secret_keys=[]
def keys_io():
key_file = open('/Users/homie/.keys', 'r+')
for key in range(1,5):
secret_keys.append(key_file.readline().split("=")[1])
print secret_keys[0]
print (secret_keys[0] == "jFOMZ0bI60fDAEKw53lYCj2r4")
keys_io()
Output:
jFOMZ0bI60fDAEKw53lYCj2r4
False
What am i missing here?
You should strip the key that you read from the file, as it has a trailing \n:
print(secret_keys[0].strip() == "jFOMZ0bI60fDAEKw53lYCj2r4")
Or do it when reading it:
for key in range(1,5):
secret_keys.append(key_file.readline().split("=")[1].strip())
If leading-trailing characters are bugging you, remove them with slicing, i.e [1:-1] to remove first-last quotations.
I also refactored your function a bit:
def keys_io():
with open('.keys', 'r+') as f:
for line in f:
secret_keys.append(line.split('=')[1].strip()[1:-1])
print secret_keys[0]
print (secret_keys[0] == "jFOMZ0bI60fDAEKw53lYCj2r4"
Use a context manager to open/close your file automatically.
Use for line in <opened_file> instead of other methods if you need to examine all lines.
Use strip() without arguments to remove unwanted space.
After these changes, the keys_io file works like a charm for me when using the .key file you presented.
When you read a text file from Python, you need to escape the new line character first. Another problem is you have single quote in between the input text. So you need to make change to:
secret_keys.append(key_file.readline().strip().split("=")[1])
and
if(secret_keys[0] == "\'jFOMZ0bI60fDAEKw53lYCj2r4\'"):

How do I reverse my for loop when I'm parsing a text file (python)

I have a for loop which iterates through a text file (in this case actually a Python file) and it is trying to extract all the functions (looking for the word def). Once it finds that word it starts recording lines until it hits a blank space (which I'm using to denote the end of the function).
My problem is that I want to backup once I hit a def in the file and record any comments that might come before the function. Ex: # This function does the following... etc. I want to backup until I no longer hit a hash.
How would I look backwards with this loop I have written?
for (counter,line) in enumerate(visible_texts):
line= line.encode('utf-8')
# if line doesn't contain def then ignore it
if "def" in line and infunction== 0:
match = re.search(r'\def (\w+)', line.strip())
line = line.split ("def")[1]
print "Recording start of the function..."
# Backup to see if there's any hashes above it (until the end of the hashes) ** how do I do this **
An example of output I would want at the end would be:
# This function was created by Thomas
# This function print a pass string into the function
def printme( str ):
"This prints a passed string into this function"
print str
return
Don't back up; record comments regardless, until you hit another line. If it's not a def line, discard the comments gathered:
comments = []
for (counter, line) in enumerate(visible_texts):
if line.lstrip().startswith('#'):
comments.append(line)
elif "def" in line and not infunction:
comment = '\n'.join(comments)
comments = []
# rest of your code
else:
comments = []
You can use the ast module to parse python code.
You should get into the habit of using docstrings instead of comments to describe a function.
One of the advantages is that the ast module can fish out these docstrings using ast.get_docstring().

Python command Line - multiple Line Input

I'm trying to solve a Krypto Problem on https://www.spoj.pl in Python, which involves console input.
My Problem is, that the Input String has multiple Lines but is needed as one single String in the Programm.
If I just use raw_input() and paste (for testing) the text in the console, Python threats it like I pressed enter after every Line -> I need to call raw_input() multiple times in a loop.
The Problem is, that I cannot modify the Input String in any way, it doesn't have any Symbol thats marks the End and I don't know how many Lines there are.
So what do I do?
Upon reaching end of stream on input, raw_input will return an empty string. So if you really need to accumulate entire input (which you probably should be avoiding given SPOJ constraints), then do:
buffer = ''
while True:
line = raw_input()
if not line: break
buffer += line
# process input
Since the end-of-line on Windows is marked as '\r\n' or '\n' on Unix system it is straight forward to replace those strings using
your_input.replace('\r\n', '')
Since raw_input() is designed to read a single line, you may have trouble this way.
A simple solution would be to put the input string in a text file and parse from there.
Assuming you have input.txt you can take values as
f = open(r'input.txt','rU')
for line in f:
print line,
Using the best answer here, you will still have an EOF error that should be handled. So, I just added exception handling here
buffer = ''
while True:
try:
line = raw_input()
except EOFError:
break
if not line:
break
buffer += line

Python: How to ignore #comment lines when reading in a file

In Python, I have just read a line form a text file and I'd like to know how to code to ignore comments with a hash # at the beginning of the line.
I think it should be something like this:
for
if line !contain #
then ...process line
else end for loop
But I'm new to Python and I don't know the syntax
you can use startswith()
eg
for line in open("file"):
li=line.strip()
if not li.startswith("#"):
print line.rstrip()
I recommend you don't ignore the whole line when you see a # character; just ignore the rest of the line. You can do that easily with a string method function called partition:
with open("filename") as f:
for line in f:
line = line.partition('#')[0]
line = line.rstrip()
# ... do something with line ...
partition returns a tuple: everything before the partition string, the partition string, and everything after the partition string. So, by indexing with [0] we take just the part before the partition string.
EDIT:
If you are using a version of Python that doesn't have partition(), here is code you could use:
with open("filename") as f:
for line in f:
line = line.split('#', 1)[0]
line = line.rstrip()
# ... do something with line ...
This splits the string on a '#' character, then keeps everything before the split. The 1 argument makes the .split() method stop after a one split; since we are just grabbing the 0th substring (by indexing with [0]) you would get the same answer without the 1 argument, but this might be a little bit faster. (Simplified from my original code thanks to a comment from #gnr. My original code was messier for no good reason; thanks, #gnr.)
You could also just write your own version of partition(). Here is one called part():
def part(s, s_part):
i0 = s.find(s_part)
i1 = i0 + len(s_part)
return (s[:i0], s[i0:i1], s[i1:])
#dalle noted that '#' can appear inside a string. It's not that easy to handle this case correctly, so I just ignored it, but I should have said something.
If your input file has simple enough rules for quoted strings, this isn't hard. It would be hard if you accepted any legal Python quoted string, because there are single-quoted, double-quoted, multiline quotes with a backslash escaping the end-of-line, triple quoted strings (using either single or double quotes), and even raw strings! The only possible way to correctly handle all that would be a complicated state machine.
But if we limit ourselves to just a simple quoted string, we can handle it with a simple state machine. We can even allow a backslash-quoted double quote inside the string.
c_backslash = '\\'
c_dquote = '"'
c_comment = '#'
def chop_comment(line):
# a little state machine with two state varaibles:
in_quote = False # whether we are in a quoted string right now
backslash_escape = False # true if we just saw a backslash
for i, ch in enumerate(line):
if not in_quote and ch == c_comment:
# not in a quote, saw a '#', it's a comment. Chop it and return!
return line[:i]
elif backslash_escape:
# we must have just seen a backslash; reset that flag and continue
backslash_escape = False
elif in_quote and ch == c_backslash:
# we are in a quote and we see a backslash; escape next char
backslash_escape = True
elif ch == c_dquote:
in_quote = not in_quote
return line
I didn't really want to get this complicated in a question tagged "beginner" but this state machine is reasonably simple, and I hope it will be interesting.
I'm coming at this late, but the problem of handling shell style (or python style) # comments is a very common one.
I've been using some code almost everytime I read a text file.
Problem is that it doesn't handle quoted or escaped comments properly. But it works for simple cases and is easy.
for line in whatever:
line = line.split('#',1)[0].strip()
if not line:
continue
# process line
A more robust solution is to use shlex:
import shlex
for line in instream:
lex = shlex.shlex(line)
lex.whitespace = '' # if you want to strip newlines, use '\n'
line = ''.join(list(lex))
if not line:
continue
# process decommented line
This shlex approach not only handles quotes and escapes properly, it adds a lot of cool functionality (like the ability to have files source other files if you want). I haven't tested it for speed on large files, but it is zippy enough of small stuff.
The common case when you're also splitting each input line into fields (on whitespace) is even simpler:
import shlex
for line in instream:
fields = shlex.split(line, comments=True)
if not fields:
continue
# process list of fields
This is the shortest possible form:
for line in open(filename):
if line.startswith('#'):
continue
# PROCESS LINE HERE
The startswith() method on a string returns True if the string you call it on starts with the string you passed in.
While this is okay in some circumstances like shell scripts, it has two problems. First, it doesn't specify how to open the file. The default mode for opening a file is 'r', which means 'read the file in binary mode'. Since you're expecting a text file it is better to open it with 'rt'. Although this distinction is irrelevant on UNIX-like operating systems, it's important on Windows (and on pre-OS X Macs).
The second problem is the open file handle. The open() function returns a file object, and it's considered good practice to close files when you're done with them. To do that, call the close() method on the object. Now, Python will probably do this for you, eventually; in Python objects are reference-counted, and when an object's reference count goes to zero it gets freed, and at some point after an object is freed Python will call its destructor (a special method called __del__). Note that I said probably: Python has a bad habit of not actually calling the destructor on objects whose reference count drops to zero shortly before the program finishes. I guess it's in a hurry!
For short-lived programs like shell scripts, and particularly for file objects, this doesn't matter. Your operating system will automatically clean up any file handles left open when the program finishes. But if you opened the file, read the contents, then started a long computation without explicitly closing the file handle first, Python is likely to leave the file handle open during your computation. And that's bad practice.
This version will work in any 2.x version of Python, and fixes both the problems I discussed above:
f = open(file, 'rt')
for line in f:
if line.startswith('#'):
continue
# PROCESS LINE HERE
f.close()
This is the best general form for older versions of Python.
As suggested by steveha, using the "with" statement is now considered best practice. If you're using 2.6 or above you should write it this way:
with open(filename, 'rt') as f:
for line in f:
if line.startswith('#'):
continue
# PROCESS LINE HERE
The "with" statement will clean up the file handle for you.
In your question you said "lines that start with #", so that's what I've shown you here. If you want to filter out lines that start with optional whitespace and then a '#', you should strip the whitespace before looking for the '#'. In that case, you should change this:
if line.startswith('#'):
to this:
if line.lstrip().startswith('#'):
In Python, strings are immutable, so this doesn't change the value of line. The lstrip() method returns a copy of the string with all its leading whitespace removed.
I've found recently that a generator function does a great job of this. I've used similar functions to skip comment lines, blank lines, etc.
I define my function as
def skip_comments(file):
for line in file:
if not line.strip().startswith('#'):
yield line
That way, I can just do
f = open('testfile')
for line in skip_comments(f):
print line
This is reusable across all my code, and I can add any additional handling/logging/etc. that I need.
I know that this is an old thread, but this is a generator function that I
use for my own purposes. It strips comments no matter where they
appear in the line, as well as stripping leading/trailing whitespace and
blank lines. The following source text:
# Comment line 1
# Comment line 2
# host01 # This host commented out.
host02 # This host not commented out.
host03
host04 # Oops! Included leading whitespace in error!
will yield:
host02
host03
host04
Here is documented code, which includes a demo:
def strip_comments(item, *, token='#'):
"""Generator. Strips comments and whitespace from input lines.
This generator strips comments, leading/trailing whitespace, and
blank lines from its input.
Arguments:
item (obj): Object to strip comments from.
token (str, optional): Comment delimiter. Defaults to ``#``.
Yields:
str: Next uncommented non-blank line from ``item`` with
comments and leading/trailing whitespace stripped.
"""
for line in item:
s = line.split(token, 1)[0].strip()
if s:
yield s
if __name__ == '__main__':
HOSTS = """# Comment line 1
# Comment line 2
# host01 # This host commented out.
host02 # This host not commented out.
host03
host04 # Oops! Included leading whitespace in error!""".split('\n')
hosts = strip_comments(HOSTS)
print('\n'.join(h for h in hosts))
The normal use case will be to strip the comments from a file (i.e., a hosts file, as in my example above). If this is the case, then the tail end of the above code would be modified to:
if __name__ == '__main__':
with open('aa.txt', 'r') as f:
hosts = strip_comments(f)
for host in hosts:
print('\'%s\'' % host)
A more compact version of a filtering expression can also look like this:
for line in (l for l in open(filename) if not l.startswith('#')):
# do something with line
(l for ... ) is called "generator expression" which acts here as a wrapping iterator that will filter out all unneeded lines from file while iterating over it. Don't confuse it with the same thing in square brakets [l for ... ] which is a "list comprehension" that will first read all the lines from the file into memory and only then will start iterating over it.
Sometimes you might want to have it less one-liney and more readable:
lines = open(filename)
lines = (l for l in lines if ... )
# more filters and mappings you might want
for line in lines:
# do something with line
All the filters will be executed on the fly in one iteration.
Use regex re.compile("^(?:\s+)*#|(?:\s+)") to skip the new lines and comments.
I tend to use
for line in lines:
if '#' not in line:
#do something
This will ignore the whole line, though the answer which includes rpartition has my upvote as it can include any information from before the #
a good thing to get rid of coments that works for both inline and on a line
def clear_coments(f):
new_text = ''
for line in f.readlines():
if "#" in line: line = line.split("#")[0]
new_text += line
return new_text

Categories