I code with python 3.4
and try to solve some task in CodeEval.
The input file consists of lines like:
31415;HYEMYDUMPS
45162;M%muxi%dncpqftiix"
14586214;Uix!&kotvx3
I try to read the inputfile with such way:
import sys
ABC = " !\"#$%&'()*+,-./0123456789:<=>?#ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
test_cases = open(sys.argv[1], 'r')
for test in test_cases:
cod = test.split(';')[0]
phrase = test.split(';')[1]
def decode(cod, phrase):
"""my code"""
def main():
decode(cod, phrase)
if __name__ == '__main__':
main()
Is it the right way to read the inputfile?
Because my solutions' status is "Partially" solved...
What is the right way to read such lines? ( i mean separated with ';' or ' ')
Tnx, Friends!
It is possible that your 'phrase' may contain semi-colons too. When you split you test case line, it might be happening that it breaks your phrase containing semi-colons too into two or more parts. As you are only assigning the first part of these multiple parts to your phrase variable, hence you will get incorrect decoding in such situations.
You, will get 'Partially' solved result as the cases where the phrases do not contain semi-colons will work just fine.
I will like to point out, that this is just my guess. Please provide the text/link of the problem, so that I can take a look and be of better help.
Related
I am new to regex so please explain how you got to the answer. Anyway I want to know the best way to match input function from a separate python file.
For example:
match.py
a = input("Enter a number")
b = input()
print(a+b)
Now I want to match ONLY the input statement and replace it with a random number. I will do this in a separate file main.py. So my aim is to replace input function in the match.py with a random numbers so I can check the output will come as expected. You can think of match.py like a coding exercise where he writes the code in that file and main.py will be the file where it evaluates if the users code is right. And to do that I need to replace the input myself and check if it works for all kinds of inputs. I looked for "regex patterns for python input function" but the search did not work right. I have a current way of doing it but I don't think it works in all kinds of cases. I need a perfect pattern which works in all kinds of cases referring to the python syntax. Here is the current main.py I have (It doesn't work for all cases I mean when you write a string with single quote, it does not replace but here is the problem I can just add single quote in pattern but I also need to detect if both are used):
# Evaluating python file checking if input 2 numbers and print sum is correct
import re
import subprocess
input_pattern = re.compile(r"input\s?\([\"]?[\w]*[\"]?\)")
file = open("match.py", 'r')
read = file.read()
file.close()
code = read
matches = input_pattern.findall(code)
for match in matches:
code = code.replace(match, '8')
file = open("match.py", 'w')
file.write(code)
file.close()
process = subprocess.Popen('python3 match.py', stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
out = process.communicate()[0]
print(out == b"16\n")
file = open("match.py", 'w')
file.write(read)
file.close()
Please let me know if you don't understand this question.
The following regex statement is very close to what you need:
input\s?\((?(?=[\"\'])[\"\'].*[\"\']\)|\))
I am using a conditional regex statement. However, I think it may need a nested conditional to avoid the situation that the user enters something like:
input(' text ")
But hopefully this gets you on the right track.
Caveat emptor: I can spell p-y-t-h-o-n and that's pretty much all there is to my knowledge. I tried to take some online classes but after about 20 lectures learning not much, I gave up long time ago. So, what I am going to ask is very simple but I need help:
I have a file with the following structure:
object_name_here:
object_owner:
- me#my.email.com
- user#another.email.com
object_id: some_string_here
identification: some_other_string_here
And this block repeats itself hundreds of times in the same file.
Other than object_name_here being unique and required, all other lines may or may not be present, email addresses can be from none to 10+ different email addresses.
what I want to do is to export this information into a flat file, likes of /etc/passwd, with a twist
for instance, I want the block above to yield a line like this:
object_name_here:object_owner=me#my_email.com,user#another.email.com:objectid=some_string_here:identification=some_other_string_here
again, the number of fields or length of the content fields are not fixed by any means. I am sure this is pretty easy task to accomplish with python but how, I don't know. I don't even know where to start from.
Final Edit: Okay, I am able to write a shell script (bash, ksh etc.) to parse the information, but, when I asked this question originally, I was under the impression that, python had a simpler way of handling uniform or semi-uniform data structures as this one. My understanding was proven to be not very accurate. Sorry for wasting your time.
As jaypb points out, regular expressions are a good idea here. If you're interested in some python 101, I'll give you some simple code to get you started on your own solution.
The following code is a quick and dirty way to lump every six lines of a file into one line of a new file:
# open some files to read and write
oldfile = open("oldfilename","r")
newfile = open("newfilename","w")
# initiate variables and iterate over the input file
count = 0
outputLine = ""
for line in oldfile:
# we're going to append lines in the file to the variable outputLine
# file.readline() will return one line of a file as a string
# str.strip() will remove whitespace at the beginning and end of a string
outputLine = outputLine + oldfile.readline().strip()
# you know your interesting stuff is six lines long, so
# reset the output string and write it to file every six lines
if count%6 == 0:
newfile.write(outputLine + "\n")
outputLine = ""
# increment the counter
count = count + 1
# clean up
oldfile.close()
newfile.close()
This isn't exactly what you want to do but it gets you close. For instance, if you want to get rid of " - " from the beginning of the email addresses and replace it with "=", instead of just appending to outputLine you'd do something like
if some condition:
outputLine = outputLine + '=' + oldfile.readline()[3:]
that last bit is a python slice, [3:] means "give me everything after the third element," and it works for things like strings or lists.
That'll get you started. Use google and the python docs (for instance, googling "python strip" takes you to the built-in types page for python 2.7.10) to understand every line above, then change things around to get what you need.
Since you are replacing text substrings with different text substrings, this is a pretty natural place to use regular expressions.
Python, fortunately, has an excellent regular expressions library called re.
You will probably want to heavily utilize
re.sub(pattern, repl, string)
Look at the documentation here:
https://docs.python.org/3/library/re.html
Update: Here's an example of how to use the regular expression library:
#!/usr/bin/env python
import re
body = None
with open("sample.txt") as f:
body = f.read()
# Replace emails followed by other emails
body = re.sub(" * - ([a-zA-Z.#]*)\n * -", r"\1,", body)
# Replace declarations of object properties
body = re.sub(" +([a-zA-Z_]*): *[\n]*", r"\1=", body)
# Strip newlines
body = re.sub(":?\n", ":", body)
print (body)
Example output:
$ python example.py
object_name_here:object_owner=me#my.email.com, user#another.email.com:object_id=some_string_here:identification=some_other_string_here
I wanted to align all the strings in each column of a file in a particular order using a python script. I have described the problem and the possible outcome using a sample scenario.
#sample.txt
start() "hello"
appended() "fly"
instantiated() "destination"
do() "increment"
logging_sampler() "dummy string"
Output scenario
#sample.txt(indented)
start() "hello"
appended() "fly"
instantiated() "destination"
do() "increment"
logging_sampler() "dummy string"
So is there any python library that can process a file and provide the above indentation?
Is there any general solution such that if I have a file with more that 2 columns and I can still indent all the columns in the same manner?
So is there any python library that can process a file and provide the above indentation? NO
Is this Possible? Yes
You need to know a way to parse your lines and then display in a formatted manner
In your particular case, parsing is straight forward as you just need to split the string based on the first occurring space. This can easily be done using str.partition. At times you may even need some exotic parsing logic for which you may need to use regex.
Formatting is even simpler, if you know the Format String Syntax.
Demo
>>> for e in st.splitlines():
left,_,right = e.partition(' ')
print "{:<20}{:<20}".format(left, right)
start() "hello"
appended() "fly"
instantiated() "destination"
do() "increment"
logging_sampler() "dummy string"
Adapted from this one here's a function takes a list of string lists, and returns a list formatted lines.
def table(lines, delim='\t'):
lens = [len(max(col, key=len)) for col in zip(*lines)]
fmt = delim.join('{:' + str(x) + '}' for x in lens)
return [fmt.format(*line) for line in lines]
The rest is trivial:
import re
with open(__file__) as fp:
lines = [re.split(r' ', s.strip(), maxsplit=1) for s in fp]
print '\n'.join(table(lines))
http://ideone.com/9WucPj
You can use the tab character ("\t") in your printing, but I'm not clear how you are printing sample.txt.
print string1+"\t"+string2
See here for more details
newbie playing around with hashes here and not getting the result I am looking for. Trying to get a hash from a txt file from the web, then comparing that hash to a local hash.
For testing purposes I'm using SHA256.new(“10”).hexdigest() which is: 4a44dc15364204a80fe80e9039455cc1608281820fe2b24f1e5233ade6af1dd5
CODE:
import urllib2
from Crypto.Hash import SHA256
source = urllib2.urlopen("<xxURLxx>")
line1 = source.readline() # get first line of the txt file in source which is the hash
localHash = SHA256.new("10").hexdigest()
if localHash == line1: # I know, shouldnt use == to compare hashes but it is my first try.
print("it works!")
else:
print("it does not work...")
Printing the hashes I get from the web file and the local hash they return the same characters. But if I hash each hash one more time I get different results.
Any ideas?
Had a look around S.O. and found:
Compare result from hexdigest() to a string
but the issue there was the lack of .digest() which I have.
Thank you in advance for any help.
If I had to guess, I'd say that changing
line1 = source.readline()
to
line1 = source.readline().strip()
will fix the problem. strip() removes leading and trailing whitespace, including the newline ('\n') character that will almost certainly be at the end of the first line read by readline.
You can see whether there are "invisible" characters like that by using repr, which renders them explicitly using escape characters:
>>> print repr('\t')
'\t'
I was trying to make a script to allow me to automate clean ups in the linux kernel a little bit. The first thing on my agenda was to remove braces({}) on if statements(c-styled) that wasnt necessary for single statement blocks. Now the code I tried with my little knowledge of regex in python I got to a working state, such as:
if (!buf || !buf_len) {
TRACE_RET(chip, STATUS_FAIL);
}
and the script turn it into:
if (!buf || !buf_len)
TRACE_RET(chip, STATUS_FAIL);
Thats what I want but when I try it on real source files it seems like it randomly selects a if statement and take its deleted it beginning brace and it has multiple statement blocks and it remove the ending brace far down the program usually on a else satement or a long if statement.
So can someone please help me with make the script only touch an if statement if it has a single block statement and correctly delete it corresponding beginning and ending brace.
The correct script looks like:
from sys import argv
import os
import sys
import re
get_filename = argv[1]
target = open(get_filename)
rename = get_filename + '.tmp'
temp = open(rename, 'w')
def if_statement():
look=target.read()
pattern=r'''if (\([^.)]*\)) (\{)(\n)([^>]+)(\})'''
replacement=r'''if \1 \3\4'''
pattern_obj = re.compile(pattern, re.MULTILINE)
outtext = re.sub(pattern_obj, replacement, look)
temp.write(outtext)
temp.close()
target.close()
if_statement()
Thanks in advance
In theory, this would mostly work:
re.sub(r'(if\s*\([^{]+\)\s*){([^;]*;)\s*}', r'\1\2', yourstring)
Note that this will fail on nested single-statement blocks and on semicolons inside string or character literals.
In general, trying to parse C code with regex is a bad idea, and you really shouldn't get rid of those braces anyway. It's good practice to have them and they're not hurting anything.