having a difficult time understanding how to get python to call a system function...
the_file = ('logs/consolidated.log.gz')
webstuff = subprocess.Popen(['/usr/bin/zgrep', '/meatsauce/', the_file ],stdout=subprocess.PIPE) % dpt_search
for line in webstuff.stdout:
print line
Trying to get python to build another file with my search string.
Thanks!
I recommend the PyMotW Subprocess page from Doug Hellmann who (quoted) "Reads the docs so you don't have to"
Apart from that:
f = file('sourcefile')
for line in f:
if 'pattern' in line:
# mind the , at the end,
# since there's no stripping involved
# and print adds a newline without it
print line,
if you need to match regular expressions apart from the documentation in the Python Standard Library documentation for the re module also refer to the PyMotW Regular Expression page
Related
I have a text file /etc/default/foo which contains one line:
FOO="/path/to/foo"
In my python script, I need to reference the variable FOO.
What is the simplest way to "source" the file /etc/default/foo into my python script, same as I would do in bash?
. /etc/default/foo
Same answer as #jil however, that answer is specific to some historical version of Python.
In modern Python (3.x):
exec(open('filename').read())
replaces execfile('filename') from 2.x
You could use execfile:
execfile("/etc/default/foo")
But please be aware that this will evaluate the contents of the file as is into your program source. It is potential security hazard unless you can fully trust the source.
It also means that the file needs to be valid python syntax (your given example file is).
Keep in mind that if you have a "text" file with this content that has a .py as the file extension, you can always do:
import mytextfile
print(mytestfile.FOO)
Of course, this assumes that the text file is syntactically correct as far as Python is concerned. On a project I worked on we did something similar to this. Turned some text files into Python files. Wacky but maybe worth consideration.
Just to give a different approach, note that if your original file is setup as
export FOO=/path/to/foo
You can do source /etc/default/foo; python myprogram.py (or . /etc/default/foo; python myprogram.py) and within myprogram.py all the values that were exported in the sourced' file are visible in os.environ, e.g
import os
os.environ["FOO"]
If you know for certain that it only contains VAR="QUOTED STRING" style variables, like this:
FOO="some value"
Then you can just do this:
>>> with open('foo.sysconfig') as fd:
... exec(fd.read())
Which gets you:
>>> FOO
'some value'
(This is effectively the same thing as the execfile() solution
suggested in the other answer.)
This method has substantial security implications; if instead of FOO="some value" your file contained:
os.system("rm -rf /")
Then you would be In Trouble.
Alternatively, you can do this:
>>> with open('foo.sysconfig') as fd:
... settings = {var: shlex.split(value) for var, value in [line.split('=', 1) for line in fd]}
Which gets you a dictionary settings that has:
>>> settings
{'FOO': ['some value']}
That settings = {...} line is using a dictionary comprehension. You could accomplish the same thing in a few more lines with a for loop and so forth.
And of course if the file contains shell-style variable expansion like ${somevar:-value_if_not_set} then this isn't going to work (unless you write your very own shell style variable parser).
There are a couple ways to do this sort of thing.
You can indeed import the file as a module, as long as the data it contains corresponds to python's syntax. But either the file in question is a .py in the same directory as your script, either you're to use imp (or importlib, depending on your version) like here.
Another solution (that has my preference) can be to use a data format that any python library can parse (JSON comes to my mind as an example).
/etc/default/foo :
{"FOO":"path/to/foo"}
And in your python code :
import json
with open('/etc/default/foo') as file:
data = json.load(file)
FOO = data["FOO"]
## ...
file.close()
This way, you don't risk to execute some uncertain code...
You have the choice, depending on what you prefer. If your data file is auto-generated by some script, it might be easier to keep a simple syntax like FOO="path/to/foo" and use imp.
Hope that it helps !
The Solution
Here is my approach: parse the bash file myself and process only variable assignment lines such as:
FOO="/path/to/foo"
Here is the code:
import shlex
def parse_shell_var(line):
"""
Parse such lines as:
FOO="My variable foo"
:return: a tuple of var name and var value, such as
('FOO', 'My variable foo')
"""
return shlex.split(line, posix=True)[0].split('=', 1)
if __name__ == '__main__':
with open('shell_vars.sh') as f:
shell_vars = dict(parse_shell_var(line) for line in f if '=' in line)
print(shell_vars)
How It Works
Take a look at this snippet:
shell_vars = dict(parse_shell_var(line) for line in f if '=' in line)
This line iterates through the lines in the shell script, only process those lines that has the equal sign (not a fool-proof way to detect variable assignment, but the simplest). Next, run those lines into the function parse_shell_var which uses shlex.split to correctly handle the quotes (or the lack thereof). Finally, the pieces are assembled into a dictionary. The output of this script is:
{'MOO': '/dont/have/a/cow', 'FOO': 'my variable foo', 'BAR': 'My variable bar'}
Here is the contents of shell_vars.sh:
FOO='my variable foo'
BAR="My variable bar"
MOO=/dont/have/a/cow
echo $FOO
Discussion
This approach has a couple of advantages:
It does not execute the shell (either in bash or in Python), which avoids any side-effect
Consequently, it is safe to use, even if the origin of the shell script is unknown
It correctly handles values with or without quotes
This approach is not perfect, it has a few limitations:
The method of detecting variable assignment (by looking for the presence of the equal sign) is primitive and not accurate. There are ways to better detect these lines but that is the topic for another day
It does not correctly parse values which are built upon other variables or commands. That means, it will fail for lines such as:
FOO=$BAR
FOO=$(pwd)
Based off the answer with exec(.read()), value = eval(.read()), it will only return the value. E.g.
1 + 1: 2
"Hello Word": "Hello World"
float(2) + 1: 3.0
I would like to read the next logical line from a file into python, where logical means "according to the syntax of python".
I have written a small command which reads a set of statements from a file, and then prints out what you would get if you typed the statements into a python shell, complete with prompts and return values. Simple enough -- read each line, then eval. Which works just fine, until you hit a multi-line string.
I'm trying to avoid doing my own lexical analysis.
As a simple example, say I have a file containing
2 + 2
I want to print
>>> 2 + 2
4
and if I have a file with
"""Hello
World"""
I want to print
>>>> """Hello
...World"""
'Hello\nWorld'
The first of these is trivial -- read a line, eval, print. But then I need special support for comment lines. And now triple quotes. And so on.
You may want to take a look at the InteractiveInterpreter class from the code module .
The runsource() method shows how to deal with incomplete input.
Okay, so resi had the correct idea. Here is my trivial code which does the job.
#!/usr/bin/python
import sys
import code
class Shell(code.InteractiveConsole):
def write(data):
print(data)
cons = Shell()
file_contents = sys.stdin
prompt = ">>> "
for line in file_contents:
print prompt + line,
if cons.push(line.strip()):
prompt = "... "
else:
prompt = ">>> "
abc=123
dabc=123
abc=456
dabc=789
aabd=123
From the above file I need to find lines beginning with abc= (whitespaces doesn't matter)
in ruby I would put this in an array and do
matches = input.grep(/^\s*abc=.*/).map(&:strip)
I'm a totally noob in Python, even said I'm a fresh Python developer is too much.
Maybe there is a better "Python way" of doing this without even grepping ?
The Python version I have available on the platform where I need to solve the problem is 2.6
There is no way of use Ruby at that time
with open("myfile.txt") as myfile:
matches = [line.rstrip() for line in myfile if line.lstrip().startswith("abc=")]
In Python you would typically use a list comprehension whose if clause does what you'd accomplish with Ruby's grep:
import sys, re
matches = [line.strip() for line in sys.stdin
if re.match(r'^\s*abc=.*', line)]
I have the following lines (many, many):
...
gfnfgnfgnf: 5656756734
arvervfdsa: 1343453563
particular: 4685685685
erveveersd: 3453454545
verveversf: 7896789567
..
What I'd like to do is to find line 'particular' (whatever number is after ':')
and replace this number with '111222333'. How can I do that using python regular expressions ?
for line in input:
key, val = line.split(':')
if key == 'particular':
val = '111222333'
I'm not sure regex would be of any value in this specific case. My guess is they'd be slower. That said, it can be done. Here's one way:
for line in input:
re.sub('^particular : .*', 'particular : 111222333')
There are subtleties involved in this, and this is almost certainly not what you'd want in production code. You need to check all of the re module constants to make sure the regex is acting the way you expect, etc. You might be surprised at the flexibility you find in dealing with problems like this in Python if you try not to use re (of course, this isn't to say re isn't useful) ;-)
Sure you need a regular expression?
other_number = '111222333'
some_text, some_number = line.split(': ')
new_line = ': '.join(some_text, other_number)
#!/usr/bin/env python
import re
text = '''gfnfgnfgnf: 5656756734
arvervfdsa: 1343453563
particular: 4685685685
erveveersd: 3453454545
verveversf: 7896789567'''
print(re.sub('[0-9]+', '111222333', text))
input = """gfnfgnfgnf: 5656756734
arvervfdsa: 1343453563
particular: 4685685685
erveveersd: 3453454545
verveversf: 7896789567"""
entries = re.split("\n+", input)
for entry in entries:
if entry.startswith("particular"):
entry = re.sub(r'[0-9]+', r'111222333', entry)
or with sed:
sed -e 's/^particular: [0-9].*$/particular: 111222333/g' file
An important point here is that if you have a lot of lines, you want to process them one by one. That is, instead of reading all the lines in replacing them, and writing them out again, you should read in a line at a time and write out a line at a time. (This would be inefficient if you were actually reading a line at a time from the disk; however, Python's IO is competent and will buffer the file for you.)
with open(...) as infile, open(...) as outfile:
for line in infile:
if line.startswith("particular"):
outfile.write("particular: 111222333")
else:
outfile.write(line)
This will be speed- and memory-efficient.
Your sed example forces me to say neat!
python -c "import re, sys; print ''.join(re.sub(r'^(particular:) \d+', r'\1 111222333', l) for l in open(sys.argv[1]))" file
I have a lot of Perl scripts that looks something like the following. What it does is that it will automatically open any file given as a command line argument and in this case print the content of that file. If no file is given it will instead read from standard input.
while ( <> ) {
print $_;
}
Is there a way to do something similar in Python without having to explicitly open each file?
The fileinput module in Python's standard library is designed exactly for this purpose, and I quote a bit of code from the URL I just gave:
import fileinput
for line in fileinput.input():
process(line)
Use print in lieu of process and you have the exact equivalent of your Perl code.
You could look into sys.argv. It may help.