Parsing puppet-api yaml with python - python

I am creating a script which need to parse the yaml output that the puppet outputs.
When I does a request agains example https://puppet:8140/production/catalog/my.testserver.no I will get some yaml back that looks something like:
--- &id001 !ruby/object:Puppet::Resource::Catalog
aliases: {}
applying: false
classes:
- s_baseconfig
...
edges:
- &id111 !ruby/object:Puppet::Relationship
source: &id047 !ruby/object:Puppet::Resource
catalog: *id001
exported:
and so on... The problem is when I do an yaml.load(yamlstream), I will get an error like:
yaml.constructor.ConstructorError: could not determine a constructor for the tag '!ruby/object:Puppet::Resource::Catalog'
in "<string>", line 1, column 5:
--- &id001 !ruby/object:Puppet::Reso ...
^
As far as I know, this &id001 part is supported in yaml.
Is there any way around this? Can I tell the yaml parser to ignore them?
I only need a couple of lines from the yaml stream, maybe regex is my friend here?
Anyone done any yaml cleanup regexes before?
You can get the yaml output with curl like:
curl --cert /var/lib/puppet/ssl/certs/$(hostname).pem --key /var/lib/puppet/ssl/private_keys/$(hostname).pem --cacert /var/lib/puppet/ssl/certs/ca.pem -H 'Accept: yaml' https://puppet:8140/production/catalog/$(hostname)
I also found some info about this in the puppet mailinglist # http://www.mail-archive.com/puppet-users#googlegroups.com/msg24143.html. But I cant get it to work correctly...

I have emailed Kirill Simonov, the creator of PyYAML, to get help to parse Puppet YAML file.
He gladly helped with the following code. This code is for parsing Puppet log, but I'm sure you can modify it to parse other Puppet YAML file.
The idea is to create the correct loader for the Ruby object, then PyYAML can read the data after that.
Here goes:
#!/usr/bin/env python
import yaml
def construct_ruby_object(loader, suffix, node):
return loader.construct_yaml_map(node)
def construct_ruby_sym(loader, node):
return loader.construct_yaml_str(node)
yaml.add_multi_constructor(u"!ruby/object:", construct_ruby_object)
yaml.add_constructor(u"!ruby/sym", construct_ruby_sym)
stream = file('201203130939.yaml','r')
mydata = yaml.load(stream)
print mydata

I believe the crux of the matter is the fact that puppet is using yaml "tags" for ruby-fu, and that's confusing the default python loader. In particular, PyYAML has no idea how to construct a ruby/object:Puppet::Resource::Catalog, which makes sense, since that's a ruby object.
Here's a link showing some various uses of yaml tags: http://www.yaml.org/spec/1.2/spec.html#id2761292
I've gotten past this in a brute-force approach by simply doing something like:
cat the_yaml | sed 's#\!ruby/object.*$##gm' > cleaner.yaml
but now I'm stuck on an issue where the *resource_table* block is confusing PyYAML with its complex keys (the use of '? ' to indicate the start of a complex key, specifically).
If you find a nice way around that, please let me know... but given how tied at the hip puppet is to ruby, it may just be easier to do you script directly in ruby.

I only needed the classes section. So I ended up creating this little python function to strip it out...
Hope its usefull for someone :)
#!/usr/bin/env python
import re
def getSingleYamlClass(className, yamlList):
printGroup = False
groupIndent = 0
firstInGroup = False
output = ''
for line in yamlList:
# Count how many spaces in the beginning of our line
spaceCount = len(re.findall(r'^[ ]*', line)[0])
cleanLine = line.strip()
if cleanLine == className:
printGroup = True
groupIndent = spaceCount
firstInGroup = True
if printGroup and (spaceCount > groupIndent) or firstInGroup:
# Strip away the X amount of spaces for this group, so we get valid yaml
output += re.sub(r'^[ ]{%s}' % groupIndent, '', line) + '\n'
firstInGroup = False # Reset this
else:
# End of our group, reset
groupIndent = 0
printGroup = False
return output
getSingleYamlClass('classes:', open('puppet.yaml').readlines())

Simple YAML parser:
with open("file","r") as file:
for line in file:
re= yaml.load('\n'.join(line.split('?')[1:-1]).replace('?','\n').replace('""','\'').replace('"','\''))
# print '\n'.join(line.split('?')[1:-1])
# print '\n'.join(line.split('?')[1:-1]).replace('?','\n').replace('""','\'').replace('"','\'')
print line
print re

Related

How to best capture the final process progress output containing carriage returns

I use python to connect multiple processing tools for NLP tasks together but also capture the output of each in case something fails and write it to a log.
Some tools need many hours and output their current status as a progress percentage with carriage returns (\r). They do many steps, so they mix normal messages and progress messages. That results in sometimes really large log files that are hard to view with less.
My log will look like this (for fast progresses):
[DEBUG ] [FILE] [OUT] ^M4% done^M8% done^M12% done^M15% done^M19% done^M23% done^M27% done^M31% done^M35% done^M38% done^M42% done^M46% done^M50% done^M54% done^M58% done^M62% done^M65% done^M69% done^M73% done^M77% done^M81% done^M85% done^M88% done^M92% done^M96% done^M100% doneFinished
What I want is an easy way to collapse those strings in python. (I guess it is also possible to do this after the pipeline is finished and replace progress messages with e. g. sed ...)
My code for running and capturing the output looks like this:
import subprocess
from tempfile import NamedTemporaryFile
def run_command_of(command):
try:
out_file = NamedTemporaryFile(mode='w+b', delete=False, suffix='out')
err_file = NamedTemporaryFile(mode='w+b', delete=False, suffix='err')
debug('Redirecting command output to temp files ...', \
'out =', out_file.name, ', err =', err_file.name)
p = subprocess.Popen(command, shell=True, \
stdout=out_file, stderr=err_file)
p.communicate()
status = p.returncode
def fr_gen(file):
debug('Reading from %s ...' % file.name)
file.seek(0)
for line in file:
# TODO: UnicodeDecodeError?
# reload(sys)
# sys.setdefaultencoding('utf-8')
# unicode(line, 'utf-8')
# no decoding ...
yield line.decode('utf-8', errors='replace').rstrip()
debug('Closing temp file %s' % file.name)
file.close()
os.unlink(file.name)
return (fr_gen(out_file), fr_gen(err_file), status)
except:
from sys import exc_info
error('Error while running command', command, exc_info()[0], exc_info()[1])
return (None, None, 1)
def execute(command, check_retcode_null=False):
debug('run command:', command)
out, err, status = run_command_of(command)
debug('-> exit status:', status)
if out is not None:
is_empty = True
for line in out:
is_empty = False
debug('[FILE]', '[OUT]', line.encode('utf-8', errors='replace'))
if is_empty:
debug('execute: no output')
else:
debug('execute: no output?')
if err is not None:
is_empty = True
for line in err:
is_empty = False
debug('[FILE]', '[ERR]', line.encode('utf-8', errors='replace'))
if is_empty:
debug('execute: no error-output')
else:
debug('execute: no error-output?')
if check_retcode_null:
return status == 0
return True
It is some older code in Python 2 (funny time with unicode strings) that I want to rewrite to Python 3 and improve upon. (I'm also open for suggestions in how to process the output in realtime and not when everything is finished. update: is too broad and not exactly part of my problem)
I can think of many approaches but do not know if there is a ready-to-use function/library/etc. but I could not find any. (My google-fu needs work.) The only things I found were ways to remove the CR/LF but not the string portion that gets visually replaced. So I'm open for suggestions and improvements before I invest my time in reimplementing the wheel. ;-)
My approach would be to use a regex to find sections in a string/line between \r and remove them. Optionally I would keep a single percentage value for really long processes. Something like \r([^\r]*\r).
Note: A possible duplicate of: How to pull the output of the most recent terminal command?
It may require a wrapper script. It can still be used to convert my old log files with script2log. I found/got a suggestion for a plain python way that fulfills my needs.
I think the solution for my use case is as simple as this snippet:
# my data
segments = ['abcdef', '567', '1234', 'xy', '\n']
s = '\r'.join(segments)
# really fast approach:
last = s.rstrip().split('\r')[-1]
# or: simulate the overwrites
parts = s.rstrip().split('\r')
last = parts[-1]
last_len = len(last)
for part in reversed(parts):
if len(part) > last_len:
last = last + part[last_len:]
last_len = len(last)
# result
print(last)
Thanks to the comments to my question, I could better/further refine my requirements. In my case the only control characters are carriage returns (CR, \r) and a rather simple solution works as tripleee suggested.
Why not simply the last part after \r? The output of
echo -e "abcd\r12"
can result in:
12cd
The questions under the subprocess tag (also suggested in a comment from tripleee) should help for realtime/interleaved output but are outside of my current focus. I will have to test the best approach. I was already using stdbuf for switching the buffering when needed.

File parsing using Unix Shell Scripting

I am trying to do some transformation and I’m stuck. Here goes the problem description.
Below is the pipe delimited file. I have masked data!
AccountancyNumber|AccountancyNumberExtra|Amount|ApprovedBy|BranchCurrency|BranchGuid|BranchId|BranchName|CalculatedCurrency|CalculatedCurrencyAmount|CalculatedCurrencyVatAmount|ControllerBy|Country|Currency|CustomFieldEnabled|CustomFieldGuid|CustomFieldName|CustomFieldRequired|CustomFieldValue|Date|DateApproved|DateControlled|Email|EnterpriseNumber|ExpenseAccountGuid|ExpenseAccountName|ExpenseAccountStatus|ExpenseGuid|ExpenseReason|ExpenseStatus|ExternalId|GroupGuid|GroupId|GroupName|IBAN|Image|IsInvoice|MatchStatus|Merchant|MerchantEnterpriseNumber|Note|OwnerShip|PaymentMethod|PaymentMethodGuid|PaymentMethodName|ProjectGuid|ProjectId|ProjectName|Reimbursable|TravellerId|UserGUID|VatAmount|VatPercentage|XpdReference|VatCode|FileName|CreateTstamp
61470003||30.00|null|EUR|168fcea9-17d4-45a1-8b6f-bfb249cdbea6|BEL|BEL|USD,INR,EUR|35.20,2420.11,30.00|null,null,null|null|BE|EUR|true|0d4b767b-0988-47e8-9144-05e607169284|careertitle|false|FE|2018-07-24T00:00:00|null|null|abc_def#xyz.com||c32f03c6-31df-4fd8-8cc2-1c5f3a580aad|Meals - In Office|true|781d10d2-2f3b-43bc-866e-a653fefacbbe||Approved|70926|40ac7117-c7e2-42ea-b34f-96330c9380b6|BEL-FSP-Users|BEL-FSP-Users|||false|None|in office meal #1|||Personal|Cash|1ee44666-f4c7-44b3-acd3-8ecd7127480a|Cash|2cb4ccb7-634d-4386-af43-b4572ec72098|00AA06|00AA06|true||6c5a835f-5152-46db-923a-3ebd08c7dad3|null|null|XPD012245802||1820711.xml|2018-08-07 05:42:10.46
In this file, we have got CalculatedCurrency field where we have multiple values delimited by a comma. The file also has field CalculatedCurrencyAmount which too has multiple values delimited by a comma. But I need to pick up only that currency value from CalculatedCurrency field which belongs to
BranchCurrency (another field in the file) and of course corresponding CalculatedCurrencyAmount for that Currency.
Required output : -
AccountancyNumber|AccountancyNumberExtra|Amount|ApprovedBy|BranchCurrency|BranchGuid|BranchId|BranchName|CalculatedCurrency|CalculatedCurrencyAmount|CalculatedCurrencyVatAmount|ControllerBy|Country|Currency|CustomFieldEnabled|CustomFieldGuid|CustomFieldName|CustomFieldRequired|CustomFieldValue|Date|DateApproved|DateControlled|Email|EnterpriseNumber|ExpenseAccountGuid|ExpenseAccountName|ExpenseAccountStatus|ExpenseGuid|ExpenseReason|ExpenseStatus|ExternalId|GroupGuid|GroupId|GroupName|IBAN|Image|IsInvoice|MatchStatus|Merchant|MerchantEnterpriseNumber|Note|OwnerShip|PaymentMethod|PaymentMethodGuid|PaymentMethodName|ProjectGuid|ProjectId|ProjectName|Reimbursable|TravellerId|UserGUID|VatAmount|VatPercentage|XpdReference|VatCode|FileName|CreateTstamp|ActualCurrency|ActualAmount
61470003||30.00|null|EUR|168fcea9-17d4-45a1-8b6f-bfb249cdbea6|BEL|BEL|USD,INR,EUR|35.20,2420.11,30.00|null,null,null|null|BE|EUR|true|0d4b767b-0988-47e8-9144-05e607169284|careertitle|false|FE|2018-07-24T00:00:00|null|null|abc_def#xyz.com||c32f03c6-31df-4fd8-8cc2-1c5f3a580aad|Meals - In Office|true|781d10d2-2f3b-43bc-866e-a653fefacbbe||Approved|70926|40ac7117-c7e2-42ea-b34f-96330c9380b6|BEL-FSP-Users|BEL-FSP-Users|||false|None|in office meal #1|||Personal|Cash|1ee44666-f4c7-44b3-acd3-8ecd7127480a|Cash|2cb4ccb7-634d-4386-af43-b4572ec72098|00AA06|00AA06|true||6c5a835f-5152-46db-923a-3ebd08c7dad3|null|null|XPD012245802||1820711.xml|2018-08-07 05:42:10.46|EUR|30.00
Please help.
Snaplogic Python Script
from com.snaplogic.scripting.language import ScriptHook
from com.snaplogic.scripting.language.ScriptHook import *
import csv
class TransformScript(ScriptHook):
def __init__(self, input, output, error, log):
self.input = input
self.output = output
self.error = error
self.log = log
def execute(self):
self.log.info("Executing Transform script")
while self.input.hasNext():
data = self.input.next()
branch_currency = data['BranchCurrency']
calc_currency = data['CalculatedCurrency'].split(',')
calc_currency_amount = data['CalculatedCurrencyAmount'].split(',')
result = None
for i, name in enumerate(calc_currency):
result = calc_currency_amount[i] if name == branch_currency else result
data["CalculatedCurrencyAmount"] = result
result1 = calc_currency[i] if name == branch_currency else result
data["CalculatedCurrency"] = result1
try:
data["mathTryCatch"] = data["counter2"].longValue() + 33
self.output.write(data)
except Exception as e:
data["errorMessage"] = e.message
self.error.write(data)
self.log.info("Finished executing the Transform script")
hook = TransformScript(input, output, error, log)
Using bash with some arrays:
arr_find() {
echo $(( $(printf "%s\0" "${#:2}" | grep -Fnxz "$1" | cut -d: -f1) - 1 ))
}
IFS='|' read -r -a headers
while IFS='|' read -r "${headers[#]}"; do
IFS=',' read -r -a CalculatedCurrency <<<"$CalculatedCurrency"
IFS=',' read -r -a CalculatedCurrencyAmount <<<"$CalculatedCurrencyAmount"
idx=$(arr_find "$BranchCurrency" "${CalculatedCurrency[#]}")
echo "BranchCurrency is $BranchCurrency. Hence CalculatedCurrency will be ${CalculatedCurrency[$idx]} and CalculatedCurrencyAmount will have to be ${CalculatedCurrencyAmount[$idx]}."
done
First I read all headers names. Then read all values into headers. Then read CalculatedCurrency* correctly, cause they are separated by ','. Then I find the element number which is equal to BranchCurrency inside CalculatedCurrency. Having the element index and arrays, I can just print the output.
I know, the op asked for unix shell, but as an alternative option I show some code to do it using python. (Obviously this code can be heavily improved also.) The great advantage is readability, for instance, that you can address your data via name. Or the code, which is much more readable at all, than doing this with awk et al.
Save your data in data.psv, write the following script into a file main.py. I've tested it using python3 and python2. Both works. Run the script using python main.py.
Update: I've extended the script to parse all lines. In the example data, I've set BranchCurrency to EUR in the first line and USD in the secondline, as a dummy test.
File: main.py
import csv
def parse_line(row):
branch_currency = row['BranchCurrency']
calc_currency = row['CalculatedCurrency'].split(',')
calc_currency_amount = row['CalculatedCurrencyAmount'].split(',')
result = None
for i, name in enumerate(calc_currency):
result = calc_currency_amount[i] if name == branch_currency else result
return result
def main():
with open('data.psv') as f:
reader = csv.DictReader(f, delimiter='|')
for row in reader:
print(parse_line(row))
if __name__ == '__main__':
main()
Example Data:
[:~] $ cat data.psv
AccountancyNumber|AccountancyNumberExtra|Amount|ApprovedBy|BranchCurrency|BranchGuid|BranchId|BranchName|CalculatedCurrency|CalculatedCurrencyAmount|CalculatedCurrencyVatAmount|ControllerBy|Country|Currency|CustomFieldEnabled|CustomFieldGuid|CustomFieldName|CustomFieldRequired|CustomFieldValue|Date|DateApproved|DateControlled|Email|EnterpriseNumber|ExpenseAccountGuid|ExpenseAccountName|ExpenseAccountStatus|ExpenseGuid|ExpenseReason|ExpenseStatus|ExternalId|GroupGuid|GroupId|GroupName|IBAN|Image|IsInvoice|MatchStatus|Merchant|MerchantEnterpriseNumber|Note|OwnerShip|PaymentMethod|PaymentMethodGuid|PaymentMethodName|ProjectGuid|ProjectId|ProjectName|Reimbursable|TravellerId|UserGUID|VatAmount|VatPercentage|XpdReference|VatCode|FileName|CreateTstamp
61470003||35.00|null|EUR|168fcea9-17d4-45a1-8b6f-bfb249cdbea6|BEL|BEL|USD,INR,EUR|35.20,2420.11,30.00|null,null,null|null|BE|EUR|true|0d4b767b-0988-47e8-9144-05e607169284|careertitle|false|FE|2018-07-24T00:00:00|null|null|abc_def#xyz.com||c32f03c6-31df-4fd8-8cc2-1c5f3a580aad|Meals - In Office|true|781d10d2-2f3b-43bc-866e-a653fefacbbe||Approved|70926|40ac7117-c7e2-42ea-b34f-96330c9380b6|BEL-FSP-Users|BEL-FSP-Users|||false|None|in office meal #1|||Personal|Cash|1ee44666-f4c7-44b3-acd3-8ecd7127480a|Cash|2cb4ccb7-634d-4386-af43-b4572ec72098|00AA06|00AA06|true||6c5a835f-5152-46db-923a-3ebd08c7dad3|null|null|XPD012245802||1820711.xml|2018-08-07 05:42:10.46
61470003||35.00|null|USD|168fcea9-17d4-45a1-8b6f-bfb249cdbea6|BEL|BEL|USD,INR,EUR|35.20,2420.11,30.00|null,null,null|null|BE|EUR|true|0d4b767b-0988-47e8-9144-05e607169284|careertitle|false|FE|2018-07-24T00:00:00|null|null|abc_def#xyz.com||c32f03c6-31df-4fd8-8cc2-1c5f3a580aad|Meals - In Office|true|781d10d2-2f3b-43bc-866e-a653fefacbbe||Approved|70926|40ac7117-c7e2-42ea-b34f-96330c9380b6|BEL-FSP-Users|BEL-FSP-Users|||false|None|in office meal #1|||Personal|Cash|1ee44666-f4c7-44b3-acd3-8ecd7127480a|Cash|2cb4ccb7-634d-4386-af43-b4572ec72098|00AA06|00AA06|true||6c5a835f-5152-46db-923a-3ebd08c7dad3|null|null|XPD012245802||1820711.xml|2018-08-07 05:42:10.46
Example Run:
[:~] $ python main.py
30.00
35.20
Using awk:
awk 'BEGIN{FS=OFS="|"}
NR==1{print $0,"ActualCurrency","ActualAmount";next}
{n=split($9,a,",");split($10,b,",");for(i=1;i<=n;i++) if(a[i]==$5) print $0,$5,b[i]}' file
BEGIN{FS=OFS="|"} sets the input and ouput delimiter to |.
The NR==1 statement takes care of the header by adding the 2 strings.
The 9th and 10th fields are splitted based on the , separator, and values are set inside the arrays a and b.
The for loop is trying to find the value of the a array corresponding to the 5th field. If found, the corresponding value of the b is printed.
You don't need to use the script snap here at all. Writing scripts for transformations all the time hampers the performance and defeats the purpose of an IPaaS tool altogether. The mapper should suffice.
I created the following test pipeline for this problem.
I saved the data provided in this question in a file and saved it in SnapLogic for the test. In the pipeline, I parsed it using a CSV parser.
Following is the parsed result.
Then I used a mapper for doing the required transformation.
Following is the expression for getting the actual amount.
$CalculatedCurrency.split(',').indexOf($BranchCurrency) >= 0 ? $CalculatedCurrencyAmount.split(',')[$CalculatedCurrency.split(',').indexOf($BranchCurrency)] : null
Following is the result.
Avoid writing scripts for problems that can be solved using mappers.

update /etc/sysctl.conf with Python's ConfigParser

I am able to use Python's ConfigParser library to read /etc/sysctl.conf by adding a [dummy] section and overriding ConfigParser's read() method as follows:
class SysctlConfigParser(ConfigParser.ConfigParser):
def read(self, fn):
text = open(fn).read()
contents = StringIO.StringIO("[dummy]\n" + text)
self.readfp(contents, fn)
Now the tricky part is to write back configuration updates that my python program made, because if I would now call ConfigParser.write() directly then it would add back this [dummy] section as well:
[dummy]
net.netfilter.nf_conntrack_max = 313
net.netfilter.nf_conntrack_expect_max = 640
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 5
Here are my questions:
Is there an elegant way to make ConfigParser not to add this [dummy] section? It seems odd if I would have to open this file again just to remove the first line that contains this dummy section.
Maybe ConfigParser is not the right tool to edit sysctl.conf? If so are there any other Python libraries that would allow to update sysctl.conf in a convenient way from Python?
ConfigParser is designed for parsing INI-style configuration files. /etc/sysconf.conf is not this sort of file.
You could use the Augeas bindings for Python if you want a parser that works out-of-the-box:
import augeas
aug = augeas.Augeas()
aug.set('/files/etc/sysctl.conf/net.ipv4.ip_forwarding', '1')
aug.set('/files/etc/sysctl.conf/fs.inotify.max_user_watches', '8192')
aug.save()
The format of the file is pretty trivial (just a collection of <name> = <value> lines with optional comments).

Newbie question about file formatting in Python

I'm writing a simple program in Python 2.7 using pycURL library to submit file contents to pastebin.
Here's the code of the program:
#!/usr/bin/env python2
import pycurl, os
def send(file):
print "Sending file to pastebin...."
curl = pycurl.Curl()
curl.setopt(pycurl.URL, "http://pastebin.com/api_public.php")
curl.setopt(pycurl.POST, True)
curl.setopt(pycurl.POSTFIELDS, "paste_code=%s" % file)
curl.setopt(pycurl.NOPROGRESS, True)
curl.perform()
def main():
content = raw_input("Provide the FULL path to the file: ")
open = file(content, 'r')
send(open.readlines())
return 0
main()
The output pastebin looks like standard Python list: ['string\n', 'line of text\n', ...] etc.
Is there any way I could format it so it looks better and it's actually human-readable? Also, I would be very happy if someone could tell me how to use multiple data inputs in POSTFIELDS. Pastebin API uses paste_code as its main data input, but it can use optional things like paste_name that sets the name of the upload or paste_private that sets it private.
First, use .read() as virhilo said.
The other step is to use urllib.urlencode() to get a string:
curl.setopt(pycurl.POSTFIELDS, urllib.urlencode({"paste_code": file}))
This will also allow you to post more fields:
curl.setopt(pycurl.POSTFIELDS, urllib.urlencode({"paste_code": file, "paste_name": name}))
import pycurl, os
def send(file_contents, name):
print "Sending file to pastebin...."
curl = pycurl.Curl()
curl.setopt(pycurl.URL, "http://pastebin.com/api_public.php")
curl.setopt(pycurl.POST, True)
curl.setopt(pycurl.POSTFIELDS, "paste_code=%s&paste_name=%s" \
% (file_contents, name))
curl.setopt(pycurl.NOPROGRESS, True)
curl.perform()
if __name__ == "__main__":
content = raw_input("Provide the FULL path to the file: ")
with open(content, 'r') as f:
send(f.read(), "yournamehere")
print
When reading files, use the with statement (this makes sure your file gets closed properly if something goes wrong).
There's no need to be having a main function and then calling it. Use the if __name__ == "__main__" construct to have your script run automagically when called (unless when importing this as a module).
For posting multiple values, you can manually build the url: just seperate different key, value pairs with an ampersand (&). Like this: key1=value1&key2=value2. Or you can build one with urllib.urlencode (as others suggested).
EDIT: using urllib.urlencode on strings which are to be posted makes sure content is encoded properly when your source string contains some funny / reserved / unusual characters.
use .read() instead of .readlines()
The POSTFIELDS should be sended the same way as you send Query String arguments. So, in the first place, it's necessary to encode the string that you're sending to paste_code, and then, using & you could add more POST arguments.
Example:
paste_code=hello%20world&paste_name=test
Good luck!

Python - ConfigParser throwing comments

Based on ConfigParser module how can I filter out and throw every comments from an ini file?
import ConfigParser
config = ConfigParser.ConfigParser()
config.read("sample.cfg")
for section in config.sections():
print section
for option in config.options(section):
print option, "=", config.get(section, option)
eg. in the ini file below the above basic script prints out the further comments lines as well like:
something = 128 ; comment line1
; further comments
; one more line comment
What I need is having only the section names and pure key-value pairs inside them without any comments. Does ConfigParser handles this somehow or should I use regexp...or? Cheers
according to docs lines starting with ; or # will be ignored. it doesn't seem like your format satisfies that requirement. can you by any chance change format of your input file?
edit: since you cannot modify your input files, I'd suggest pre-parsing them with something along the lines:
tmp_fname = 'config.tmp'
with open(config_file) as old_file:
with open(tmp_fname, 'w') as tmp_file:
tmp_file.writelines(i.replace(';', '\n;') for i in old_lines.readlines())
# then use tmp_fname with ConfigParser
obviously if semi-colon is present in options you'll have to be more creative.
Best way is to write a commentless file subclass:
class CommentlessFile(file):
def readline(self):
line = super(CommentlessFile, self).readline()
if line:
line = line.split(';', 1)[0].strip()
return line + '\n'
else:
return ''
You could use it then with configparser (your code):
import ConfigParser
config = ConfigParser.ConfigParser()
config.readfp(CommentlessFile("sample.cfg"))
for section in config.sections():
print section
for option in config.options(section):
print option, "=", config.get(section, option)
It seems your comments are not on lines that start with the comment leader. It should work if the comment leader is the first character on the line.
As the doc said: "(For backwards compatibility, only ; starts an inline comment, while # does not.)" So use ";" and not "#" for inline comments. It is working well for me.
Python 3 comes with a build-in solution: The class configparser.RawConfigParser has constructor argument inline_comment_prefixes. Example:
class MyConfigParser(configparser.RawConfigParser):
def __init__(self):
configparser.RawConfigParser.__init__(self, inline_comment_prefixes=('#', ';'))

Categories