configparser does not show sections - python

I added sections and its values to ini file, but configparser doesn't want to print what sections I have in total. What I've done:
import configparser
import os
# creating path
current_path = os.getcwd()
path = 'ini'
try:
os.mkdir(path)
except OSError:
print("Creation of the directory %s failed" % path)
# add section and its values
config = configparser.ConfigParser()
config['section-1'] = {'somekey' : 'somevalue'}
file = open(f'ini/inifile.ini', 'a')
with file as f:
config.write(f)
file.close()
# get sections
config = configparser.ConfigParser()
file = open(f'ini/inifile.ini')
with file as f:
config.read(f)
print(config.sections())
file.close()
returns
[]
The similar code was in the documentation, but doesn't work. What I do wrong and how I could solve this problem?

From the docs, config.read() takes in a filename (or list of them), not a file descriptor object:
read(filenames, encoding=None)
Attempt to read and parse an iterable of filenames, returning a list of filenames which were successfully parsed.
If filenames is a string, a bytes object or a path-like object, it is treated as a single filename. ...
If none of the named files exist, the ConfigParser instance will contain an empty dataset. ...
A file object is an iterable of strings, so basically the config parser is trying to read each string in the file as a filename. Which is sort of interesting and silly, because if you passed it a file that contained the filename of your actual config...it would work.
Anyways, you should pass the filename directly to config.read(), i.e.
config.read("ini/inifile.ini")
Or, if you want to use a file descriptor object instead, simply use config.read_file(f). Read the docs for read_file() for more information.
As an aside, you are duplicating some of the work the context manager is doing for no gain. You can use the with block without creating the object explicitly first or closing it after (it will get closed automatically). Keep it simple:
with open("path/to/file.txt") as f:
do_stuff_with_file(f)

Related

How to get absolute path of the file selected as input file in python?

I want the absolute path of the file selected as input file (from file browser in the form) using the python code below:
for attr, document in request.files.iteritems():
orig_filename = document.filename
print os.path.abspath(orig_filename)
mhash = get_hash_for_doc(orig_filename)
This prints the path of current working directory along(where the python script is executing) with the 'orig_filename' appended to it, which is the wrong path. I am using python 2.7, flask 0.12 under linux OS.
The requirement is to find the hash value of the file before uploading it to the server to check deduplication. So I need to use the algorithm by passing the file selected for hashing to another function as:
def get_hash_for_doc(orig_filename):
mhash = None
hash = sha1()#md5()
with open(mfile, "rb") as f:
for chunk in iter(lambda: f.read(128 * hash.block_size), b""):
hash.update(chunk)
mhash = hash.hexdigest()
return mhash
In this function I want to read file from absolute path of the orig_filename before uploading. Avoided all other code checks here.
First you need to create a temp file to simulate this required file then make your process on it
import tempfile, os
try:
fd, tmp = tempfile.mkstemp()
with os.fdopen(fd, 'w') as out:
out.write(file.read())
mhash = get_hash_for_doc(tmp)
finally:
os.unlink(tmp)
If you want to find a folder/file.ext, for an input file, simply use 'os.path.abspath' like:
savefile = os.path.abspath(Myinputfile)
when "Myinputfile" is a variable that contains the relative path and file name. For instance, derived from an argument define by the user.
But if you prefer to have absolute address of the folder, without file name try this:
saveloc = os.path.dirname(os.path.realpath(Myinputfile))
You can use pathlib to find the absolute path of the selected file.

Read .tar.gz file in Python

I have a text file of 25GB. so i compressed it to tar.gz and it became 450 MB. now i want to read that file from python and process the text data.for this i referred question . but in my case code doesn't work. the code is as follows :
import tarfile
import numpy as np
tar = tarfile.open("filename.tar.gz", "r:gz")
for member in tar.getmembers():
f=tar.extractfile(member)
content = f.read()
Data = np.loadtxt(content)
the error is as follows :
Traceback (most recent call last):
File "dataExtPlot.py", line 21, in <module>
content = f.read()
AttributeError: 'NoneType' object has no attribute 'read'
also, Is there any other method to do this task ?
The docs tell us that None is returned by extractfile() if the member is a not a regular file or link.
One possible solution is to skip over the None results:
tar = tarfile.open("filename.tar.gz", "r:gz")
for member in tar.getmembers():
f = tar.extractfile(member)
if f is not None:
content = f.read()
tarfile.extractfile() can return None if the member is neither a file nor a link. For example your tar archive might contain directories or device files. To fix:
import tarfile
import numpy as np
tar = tarfile.open("filename.tar.gz", "r:gz")
for member in tar.getmembers():
f = tar.extractfile(member)
if f:
content = f.read()
Data = np.loadtxt(content)
You may try this one
t = tarfile.open("filename.gz", "r")
for filename in t.getnames():
try:
f = t.extractfile(filename)
Data = f.read()
print filename, ':', Data
except :
print 'ERROR: Did not find %s in tar archive' % filename
My needs:
Python3.
My tar.gz file consists of multiple utf-8 text files and dir.
Need to read text lines from all files.
Problems:
The tar object returned by tar.getmembers() maybe None.
The content extractfile(fname) returns is a bytes str (e.g. b'Hello\t\xe4\xbd\xa0\xe5\xa5\xbd'). Unicode char doesn't display correctly.
Solutions:
Check the type of tar object first. I reference the example in doc of tarfile lib. (Search "How to read a gzip compressed tar archive and display some member information")
Decode from byte str to normal str. (ref - most voted answer)
Code:
with tarfile.open("sample.tar.gz", "r:gz") as tar:
for tarinfo in tar:
logger.info(f"{tarinfo.name} is {tarinfo.size} bytes in size and is: ")
if tarinfo.isreg():
logger.info(f"Is regular file: {tarinfo.name}")
f = tar.extractfile(tarinfo.name)
# To get the str instead of bytes str
# Decode with proper coding, e.g. utf-8
content = f.read().decode('utf-8', errors='ignore')
# Split the long str into lines
# Specify your line-sep: e.g. \n
lines = content.split('\n')
for i, line in enumerate(lines):
print(f"[{i}]: {line}\n")
elif tarinfo.isdir():
logger.info(f"Is dir: {tarinfo.name}")
else:
logger.info(f"Is something else: {tarinfo.name}.")
You cannot "read" the content of some special files such as links yet tar supports them and tarfile will extract them alright. When tarfile extracts them, it does not return a file-like object but None. And you get an error because your tarball contains such a special file.
One approach is to determine the type of an entry in a tarball you are processing ahead of extracting it: with this information at hand you can decide whether or not you can "read" the file. You can achieve this by calling tarfile.getmembers() returns tarfile.TarInfos that contain detailed information about the type of file contained in the tarball.
The tarfile.TarInfo class has all the attributes and methods you need to determine the type of tar member such as isfile() or isdir() or tinfo.islnk() or tinfo.issym() and then accordingly decide what do to with each member (extract or not, etc).
For instance I use these to test the type of file in this patched tarfile to skip extracting special files and process links in a special way:
for tinfo in tar.getmembers():
is_special = not (tinfo.isfile() or tinfo.isdir()
or tinfo.islnk() or tinfo.issym())
...
In Jupyter notebook you can do like below
!wget -c http://qwone.com/~jason/20Newsgroups/20news-bydate.tar.gz -O - | tar -xz

Writing and reading a new file by passing the files as an argument to a function in a python script

I have a python script that gets input file names from the command prompt. I created a list to store all the input files and pass that to a function to create a new file with all the input files merged at once. Now, I pass this newly written file as an input to another function. I am getting an error message
TypeError: coercing to Unicode: need string or buffer, list found
Code:
file_list = []
for arg in range(1,len(sys.argv)-2):
file_list.append(sys.argv[arg])
process_name = sys.argv[len(sys.argv)-1]
integrate_files(file_list,process_name)
def integrate_files(file_list,process_name):
with open('result.log', 'w' ) as result:
for file_ in file_list:
for line in open( file_, 'r' ):
result.write( line )
start_process(result,process_name)
def start_process(result,process_name):
with open(result,'r') as mainFile:
content = mainFile.readlines()
I am getting this error highlighted at the lines having the word with.open(). I tried to print the abspath of the result.log file. It printed closed file 'result.log', mode 'w' at 0x000000000227578. Where am I going wrong ? How should I create a new file and pass it to a function?
Your problem is that result is a closed file object:
start_process(result,process_name)
I think you want
start_process('result.log', process_name)
You could clean the script up a bit with
import shutil
file_list = sys.argv[1:-1]
process_name = sys.argv[-1]
integrate_files(file_list,process_name)
def integrate_files(file_list,process_name):
with open('result.log', 'w' ) as result:
for file_ in file_list:
with open(file_) as infile:
shutil.copyfileobj(infile, result)
start_process('result.log',process_name)
def start_process(result,process_name):
with open(result,'r') as mainFile:
content = mainFile.readlines()
The issue is here:
with open('result.log', 'w' ) as result:
# ...
start_process(result,process_name)
Since you reopen your file in start_process, you should just pass the name:
start_process(result.name, process_name)
Or just be explicit:
start_process('result.log', process_name)
When you write with open('result.log', 'w') as result:, you make result be an object representing the actual file on disk. That is different from the name of the file.
You certainly can pass that result to another function. But since it will be the actual file object, and not a file name, you can't pass that to open - open expects a name of a file, and looks for the file with that name, in order to create a new file object.
You can call methods on that file object, but none of them will actually re-open the file. Instead, the simplest thing is to remember and pass the file name, so that start_process can open it again.
As shown in #matsjoyce's answer, the file object remembers the original file name. So you could pass the object, and have start_process get the name. But that's messy. Really, just pass the name. (You could, like mats showed, pass result.name explicitly instead of making your own name variable first). Passing file objects around is usually not what you want - do it only when you want to split the reading/writing work across functions (and have a good reason for that).
In this:
with open('result.log', 'w' ) as result:
When you define result above, you are only defining it for that single loop, so it won't pass when you call start_process
So either change start_process to:
with open('result.log','r') as mainFile:
Or you could pass the string result.log into start_process instead of the variable result:
file_list = []
for arg in range(1,len(sys.argv)-2):
file_list.append(sys.argv[arg])
process_name = sys.argv[len(sys.argv)-1]
integrate_files(file_list,process_name)
def integrate_files(file_list,process_name):
with open('result.log', 'w' ) as result:
for file_ in file_list:
for line in open( file_, 'r' ):
result.write( line )
start_process('result.log',process_name)
def start_process(result,process_name):
with open(result,'r') as mainFile:
content = mainFile.readlines()

function for loading both strings and files on disk?

I have a design question. I have a function loadImage() for loading an image file. Now it accepts a string which is a file path. But I also want to be able to load files which are not on physical disk, eg. generated procedurally. I could have it accept a string, but then how could it know the string is not a file path but file data? I could add an extra boolean argument to specify that, but that doesn't sound very clean. Any ideas?
It's something like this now:
def loadImage(filepath):
file = open(filepath, 'rb')
data = file.read()
# do stuff with data
The other version would be
def loadImage(data):
# do stuff with data
How to have this function accept both 'filepath' or 'data' and guess what it is?
You can change your loadImage function to expect an opened file-like object, such as:
def load_image(f):
data = file.read()
... and then have that called from two functions, one of which expects a path and the other a string that contains the data:
from StringIO import StringIO
def load_image_from_path(path):
with open(path, 'rb') as f:
load_image(f)
def load_image_from_string(s):
sio = StringIO(s)
try:
load_image(sio)
finally:
sio.close()
How about just creating two functions, loadImageFromString and loadImageFromFile?
This being Python, you can easily distinguish between a filename and a data string. I would do something like this:
import os.path as P
from StringIO import StringIO
def load_image(im):
fin = None
if P.isfile(im):
fin = open(im, 'rb')
else:
fin = StringIO(im)
# Read from fin like you would from any open file object
Other ways to do it would be a try block instead of using os.path, but the essence of the approach remains the same.

NameError: global name is not defined

Hello
My error is produced in generating a zip file. Can you inform what I should do?
main.py", line 2289, in get
buf=zipf.read(2048)
NameError: global name 'zipf' is not defined
The complete code is as follows:
def addFile(self,zipstream,url,fname):
# get the contents
result = urlfetch.fetch(url)
# store the contents in a stream
f=StringIO.StringIO(result.content)
length = result.headers['Content-Length']
f.seek(0)
# write the contents to the zip file
while True:
buff = f.read(int(length))
if buff=="":break
zipstream.writestr(fname,buff)
return zipstream
def get(self):
self.response.headers["Cache-Control"] = "public,max-age=%s" % 86400
start=datetime.datetime.now()-timedelta(days=20)
count = int(self.request.get('count')) if not self.request.get('count')=='' else 1000
from google.appengine.api import memcache
memcache_key = "ads"
data = memcache.get(memcache_key)
if data is None:
a= Ad.all().filter("modified >", start).filter("url IN", ['www.koolbusiness.com']).filter("published =", True).order("-modified").fetch(count)
memcache.set("ads", a)
else:
a = data
dispatch='templates/kml.html'
template_values = {'a': a , 'request':self.request,}
path = os.path.join(os.path.dirname(__file__), dispatch)
output = template.render(path, template_values)
self.response.headers['Content-Length'] = len(output)
zipstream=StringIO.StringIO()
file = zipfile.ZipFile(zipstream,"w")
url = 'http://www.koolbusiness.com/list.kml'
# repeat this for every URL that should be added to the zipfile
file =self.addFile(file,url,"list.kml")
# we have finished with the zip so package it up and write the directory
file.close()
zipstream.seek(0)
# create and return the output stream
self.response.headers['Content-Type'] ='application/zip'
self.response.headers['Content-Disposition'] = 'attachment; filename="list.kmz"'
while True:
buf=zipf.read(2048)
if buf=="": break
self.response.out.write(buf)
That is probably zipstream and not zipf. So replace that with zipstream and it might work.
i don't see where you declare zipf?
zipfile? Senthil Kumaran is probably right with zipstream since you seek(0) on zipstream before the while loop to read chunks of the mystery variable.
edit:
Almost certainly the variable is zipstream.
zipfile docs:
class zipfile.ZipFile(file[, mode[, compression[, allowZip64]]])
Open a ZIP file, where file can be either a path to a file (a string) or
a file-like object. The mode parameter
should be 'r' to read an existing
file, 'w' to truncate and write a new
file, or 'a' to append to an existing
file. If mode is 'a' and file refers
to an existing ZIP file, then
additional files are added to it. If
file does not refer to a ZIP file,
then a new ZIP archive is appended to
the file. This is meant for adding a
ZIP archive to another file (such as
python.exe).
your code:
zipsteam=StringIO.StringIO()
create a file-like object using StringIO which is essentially a "memory file" read more in docs
file = zipfile.ZipFile(zipstream,w)
opens the zipfile with the zipstream file-like object in 'w' mode
url = 'http://www.koolbusiness.com/list.kml'
# repeat this for every URL that should be added to the zipfile
file =self.addFile(file,url,"list.kml")
# we have finished with the zip so package it up and write the directory
file.close()
uses the addFile method to retrieve and write the retrieved data to the file-like object and returns it. The variables are slightly confusing because you pass a zipfile to the addFile method which aliases as zipstream (confusing because we are using zipstream as a StringIO file-like object). Anyways, the zipfile is returned, and closed to make sure everything is "written".
It was written to our "memory file", which we now seek to index 0
zipstream.seek(0)
and after doing some header stuff, we finally reach the while loop that will read our "memory-file" in chunks
while True:
buf=zipstream.read(2048)
if buf=="": break
self.response.out.write(buf)
You need to declare:
global zipf
right after your
def get(self):
line. you are modifying a global variable, and this is the only way python knows what you are doing.

Categories