Memory error while parsing files in a folder

Memory error while parsing files in a folder - python

I'm using python 2.7
Here is my code to parse files in a folder
import linecache
import glob
path = r"G:\test\folder1"
Key = '''testresult="NOK"'''
Files = glob.glob(path+'\*.xml')
for FileName in Files:
Loop_Count = 1
while Loop_Count!= 50:
Line_Read = linecache.getline(FileName, Loop_Count)
if (Key in Line_Read):
a = FileName.split('\\')
b = len(a)-1
print a[b]
break
elif(Loop_Count == 49):
pass
Loop_Count = Loop_Count+1
print "Completed"
if folder1 has many files, i'm getting memory error
Traceback (most recent call last):
File "C:\Users\whoKnows\Desktop\test_Check111.py", line 10, in <module> Line_Read = linecache.getline(FileName, Loop_Count)
File "C:\Python27\lib\linecache.py", line 14, in getline
lines = getlines(filename, module_globals)
File "C:\Python27\lib\linecache.py", line 40, in getlines
return updatecache(filename, module_globals)
File "C:\Python27\lib\linecache.py", line 128, in updatecache
lines = fp.readlines()
MemoryError
I think its because i'm opening all the files for reading and i'm not closing them. Can anyone please tell me how to close the files While using glob.

MemoryError means you have run out of memory. You are probably loading all the files into the memory at once. Try deleting lines not needed anymore with linecache.clearcache().

Related

Errors with Python and eyed3

import os
import eyed3
def files(path):
for file in os.listdir(path):
if os.path.isfile(os.path.join(path, file)):
yield file
def title_alteration(music_artist_string):
music_artist_string = music_artist_string.replace(';', ' feat. ', 1)
semicolon_count = music_artist_string.count(';')
music_artist_string = music_artist_string.replace(';', ', ', semicolon_count-1)
music_artist_string = music_artist_string.replace(';', ' & ')
return music_artist_string
def main():
audio_files = eyed3.load(files('D:\\iTunes Music\\iTunes\\iTunes Media\\Music'))
title_alteration(audio_files.tag.artist)
if __name__ == '__main__':
main()
Can I get some help debugging this, I got it down to three distinct functions with some help from my last post, now I just need to know why is this getting errors when I attempt to run it on this directory on my pc.
I'm getting these errors (TLDR; It doesnt like Line 20 [the audio_files line]):
Traceback (most recent call last):
File "D:/Pycharm Projects/Music Alterations v2.py", line 25, in <module>
main()
File "D:/Pycharm Projects/Music Alterations v2.py", line 20, in main
audio_files = eyed3.load(files('D:\\iTunes Music\\iTunes\\iTunes Media\\Music'))
File "C:\Users\cLappy\AppData\Local\Programs\Python\Python38\lib\site-packages\eyed3\core.py", line 74, in load
path = pathlib.Path(path)
File "C:\Users\cLappy\AppData\Local\Programs\Python\Python38\lib\pathlib.py", line 1038, in __new__
self = cls._from_parts(args, init=False)
File "C:\Users\cLappy\AppData\Local\Programs\Python\Python38\lib\pathlib.py", line 679, in _from_parts
drv, root, parts = self._parse_args(args)
File "C:\Users\cLappy\AppData\Local\Programs\Python\Python38\lib\pathlib.py", line 663, in _parse_args
a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not generator
Process finished with exit code 1

Your files function is a generator, to be used like an iterator. The eyed3.load function doesn't want a generator or an iterator. It wants a pathname or something like one. Passing a generator where a pathname is expected does not magically cause iteration over all the values the generator would generate. It would work better just to make a list of all the pathnames of interest and then iterate over that list, calling eyed3.load for each pathname.

EOFError: Ran out of input when unpickling non-empty file

Traceback (most recent call last):
File "C:\Users\me\folder\project.py", line 999, in <module>
load_data()
File "C:\Users\me\folder\project.py", line 124, in load_data
globals()[var_name] = pickle.(f)
EOFError: Ran out of input
I get this error when trying to unpickle a file, even though the file is non-empty. I've tried opening the file and printing its value and it is non-empty, but the unpickler returns this result still.
Anyone know why this may be happening?
The code here is as follows:
files_to_load = ['var1','var2',...]
def load_data():
for var_name in files_to_load:
path = '{}.txt'.format(var_name)
if os.path.exists(path):
with open(path, 'rb') as f:
globals()[var_name] = pickle.Unpickler(f).load()
else: globals()[var_name] = {}

Python Pillow - ValueError: Decompressed Data Too Large

I use the Pillow lib to create thumbnails. I have to create a lot of them, actually more than 10.000
The program works fine, but after processing round about 1.500, I get the following error:
Traceback (most recent call last):
File "thumb.py", line 15, in <module>
im = Image.open('/Users/Marcel/images/07032017/' + infile)
File "/Users/Marcel/product-/PIL/Image.py", line 2339, in open
im = _open_core(fp, filename, prefix)
File "/Users/Marcel/product-/PIL/Image.py", line 2329, in _open_core
im = factory(fp, filename)
File "/Users/Marcel/product-/PIL/ImageFile.py", line 97, in __init__
self._open()
File "/Users/Marcel/product-/PIL/PngImagePlugin.py", line 538, in _open
s = self.png.call(cid, pos, length)
File "/Users/Marcel/product-/PIL/PngImagePlugin.py", line 136, in call
return getattr(self, "chunk_" + cid.decode('ascii'))(pos, length)
File "/Users/Marcel/product-/PIL/PngImagePlugin.py", line 319, in chunk_iCCP
icc_profile = _safe_zlib_decompress(s[i+2:])
File "/Users/Marcel/product-/PIL/PngImagePlugin.py", line 90, in _safe_zlib_decompress
raise ValueError("Decompressed Data Too Large")
ValueError: Decompressed Data Too Large
My program is very straight forward:
import os, sys
import PIL
from PIL import Image
size = 235, 210
reviewedProductsList = open('products.txt', 'r')
reviewedProducts = reviewedProductsList.readlines()
t = map(lambda s: s.strip(), reviewedProducts)
print "Thumbs to create: '%s'" % len(reviewedProducts)
for infile in t:
outfile = infile
try:
im = Image.open('/Users/Marcel/images/07032017/' + infile)
im.thumbnail(size, Image.ANTIALIAS)
print "thumb created"
im.save('/Users/Marcel/product-/thumbs/' + outfile, "JPEG")
except IOError, e:
print "cannot create thumbnail for '%s'" % infile
print "error: '%s'" % e
I am performing this operation locally on my MacBook Pro.

This is to protect against a potential DoS attack on servers running Pillow caused by decompression bombs. It occurs when a decompressed image is found to have too large metadata. See http://pillow.readthedocs.io/en/4.0.x/handbook/image-file-formats.html?highlight=decompression#png
Here's the CVE report: https:// www.cvedetails.com/cve/CVE-2014-9601/
From a recent issue:
If you set ImageFile.LOAD_TRUNCATED_IMAGES to true, it will suppress
the error (but still not read the large metadata). Alternately, you can
change set the values here: https://github.com/python-pillow/Pillow/ blob/master/PIL/PngImagePlugin.py#L74
https://github.com/python-pillow/Pillow/issues/2445

following code should help you in setting what accepted answer says.
from PIL import PngImagePlugin
LARGE_ENOUGH_NUMBER = 100
PngImagePlugin.MAX_TEXT_CHUNK = LARGE_ENOUGH_NUMBER * (1024**2)
It's not documented how to set this value. I hope people find this useful.

Python. Can not open ZipFile

I have a set of *.tar.xz archives. Each of them may contain APK or JAR files, that indeed are zip archives. I'm trying to search for some pattern inside content of that zip archives. I use next code to accomplish it:
#! /usr/bin/env python3
import os
import glob
import tarfile
import shutil
import zipfile
def check(filename):
if 'my_awesome_pattern' in open(file).read():
print('matches')
def process_zip(f):
z = zipfile.ZipFile(f, 'r') # <- here problem occurs
z.extractall('tmp')
z.close()
def process_jar(file):
print('JAR')
process_zip(file)
def process_apk(file):
print('APK')
process_zip(file)
def process_xml(file):
print('XML')
check(file)
def process_tar(filename):
print(filename)
tar = tarfile.open(filename)
for entry in tar.getnames():
print(">>> " + entry)
if entry.endswith('xml'):
tar.extract(entry)
process_xml(entry)
os.remove(entry)
elif entry.endswith('jar'):
tar.extract(entry)
process_jar(entry)
os.remove(entry)
elif entry.endswith('apk'):
tar.extract(entry)
process_apk(entry)
os.remove(entry)
tar.close()
for file in glob.glob("*.tar.xz"):
process_tar(file)
But runtime stops with:
setupwizardtablet-all.tar.xz
>>> setupwizardtablet-all
>>> setupwizardtablet-all/nodpi
>>> setupwizardtablet-all/nodpi/priv-app
>>> setupwizardtablet-all/nodpi/priv-app/SetupWizard
>>> setupwizardtablet-all/nodpi/priv-app/SetupWizard/SetupWizard.apk
APK
Traceback (most recent call last):
File "./scan.py", line 56, in <module>
process_tar(file)
File "./scan.py", line 49, in process_tar
process_apk(entry)
File "./scan.py", line 27, in process_apk
process_zip(file)
File "./scan.py", line 16, in process_zip
z = zipfile.ZipFile(f, 'r') # <- here problem occurs
File "/usr/lib/python3.4/zipfile.py", line 937, in __init__
self._RealGetContents()
File "/usr/lib/python3.4/zipfile.py", line 1034, in _RealGetContents
x._decodeExtra()
File "/usr/lib/python3.4/zipfile.py", line 415, in _decodeExtra
tp, ln = unpack('<HH', extra[:4])
struct.error: unpack requires a bytes object of length 4
And I've stuck with this error. Python is not my cup of tea, so I'm looking for help.
Thanks in advance!

Pyglet unable to play .wav files

def morse_audio( item ):
from pyglet import media
import pyglet
import time
import glob
import os
import wave
from contextlib import closing
files = []
audios = []
for file in glob.glob('C:\\Users\\MQ\'s Virual World\\Downloads\\Morse\\*.wav'):
ass = str(os.path.join('C:\\Users\MQ\'s Virual World\\Downloads\\Morse', file))
print (ass)
files.append(ass)
#audio = media.load(files[1])
#audio.play()
#print (len(files))
one = list(item)
str_list = [x.strip(' ') for x in one]
str_list = [x.strip('/') for x in str_list]
for s in str_list[0]:
if s != "-" and s != ".":
list(item)
for letter in item:
for i in range(0, 51):
if letter == " ":
time.sleep(1.5)
audios.append("noise3.wav")
break
if letter != letterlst[i] and letter != letterlst[i].lower():
continue
else:
print (files[i])
audio = media.load(files[i])
audio.play()
audios.append(files[i])
audios.append("noise2.wav")
time.sleep(1)
else:
lst = item.split()
print (' '.join(lst))
for code in lst:
for i in range(0, 51):
if code == "/":
time.sleep(1.5)
audios.append("noise3.wav")
break
if code != morse[i]:
continue
else:
print (files[i])
audio = media.load(files[i])
audio.play()
audios.append(files[i])
audios.append("noise2.wav")
time.sleep(1)
break
outfile = "sounds.wav"
data= []
for file in audios:
w = wave.open(file, 'rb')
lol = w.getparams()
print (lol)
data.append( [w.getparams(), w.readframes(w.getnframes())] )
w.close()
with closing(wave.open(outfile, 'wb')) as output:
# find sample rate from first file
with closing(wave.open(audios[0])) as w:
output.setparams(w.getparams())
# write each file to output
for audioo in audios:
with closing(wave.open(audioo)) as w:
output.writeframes(w.readframes(w.getnframes()))()))
So this code previously worked but I wanted to use different file types other then .wav files but because that worked so poorly I went back to .wav. These are different .wav files but the ones that worked before get the same error message. Which is:
Traceback (most recent call last):
File "C:\Users\MQ's Virual World\AppData\Local\Programs\Python\Python35-32\morsecode.py", line 187, in <module>
morse_audio("0123456789ÁÄ#&':,$=!-().+?;/_")
File "C:\Users\MQ's Virual World\AppData\Local\Programs\Python\Python35-32\morsecode.py", line 96, in morse_audio
audio.play()
File "C:\Users\MQ's Virual World\AppData\Local\Programs\Python\Python35-32\lib\site-packages\pyglet\media\__init__.py", line 473, in play
player.play()
File "C:\Users\MQ's Virual World\AppData\Local\Programs\Python\Python35-32\lib\site-packages\pyglet\media\__init__.py", line 1012, in play
self._set_playing(True)
File "C:\Users\MQ's Virual World\AppData\Local\Programs\Python\Python35-32\lib\site-packages\pyglet\media\__init__.py", line 993, in _set_playing
self._create_audio_player()
File "C:\Users\MQ's Virual World\AppData\Local\Programs\Python\Python35-32\lib\site-packages\pyglet\media\__init__.py", line 1083, in _create_audio_player
self._audio_player = audio_driver.create_audio_player(group, self)
File "C:\Users\MQ's Virual World\AppData\Local\Programs\Python\Python35-32\lib\site-packages\pyglet\media\drivers\directsound\__init__.py", line 502, in create_audio_player
return DirectSoundAudioPlayer(source_group, player)
File "C:\Users\MQ's Virual World\AppData\Local\Programs\Python\Python35-32\lib\site-packages\pyglet\media\drivers\directsound\__init__.py", line 184, in __init__
None)
File "C:\Users\MQ's Virual World\AppData\Local\Programs\Python\Python35-32\lib\site-packages\pyglet\com.py", line 125, in <lambda>
self.method.get_field()(self.i, self.name)(obj, *args)
File "_ctypes/callproc.c", line 920, in GetResult
OSError: [WinError -2147024809] The parameter is incorrect
I've tried .wav files that used to work. It works when I use a .ogg file. Also works with mp3s. Seems only .wav files are giving it issues. Very suddenly and randomly.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Memory error while parsing files in a folder - python

MemoryError means you have run out of memory. You are probably loading all the files into the memory at once. Try deleting lines not needed anymore with linecache.clearcache().

Related

Errors with Python and eyed3

EOFError: Ran out of input when unpickling non-empty file

Python Pillow - ValueError: Decompressed Data Too Large

Python. Can not open ZipFile

Pyglet unable to play .wav files

Categories

Resources