This was a suggestion from another stack thread that I'm finally getting back to. It was part of a discussion about how to embed tools into a maya file.
You can write the whole thing as a python package, zip it, then stuff the binary contents of the zip file into a fileInfo. When you need to code, look for it in the user's $MAYA_APP_DIR; if there's no zip, write the contents of the fileInfo to disk as a zip and then insert the zip into sys.path
Source discussions were:
python copy scripts into script
and Maya Python Create and Use Zipped Package?
So far the programming is going okay, but I think I hit a snag. When I attempt this:
with open("..directory/myZip.zip","rb") as file:
cmds.fileInfo("myZip", file.read())
..and then I..
print cmds.fileInfo("myZip",q=1)
I get
[u'PK\003\004\024']
which is a bad translation of the first line of gibberish when reading the zip file as a text document.
How can I embed my zip file into my maya file as binary?
====================
Update:
Maya doesn't like writing to the file as a direct read of the utf-8 encoded zip file. I found various methods of making it into an acceptable string that could be written, but the decoding back to the file didn't appear to work. I see now that Theodox's suggestion was to write it to binary and put that in the fileInfo node.
How can one encode, store and then decode to write to file later?
If I were to convert to binary using for instance:
' '.join(format(ord(x), 'b') for x in line)
What code would I need to turn that back into the original utf-8 zip information?
you can find related code here:
http://tech-artists.org/forum/showthread.php?4161-Maya-API-Singleton-Nodes&highlight=mayapersist
the relevant bit is
import base64
encoded = base64.b64encode(value)
decoded = base64.b64decode(encoded)
basically it's the same idea, except using the base64 module instead of binascii. Any method of converting an arbitary character stream to an ascii-safe representation will work fine, as long as you use a reversable method: the potential problem you need to watch out for is a character in the data block that looks to maya like a close quote - an open quote int he fileInfo will be messy in an MA file.
This example uses YAML to do arbitrary key-value pairs but that part is irrelevant to storing the binary stuff. I've used this technique for fairly large data (up to 640k if i recall) but I don't know if it has an upper limit in terms of what you can stash in Maya
Found the answer. Great script on stack overflow. I had to encode to 'string_escape', which is something I found while trying to figure out the whole characters situation. But anyways, you open the zip, encode to 'string_escape', write it to fileInfo and then before fetching to write it back to zip, you decode it back.
Convert binary to ASCII and vice versa
import maya.cmds as cmds
import binascii
def text_to_bits(text, encoding='utf-8', errors='surrogatepass'):
bits = bin(int(binascii.hexlify(text.encode(encoding, errors)), 16))[2:]
return bits.zfill(8 * ((len(bits) + 7) // 8))
def text_from_bits(bits, encoding='utf-8', errors='surrogatepass'):
n = int(bits, 2)
return int2bytes(n).decode(encoding, errors)
def int2bytes(i):
hex_string = '%x' % i
n = len(hex_string)
return binascii.unhexlify(hex_string.zfill(n + (n & 1)))
And then you can
with open("..\maya\scripts/test.zip","rb") as thing:
texty = text_to_bits(thing.read().encode('string_escape'))
cmds.fileInfo("binaryZip",texty)
...later
with open("..\maya\scripts/test_2.zip","wb") as thing:
texty = cmds.fileInfo("binaryZip",q=1)
thing.write( text_from_bits( texty ).decode('string_escape') )
and this appears to work.. so far..
Figured I'd post the final product for anyone wanting to undertake this approach. I tried to do some corruption checking, so that a bad zip doesn't get passed between machines. That's what all the check hashing is for.
def writeTimeFull(tl):
import TimeFull
#reload(TimeFull)
with open(TimeFull.__file__.replace(".pyc",".py"),"r") as file:
cmds.scriptNode( tl.scriptConnection[1][0], e=1, bs=file.read() )
cmds.expression("spark_timeliner_activator",
e=1,s='if (Spark_Timeliner.ShowTimeliner == 1)\n'
'{\n'
'\tsetAttr Spark_Timeliner.ShowTimeliner 0;\n'
'\tpython \"Timeliner.InitTimeliner()\";\n'
'}',
o="Spark_Timeliner",ae=1,uc=all)
def checkHash(zipPath,hash1,hash2,hash3):
check = False
hashes = [hash1,hash2,hash3]
for ii, hash in enumerate(hashes):
if hash == hashes[(ii+1)%3]:
hashes[(ii+2)%3] = hashes[ii]
check = True
if check:
if md5(zipPath) == hashes[0]:
return [zipPath,hashes[0],hashes[1],hashes[2]]
else:
cmds.warning("Hash checks and/or zip are corrupted. Attain toolbox_fix.zip, put it in scripts folder and restart.")
return []
#this writes the zip file to the local users space
def saveOutZip(filename):
if os.path.isfile(filename):
if not os.path.isfile(filename.replace('_pkg','_'+__version__)):
os.rename(filename,filename.replace('_pkg','_'+__version__))
with open(filename,"w") as zipFile:
zipInfo = cmds.fileInfo("zipInfo1",q=1)[0]
zipHash_1 = cmds.fileInfo("zipHash1",q=1)[0]
zipHash_2 = cmds.fileInfo("zipHash2",q=1)[0]
zipHash_3 = cmds.fileInfo("zipHash3",q=1)[0]
zipFile.write( base64.b64decode(zipInfo) )
if checkHash(filename,zipHash_1,zipHash_2,zipHash_3):
cmds.fileInfo("zipInfo2",zipInfo)
return filename
with open(filename,"w") as zipFile:
zipInfo = cmds.fileInfo("zipInfo2",q=1)[0]
zipHash_1 = cmds.fileInfo("zipHash1",q=1)[0]
zipHash_2 = cmds.fileInfo("zipHash2",q=1)[0]
zipHash_3 = cmds.fileInfo("zipHash3",q=1)[0]
zipFile.write( base64.b64decode(zipInfo) )
if checkHash(filename,zipHash_1,zipHash_2,zipHash_3):
cmds.fileInfo("zipInfo1",zipInfo)
return filename
return False
#this writes the local zip to this file
def loadInZip(filename):
zipResults = []
for ii in range(0,10):
with open(filename,"r") as theRead:
zipResults.append([base64.b64encode(theRead.read())]+checkHash(filename,md5(filename),md5(filename),md5(filename)))
if ii>0 and zipResults[ii]==zipResults[ii-1]:
cmds.fileInfo("zipInfo1",zipResults[ii][0])
cmds.fileInfo("zipInfo2",zipResults[ii-1][0])
cmds.fileInfo("zipHash1",zipResults[ii][2])
cmds.fileInfo("zipHash2",zipResults[ii][3])
cmds.fileInfo("zipHash3",zipResults[ii][4])
return True
#file check
#http://stackoverflow.com/questions/3431825/generating-a-md5-checksum-of-a-file
def md5(fname):
import hashlib
hash = hashlib.md5()
with open(fname, "rb") as f:
for chunk in iter(lambda: f.read(4096), b""):
hash.update(chunk)
return hash.hexdigest()
filename = path+'/toolbox_pkg.zip'
zipPaths = [path+'/toolbox_update.zip',
path+'/toolbox_fix.zip',
path+'/toolbox_'+__version__+'.zip',
filename]
zipPaths_exist = [os.path.isfile(zipPath) for zipPath in zipPaths ]
if any(zipPaths_exist[:2]):
if zipPaths_exist[0]:
cmds.warning('Timeliner update present. Forcing file to update version')
if zipPaths_exist[2]:
os.remove(zipPaths[3])
elif os.path.isfile(zipPaths[3]):
os.rename(zipPaths[3], zipPaths[2])
os.rename(zipPaths[0],zipPaths[3])
if zipPaths_exist[1]:
os.remove(zipPaths[1])
else:
cmds.warning('Timeliner fix present. Replacing file to the fix version.')
if os.path.isfile(zipPaths[3]):
os.remove(zipPaths[3])
os.rename(zipPaths[1],zipPaths[3])
loadInZip(filename)
if not cmds.fileInfo("zipInfo1",q=1) and not cmds.fileInfo("zipInfo2",q=1):
loadInZip(filename)
if not os.path.isfile(filename):
saveOutZip(filename)
sys.path.append(filename)
import Timeliner
Timeliner.InitTimeliner(theVers=__version__)
if not any(zipPaths[:2]):
if __version__ > Timeliner.__version__:
cmds.warning('Saving out newer version of timeliner to local machine. Restart Maya to access latest version.')
saveOutZip(filename)
elif __version__ < Timeliner.__version__:
cmds.warning('Timeliner on machine is newer than file version. Saving machine version over timeliner in file.')
loadInZip(filename)
__version__ = Timeliner.__version__
if __name__ != "__main__":
tl = getTimeliner()
writeTimeFull(tl)
Related
I have created a simple checksum script that checks a checksum of a file called tecu.a2l and compares it to a few .md5 files - ensuring that they all have the exact same checksum whenever this script is running.
To make things easier to understand:
Lets say i have tecu.a2l with the checksum 1x2x3x. So the md5 files (if generated correctly) should have the same checksum (1x2x3x). If one of the md5 files has a different checksum than what the latest tecu.a2l has it will give an error.
Hopefully the code might fill in the blanks if you did not quite understand my description.
import hashlib
import dst_creator_constants as CONST
import Tkinter
path_a2l = 'C:<path>\tecu.a2l'
md5 = hashlib.md5()
blocks = 65565
with open(path_a2l, 'rb') as a2l:
readA2L = a2l.read(blocks)
generatedMD5 = md5.hexdigest()
print "stop1"
ihx_md5_files = CONST.PATH_DELIVERABLES_DST
for file in ihx_md5_files:
print "stop2"
if file.endswith('.md5'):
print "stop3"
readMD5 = file.read()
if compare_checksums:
print "Yes"
# Add successful TkInter msg here
else:
print "No"
# Add error msg here
def compare_checksums(generatedMD5, readMD5):
if generatedMD5 == readMD5:
return True
else:
return False
When i run this script, nothing happens. No messages, nothing. If i type in python checksum.py into cmd - it returns no message. So i put in some print statements to see what could be the issue. The issue is, that stop3 is never shown in the command prompt - which means that the problem has something to do with if file.endswith('.md5'): statement.
I have no idea why it's the culprit as I have used this file.endswith() statement on a previous script I wrote in relation to this and there it has worked so I am turning to you.
You are not creating a hash object. Your file stays in your readA2L variable. Also, your file may be larger than the 65565 byte buffer you allow it. Try to update your hasher like the function below and let us know what the result is.
import hashlib as h
from os.path import isfile
hasher = h.md5()
block_size = 65536
def get_hexdigest(file_path, hasher, block_size):
if isfile(file_path):
with open(file_path, 'rb') as f:
buf = f.read(block_size)
while len(buf) > 0:
# Update the hasher until the entire file has been read
hasher.update(buf)
buf = f.read(block_size)
digest = hasher.hexdigest()
else:
return None
return digest
I would like to read (in Python 2.7), line by line, from a csv (text) file, which is 7z compressed. I don't want to decompress the entire (large) file, but to stream the lines.
I tried pylzma.decompressobj() unsuccessfully. I get a data error. Note that this code doesn't yet read line by line:
input_filename = r"testing.csv.7z"
with open(input_filename, 'rb') as infile:
obj = pylzma.decompressobj()
o = open('decompressed.raw', 'wb')
obj = pylzma.decompressobj()
while True:
tmp = infile.read(1)
if not tmp: break
o.write(obj.decompress(tmp))
o.close()
Output:
o.write(obj.decompress(tmp))
ValueError: data error during decompression
This will allow you to iterate the lines. It's partially derived from some code I found in an answer to another question.
At this point in time (pylzma-0.5.0) the py7zlib module doesn't implement an API that would allow archive members to be read as a stream of bytes or characters — its ArchiveFile class only provides a read() function that decompresses and returns the uncompressed data in a member all at once. Given that, about the best that can be done is return bytes or lines iteratively via a Python generator using that as a buffer.
The following does the latter, but may not help if the problem is the archive member file itself is huge.
The code below should work in Python 3.x as well as 2.7.
import io
import os
import py7zlib
class SevenZFileError(py7zlib.ArchiveError):
pass
class SevenZFile(object):
#classmethod
def is_7zfile(cls, filepath):
""" Determine if filepath points to a valid 7z archive. """
is7z = False
fp = None
try:
fp = open(filepath, 'rb')
archive = py7zlib.Archive7z(fp)
_ = len(archive.getnames())
is7z = True
finally:
if fp: fp.close()
return is7z
def __init__(self, filepath):
fp = open(filepath, 'rb')
self.filepath = filepath
self.archive = py7zlib.Archive7z(fp)
def __contains__(self, name):
return name in self.archive.getnames()
def readlines(self, name, newline=''):
r""" Iterator of lines from named archive member.
`newline` controls how line endings are handled.
It can be None, '', '\n', '\r', and '\r\n' and works the same way as it does
in StringIO. Note however that the default value is different and is to enable
universal newlines mode, but line endings are returned untranslated.
"""
archivefile = self.archive.getmember(name)
if not archivefile:
raise SevenZFileError('archive member %r not found in %r' %
(name, self.filepath))
# Decompress entire member and return its contents iteratively.
data = archivefile.read().decode()
for line in io.StringIO(data, newline=newline):
yield line
if __name__ == '__main__':
import csv
if SevenZFile.is_7zfile('testing.csv.7z'):
sevenZfile = SevenZFile('testing.csv.7z')
if 'testing.csv' not in sevenZfile:
print('testing.csv is not a member of testing.csv.7z')
else:
reader = csv.reader(sevenZfile.readlines('testing.csv'))
for row in reader:
print(', '.join(row))
If you were using Python 3.3+, you might be able to do this using the lzma module which was added to the standard library in that version.
See: lzma Examples
If you can use python 3, there is a useful library, py7zr, which supports partially 7zip decompression as below:
import py7zr
import re
filter_pattern = re.compile(r'<your/target/file_and_directories/regex/expression>')
with SevenZipFile('archive.7z', 'r') as archive:
allfiles = archive.getnames()
selective_files = [f if filter_pattern.match(f) for f in allfiles]
archive.extract(targets=selective_files)
Hello
My error is produced in generating a zip file. Can you inform what I should do?
main.py", line 2289, in get
buf=zipf.read(2048)
NameError: global name 'zipf' is not defined
The complete code is as follows:
def addFile(self,zipstream,url,fname):
# get the contents
result = urlfetch.fetch(url)
# store the contents in a stream
f=StringIO.StringIO(result.content)
length = result.headers['Content-Length']
f.seek(0)
# write the contents to the zip file
while True:
buff = f.read(int(length))
if buff=="":break
zipstream.writestr(fname,buff)
return zipstream
def get(self):
self.response.headers["Cache-Control"] = "public,max-age=%s" % 86400
start=datetime.datetime.now()-timedelta(days=20)
count = int(self.request.get('count')) if not self.request.get('count')=='' else 1000
from google.appengine.api import memcache
memcache_key = "ads"
data = memcache.get(memcache_key)
if data is None:
a= Ad.all().filter("modified >", start).filter("url IN", ['www.koolbusiness.com']).filter("published =", True).order("-modified").fetch(count)
memcache.set("ads", a)
else:
a = data
dispatch='templates/kml.html'
template_values = {'a': a , 'request':self.request,}
path = os.path.join(os.path.dirname(__file__), dispatch)
output = template.render(path, template_values)
self.response.headers['Content-Length'] = len(output)
zipstream=StringIO.StringIO()
file = zipfile.ZipFile(zipstream,"w")
url = 'http://www.koolbusiness.com/list.kml'
# repeat this for every URL that should be added to the zipfile
file =self.addFile(file,url,"list.kml")
# we have finished with the zip so package it up and write the directory
file.close()
zipstream.seek(0)
# create and return the output stream
self.response.headers['Content-Type'] ='application/zip'
self.response.headers['Content-Disposition'] = 'attachment; filename="list.kmz"'
while True:
buf=zipf.read(2048)
if buf=="": break
self.response.out.write(buf)
That is probably zipstream and not zipf. So replace that with zipstream and it might work.
i don't see where you declare zipf?
zipfile? Senthil Kumaran is probably right with zipstream since you seek(0) on zipstream before the while loop to read chunks of the mystery variable.
edit:
Almost certainly the variable is zipstream.
zipfile docs:
class zipfile.ZipFile(file[, mode[, compression[, allowZip64]]])
Open a ZIP file, where file can be either a path to a file (a string) or
a file-like object. The mode parameter
should be 'r' to read an existing
file, 'w' to truncate and write a new
file, or 'a' to append to an existing
file. If mode is 'a' and file refers
to an existing ZIP file, then
additional files are added to it. If
file does not refer to a ZIP file,
then a new ZIP archive is appended to
the file. This is meant for adding a
ZIP archive to another file (such as
python.exe).
your code:
zipsteam=StringIO.StringIO()
create a file-like object using StringIO which is essentially a "memory file" read more in docs
file = zipfile.ZipFile(zipstream,w)
opens the zipfile with the zipstream file-like object in 'w' mode
url = 'http://www.koolbusiness.com/list.kml'
# repeat this for every URL that should be added to the zipfile
file =self.addFile(file,url,"list.kml")
# we have finished with the zip so package it up and write the directory
file.close()
uses the addFile method to retrieve and write the retrieved data to the file-like object and returns it. The variables are slightly confusing because you pass a zipfile to the addFile method which aliases as zipstream (confusing because we are using zipstream as a StringIO file-like object). Anyways, the zipfile is returned, and closed to make sure everything is "written".
It was written to our "memory file", which we now seek to index 0
zipstream.seek(0)
and after doing some header stuff, we finally reach the while loop that will read our "memory-file" in chunks
while True:
buf=zipstream.read(2048)
if buf=="": break
self.response.out.write(buf)
You need to declare:
global zipf
right after your
def get(self):
line. you are modifying a global variable, and this is the only way python knows what you are doing.
I've had a look around for the answer to this, but I only seem to be able to find software that does it for you. Does anybody know how to go about doing this in python?
I wrote a piece of python code that verifies the hashes of downloaded files against what's in a .torrent file. Assuming you want to check a download for corruption you may find this useful.
You need the bencode package to use this. Bencode is the serialization format used in .torrent files. It can marshal lists, dictionaries, strings and numbers somewhat like JSON.
The code takes the hashes contained in the info['pieces'] string:
torrent_file = open(sys.argv[1], "rb")
metainfo = bencode.bdecode(torrent_file.read())
info = metainfo['info']
pieces = StringIO.StringIO(info['pieces'])
That string contains a succession of 20 byte hashes (one for each piece). These hashes are then compared with the hash of the pieces of on-disk file(s).
The only complicated part of this code is handling multi-file torrents because a single torrent piece can span more than one file (internally BitTorrent treats multi-file downloads as a single contiguous file). I'm using the generator function pieces_generator() to abstract that away.
You may want to read the BitTorrent spec to understand this in more details.
Full code bellow:
import sys, os, hashlib, StringIO, bencode
def pieces_generator(info):
"""Yield pieces from download file(s)."""
piece_length = info['piece length']
if 'files' in info: # yield pieces from a multi-file torrent
piece = ""
for file_info in info['files']:
path = os.sep.join([info['name']] + file_info['path'])
print path
sfile = open(path.decode('UTF-8'), "rb")
while True:
piece += sfile.read(piece_length-len(piece))
if len(piece) != piece_length:
sfile.close()
break
yield piece
piece = ""
if piece != "":
yield piece
else: # yield pieces from a single file torrent
path = info['name']
print path
sfile = open(path.decode('UTF-8'), "rb")
while True:
piece = sfile.read(piece_length)
if not piece:
sfile.close()
return
yield piece
def corruption_failure():
"""Display error message and exit"""
print("download corrupted")
exit(1)
def main():
# Open torrent file
torrent_file = open(sys.argv[1], "rb")
metainfo = bencode.bdecode(torrent_file.read())
info = metainfo['info']
pieces = StringIO.StringIO(info['pieces'])
# Iterate through pieces
for piece in pieces_generator(info):
# Compare piece hash with expected hash
piece_hash = hashlib.sha1(piece).digest()
if (piece_hash != pieces.read(20)):
corruption_failure()
# ensure we've read all pieces
if pieces.read():
corruption_failure()
if __name__ == "__main__":
main()
Here how I've extracted HASH value from torrent file:
#!/usr/bin/python
import sys, os, hashlib, StringIO
import bencode
def main():
# Open torrent file
torrent_file = open(sys.argv[1], "rb")
metainfo = bencode.bdecode(torrent_file.read())
info = metainfo['info']
print hashlib.sha1(bencode.bencode(info)).hexdigest()
if __name__ == "__main__":
main()
It is the same as running command:
transmissioncli -i test.torrent 2>/dev/null | grep "^hash:" | awk '{print $2}'
Hope, it helps :)
According to this, you should be able to find the md5sums of files by searching for the part of the data that looks like:
d[...]6:md5sum32:[hash is here][...]e
(SHA is not part of the spec)
In a web app I am working on, the user can create a zip archive of a folder full of files. Here here's the code:
files = torrent[0].files
zipfile = z.ZipFile(zipname, 'w')
output = ""
for f in files:
zipfile.write(settings.PYRAT_TRANSMISSION_DOWNLOAD_DIR + "/" + f.name, f.name)
downloadurl = settings.PYRAT_DOWNLOAD_BASE_URL + "/" + settings.PYRAT_ARCHIVE_DIR + "/" + filename
output = "Download " + torrent_name + ""
return HttpResponse(output)
But this has the nasty side effect of a long wait (10+ seconds) while the zip archive is being downloaded. Is it possible to skip this? Instead of saving the archive to a file, is it possible to send it straight to the user?
I do beleive that torrentflux provides this excat feature I am talking about. Being able to zip GBs of data and download it within a second.
Check this Serving dynamically generated ZIP archives in Django
As mandrake says, constructor of HttpResponse accepts iterable objects.
Luckily, ZIP format is such that archive can be created in single pass, central directory record is located at the very end of file:
(Picture from Wikipedia)
And luckily, zipfile indeed doesn't do any seeks as long as you only add files.
Here is the code I came up with. Some notes:
I'm using this code for zipping up a bunch of JPEG pictures. There is no point compressing them, I'm using ZIP only as container.
Memory usage is O(size_of_largest_file) not O(size_of_archive). And this is good enough for me: many relatively small files that add up to potentially huge archive
This code doesn't set Content-Length header, so user doesn't get nice progress indication. It should be possible to calculate this in advance if sizes of all files are known.
Serving the ZIP straight to user like this means that resume on downloads won't work.
So, here goes:
import zipfile
class ZipBuffer(object):
""" A file-like object for zipfile.ZipFile to write into. """
def __init__(self):
self.data = []
self.pos = 0
def write(self, data):
self.data.append(data)
self.pos += len(data)
def tell(self):
# zipfile calls this so we need it
return self.pos
def flush(self):
# zipfile calls this so we need it
pass
def get_and_clear(self):
result = self.data
self.data = []
return result
def generate_zipped_stream():
sink = ZipBuffer()
archive = zipfile.ZipFile(sink, "w")
for filename in ["file1.txt", "file2.txt"]:
archive.writestr(filename, "contents of file here")
for chunk in sink.get_and_clear():
yield chunk
archive.close()
# close() generates some more data, so we yield that too
for chunk in sink.get_and_clear():
yield chunk
def my_django_view(request):
response = HttpResponse(generate_zipped_stream(), mimetype="application/zip")
response['Content-Disposition'] = 'attachment; filename=archive.zip'
return response
Here's a simple Django view function which zips up (as an example) any readable files in /tmp and returns the zip file.
from django.http import HttpResponse
import zipfile
import os
from cStringIO import StringIO # caveats for Python 3.0 apply
def somezip(request):
file = StringIO()
zf = zipfile.ZipFile(file, mode='w', compression=zipfile.ZIP_DEFLATED)
for fn in os.listdir("/tmp"):
path = os.path.join("/tmp", fn)
if os.path.isfile(path):
try:
zf.write(path)
except IOError:
pass
zf.close()
response = HttpResponse(file.getvalue(), mimetype="application/zip")
response['Content-Disposition'] = 'attachment; filename=yourfiles.zip'
return response
Of course this approach will only work if the zip files will conveniently fit into memory - if not, you'll have to use a disk file (which you're trying to avoid). In that case, you just replace the file = StringIO() with file = open('/path/to/yourfiles.zip', 'wb') and replace the file.getvalue() with code to read the contents of the disk file.
Does the zip library you are using allow for output to a stream. You could stream directly to the user instead of temporarily writing to a zip file THEN streaming to the user.
It is possible to pass an iterator to the constructor of a HttpResponse (see docs). That would allow you to create a custom iterator that generates data as it is being requested. However I don't think that will work with a zip (you would have to send partial zip as it is being created).
The proper way, I think, would be to create the files offline, in a separate process. The user could then monitor the progress and then download the file when its ready (possibly by using the iterator method described above). This would be similar what sites like youtube use when you upload a file and wait for it to be processed.