Just to close out on my investigations here, here is a cleaned up version with a proper context manager to replace the objc_util version, and some extra fluff removed. .
from objc_util import *
#from objc_util import autoreleasepool #this doesnt work properly
import photos
import sys, os,gc
import contextlib
try:
sys.path+=os.path.expanduser('~')
from objc_hacks.memstatus import _get_taskinfo
except:
pass # if memstatus is not installed,or not supported device
from PIL import Image
all_photos=photos.get_assets()
'''Replacement for objc_util.autoreleasepool that doesnt crash when rerunning script.
objc_util's has two issues: first, a bug in ObcCInstance.__del__ means that NSAutoreleasePool will have release called, which is never proper for NSAutoreleasePool. This wouldnt be an issue if the object is alive, since those calls get ignired, it seems. The second issue affects all ObjCInstances: the _method_cache prevents timely garbage collection due to a ref cycle. So ObjCInstances stick around -- and thus never get release called -- until gc is run. Problem is, once the pool has been drained, we are not guaranteed to have it around later. Since objc_util doesnt retain it, after draining, it might get released, and then when gc finally breaks the ref cycle, release gets called, and that pointer might be a new object at that point, so release causes a crash'''
@contextlib.contextmanager
def autoreleasepool():
pool = ObjCClass('NSAutoreleasePool').new()
try:
yield
finally:
pool.drain()
pool._cached_methods.clear() #break ref cycle
pool.__del__=None #prevent release from being called
for idx, asset in enumerate(all_photos):
with autoreleasepool():
''' debugging '''
if _get_taskinfo:
print(f'\nSTART {_get_taskinfo().resident_size/1024/1024:3.2f} MB')
''' get the data'''
fname = str(ObjCInstance(asset).valueForKey_('filename'))
payload = asset.get_image_data(original=True)
''' Do some actual work here
... insert upload, save, etc, code
'''
'''Clean up'''
del payload # not actually needed
ObjCInstance(asset)._cached_methods.clear() #required !!!
del asset # not actually needed
''' debugging '''
if _get_taskinfo:
print(f'END. {_get_taskinfo().resident_size/1024/1024:3.2f} MB')
As written, the memory is basically constant each cycle:
START 76.20 MB
END. 76.52 MB
START 76.20 MB
END. 76.52 MB
START 76.20 MB
END. 76.94 MB
START 76.20 MB
END. 76.79 MB
START 76.20 MB
END. 76.85 MB
If i comment out the with autoreleasepool() line, essentially not using a release pool, it should now be obvious why looping phassets never works -- despite deleting the objc objects, the actual PHAsset's data does not get cleared without draining the pool, and we gain several MB per cycle
START 94.90 MB
END. 96.81 MB
START 96.81 MB
END. 98.68 MB
START 98.68 MB
END. 99.94 MB
START 99.94 MB
END. 100.99 MB
START 101.09 MB
END. 102.03 MB
START 102.02 MB
END. 102.89 MB
If I use the pool, but comment out the two del lines:
START 63.28 MB
END. 65.14 MB
START 63.70 MB
END. 64.77 MB
START 63.51 MB
END. 62.46 MB
START 62.36 MB
END. 62.43 MB
START 62.35 MB
END. 66.07 MB
START 64.17 MB
END. 66.01 MB
we see that there is no net growth, although there is somewhat more variation from cycle to cycle (because the bytesIO object payload is still in scope at the start -- if the meat of the loop is moved to a function, the results are the same with or without the del) . This shows that reference counting gc does do its job -- deleting the bytesIO object (payload) is not necessary (this was a point of contention with @ccc in a previous thread) -- when it falls out of scope, the memory is cleared.
Finally, to see the effect of the objc_util bug that prevents ObjCInstances from getting cleared when they fall out of scope, I comment out the
ObjCInstance(asset)._cached_methods.clear() ! line:
since ObjCInstance objects themselves are not that big, we see a slow leak, growing maybe a 0.1 MB every few hundred images. I seemed to get more crashes if starting/stopping the fhe script many times, but that could be my imagination. So I conclude, this probably not needed unless you are creating many tens of thousands of ObjCInstances, or you need to ensure that the objects release is called at a specific time for some other reason.