extracting metadata from image files (IPTC) – erratic problems (of object class ? ) for jpg files with python os.path.isfile()

  Kiến thức lập trình

I’m stuck with a tricky issue with image files (at PAW : PythonAnywhere).
While coding for extracting metadata from image files, I’m succeeding with some of them, while others are claimed to not exist, or not to be a file.

However all jpg files are equivalent for processing by my code:

  • All are in the same folder , and I have no path problem with them
  • All are jpg, jpeg files ; some of them with manually added metadata (via ACDSEE tool on windows): but these particular files are being processed correctly
import os, os.path
import iptcinfo3

from pathlib import Path

for fname in os.listdir('static/ImagesCabWillA'):
    fil1=Path(fname)
    if not fil1.exists():
        print(fname, " doesn't exist")
print("§"*15)

for fname in os.listdir('static/ImagesCabWillA'):
    if os.path.isfile(fname):
        print('ok ', fname, 'type : ', type(fname))
    else:
        print('NOT ok : ',fname, 'type : ', type(fname))

print('==== end list1 ====')

for fname in os.listdir('static/ImagesCabWillA'):
    if os.path.isfile(fname) and fname.split(".")[1] in ("jpg","jpeg","JPG","JPEG"):
        print("n",fname,"n")
        info = iptcinfo3.IPTCInfo(fname, force=True)
        print('-----------')
        print(info)
        # python what is b prefix in string printing : byte, byte string
        print (type(info["headline"]))
        print('=============')      #       <class 'bytes'>
#        if isinstance(info['object name'],bytes):
        print('zzzzzzzzz')
        print(info["headline"].decode("utf-8"))     #       , "n"
        print(info["object name"].decode("utf-8"), "n" )

I Investigated quite a lot on Google, but I don’t get the clue :

  • There is nothing specific to Flask in this problematic code; it is plain and simple Python …
  • I tried with os.path.isfile(), and with pathlib.Path(), but the problem is the same….
  • I’m suspecting heterogeneity in the encoding of various jpg filenames or internal coding of jpg files (some came initially via Whatsapp, while others from the phone or from a camera…)
  • I’m suspecting that file-object class is modified by the “for loop” processing …

Any idea will be most welcome
My current code, focusing on debugging this issue:
test_iptc3.py

folder content :
01.jpg 2024-03-22 20:33 45.3 KB
02_DSCF0225.JPG 2024-03-22 20:33 525.5 KB
03_IMG-20220517-WA0006.jpeg 2024-03-22 20:33 317.7 KB
04_IMG_20220207_182856.jpg 2024-03-22 20:33 386.8 KB
05_IMG-20220325-WA0014.jpeg 2024-03-22 20:33 201.5 KB
06_IMG-20220517-WA0013.jpeg 2024-03-22 20:33 303.9 KB
07_IMG-20220325-WA0021.jpeg 2024-03-22 20:33 197.9 KB
08_IMG_20220509_193214.jpg 2024-03-22 20:33 430.4 KB
09_DSCF8528.JPG 2024-03-22 20:33 273.1 KB
DSCF9384Lite.JPG 2024-04-14 17:42 462.2 KB
DSCF9387Lite.JPG 2024-04-14 16:33 496.4 KB

Python console output:

New contributor

Pierre-Yves Delens is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

LEAVE A COMMENT