home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   linux.debian.bugs.dist      Ohh some weird Debian bug report thing      28,835 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 28,202 of 28,835   
   Albert Nash to All   
   Bug#1128417: pdftk-java writes dates in    
   19 Feb 26 15:30:01   
   
   From: AlbertNash0@ro.ru   
      
   Package: pdftk-java   
   Version: 3.3.3-2   
   Control: affects -1 exiftool   
      
   When updating the Info-dictionary date fields, pdftk-java encodes the date   
   string in UTF-16BE with BOM instead of ASCII (or PDFDocEncoding). This causes   
   an interoperability issue with exiftool, which does not normalize the dates   
   into a human-readable    
   form.   
      
   Grab a sample PDF file, say, https://pdfobject.com/pdf/sample.pdf, and try to   
   update the creation date there with either update_info or update_info_utf8   
   (for our purposes, pick any):   
      
   $ pdftk sample.pdf update_info <(echo -e "InfoBegin\nInfoKey: Cr   
   ationDate\nInfoValue: D:199812231952-08'00'") output sample_with_date.pdf   
      
   Exiftool shows what's there, but the creation date is not in a human-readable   
   form, whereas the original modification date is way more readable:   
      
   $ exiftool -a -G sample_with_date.pdf | grep "PDF.*Date"   
   [PDF]           Modify Date                     : 2008:07:01 05:24:47Z   
   [PDF]           Create Date                     : D:199812231952-08'00'   
      
   The culprit is the encoding of the date in UTF-16BE, starting with the   
   byte-order mark FE FF:   
      
   $ mutool show sample_with_date.pdf trailer/Info | grep Date   
      /ModDate (D:20080701052447Z00'00')   
      /CreationDate    
   $ xxd sample_with_date.pdf | grep -A3 "Dat"   
   000045c0: 2028 5061 6765 7329 0a2f 4d6f 6444 6174   (Pages)./ModDat   
   000045d0: 6520 2844 3a32 3030 3830 3730 3130 3532  e (D:20080701052   
   000045e0: 3434 375a 3030 2730 3027 290a 2f43 7265  447Z00'00')./Cre   
   000045f0: 6174 696f 6e44 6174 6520 28fe ff00 4400  ationDate (...D.   
   00004600: 3a00 3100 3900 3900 3800 3100 3200 3200  :.1.9.9.8.1.2.2.   
   00004610: 3300 3100 3900 3500 3200 2d00 3000 3800  3.1.9.5.2.-.0.8.   
   00004620: 2700 3000 3000 2729 0a2f 5072 6f64 7563  '.0.0.')./Produc   
      
   Instead of UTF-16BE encoding beginning with FE FF, CreationDate should be   
   written as an ASCII string:   
   /CreationDate (D:199812231952-08'00')   
      
   In the PDF spec 1.3, https://opensource.adobe.com/dc-acrobat-sdk   
   docs/pdfstandards/pdfreference1.3.pdf, Table 3.21, a date is a string, and a   
   string is specified as the beginning of § 3.2.3 as a series of    
   ytes—unsigned integer values in the range 0    
   to 255.   
      
   In the PDF spec 1.7, https://opensource.adobe.com/dc-acrobat-sdk   
   docs/pdfstandards/PDF32000_2008.pdf, Table 34, a date is an ASCII string.   
      
   In the PDF spec 2.0, https://developer.adobe.com/document-servic   
   s/docs/assets/5b15559b96303194340b99820d3a70fa/PDF_ISO_32000-2.pdf, Table 35,   
   a date is also an ASCII string.   
      
   Though § 7.9.4 in specs 1.7 and 2.0 say that the date is a text string, text   
   strings can be PDFDocEncoded, and PDFDocEncoding contains printable ASCII.   
      
   The aforementioned exiftool output demonstrates inconsistent encoding of date   
   fields within the same Info dictionary and reduced interoperability when   
   UTF-16BE is used.   
      
   Requested change: pdftk-java should write Info date fields using printable   
   ASCII (PDFDocEncoding subset) instead of UTF-16BE. Since the PDF date format   
   uses only printable ASCII characters, UTF-16BE encoding is unnecessary and   
   results in inconsistent    
   encoding and reduced interoperability with some PDF tools.   
      
   This appears to be an upstream pdftk-java behavior rather than a Debian   
   packaging issue.   
      
   --- SoupGate-Win32 v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca