home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.editors      What? Edlin ain't good enough for you?      123,932 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 123,800 of 123,932   
   Marion to Lawrence D'Oliveiro   
   Re: What is the best free software for c   
   03 Mar 25 03:17:31   
   
   XPost: alt.comp.os.windows-11, comp.text.pdf, alt.comp.os.windows-10   
   From: marion@facts.com   
      
   On Sun, 2 Mar 2025 22:03:12 -0000 (UTC), Lawrence D'Oliveiro wrote :   
      
      
   > Perhaps another one I should mention is PDFMiner. This is a bit of a   
   > specialist one, focused on extracting text items from a PDF page, and   
   > using various heuristics to try to reassemble them into larger text   
   > blocks.   
      
   Thank you for adding value to the spirit of this conversation where the PDF   
   experts and editing experts are involved, along with the Windows users.   
      
   Looking up what the PDFMiner Python tool can do for us, it's important to   
   note it's apparently designed for extracting information from PDF files.   
      
   I'm not quite sure how PDFMinor differs from any of the other text   
   extractors (e.g., PDF to TEXT) but it seems to gather layout data also.   
      
   While it can extract metadata, it seems to me it's mostly used to "mine"   
   large assemblages of PDF files for textual data of interest to the user.   
      
   The original PDFMiner has apparently  been forked as pdfminer.six, which,   
   as far as I can tell from date stamps, is still actively being updated.   
       
       
      
   Since it functions on windows, (within the python enviroment) and since it   
   does something useful (mine text in PDFs), I'll add it to the PDF chart as   
   [x] Extract text (poppler) or mine textual & metadata (pdfminersix)   
      
   Here's the current chart, where I simply ask for more things done to PDFs.   
   [?] Print book format PDF (FinePrint payware)   
   [x] Add or concatenate pages (pdftk, acrobat payware)   
   [x] Add signature (Adobe Reader Fill-and-sign sign-yourself tool)   
   [x] Archive sites (wkhtmltopdf, Acrobat payware,fastone scroll capture)   
   [x] Compress PDFs (ImageMagick, PDFgear, rlvision)   
   [x] Convert PDF to MSOffice (PDFgear, Calibre for MS Word only)   
   [x] Convert PDF to MSWord (Calibre, PDFgear)   
   [x] Convert PDF to epub format (Calibre)   
   [x] Convert PDF to PostScript (Calibre, Poppler)   
   [x] Converts PDFs to HTML (poppler)   
   [x] Converts PDFs to PNG, JPEG, etc (poppler) using Cairo graphics   
   [x] Converts PDFs to PPM/PGM/PBM image formats (poppler)   
   [x] Create PDF new text (Irfanview or Paint.NET plugins + Ghostscript)   
   [x] Edit PDF existing text (Adobe Reader commenting, Acrobat payware)   
   [x] Embeds files into a PDF as attachments (poppler)   
   [x] Extract images (PDF Exchange Viewer, PDF Shaper, PDFgear, poppler)   
   [x] Extract text (poppler) or mine textual & metadata (pdfminersix)   
   [x] Extracts embedded files (attachments) from a PDF (poppler)   
   [x] Fastest PDF readers (Sumatra or Foxit)   
   [x] Globally search & replace PDF text (Libre Office)   
   [x] List fonts used in a PDF (poppler)   
   [x] Merge PDFs (pdfsam, pdftk, PDFgear, Poppler)   
   [x] Metadata display on command line (poppler)   
   [x] Metadata removal (LibreOffice Writer, PDFgear offline)   
   [x] OCR, PDF-Xchange, freeOCR (paperfile.net), GOCR (jocr.sourceforge.net)   
   [x] Offline encrypt PDF with a password (pdfencrypt)   
   [x] Online shrink PDF    
   [x] PDF text to audio file (Balabolka)   
   [x] Remove pages (pdfsam, pdftk)   
   [x] Remove restrictions (Ghostscript,Ghostview,ps2edit,pdfwrite,pdf2djvu)   
   [x] Renumber pages (Acrobat Reader)   
   [x] Reorder pages (mutool)   
   [x] Rotate pages (Acrobat Reader)   
   [x] Separates a PDF into individual pages (poppler)   
   [x] Split PDFs (PDFgear, Poppler)   
   [x] Tile PDFs (i.e., to print large posters) (Posterazor)   
   [?] What other tasks do you do to edit or modify a PDF file?   
      
   --- SoupGate-DOS v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca