home bbs files messages ]

Forums before death by AOL, social media and spammers... "We can't have nice things"

   comp.editors      What? Edlin ain't good enough for you?      123,932 messages   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]

   Message 123,802 of 123,932   
   G to Marion   
   Re: What is the best free software for c   
   03 Mar 25 09:19:35   
   
   XPost: alt.comp.os.windows-11, comp.text.pdf, alt.comp.os.windows-10   
   From: g@nowhere.invalid   
      
   In comp.editors Marion  wrote:   
   > On Sun, 2 Mar 2025 22:03:12 -0000 (UTC), Lawrence D'Oliveiro wrote :   
   >   
   >> Perhaps another one I should mention is PDFMiner. This is a bit of a   
   >> specialist one, focused on extracting text items from a PDF page, and   
   >> using various heuristics to try to reassemble them into larger text   
   >> blocks.   
   >   
   > Thank you for adding value to the spirit of this conversation where the PDF   
   > experts and editing experts are involved, along with the Windows users.   
   >   
   > Looking up what the PDFMiner Python tool can do for us, it's important to   
   > note it's apparently designed for extracting information from PDF files.   
   >   
   > I'm not quite sure how PDFMinor differs from any of the other text   
   > extractors (e.g., PDF to TEXT) but it seems to gather layout data also.   
   >   
   > While it can extract metadata, it seems to me it's mostly used to "mine"   
   > large assemblages of PDF files for textual data of interest to the user.   
   >   
   > The original PDFMiner has apparently  been forked as pdfminer.six, which,   
   > as far as I can tell from date stamps, is still actively being updated.   
   >    
   >    
   >   
   > Since it functions on windows, (within the python enviroment) and since it   
   > does something useful (mine text in PDFs), I'll add it to the PDF chart as   
   > [x] Extract text (poppler) or mine textual & metadata (pdfminersix)   
   >   
   > Here's the current chart, where I simply ask for more things done to PDFs.   
   > [?] Print book format PDF (FinePrint payware)   
   > [x] Add or concatenate pages (pdftk, acrobat payware)   
   > [x] Add signature (Adobe Reader Fill-and-sign sign-yourself tool)   
   > [x] Archive sites (wkhtmltopdf, Acrobat payware,fastone scroll capture)   
   > [x] Compress PDFs (ImageMagick, PDFgear, rlvision)   
   > [x] Convert PDF to MSOffice (PDFgear, Calibre for MS Word only)   
   > [x] Convert PDF to MSWord (Calibre, PDFgear)   
   > [x] Convert PDF to epub format (Calibre)   
   > [x] Convert PDF to PostScript (Calibre, Poppler)   
   > [x] Converts PDFs to HTML (poppler)   
   > [x] Converts PDFs to PNG, JPEG, etc (poppler) using Cairo graphics   
   > [x] Converts PDFs to PPM/PGM/PBM image formats (poppler)   
   > [x] Create PDF new text (Irfanview or Paint.NET plugins + Ghostscript)   
   > [x] Edit PDF existing text (Adobe Reader commenting, Acrobat payware)   
   > [x] Embeds files into a PDF as attachments (poppler)   
   > [x] Extract images (PDF Exchange Viewer, PDF Shaper, PDFgear, poppler)   
   > [x] Extract text (poppler) or mine textual & metadata (pdfminersix)   
   > [x] Extracts embedded files (attachments) from a PDF (poppler)   
   > [x] Fastest PDF readers (Sumatra or Foxit)   
   > [x] Globally search & replace PDF text (Libre Office)   
   > [x] List fonts used in a PDF (poppler)   
   > [x] Merge PDFs (pdfsam, pdftk, PDFgear, Poppler)   
   > [x] Metadata display on command line (poppler)   
   > [x] Metadata removal (LibreOffice Writer, PDFgear offline)   
   > [x] OCR, PDF-Xchange, freeOCR (paperfile.net), GOCR (jocr.sourceforge.net)   
   > [x] Offline encrypt PDF with a password (pdfencrypt)   
   > [x] Online shrink PDF    
   > [x] PDF text to audio file (Balabolka)   
   > [x] Remove pages (pdfsam, pdftk)   
   > [x] Remove restrictions (Ghostscript,Ghostview,ps2edit,pdfwrite,pdf2djvu)   
   > [x] Renumber pages (Acrobat Reader)   
   > [x] Reorder pages (mutool)   
   > [x] Rotate pages (Acrobat Reader)   
   > [x] Separates a PDF into individual pages (poppler)   
   > [x] Split PDFs (PDFgear, Poppler)   
   > [x] Tile PDFs (i.e., to print large posters) (Posterazor)   
   > [?] What other tasks do you do to edit or modify a PDF file?   
      
   I suppose all are available for Windows, it would be useful to know which are   
   also for Linux or Mac.   
      
   G   
      
   --- SoupGate-DOS v1.05   
    * Origin: you cannot sedate... all the things you hate (1:229/2)   

[   << oldest   |   < older   |   list   |   newer >   |   newest >>   ]


(c) 1994,  bbs@darkrealms.ca