Skip to content Skip to sidebar Skip to footer

How To Detect A Rotated Page In A Pdf Document In Python?

Given a PDF document with multiple pages, how to check if a given page is rotated (-90, 90 or 180º)? Preferable using Python (pdfminer, pyPDF) ... UPDATE: The pages are scanned, a

Solution 1:

I used simply /Rotate attribute of the page in PyPDF2:

pdf = PyPDF2.PdfFileReader(open('example.pdf', 'rb'))
 orientation = pdf.getPage(pagenumber).get('/Rotate')

it can be 0, 90, 180, 270 or None

Solution 2:

If you're using pdfminer you can get the rotation by calling the .rotate attribute of PDFPage instance.

for page in PDFPage.create_pages(doc):
    interpreter.process_page(page)
    r = page.rotate

Solution 3:

If you're using PDFMiner and want the orientation by each page:

from pdfminer.pdfpageimportPDFPagefrom io importStringIO
from pdfminer.pdfparserimportPDFParserfrom pdfminer.pdfdocumentimportPDFDocumentfrom pdfminer.pdfpageimportPDFPagefrom pdfminer.pdfinterpimportPDFResourceManager, PDFPageInterpreterfrom pdfminer.converterimportTextConverterfrom pdfminer.layoutimportLAParams

output_string = StringIO()
resource_manager = PDFResourceManager()
device = TextConverter(resource_manager, output_string, 
laparams=LAParams())
interpreter = PDFPageInterpreter(resource_manager, device)

for page inPDFPage.get_pages(open('sample.pdf', 'rb')):
    interpreter.process_page(page)

    if page.mediabox[2] - page.mediabox[0] > page.mediabox[3] - page.mediabox[1]:
        orientation = 'Landscape'else:
        orientation = 'Portrait'

Post a Comment for "How To Detect A Rotated Page In A Pdf Document In Python?"