How To Detect A Rotated Page In A Pdf Document In Python?
Given a PDF document with multiple pages, how to check if a given page is rotated (-90, 90 or 180º)? Preferable using Python (pdfminer, pyPDF) ... UPDATE: The pages are scanned, a
Solution 1:
I used simply /Rotate
attribute of the page in PyPDF2
:
pdf = PyPDF2.PdfFileReader(open('example.pdf', 'rb'))
orientation = pdf.getPage(pagenumber).get('/Rotate')
it can be 0
, 90
, 180
, 270
or None
Solution 2:
If you're using pdfminer
you can get the rotation by calling the .rotate
attribute of PDFPage
instance.
for page in PDFPage.create_pages(doc):
interpreter.process_page(page)
r = page.rotate
Solution 3:
If you're using PDFMiner and want the orientation by each page:
from pdfminer.pdfpageimportPDFPagefrom io importStringIO
from pdfminer.pdfparserimportPDFParserfrom pdfminer.pdfdocumentimportPDFDocumentfrom pdfminer.pdfpageimportPDFPagefrom pdfminer.pdfinterpimportPDFResourceManager, PDFPageInterpreterfrom pdfminer.converterimportTextConverterfrom pdfminer.layoutimportLAParams
output_string = StringIO()
resource_manager = PDFResourceManager()
device = TextConverter(resource_manager, output_string,
laparams=LAParams())
interpreter = PDFPageInterpreter(resource_manager, device)
for page inPDFPage.get_pages(open('sample.pdf', 'rb')):
interpreter.process_page(page)
if page.mediabox[2] - page.mediabox[0] > page.mediabox[3] - page.mediabox[1]:
orientation = 'Landscape'else:
orientation = 'Portrait'
Post a Comment for "How To Detect A Rotated Page In A Pdf Document In Python?"