PDF Files

import PyPDF2

Open the PDF file in read, binary mode

PyPDF2.PdfFileReader

num_Pages

getPage

extractText

Add pages

1
2
addPage
write

get the text of all pages using for loop

import PyPDF2
f = open('Working_Business_Proposal.pdf','rb')
pdf_reader = PyPDF2.PdfFileReader(f)
pdf_reader.numPages
5
page_num = pdf_reader.getPage(0)
data = page_num.extractText()
data
'Business Proposal\n The Revolution is Coming\n Leverage agile frameworks to provide a robust synopsis for high level \noverviews. Iterative approaches to corporate strategy foster collaborative \nthinking to further the overall value proposition. Organically grow the \nholistic world view of disruptive innovation via workplace diversity and \nempowerment. \nBring to the table win-win survival strategies to ensure proactive \ndomination. At the end of the day, going forward, a new normal that has \nevolved from generation X is on the runway heading towards a streamlined \ncloud solution. User generated content in real-time will have multiple \ntouchpoints for offshoring. \nCapitalize on low hanging fruit to identify a ballpark value added activity to \nbeta test. Override the digital divide with additional clickthroughs from \nDevOps. Nanotechnology immersion along the information highway will \nclose the loop on focusing solely on the bottom line. Podcasting operational change management inside of workßows to \nestablish a framework. Taking seamless key performance indicators ofßine \nto maximise the long tail. Keeping your eye on the ball while performing a \ndeep dive on the start-up mentality to derive convergence on cross-\nplatform integration. \nCollaboratively administrate empowered markets via plug-and-play \nnetworks. Dynamically procrastinate B2C users after installed base \nbeneÞts. Dramatically visualize customer directed convergence without \nrevolutionary ROI. \nEfÞciently unleash cross-media information without cross-media value. \nQuickly maximize timely deliverables for real-time schemas. Dramatically \nmaintain clicks-and-mortar solutions without functional solutions. \nBUSINESS PROPOSAL\n!1'
print(data)
Business Proposal
 The Revolution is Coming
 Leverage agile frameworks to provide a robust synopsis for high level 
overviews. Iterative approaches to corporate strategy foster collaborative 
thinking to further the overall value proposition. Organically grow the 
holistic world view of disruptive innovation via workplace diversity and 
empowerment. 
Bring to the table win-win survival strategies to ensure proactive 
domination. At the end of the day, going forward, a new normal that has 
evolved from generation X is on the runway heading towards a streamlined 
cloud solution. User generated content in real-time will have multiple 
touchpoints for offshoring. 
Capitalize on low hanging fruit to identify a ballpark value added activity to 
beta test. Override the digital divide with additional clickthroughs from 
DevOps. Nanotechnology immersion along the information highway will 
close the loop on focusing solely on the bottom line. Podcasting operational change management inside of workßows to 
establish a framework. Taking seamless key performance indicators ofßine 
to maximise the long tail. Keeping your eye on the ball while performing a 
deep dive on the start-up mentality to derive convergence on cross-
platform integration. 
Collaboratively administrate empowered markets via plug-and-play 
networks. Dynamically procrastinate B2C users after installed base 
beneÞts. Dramatically visualize customer directed convergence without 
revolutionary ROI. 
EfÞciently unleash cross-media information without cross-media value. 
Quickly maximize timely deliverables for real-time schemas. Dramatically 
maintain clicks-and-mortar solutions without functional solutions. 
BUSINESS PROPOSAL
!1
full_data = []

for page in range(pdf_reader.numPages):
    page_num = pdf_reader.getPage(page)
    data = page_num.extractText()
    full_data.append(data)
print(full_data[1])
Completely synergize resource taxing relationships via premier niche 
markets. Professionally cultivate one-to-one customer service with robust 
ideas. Dynamically innovate resource-leveling customer service for state of 
the art customer service. 
Objectively innovate empowered manufactured products whereas parallel 
platforms. Holisticly predominate extensible testing procedures for reliable 
supply chains. Dramatically engage top-line web services vis-a-vis 
cutting-edge deliverables. Proactively envisioned multimedia based expertise and cross-media 
growth strategies. Seamlessly visualize quality intellectual capital without 
superior collaboration and idea-sharing. Holistically pontiÞcate installed 
base portals after maintainable products. 
Phosßuorescently engage worldwide methodologies with web-enabled 
technology. Interactively coordinate proactive e-commerce via process-
centric "outside the box" thinking. Completely pursue scalable customer 
service through sustainable potentialities. 
Collaboratively administrate turnkey channels whereas virtual e-tailers. 
Objectively seize scalable metrics whereas proactive e-services. 
Seamlessly empower fully researched growth strategies and interoperable 
internal or "organic" sources. 
Credibly innovate granular internal or "organic" sources whereas high 
standards in web-readiness. Energistically scale future-proof core 
competencies vis-a-vis impactful experiences. Dramatically synthesize 
integrated schemas with optimal networks. Interactively procrastinate high-payoff content without backward-
compatible data. Quickly cultivate optimal processes and tactical 
architectures. Completely iterate covalent strategic theme areas via 
accurate e-markets. Globally incubate standards compliant channels before scalable beneÞts. 
Quickly disseminate superior deliverables whereas web-enabled 
BUSINESS PROPOSAL
!2

f = open('Working_Business_Proposal.pdf','rb')
pdf_reader = PyPDF2.PdfFileReader(f)
first_page = pdf_reader.getPage(0)
type(first_page)
PyPDF2.pdf.PageObject
pdf_writer = PyPDF2.PdfFileWriter()
pdf_writer.addPage(first_page)
for i in range(pdf_reader.numPages):
    page = pdf_reader.getPage(i)
    pdf_writer.addPage(page)
new = open('new_file2.pdf','wb')
pdf_writer.write(new)
new.close()