Forum Archive

Issue with flattening PDF

CastroSATT

Hiya guys got an issue need to use obvs in pure python to fill out a pdf form then flatten it as apprantly that’s how you make the form uneditable having really big issue I just can’t get it done

So now here’s the question how do I take the output stream and save it as a pdf

from io import BytesIO
import PyPDF2
from PyPDF2.generic import BooleanObject, NameObject, IndirectObject, NumberObject

def set_need_appearances_writer(writer):
    # basically used to ensured there are not 
    # overlapping form fields, which makes printing hard
    try:
        catalog = writer._root_object
        # get the AcroForm tree and add "/NeedAppearances attribute
        if "/AcroForm" not in catalog:
            writer._root_object.update({
                NameObject("/AcroForm"): IndirectObject(len(writer._objects), 0, writer)})

        need_appearances = NameObject("/NeedAppearances")
        writer._root_object["/AcroForm"][need_appearances] = BooleanObject(True)


    except Exception as e:
        print('set_need_appearances_writer() catch : ', repr(e))

    return writer  




# open the pdf
input_stream = open("JMA.pdf", "rb")
pdf_reader = PyPDF2.PdfFileReader(input_stream, strict=False)
if "/AcroForm" in pdf_reader.trailer["/Root"]:
    pdf_reader.trailer["/Root"]["/AcroForm"].update(
        {NameObject("/NeedAppearances"): BooleanObject(True)})

pdf_writer = PyPDF2.PdfFileWriter()
set_need_appearances_writer(pdf_writer)
if "/AcroForm" in pdf_writer._root_object:
    # Acro form is form field, set needs appearances to fix printing issues
    pdf_writer._root_object["/AcroForm"].update(
        {NameObject("/NeedAppearances"): BooleanObject(True)})

#data_dict = dict() # this is a dict of your form values

data_dict = {
   'name': 'Jason allen',
   'addressf': 'Test address',
   }

pdf_writer.addPage(pdf_reader.getPage(0))
page = pdf_writer.getPage(0)
# update form fields
pdf_writer.updatePageFormFieldValues(page, data_dict)
for j in range(0, len(page['/Annots'])):
    writer_annot = page['/Annots'][j].getObject()
    for field in data_dict:
        if writer_annot.get('/T') == field:
            writer_annot.update({
                NameObject("/Ff"): NumberObject(1)    # make ReadOnly
            })
output_stream = BytesIO()
pdf_writer.write(output_stream)

# output_stream is your flattened pdf

print(output_stream)
cvp

@CastroSATT You loose the indentation of your code, please, next time,
try to use the in the menu just above your posted text

cvp

@CastroSATT you could try this

with open('out.pdf', 'wb') as f:
    f.write(output_stream.getvalue())
CastroSATT

didn’t work gives me me a pdf with the replaced same issue

JonB

Why are you using BytesIO instead of a file?

outputStream = file(r"output.pdf", "wb")
pdf_writer.write(outputStream)
outputStream.close()

CastroSATT

@cvp sorry when I said didn’t work I meant it kinda of worked the issue is it made the pdf but still with editable forms and not ready only it’s weird

cvp

@CastroSATT Sorry, I didn't understand correctly your request. If this a problem of filled form, I can't help you. 😢 I'm sure somebody here will help you

I suppose you got your code from here and here

cvp

@CastroSATT Try this, it should work

        if writer_annot.get('/T')[:len(field)] == field:
cvp

@CastroSATT Please, tell me if it is ok for you, and if not, could you post your input pdf.

CastroSATT

Sorry still no go it makes the pdf but adds my dictionary to the form which is perfect
I just can’t make the text read only so everyone can see it even In preiview mode

Template PDF

CastroSATT

So this is a complete template as you can see clean but after processing

This is what it looks like

Name and first line of address only
after processing

cvp

@CastroSATT I understand. I did test a little code with another pdf sample, and the output was read only. I'll test...

CastroSATT

thanks for the help it’s on my snagging list

cvp

@CastroSATT Try this pdf with

from PyPDF2 import PdfFileReader, PdfFileWriter
from PyPDF2.generic import BooleanObject, NameObject, IndirectObject, NumberObject

TEMPLATE_PATH = 'SampleForm-1.pdf'
OUTPUT_PATH = 'out1.pdf'
data_dict = {
    'Name':'test',
    'Surname':'test'
}

if __name__ == '__main__':
    input_file = PdfFileReader(open(TEMPLATE_PATH, "rb"))

    output_file = PdfFileWriter()
    page = input_file.getPage(0)
    output_file.addPage(page)
    output_file.updatePageFormFieldValues(page, data_dict)
    page = output_file.getPage(0)

    for j in range(0, len(page['/Annots'])):
      writer_annot = page['/Annots'][j].getObject()
      print(writer_annot.get('/T')) # to know what to put in data_dict
      for field in data_dict: 
        if writer_annot.get('/T') == None:
            continue
        if writer_annot.get('/T')[:len(field)] == field:
            writer_annot.update({
                NameObject("/Ff"): NumberObject(1)   # make ReadOnly
            })

    output_stream = open(OUTPUT_PATH, "wb")

    output_file.write(output_stream)
    output_stream.close()

And you will see that the two fields from data_dict are readonly in out1.pdf

In JMA2.pdf,

      print(writer_annot.get('/T')) # to know what to put in data_dict

gives None for the first line, perhaps the problem comes from there ⁉️

CastroSATT

Yeah might be gonna download a demo of adobe try and remake the template not sure how to get rid of that none from there thanks so much will post an update in a few I’ve been using pdfscape to make the template and the template source (years ago) has been played with a few time after the years so maybe conflicts there some where but your right

Seems to work ok with that test one although would preferred to be uneditable but 1 thing at a time me thinks

cvp

@CastroSATT with the example, if you define all fields, the out pdf becomes uneditable

data_dict = {
    'Name':'the',
    'Surname':'king',
    'email':'xxxx@yyyy.com',
    'phone':'1234',
    'Mobile':'1234',
    'Street':'5th avenue',
    'House':'3',
    'Town':'NY',
    'Postcode':'1234',
    'Country':'Usa',
    'Comments':'no'
}
CastroSATT

Still having a barney

-Tried your original template works
-Tried your template altered on 2 other form makers and they then have the issue after that

Weird so now gonna
Start with a blank form and 1 text box each see what happens then

I think the issue has something to do with the NONE

-When using pdfscrape I get 1 none (needs more testing)

-Using pdfelement 6 pro I get 2 none one and the top 2nd at the end
—Maybe something to do with the water mark provided by the software, I am reaching.

Both using your working template (which has NO none)

cvp

@CastroSATT Good luck. I don't think I can help more. Sincerely, I don't know anything about pdf filling forms...

cvp

@CastroSATT I have run my little script on your JMA2.pdf and open the output pdf with PDF Expert app of Readdle, and fields are filled and read-only

JonB

Castro, I don't think you ever stated what your actual problem is. are you getting an error? Or is the saved form still editable? Or not filled?
What is your problem with None? Are you trying to fill the form with None?

cvp

@JonB I think that his problem is that his output pdf is filled but still editable (with pdf editors). The problem of "None" comes from my little script where the first writer_annot.get('/T') gives None... It is not a real problem but with my sample template, there is no such "none" and the output file is filled and not editable. Thus, his post r fees to this None.