Forum Archive

wcaleb

Jun 10, 2014 - 21:05

I am working on a workflow for using Docverter to convert markdown, latex, and HTML into the same three formats. The workflow uses Editorial 1.1's new custom UI feature to allow the user to set Docverter options before clicking a button. Output from Docverter will, depending on the button tapped, either be copied to the clipboard or inserted in the editor.

Right now, for debugging purposes, I'm using a simple string as the input for sending to Docverter, but eventually want the workflow to get the input text using editor.get_text().

I'm having (so far) two problems that I can't diagnose, one (I think) Editorial-related and one python-related.

The Editorial one: often, after the workflow successfully runs, when I try to run the workflow again, I get a spinning wheel in where the wrench icon was, and the custom UI never again appears. I have to force quit the app. Occasionally, I can run the workflow successfully two or three times before the spinning wheel appears. But more frequently, I am only able to run it once before I have to force quit whenever I try it again. I don't think it's a Docverter problem, because the API call is in a function that is never called unless a button on the Custom UI is clicked.

The Python problem is a familiar one, but one that always gives me fits: encoding. If I change my diagnostic input string ("Hello world!" in the Gist) to one that has special characters like "smart quotes" and "smart apostrophers" and "em-dashes," then Docverter returns an error. I think this is because Docverter needs unicode, but I can't figure out the right sequence of encode/decode commands (or where to place them) to get things to work with special strings.

Any help would be most welcome!

peterh86

Jun 05, 2014 - 21:19

For encoding, google Unicode sandwich. It takes time to figure out, then you know how to avoid those errors forever.

I did a Docverter workflow, but was disappointed by the limited number of fonts available, and it seemed too complicated to send my own font. I can send you the workflow if you want.

If you are hacking Caleb's Docverter workflow: in post_multipart you at least need to replace h.send with h.send(body.encode('utf-8', 'replace')).

wcaleb

Jun 05, 2014 - 22:07

Thanks for the help. I am the Caleb of whom you speak, though as you can see I need help even hacking my own stuff!

I'll do some more reading, as you suggest. In the meantime, even adding the encode call to the body, as you suggest, seems to have helped with the other problem I was describing. Maybe encoding errors were somehow trying to throw exceptions that crashed the workflow?

wcaleb

Jun 06, 2014 - 03:54

I've read more about Unicode and watched the very helpful Unipain presentation, but still having trouble debugging.

If I understand the "Unicode sandwich" concept, then encoding the multipart body handles the output edge.

To get into unicode as quickly as possible, I tried to turn my test string into a unicode string before writing it to the Docverter input file, using unicode().

After that, I presume I should try to keep everything in unicode in the program itself. Right away, I run into trouble when opening the Docverter input file to pass it to post_multipart() as part of files. At that point, the text is type str, right? But if I try to decode the file contents right after reading the file, it doesn't seem to fix my problem. Docverter returns an error message that does not appear when I use non-problematic strings.

{"error":"uninitialized constant DocverterServer::Manifest::InvalidManifestError"}

Is there a library I'm using that is doing some implicit encoding somewhere?

peterh86

Jun 06, 2014 - 06:00

Gosh, that Caleb. Thanks for posting that original script.

You can make your simple test string Unicode by putting u in front, s = u'test'. I seem to remember that the workflow action document text returns Unicode.

Here is my hack of your script. I haven't used it for a while and Im not sure it works. You'll see I use codecs.open a couple of times. And I open the output of Docverter straight in Editorial with console.quicklook, to save using dropbox.

# Now let's DO this

# get workflow variables
Text = workflow.get_input()
FileName = workflow.get_variable('FileName')

# set file names
html_file_name = FileName + '.html'
pdf_file_name = FileName + '.pdf'

# Put the html in a file to send to Docverter.

f = codecs.open(html_file_name, "w", encoding='utf-8')
f.write(Text)

# Set Docverter options and define fields using lists.
# Other options available at http://www.docverter.com/api.html#toc_2
fields = [("from", "html"), ("to", "pdf")]
files = [("input_files[]", html_file_name, codecs.open(html_file_name, "r", encoding='utf-8').read())]

pdf = post_multipart("c.docverter.com", "/convert", fields, files)
o = open(pdf_file_name, "w").write(pdf)

# open PDF file in Editorial (can then share)
pdf_file_path = os.path.abspath(pdf_file_name)
console.quicklook(pdf_file_path)

# delete temporary files
os.remove(html_file_name)
os.remove(pdf_file_name)

wcaleb

Jun 07, 2014 - 15:55

I figured out the problem that was giving me the Invalid Manifest error. I needed to encode the body not only in the h.send line, as you suggested, but also in the line that calculated the content length header. Otherwise, the length of the body string would not match the length of the encoded body, raising a Docverter error when the request was made.

The working script does not seem to require the use of codecs to read from and write to the input file.

peterh86

Jun 08, 2014 - 01:31

Thanks. I wondered about needing to use codecs, because the resulting PDF gets written to a file without codecs.

wcaleb

Jun 10, 2014 - 21:05

I have a pretty good working workflow now: Send to Docverter. Would appreciate feedback from testers.

Forum Archive

Help with Docverter Custom UI and workflow