Forum Archive

Jozh

May 07, 2015 - 04:38

Is it possible to build a workflow that pulls all the links to .PDFs off a webpage in the built-in browser.

peterh86

May 05, 2015 - 04:16

Yes, but i can't help in detail. It probably requires a workflow with just a Python script.

Given the webpage address, you'd use Requests to get the webpage html, then search for links ending in .pdf and return them in a list. I imagine you could use Requests to download the pdfs as well.

Gerzer

May 06, 2015 - 12:10

You might be able to pull the HTML directly from the built-in browser, but I’m not 100% sure.

ccc

May 07, 2015 - 04:38

See the two links below.... The basic idea is to use requests to get the webpage HTML and use BeautifulSoup to parse that HTML to find the links that end in ".pdf".

http://omz-forums.appspot.com/pythonista/post/5903606662299648

http://omz-forums.appspot.com/pythonista/post/5253563362050048

Forum Archive

.pdf links from a website