Is it possible to build a workflow that pulls all the links to .PDFs off a webpage in the built-in browser.
Forum Archive
.pdf links from a website
Jozh
May 07, 2015 - 04:38
peterh86
May 05, 2015 - 04:16
Yes, but i can't help in detail. It probably requires a workflow with just a Python script.
Given the webpage address, you'd use Requests to get the webpage html, then search for links ending in .pdf and return them in a list. I imagine you could use Requests to download the pdfs as well.
Gerzer
May 06, 2015 - 12:10
You might be able to pull the HTML directly from the built-in browser, but I’m not 100% sure.
ccc
May 07, 2015 - 04:38
See the two links below.... The basic idea is to use requests to get the webpage HTML and use BeautifulSoup to parse that HTML to find the links that end in ".pdf".
http://omz-forums.appspot.com/pythonista/post/5903606662299648
http://omz-forums.appspot.com/pythonista/post/5253563362050048