What is screen scraping? From what I know it's like getting info from some database(i think). And also how can I screen scrape? Thanks for your answers
Forum Archive
Screen Scraping
Screen Scraping is the art and science of:
1) getting all the text from a computer display (terminal, webpage, etc.) and then
2) selecting out only those data fields of interest for storage or further processing.
It used to be about getting data from terminal displays but these daze it is mostly about scraping data off of web pages. The Pythonista tools that I prefer for web scraping are requests (for getting all the HTML of a webpage) and beautiful soup 4 (selecting out only those data fields of interest). bs4 is complicated but it is supercool once you get the hang of it.
Here are two recent examples of web scraping. They follow the model:
import bs4, requests
def get_beautiful_soup(url):
return bs4.BeautifulSoup(requests.get(url).text)
soup = get_beautiful_soup('http://omz-forums.appspot.com/pythonista')
print(soup.prettify())
# See: http://www.crummy.com/software/BeautifulSoup/bs4/doc for all the things you can do with the soup.
As you can see by looking at the output, the harder part is selecting out only those data fields of interest. ;-)
If bs4 is too complicated for your purposes, you can do html = requests.get(url).text and then try using str.find() and str.partition() or Python's regular expressions module, re as a poor man's soup. Happy scraping.
Cool! Thanks for the response
Screen Scraping is the technique of looking website as a whole database and each webpage as a field in the table. I have written how to login to particular website and scrape it. There are so many advanced web scraping software available in the market if you do not want to write the code.