Forum Archive

Strip some html codes out

locm

Here is my script:

from bs4 import BeautifulSoup as soup
from urllib.request import urlopen as uReq
page_url = ('http://fcs002.xreflector.net/_user.html')
uClient = uReq(page_url)
page_soup = soup(uClient.read(), "html.parser")
uClient.close
lh = page_soup.findAll("td")
print (lh[10])

It prints out:

WW6E

how can I end up with just the WW6E ?

Thanks,
Michael

ccc

print(lh[10].text.strip())

No space between print and ( because print() is a function just like every other function.

cvp

I think he wants to remove html tags, like with innerText of webView.eval_js

ccc
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup('<td><div align="left">WW6E </div></td>')
>>> soup.text.strip()
'WW6E'

BeautifulSoup is going to make JavaScript look sloppy.

locm

@ccc thank you - that works perfectly

locm

print(lh[10].text.strip())
works perfectly- thank you!