it’s difficult to write a data into a list of beautiful soup results every time even if the data is not found

You don’t really need regex for that. You might be just fine with bs4 and a css selector.

Try this:

import requests
from bs4 import BeautifulSoup

html = requests.get("").text
soup = BeautifulSoup(html, "html.parser")
mailtos ='a[href^=mailto]')
print(list(set(m["href"] for m in mailtos)))


['mailto:[email protected]', 'mailto:[email protected]', 'mailto:[email protected]', 'mailto:[email protected]']

And if you want “pure” emails, just change the last line to:

print(list(set(m["href"].replace("mailto:", "") for m in mailtos)))

To get this:

['[email protected]', '[email protected]', '[email protected]', '[email protected]']

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top