Search In HTML Page Using Regex Patterns With Python November 30, 2022 Post a Comment I'm trying to find a string inside a HTML page with known patterns. for example, in the following HTML code: Solution 1: re.findall(r'<HR>\s*<font size="\+1">(.*?)</font><BR>', html, re.DOTALL) Copy findall is returning a list with everything that is captured between the brackets in the regular expression. I used re.DOTALL so the dot also captures end of lines. I used \s* because I was not sure whether there would be any whitespace. Solution 2: This works, but may not be very robust: Baca JugaParsing Invalid Anchor Tag With Beautifulsoup Or RegexDataframe - Table In Table From Nested DictionaryBeautifulsoup4: Select Elements Where Attributes Are Not Equal To X import re r = re.compile('<HR>\s?<font size="\+1">(.+?)</font>\s?<BR>', re.IGNORECASE) r.findall(html) Copy You will be better off using a proper HTML parser. BeautifulSoup is excellent and easy to use. Look it up. Solution 3: re.findall(r'<HR>\n<font size="\+1">([^<]*)<\/font><BR>', html, re.MULTILINE) Copy Share You may like these postsHow To Detect End Of Http Request?Jsoup Parsing Html IssueRegular Expression To Parse Links From Html CodePhp Domdocument How To Get That Content Of This Tag? Post a Comment for "Search In HTML Page Using Regex Patterns With Python"
Post a Comment for "Search In HTML Page Using Regex Patterns With Python"