使用Python在Selenium WebDriver中获取WebElement的HTML源代码

玩技站长
玩技站长
管理员, Keymaster
11178
文章
0
粉丝
测试交流18421字数 134阅读0分26秒阅读模式

我正在使用Python绑定来运行Selenium WebDriver:

from selenium import webdriver
wd = webdriver.Firefox()

我知道我可以抓取这样的Web元素:文章源自玩技e族-https://www.playezu.com/179020.html

elem = wd.find_element_by_css_selector('#my-id')

我知道我可以得到完整的来源与。。。文章源自玩技e族-https://www.playezu.com/179020.html

wd.page_source

但是有没有办法获得“元素源”?文章源自玩技e族-https://www.playezu.com/179020.html

elem.source   # <-- returns the HTML as a string

用于Python的Selenium WebDriver文档基本上不存在,我在代码中没有看到任何支持该功能的内容。文章源自玩技e族-https://www.playezu.com/179020.html

访问元素(及其子元素)的HTML的最佳方式是什么?文章源自玩技e族-https://www.playezu.com/179020.html 文章源自玩技e族-https://www.playezu.com/179020.html

 
评论  18  访客  18
    • Dima Tisnek
      Dima Tisnek 9

      WebElement element = driver.findElement(By.id("foo"));
      String contents = (String)((JavascriptExecutor)driver).executeScript("return arguments[0].innerHTML;", element);

      This code really works to get JavaScript from source as well!

      • wowandy
        wowandy 9

        In PHP Selenium WebDriver you can get page source like this:
        $html = $driver->getPageSource();

        Or get HTML of the element like this:
        // innerHTML if you need HTML of the element content
        $html = $element->getDomProperty(‘outerHTML’);

        • christian
          christian 9

          In current versions of php-webdriver (1.12.0+) you to use
          $element->getDomProperty(‘innerHTML’);

          as pointed out in this issue: https://github.com/php-webdriver/php-webdriver/issues/929

          • user2849367
            user2849367 9

            Use execute_script get html
            bs4(BeautifulSoup) also can access html tag quickly.
            from bs4 import BeautifulSoup
            html = adriver.execute_script("return document.documentElement.outerHTML")
            bs4_onepage_object=BeautifulSoup(html,"html.parser")
            bs4_div_object=bs4_onepage_object.find_all("atag",class_="attribute")

            • Peter Mortensen
              Peter Mortensen 9

              And in PHPUnit Selenium test it’s like this:
              $text = $this->byCssSelector(‘.some-class-nmae’)->attribute(‘innerHTML’);

              • Peter Mortensen
                Peter Mortensen 9

                If you are interested in a solution for Selenium Remote Control in Python, here is how to get innerHTML:
                innerHTML = sel.get_eval("window.document.getElementById(‘prodid’).innerHTML")

                • Peter Mortensen
                  Peter Mortensen 9

                  The method to get the rendered HTML I prefer is the following:
                  driver.get("http://www.google.com")
                  body_html = driver.find_element_by_xpath("/html/body")
                  print body_html.text

                  However, the above method removes all the tags (yes, the nested tags as well) and returns only text content. If you interested in getting the HTML markup as well, then use the method below.
                  print body_html.getAttribute("innerHTML")

                  • MaartenDev
                    MaartenDev 9

                    This works seamlessly for me.

                    element.get_attribute(‘innerHTML’)

                  匿名

                  发表评论

                  匿名网友
                  :?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:
                  确定

                  拖动滑块以完成验证