我正在使用Python绑定来运行Selenium WebDriver:
from selenium import webdriver
wd = webdriver.Firefox()
我知道我可以抓取这样的Web元素:文章源自玩技e族-https://www.playezu.com/179020.html
elem = wd.find_element_by_css_selector('#my-id')
我知道我可以得到完整的来源与。。。文章源自玩技e族-https://www.playezu.com/179020.html
wd.page_source
但是有没有办法获得“元素源”?文章源自玩技e族-https://www.playezu.com/179020.html
elem.source # <-- returns the HTML as a string
用于Python的Selenium WebDriver文档基本上不存在,我在代码中没有看到任何支持该功能的内容。文章源自玩技e族-https://www.playezu.com/179020.html
访问元素(及其子元素)的HTML的最佳方式是什么?文章源自玩技e族-https://www.playezu.com/179020.html 文章源自玩技e族-https://www.playezu.com/179020.html
未知地区 18F
WebElement element = driver.findElement(By.id("foo"));
String contents = (String)((JavascriptExecutor)driver).executeScript("return arguments[0].innerHTML;", element);
This code really works to get JavaScript from source as well!
未知地区 17F
In PHP Selenium WebDriver you can get page source like this:
$html = $driver->getPageSource();
Or get HTML of the element like this:
// innerHTML if you need HTML of the element content
$html = $element->getDomProperty(‘outerHTML’);
未知地区 16F
In current versions of php-webdriver (1.12.0+) you to use
$element->getDomProperty(‘innerHTML’);
as pointed out in this issue: https://github.com/php-webdriver/php-webdriver/issues/929
未知地区 15F
Use execute_script get html
bs4(BeautifulSoup) also can access html tag quickly.
from bs4 import BeautifulSoup
html = adriver.execute_script("return document.documentElement.outerHTML")
bs4_onepage_object=BeautifulSoup(html,"html.parser")
bs4_div_object=bs4_onepage_object.find_all("atag",class_="attribute")
未知地区 14F
And in PHPUnit Selenium test it’s like this:
$text = $this->byCssSelector(‘.some-class-nmae’)->attribute(‘innerHTML’);
未知地区 13F
If you are interested in a solution for Selenium Remote Control in Python, here is how to get innerHTML:
innerHTML = sel.get_eval("window.document.getElementById(‘prodid’).innerHTML")
未知地区 12F
The method to get the rendered HTML I prefer is the following:
driver.get("http://www.google.com")
body_html = driver.find_element_by_xpath("/html/body")
print body_html.text
However, the above method removes all the tags (yes, the nested tags as well) and returns only text content. If you interested in getting the HTML markup as well, then use the method below.
print body_html.getAttribute("innerHTML")
未知地区 11F
This works seamlessly for me.
element.get_attribute(‘innerHTML’)