侯体宗的博客
  • 首页
  • Hyperf版
  • beego仿版
  • 人生(杂谈)
  • 技术
  • 关于我
  • 更多分类
    • 文件下载
    • 文字修仙
    • 中国象棋ai
    • 群聊
    • 九宫格抽奖
    • 拼图
    • 消消乐
    • 相册

simplehtmldom Doc api帮助文档

php  /  管理员 发布于 7年前   422

API Reference

Helper functions
object str_get_html ( string $content ) Creates a DOM object from a string.
object file_get_html ( string $filename ) Creates a DOM object from a file or a URL.

DOM methods & properties

stringplaintext Returns the contents extracted from HTML.
voidclear () Clean up memory.
voidload ( string $content ) Load contents from a string.
stringsave ( [string $filename] ) Dumps the internal DOM tree back into a string. If the $filename is set, result string will save to file.
voidload_file ( string $filename ) Load contents from a from a file or a URL.
voidset_callback ( string $function_name ) Set a callback function.
mixedfind ( string $selector [, int $index] ) Find elements by the CSS selector. Returns the Nth element object if index is set, otherwise return an array of object.

Element methods & properties

string[attribute] Read or write element's attribure value.
stringtag Read or write the tag name of element.
stringoutertext Read or write the outer HTML text of element.
stringinnertext Read or write the inner HTML text of element.
stringplaintext Read or write the plain text of element.
mixedfind ( string $selector [, int $index] ) Find children by the CSS selector. Returns the Nth element object if index is set, otherwise, return an array of object.

DOM traversing

mixed$e->children ( [int $index] ) Returns the Nth child object if index is set, otherwise return an array of children.
element$e->parent () Returns the parent of element.
element$e->first_child () Returns the first child of element, or null if not found.
element$e->last_child () Returns the last child of element, or null if not found.
element$e->next_sibling () Returns the next sibling of element, or null if not found.
element$e->prev_sibling () Returns the previous sibling of element, or null if not found.
Camel naming convertions You can also call methods with W3C STANDARD camel naming convertions.


string$e->getAttribute ( $name ) string$e->attribute
void$e->setAttribute ( $name, $value ) void$value = $e->attribute
bool$e->hasAttribute ( $name ) boolisset($e->attribute)
void$e->removeAttribute ( $name ) void$e->attribute = null
element$e->getElementById ( $id ) mixed$e->find ( "#$id", 0 )
mixed$e->getElementsById ( $id [,$index] ) mixed$e->find ( "#$id" [, int $index] )
element$e->getElementByTagName ($name ) mixed$e->find ( $name, 0 )
mixed$e->getElementsByTagName ( $name [, $index] ) mixed$e->find ( $name [, int $index] )
element$e->parentNode () element$e->parent ()
mixed$e->childNodes ( [$index] ) mixed$e->children ( [int $index] )
element$e->firstChild () element$e->first_child ()
element$e->lastChild () element$e->last_child ()
element$e->nextSibling () element$e->next_sibling ()
element$e->previousSibling () element$e->prev_sibling ()

// Create a DOM object from a string
$html = str_get_html('Hello!');

// Create a DOM object from a URL
$html = file_get_html('http://www.google.com/');

// Create a DOM object from a HTML file
$html = file_get_html('test.htm');

// Create a DOM object
$html = new simple_html_dom();

// Load HTML from a string
$html->load('Hello!');

// Load HTML from a URL
$html->load_file('http://www.google.com/');

// Load HTML from a HTML file
$html->load_file('test.htm');


// Find all anchors, returns a array of element objects
$ret = $html->find('a');

// Find (N)thanchor, returns element object or null if not found(zero based)
$ret = $html->find('a', 0);

// Find all which attribute id=foo
$ret = $html->find('div[id=foo]');

// Find all with the id attribute
$ret = $html->find('div[id]');

// Find all element has attribute id
$ret = $html->find('[id]');


// Find all element which id=foo
$ret = $html->find('#foo');

// Find all element which class=foo
$ret = $html->find('.foo');

// Find all anchors and images
$ret = $html->find('a, img');

// Find all anchors and images with the "title" attribute
$ret = $html->find('a[title], img[title]');

// Find all

  • in

      $es = $html->find('ul li');

      // Find Nested tags
      $es = $html->find('div div div');

      // Find all in

      which class=hello
      $es = $html->find('table.hello td');

      // Find all td tags with attribite align=center in table tags
      $es = $html->find(''table td[align=center]');

      // Find all

    • in

        foreach($html->find('ul') as $ul)
        {
        foreach($ul->find('li') as $li)
        {
        // do something...
        }
        }

        // Find first

      • in first

          $e = $html->find('ul', 0)->find('li', 0);

          Supports these operators in attribute selectors:


          [attribute] Matches elements that have the specified attribute.
          [attribute=value] Matches elements that have the specified attribute with a certain value.
          [attribute!=value] Matches elements that don't have the specified attribute with a certain value.
          [attribute^=value] Matches elements that have the specified attribute and it starts with a certain value.
          [attribute$=value] Matches elements that have the specified attribute and it ends with a certain value.
          [attribute*=value] Matches elements that have the specified attribute and it contains a certain value.

          // Find all text blocks
          $es = $html->find('text');

          // Find all comment () blocks
          $es = $html->find('comment');

          // Get a attribute ( If the attribute is non-value attribute (eg. checked, selected...), it will returns true or false)
          $value = $e->href;

          // Set a attribute(If the attribute is non-value attribute (eg. checked, selected...), set it's value as true or false)
          $e->href = 'my link';

          // Remove a attribute, set it's value as null!
          $e->href = null;

          // Determine whether a attribute exist?
          if(isset($e->href))
          echo 'href exist!';

          // Example
          $html = str_get_html("foo bar");
          $e = $html->find("div", 0);

          echo $e->tag; // Returns: " div"
          echo $e->outertext; // Returns: " foo bar"
          echo $e->innertext; // Returns: " foo bar"
          echo $e->plaintext; // Returns: " foo bar"


          $e->tag Read or write the tag name of element.
          $e->outertext Read or write the outer HTML text of element.
          $e->innertext Read or write the inner HTML text of element.
          $e->plaintext Read or write the plain text of element.

          // Extract contents from HTML
          echo $html->plaintext;

          // Wrap a element
          $e->outertext = '

          ' . $e->outertext . '';

          // Remove a element, set it's outertext as an empty string
          $e->outertext = '';

          // Append a element
          $e->outertext = $e->outertext . 'foo';

          // Insert a element
          $e->outertext = 'foo' . $e->outertext;

          // If you are not so familiar with HTML DOM, check this link to learn more...

          // Example
          echo $html->find("#div1", 0)->children(1)->children(1)->children(2)->id;
          // or
          echo $html->getElementById("div1")->childNodes(1)->childNodes(1)->childNodes(2)->getAttribute('id');
          You can also call methods with Camel naming convertions.

          mixed$e->children ( [int $index] ) Returns the Nth child object if index is set, otherwise return an array of children.
          element$e->parent () Returns the parent of element.
          element$e->first_child () Returns the first child of element, or null if not found.
          element$e->last_child () Returns the last child of element, or null if not found.
          element$e->next_sibling () Returns the next sibling of element, or null if not found.
          element$e->prev_sibling () Returns the previous sibling of element, or null if not found.

          // Dumps the internal DOM tree back into string
          $str = $html;

          // Print it!
          echo $html;

          // Dumps the internal DOM tree back into string
          $str = $html->save();

          // Dumps the internal DOM tree back into a file
          $html->save('result.htm');

          // Write a function with parameter "$element"
          function my_callback($element) {
          // Hide all tags
          if ($element->tag=='b')
          $element->outertext = '';
          }

          // Register the callback function with it's function name
          $html->set_callback('my_callback');

          // Callback function will be invoked while dumping
          echo $html;

          您可能感兴趣的文章:

          • JavaScript 节点操作 以及DOMDocument属性和方法
          • javascript firefox兼容ie的dom方法脚本
          • Javascript入门学习第八篇 js dom节点属性说明
          • javascript转换字符串为dom对象(字符串动态创建dom)
          • JavaScript判断DOM何时加载完毕的技巧
          • JavaScript与DOM组合动态创建表格实例
          • 用JavaScript获取DOM元素位置和尺寸大小的方法
          • javascript获取dom的下一个节点方法
          • javascript学习笔记(三)BOM和DOM详解
          • javascript中HTMLDOM操作详解
          • simplehtmldom 一个PHP处理HTML的利器(方便采集)


        • 上一条:
          web目录下不应该存在多余的程序(安全考虑)
          下一条:
          for循环连续求和、九九乘法表代码
        • 昵称:

          邮箱:

          0条评论 (评论内容有缓存机制,请悉知!)
          最新最热
          • 分类目录
          • 人生(杂谈)
          • 技术
          • linux
          • Java
          • php
          • 框架(架构)
          • 前端
          • ThinkPHP
          • 数据库
          • 微信(小程序)
          • Laravel
          • Redis
          • Docker
          • Go
          • swoole
          • Windows
          • Python
          • 苹果(mac/ios)
          • 相关文章
          • Laravel从Accel获得5700万美元A轮融资(0个评论)
          • PHP 8.4 Alpha 1现已发布!(0个评论)
          • 用Time Warden监控PHP中的代码处理时间(0个评论)
          • 在PHP中使用array_pop + yield实现读取超大型目录功能示例(0个评论)
          • Property Hooks RFC在PHP 8.4中越来越接近现实(0个评论)
          • 近期文章
          • 在go语言中使用api.geonames.org接口实现根据国际邮政编码获取地址信息功能(1个评论)
          • 在go语言中使用github.com/signintech/gopdf实现生成pdf分页文件功能(0个评论)
          • gmail发邮件报错:534 5.7.9 Application-specific password required...解决方案(0个评论)
          • 欧盟关于强迫劳动的规定的官方举报渠道及官方举报网站(0个评论)
          • 在go语言中使用github.com/signintech/gopdf实现生成pdf文件功能(0个评论)
          • Laravel从Accel获得5700万美元A轮融资(0个评论)
          • 在go + gin中gorm实现指定搜索/区间搜索分页列表功能接口实例(0个评论)
          • 在go语言中实现IP/CIDR的ip和netmask互转及IP段形式互转及ip是否存在IP/CIDR(0个评论)
          • PHP 8.4 Alpha 1现已发布!(0个评论)
          • Laravel 11.15版本发布 - Eloquent Builder中添加的泛型(0个评论)
          • 近期评论
          • 122 在

            学历:一种延缓就业设计,生活需求下的权衡之选中评论 工作几年后,报名考研了,到现在还没认真学习备考,迷茫中。作为一名北漂互联网打工人..
          • 123 在

            Clash for Windows作者删库跑路了,github已404中评论 按理说只要你在国内,所有的流量进出都在监控范围内,不管你怎么隐藏也没用,想搞你分..
          • 原梓番博客 在

            在Laravel框架中使用模型Model分表最简单的方法中评论 好久好久都没看友情链接申请了,今天刚看,已经添加。..
          • 博主 在

            佛跳墙vpn软件不会用?上不了网?佛跳墙vpn常见问题以及解决办法中评论 @1111老铁这个不行了,可以看看近期评论的其他文章..
          • 1111 在

            佛跳墙vpn软件不会用?上不了网?佛跳墙vpn常见问题以及解决办法中评论 网站不能打开,博主百忙中能否发个APP下载链接,佛跳墙或极光..
          • 2016-10
          • 2016-11
          • 2017-06
          • 2017-07
          • 2017-08
          • 2017-09
          • 2017-11
          • 2017-12
          • 2018-01
          • 2018-02
          • 2018-03
          • 2020-03
          • 2020-04
          • 2020-05
          • 2020-06
          • 2020-07
          • 2020-09
          • 2021-02
          • 2021-03
          • 2021-04
          • 2021-05
          • 2021-06
          • 2021-07
          • 2021-08
          • 2021-09
          • 2021-10
          • 2021-11
          • 2021-12
          • 2022-01
          • 2022-02
          • 2022-05
          • 2022-06
          • 2022-07
          • 2022-08
          • 2022-09
          • 2022-10
          • 2022-11
          • 2022-12
          • 2023-01
          • 2023-02
          • 2023-03
          • 2023-04
          • 2023-05
          • 2023-06
          • 2023-07
          • 2023-08
          • 2023-09
          • 2023-10
          • 2023-11
          • 2023-12
          • 2024-01
          • 2024-02
          • 2024-03
          • 2024-04
          • 2024-05
          • 2024-06
          • 2024-07
          • 2024-09
          Top

          Copyright·© 2019 侯体宗版权所有· 粤ICP备20027696号 PHP交流群

          侯体宗的博客