require 'rubygems' : hpricot HTML Parsing « Network « Ruby






require 'rubygems'

require 'hpricot'

html = <<END_OF_HTML
<html>
<head>
  <title>This is the page title</title>
</head>

<body>
  <h1>Big heading!</h1>
  <p>A paragraph of text.</p>
  <ul><li>Item 1 in a list</li><li>Item 2</li><li class="highlighted">Item
3</li></ul>
</body>
</html>
END_OF_HTML

doc = Hpricot(html)
puts doc.search("h1").first.inner_html

 








Related examples in the same category

1.Hpricot can work directly with open-uri to load HTML from remote files
2.Using a combination of search methods, search for the list within the HTML and then extract each item
3.Search for the first instance of an element only
4.Using CSS classes to find certain elements