Use regular expression to get web page title


import java.io.DataInputStream;
import java.net.URL;
import java.net.URLConnection;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main {
  public static void main(String[] argv) throws Exception {

    URL url = new URL("http://www.java.com/");
    URLConnection urlConnection = url.openConnection();
    DataInputStream dis = new DataInputStream(urlConnection.getInputStream());
    String html = "", tmp = "";
    while ((tmp = dis.readUTF()) != null) {
      html += " " + tmp;
    }
    dis.close();

    html = html.replaceAll("\\s+", " ");
    Pattern p = Pattern.compile("<title>(.*?)</title>");
    Matcher m = p.matcher(html);
    while (m.find() == true) {
      System.out.println(m.group(1));
    }
  }
}

Related examples in the same category

1.	Escape HTML special characters from a String
2.	Using javax.swing.text.html.HTMLEditorKit to parse html document
3.	Extract links from an HTML page
4.	extends HTMLEditorKit.ParserCallback
5.	HTML parser based on HTMLEditorKit.ParserCallback
6.	Get all hyper links from a web page
7.	Getting the Links in an HTML Document
8.	Getting the Text in an HTML Document
9.	Find and display hyperlinks contained within a web page

Use regular expression to get web page title : HTML Parser « Network Protocol « Java

Related examples in the same category