Remove tags from a html string : HTML « Network « C# / C Sharp

Home
C# / C Sharp
1.2D Graphics
2.Class Interface
3.Collections Data Structure
4.Components
5.Data Types
6.Database ADO.net
7.Date Time
8.Design Patterns
9.Development Class
10.Event
11.File Stream
12.Generics
13.GUI Windows Form
14.Internationalization I18N
15.Language Basics
16.LINQ
17.Network
18.Office
19.Reflection
20.Regular Expressions
21.Security
22.Services Event
23.Thread
24.Web Services
25.Windows
26.Windows Presentation Foundation
27.XML
28.XML LINQ
C# / C Sharp » Network » HTMLScreenshots 
Remove tags from a html string
 

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.ComponentModel;

namespace NearForums
{
  public static class Utils
  {
    public static bool IsHtmlFragment(string value)
    {
      return Regex.IsMatch(value, @"</?(p|div)>");
    }

    /// <summary>
    /// Remove tags from a html string
    /// </summary>
    /// <param name="value"></param>
    /// <returns></returns>
    public static string RemoveTags(string value)
    {
      if (value != null)
      {
        value = CleanHtmlComments(value);
        value = CleanHtmlBehaviour(value);
        value = Regex.Replace(value, @"</[^>]+?>"" ");
        value = Regex.Replace(value, @"<[^>]+?>""");
        value = value.Trim();
      }
      return value;
    }

    /// <summary>
    /// Clean script and styles html tags and content
    /// </summary>
    /// <returns></returns>
    public static string CleanHtmlBehaviour(string value)
    {
      value = Regex.Replace(value, "(<style.+?</style>)|(<script.+?</script>)""", RegexOptions.IgnoreCase | RegexOptions.Singleline);

      return value;
    }

    /// <summary>
    /// Replace the html commens (also html ifs of msword).
    /// </summary>
    public static string CleanHtmlComments(string value)
    {
      //Remove disallowed html tags.
      value = Regex.Replace(value, "<!--.+?-->""", RegexOptions.IgnoreCase | RegexOptions.Singleline);

      return value;
    }

    /// <summary>
    /// Adds rel=nofollow to html anchors
    /// </summary>
    public static string HtmlLinkAddNoFollow(string value)
    {
      return Regex.Replace(value, "<a[^>]+href=\"?'?(?!#[\\w-]+)([^'\">]+)\"?'?[^>]*>(.*?)</a>""<a href=\"$1\" rel=\"nofollow\" target=\"_blank\">$2</a>", RegexOptions.IgnoreCase | RegexOptions.Compiled);
    }
  }
}

   
  
Related examples in the same category
1.Get Links From HTML
2.Parses the value information from any INPUT tag in an HTML string where the name="" attribute matched the tagID parameter
3.Html Utilities
4.Convert HTML To Text
5.Converts a FontUnit to a size for the HTML FONT tag
6.Strip HTML
7.Sanitize any potentially dangerous tags from the provided raw HTML input using a whitelist based approach
8.Get Type As Html
9.HTML-encodes a string and returns the encoded string.
10.Strips all HTML tags from the specified string.
11.Removes the HTML whitespace.
12.Array To Html Breaked String
13.Show Html Page in String with Process
java2s.com  | Contact Us | Privacy Policy
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.