Strip HTML : How to Remove HTML tags from a String in VBA

.

In this article I am going to write an User Defined Function (UDF) which is used to remove all HTMLs from a String Stored in Cell. It can be used as normal Inbuilt Formulas of Excel.

For writing this function i have used VBScript Regular Expression to Replace all the HTML tags with BLANK “”. In code you can see i have defined a regular expression to replace all HTML tags.

Regular Pattern which can be used to replace all HTML tags is : “<[^>]+>”

Follow the below Steps to create this UDF (User Defined Function):
1. Open Visual Basic Editor (Alt+ F11)
2. Open an Existing Module or add a New Module
3. Copy Paste the below Code and Save the Excel.

Function – Input as Cell Address


Function StripHTML(cell As Range) As String
 Dim RegEx As Object
 Set RegEx = CreateObject("vbscript.regexp")

 Dim sInput As String
 Dim sOut As String
 sInput = cell.Text
 
 With RegEx
   .Global = True
   .IgnoreCase = True
   .MultiLine = True
.Pattern = "<[^>]+>" 'Regular Expression for HTML Tags.
 End With

 sOut = RegEx.Replace(sInput, "")
 StripHTML = sOut
 Set RegEx = Nothing
End Function

4. Now Go to any Cell of any Sheet of your Workbook and use this formula =stripHTML(Cell)

REMOVE HTML TAGS FROM A STRING

REMOVE HTML TAGS FROM A STRING

Function – Pass whole String with HTML

Below function can be basically used in VBA while writing some Macro.
For example: At some point of time you recieve a string from some resource which has HTML tags with it and you want to remove all of them and want HTML tags free String then simply use this function.

Syntax To Use this Function in your Macro:


htmlFreeString=stripHTML("withHTMLString")

Function StripHTML(sInput As String) As String
 Dim RegEx As Object
 Set RegEx = CreateObject("vbscript.regexp")

 Dim sOut As String
 With RegEx
   .Global = True
   .IgnoreCase = True
   .MultiLine = True
.Pattern = "<[^>]+>" 'Regular Expression for HTML Tags.
 End With

 sOut = RegEx.Replace(sInput, "")
 StripHTML = sOut
 Set RegEx = Nothing
End Function

Buy a coffee for the author

Adsense

Download FREE Tools and Templates

There are many cool and useful excel tools and templates available to download for free. For most of the tools, you get the entire VBA code base too which you can look into it, play around it, and customize according to your need.

Dynamic Arrays and Spill Functions in Excel: A Beginner’s Guide
Dynamic Arrays and Spill Functions in Excel: A Beginner’s Guide

In today's tutorial, we'll be diving into the exciting world of dynamic arrays and spill functions in Office 365 Excel. These features have revolutionized the way we work with data, providing a more flexible and efficient way to handle arrays. I am going to explain...

How to Declare a Public Variable in VBA
How to Declare a Public Variable in VBA

While programming in VBA sometimes you need to declare a Public Variable that can store the value throughout the program. Use of Public Variable: Let's say you have 4 different Functions in your VBA Code or Module and you have a variable that may or may not be...

How to Copy content from Word using VBA

As many of us want to deal with Microsoft Word Document from Excel Macro/VBA. I am going to write few articles about Word from Excel Macro. This is the first article which opens a Word Document and read the whole content of that Word Document and put it in the Active...

What is Excel Formula?

Excel Formula is one of the best feature in Microsoft Excel, which makes Excel a very very rich application. There are so many useful built-in formulas available in Excel, which makes our work easier in Excel. For all the automated work, Excel Macro is not required. There are so many automated things can be done by using simple formulas in Excel. Formulas are simple text (With a Syntax) which is entered in to the Excel Worksheet Cells. So how computer will recognize whether it is a formula or simple text? Answer is simple.. every formula in Excel starts with Equal Sign (=).

You May Also Like…

15 Comments

  1. Rahul Kumar

    Hi i used this but inspite of new line it showing / in the string

    Reply
    • Vishwamitra Mishra

      Hi Rahul,
      I would appreciate if you Can explain me with an example, how you want and how the currect code is behaving for you.
      Thanks,
      Vish

      Reply
  2. Fabio

    Great code! Thanks so much!

    Reply
    • Vishwamitra Mishra

      Thanks Fabio !!

      Reply
  3. Dali

    With the "Function – Pass whole String with HTML" where do you input the syntax? Under a new module or? It is not working for me

    Reply
    • Vishwamitra Mishra

      What Syntax do you want to pass?? In that function you just need to pass the String with HTML tags and it will remove all the HTML tags and return you a string without HTML tags.

      Reply
  4. Scott

    Thanks!

    Reply
  5. Hien

    Hi, thanks for your post!

    How can we improve your function so that we can have:

    This is a test. Will this text be bold or italic

    to become (1):

    This is a test. Will this text be bold or italic

    instead of (2)

    This is a test. Will this text be bold or italic

    when using =stripHTML(Cell) formula.

    The resulted (1) will be rich formatted instead of only remove html tags.

    For solving the problem of
    http://stackoverflow.com/questions/9999713/html-t

    I am new in this field. So, I would hope to get understandable answer. Thank You in advance.

    Reply
  6. GreenMaxD

    Help you

    ‘ASCII Table ### Code HTML ### Character Entity ### Description
    sInput = Replace(sInput, “"”, “”””) ‘Guillemet double
    sInput = Replace(sInput, “&”, “&”) ‘Et commercial
    sInput = Replace(sInput, “<“, “”) ‘Supérieur à
    sInput = Replace(sInput, “€”, “€”) ‘Symbole Euro
    sInput = Replace(sInput, “†”, “†”) ‘Obèle
    sInput = Replace(sInput, “‡”, “‡”) ‘Double obèle
    sInput = Replace(sInput, “<“, “‹”) ‘
    sInput = Replace(sInput, “>”, “›”) ‘
    sInput = Replace(sInput, “œ”, “œ”) ‘
    sInput = Replace(sInput, “Ÿ”, “Ÿ”) ‘
    sInput = Replace(sInput, ” “, ” “) ‘Espace insécable
    sInput = Replace(sInput, “¡”, “¡”) ‘Exclamation inversée
    sInput = Replace(sInput, “¢”, “¢”) ‘Symbole centime 
    sInput = Replace(sInput, “£”, “£”) ‘Symbole livre 
    sInput = Replace(sInput, “¤”, “¤”) ‘Symbole monétaire
    sInput = Replace(sInput, “¥”, “¥”) ‘Signe Yen
    sInput = Replace(sInput, “¦ ou &brkbar;”, “¦”) ‘Barre verticale interrompue
    sInput = Replace(sInput, “§”, “§”) ‘Signe section 
    sInput = Replace(sInput, “¨ ou &die;”, “¨”) ‘Tréma
    sInput = Replace(sInput, “©”, “©”) ‘Copyright
    sInput = Replace(sInput, “ª”, “ª”) ‘Ordinal féminin
    sInput = Replace(sInput, “«”, “«”) ‘Chevron gauche ou guillemet gauche
    sInput = Replace(sInput, “¬”, “¬”) ‘Signe non
    sInput = Replace(sInput, “­”, “”) ‘Tiret conditionnel
    sInput = Replace(sInput, “®”, “®”) ‘Marque déposée
    sInput = Replace(sInput, “¯ ou &hibar;”, “¯”) ‘Macron
    sInput = Replace(sInput, “°”, “°”) ‘Signe degrés
    sInput = Replace(sInput, “±”, “±”) ‘Plus ou moins
    sInput = Replace(sInput, “²”, “²”) ‘Deux exposant
    sInput = Replace(sInput, “³”, “³”) ‘Trois exposant
    sInput = Replace(sInput, “´”, “´”) ‘Accent aigu
    sInput = Replace(sInput, “µ”, “µ”) ‘Signe micro
    sInput = Replace(sInput, “¶”, “¶”) ‘Signe paragraphe
    sInput = Replace(sInput, “·”, “·”) ‘Point milieu
    sInput = Replace(sInput, “¸”, “¸”) ‘Cédille
    sInput = Replace(sInput, “¹”, “¹”) ‘Un exposant
    sInput = Replace(sInput, “º”, “º”) ‘Ordinal masculin
    sInput = Replace(sInput, “»”, “»”) ‘Chevron droit ou guillemet droit
    sInput = Replace(sInput, “¼”, “¼”) ‘Fraction un quart
    sInput = Replace(sInput, “½”, “½”) ‘Fraction un demi
    sInput = Replace(sInput, “¾”, “¾”) ‘Fraction trois-quarts
    sInput = Replace(sInput, “¿”, “¿”) ‘Point d’interrogation inversé
    sInput = Replace(sInput, “À”, “À”) ‘A accent grave majuscule
    sInput = Replace(sInput, “Á”, “Á”) ‘A accent aigu majuscule 
    sInput = Replace(sInput, “”, “”) ‘A accent circonflexe majuscule 
    sInput = Replace(sInput, “Ô, “Ô) ‘A tilde majuscule
    sInput = Replace(sInput, “Ä”, “Ä”) ‘A tréma majuscule
    sInput = Replace(sInput, “Å”, “Å”) ‘A anneau majuscule
    sInput = Replace(sInput, “Æ”, “Æ”) ‘AE majuscule (digramme soudé ae, lettre)
    sInput = Replace(sInput, “Ç”, “Ç”) ‘C cédille majuscule
    sInput = Replace(sInput, “È”, “È”) ‘E accent grave majuscule
    sInput = Replace(sInput, “É”, “É”) ‘E accent aigu majuscule
    sInput = Replace(sInput, “Ê”, “Ê”) ‘E accent circonflexe majuscule
    sInput = Replace(sInput, “Ë”, “Ë”) ‘E tréma majuscule
    sInput = Replace(sInput, “Ì”, “Ì”) ‘I accent grave majuscule
    sInput = Replace(sInput, “Í”, “Í”) ‘I accent aigu majuscule
    sInput = Replace(sInput, “Δ, “Δ) ‘I accent circonflexe majuscule
    sInput = Replace(sInput, “Ï”, “Ï”) ‘I tréma majuscule
    sInput = Replace(sInput, “Ð ou &Dstrok;”, “Д) ‘Eth islandais
    sInput = Replace(sInput, “Ñ”, “Ñ”) ‘N tilde majuscule
    sInput = Replace(sInput, “Ò”, “Ò”) ‘O accent grave majuscule
    sInput = Replace(sInput, “Ó”, “Ó”) ‘O accent aigu majuscule
    sInput = Replace(sInput, “Ô”, “Ô”) ‘O accent circonflexe majuscule
    sInput = Replace(sInput, “Õ”, “Õ”) ‘O tilde majuscule
    sInput = Replace(sInput, “Ö”, “Ö”) ‘O tréma majuscule
    sInput = Replace(sInput, “×”, “×”) ‘Signe infini
    sInput = Replace(sInput, “Ø”, “Ø”) ‘O barré majuscule
    sInput = Replace(sInput, “Ù”, “Ù”) ‘U accent grave majuscule
    sInput = Replace(sInput, “Ú”, “Ú”) ‘U accent aigu majuscule
    sInput = Replace(sInput, “Û”, “Û”) ‘U accent circonflexe majuscule
    sInput = Replace(sInput, “Ü”, “Ü”) ‘U tréma majuscule
    sInput = Replace(sInput, “Ý”, “Ý”) ‘Y accent aigu majuscule
    sInput = Replace(sInput, “Þ”, “Þ”) ‘Thorn islandais majuscule
    sInput = Replace(sInput, “ß”, “ß”) ‘s dur allemand (szet)
    sInput = Replace(sInput, “à”, “à”) ‘a accent grave minuscule
    sInput = Replace(sInput, “á”, “á”) ‘a accent aigu minuscule
    sInput = Replace(sInput, “â”, “â”) ‘a accent circonflexe minuscule
    sInput = Replace(sInput, “ã”, “ã”) ‘a tilde minuscule
    sInput = Replace(sInput, “ä”, “ä”) ‘a tréma minuscule
    sInput = Replace(sInput, “å”, “å”) ‘a rond minuscule
    sInput = Replace(sInput, “æ”, “æ”) ‘e dans l’a minuscule
    sInput = Replace(sInput, “ç”, “ç”) ‘c cédille minuscule
    sInput = Replace(sInput, “è”, “è”) ‘e accent grave minuscule
    sInput = Replace(sInput, “é”, “é”) ‘e accent aigu minuscule
    sInput = Replace(sInput, “ê”, “ê”) ‘e accent circonflexe minuscule
    sInput = Replace(sInput, “ë”, “ë”) ‘e tréma minuscule
    sInput = Replace(sInput, “ì”, “ì”) ‘i accent grave minuscule
    sInput = Replace(sInput, “í”, “í”) ‘i accent aigu minuscule
    sInput = Replace(sInput, “î”, “î”) ‘i accent circonflexe minuscule
    sInput = Replace(sInput, “ï”, “ï”) ‘i tréma minuscule
    sInput = Replace(sInput, “ð”, “ð”) ‘eth islandais minuscule
    sInput = Replace(sInput, “ñ”, “ñ”) ‘n tilde minuscule
    sInput = Replace(sInput, “ò”, “ò”) ‘o accent grave minuscule
    sInput = Replace(sInput, “ó”, “ó”) ‘o accent aigu minuscule
    sInput = Replace(sInput, “ô”, “ô”) ‘o accent circonflexe minuscule
    sInput = Replace(sInput, “õ”, “õ”) ‘o tilde minuscule
    sInput = Replace(sInput, “ö”, “ö”) ‘o tréma minuscule
    sInput = Replace(sInput, “÷”, “÷”) ‘Signe division 
    sInput = Replace(sInput, “ø”, “ø”) ‘o barré minuscule
    sInput = Replace(sInput, “ù”, “ù”) ‘u accent grave minuscule
    sInput = Replace(sInput, “ú”, “ú”) ‘u accent aigu minuscule
    sInput = Replace(sInput, “û”, “û”) ‘u accent circonflexe minuscule
    sInput = Replace(sInput, “ü”, “ü”) ‘u tréma minuscule
    sInput = Replace(sInput, “ý”, “ý”) ‘y accent aigu minuscule
    sInput = Replace(sInput, “þ”, “þ”) ‘Thorn islandais minuscule
    sInput = Replace(sInput, “ÿ”, “ÿ”) ‘y tréma minuscule
    sInput = Replace(sInput, “♥”, “?”) ‘Cœur noir
    sInput = Replace(sInput, “♦”, “?”) ‘Carreau noir
    sInput = Replace(sInput, “♣”, “?”) ‘Trèfle noir
    sInput = Replace(sInput, “♠”, “?”) ‘Pique noir
    sInput = Replace(sInput, “•”, “•”) ‘Gros point médian
    sInput = Replace(sInput, “¶”, “¶”) ‘Pied de mouche
    sInput = Replace(sInput, “§”, “§”) ‘Paragraphe
    sInput = Replace(sInput, “↑”, “?”) ‘Flèche vers le haut
    sInput = Replace(sInput, “↓”, “?”) ‘Flèche vers le bas
    sInput = Replace(sInput, “→”, “?”) ‘Flèche vers la droite
    sInput = Replace(sInput, “←”, “?”) ‘Flèche vers la gauche
    sInput = Replace(sInput, “↔”, “?”) ‘Flèche bilatérale gauche-droite
    sInput = Replace(sInput, “◊”, “?”) ‘Losange
    sInput = Replace(sInput, “α”, “a”) ‘alpha Minuscule
    sInput = Replace(sInput, “Α”, “?”) ‘alpha Majuscule
    sInput = Replace(sInput, “β”, “ß”) ‘béta Minuscule
    sInput = Replace(sInput, “Β”, “?”) ‘béta Majuscule
    sInput = Replace(sInput, “γ”, “?”) ‘gamma Minuscule
    sInput = Replace(sInput, “Γ”, “G”) ‘gamma Majuscule
    sInput = Replace(sInput, “δ”, “d”) ‘delta Minuscule
    sInput = Replace(sInput, “Δ”, “?”) ‘delta Majuscule
    sInput = Replace(sInput, “ε”, “e”) ‘epsilon Minuscule
    sInput = Replace(sInput, “Ε”, “?”) ‘epsilon Majuscule
    sInput = Replace(sInput, “ζ”, “?”) ‘zéta Minuscule
    sInput = Replace(sInput, “Ζ”, “?”) ‘zéta Majuscule
    sInput = Replace(sInput, “η”, “?”) ‘éta Minuscule
    sInput = Replace(sInput, “Η”, “?”) ‘éta Majuscule
    sInput = Replace(sInput, “θ”, “?”) ‘théta Minuscule
    sInput = Replace(sInput, “Θ”, “T”) ‘théta Majuscule
    sInput = Replace(sInput, “ι”, “?”) ‘iota Minuscule
    sInput = Replace(sInput, “Ι”, “?”) ‘iota Majuscule
    sInput = Replace(sInput, “κ”, “?”) ‘kappa Minuscule
    sInput = Replace(sInput, “Κ”, “?”) ‘kappa Majuscule
    sInput = Replace(sInput, “λ”, “?”) ‘lambda Minuscule
    sInput = Replace(sInput, “Λ”, “?”) ‘lambda Majuscule
    sInput = Replace(sInput, “μ”, “µ”) ‘mu Minuscule
    sInput = Replace(sInput, “Μ”, “?”) ‘mu Majuscule
    sInput = Replace(sInput, “ν”, “?”) ‘nu Minuscule
    sInput = Replace(sInput, “Ν”, “?”) ‘nu Majuscule
    sInput = Replace(sInput, “ξ”, “?”) ‘xi Minuscule
    sInput = Replace(sInput, “Ξ”, “?”) ‘xi Majuscule
    sInput = Replace(sInput, “ο”, “?”) ‘omicron Minuscule
    sInput = Replace(sInput, “Ο”, “?”) ‘omicron Majuscule
    sInput = Replace(sInput, “π”, “p”) ‘pi Minuscule
    sInput = Replace(sInput, “Π”, “?”) ‘pi Majuscule
    sInput = Replace(sInput, “ρ”, “?”) ‘rho Minuscule
    sInput = Replace(sInput, “Ρ”, “?”) ‘rho Majuscule
    sInput = Replace(sInput, “σ”, “s”) ‘sigma Minuscule
    sInput = Replace(sInput, “ς”, “?”) ‘sigma Final Minuscule
    sInput = Replace(sInput, “Σ”, “S”) ‘sigma Majuscule
    sInput = Replace(sInput, “τ”, “t”) ‘tau Minuscule
    sInput = Replace(sInput, “Τ”, “?”) ‘tau Majuscule
    sInput = Replace(sInput, “υ”, “?”) ‘upsilon Minuscule
    sInput = Replace(sInput, “Υ”, “?”) ‘upsilon Majuscule
    sInput = Replace(sInput, “φ”, “f”) ‘phi Minuscule
    sInput = Replace(sInput, “Φ”, “F”) ‘phi Majuscule
    sInput = Replace(sInput, “χ”, “?”) ‘chi Minuscule
    sInput = Replace(sInput, “Χ”, “?”) ‘chi Majuscule
    sInput = Replace(sInput, “ψ”, “?”) ‘psi Minuscule
    sInput = Replace(sInput, “Ψ”, “?”) ‘psi Majuscule
    sInput = Replace(sInput, “ω”, “?”) ‘oméga Minuscule
    sInput = Replace(sInput, “Ω”, “O”) ‘oméga Majuscule

    Reply
    • Vishwamitra Mishra

      Thanks for this list 🙂

      Reply
  7. haoer

    Thank you, the Function StripHTML(sInput As String) As String

    is great , and useful for excel 2003 ~~

    Reply
  8. Sajiv

    Hi Vishwa,

    I am using your method in an excel that has macros. If I have only one sheet and have this method, it is getting called and it works well.

    But, if I put it in a workbook that has different sheets, this function is not getting called.

    Could you please let me know what mistake I am making ?

    Thanks
    Sajiv

    Reply
    • Vishwamitra Mishra

      Hi,
      You should put this function in a module in that workbook and then it should work for all the sheets in that workbook. Please check if your function is in a module or not. If not then please create a module and paste this function.

      Reply
  9. Liviu

    i can call the function
    =StripHTML(D60)
    but return the same thing as D60 cell with not html tags stripped:
    Dear colleagues, Please find attached for your approval the new version of Group Credit Control & Collection Policy and its annexes. The policy has been updated in the context of the Corporate Governance Changes.Thank youBest regards

    using xlsm file (so macro enabled)

    Reply
  10. phil

    as above my html text remained unchanged ie doesnt semm to work

    Reply

Trackbacks/Pingbacks

  1. VBA vervangen met een wildchar - Worksheet.nl - [...] paragraph.</p></body></html>" stringwithpipe = StripHTML(htmlstring) End Sub Bron __________________ gr. [...]
  2. Learn Excel Macro How to create User Defined Function in Excel - [...] from a cell UDF to Extract Special Characters from a cell UDF to Extract Alphabets from a cell UDF…
  3. Regular Expression and its usage in Excel VBA - […] Strip HTML : How to Remove HTML tags from a String in VBA […]

Submit a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Join and get a FREE! e-Book

Don't miss any articles, tools, tips and tricks, I publish here

You have Successfully Subscribed!

Pin It on Pinterest