Categories
html javascript

How to get the pure text without HTML element using JavaScript?

156

I have the 1 button and some text in my HTML like the following:

function get_content(){
   // I don't know how to do in here!!!
}

<input type="button" onclick="get_content()" value="Get Content"/>
<p id='txt'>
<span class="A">I am</span>
<span class="B">working in </span>
<span class="C">ABC company.</span>
</p>

When the user clicks the button, the content in the <p id='txt'> will become the follow expected result:

<p id='txt'>
// All the HTML element within the <p> will be disappear
I am working in ABC company.
</p>

Can anyone help me how to write the JavaScript function?

Thank you.

1

82

[2017-07-25] since this continues to be the accepted answer, despite being a very hacky solution, I’m incorporating Gabi‘s code into it, leaving my own to serve as a bad example.

// my hacky approach:
function get_content() {
  var html = document.getElementById("txt").innerHTML;
  document.getElementById("txt").innerHTML = html.replace(/<[^>]*>/g, "");
}
// Gabi's elegant approach, but eliminating one unnecessary line of code:
function gabi_content() {
  var element = document.getElementById('txt');
  element.innerHTML = element.innerText || element.textContent;
}
// and exploiting the fact that IDs pollute the window namespace:
function txt_content() {
  txt.innerHTML = txt.innerText || txt.textContent;
}
.A {
  background: blue;
}

.B {
  font-style: italic;
}

.C {
  font-weight: bold;
}
<input type="button" onclick="get_content()" value="Get Content (bad)" />
<input type="button" onclick="gabi_content()" value="Get Content (good)" />
<input type="button" onclick="txt_content()" value="Get Content (shortest)" />
<p id='txt'>
  <span class="A">I am</span>
  <span class="B">working in </span>
  <span class="C">ABC company.</span>
</p>

4

  • 3

    Bad because hacky and slow. Is there even a guarantee that the rendered text itself must never contain tags?

    – Domi

    Jan 9, 2014 at 14:19


  • 1

    no, there is no such guarantee. I gave a disclaimer when I posted. it apparently served the purpose of the OP.

    Jan 9, 2014 at 17:12

  • 4

    Trying to parse HTML with regular expressions is really dangerous — it’s practically impossible (I suspect it may be theoretically impossible) to get right. There’s too many edge cases and then your code blows up when faced with strange input, which can frequently be exploited to do XSS.

    Feb 4, 2015 at 22:37

  • 2

    my guess as to why it was accepted: it’s a complete answer, which can be immediately cut-and-pasted as is into an html file and tested with a browser. I never said it was a good answer. I posted after seeing all the good answers were there, and not accepted, and figured the OP needed a little handholding. it still is good enough for any application for which the HTML source is already known not to contain unbalanced angle brackets.

    Aug 29, 2016 at 23:39

82

[2017-07-25] since this continues to be the accepted answer, despite being a very hacky solution, I’m incorporating Gabi‘s code into it, leaving my own to serve as a bad example.

// my hacky approach:
function get_content() {
  var html = document.getElementById("txt").innerHTML;
  document.getElementById("txt").innerHTML = html.replace(/<[^>]*>/g, "");
}
// Gabi's elegant approach, but eliminating one unnecessary line of code:
function gabi_content() {
  var element = document.getElementById('txt');
  element.innerHTML = element.innerText || element.textContent;
}
// and exploiting the fact that IDs pollute the window namespace:
function txt_content() {
  txt.innerHTML = txt.innerText || txt.textContent;
}
.A {
  background: blue;
}

.B {
  font-style: italic;
}

.C {
  font-weight: bold;
}
<input type="button" onclick="get_content()" value="Get Content (bad)" />
<input type="button" onclick="gabi_content()" value="Get Content (good)" />
<input type="button" onclick="txt_content()" value="Get Content (shortest)" />
<p id='txt'>
  <span class="A">I am</span>
  <span class="B">working in </span>
  <span class="C">ABC company.</span>
</p>

4

  • 3

    Bad because hacky and slow. Is there even a guarantee that the rendered text itself must never contain tags?

    – Domi

    Jan 9, 2014 at 14:19


  • 1

    no, there is no such guarantee. I gave a disclaimer when I posted. it apparently served the purpose of the OP.

    Jan 9, 2014 at 17:12

  • 4

    Trying to parse HTML with regular expressions is really dangerous — it’s practically impossible (I suspect it may be theoretically impossible) to get right. There’s too many edge cases and then your code blows up when faced with strange input, which can frequently be exploited to do XSS.

    Feb 4, 2015 at 22:37

  • 2

    my guess as to why it was accepted: it’s a complete answer, which can be immediately cut-and-pasted as is into an html file and tested with a browser. I never said it was a good answer. I posted after seeing all the good answers were there, and not accepted, and figured the OP needed a little handholding. it still is good enough for any application for which the HTML source is already known not to contain unbalanced angle brackets.

    Aug 29, 2016 at 23:39

26

If you can use jquery then its simple

$("#txt").text()

2

  • 8

    I just have to say, look at all the pure JS answers and then look at this one. This is the second most important reason why I use jQuery (i.e., it simplifies tasks, reduces my workload, and increases readability). The first most important reason (to me) is because it handles many cross-compatibility issues, I might otherwise not even be aware of (like using jQuery to adjust opacity, so that I don’t have to write a separate line just for IE8 to target the filter property. I know that pure JS is technically more efficient when it comes to speed, but that hardly matters anymore in most normal..

    – VoidKing

    Oct 1, 2013 at 14:22

  • 11

    pure js one liner equivalent: document.querySelector("#txt").innerText; People include the entire jQuery library far too often when their only need is a couple of lines of code. It’s bad practice.

    – Levi

    Mar 11, 2018 at 13:36