[php] Strip out HTML and Special Characters

I'd like to use any php function or whatever so that i can remove any HTML code and special characters and gives me only alpha-numeric output

$des = "Hello world)<b> (*&^%$#@! it's me: and; love you.<p>";

I want the output become Hello world it s me and love you (just Aa-Zz-0-9-WhiteSpace)

I've tried strip_tags but it removes only HTML codes

$clear = strip_tags($des); 
echo $clear;

So is there any way to do it?

This question is related to php

The answer is


Strip out tags, leave only alphanumeric characters and space:

$clear = preg_replace('/[^a-zA-Z0-9\s]/', '', strip_tags($des));

Edit: all credit to DaveRandom for the perfect solution...

$clear = preg_replace('/[^a-zA-Z0-9\s]/', '', strip_tags(html_entity_decode($des)));

Here's a function I've been using that I've put together from various threads around the net that removes everything, all tags and leaves you with a perfect phrase. Does anyone know how to modify this script to allow periods (.) ? In other words, leave everything 'as is' but leave the periods alone or other punctuation like and ! or a comma? let me know.

function stripAlpha( $item )

{

    $search     = array( 
         '@<script[^>]*?>.*?</script>@si'   // Strip out javascript 
        ,'@<style[^>]*?>.*?</style>@siU'    // Strip style tags properly 
        ,'@<[\/\!]*?[^<>]*?>@si'            // Strip out HTML tags
        ,'@<![\s\S]*?–[ \t\n\r]*>@'         // Strip multi-line comments including CDATA
        ,'/\s{2,}/'
        ,'/(\s){2,}/'

    );

    $pattern    = array(

         '#[^a-zA-Z ]#'                     // Non alpha characters
        ,'/\s+/'                            // More than one whitespace

    );

    $replace    = array(
         ''
        ,' '

    );

    $item = preg_replace( $search, '', html_entity_decode( $item ) );
    $item = trim( preg_replace( $pattern, $replace, strip_tags( $item ) ) );
    return $item;

}

Remove all special character don't give space write in single line

trim(preg_replace('/ +/', ' ', preg_replace('/[^A-Za-z0-9 ]/', ' ', 
urldecode(html_entity_decode(strip_tags($string))))));

In a more detailed manner from Above example, Considering below is your string:

$string = '<div>This..</div> <a>is<a/> <strong>hello</strong> <i>world</i> ! ??? ?? ????? ??????! !@#$%^&&**(*)<>?:";p[]"/.,\|`~1@#$%^&^&*(()908978867564564534423412313`1`` "Arabic Text ?? ???? test 123 ?,.m,............ ~~~ ??]??}~?]?}"; ';

Code:

echo preg_replace('/[^A-Za-z0-9 !@#$%^&*().]/u','', strip_tags($string));

Allows: English letters (Capital and small), 0 to 9 and characters !@#$%^&*().

Removes: All html tags, and special characters other than above


You can do it in one single line :) specially useful for GET or POST requests

$clear = preg_replace('/[^A-Za-z0-9\-]/', '', urldecode($_GET['id']));

All the other solutions are creepy because they are from someone that arrogantly simply thinks that English is the only language in the world :)

All those solutions strip also diacritics like ç or à.

The perfect solution, as stated in PHP documentation, is simply:

$clear = strip_tags($des);

to allow periods and any other character just add them like so:

change: '#[^a-zA-Z ]#' to:'#[^a-zA-Z .()!]#'


preg_replace('/[^a-zA-Z0-9\s]/', '',$string) this is using for removing special character only rather than space between the strings.