Best way to automatically remove comments from PHP code

    |
  • Added:
  • |
  • In: Basic PHP

Whats the best way to remove comments from a PHP file?

I want to do something similar to strip-whitespace() - but it shouldn't remove the line breaks as well.

EG:

I want this:

<?PHP // something if ($whatsit) { do_something(); # we do something here echo '<html>Some embedded HTML</html>'; } /* another long comment */ some_more_code(); ?> 

to become:

<?PHP if ($whatsit) { do_something(); echo '<html>Some embedded HTML</html>'; } some_more_code(); ?> 

(Although if the empty lines remain where comments are removed, that wouldn't be ok).

It may not be possible, because of the requirement to preserve embedded html - thats whats tripped up the things that have come up on google.

This Question Has 11 Answeres | Orginal Question | benlumley

If you already use an editor like UltraEdit, you can open one or multiple PHP file/s and then use a simple Find&Replace (CTRL+R) with the following Perl regexp

(?s)/\*.*\*/ 

Beware the above regexp removes also comments inside a sring, i.e. in echo "hello/*babe*/"; the /*babe*/ would be removed too. Hence, it could be a solution if you have few files to remove comments, in order to be absolutely sure it does not wrongly replace something that is not a comment you would have to run the Find&Replace command and approve each time what is getting replaced.

Run the command php --strip file.php in a command prompt (i.e. cmd.exe), then browse to http://www.writephponline.com/phpbeautifier.

Here, file.php is your own file.

1

Here's the function posted above, modified to recursively remove all comments from all php files within a directory and all its subdirectories:

function rmcomments($id) { if (file_exists($id)) { if (is_dir($id)) { $handle = opendir($id); while($file = readdir($handle)) { if (($file != ".") && ($file != "..")) { rmcomments($id."/".$file); }} closedir($handle); } else if ((is_file($id)) && (end(explode('.', $id)) == "php")) { if (!is_writable($id)) { chmod($id,0777); } if (is_writable($id)) { $fileStr = file_get_contents($id); $newStr = ''; $commentTokens = array(T_COMMENT); if (defined('T_DOC_COMMENT')) { $commentTokens[] = T_DOC_COMMENT; } if (defined('T_ML_COMMENT')) { $commentTokens[] = T_ML_COMMENT; } $tokens = token_get_all($fileStr); foreach ($tokens as $token) { if (is_array($token)) { if (in_array($token[0], $commentTokens)) { continue; } $token = $token[1]; } $newStr .= $token; } if (!file_put_contents($id,$newStr)) { $open = fopen($id,"w"); fwrite($open,$newStr); fclose($open); }}}}} rmcomments("path/to/directory"); 

The catch is that a less robust matching algorithm (simple regex, for instance) will start stripping here when it clearly shouldn't:

if (preg_match('#^/*' . $this->index . '#', $this->permalink_structure)) { 

It might not affect your code, but eventually someone will get bit by your script. So you will have to use a utility that understands more of the language than you might otherwise expect.

-Adam

/* * T_ML_COMMENT does not exist in PHP 5. * The following three lines define it in order to * preserve backwards compatibility. * * The next two lines define the PHP 5 only T_DOC_COMMENT, * which we will mask as T_ML_COMMENT for PHP 4. */ if (! defined('T_ML_COMMENT')) { define('T_ML_COMMENT', T_COMMENT); } else { define('T_DOC_COMMENT', T_ML_COMMENT); } /* * Remove all comment in $file */ function remove_comment($file) { $comment_token = array(T_COMMENT, T_ML_COMMENT, T_DOC_COMMENT); $input = file_get_contents($file); $tokens = token_get_all($input); $output = ''; foreach ($tokens as $token) { if (is_string($token)) { $output .= $token; } else { list($id, $text) = $token; if (in_array($id, $comment_token)) { $output .= $text; } } } file_put_contents($file, $output); } /* * Glob recursive * @return ['dir/filename', ...] */ function glob_recursive($pattern, $flags = 0) { $file_list = glob($pattern, $flags); $sub_dir = glob(dirname($pattern) . '/*', GLOB_ONLYDIR); // If sub directory exist if (count($sub_dir) > 0) { $file_list = array_merge( glob_recursive(dirname($pattern) . '/*/' . basename($pattern), $flags), $file_list ); } return $file_list; } // Remove all comment of '*.php', include sub directory foreach (glob_recursive('*.php') as $file) { remove_comment($file); } 

I'd use tokenizer. Here's my solution. It should work on both PHP 4 and 5:

$fileStr = file_get_contents('path/to/file'); $newStr = ''; $commentTokens = array(T_COMMENT); if (defined('T_DOC_COMMENT')) $commentTokens[] = T_DOC_COMMENT; // PHP 5 if (defined('T_ML_COMMENT')) $commentTokens[] = T_ML_COMMENT; // PHP 4 $tokens = token_get_all($fileStr); foreach ($tokens as $token) { if (is_array($token)) { if (in_array($token[0], $commentTokens)) continue; $token = $token[1]; } $newStr .= $token; } echo $newStr; 

For ajax/json responses, I use following PHP code, to remove comments from HTML/JavaScript code, so it would be smaller (about 15% gain for my code).

// Replace doubled spaces with single ones (ignored in HTML any way) $html = preg_replace('@(\s){2,}@', '\1', $html); // Remove single and multiline comments, tabs and newline chars $html = preg_replace( '@(/\*([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*+/)|((?<!:)//.*)|[\t\r\n]@i', '', $html ); 

Short and effective, but can produce unexpected results, if your code has $itty syntax.

Bash solution: If you want to remove recursively comments from all PHP files starting from the current directory you can write in terminal this one-liner. ( it uses temp1 file to store PHP content for processing ) Note that this will strip all white spaces with comments.

 find . -type f -name '*.php' | while read VAR; do php -wq $VAR > temp1 ; cat temp1 > $VAR; done 

Then you should remove temp1 file after.

if PHP_BEAUTIFER is installed then you can get nicely formatted code without comments with

 find . -type f -name '*.php' | while read VAR; do php -wq $VAR > temp1; php_beautifier temp1 > temp2; cat temp2 > $VAR; done; 

then remove two files ( temp1, temp2 )

How about using php -w to generate a file stripped of comments and whitespace, then using a beautifier like PHP_Beautifier to reformat for readability?

a version more powerful : remove all comments in the folder

<?php $di = new RecursiveDirectoryIterator(__DIR__,RecursiveDirectoryIterator::SKIP_DOTS); $it = new RecursiveIteratorIterator($di); $fileArr = []; foreach($it as $file){ if(pathinfo($file,PATHINFO_EXTENSION) == "php"){ ob_start(); echo $file; $file = ob_get_clean(); $fileArr[] = $file; } } $arr = [T_COMMENT,T_DOC_COMMENT]; $count = count($fileArr); for($i=1;$i < $count;$i++){ $fileStr = file_get_contents($fileArr[$i]); foreach(token_get_all($fileStr) as $token){ if(in_array($token[0],$arr)){ $fileStr = str_replace($token[1],'',$fileStr); } } file_put_contents($fileArr[$i],$fileStr); } 
$fileStr = file_get_contents('file.php'); foreach (token_get_all($fileStr) as $token ) { if ($token[0] != T_COMMENT) { continue; } $fileStr = str_replace($token[1], '', $fileStr); } echo $fileStr; 

edit I realised Ionut G. Stan has already suggested this, but I will leave the example here


Search
I am...

Sajjad Hossain

I have five years of experience in web development sector. I love to do amazing projects and share my knowledge with all.

Connect Social With PHPAns
Top