Coder Perfect

How can I get rid of the character « from the beginning of a file?


When I open a CSS file in gedit, it looks normal, but when PHP reads it (to merge all the CSS files into one), it has the following characters prepended to it: 

Because PHP removes all whitespace, a single » in the middle of the code causes the entire thing to break. As I previously stated, I can’t see these characters when I open the file in gedit, therefore I can’t readily erase them.

I googled the issue and discovered that there is clearly a problem with the file encoding, which makes sense given that I’ve been transferring data between Linux/Windows servers via ftp and rsync, using a variety of text editors. I don’t know much about character encoding, so any assistance would be greatly appreciated.

If it’s any help, the file is saved in UTF-8 format because gedit won’t let me save it in ISO-8859-15 (the document contains one or more characters that cannot be encoded using the specified character encoding). I tried both Windows and Linux line endings to save it, but neither worked.

Asked by Matt

Solution #1

For you, I have three words:

Marking the Byte Order (BOM)

In ISO-8859-1, that’s how the UTF-8 BOM is represented. You must either instruct your editor not to utilize BOMs or use a different editor to remove them.

You can use awk to automate the BOM removal, as described in this question.

As another answer suggests, the ideal solution is for PHP to correctly interpret the BOM; for this, you can use mb internal encoding(), as seen here:

   //Storing the previous encoding in case you have some other piece 
   //of code sensitive to encoding and counting on the default value.      
   $previous_encoding = mb_internal_encoding();

   //Set the encoding to UTF-8, so when reading files it ignores the BOM       

   //Process the CSS files...

   //Finally, return to the previous encoding

   //Rest of the code...

Answered by Vinko Vrsalovic

Solution #2

Notepad++ should be used to open your file. Select Convert to UTF-8 without BOM from the Encoding option, save the file, and replace the old file with this new one. And it will, without a doubt, work.

Answered by V.Rohan

Solution #3

To eliminate all non-characters, including the character in question, execute the following in PHP.

$response = preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $response);

Answered by Michael Schreiber

Solution #4

For those with shell access, here’s a quick command to discover any files in the public html directory that have the BOM set – be sure to update the path to your server’s right path.


grep -rl $'\xEF\xBB\xBF' /home/username/public_html

Open the file in vi if you’re familiar with the vi editor:

vi /path-to-file-name/file.php

And then run the following command to remove the BOM:

set nobomb

Save the file:


Answered by Diego Palomar

Solution #5

Because BOM is essentially a string of characters ($EF $BB $BF for UTF-8), you may either remove it with scripts or set the editor to prevent it from being added.

Removing the BOM from UTF-8:

$file[0] =~ s/^\xEF\xBB\xBF//;

I’m confident it’ll transition quickly to PHP.

Answered by Eugene Yokota

Post is based on