View Full Version : Can someone please convert this tiny snippet to PHP?
kaisellgren
06-01-2009, 12:11 AM
Hi,
Could someone please convert this code to PHP? It basically encodes characters into their corresponding HTML entity values.
public String encodeCharacter( char[] immune, Character c ) {
char ch = c.charValue();
// check for immune characters
if ( containsCharacter( ch, immune ) ) {
return ""+ch;
}
// check for alphanumeric characters
String hex = Codec.getHexForNonAlphanumeric( ch );
if ( hex == null ) {
return ""+ch;
}
// check for illegal characters
if ( ( ch <= 0x1f && ch != '\t' && ch != '\n' && ch != '\r' ) || ( ch >= 0x7f && ch <= 0x9f ) ) {
return( " " );
}
// check if there's a defined entity
String entityName = (String) characterToEntityMap.get(c);
if (entityName != null) {
return "&" + entityName + ";";
}
// return the hex entity as suggested in the spec
return "&#x" + hex + ";";
}
Thanks for any help!
abduraooft
06-01-2009, 09:01 AM
Have you checked http://php.net/htmlentities ?
kaisellgren
06-01-2009, 03:41 PM
It only converts some basic characters...
$a = chr(1).'This is a chinese character: "華" as you can see... ;)';
This is a chinese character: "華" as you can see... ;)
The first character is not converted to ; ! The chinese character is not converted to its HTML entity and neither are : . ; or ) in addition to alphabets.
I need to convert any character to its corresponding HTML entity: &#xx; and not just the basic < > ' " etc
Fou-Lu
06-01-2009, 05:12 PM
It only converts some basic characters...
$a = chr(1).'This is a chinese character: "華" as you can see... ;)';
The first character is not converted to ; ! The chinese character is not converted to its HTML entity and neither are : . ; or ) in addition to alphabets.
I need to convert any character to its corresponding HTML entity: &#xx; and not just the basic < > ' " etc
Works for me, just make sure you're changing you're charset as well. Are you meaning to keep that SOH though, I doesn't look like the decode will remove it.
If you still can't get it to work, we can look at converting what you have. Going to unicode will be a pain though.
Oh sorry missed that you were looking for other things too.
How come you need to convert the other special chars, like ; and :? For that, yeah we'll need to look at rewriting.
kaisellgren
06-01-2009, 06:14 PM
There are no display problems, and this is not a display problem.. the problem is the way the characters are being shown.
For instance, you can display & -sign by either typing it in the HTML as & (which I know does not obey strict standard rules) or you can use the corresponding HTML entity: &
Now what I want to do is to convert all characters into HTML entities. This certainly is possible.
For example, a space can be converted into (in addition to  ) ; where 32 is the corresponding value in ASCII set. Now, I want to convert all characters to these entities. Well okay, a-zA-Z0-9 do not need to be, but all other characters as well as . , : etc.
For example here: http://rishida.net/scripts/uniview/conversion.php
If you type: A and convert, it says it equals to a which is exactly what I want.
That is simple to do, just use ord() to find out the number. However, I need help on figuring out this when it comes to multibyte characters like the chinese character I showed earlier, which is: 華 in an entity format. I want to do that kind of conversions in PHP.
EDIT: Omg, this board converted my characters... the chinese character equals to &# 33775;
vBulletin® v3.8.2, Copyright ©2000-2012, Jelsoft Enterprises Ltd.