mb_ord
(PHP 7 >= 7.2.0, PHP 8)
mb_ord — Get Unicode code point of character
说明
mb_ord(string
$string
, ?string $encoding
= null
): int|falseReturns the Unicode code point value of the given character.
This function complements mb_chr().
参数
-
string
-
A string
-
encoding
-
encoding
参数为字符编码。如果省略或是null
,则使用内部字符编码。
返回值
The Unicode code point for the first character of string
或者在失败时返回 false
.
更新日志
版本 | 说明 |
---|---|
8.0.0 |
现在 encoding 可以为 null。
|
范例
<?php
var_dump(mb_ord("A", "UTF-8"));
var_dump(mb_ord("🐘", "UTF-8"));
var_dump(mb_ord("\x80", "ISO-8859-1"));
var_dump(mb_ord("\x80", "Windows-1252"));
?>
以上例程会输出:
int(65)
int(128024)
int(128)
int(8364)
参见
- mb_internal_encoding() - 设置/获取内部字符编码
- mb_chr() - Return character by Unicode code point value
- IntlChar::ord() - Return Unicode code point value of character
- ord() - 转换字符串第一个字节为 0-255 之间的值
data:image/s3,"s3://crabby-images/00698/00698142cd7f9d7f9bd4fdcf9bee9cb315da9f05" alt="add a note"
User Contributed Notes 1 note
Andrew ¶
2 years ago
You can forget about DIY uniord()
https://www.php.net/manual/en/function.ord.php#42778
$array['Б'] = uniord('Б');
$array['🚷'] = uniord('🚷');
$array['mb_ord Б'] = mb_ord('Б');
$array['mb_ord 🚷'] = mb_ord('🚷');
function uniord($charUTF8)
{
$charUCS4 = mb_convert_encoding($charUTF8, 'UCS-4BE', 'UTF-8');
$byte1 = ord(substr($charUCS4, 0, 1));
$byte2 = ord(substr($charUCS4, 1, 1));
$byte3 = ord(substr($charUCS4, 2, 1));
$byte4 = ord(substr($charUCS4, 3, 1));
return ($byte1 << 32) + ($byte2 << 16) + ($byte3 << 8) + $byte4;
}
var_export($array);
Shows:
array ( 'Б' => 1041, '🚷' => 128695, 'mb_ord Б' => 1041, 'mb_ord 🚷' => 128695, )
https://unicode-table.com/en/0411/
Б
Encoding hex dec (bytes) dec binary
UTF-8 D0 91 208 145 53393 11010000 10010001
UTF-16BE 04 11 4 17 1041 00000100 00010001
UTF-16LE 11 04 17 4 4356 00010001 00000100
UTF-32BE 00 00 04 11 0 0 4 17 1041 00000000 00000000 00000100 00010001
UTF-32LE 11 04 00 00 17 4 0 0 285474816 00010001 00000100 00000000 00000000
https://unicode-table.com/en/1F6B7/
🚷
Encoding hex dec (bytes) dec binary
UTF-8 F0 9F 9A B7 240 159 154 183 4036991671 11110000 10011111 10011010 10110111
UTF-16BE D8 3D DE B7 216 61 222 183 3627933367 11011000 00111101 11011110 10110111
UTF-16LE 3D D8 B7 DE 61 216 183 222 1037613022 00111101 11011000 10110111 11011110
UTF-32BE 00 01 F6 B7 0 1 246 183 128695 00000000 00000001 11110110 10110111
UTF-32LE B7 F6 01 00 183 246 1 0 3086352640 10110111 11110110 00000001 00000000