运行时配置
这些函数的行为受 php.ini 中的设置影响。
名字 | 默认 | 可修改范围 | 更新日志 |
---|---|---|---|
mbstring.language | "neutral" | PHP_INI_ALL | PHP_INI_PERDIR 位于 PHP <= 5.2.6 |
mbstring.detect_order | NULL | PHP_INI_ALL | |
mbstring.http_input | "pass" | PHP_INI_ALL | |
mbstring.http_output | "pass" | PHP_INI_ALL | |
mbstring.internal_encoding | NULL | PHP_INI_ALL | |
mbstring.script_encoding | NULL | PHP_INI_ALL | 在 PHP 5.4.0. 中移除, 使用 zend.script_encoding 代替。 |
mbstring.substitute_character | NULL | PHP_INI_ALL | |
mbstring.func_overload | "0" | PHP_INI_SYSTEM | PHP <= 5.2.6 是 PHP_INI_PERDIR。 Deprecated as of PHP 7.2.0; removed as of PHP 8.0.0. |
mbstring.encoding_translation | "0" | PHP_INI_PERDIR | |
mbstring.http_output_conv_mimetypes | "^(text/|application/xhtml\+xml)" | PHP_INI_ALL | Available as of PHP 5.3.0. |
mbstring.strict_detection | "0" | PHP_INI_ALL | 自 PHP 5.1.2 起有效。 |
这是配置指令的简短说明。
-
mbstring.language
string -
mbstring 使用了国家默认语言设置(NLS)。 注意,该选项自动地定义了
mbstring.internal_encoding
和mbstring.internal_encoding
,在 php.ini 里应当放置在mbstring.language
之后。 -
mbstring.encoding_translation
bool -
为传入的 HTTP 查询启用透明字符编码过滤器,将检测和转换输入的编码为内部字符编码(internal character encoding)。
-
mbstring.internal_encoding
string -
警告
本特性已自 PHP 5.6.0 起废弃。强烈建议不要使用本特性。
定义内部字符的默认编码。
PHP 5.6 及更新版的用户应该将此选项留空,并设置
default_charset
作为代替。 -
mbstring.http_input
string -
警告
本特性已自 PHP 5.6.0 起废弃。强烈建议不要使用本特性。
定义 HTTP 输入字符的默认编码。
PHP 5.6 及更新版的用户应该将此选项留空,并设置
default_charset
作为代替。 -
mbstring.http_output
string -
警告
本特性已自 PHP 5.6.0 起废弃。强烈建议不要使用本特性。
定义 HTTP 输出字符的默认编码。
PHP 5.6 及更新版的用户应该将此选项留空,并设置
default_charset
作为代替。 -
mbstring.detect_order
string -
定义字符编码的默认检测顺序。参见 mb_detect_order()。
-
mbstring.substitute_character
string -
为无效编码的字符定义替代字符。 参见 mb_substitute_character() ,查看支持的值。
-
mbstring.func_overload
string -
警告
本特性自 PHP 7.2.0 起废弃,并且自 PHP 8.0.0 起被移除。 强烈建议不要使用本特性。
用 mbstring 对应的函数覆盖单字节版本的函数集。更多信息参见函数的覆盖。
该设置仅能通过 php.ini 文件来修改。
-
mbstring.http_output_conv_mimetypes
string -
-
mbstring.strict_detection
bool -
使用严格的编码检测。
根据 » HTML4.01 规范,允许 Web 浏览器以页面不同的字符编码来提交表单。 参见用 mb_http_input() 来检测浏览器使用的字符编码。
尽管流行的浏览器能够根据给出的 HTML 文档合理猜测正确的编码,但如果能通过 header() 函数在 HTTP 的 Content-Type
头内或 ini 的 default_charset 里设置适当的 charset
参数则会更佳。
示例 #1 php.ini 设置例子
; 设置默认语言 mbstring.language = Neutral; 设置默认语言 Neutral(UTF-8) (默认的值) mbstring.language = English; 设置默认语言为 English mbstring.language = Japanese; 设置默认语言为 Japanese ;; 设置内部的默认编码 ;; 注意:请确保这个编码能被 PHP 所处理 mbstring.internal_encoding = UTF-8 ; 设置内部的默认编码为 UTF-8 ;; 启用 HTTP 输入编码的转换 mbstring.encoding_translation = On ;; 设置 HTTP 输入的默认编码 ;; 注意:脚本不能修改 http_input 的设置 mbstring.http_input = pass ; 不转换 mbstring.http_input = auto ; 设置 HTTP 输入为 auto ; "auto" 会根据 mbstring.language 自动扩展 mbstring.http_input = SJIS ; 设置 HTTP 输入编码为 SJIS mbstring.http_input = UTF-8,SJIS,EUC-JP ; 指定顺序 ;; 设置 HTTP 输出的默认编码 mbstring.http_output = pass ; 不转换 mbstring.http_output = UTF-8 ; 设置 HTTP 输出编码为 UTF-8 ;; 设置字符编码的默认检测顺序 mbstring.detect_order = auto ; Set detect order to auto mbstring.detect_order = ASCII,JIS,UTF-8,SJIS,EUC-JP ; Specify order ;; 设置默认的替代字符 mbstring.substitute_character = 12307 ; 指定 Unicode 值 mbstring.substitute_character = none ; 不打印字符 mbstring.substitute_character = long ; Long 的例子: U+3000,JIS+7E7E
示例 #2 php.ini 里 EUC-JP
用户的设置
;; 禁用输出缓冲 output_buffering = Off ;; 设置 HTTP header 字符编码 default_charset = EUC-JP ;; 设置默认语言为 Japanese mbstring.language = Japanese ;; 启用 HTTP 输入编码的转换 mbstring.encoding_translation = On ;; 启用 HTTP 输入转换的编码为 auto mbstring.http_input = auto ;; 转换 HTTP 输出的编码为 EUC-JP mbstring.http_output = EUC-JP ;; 设置内部编码为 EUC-JP mbstring.internal_encoding = EUC-JP ;; 不要打印无效的字符 mbstring.substitute_character = none
示例 #3 php.ini 里 SJIS
用户的设置
;; 启用输出缓冲 output_buffering = On ;; 设置 mb_output_handler 来启用输出编码的转换 output_handler = mb_output_handler ;; 设置 HTTP header 的字符编码 default_charset = Shift_JIS ;; 设置默认语言为 Japanese mbstring.language = Japanese ;; 设置 http 输入转换的编码为 auto mbstring.http_input = auto ;; 转换成 SJIS mbstring.http_output = SJIS ;; 设置内部变量为 EUC-JP mbstring.internal_encoding = EUC-JP ;; 不要打印无效的字符 mbstring.substitute_character = none
data:image/s3,"s3://crabby-images/00698/00698142cd7f9d7f9bd4fdcf9bee9cb315da9f05" alt="add a note"
User Contributed Notes 3 notes
String literals in the PHP script are encoded with the same encoding that the PHP file was saved with. This is not affected by default_charset or other .ini settings.
Scenario: The default_charset is KOI8-R, and there is a text file "input.txt" containing the string "Это текст для поиска." in KOI8-R encoding.
A PHP script is written:
<?php
// mb_internal_encoding('KOI8-R');
$string = 'текст.';
$data = file_get_contents('input.txt');
echo mb_strpos($data, $string);
?>
But unfortunately it was saved as UTF-8.
It doesn't work; mb_strpos() returns false because it can't find the UTF-8-encoded "текст" inside the KOI8-R-encoded "Это текст для поиска.".
Adjusting the default_charset had no effect. Not even fiddling with mb_internal_encoding could fix it, simply because the strings involved had *different* encodings and without actually changing one of them they just weren't going to match.
Either re-save the source file as KOI8-R to match the data file, or re-save the data file as UTF-8 to match the source code. Only then will the script properly echo '4'.
The documentation is vague, on WHAT precisely the valid "NLS" language strings are that are valid for "mbstring.language".
According to http://php.net/manual/en/function.mb-language.php the values are "Japanese", "ja", "English", "en", or "uni" for UTF-8.
On the other hand, the sample on this current page omits "uni" but introduces "Neutral" as an undocumented option - which is also the default value:
<?php
var_dump( mb_language() ); // "neutral" (default if not set)
var_dump( mb_language( 'uni' ) ); // TRUE, valid language string
var_dump( mb_language() ); // "uni"
var_dump( mb_language( 'neutral' ) ); // TRUE, valid language string
var_dump( mb_language() ); // "neutral"
?>
Note that you should better at least set "mbstring.internal_encoding".
Just check as below:
<?php
echo mb_internal_encoding() . '<br />';
echo mb_regex_encoding();
?>
You might be surprised at unexpected values.
eg.
mbstring.language Japanese
;mbstring.internal_encoding (commented out showing "no value" in phpinfo() )
These two lines in "php.ini" are the same values as
mb_internal_encoding("EUC-JP");
mb_regex_encoding("EUC-JP");
in Win / Linux servers.
"mbstring.internal_encoding" defines the default encoding for "mb_" Functions such as "mb_strlen()".
It also defines the same for "mb_ereg_" Functions such as "mb_ereg()" when you don't set "mb_regex_encoding".
备份地址:http://www.lvesu.com/blog/php/mbstring.configuration.php