我有一个
PHP脚本,我从终端运行这里是它的作用:
>从数据库中获取一行数据(表存储要由此脚本专门处理的JSON字符串);
>将JSON字符串转换为数组并准备要插入数据库的数据.
>将所需数据插入数据库
这是脚本:
#!/usr/bin/php
<?PHP
//script used to parse tweets we have gathered from the twitter streaming API
mb_internal_encoding("UTF-8");
date_default_timezone_set('UTC');
require './config/config.php';
require './libs/db.class.php';
require './libs/tweetReadWrite.class.php';
require './libs/tweetHandle.class.php';
require './libs/tweetPrepare.class.php';
require './libs/pushOver.class.php';
require './libs/getLocationDetails.class.php';
//instatiate our classes
$twitdb = new db(Config::getConfig("twitterDbConnStr"),Config::getConfig("twitterDbUser"),Config::getConfig("twitterDbPass"));
$pushOvr = new PushOver(); // push error messages to my phone
$tweetPR = new TweetPrepare(); // prepares tweet data
$geoData = new getLocationDetails($pushOvr); // reverse geolocation using google maps API
$tweetIO = new TweetReadWrite($twitdb,$tweetPR,$pushOvr,$geoData); // read and write tweet data to the database
/* grab cached json row from the ORCALE Database
*
* the reason the JSON string is brought back in multiple parts is because
* PDO doesnt handle CLOB's very well and most of the time the JSON string
* is larger than 4000 chars - its a hack but it works
*
* the following sql specifies a test row to work with which has characters like €$£ etc..
*/
$sql = "
SELECT a.tjc_id
, dbms_lob.substr(tweet_json, 4000,1) part1
, dbms_lob.substr(tweet_json, 8000,4001) part2
, dbms_lob.substr(tweet_json, 12000,8001) part3
FROM twtr_json_cache a
WHERE a.tjc_id = 8368
";
$sth = $twitdb->prepare($sql);
$sth->execute();
$data = $sth->fetchAll();
//join JSON string back together
$jsonRaw = $data[0]['PART1'].$data[0]['PART2'].$data[0]['PART3'];
//shouldnt needs to do this, doesnt affect the outcome anyway
$jsonRaw = mb_convert_encoding($jsonRaw, "UTF-8");
//convert JSON object to an array
$data = json_decode($jsonRaw,true);
//prepares the data (grabs the data I need from the JSON object and does some
//validation etc then finally submits to the database
$result = $tweetIO->saveTweet($data); // returns BOOL
echo $result;
?>
现在,如果我使用./proc_json_cache.php或php proc_json_chache.php从终端运行它,它可以很好地将数据到达数据库UTF-8编码,一切都很好,数据库中的数据看起来像这样的£$@€ ;测试.
如果我通过CRON调用这个脚本,它仍会保存数据,但像€等等特殊字符只是正方形,数据库中的数据看起来像这样 $@
TERM=xterm
SHELL=/bin/bash
这是因为它匹配我当前的shell ENV会话设置,并将以下内容添加到调用我的php脚本的bash脚本中:
export NLS_LANG="ENGLISH_UNITED KINGDOM.AL32UTF8"
export LANG="en_GB.UTF-8"
再次匹配我当前的shell ENV设置,但是当脚本从终端中的cron vs direct运行时,我仍然会遇到字符编码问题.
有没有其他人有类似的问题可以解释如何解决这个问题?
提前致谢.
编辑:
这里有一些关于服务器的更多信息:
操作系统:SUSE Linux Enterprise Server 11
PHP:5.2.14
最佳答案 尝试添加调用php脚本的bash脚本:
unset LANG LANGUAGE LC_CTYPE
export LANG=en_GB.UTF-8 LANGUAGE=en LC_CTYPE=en_GB.UTF-8