PHP Tesseract OCR是一个PHP的C++扩展,用于PHP环境下的字符识别和OCR学习。本文将详细的介绍Linux和OSX系统下tesseracth,PHP-CPP,PHPTesseract扩展的安装。
1.tesseracth4.0.0安装
由于PHPTesseract是基于tesseracth4.0.0版本开发的,所以必须使用tesseracth4.0.0以上版本。
OSX系统:
使用前请确保已经安装brew命令 # brew install --with-training-tools tesseract
Ubuntu系统:
# apt-get install tesseract-ocr
Centos系统:
1.安装依赖工具: # yum install autoconf automake libtool libjpeg-devel libpng-devel libtiff-devel zlib-devel centos7以下需要执行下面步骤: 在centos7以下yum的autoconf版本是2.63,安装tesseract4.0.0需要2.69,所以需要先手动升级autoconf 查看autoconf版本如果是2.69以上版本请跳过直接第二步骤 # rpm -qf /usr/bin/autoconf 卸载 # rpm -e --nodeps autoconf-2.63 安装autoconf 2.69 # wget ftp://ftp.gnu.org/gnu/autoconf/autoconf-2.69.tar.gz # tar zxvf autoconf-2.69.tar.gz # cd autoconf-2.69 # ./configure # make # make install 安装autoconf-archive: # wget https://mirror.sergal.org/gnu/autoconf-archive/autoconf-archive-2016.09.16.tar.xz # xz -d autoconf-archive-2016.09.16.tar.xz # tar -xvf autoconf-archive-2016.09.16.tar # cd autoconf-archive-2016.09.16 # ./configure # make # make install 2.安装leptonica: # wget http://www.leptonica.org/source/leptonica-1.74.4.tar.gz # tar zxvf leptonica-1.74.4.tar.gz # cd leptonica-1.74.4/ # ./configure --prefix=/usr/local/leptonica # make # make install 3.配置环境变量 # vi /etc/profile 添加: PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/leptonica/lib/pkgconfig export PKG_CONFIG_PATH CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:/usr/local/leptonica/include/leptonica export CPLUS_INCLUDE_PATH C_INCLUDE_PATH=$C_INCLUDE_PATH:/usr/local/leptonica/include/leptonica export C_INCLUDE_PATH LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/leptonica/lib export LD_LIBRARY_PATH LIBRARY_PATH=$LIBRARY_PATH:/usr/local/leptonica/lib export LIBRARY_PATH LIBLEPT_HEADERSDIR=/usr/local/leptonica/include/leptonica export LIBLEPT_HEADERSDIR # source /etc/profile 4.安装C++11(已升级到gcc4.8.1请跳过) 请参考:https://blog.csdn.net/redfivehit/article/details/77275960 5.安装tesseract # git clone https://github.com/tesseract-ocr/tesseract.git # cd tesseract # ./autogen.sh # ./configure --prefix=/usr/local/tesseract --with-extra-libraries=/usr/local/leptonica/lib # make # make install # make training # make training-install 6.配置tesseract环境变量 # vi /etc/profile 添加: PATH=$PATH:/usr/local/tesseract/bin export PATH export LC_ALL=C # source /etc/profile
2.PHP-CPP安装
安装PHP-CPP之前请先把当前PHP环境的php-config添加到环境变量中,执行php-config正确输出请忽略。
# git clone https://github.com/CopernicaMarketingSoftware/PHP-CPP.git # cd PHP-CPP # make # sudo make install
OSX在编译中如果出现多处警告请忽略。
3.PHPTesseract扩展安装
编译PHPTesseract需要GCC4.8以上版本
# git clone https://github.com/2654709623/php-tesseract-ocr.git # cd php-tesseract-ocr # make # sudo make install
OSX在编译中如果出现多处警告请忽略。
成功安装后会输出扩展安装目录,然后在php.ini中添加extension=你的扩展安装目录/tesseract.so。到这里你已经安装好PHPTesseract了,接下来就让我们来探索PHPTesseract的使用及OCR学习吧。
Comments : 0