安装Srilm的一点新变化

  读者Fanlc昨天在《Ubuntu 64位系统下SRILM的配置详解》下留言:“为什么我下载到的1.5.10版本,没有test文件夹呢?编译之后也没有……这怎么测试”。我手头没有Srilm的1.5.10版本,于是下载了一个看看,发现主目录下的确没有test文件夹,对比了一下1.5.9版本的Srilm目录,发现这是一点新变化。
  对比了一下Srilm 1.5.10和1.5.9里自带的INSTALL文件,以下是diff后的结果:

103c103
< 7 – To test the compiled tools, change into the $SRILM/test directory and run

> 7 – To test the compiled tools, run
105c105,107
< gnumake all

> gnumake test
>
> from the top-level directory.
109,110c111,113
< reported, examine the output files in $SRILM/test/output and compare them
reported, examine the output files in $SRILM//test/output and
> compare them to the corresponding files in $SRILM//test/reference,
> where is a subdirectory name (lm, flm, lattice).
157c160
157c160
< $Date: 2009/06/28 09:12:45 $

> $Date: 2009/12/02 19:39:04 $

  主要是第7步test时有变化,以前是:

cd test
make all

  现在改为了:

make test

  而1.5.10里test文件已不在主目录下,而是分别位于:$SRILM/lm & flm & lattice下。
  晚上我在一台新机器上试着编译了一下1.5.10版本的Srilm,依然是检查安装依赖软件,修改Makefile,以及make World编译,而在测试时改变为“make test”,同样一大堆的IDENTICAL及少量DIFFERS出现。
  写在这里,做个备忘,同时提醒将来可能会遇到此问题的读者,最后感谢Fanlc读者的提示!

注:原创文章,转载请注明出处“我爱自然语言处理”:www.52nlp.cn

本文链接地址:http://www.52nlp.cn/安装Srilm的一点新变化

此条目发表在机器翻译, 语言模型分类目录,贴了, , , 标签。将固定链接加入收藏夹。

安装Srilm的一点新变化》有 22 条评论

  1. Fanlc说:

    谢谢您,测试通过了。不过在此之前我也想到了一个解决办法,有点笨,就是下载一个低版本的,然后把test文件夹拷贝过来测试,我下载了1.4.6的,拷贝到test测试,同样成功了。不过照您到方法方便得多,谢谢

    [回复]

    52nlp 回复:

    呵呵,这个办法也不错!

    [回复]

  2. li_bopr说:

    高版本直接:make test即可;

    另外,我用srilm 1.5.10+ Moses 2010-04-26版本通过了
    ./regenerate-makefiles.sh

    然而在:./configure –with-srilm=/home/bopr/tools/srilm –with-irstlm=/home/bopr/tools/irstlm时,报Cannot find SRILM!
    郁闷了半天也找不出原因。

    系统是:ubuntu9.10,SRILM在test时,只有两个differ,Ngram也很好用,应该没错。

    环境变量也增加在/etc/profile中。

    就快完成了,我QQ:363954866,邮箱:li_bopr at 126.com,希望高手帮助解答,不胜感激!!!

    [回复]

  3. li_bopr说:

    说是 Srilm中oom编译问题。哪里出错了?

    [回复]

    52nlp 回复:

    从你的描述来看应就是路径问题了,srilm本身没问题。
    关于“Srilm中oom编译问题”,不太明白,最好把错误的详细信息贴在这里看看。

    [回复]

  4. li_bopr说:

    十分感谢管理员!!
    我的运行记录如下,谢谢!!

    bopr@bopr-ubuntu:~$ cd tools/srilm/bin
    bopr@bopr-ubuntu:~/tools/srilm/bin$ ls
    align-with-tags i686-gcc4 nbest-error rescore-minimize-wer
    change-lm-vocab make-batch-counts nbest-rover rescore-reweight
    compare-sclite make-big-lm pfsg-from-ngram
    compute-sclite make-multiword-pfsg rescore-acoustic
    empty-sentence-lm merge-batch-counts rescore-decipher
    bopr@bopr-ubuntu:~/tools/srilm/bin$ cd i686-gcc4
    bopr@bopr-ubuntu:~/tools/srilm/bin/i686-gcc4$ ls
    add-classes-to-pfsg hits-from-log pfsg-to-dot
    add-dummy-bows htklat-vocab pfsg-to-fsm
    add-pauses-to-pfsg lattice-tool pfsg-vocab
    add-ppls log10-to-bytelog ppl-from-log
    anti-ngram make-abs-discount prettify
    bytelog-to-log10 make-diacritic-map remove-lowprob-ngrams
    classes-to-fsm make-google-ngrams replace-words-with-classes
    combine-acoustic-scores make-gt-discounts reverse-lm
    combine-rover-controls make-hiddens-lm reverse-ngram-counts
    compare-ppls make-kn-counts reverse-text
    compute-best-mix make-kn-discounts segment
    compute-best-rover-mix make-lm-subset segment-nbest
    compute-best-sentence-mix make-nbest-pfsg select-vocab
    compute-oov-rate make-ngram-pfsg sentid-to-ctm
    context-ngrams make-sub-lm sentid-to-sclite
    continuous-ngram-count merge-nbest sort-lm
    cumbin multi-ngram split-tagged-ngrams
    disambig nbest2-to-nbest1 subset-context-ngrams
    extract-skip-probs nbest-lattice subtract-ppls
    filter-event-counts nbest-mix tolower-ngram-counts
    find-reference-posteriors nbest-optimize uniform-classes
    fix-ctm nbest-posteriors uniq-ngram-counts
    fngram nbest-pron-score vp2text
    fngram-count nbest-vocab wlat-stats
    fsm-to-pfsg ngram wlat-to-dot
    get-gt-counts ngram-class wlat-to-pfsg
    get-unigram-probs ngram-count wordlat-to-lisp
    hidden-ngram ngram-merge
    bopr@bopr-ubuntu:~/tools/srilm/bin/i686-gcc4$ cd /home/bopr/tools/moses
    bopr@bopr-ubuntu:~/tools/moses$ sudo ./configure –with-srilm=/home/bopr/tools/srilm
    checking for a BSD-compatible install… /usr/bin/install -c
    checking whether build environment is sane… yes
    checking for a thread-safe mkdir -p… /bin/mkdir -p
    checking for gawk… gawk
    checking whether make sets $(MAKE)… yes
    checking for g++… g++
    checking for C++ compiler default output file name… a.out
    checking whether the C++ compiler works… yes
    checking whether we are cross compiling… no
    checking for suffix of executables…
    checking for suffix of object files… o
    checking whether we are using the GNU C++ compiler… yes
    checking whether g++ accepts -g… yes
    checking for style of include used by make… GNU
    checking dependency style of g++… gcc3
    checking how to run the C++ preprocessor… g++ -E
    checking build system type… i686-pc-linux-gnu
    checking host system type… i686-pc-linux-gnu
    checking for gcc… gcc
    checking whether we are using the GNU C compiler… yes
    checking whether gcc accepts -g… yes
    checking for gcc option to accept ISO C89… none needed
    checking dependency style of gcc… gcc3
    checking for a sed that does not truncate output… /bin/sed
    checking for grep that handles long lines and -e… /bin/grep
    checking for egrep… /bin/grep -E
    checking for fgrep… /bin/grep -F
    checking for ld used by gcc… /usr/bin/ld
    checking if the linker (/usr/bin/ld) is GNU ld… yes
    checking for BSD- or MS-compatible name lister (nm)… /usr/bin/nm -B
    checking the name lister (/usr/bin/nm -B) interface… BSD nm
    checking whether ln -s works… yes
    checking the maximum length of command line arguments… 1572864
    checking whether the shell understands some XSI constructs… yes
    checking whether the shell understands “+=”… yes
    checking for /usr/bin/ld option to reload object files… -r
    checking for objdump… objdump
    checking how to recognize dependent libraries… pass_all
    checking for ar… ar
    checking for strip… strip
    checking for ranlib… ranlib
    checking command to parse /usr/bin/nm -B output from gcc object… ok
    checking how to run the C preprocessor… gcc -E
    checking for ANSI C header files… yes
    checking for sys/types.h… yes
    checking for sys/stat.h… yes
    checking for stdlib.h… yes
    checking for string.h… yes
    checking for memory.h… yes
    checking for strings.h… yes
    checking for inttypes.h… yes
    checking for stdint.h… yes
    checking for unistd.h… yes
    checking for dlfcn.h… yes
    checking whether we are using the GNU C++ compiler… (cached) yes
    checking whether g++ accepts -g… (cached) yes
    checking dependency style of g++… (cached) gcc3
    checking how to run the C++ preprocessor… g++ -E
    checking for objdir… .libs
    checking if gcc supports -fno-rtti -fno-exceptions… no
    checking for gcc option to produce PIC… -fPIC -DPIC
    checking if gcc PIC flag -fPIC -DPIC works… yes
    checking if gcc static flag -static works… yes
    checking if gcc supports -c -o file.o… yes
    checking if gcc supports -c -o file.o… (cached) yes
    checking whether the gcc linker (/usr/bin/ld) supports shared libraries… yes
    checking dynamic linker characteristics… GNU/Linux ld.so
    checking how to hardcode library paths into programs… immediate
    checking whether stripping libraries is possible… yes
    checking if libtool supports shared libraries… yes
    checking whether to build shared libraries… no
    checking whether to build static libraries… yes
    checking for ld used by g++… /usr/bin/ld
    checking if the linker (/usr/bin/ld) is GNU ld… yes
    checking whether the g++ linker (/usr/bin/ld) supports shared libraries… yes
    checking for g++ option to produce PIC… -fPIC -DPIC
    checking if g++ PIC flag -fPIC -DPIC works… yes
    checking if g++ static flag -static works… yes
    checking if g++ supports -c -o file.o… yes
    checking if g++ supports -c -o file.o… (cached) yes
    checking whether the g++ linker (/usr/bin/ld) supports shared libraries… yes
    checking dynamic linker characteristics… GNU/Linux ld.so
    checking how to hardcode library paths into programs… immediate
    checking for XMLRPC-C… ignored
    configure: trace enabled (default)
    configure: Building non-threaded moses. This will disable the moses server
    checking Ngram.h usability… yes
    checking Ngram.h presence… yes
    checking for Ngram.h… yes
    checking for trigram_init in -loolm… no
    configure: error: Cannot find SRILM!我在这郁闷
    bopr@bopr-ubuntu:~/tools/moses$

    configure文件:

    if test “x$ac_cv_lib_oolm_trigram_init” = x””yes; then :
    cat >>confdefs.h <<_ACEOF
    #define HAVE_LIBOOLM 1
    _ACEOF

    LIBS="-loolm $LIBS"

    else
    as_fn_error "Cannot find SRILM!我在这郁闷" "$LINENO" 5
    fi

    if true; then
    SRI_LM_TRUE=
    SRI_LM_FALSE='#'
    else
    SRI_LM_TRUE='#'
    SRI_LM_FALSE=
    fi

    [回复]

  5. li_bopr说:

    已经解决了,谢谢!修改srilm下sbin下的machine-type文件中
    else if (`uname -m` == i686) then
    set MACHINE_TYPE = i686
    为:
    else if (`uname -m` == i686) then
    set MACHINE_TYPE = i686-gcc4

    不然Moses找不到i686-gcc4目录,然后报找不到srilm

    再次感谢管理员!!!感谢52nlp!!

    [回复]

    52nlp 回复:

    自己搞定就好,不客气。

    [回复]

  6. shark说:

    您好!我用的是1.4.6版本的,编译通过了,可是测试的时候出现的都是DIFFERS,您能给点指点吗?这样应该如果修改才可以。另外编译通过后没有ngram,lib目录下库文件倒是有了。谢谢了~~我的QQ号码:1072787885,如果您方便的话,请您加我 帮我解决这个问题,谢谢

    [回复]

    52nlp 回复:

    “编译通过后没有ngram”,这样看来是编译没有通过,再按以上步骤重新编译一下吧: 检查安装依赖软件,修改Makefile,以及make World编译,而在测试时改变为“make test”。确保每一步无误,应该问题不大。

    [回复]

    shark 回复:

    谢谢,错误出现在上面的依赖软件有的没有装成功。再次感谢管理员!!!

    [回复]

    52nlp 回复:

    恩,根据我个人的经验,安装srilm出现的很多问题都与依赖软件有关。

  7. ironbridge说:

    还是习惯VC编程,
    发给windows底下VC编译SRILM,
    http://www.keithv.com/software/srilm/

    [回复]

    52nlp 回复:

    非常感谢!

    [回复]

  8. wcr2011说:

    能进行语言模型训练的 最大语料库规模 是多大啊?

    2.6G的 有240万行句子的 语料库 不能进行3-元模型训练,提示语料库太大!

    2.6G的语料库如何进行语言模型训练呢?

    [回复]

    52nlp 回复:

    大规模的语言模型训练可以考虑利用Srilm中提供的make-batch-counts,merge-batch-counts训练命令,具体搜一下,貌似有人谈过。

    [回复]

  9. xieqianlong说:

    您好!我用的是1.6.0版本的,机器是fedora14 x86_64,machine_type选的是i686-m64编译通过了,在bin/i686-m64/下有ngram等很多可执行程序,可是测试的时候出现的大部分都是DIFFERS,您帮我看一下什么情况好吗?QQ:64四415零87,谢谢!

    [回复]

    xieqianlong 回复:

    下面是lm/test/output/ngram-count-lm.i686-m64.stderr的输出:
    ./run-test: line 40: 25795 Illegal instruction ngram-count -debug 1 -order $order -count-lm -text $dir/eval97.text -vocab $dir/eval 2001.vocab -init-lm swbd.3countlm -em-iters 10 -lm swbd.3countlm.reest
    ./run-test: line 50: 25796 Illegal instruction ngram -debug 0 -order $order -count-lm -lm swbd.3countlm.reest -vocab $dir/eval2001. vocab -ppl $dir/eval97.text
    ./run-test: line 62: 25797 Illegal instruction ngram -debug 0 -order $order -count-lm -lm swbd.3countlm.reest -mix-lm $dir/swbd.3bo $gz -lambda 0.5 -bayes 0 -vocab $dir/eval2001.vocab -ppl $dir/eval97.text

    lm/test/output/ngram-count-lm.i686-m64.stdout是空的

    [回复]

    52nlp 回复:

    如果有Ngram等关键可执行文件,可以不care这些differs,differs出现与环境有关,不太好判断。

    [回复]

  10. xieqianlong说:

    可是我在使用ngram-count的时候出现“Illegal instruction”错误提示,所以我感觉还是没安装成功。
    不过我已经在别的机器上安装并且test成功了,可能是机器配置的问题……谢谢了

    [回复]

  11. Xin弦说:

    按照这个 [ srilm下sbin下的machine-type文件中
    else if (`uname -m` == i686) then
    set MACHINE_TYPE = i686
    为:
    else if (`uname -m` == i686) then
    set MACHINE_TYPE = i686-gcc4
    修改过了 ,但还会出现这个错误
    checking for Ngram.h… yes
    checking for trigram_init in -loolm… no
    configure: error: Cannot find SRILM’s library in /home/zhangxin/tools/srilm/lib/i686-gcc4
    实际存在路径/home/zhangxin/tools/srilm/libi686-gcc4 libi686-gcc4目录下有libdstruct.a libflm.a liblattice.a libmisc.a libboolm.a
    不知到为什么还是显示找不到SRILM library 谢谢了

    [回复]

    52nlp 回复:

    /home/zhangxin/tools/srilm/libi686-gcc4 lib和i686直接没有反斜杠?

    [回复]

发表评论

电子邮件地址不会被公开。 必填项已用*标注