На моем стареньком ноуте OO незапустишь. Но регулярно появляются документы (MS Word, OO) простой структуры.

Может есть какие-нибудь конверторы легкие, переводящие doc в html, или выуживающие текст?


Re: doc -> html

Я точно не помню, но по-моему была програ catdoc, которая как раз "выуживала текст"

ukez ()

Re: doc -> html

$apt-cache show antiword
Package: antiword
Priority: optional
Section: text
Installed-Size: 500
Maintainer: Bdale Garbee <>
Architecture: i386
Version: 0.32-2
Depends: libc6 (>= 2.2.4-4)
Filename: pool/main/a/antiword/antiword_0.32-2_i386.deb
Size: 88490
MD5sum: 7c19befb191b9a5a88e77a7e87310d3e
Description: Converts MS Word files to text and ps
 Antiword is a free MS Word reader.
 It converts the binary files from MS Word 6, 7, 97 and 2000 to text and

anonymous ()
Ответ на: Re: doc -> html от anonymous

Re: doc -> html

$apt-cache show catdoc
Package: catdoc
Priority: optional
Section: text
Installed-Size: 636
Maintainer: Pawel Wiecek <>
Architecture: i386
Version: 0.91.5-1.woody3
Depends: libc6 (>= 2.2.4-4)
Suggests: wish
Filename: pool/main/c/catdoc/catdoc_0.91.5-1.woody3_i386.deb
Size: 66898
MD5sum: 94f0f2f0bccb8abbed2f70fd70d8d9f1
Description: MS-Word to TeX or plain text converter
 This program extracts text from MS-Word files, trying to preserve
 as many special printable characters as possible. catdoc supports
 everything up to Word-97.
 It doesn't even try to preserve fancy Word formatting, because
 Word users usually don't care about document structure, and it is
 this very thing which is important to LaTeX users.
 Also provided is xls2csv, which extracts data from Excel spreadsheets
 and outputs it in comma-separated-value format.
 This package suggests tk because it also includes wordview, an
 optional Tk-based GUI for catdoc.  The MIME config provided in this
 package will use wordview is X is running, or catdoc directly if it
 is not.

anonymous ()

Re: doc -> html

wvHtml(1)                                                            wvHtml(1)

       wvHtml - convert msword documents to HTML4.0

       wvHtml in_word_doc out_html_doc

       wvHtml  converts  word documents into W3C certified HTML4.0 format. You
       can use Netscape or some other browser to then view your docs.


       wvAbw(1), wvWare(1), wvLatex(1),  wvCleanLatex(1),  wvPS(1),  wvDVI(1),
       wvPDF(1), wvText(1), wvWml(1), wvMime(1), catdoc(1), word2x(1)

        Dom Lachowicz (current author and maintainer)

Keiko ()
