html
pdf
produced by pdflatex
. Sometimes it is necessary to give collaborators a version of a report that can be edited outside of latex2rtf
and hevea
have shown that these are not adequate for reports that incorporate advanced features such as latex(describe())
output. One of the most reliable approaches is to use TtH to convert from html
(but see TeX4ht
below which is probably better). BUT if you need to convert tables that unlike latex(describe(...))
do not contain pictures, hevea
is the fastest and usually the best approach. This is an option in the Hmisc
package html
function.
pdf
, and you can send all the individual pdf
graphics files to collaborators or send the single large pdf
report file. For reproducible research, collaborators should not edit graphics files; statisticians should attempt to make graphics publication-ready if they are to be used in a manuscript or grant proposal. Tables usually present the bigger challenge, because collaborators often need to extract tables into a Word document and sometimes need to reformat the tables. This is not what reproducible research is about. Statisticians need to produce tables in nearly final format so that if any data or computational methods are changed the table can be re-exported and re-inserted into a collaborator's document with very little manual intervention.
LaTeX is the gold standard for producing advanced tables. No other approach can handle the nuances that LaTeX can handle, and we have an abundance of R functions for producing LaTeX tables. (As an aside, pandoc is an excellent way to convert simple tables, but table formats are constrained to the very simple patterns supported by the markdown
language.) Converting tables from LaTeX to html
is the best approach for working with non- LaTeX users. There are two global choices: .tex
file to html
and send this to the collaborator along with all graphics files and the html
css
stylesheet produced by the conversion software. Although rendering in html
is good overall, often the R code and some of the pdf
graphics do not come out right, which will cause confusion.
.tex
file that is external to the main report file. Here is an example:
htlatex tables.tex
to produce tables.html
tables.css
and any needed graphics files and zip up these new files and send to the collaborator to open directly in Word. See the comment below about instructions for unzipping the archive. htlatex
is part of the TeX4ht package in linux, Mac OS, and Windows.
Example output may be found here - this is a pdf
file produced just by printing to pdf from a browser. This output will look the same in Word once the html
file is inserted into a Word document or opened whole, and all table elements are editable.
You can even do some of the latter steps from within R:
Hmisc
version 3.15-1 you can use the html
function to run htlatex
automatically to create html
files:
html
Tables for Insertion by Collaborators into Word Hmisc
summary*
function without specifying dotchart=TRUE to latex.summary.formula
, the fastest and best way to convert a table in LaTeX that is in its own .tex
file is to use hevea
. There is an option in the Hmisc
html.latex
function for using hevea
. Here is an example:
knitr
document you could have the following after the @ that ends the chunk to also include the LaTeX typeset table:
\input{/tmp/a}
TtH
(thanks to Ben Bolker for providing most of the code). Here are the steps needed to get going. sudo apt-get install tth
picture
code in documents as produced by latex.describe
to create the ppmtogif
executable sudo apt-get install netpbm
cd ~/bin
wget https://biostat.app.vumc.org/wiki/pub/Main/SweaveConvert/sweave2html -nv ( Note: Type the web address; not the contents of the link.)
chmod u+x sweave2html
.tex
file to html
and create all the needed graphics files, do the following.
.Rnw or .nw
file: \SweaveOpts{prefix.string=graphics/plot}
.tex
file is located (using the command 'cd ~username\pathname'). konqueror
and copy and paste it into an OpenOffice
document and save in a variety of formats including Word. Use Select All(control+a), cut
and paste
and all graphics and table formats will be preserved.
It is important to give your collaborator all the .pdf
files in the graphics
directory to use in manuscripts; do not let them use the lower resolution graphs that will be included in the filename.html
document. Bundle all the necessary files to send to the collaborator, using for example
zip /tmp/z.zip foo.pdf foo.html *.gif graphics/*.pdf graphics/*.pngE-mail
/tmp/z.zip
as an attachment.
TeX4ht
html
convertor. It may be installed easily using apt-get
For Windows go here and note that many of the changes discussed there are not needed.
In one test TeX4ht
performed well (including greek letters and superscripts and LaTeX picture environments) although I did not see how to get postscript or pdf graphics to appear in the final output. Advanced summary.formula.reverse
tables are handled nearly perfectly, including those that contain micro dot charts. TeX4ht
is used as follows: htlatex foo.tex # produces foo.html mk4ht oolatex foo.tex # produces an OpenOffice .sxw fileNote that the
tth
package has to be installed for htlatex to run completely.
My test of the oolatex
option resulted in output that was not as good as running htlatex
and opening the resulting .html
file in OpenOffice. See StatReport for more information and example output, and note its comment about turning off picture links in the OpenOffice document.
l2h
in your ~/bin
directory: htlatex $1.tex rm -f $1.idv $1.lg $1.tmp $1.4tc $1.xref $1.4ct zip /tmp/$$.zip $1.html $1.css $1*x.png oowriter $1.html echo "pack [button .h -text \"/tmp/$$.zip contains html and related files for\ncollaborator to unpack into one folder, or:\n\nClick Edit ... Links ... Break Link\nClick View and uncheck Notes, then\nSave as Word 97/2000/XP and exit OpenOffice\" -command exit]" | wish rm -f $1*x.png ${1}2.html $1.dviRun
l2h my
to convert my.tex
to html
and open OpenOffice
to save it in Word 97/2000/XP. A popup will give you some pointers, such as unlinking pictures so if you e-mail the document to someone it will be self-contained. Examples are attached (see below for intro.tex
and intro.doc
). This process gives you two options. First, you can e-mail your collaborator the .zip
file the script creates in /tmp
. Second, you can go ahead and save the result as a Word 97
document.
Try having your collaborator use the html
approach first. html
files can be opened directly in Word
, and Word
will use the html
style sheet ( css
file) that is included in the zip
file. You will need to include in addition to the html
and css
file any png
image files created by htlatex
. Further investigation has shown that OpenOffice
and LibreOffice
lose some font attributes but that opening the html
directly in Word preserves them. So zipping html
css
png
files and sending the zip file to the collaborator is the best approach. Be sure to tell her not to open the html
file from the zip archive but to extract all the files from the archive into a folder, otherwise Word will not find the image files.
If you do not use many LaTeX packages, tables are not complex, and do not make major use of equations, a faster approach is to install the latex2rtf
package to very quickly convert from LaTeX to rich text format ( rtf
), using a command such as latex2rtf -o my.rtf my.tex
.
You can create a file that can be opened in Firefox
that beautifully renders equations without resorting to graphics by using MathML
. The attached intro.xhtml
was created by running mk4ht xhmlatex intro.tex
then renaming intro.html
as intro.xhtml
. We don't currently know how to make OpenOffice
open such files. To properly view intro.xhtml
you have to save it to a local file so you can point to it outside of foswiki
.
Sweave
. Here is how to run an example (in linux), after installing the odfWeave
package and the latest OpenOffice
. The file can then be exported to open document or Word format. /tmp/out.odt
in OpenOffice
Writer. Note: On some systems the correct file name will be /usr/lib/R/site-library/odfWeave/examples/examples.odt
.
This approach does not allow you to use the advanced table making capabilities of Hmisc
that rely on R2HTML
package to produce .html
reports.
cd /tmp bunzip2 ooconvert-* tar xvf ooconvert-* # You may need to edit line to change python2.3 to python sudo chmod a+x ooconvert sudo mv ooconvert /usr/local/bin or to ~/bin
ooconvert
, tth
, tex4ht
~/bin
and chmod +x
to make it executable
ltx2doc foo
to convert foo.tex
to foo.doc
mk4ht oolatex $1.tex rm -f $1.css $1.idv $1.lg $1.tmp $1.4tc $1.xref $1.4ct ooconvert $1.odt $1.doc rm $1.odtBut see above for a better approach through
html
and OpenOffice
.
pdf
then use this server to convert to .doc
or .rtf
which will be e-mailed to you. Here is a great example: pdflatex output converted to Word (zip
file) then back to pdf using Word or using pdftoword.com
Also try http://zamzar.com. The result on the above test file was not nearly as good as with pdftoword
(except for the complex summary 'reverse' tables!)
A test on 2015-11-14 on spaper.pdf got better results with pdftoword
than with zamzar
but ggplot2
graphics were converted to editable characters (parts of which rendered corrected and parts didn't). http://pdfonline.com produced perfect html
but using Word on the html
file was a total mess. It claims to convert to Word directly but really converts to defective rtf
. On spaper.tex
tth
(using a new shell script knitr2html
) rendered well but would not recognize \begin{supp}...\end{supp}
. htlatex
has a bug related to BibLaTeX.
Doesn't work: freepdfconvert
, smallpdf.com
(partially worked), doc.zone
, convertonlinefree.com
, pdf2doc
, convertpdftoword.net
, pdfpublisher
, wondershare.net pdfelement
(failed to run using wine
).
formswift.com
rendered perfectly in their online editor after converting but required a credit card in order to download the Word document
More information is available at http://www.freewaregenius.com/2010/03/06/how-to-convert-pdf-to-word-doc-for-free-a-comparative-test. See especially Nuance
which costs $ and runs only on Windows and Mac.
I | Attachment | Action | Size | Date | Who | Comment |
---|---|---|---|---|---|---|
htmlWeave.pdf | manage | 60.8 K | 26 Jul 2006 - 22:59 | FrankHarrell | Automating Reports with Sweave by Greg Snow | |
tex | intro.tex | manage | 8.5 K | 06 May 2009 - 19:19 | FrankHarrell | LaTeX test file to try with l2h |
htlatex.pdf | manage | 31.9 K | 17 Sep 2014 - 18:00 | FrankHarrell | Output of htlatex after printing the page to pdf from a browser | |
doc | intro.doc | manage | 69.5 K | 06 May 2009 - 19:20 | FrankHarrell | Result of l2h intro after telling OpenOffice to save in Word format |
xhtml | intro.xhtml | manage | 41.7 K | 17 May 2009 - 11:57 | FrankHarrell | Result of mk4ht xhmlatex intro.tex then renaming intro.html to intro.xhtml |
EXT | sweave2html | manage | 0.6 K | 14 May 2009 - 10:44 | WillGray | Sweave |