YaHP Converter (Yet another Html to Pdf converter) v1.3 (21/11/2011)
New 25/11/2011!Sample Da Linux French Page
New 25/11/2011!Sample freecode.com
YaHP is a java library which permits to convert an html document into a pdf document.
YaHP is licensed under the LGPL (GNU).
YaHP use a pluggable renderer system.
It had in previous versions 4 renderer, in the 1.2.19 and 1.2.19a version only the renderer based on flying saucer is available. Jar is now smaller.
- The flying saucer renderer, good rendering, can set header/footer in html, choose page size.
Source code
The source code is available through github.
Issues
You can log any issue in the dedicated section.
Old renderer properties (not available in 1.2.19):
- The swing renderer, most stable, good rendering, can set header/footer, choose page size.
- The OpenOffice.org renderer, average rendering, cannot set header footer nor page size.(worst renderer)
- The firefox renderer, best rendering, can set header/footer, choose page size, must patch and compile a version of firefox, tested only on linux. Download pre-compiled version of the patched firefox (compiled on ubuntu dapper)mozila-print.tar.gz (08/04/2006)
To use the openoffice.org renderer, you must first launch openoffice in listen mode like this:
oowriter "-accept=socket,port=8100;urp;"
(with the quotes), then set the correct properties (see CSimpleConversion.java).
To use the firefox renderer, you must first download the firefox source : firefox-1.5.0.1-source.tar.bz2 at http://www.mozilla.org/ , then patch it with the files "layout.printing.nsPrintEngine.diff" and "layout.printing.nsPrintEngineH.diff", then compile it by issuing :
"./configure --prefix=/yourchoosenpath/mozilla-print/ --disable-logging --disable-tests --disable-oji --disable-view-source --disable-accessibility --disable-composer --disable-ldap --enable-canvas --disable-gnomeui --enable-application=browser --with-user-appdir=.print.mozilla --enable-system-cairo", then "make install"
Copy the sh script "fireprint" in /yourchoosenpath/mozilla-print/, modify the path inside the script, then set the correct properties (see CSimpleConversion.java). --script refer to the path /yourchoosenpath/mozilla-print/fireprint.
You also need ps2pdf installed and in the path.
New 21/03/2009!Sample pdf with colors/tables/fonts embedded/alignment/non-latin alphabet New ! source
New 21/03/2009!Sample pdf with pagebreaks
Sample html with pagebreaks (use to generate the previous pdf)
New 21/03/2009!Sample pdf with listing tables with automatic row break
Sample html with listing tables with automatic row break (use to generate the previous pdf)
New 21/03/2009!Sample pdf with watermark
Sample html with watermark (use to generate the previous pdf)
Sample image used as watermark (use to generate the previous pdf)
New 18/03/2009!Sample Freecode
New 18/03/2009!Sample Freshmeat
New 21/03/2009!Sample This page
Javadoc
Getting started
Copy the jar from $YAHP_HOME/YaHPConverter/lib/ and $YAHP_HOME/YaHPSample/lib/ in the classpath.
Or rebuild it by issuing 'ant' in $YAHP_HOME/YaHPConverter/ the jars will be in $YAHP_HOME/YaHPConverter/Run and $YAHP_HOME/YaHPConverter/lib/ for the dependancies after the build.
Example usage: (see CSimpleConversion.java)
// new converter
CYaHPConverter converter = new CYaHPConverter();
// save pdf in outfile
File fout = new File(outfile);
FileOutputStream out = new FileOutputStream(fout);
// contains configuration properties
Map properties = new HashMap();
// list containing header/footer
List headerFooterList = new ArrayList();
// add header/footer
headerFooterList.add(new IHtmlToPdfTransformer.CHeaderFooter(
"<table width=\"100%\"><tbody><tr><td align=\"left\">"+
"Generated with YaHPConverter.</td><td align=\"right\">Page <pagenumber>/<"+
"pagecount></td></tr></tbody></table>",
IHtmlToPdfTransformer.CHeaderFooter.HEADER));
headerFooterList.add(new IHtmlToPdfTransformer.CHeaderFooter(
"© 2009 Quentin Anciaux",
IHtmlToPdfTransformer.CHeaderFooter.FOOTER));
properties.put(IHtmlToPdfTransformer.PDF_RENDERER_CLASS,
IHtmlToPdfTransformer.FLYINGSAUCER_PDF_RENDERER);
properties.put(IHtmlToPdfTransformer.FOP_TTF_FONT_PATH, fontPath);
converter.convertToPdf(new URL(url),
IHtmlToPdfTransformer.A4P, headerFooterList, out,
properties);
out.flush();
out.close();
To change the page size after a page break, set the attribute "size" of the <yahp:pb /> attribute. Example:
- <yahp:pb size="A4L"/>
- <yahp:pb size="29.7,21"/>
- <yahp:pb size="21,29.7,1"/>
- <yahp:pb size="21,29.7,1,0.5,0.5,1"/>
!! Download v1.3 (tar.gz) - 21/11/2011 !!
Changes:
-
Updated third party libraries to their latest or most compliant version:
- Flying Saucer: 16/04/2011
- iText: 2.1.7
- Apache Log4j: 1.2.16
- JTidy: r938
- Shanidom: 1.4.17
- Jaxen: 1.1.1
- Removed Apache Commons IO and Apache Commons Log libraries which are not needed any more.
- Removed extra calls in charge of adding a document producer since they are not applicable any more and got rid of deprecated constants.
- Move source code to github.
- Thanks to Stéphane Thomas for all changes.
!! Download v1.2.20c (tar.gz) - 10/05/2011 !!
Changes:
- Revert to previous itext and flying saucer librairies due to licensing problem. This version is LGPL
Download v1.2.20b (tar.gz) - 17/01/2011
Changes:
!!!! ATTENTION due to the fact that new itext 5.0.5 is licensed under AGPL, the whole distribution is therefore also licensed under AGPL only the source code of YAHP is LGPL !!!!
- Do not validate html and do not download xhtml DTD from w3 site. (thanks to Johnathan Crawford)
- Bug correction in entities normalization.
- Inline remote css into a style element inside the document.
- Updated itext and flying saucer librairies to latest version. (thanks to Johnathan Crawford)
Download v1.2.20a (tar.gz) - 18/12/2009
Changes:
- Fix a problem with multiple text node copy by jtidy.
- Updated jars.
- Remove numerous "INFO" default logging to the console.
Download v1.2.20 (tar.gz) - 17/12/2009
Changes:
- Better handling of ms word generated html.
- Html document is normalized (entities are translated in characters) before rendering.
- Handle page break tag even if namespace declaration is missing
Download v1.2.19d (tar.gz) - 16/06/2009
Changes:
- Allow multiple threads to concurrently use the same CYaHPConverter object instead of serializing access (or using one CYaHPConverter object per thread).
Download v1.2.19c (tar.gz) - 07/05/2009
Changes:
- Can change page size and orientation after a page break.
- If the title tag is set in the html, it is used for the pdf title.
- Added LEGAL and LETTER constant for page size
Download v1.2.19b (tar.gz) - 01/05/2009
Changes:
- Fixed Out Of Memory Error when a textarea was present in the html source.
Download v1.2.19a (tar.gz) - 18/03/2009
Changes:
- Fixed incorrect page size when a page break is inserted.
Download v1.2.19 (tar.gz) - 13/03/2009
Changes:
- Header and Footer can contain html.
- Updated samples.
- Removed old renderers
Download v1.2.18c (tar.gz) - 07/07/2008
Changes:
- Rendering of form components (textfield, textarea, combo, radiobutton, checkbox, button , listbox) in the flying saucer renderer
- Corrected double encoding of &, < and >
- Normalize html entity before rendering
- known bug: Header/footer rendering does not works in headless mode for the flying saucer renderer
- Updated samples.
- Code cleanup
Download v1.2.18b (tar.gz) - 05/07/2008
Changes:
Download v1.2.18a (tar.gz) - 04/07/2008
Changes:
- Better rendering of tags soup.
Download v1.2.18 (tar.gz) - 04/07/2008
Changes:
- Clean up, ensure compat with jdk 1.5.
- Updated samples.
Download v1.2.18-pre1 (tar.gz) - 01/07/2008
Changes:
- Added flying saucer xhtml renderer.
- Updated samples.
Download v1.2.17 (tar.gz) - 02/07/2007
Changes:
- Fixed a NPE if the FOP_TTF_FONT_PATH properties is not set.
Download v1.2.16 (tar.gz) - 19/06/2007
Changes:
- Font embedding does not need anymore the fonts to be in the OS system fonts folder in jdk < 1.6 on windows OSs.(Thanks to Takis Bouyouris)
Download v1.2.15 (tar.gz) - 17/06/2007
Changes:
- Font embedding does not need anymore the fonts to be in the OS system fonts folder in jdk < 1.6.
Download v1.2.14 (tar.gz) - 15/06/2007
Changes:
- Sometimes the font embedding was still not working while running inside tomcat, this bug has been fixed.
- Works again in headless mode.
- Font embedding does not need anymore the fonts to be in the OS system fonts folder.
Download v1.2.13 (tar.gz) - 13/06/2007
Changes:
- Fix bug: The font embedding was not working while running inside tomcat. (Thanks to Takis Bouyouris)
- Fix bug: Sometimes when running inside tomcat, the following error occured: 'UIDefaults.getUI() failed: no ComponentUI class for: javax.swing.JTextPane'. (Thanks to Takis Bouyouris)
Download v1.2.12 (tar.gz) - 11/06/2007
Changes:
- Removed the method "getResources" from the classloader, because this method is marked as final in 1.4 JVM and so did break compatibility of yahp with 1.4 vm.
Download v1.2.11 (tar.gz) - 11/05/2007
Changes:
- Updated classloader.
- Use current DPI screen settings to calculate page size.
Download v1.2.10 (tar.gz) - 26/04/2007
Changes:
- Updated parser to ShaniXmlParser-v1.4.16.
- Updated xalan.
Download v1.2.9 (tar.gz) - 24/04/2007
Changes:
- Updated parser to ShaniXmlParser-v1.4.15.
- The swing renderer can now render fieldset and legend tags.
Download v1.2.8 (tar.gz) - 19/04/2007
Changes:
- Updated parser to ShaniXmlParser-v1.4.14.
Download v1.2.7 (tar.gz) - 13/04/2007
Changes:
- v1.2.6 was broken in application server environment.
v1.2.6 (tar.gz) - 13/04/2007 (NUKED)
Changes:
- Use FOP 0.93.
- Can now embed automatically TrueType font by giving a path where TTF files are located with the yahp parameter IHtmlToPdfTransformer.FOP_TTF_FONT_PATH.
- The page-break <yahp:pb> works again.
- Updated samples PDF for the swing renderer.
- Updated javadoc.
Download v1.2.5 (tar.gz) - 12/04/2007
Changes:
- Corrected PageSize class where bottom margin was set incorrectly in CM.
- Corrected support for accentuated letters.
- Updated classloading mechanism.
- Corrected incorrect page count on some html.
Download v1.2.4 (tar.gz) - 11/04/2007
Changes:
- Remove infinite loop in the css parser.
- Tidyfy html before sending to rendering.
- Corrected a class cast exception in the swing border helper.
- If base url not set, take base tag as base url if found.
Download v1.2.3 (tar.gz) - 09/04/2007
Changes:
- Updated xml/html parser.
- Default charset to utf-8.
Download v1.2.2 (tar.gz) - 16/03/2007
Changes:
- Correct errors with commons-logging under tomcat on windows.
Download v1.2.1 (tar.gz) - 05/01/2007
Changes:
- Ignore attribute's case on image tag.
Download v1.2 (tar.gz) - 07/12/2006
Changes:
- Corrected rendering of elements with size set in percent.
Download v1.1beta2 (tar.gz) - 10/08/2006
Changes:
- Rendering of CSS border in the swing renderer.
- Better memory usage.
- Use Shani xml parser v1.4.6.
Download v1.0 (tar.gz) - 21/07/2006
Changes:
- Corrected non rendering of table row on edge of pages in the swing renderer.
- Better memory usage.
- Use Shani xml parser v1.4.2.
Download v0.99 (tar.gz) - 05/07/2006
Changes:
- Corrected dissapearance of header/footer in the swing renderer.
- Better memory usage when css style is put on the document.
Download v0.98 (tar.gz) - 05/07/2006 (N/A)
Changes:
- Huge memory usage improvement.
- Use Shani xml parser v1.3.8.
Download v0.97 (tar.gz) - 23/06/2006
Changes:
- Add intelligent and automatic table rows break in the swing renderer.
Download v0.96 (tar.gz) - 19/06/2006
Changes:
- Fix incorrect alignment with embedded fonts in the swing renderer.
- Javadoc updated.
Download v0.95 (tar.gz) - 17/06/2006
Changes:
- Corrected incorrect right alignment of text in pdf generated by the swing renderer.
- Javadoc updated.
Download v0.94 (tar.gz) - 08/06/2006
Changes:
- The swing renderer now has a pagebreak tag which permits to cut one document in several pages.
- Possibility to embed font with the yahp-fop-config.xml file.
- Javadoc updated.
- Sample application updated.
Download v0.93 (tar.gz) - 15/04/2006
Changes:
- Better rendering of forms components (button, field, ...) in the swing renderer. (see widget.pdf)
- List box are now rendered with the swing renderer.
- Javadoc updated.
- Swing renderer samples files updated.
Download v0.92 (tar.gz) - 14/04/2006
Changes:
- Rendering of forms components (button, field, ...) is now custom made in the swing renderer.
- The swing renderer is two times faster.
- Correct rendering of scaled page with the swing renderer.
- Rendering of the content of input field and textarea with the swing renderer.
- Sample application updated.
- Javadoc updated.
- Swing renderer samples files updated.
Download v0.91 (tar.gz) - 11/04/2006
Changes:
- Can sign a document with a certificate.
- Code cleanup.
- Sample application updated.
- Javadoc updated.
Download v0.90 (tar.gz) - 08/04/2006
Changes:
- Corrected "drawing error" occuring in acrobat reader of samples pdf generated with the firefox renderer by using latest ghostscript and not ghostscript eps.
- Ensure all buttons/combo/textfield are painted with the swing renderer.
- Sample application updated.
- Javadoc updated.
- All samples files updated.
Download v0.20 (tar.gz) - 07/04/2006
Changes:
- Support header/footer in utf-8 with the firefox renderer.
- Support concurrent rendering with the firefox renderer.
- Better rendering of comboboxes and buttons with the swing renderer, they are painted as vector instead of bitmap.
- Sample application updated.
- Javadoc updated.
- All samples files updated.
Download v0.19 (tar.gz) - 05/04/2006
Changes:
- Can set header/footer and page size with the firefox renderer.
- Recompiled iText to work on 1.4 JVM and corrected a LinkageError on 1.4 JVM.
- Sample application updated to use all the new properties.
- Javadoc updated.
- Firefox samples files updated.
Download v0.18 (tar.gz) - 02/04/2006
Changes:
- Added a new renderer which use firefox as html renderer.
- Sample application updated to use all the new properties.
- Javadoc updated.
- Samples files updated.
Download v0.17 (tar.gz) - 30/03/2006
Changes:
- Renderers are now pluggable.
- Added a new renderer which use OpenOffice.org writer as pdf generator.
- Sample application updated to use all the new properties.
- Refactoring and cleanup of code.
- Does not copy yahpxxx.jar in the temp directory anymore.
- Javadoc updated.
- Samples files updated.
Download v0.16 (tar.gz) - 24/03/2006
Changes:
- Added handling of pdf encryption.
- Added several properties in IHtmlToPdfTransformer interface, see javadoc.
- Cleanup of code.
- Updated the javadoc.
- Updated to the new ShaniXmlParser 1.3.6.
- All samples files updated.
Download v0.15 (tar.gz) - 23/03/2006
Changes:
- Correct rendering of page containing chinese characters.
- Better rendering of button/checkbox components.
- Updated to the new ShaniXmlParser 1.3.6-pre.
- Samples files updated.
Download v0.14 (tar.gz) - 20/03/2006
Changes:
- Intelligent cutting of pages.
- Better rendering of page footer.
Download v0.13 (tar.gz) - 17/03/2006
Changes:
- Detect if rendering in the event thread and avoid calling SwingUtilities invokeAndWait in this case.
- Ensure synchronized rendering of image inside the document.
- Updated to the new ShaniXmlParser 1.3.5.
Download v0.12 (tar.gz) - 16/11/2005
Changes:
- Use SwingUtilities.InvokeAndWait to synchronize with the swing paint thread.
- Better rendering of page header. (page footer still rendered as image)
- Samples files updated.
Download v0.11 (tar.gz) - 07/11/2005
Changes:
- Add property "FAST_TRANSFORM" default to true, which permits to have faster transformation, but will produce black background on transparent gif under kpdf (only so far).
- Circumvent a NullPointerException in JDK ParagraphView class under jdk 1.4.2
Download v0.10 (tar.gz) - 02/11/2005
Changes:
- Set the org.apache.commons.logging.Log System property to force the use log4j instead of setting wrongly with a LogFactoryImplementation.
Download v0.9 (tar.gz) - 28/10/2005
Changes:
- Correct rendering of comboboxes.
- Correct rendering of images with transparent zone (no more black background)
- Set the org.apache.commons.logging.Log System property to force the use of the default logger inside the Yahp context.
- Samples files updated.
- Javadoc updated
Download v0.8 (tar.gz) - 26/10/2005
Changes:
- Corrected a memory leak in the classloader due to commons logging.
- Destroy the classloader on finalization.
- Added the META-INF/services/org.apache.commons.logging.LogFactory file to force the use of the default logger inside the Yahp context.
Download v0.7 (tar.gz) - 23/10/2005
Changes:
- Remove not selected option tag from DOM.
- Updated xml parser.
Download v0.6 (tar.gz) - 24/09/2005
Changes:
- Better rendering quality.
- Render directly in the pdfgraphics2d and dot not use an offscreen buffer which had bad rendering quality.
- Fonts are now vectorized and not as bitmap.
- Samples updated.
- Javadoc updated.
Download v0.5 (tar.gz) - 23/09/2005 (modified, first 0.5 has still buttons display problem, consider the first 0.5 as nuked ;)
Changes:
- Force HTMLEditorKit on the JTextPane used for rendering. (prevent source display)
- Remove the doctype node if any before giving the source to the JTextPane
- Now render correclty the Text field, buttons, combobox, ... (before was blank)
Download v0.4 (tar.gz) - 22/09/2005
Changes:
- Remove the use of TimeoutException in the CMutex clas because this exception only exists in JDK 1.5
- Set the thread context classloader to prevent Duplicate Class.
Download v0.3 (tar.gz) - 21/09/2005
Changes:
- Document/javadoc
- Set antialiasing on the graphics2d object.
- handle '../' in css and image links.
Download v0.2 (tar.gz)
Changes:
- Use a specialised classloader to load inner jar.
- Compile FOP for jdk1.4 instead of 1.5.
Download v0.1 (tar.gz)
Contact : quentin.anciaux@advalvas.be