com.itextpdf.text.pdf.parser.SimpleTextExtractionStrategy Java Examples
The following examples show how to use
com.itextpdf.text.pdf.parser.SimpleTextExtractionStrategy.
You can vote up the ones you like or vote down the ones you don't like,
and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example #1
Source File: PDF2WordExample.java From tutorials with MIT License | 6 votes |
private static void generateDocFromPDF(String filename) throws IOException { XWPFDocument doc = new XWPFDocument(); String pdf = filename; PdfReader reader = new PdfReader(pdf); PdfReaderContentParser parser = new PdfReaderContentParser(reader); for (int i = 1; i <= reader.getNumberOfPages(); i++) { TextExtractionStrategy strategy = parser.processContent(i, new SimpleTextExtractionStrategy()); String text = strategy.getResultantText(); XWPFParagraph p = doc.createParagraph(); XWPFRun run = p.createRun(); run.setText(text); run.addBreak(BreakType.PAGE); } FileOutputStream out = new FileOutputStream("src/output/pdf.docx"); doc.write(out); out.close(); reader.close(); doc.close(); }
Example #2
Source File: OfficeUtils.java From dk-fitting with Apache License 2.0 | 5 votes |
public static String itextPdf2Txt(String filePath) throws Exception { PdfReader reader = new PdfReader(filePath); PdfReaderContentParser parser = new PdfReaderContentParser(reader); StringBuffer buff = new StringBuffer(); TextExtractionStrategy strategy; for (int i = 1; i <= reader.getNumberOfPages(); i++) { strategy = parser.processContent(i, new SimpleTextExtractionStrategy()); buff.append(strategy.getResultantText()); } // String res = new String(buff.toString().getBytes("utf-8"), "utf-8"); return buff.toString(); }
Example #3
Source File: TextExtraction.java From testarea-itext5 with GNU Affero General Public License v3.0 | 5 votes |
String extractSimple(PdfReader reader, int pageNo) throws IOException { return PdfTextExtractor.getTextFromPage(reader, pageNo, new SimpleTextExtractionStrategy() { boolean empty = true; @Override public void beginTextBlock() { if (!empty) appendTextChunk("<BLOCK>"); super.beginTextBlock(); } @Override public void endTextBlock() { if (!empty) appendTextChunk("</BLOCK>\n"); super.endTextBlock(); } @Override public String getResultantText() { if (empty) return super.getResultantText(); else return "<BLOCK>" + super.getResultantText(); } @Override public void renderText(TextRenderInfo renderInfo) { empty = false; super.renderText(renderInfo); } }); }