Java Code Examples for org.apache.poi.xwpf.extractor.XWPFWordExtractor#getText()
The following examples show how to use
org.apache.poi.xwpf.extractor.XWPFWordExtractor#getText() .
You can vote up the ones you like or vote down the ones you don't like,
and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: FileBeanParser.java From everywhere with Apache License 2.0 | 6 votes |
private static String readDoc (String filePath, InputStream is) throws Exception { String text= ""; is = FileMagic.prepareToCheckMagic(is); try { if (FileMagic.valueOf(is) == FileMagic.OLE2) { WordExtractor ex = new WordExtractor(is); text = ex.getText(); ex.close(); } else if(FileMagic.valueOf(is) == FileMagic.OOXML) { XWPFDocument doc = new XWPFDocument(is); XWPFWordExtractor extractor = new XWPFWordExtractor(doc); text = extractor.getText(); extractor.close(); } } catch (OfficeXmlFileException e) { logger.error(filePath, e); } finally { if (is != null) { is.close(); } } return text; }
Example 2
Source File: IndexerTextExtractor.java From eplmp with Eclipse Public License 1.0 | 6 votes |
private String microsoftWordDocumentToString(InputStream inputStream) throws IOException { String strRet; try (InputStream wordStream = new BufferedInputStream(inputStream)) { if (POIFSFileSystem.hasPOIFSHeader(wordStream)) { WordExtractor wordExtractor = new WordExtractor(wordStream); strRet = wordExtractor.getText(); wordExtractor.close(); } else { XWPFWordExtractor wordXExtractor = new XWPFWordExtractor(new XWPFDocument(wordStream)); strRet = wordXExtractor.getText(); wordXExtractor.close(); } } return strRet; }
Example 3
Source File: OOXMLWordFormatModule.java From ontopia with Apache License 2.0 | 5 votes |
@Override public void readContent(ClassifiableContentIF cc, TextHandlerIF handler) { try { OPCPackage opc = OPCPackage.open(new ByteArrayInputStream(cc.getContent())); XWPFWordExtractor extractor = new XWPFWordExtractor(opc); String s = extractor.getText(); char[] c = s.toCharArray(); handler.startRegion("document"); handler.text(c, 0, c.length); handler.endRegion(); } catch (Exception e) { throw new OntopiaRuntimeException(e); } }
Example 4
Source File: MSOfficeBox.java From wandora with GNU General Public License v3.0 | 4 votes |
public static String getDocxText(File file) { try { XWPFDocument docx = new XWPFDocument(new FileInputStream(file)); XWPFWordExtractor extractor = new XWPFWordExtractor(docx); String text = extractor.getText(); return text; } catch(Exception e) { e.printStackTrace(); } return null; }