Java Code Examples for org.apache.tika.mime.MimeType#getName()

The following examples show how to use org.apache.tika.mime.MimeType#getName() . You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: ReferenceResource.java    From oodt with Apache License 2.0 5 votes vote down vote up
/**
 * Gets the name of the MIME type for the reference.
 * @return the name of the MIME type for the reference
 */
@XmlElement(name = "mimeType")
public String getMimeTypeName()
{
  MimeType m = reference.getMimeType();
  if (m != null)
  {
    return m.getName();
  }
  return null;
}
 
Example 2
Source File: MimeUtil.java    From anthelion with Apache License 2.0 4 votes vote down vote up
/**
 * A facade interface to trying all the possible mime type resolution
 * strategies available within Tika. First, the mime type provided in
 * <code>typeName</code> is cleaned, with {@link #cleanMimeType(String)}.
 * Then the cleaned mime type is looked up in the underlying Tika
 * {@link MimeTypes} registry, by its cleaned name. If the {@link MimeType} is
 * found, then that mime type is used, otherwise URL resolution is
 * used to try and determine the mime type. If that means is unsuccessful, and
 * if <code>mime.type.magic</code> is enabled in {@link NutchConfiguration},
 * then mime type magic resolution is used to try and obtain a
 * better-than-the-default approximation of the {@link MimeType}.
 * 
 * @param typeName
 *          The original mime type, returned from a {@link ProtocolOutput}.
 * @param url
 *          The given @see url, that Nutch was trying to crawl.
 * @param data
 *          The byte data, returned from the crawl, if any.
 * @return The correctly, automatically guessed {@link MimeType} name.
 */
public String autoResolveContentType(String typeName, String url, byte[] data) {
  String retType = null;
  String magicType = null;
  MimeType type = null;
  String cleanedMimeType = null;

  try {
    cleanedMimeType = MimeUtil.cleanMimeType(typeName) != null ? this.mimeTypes
        .forName(MimeUtil.cleanMimeType(typeName)).getName()
        : null;
  } catch (MimeTypeException mte) {
    // Seems to be a malformed mime type name...
  }

  // first try to get the type from the cleaned type name
  try {
    type = cleanedMimeType != null ? this.mimeTypes.forName(cleanedMimeType)
        : null;
  } catch (MimeTypeException e) {
    type = null;
  }

  // if returned null, or if it's the default type then try url resolution
  if (type == null
      || (type != null && type.getName().equals(MimeTypes.OCTET_STREAM))) {
    // If no mime-type header, or cannot find a corresponding registered
    // mime-type, then guess a mime-type from the url pattern
    type = this.mimeTypes.getMimeType(url) != null ? this.mimeTypes
        .getMimeType(url) : type;
  }

  retType= type.getName();

  // if magic is enabled use mime magic to guess if the mime type returned
  // from the magic guess is different than the one that's already set so far
  // if it is, and it's not the default mime type, then go with the mime type
  // returned by the magic
  if (this.mimeMagic) {
    magicType = tika.detect(data);

    // Deprecated in Tika 1.0 See https://issues.apache.org/jira/browse/NUTCH-1230
    //MimeType magicType = this.mimeTypes.getMimeType(data);
    if (magicType != null && !magicType.equals(MimeTypes.OCTET_STREAM)
        && !magicType.equals(MimeTypes.PLAIN_TEXT)
        && retType != null && !retType.equals(magicType)) {

      // If magic enabled and the current mime type differs from that of the
      // one returned from the magic, take the magic mimeType
      retType = magicType;
    }

    // if type is STILL null after all the resolution strategies, go for the
    // default type
    if (retType == null) {
      try {
        retType = MimeTypes.OCTET_STREAM;
      } catch (Exception ignore) {
      }
    }
  }

  return retType;
}