Java Code Examples for htsjdk.samtools.util.IOUtil#hasBlockCompressedExtension()
The following examples show how to use
htsjdk.samtools.util.IOUtil#hasBlockCompressedExtension() .
You can vote up the ones you like or vote down the ones you don't like,
and go to the original project or source file by following the links above each example. You may check out the related API usage on the sidebar.
Example 1
Source File: PathLineIterator.java From gatk with BSD 3-Clause "New" or "Revised" License | 6 votes |
/** * Returns an iterator so you can iterate over the lines in the text file like so: * for (String line: new PathLineIterator(path)) { * // do something with the line * } * * It's also closeable so you can close it when done, or use it in a try-with-resources * to close it automatically. * * If the file name ends in ".gz", this will decompress it automatically. * * Consider also using XReadLines if you need trimming or skipping comments. * * @param path path to a text file. */ public PathLineIterator(final Path path) { try { InputStream is = Files.newInputStream(Utils.nonNull(path, "path shouldn't be null")); if (IOUtil.hasBlockCompressedExtension(path)) { is = IOUtils.makeZippedInputStream(is); } BufferedReader br = new BufferedReader(new InputStreamReader(is, "UTF-8")); lines = br.lines(); } catch (final CharacterCodingException ex ) { throw new UserException("Error detected in file character encoding. Possible inconsistent character encodings within the file: " + path.toUri().toString(), ex); } catch (final IOException x) { throw new UserException("Error reading " + path.toUri().toString(), x); } }
Example 2
Source File: ReadMetadata.java From gatk with BSD 3-Clause "New" or "Revised" License | 6 votes |
/** * Serializes a read-metadata by itself into a file. * @param meta the read-metadata to serialize. * @param whereTo the name of the file or resource where it will go to. * @throws IllegalArgumentException if either {@code meta} or {@code whereTo} * is {@code null}. * @throws UserException if there was a problem during serialization. */ public static void writeStandalone(final ReadMetadata meta, final String whereTo) { try { final OutputStream outputStream = BucketUtils.createFile(whereTo); final OutputStream actualStream = IOUtil.hasBlockCompressedExtension(whereTo) ? new GzipCompressorOutputStream(outputStream) : outputStream; final Output output = new Output(actualStream); final Kryo kryo = new Kryo(); final Serializer serializer = new Serializer(); output.writeString(MAGIC_STRING); output.writeString(VERSION_STRING); serializer.write(kryo, output, meta); output.close(); } catch (final IOException ex) { throw new UserException.CouldNotCreateOutputFile(whereTo, ex); } }
Example 3
Source File: FeatureDataSource.java From gatk with BSD 3-Clause "New" or "Revised" License | 5 votes |
/** * Creates a FeatureDataSource backed by the provided FeatureInput. We will look ahead the specified number of bases * during queries that produce cache misses. * * @param featureInput a FeatureInput specifying a source of Features * @param queryLookaheadBases look ahead this many bases during queries that produce cache misses * @param targetFeatureType When searching for a {@link FeatureCodec} for this data source, restrict the search to codecs * that produce this type of Feature. May be null, which results in an unrestricted search. * @param cloudPrefetchBuffer MB size of caching/prefetching wrapper for the data, if on Google Cloud (0 to disable). * @param cloudIndexPrefetchBuffer MB size of caching/prefetching wrapper for the index, if on Google Cloud (0 to disable). * @param genomicsDBOptions options and info for reading from a GenomicsDB; may be null */ public FeatureDataSource(final FeatureInput<T> featureInput, final int queryLookaheadBases, final Class<? extends Feature> targetFeatureType, final int cloudPrefetchBuffer, final int cloudIndexPrefetchBuffer, final GenomicsDBOptions genomicsDBOptions) { Utils.validateArg(queryLookaheadBases >= 0, "Query lookahead bases must be >= 0"); this.featureInput = Utils.nonNull(featureInput, "featureInput must not be null"); if (IOUtils.isGenomicsDBPath(featureInput)) { Utils.nonNull(genomicsDBOptions, "GenomicsDBOptions must not be null. Calling tool may not read from a GenomicsDB data source."); } // Create a feature reader without requiring an index. We will require one ourselves as soon as // a query by interval is attempted. this.featureReader = getFeatureReader(featureInput, targetFeatureType, BucketUtils.getPrefetchingWrapper(cloudPrefetchBuffer), BucketUtils.getPrefetchingWrapper(cloudIndexPrefetchBuffer), genomicsDBOptions); if (IOUtils.isGenomicsDBPath(featureInput)) { //genomics db uri's have no associated index file to read from, but they do support random access this.hasIndex = false; this.supportsRandomAccess = true; } else if (featureReader instanceof AbstractFeatureReader) { this.hasIndex = ((AbstractFeatureReader<T, ?>) featureReader).hasIndex(); this.supportsRandomAccess = hasIndex; } else { throw new GATKException("Found a feature input that was neither GenomicsDB or a Tribble AbstractFeatureReader. Input was " + featureInput.toString() + "."); } // Due to a bug in HTSJDK, unindexed block compressed input files may fail to parse completely. For safety, // these files have been disabled. See https://github.com/broadinstitute/gatk/issues/4224 for discussion if (!hasIndex && IOUtil.hasBlockCompressedExtension(featureInput.getFeaturePath())) { throw new UserException.MissingIndex(featureInput.toString(), "Support for unindexed block-compressed files has been temporarily disabled. Try running IndexFeatureFile on the input."); } this.currentIterator = null; this.intervalsForTraversal = null; this.queryCache = new FeatureCache<>(); this.queryLookaheadBases = queryLookaheadBases; }
Example 4
Source File: SAMPileupCodec.java From gatk with BSD 3-Clause "New" or "Revised" License | 5 votes |
/** * Only files with {@link #SAM_PILEUP_FILE_EXTENSIONS} could be parsed * @param path path the file to test for parsability with this codec * @return {@code true} if the path has the correct file extension {@link #SAM_PILEUP_FILE_EXTENSIONS}, {@code false} otherwise */ @Override public boolean canDecode(final String path) { final String noBlockCompressedPath; if (IOUtil.hasBlockCompressedExtension(path)) { noBlockCompressedPath = FilenameUtils.removeExtension(path).toLowerCase(); } else { noBlockCompressedPath = path.toLowerCase(); } return SAM_PILEUP_FILE_EXTENSIONS.stream().anyMatch(ext -> noBlockCompressedPath.endsWith("."+ext)); }
Example 5
Source File: GatherVcfsCloud.java From gatk with BSD 3-Clause "New" or "Revised" License | 5 votes |
/** Checks (via filename checking) that all files appear to be block compressed files. */ @VisibleForTesting static boolean areAllBlockCompressed(final List<Path> input) { for (final Path path : input) { if (path == null){ return false; } final String pathString = path.toUri().toString(); if ( pathString.endsWith(".bcf") || !IOUtil.hasBlockCompressedExtension(pathString)){ return false; } } return true; }
Example 6
Source File: IndexFeatureFile.java From gatk with BSD 3-Clause "New" or "Revised" License | 5 votes |
private Index createAppropriateIndexInMemory(final FeatureCodec<? extends Feature, ?> codec) { try { // For block-compression files, write a Tabix index if (IOUtil.hasBlockCompressedExtension(featurePath.toPath())) { // Creating tabix indices with a non standard extensions can cause problems so we disable it if (outputPath != null && !outputPath.getURIString().endsWith(FileExtensions.TABIX_INDEX)) { throw new UserException("The index for " + featurePath + " must be written to a file with a \"" + FileExtensions.TABIX_INDEX + "\" extension"); } // TODO: this could benefit from provided sequence dictionary from reference // TODO: this can be an optional parameter for the tool return IndexFactory.createIndex(featurePath.toPath(), codec, IndexFactory.IndexType.TABIX, null); } // TODO: detection of GVCF files should not be file-extension-based. Need to come up with canonical // TODO: way of detecting GVCFs based on the contents (may require changes to the spec!) else if (featurePath.getURIString().endsWith(GVCF_FILE_EXTENSION)) { // Optimize GVCF indices for the use case of having a large number of GVCFs open simultaneously return IndexFactory.createLinearIndex(featurePath.toPath(), codec, OPTIMAL_GVCF_INDEX_BIN_SIZE); } else { // Optimize indices for other kinds of files for seek time / querying return IndexFactory.createDynamicIndex(featurePath.toPath(), codec, IndexFactory.IndexBalanceApproach.FOR_SEEK_TIME); } } catch (TribbleException e) { // Underlying cause here is usually a malformed file, but can also be things like // "codec does not support tabix" throw new UserException.CouldNotIndexFile(featurePath.toPath(), e); } }
Example 7
Source File: XGBoostEvidenceFilter.java From gatk with BSD 3-Clause "New" or "Revised" License | 4 votes |
private static InputStream resourcePathToInputStream(final String resourcePath) throws IOException { final InputStream inputStream = XGBoostEvidenceFilter.class.getResourceAsStream(resourcePath); return IOUtil.hasBlockCompressedExtension(resourcePath) ? IOUtils.makeZippedInputStream(new BufferedInputStream(inputStream)) : inputStream; }