Python vggish_input.wavfile_to_examples() Examples

The following are 2 code examples of vggish_input.wavfile_to_examples(). You can vote up the ones you like or vote down the ones you don't like, and go to the original project or source file by following the links above each example. You may also want to check out all available functions/classes of the module vggish_input , or try the search function .
Example #1
Source File: audio_transfer_learning.py    From sklearn-audio-transfer-learning with ISC License 5 votes vote down vote up
def extract_vggish_features(paths, path2gt, model): 
    """Extracts VGGish features and their corresponding ground_truth and identifiers (the path).

       VGGish features are extracted from non-overlapping audio patches of 0.96 seconds, 
       where each audio patch covers 64 mel bands and 96 frames of 10 ms each.

       We repeat ground_truth and identifiers to fit the number of extracted VGGish features.
    """
    # 1) Extract log-mel spectrograms
    first_audio = True
    for p in paths:
        if first_audio:
            input_data = vggish_input.wavfile_to_examples(config['audio_folder'] + p)
            ground_truth = np.repeat(path2gt[p], input_data.shape[0], axis=0)
            identifiers = np.repeat(p, input_data.shape[0], axis=0)
            first_audio = False
        else:
            tmp_in = vggish_input.wavfile_to_examples(config['audio_folder'] + p)
            input_data = np.concatenate((input_data, tmp_in), axis=0)
            tmp_gt = np.repeat(path2gt[p], tmp_in.shape[0], axis=0)
            ground_truth = np.concatenate((ground_truth, tmp_gt), axis=0)
            tmp_id = np.repeat(p, tmp_in.shape[0], axis=0)
            identifiers = np.concatenate((identifiers, tmp_id), axis=0)

    # 2) Load Tensorflow model to extract VGGish features
    with tf.Graph().as_default(), tf.Session() as sess:
        vggish_slim.define_vggish_slim(training=False)
        vggish_slim.load_vggish_slim_checkpoint(sess, 'vggish_model.ckpt')
        features_tensor = sess.graph.get_tensor_by_name(vggish_params.INPUT_TENSOR_NAME)
        embedding_tensor = sess.graph.get_tensor_by_name(vggish_params.OUTPUT_TENSOR_NAME)
        extracted_feat = sess.run([embedding_tensor], feed_dict={features_tensor: input_data})
        feature = np.squeeze(np.asarray(extracted_feat))

    return [feature, ground_truth, identifiers] 
Example #2
Source File: audio_feature_extractor.py    From Tensorflow-Audio-Classification with Apache License 2.0 5 votes vote down vote up
def wavfile_to_features(wav_file):
        assert os.path.exists(wav_file), '{} not exists!'.format(wav_file)
        mel_features = vggish_input.wavfile_to_examples(wav_file)
        return mel_features