Python vggish_input.waveform_to_examples() Examples
The following are 9
code examples of vggish_input.waveform_to_examples().
You can vote up the ones you like or vote down the ones you don't like,
and go to the original project or source file by following the links above each example.
You may also want to check out all available functions/classes of the module
vggish_input
, or try the search function
.
Example #1
Source File: vggish_train_demo.py From Tensorflow-Audio-Classification with Apache License 2.0 | 4 votes |
def _get_examples_batch(): """Returns a shuffled batch of examples of all audio classes. Note that this is just a toy function because this is a simple demo intended to illustrate how the training code might work. Returns: a tuple (features, labels) where features is a NumPy array of shape [batch_size, num_frames, num_bands] where the batch_size is variable and each row is a log mel spectrogram patch of shape [num_frames, num_bands] suitable for feeding VGGish, while labels is a NumPy array of shape [batch_size, num_classes] where each row is a multi-hot label vector that provides the labels for corresponding rows in features. """ # Make a waveform for each class. num_seconds = 5 sr = 44100 # Sampling rate. t = np.linspace(0, num_seconds, int(num_seconds * sr)) # Time axis. # Random sine wave. freq = np.random.uniform(100, 1000) sine = np.sin(2 * np.pi * freq * t) # Random constant signal. magnitude = np.random.uniform(-1, 1) const = magnitude * t # White noise. noise = np.random.normal(-1, 1, size=t.shape) # Make examples of each signal and corresponding labels. # Sine is class index 0, Const class index 1, Noise class index 2. sine_examples = vggish_input.waveform_to_examples(sine, sr) sine_labels = np.array([[1, 0, 0]] * sine_examples.shape[0]) const_examples = vggish_input.waveform_to_examples(const, sr) const_labels = np.array([[0, 1, 0]] * const_examples.shape[0]) noise_examples = vggish_input.waveform_to_examples(noise, sr) noise_labels = np.array([[0, 0, 1]] * noise_examples.shape[0]) # Shuffle (example, label) pairs across all classes. all_examples = np.concatenate((sine_examples, const_examples, noise_examples)) print('all_examples shape:', all_examples.shape) all_labels = np.concatenate((sine_labels, const_labels, noise_labels)) print('all_labels shape:', all_labels.shape) labeled_examples = list(zip(all_examples, all_labels)) shuffle(labeled_examples) # Separate and return the features and labels. features = [example for (example, _) in labeled_examples] labels = [label for (_, label) in labeled_examples] return (features, labels)
Example #2
Source File: vggish_train_demo.py From yolo_v2 with Apache License 2.0 | 4 votes |
def _get_examples_batch(): """Returns a shuffled batch of examples of all audio classes. Note that this is just a toy function because this is a simple demo intended to illustrate how the training code might work. Returns: a tuple (features, labels) where features is a NumPy array of shape [batch_size, num_frames, num_bands] where the batch_size is variable and each row is a log mel spectrogram patch of shape [num_frames, num_bands] suitable for feeding VGGish, while labels is a NumPy array of shape [batch_size, num_classes] where each row is a multi-hot label vector that provides the labels for corresponding rows in features. """ # Make a waveform for each class. num_seconds = 5 sr = 44100 # Sampling rate. t = np.linspace(0, num_seconds, int(num_seconds * sr)) # Time axis. # Random sine wave. freq = np.random.uniform(100, 1000) sine = np.sin(2 * np.pi * freq * t) # Random constant signal. magnitude = np.random.uniform(-1, 1) const = magnitude * t # White noise. noise = np.random.normal(-1, 1, size=t.shape) # Make examples of each signal and corresponding labels. # Sine is class index 0, Const class index 1, Noise class index 2. sine_examples = vggish_input.waveform_to_examples(sine, sr) sine_labels = np.array([[1, 0, 0]] * sine_examples.shape[0]) const_examples = vggish_input.waveform_to_examples(const, sr) const_labels = np.array([[0, 1, 0]] * const_examples.shape[0]) noise_examples = vggish_input.waveform_to_examples(noise, sr) noise_labels = np.array([[0, 0, 1]] * noise_examples.shape[0]) # Shuffle (example, label) pairs across all classes. all_examples = np.concatenate((sine_examples, const_examples, noise_examples)) all_labels = np.concatenate((sine_labels, const_labels, noise_labels)) labeled_examples = list(zip(all_examples, all_labels)) shuffle(labeled_examples) # Separate and return the features and labels. features = [example for (example, _) in labeled_examples] labels = [label for (_, label) in labeled_examples] return (features, labels)
Example #3
Source File: vggish_train_demo.py From Gun-Detector with Apache License 2.0 | 4 votes |
def _get_examples_batch(): """Returns a shuffled batch of examples of all audio classes. Note that this is just a toy function because this is a simple demo intended to illustrate how the training code might work. Returns: a tuple (features, labels) where features is a NumPy array of shape [batch_size, num_frames, num_bands] where the batch_size is variable and each row is a log mel spectrogram patch of shape [num_frames, num_bands] suitable for feeding VGGish, while labels is a NumPy array of shape [batch_size, num_classes] where each row is a multi-hot label vector that provides the labels for corresponding rows in features. """ # Make a waveform for each class. num_seconds = 5 sr = 44100 # Sampling rate. t = np.linspace(0, num_seconds, int(num_seconds * sr)) # Time axis. # Random sine wave. freq = np.random.uniform(100, 1000) sine = np.sin(2 * np.pi * freq * t) # Random constant signal. magnitude = np.random.uniform(-1, 1) const = magnitude * t # White noise. noise = np.random.normal(-1, 1, size=t.shape) # Make examples of each signal and corresponding labels. # Sine is class index 0, Const class index 1, Noise class index 2. sine_examples = vggish_input.waveform_to_examples(sine, sr) sine_labels = np.array([[1, 0, 0]] * sine_examples.shape[0]) const_examples = vggish_input.waveform_to_examples(const, sr) const_labels = np.array([[0, 1, 0]] * const_examples.shape[0]) noise_examples = vggish_input.waveform_to_examples(noise, sr) noise_labels = np.array([[0, 0, 1]] * noise_examples.shape[0]) # Shuffle (example, label) pairs across all classes. all_examples = np.concatenate((sine_examples, const_examples, noise_examples)) all_labels = np.concatenate((sine_labels, const_labels, noise_labels)) labeled_examples = list(zip(all_examples, all_labels)) shuffle(labeled_examples) # Separate and return the features and labels. features = [example for (example, _) in labeled_examples] labels = [label for (_, label) in labeled_examples] return (features, labels)
Example #4
Source File: vggish_train_demo.py From object_detection_kitti with Apache License 2.0 | 4 votes |
def _get_examples_batch(): """Returns a shuffled batch of examples of all audio classes. Note that this is just a toy function because this is a simple demo intended to illustrate how the training code might work. Returns: a tuple (features, labels) where features is a NumPy array of shape [batch_size, num_frames, num_bands] where the batch_size is variable and each row is a log mel spectrogram patch of shape [num_frames, num_bands] suitable for feeding VGGish, while labels is a NumPy array of shape [batch_size, num_classes] where each row is a multi-hot label vector that provides the labels for corresponding rows in features. """ # Make a waveform for each class. num_seconds = 5 sr = 44100 # Sampling rate. t = np.linspace(0, num_seconds, int(num_seconds * sr)) # Time axis. # Random sine wave. freq = np.random.uniform(100, 1000) sine = np.sin(2 * np.pi * freq * t) # Random constant signal. magnitude = np.random.uniform(-1, 1) const = magnitude * t # White noise. noise = np.random.normal(-1, 1, size=t.shape) # Make examples of each signal and corresponding labels. # Sine is class index 0, Const class index 1, Noise class index 2. sine_examples = vggish_input.waveform_to_examples(sine, sr) sine_labels = np.array([[1, 0, 0]] * sine_examples.shape[0]) const_examples = vggish_input.waveform_to_examples(const, sr) const_labels = np.array([[0, 1, 0]] * const_examples.shape[0]) noise_examples = vggish_input.waveform_to_examples(noise, sr) noise_labels = np.array([[0, 0, 1]] * noise_examples.shape[0]) # Shuffle (example, label) pairs across all classes. all_examples = sine_examples + const_examples + noise_examples all_labels = sine_labels + const_labels + noise_labels labeled_examples = list(zip(all_examples, all_labels)) shuffle(labeled_examples) # Separate and return the features and labels. features = [example for (example, _) in labeled_examples] labels = [label for (_, label) in labeled_examples] return (features, labels)
Example #5
Source File: vggish_train_demo.py From object_detection_with_tensorflow with MIT License | 4 votes |
def _get_examples_batch(): """Returns a shuffled batch of examples of all audio classes. Note that this is just a toy function because this is a simple demo intended to illustrate how the training code might work. Returns: a tuple (features, labels) where features is a NumPy array of shape [batch_size, num_frames, num_bands] where the batch_size is variable and each row is a log mel spectrogram patch of shape [num_frames, num_bands] suitable for feeding VGGish, while labels is a NumPy array of shape [batch_size, num_classes] where each row is a multi-hot label vector that provides the labels for corresponding rows in features. """ # Make a waveform for each class. num_seconds = 5 sr = 44100 # Sampling rate. t = np.linspace(0, num_seconds, int(num_seconds * sr)) # Time axis. # Random sine wave. freq = np.random.uniform(100, 1000) sine = np.sin(2 * np.pi * freq * t) # Random constant signal. magnitude = np.random.uniform(-1, 1) const = magnitude * t # White noise. noise = np.random.normal(-1, 1, size=t.shape) # Make examples of each signal and corresponding labels. # Sine is class index 0, Const class index 1, Noise class index 2. sine_examples = vggish_input.waveform_to_examples(sine, sr) sine_labels = np.array([[1, 0, 0]] * sine_examples.shape[0]) const_examples = vggish_input.waveform_to_examples(const, sr) const_labels = np.array([[0, 1, 0]] * const_examples.shape[0]) noise_examples = vggish_input.waveform_to_examples(noise, sr) noise_labels = np.array([[0, 0, 1]] * noise_examples.shape[0]) # Shuffle (example, label) pairs across all classes. all_examples = np.concatenate((sine_examples, const_examples, noise_examples)) all_labels = np.concatenate((sine_labels, const_labels, noise_labels)) labeled_examples = list(zip(all_examples, all_labels)) shuffle(labeled_examples) # Separate and return the features and labels. features = [example for (example, _) in labeled_examples] labels = [label for (_, label) in labeled_examples] return (features, labels)
Example #6
Source File: extract_audioset_embedding.py From audioset_classification with MIT License | 4 votes |
def extract_audioset_embedding(): """Extract log mel spectrogram features. """ # Arguments & parameters mel_bins = vggish_params.NUM_BANDS sample_rate = vggish_params.SAMPLE_RATE input_len = vggish_params.NUM_FRAMES embedding_size = vggish_params.EMBEDDING_SIZE '''You may modify the EXAMPLE_HOP_SECONDS in vggish_params.py to change the hop size. ''' # Paths audio_path = 'appendixes/01.wav' checkpoint_path = os.path.join('vggish_model.ckpt') pcm_params_path = os.path.join('vggish_pca_params.npz') if not os.path.isfile(checkpoint_path): raise Exception('Please download vggish_model.ckpt from ' 'https://storage.googleapis.com/audioset/vggish_model.ckpt ' 'and put it in the root of this codebase. ') if not os.path.isfile(pcm_params_path): raise Exception('Please download pcm_params_path from ' 'https://storage.googleapis.com/audioset/vggish_pca_params.npz ' 'and put it in the root of this codebase. ') # Load model sess = tf.Session() vggish_slim.define_vggish_slim(training=False) vggish_slim.load_vggish_slim_checkpoint(sess, checkpoint_path) features_tensor = sess.graph.get_tensor_by_name(vggish_params.INPUT_TENSOR_NAME) embedding_tensor = sess.graph.get_tensor_by_name(vggish_params.OUTPUT_TENSOR_NAME) pproc = vggish_postprocess.Postprocessor(pcm_params_path) # Read audio (audio, _) = read_audio(audio_path, target_fs=sample_rate) # Extract log mel feature logmel = vggish_input.waveform_to_examples(audio, sample_rate) # Extract embedding feature [embedding_batch] = sess.run([embedding_tensor], feed_dict={features_tensor: logmel}) # PCA postprocessed_batch = pproc.postprocess(embedding_batch) print('Audio length: {}'.format(len(audio))) print('Log mel shape: {}'.format(logmel.shape)) print('Embedding feature shape: {}'.format(postprocessed_batch.shape))
Example #7
Source File: vggish_train_demo.py From g-tensorflow-models with Apache License 2.0 | 4 votes |
def _get_examples_batch(): """Returns a shuffled batch of examples of all audio classes. Note that this is just a toy function because this is a simple demo intended to illustrate how the training code might work. Returns: a tuple (features, labels) where features is a NumPy array of shape [batch_size, num_frames, num_bands] where the batch_size is variable and each row is a log mel spectrogram patch of shape [num_frames, num_bands] suitable for feeding VGGish, while labels is a NumPy array of shape [batch_size, num_classes] where each row is a multi-hot label vector that provides the labels for corresponding rows in features. """ # Make a waveform for each class. num_seconds = 5 sr = 44100 # Sampling rate. t = np.linspace(0, num_seconds, int(num_seconds * sr)) # Time axis. # Random sine wave. freq = np.random.uniform(100, 1000) sine = np.sin(2 * np.pi * freq * t) # Random constant signal. magnitude = np.random.uniform(-1, 1) const = magnitude * t # White noise. noise = np.random.normal(-1, 1, size=t.shape) # Make examples of each signal and corresponding labels. # Sine is class index 0, Const class index 1, Noise class index 2. sine_examples = vggish_input.waveform_to_examples(sine, sr) sine_labels = np.array([[1, 0, 0]] * sine_examples.shape[0]) const_examples = vggish_input.waveform_to_examples(const, sr) const_labels = np.array([[0, 1, 0]] * const_examples.shape[0]) noise_examples = vggish_input.waveform_to_examples(noise, sr) noise_labels = np.array([[0, 0, 1]] * noise_examples.shape[0]) # Shuffle (example, label) pairs across all classes. all_examples = np.concatenate((sine_examples, const_examples, noise_examples)) all_labels = np.concatenate((sine_labels, const_labels, noise_labels)) labeled_examples = list(zip(all_examples, all_labels)) shuffle(labeled_examples) # Separate and return the features and labels. features = [example for (example, _) in labeled_examples] labels = [label for (_, label) in labeled_examples] return (features, labels)
Example #8
Source File: vggish_train_demo.py From models with Apache License 2.0 | 4 votes |
def _get_examples_batch(): """Returns a shuffled batch of examples of all audio classes. Note that this is just a toy function because this is a simple demo intended to illustrate how the training code might work. Returns: a tuple (features, labels) where features is a NumPy array of shape [batch_size, num_frames, num_bands] where the batch_size is variable and each row is a log mel spectrogram patch of shape [num_frames, num_bands] suitable for feeding VGGish, while labels is a NumPy array of shape [batch_size, num_classes] where each row is a multi-hot label vector that provides the labels for corresponding rows in features. """ # Make a waveform for each class. num_seconds = 5 sr = 44100 # Sampling rate. t = np.linspace(0, num_seconds, int(num_seconds * sr)) # Time axis. # Random sine wave. freq = np.random.uniform(100, 1000) sine = np.sin(2 * np.pi * freq * t) # Random constant signal. magnitude = np.random.uniform(-1, 1) const = magnitude * t # White noise. noise = np.random.normal(-1, 1, size=t.shape) # Make examples of each signal and corresponding labels. # Sine is class index 0, Const class index 1, Noise class index 2. sine_examples = vggish_input.waveform_to_examples(sine, sr) sine_labels = np.array([[1, 0, 0]] * sine_examples.shape[0]) const_examples = vggish_input.waveform_to_examples(const, sr) const_labels = np.array([[0, 1, 0]] * const_examples.shape[0]) noise_examples = vggish_input.waveform_to_examples(noise, sr) noise_labels = np.array([[0, 0, 1]] * noise_examples.shape[0]) # Shuffle (example, label) pairs across all classes. all_examples = np.concatenate((sine_examples, const_examples, noise_examples)) all_labels = np.concatenate((sine_labels, const_labels, noise_labels)) labeled_examples = list(zip(all_examples, all_labels)) shuffle(labeled_examples) # Separate and return the features and labels. features = [example for (example, _) in labeled_examples] labels = [label for (_, label) in labeled_examples] return (features, labels)
Example #9
Source File: vggish_train_demo.py From multilabel-image-classification-tensorflow with MIT License | 4 votes |
def _get_examples_batch(): """Returns a shuffled batch of examples of all audio classes. Note that this is just a toy function because this is a simple demo intended to illustrate how the training code might work. Returns: a tuple (features, labels) where features is a NumPy array of shape [batch_size, num_frames, num_bands] where the batch_size is variable and each row is a log mel spectrogram patch of shape [num_frames, num_bands] suitable for feeding VGGish, while labels is a NumPy array of shape [batch_size, num_classes] where each row is a multi-hot label vector that provides the labels for corresponding rows in features. """ # Make a waveform for each class. num_seconds = 5 sr = 44100 # Sampling rate. t = np.linspace(0, num_seconds, int(num_seconds * sr)) # Time axis. # Random sine wave. freq = np.random.uniform(100, 1000) sine = np.sin(2 * np.pi * freq * t) # Random constant signal. magnitude = np.random.uniform(-1, 1) const = magnitude * t # White noise. noise = np.random.normal(-1, 1, size=t.shape) # Make examples of each signal and corresponding labels. # Sine is class index 0, Const class index 1, Noise class index 2. sine_examples = vggish_input.waveform_to_examples(sine, sr) sine_labels = np.array([[1, 0, 0]] * sine_examples.shape[0]) const_examples = vggish_input.waveform_to_examples(const, sr) const_labels = np.array([[0, 1, 0]] * const_examples.shape[0]) noise_examples = vggish_input.waveform_to_examples(noise, sr) noise_labels = np.array([[0, 0, 1]] * noise_examples.shape[0]) # Shuffle (example, label) pairs across all classes. all_examples = np.concatenate((sine_examples, const_examples, noise_examples)) all_labels = np.concatenate((sine_labels, const_labels, noise_labels)) labeled_examples = list(zip(all_examples, all_labels)) shuffle(labeled_examples) # Separate and return the features and labels. features = [example for (example, _) in labeled_examples] labels = [label for (_, label) in labeled_examples] return (features, labels)