Batch processing of files#
Using the Python standard libraries (i.e., the glob
and os
modules), we can also quickly code up batch operations e.g. over all files with a certain extension in a directory. For example, we can make a list of all .wav
files in the audio
directory, use Praat to pre-emphasize these Sound objects, and then write the pre-emphasized sound to a WAV
and AIFF
format file.
[1]:
# Find all .wav files in a directory, pre-emphasize and save as new .wav and .aiff file
import parselmouth
import glob
import os.path
for wave_file in glob.glob("audio/*.wav"):
print("Processing {}...".format(wave_file))
s = parselmouth.Sound(wave_file)
s.pre_emphasize()
s.save(os.path.splitext(wave_file)[0] + "_pre.wav", 'WAV') # or parselmouth.SoundFileFormat.WAV instead of 'WAV'
s.save(os.path.splitext(wave_file)[0] + "_pre.aiff", 'AIFF')
Processing audio/3_b.wav...
Processing audio/2_y.wav...
Processing audio/4_y.wav...
Processing audio/2_b.wav...
Processing audio/5_b.wav...
Processing audio/1_b.wav...
Processing audio/the_north_wind_and_the_sun.wav...
Processing audio/3_y.wav...
Processing audio/bat.wav...
Processing audio/1_y.wav...
Processing audio/4_b.wav...
Processing audio/5_y.wav...
Processing audio/bet.wav...
After running this, the original home directory now contains all of the original .wav
files pre-emphazised and written again as .wav
and .aiff
files. The reading, pre-emphasis, and writing are all done by Praat, while looping over all .wav
files is done by standard Python code.
[2]:
# List the current contents of the audio/ folder
!ls audio/
1_b.wav 2_y_pre.aiff 4_b_pre.wav bat.wav
1_b_pre.aiff 2_y_pre.wav 4_y.wav bat_pre.aiff
1_b_pre.wav 3_b.wav 4_y_pre.aiff bat_pre.wav
1_y.wav 3_b_pre.aiff 4_y_pre.wav bet.wav
1_y_pre.aiff 3_b_pre.wav 5_b.wav bet_pre.aiff
1_y_pre.wav 3_y.wav 5_b_pre.aiff bet_pre.wav
2_b.wav 3_y_pre.aiff 5_b_pre.wav the_north_wind_and_the_sun.wav
2_b_pre.aiff 3_y_pre.wav 5_y.wav the_north_wind_and_the_sun_pre.aiff
2_b_pre.wav 4_b.wav 5_y_pre.aiff the_north_wind_and_the_sun_pre.wav
2_y.wav 4_b_pre.aiff 5_y_pre.wav
[3]:
# Remove the generated audio files again, to clean up the output from this example
!rm audio/*_pre.wav
!rm audio/*_pre.aiff
Similarly, we can use the pandas library to read a CSV file with data collected in an experiment, and loop over that data to e.g. extract the mean harmonics-to-noise ratio. The results
CSV has the following structure:
condition |
… |
pp_id |
---|---|---|
0 |
… |
1877 |
1 |
… |
801 |
1 |
… |
2456 |
0 |
… |
3126 |
The following code would read such a table, loop over it, use Praat through Parselmouth to calculate the analysis of each row, and then write an augmented CSV file to disk. To illustrate we use an example set of sound fragments: results.csv, 1_b.wav, 2_b.wav, 3_b.wav, 4_b.wav, 5_b.wav, 1_y.wav, 2_y.wav, 3_y.wav, 4_y.wav, 5_y.wav
In our example, the original CSV file, results.csv contains the following table:
[4]:
import pandas as pd
print(pd.read_csv("other/results.csv"))
condition pp_id
0 3 y
1 5 y
2 4 b
3 2 y
4 5 b
5 2 b
6 3 b
7 1 y
8 1 b
9 4 y
[5]:
def analyse_sound(row):
condition, pp_id = row['condition'], row['pp_id']
filepath = "audio/{}_{}.wav".format(condition, pp_id)
sound = parselmouth.Sound(filepath)
harmonicity = sound.to_harmonicity()
return harmonicity.values[harmonicity.values != -200].mean()
# Read in the experimental results file
dataframe = pd.read_csv("other/results.csv")
# Apply parselmouth wrapper function row-wise
dataframe['harmonics_to_noise'] = dataframe.apply(analyse_sound, axis='columns')
# Write out the updated dataframe
dataframe.to_csv("processed_results.csv", index=False)
We can now have a look at the results by reading in the processed_results.csv
file again:
[6]:
print(pd.read_csv("processed_results.csv"))
condition pp_id harmonics_to_noise
0 3 y 22.615414
1 5 y 16.403205
2 4 b 17.839167
3 2 y 21.054674
4 5 b 16.092489
5 2 b 12.378289
6 3 b 15.718858
7 1 y 16.704779
8 1 b 12.874451
9 4 y 18.431586
[7]:
# Clean up, remove the CSV file generated by this example
!rm processed_results.csv