The Learning Innovation Hub is trailing the Automatic Speech Recognition (ASR) feature in the Echo360 Active Learning Platform.

Much like the dictation features available on phones and PCs, ASR converts speech (in this case, audio from Echo360 recordings) into a transcript that is displayed to students synchronised with video playback. Transcripts are searchable, allowing students find instances of a search term within a recording or across all recordings within the unit.


The ASR dialogue in an Echo360 video. Source: echo360.com

 ASR differs from closed captions in two ways. First, the ASR transcript is displayed in a separate window to the video itself whereas closed captions would overlay the video. Second, closed captions intended for accessibility are typically highly accurate, created and edited by a human as opposed to ASR, which is machine generated. It’s an important distinction, as ASR is not intended for accessibility, although the ASR transcript can be downloaded for further editing to a closed caption standard.

 So how accurate is ASR?

It can vary depending on the quality of the recorded sound. A recording of a presenter wearing a well-positioned lapel microphone will result in more accurate ARS compared to a recording with a presenter speaking some distance away from the mounted lectern microphone. At best, recordings may achieve 80% accuracy; at worst, a reverberant off-microphone recording would probably achieve 0% despite the recording still being intelligible to human ears.  

If would like to try out ASR with your Session 1 Echo360 recordings, please email ilearn.help@mq.edu.au with your unit code.

Posted by David Morgan

Leave a reply

Your email address will not be published. Required fields are marked *