Introducing Motion-Speech

In our last post we looked into the mechanics of creating a gem for RubyMotion, detailing the structure, spec coverage and basics of getting your files into the RubyMotion "pipeline" to be included in the compilation process.

Since we have covered the basics previously, in this article we will spend most of our time in the code itself and have fun with dynamic methods, text-to-speech functionality, and method delegation.

We will examine the motion-speech gem created for this article.

Getting Started

Before we get started please take a minute to familiarize yourself with the two concepts we'll be covering in depth. The first being meta-programming in Ruby, and specifically what you can do with RubyMotion. The second the somewhat simple read of AVSpeechSynthesizer and friends.

Ruby meta-programming

Before we go any further, I encourage you to read this excellently detailed article from the wonderful Clay Allsopp for lots of great meta-programming techniques. Clay covers dynamic methods, singleton methods, class methods, instance_eval and more. In my opinion, this is a must read if you want to begin to unlock the true power RubyMotion affords you in expressiveness and the ability to construct a DSL for your own use.

AVSpeechSynthesizer

For the purposes of this gem, we will be taking a look at how to make your iOS device running at least iOS7+ talk. AVSpeechSynthesizer provides an interface to convert text to speech on your iOS device.

The synthesizer itself is created and can optionally inform a delegate of when it begins, pauses, or finishes speech. The unit of speech itself is an utterance in Apple parlance. Coincidentally, you ask the synthesizer to begin speech of your utterance, an instance of AVSpeechUtterance.

The utterance itself is where most of the customization can occur. You can configure delays before or after the synthesizer speaks the utterance, change the rate at which it is spoken, set a different voice, pitch or any other number of factors to customize the experience.

Motion-Speech

To really make working with AVSpeechSynthesizer and AVSpeechUtterance much more ruby-like and fun, we will create a gem that provides the following features:

  • Pass a string and have it spoken without creating all the needed Cocoa classes, for convenience.
  • Pass a block that will be called upon completion of the string
  • Pass a block accepting one argument, to which you can further customize the ways you respond to an utterance being paused, stopped, or started.
  • Provide simple methods such as pause:, or resume to control playback.
  • Provide a simple way for any class, not just a string, to be passed to our speaker and be spoken as we want.

This interface might look like the following, and in fact it does. This is taken directly from the README of the motion-speech gem created for this article.

# Speak a sentence
Motion::Speech::Speaker.speak "Getting started with speech"

# Control the rate of speech
Motion::Speech::Speaker.speak "Getting started with speech", rate: 1

# Control the pitch
Motion::Speech::Speaker.speak "Getting started", pitch: 2.0

# Custom voice
voice_ref = AVSpeechSynthesisVoice.voiceWithLanguage("some_lang")
Motion::Speech::Speaker.speak "lorem", voice: voice_ref

# Pass a block to be called when the speech is completed
Motion::Speech::Speaker.speak "Getting started with speech" do
  puts "completed the utterance"
end

Providing a speakable interface looks like:

class Name < String
  def to_speakable
    "My name is #{self}"
  end
end

my_name = Name.new("Matt Brewer")
Motion::Speech::Speaker.speak my_name
# => "My name is Matt Brewer" spoken

Providing a way to further monitor events via a block syntax:

Motion::Speech::Speaker.speak "lorem" do |events|
  events.start do |speaker|
    puts "started speaking: '#{speaker.message}'"
  end

  events.finish do |speaker|
    puts "finished speaking: '#{speaker.message}'"
  end

  events.pause do |speaker|
    puts "paused while speaking: '#{speaker.message}'"
  end

  events.cancel do |speaker|
    puts "canceled while speaking: '#{speaker.message}'"
  end

  events.resume do |speaker|
    puts "resumed speaking: '#{speaker.message}'"
  end
end

And lastly, controlling playback:

speaker = Motion::Speech::Speaker.speak "lorem"

# pausing playback accepts symbols or constants
speaker.pause :word
speaker.pause :immediate
speaker.pause AVSpeechBoundaryImmediate

speaker.paused?
=> true

speaker.speaking?
=> false

# stopping playback accepts symbols or constants
speaker.stop :word
speaker.stop :immediate
speaker.stop AVSpeechBoundaryImmediate

# resume playback
speaker.resume

Text-to-speech

What follows is an excerpt of the Motion::Speech::Speaker class that has been modified to show what the minimum set of functionality to allow speech might look like.

module Motion
  module Speech
    class Speaker
      attr_reader :message, :options

      MultipleCallsToSpeakError = Class.new(StandardError)

      def self.speak(*args, &block)
        new(*args, &block).speak
      end

      def initialize(speakable, options={}, &block)
        @message = string_from_speakable(speakable)
        @options = options
        @spoken = false

        if block_given?
          if block.arity == 0
            events.finish &block
          elsif block.arity == 1
            block.call events
          else
            raise ArgumentError, 'block must accept either 0 or 1 arguments'
          end
        end
      end

      def speak
        raise MultipleCallsToSpeakError if @spoken

        synthesizer.speakUtterance utterance
        @spoken = true
        self
      end

      def utterance
        return @utterance unless @utterance.nil?

        @utterance = AVSpeechUtterance.speechUtteranceWithString(message)
        @utterance.rate = options.fetch(:rate, 0.15)
        @utterance
      end

      def synthesizer
        @synthesizer ||= AVSpeechSynthesizer.new.tap { |s| s.delegate = self }
      end

      private

      def events
        @events ||= EventBlock.new
      end

      def string_from_speakable(speakable)
        if speakable.respond_to?(:to_speakable)
          speakable.to_speakable
        else
          speakable
        end
      end
    end
  end
end

There are a few things to take note of here:

Creating an Error Class

We can easily create a subclass using the Class.new syntax and assigning that to a constant. In this case we use the following to create an exception class.

MultipleCallsToSpeakError = Class.new(StandardError)

Splat Operator

Your good friend the splat operator still works as you expect. There is no need to define what arguments the class method speak requires since really it is the same listing as what initialize expects.

def self.speak(*args, &block)
  new(*args, &block).speak
end

Block Arity and Inspection

All of the methods provided for the inspection of a block still exist. We use this to our advantage - if you pass a block accepting no arguments, we assume you want the block called when the utterance is completed. If you pass a block accepting 1 argument, we pass you an instance of Motion::Speech::EventBlock allowing further customization. If your block accepts 2+ arguments, you're doing something wrong.

if block_given?
  if block.arity == 0
    events.finish &block
  elsif block.arity == 1
    block.call events
  else
    raise ArgumentError, 'block must accept either 0 or 1 arguments'
  end
end

respond_to?

respond_to? works fine here, and we'll use it to see if our message provides a string.

def string_from_speakable(speakable)
  if speakable.respond_to?(:to_speakable)
    speakable.to_speakable
  else
    speakable
  end
end

EventBlock and meta-programming

Now getting to the point of the article with some more meta-programming, we'll examine the Motion::Speech::EventBlock class.

module Motion
  module Speech
    class EventBlock

      Events = %w(start finish cancel pause resume).freeze

      Events.each do |method|
        define_method method do |*args, &block|
          if !block.nil?
            instance_variable_set("@#{method}_block", block)
          else
            instance_variable_get("@#{method}_block")
          end
        end
      end

      def call(event, speaker)
        block = send(event)
        block.call(speaker) unless block.nil?
      end
    end
  end
end

The following class illustrates usage of several useful concepts:

  • #define_method works as you would expect, use it to your advantage.
  • #instance_variable_get and #instance_variable_set are amazingly helpful when in REPL and poking through your app, but they can be useful in crafting DSL classes as well.
  • Don't be afraid to use #send, #public_send and friends.

What does this class actually accomplish? It provides a method that when given a block, will store the block, and if the method is called without a block, it returns the block.

event = Motion::Speech::EventBlock.new
event.start { true }
event.start.should.be.instance_of Proc

event.finish.should.be.nil

event.call(:finish, nil).should.be.nil
event.call(:start, nil).should.be.true

An instance of EventBlock is what allows the user of the Speaker class to easily listen in on callbacks from the synthesizer as the state changes. Speaker itself is configured as the delegate of the synthesizer and implements the related delegate methods.

module Motion
  module Speech
    class Speaker

      def speechSynthesizer(s, didFinishSpeechUtterance: utterance)
        events.call :finish, self
      end

      def speechSynthesizer(s, didStartSpeechUtterance: utterance)
        events.call :start, self
      end

      def speechSynthesizer(s, didCancelSpeechUtterance: utterance)
        events.call :cancel, self
      end

      def speechSynthesizer(s, didPauseSpeechUtterance: utterance)
        events.call :pause, self
      end

      def speechSynthesizer(s, didContinueSpeechUtterance: utterance)
        events.call :resume, self
      end

      def events
        @events ||= EventBlock.new
      end

    end
  end
end

If you noticed in the Speaker#initialize method, if the block given does not accept any arguments, we store that block as the callback for completion with a call to events.finish &block. Optionally if the given block accepts 1 argument we call the block and pass it the events instance to customize as the user sees fit.

Wrapping Up

What are some examples of ways to continue working on this gem?

  • Maybe Motion::Speech::Speaker could be a subclass of AVSpeechSynthesizer since so much of the workings of the class are based upon the synthesizer.
  • Possibly subclass AVSpeechSynthesisVoice to make finding a voice and applying it to an utterance even easier.

Would love to hear your ideas and comments on the article and gem.

Up Next

In the next article we will take a break from RubyMotion and dive into the world of Rails and ActiveAdmin. I've been a contributor and maintainer of ActiveAdmin for quite a while and have written a gem, ActiveAdmin-StateMachine, that we will examine in more detail.

ActiveAdmin-StateMachine is a great example of crafting a simple DSL for ActiveAdmin exposing functionality from another popular Rails gem, state_machine.

Our Products

It takes one to know one - we've walked the walk by building our own products that customers love.

Ready to have a chat?

Contact us to chat with our founder
so we can learn about you and your project.