Speech performs speech recognition, transcribing audio files and live audio with support for custom language models. You configure an SFSpeechRecognizer for a locale and submit either an SFSpeechURLRecognitionRequest for a recorded file or an SFSpeechAudioBufferRecognitionRequest for live audio, then receive results through an SFSpeechRecognitionTask whose SFSpeechRecognitionResult carries an SFTranscription composed of SFTranscriptionSegment values. Custom recognition is driven through SFSpeechLanguageModel and SFCustomLanguageModelData, and recognition requires checking SFSpeechRecognizerAuthorizationStatus before use. The framework also provides a SpeechAnalyzer actor and modular SpeechTranscriber, DictationTranscriber, and SpeechDetector components that consume AnalyzerInput and build on the SpeechModule protocol for analyzing audio.
Speech Recognition Essentials 3
Create a recognizer for a locale and start transcribing audio.
- ClSFSpeechRecognizermacOS 10.15+An object you use to check for the availability of the speech recognition service, and to initiate the speech recognition process.
- PrSFSpeechRecognizerDelegatemacOS 10.15+A protocol that you adopt in your objects to track the availability of a speech recognizer.
- EnSFSpeechRecognizerAuthorizationStatusmacOS 10.15+The app's authorization to perform speech recognition.
Recognition Requests 3
Describe the audio to transcribe, whether a recorded file or a live audio stream.
- ClSFSpeechRecognitionRequestmacOS 10.15+An abstract class that represents a request to recognize speech from an audio source.
- ClSFSpeechURLRecognitionRequestmacOS 10.15+A request to recognize speech in a recorded audio file.
- ClSFSpeechAudioBufferRecognitionRequestmacOS 10.15+A request to recognize speech from captured audio content, such as audio from the device's microphone.
Recognition Tasks and Results 6
Track an in-progress recognition and read back its transcribed output.
- ClSFSpeechRecognitionTaskmacOS 10.15+A task object for monitoring the speech recognition progress.
- PrSFSpeechRecognitionTaskDelegatemacOS 10.15+A protocol with methods for managing multi-utterance speech recognition requests.
- EnSFSpeechRecognitionTaskStatemacOS 10.15+The state of the task associated with the recognition request.
- EnSFSpeechRecognitionTaskHintmacOS 10.15+The type of task for which you are using speech recognition.
- ClSFSpeechRecognitionResultmacOS 10.15+An object that contains the partial or final results of a speech recognition request.
- ClSFSpeechRecognitionMetadatamacOS 11.3+The metadata of speech in the audio of a speech recognition request.
Transcriptions 4
Inspect the recognized text and its constituent segments.
- ClSFTranscriptionmacOS 10.15+A textual representation of the specified speech in its entirety, as recognized by the speech recognizer.
- ClSFTranscriptionSegmentmacOS 10.15+A discrete part of an entire transcription, as identified by the speech recognizer.
- ClSFAcousticFeaturemacOS 10.15+The value of a voice analysis metric.
- ClSFVoiceAnalyticsmacOS 10.15+A collection of vocal analysis metrics.
Custom Language Models 4
Bias recognition toward domain-specific vocabulary using custom language model data.
- ClSFSpeechLanguageModelmacOS 14+A language model built from custom training data.
- ClSFCustomLanguageModelDataAn object that generates and exports custom language model training data.
- PrDataInsertableA protocol supporting the custom language model training data result builder.
- PrTemplateInsertableA protocol supporting the custom language model training data result builder.
Speech Analysis 8
Analyze audio through a modular pipeline of transcription and detection components.
- AcSpeechAnalyzerAn actor that coordinates a pipeline of modules to analyze an audio stream.
- ClSpeechTranscriberA speech-to-text transcription module that's appropriate for normal conversation and general purposes.
- ClDictationTranscriberA speech-to-text transcription module that's similar to system dictation features and compatible with older devices.
- ClSpeechDetectorA module that performs a voice activity detection (VAD) analysis.
- PrSpeechModuleProtocol that all analyzer modules conform to.
- PrLocaleDependentSpeechModuleA module that requires locale-specific assets.
- PrSpeechModuleResultProtocol that all module results conform to.
- EnSpeechModelsNamespace for methods related to model management.
Analyzer Input 5
Supply and convert audio input that feeds the speech analysis pipeline.
- StAnalyzerInputTime-coded audio data.
- ClAnalyzerInputConverterConverts audio buffers to a format suitable for analysis by a speech analyzer.
- ClAnalysisContextContextual information that may be shared among analyzers.
- ClAssetInputSequenceProviderReads from an audio file or asset, providing its audio in a format suitable for analysis by a speech analyzer.
- ClCaptureInputSequenceProviderReads from an AV capture device such as a microphone, providing the captured audio in a format suitable for analysis by a speech analyzer.
Asset Management 1
Manage the on-device models and assets required for analysis.
- ClAssetInventoryManages the assets that are necessary for transcription or other analyses.
Errors 1
Errors reported by speech recognition and analysis.
- StSFSpeechErrormacOS 14+A structure describing errors that occur during speech recognition or analysis.
Classes 1
- ClAssetInstallationRequestAn object that describes, downloads, and installs a selection of assets.