当前位置: 首页 > news >正文

华为HarmonyOS实现实时语音识别转文本

场景介绍

将一段音频信息(短语音模式不超过60s,长语音模式不超过8h)转换为文本,音频信息可以为pcm音频文件或者实时语音。

开发步骤

  1. 在使用语音识别时,将实现语音识别相关的类添加至工程。

    1. import { speechRecognizer } from '@kit.CoreSpeechKit';
      import { BusinessError } from '@kit.BasicServicesKit';

  2. 调用createEngine方法,对引擎进行初始化,并创建SpeechRecognitionEngine实例。

    createEngine方法提供了两种调用形式,当前以其中一种作为示例,其他方式可参考API参考。

    1. let asrEngine: speechRecognizer.SpeechRecognitionEngine;
      let sessionId: string = '123456';
      // 创建引擎,通过callback形式返回
      // 设置创建引擎参数
      let extraParam: Record<string, Object> = {"locate": "CN", "recognizerMode": "short"};
      let initParamsInfo: speechRecognizer.CreateEngineParams = {
      language: 'zh-CN',
      online: 1,
      extraParams: extraParam
      };
      // 调用createEngine方法
      speechRecognizer.createEngine(initParamsInfo, (err: BusinessError, speechRecognitionEngine: speechRecognizer.SpeechRecognitionEngine) => {
      if (!err) {
      console.info('Succeeded in creating engine.');
      // 接收创建引擎的实例
      asrEngine = speechRecognitionEngine;
      } else {
      // 无法创建引擎时返回错误码1002200008,原因:引擎正在销毁中
      console.error(`Failed to create engine. Code: ${err.code}, message: ${err.message}.`);
      }
      });

  3. 得到SpeechRecognitionEngine实例对象后,实例化RecognitionListener对象,调用setListener方法设置回调,用来接收语音识别相关的回调信息。

    1. ​
      // 创建回调对象
      let setListener: speechRecognizer.RecognitionListener = {
      // 开始识别成功回调
      onStart(sessionId: string, eventMessage: string) {
      console.info(`onStart, sessionId: ${sessionId} eventMessage: ${eventMessage}`);
      },
      // 事件回调
      onEvent(sessionId: string, eventCode: number, eventMessage: string) {
      console.info(`onEvent, sessionId: ${sessionId} eventCode: ${eventCode} eventMessage: ${eventMessage}`);
      },
      // 识别结果回调,包括中间结果和最终结果
      onResult(sessionId: string, result: speechRecognizer.SpeechRecognitionResult) {
      console.info(`onResult, sessionId: ${sessionId} sessionId: ${JSON.stringify(result)}`);
      },
      // 识别完成回调
      onComplete(sessionId: string, eventMessage: string) {
      console.info(`onComplete, sessionId: ${sessionId} eventMessage: ${eventMessage}`);
      },
      // 错误回调,错误码通过本方法返回
      // 如:返回错误码1002200006,识别引擎正忙,引擎正在识别中
      // 更多错误码请参考错误码参考
      onError(sessionId: string, errorCode: number, errorMessage: string) {
      console.error(`onError, sessionId: ${sessionId} errorCode: ${errorCode} errorMessage: ${errorMessage}`);
      }
      }
      // 设置回调
      asrEngine.setListener(setListener);
      ​

  4. 分别为音频文件转文字和麦克风转文字功能设置开始识别的相关参数,调用startListening方法,开始合成。

    1. ​
      // 开始识别
      private startListeningForWriteAudio() {
      // 设置开始识别的相关参数
      let recognizerParams: speechRecognizer.StartParams = {
      sessionId: this.sessionId,
      audioInfo: { audioType: 'pcm', sampleRate: 16000, soundChannel: 1, sampleBit: 16 } //audioInfo参数配置请参考AudioInfo
      }
      // 调用开始识别方法
      asrEngine.startListening(recognizerParams);
      };
      private startListeningForRecording() {
      let audioParam: speechRecognizer.AudioInfo = { audioType: 'pcm', sampleRate: 16000, soundChannel: 1, sampleBit: 16 }
      let extraParam: Record<string, Object> = {
      "recognitionMode": 0,
      "vadBegin": 2000,
      "vadEnd": 3000,
      "maxAudioDuration": 20000
      }
      let recognizerParams: speechRecognizer.StartParams = {
      sessionId: this.sessionId,
      audioInfo: audioParam,
      extraParams: extraParam
      }
      console.info('startListening start');
      asrEngine.startListening(recognizerParams);
      };
      ​

  5. 传入音频流,调用writeAudio方法,开始写入音频流。读取音频文件时,开发者需预先准备一个pcm格式音频文件。

    1. ​
      let uint8Array: Uint8Array = new Uint8Array();
      // 可以通过如下方式获取音频流:1、通过录音获取音频流;2、从音频文件中读取音频流
      // 2、从音频文件中读取音频流:demo参考
      // 写入音频流,音频流长度仅支持640或1280
      asrEngine.writeAudio(sessionId, uint8Array);
      ​

  6. (可选)当需要查询语音识别服务支持的语种信息,可调用listLanguages方法。

    listLanguages方法提供了两种调用形式,当前以其中一种作为示例,其他方式可参考API参考。
    // 设置查询相关的参数
    let languageQuery: speechRecognizer.LanguageQuery = {
    sessionId: sessionId
    };
    // 调用listLanguages方法
    asrEngine.listLanguages(languageQuery).then((res: Array<string>) => {
    console.info(`Succeeded in listing languages, result: ${JSON.stringify(res)}.`);
    }).catch((err: BusinessError) => {
    console.error(`Failed to list languages. Code: ${err.code}, message: ${err.message}.`);
    });

  7. (可选)当需要结束识别时,可调用finish方法。

    1. // 结束识别
      asrEngine.finish(sessionId);

  8. (可选)当需要取消识别时,可调用cancel方法。

    1. // 取消识别
      asrEngine.cancel(sessionId);

  9. (可选)当需要释放语音识别引擎资源时,可调用shutdown方法。
    
    // 释放识别引擎资源
    asrEngine.shutdown();

  10. 需要在module.json5配置文件中添加ohos.permission.MICROPHONE权限,确保麦克风使用正常。详细步骤可查看声明权限章节。

    
    //...
    "requestPermissions": [
    {
    "name" : "ohos.permission.MICROPHONE",
    "reason": "$string:reason",
    "usedScene": {
    "abilities": [
    "EntryAbility"
    ],
    "when":"inuse"
    }
    }
    ],
    //...

开发实例

点击按钮,将一段音频信息转换为文本。index.ets文件如下:

​
import { speechRecognizer } from '@kit.CoreSpeechKit';
import { BusinessError } from '@kit.BasicServicesKit';
import { fileIo } from '@kit.CoreFileKit';
import { hilog } from '@kit.PerformanceAnalysisKit';
import AudioCapturer from './AudioCapturer';
const TAG = 'CoreSpeechKitDemo';
let asrEngine: speechRecognizer.SpeechRecognitionEngine;
@Entry
@Component
struct Index {
@State createCount: number = 0;
@State result: boolean = false;
@State voiceInfo: string = "";
@State sessionId: string = "123456";
private mAudioCapturer = new AudioCapturer();
build() {
Column() {
Scroll() {
Column() {
Button() {
Text("CreateEngineByCallback")
.fontColor(Color.White)
.fontSize(20)
}
.type(ButtonType.Capsule)
.backgroundColor("#0x317AE7")
.width("80%")
.height(50)
.margin(10)
.onClick(() => {
this.createCount++;
hilog.info(0x0000, TAG, `CreateAsrEngine:createCount:${this.createCount}`);
this.createByCallback();
})
Button() {
Text("setListener")
.fontColor(Color.White)
.fontSize(20)
}
.type(ButtonType.Capsule)
.backgroundColor("#0x317AE7")
.width("80%")
.height(50)
.margin(10)
.onClick(() => {
this.setListener();
})
Button() {
Text("startRecording")
.fontColor(Color.White)
.fontSize(20)
}
.type(ButtonType.Capsule)
.backgroundColor("#0x317AE7")
.width("80%")
.height(50)
.margin(10)
.onClick(() => {
this.startRecording();
})
Button() {
Text("writeAudio")
.fontColor(Color.White)
.fontSize(20)
}
.type(ButtonType.Capsule)
.backgroundColor("#0x317AE7")
.width("80%")
.height(50)
.margin(10)
.onClick(() => {
this.writeAudio();
})
Button() {
Text("queryLanguagesCallback")
.fontColor(Color.White)
.fontSize(20)
}
.type(ButtonType.Capsule)
.backgroundColor("#0x317AE7")
.width("80%")
.height(50)
.margin(10)
.onClick(() => {
this.queryLanguagesCallback();
})
Button() {
Text("finish")
.fontColor(Color.White)
.fontSize(20)
}
.type(ButtonType.Capsule)
.backgroundColor("#0x317AE7")
.width("80%")
.height(50)
.margin(10)
.onClick(() => {
// 结束识别
hilog.info(0x0000, TAG, "finish click:-->");
asrEngine.finish(this.sessionId);
})
Button() {
Text("cancel")
.fontColor(Color.White)
.fontSize(20)
}
.type(ButtonType.Capsule)
.backgroundColor("#0x317AE7")
.width("80%")
.height(50)
.margin(10)
.onClick(() => {
// 取消识别
hilog.info(0x0000, TAG, "cancel click:-->");
asrEngine.cancel(this.sessionId);
})
Button() {
Text("shutdown")
.fontColor(Color.White)
.fontSize(20)
}
.type(ButtonType.Capsule)
.backgroundColor("#0x317AA7")
.width("80%")
.height(50)
.margin(10)
.onClick(() => {
// 释放引擎
asrEngine.shutdown();
})
}
.layoutWeight(1)
}
.width('100%')
.height('100%')
}
}
// 创建引擎,通过callback形式返回
private createByCallback() {
// 设置创建引擎参数
let extraParam: Record<string, Object> = {"locate": "CN", "recognizerMode": "short"};
let initParamsInfo: speechRecognizer.CreateEngineParams = {
language: 'zh-CN',
online: 1,
extraParams: extraParam
};
// 调用createEngine方法
speechRecognizer.createEngine(initParamsInfo, (err: BusinessError, speechRecognitionEngine:
speechRecognizer.SpeechRecognitionEngine) => {
if (!err) {
hilog.info(0x0000, TAG, 'Succeeded in creating engine.');
// 接收创建引擎的实例
asrEngine = speechRecognitionEngine;
} else {
// 无法创建引擎时返回错误码1002200001,原因:语种不支持、模式不支持、初始化超时、资源不存在等导致创建引擎失败
// 无法创建引擎时返回错误码1002200006,原因:引擎正在忙碌中,一般多个应用同时调用语音识别引擎时触发
// 无法创建引擎时返回错误码1002200008,原因:引擎正在销毁中
hilog.error(0x0000, TAG, `Failed to create engine. Code: ${err.code}, message: ${err.message}.`);
}
});
}
// 查询语种信息,以callback形式返回
private queryLanguagesCallback() {
// 设置查询相关参数
let languageQuery: speechRecognizer.LanguageQuery = {
sessionId: '123456'
};
// 调用listLanguages方法
asrEngine.listLanguages(languageQuery, (err: BusinessError, languages: Array<string>) => {
if (!err) {
// 接收目前支持的语种信息
hilog.info(0x0000, TAG, `Succeeded in listing languages, result: ${JSON.stringify(languages)}`);
} else {
hilog.error(0x0000, TAG, `Failed to create engine. Code: ${err.code}, message: ${err.message}.`);
}
});
};
// 开始识别
private startListeningForWriteAudio() {
// 设置开始识别的相关参数
let recognizerParams: speechRecognizer.StartParams = {
sessionId: this.sessionId,
audioInfo: { audioType: 'pcm', sampleRate: 16000, soundChannel: 1, sampleBit: 16 } //audioInfo参数配置请参考AudioInfo
}
// 调用开始识别方法
asrEngine.startListening(recognizerParams);
};
private startListeningForRecording() {
let audioParam: speechRecognizer.AudioInfo = { audioType: 'pcm', sampleRate: 16000, soundChannel: 1, sampleBit: 16 }
let extraParam: Record<string, Object> = {
"recognitionMode": 0,
"vadBegin": 2000,
"vadEnd": 3000,
"maxAudioDuration": 20000
}
let recognizerParams: speechRecognizer.StartParams = {
sessionId: this.sessionId,
audioInfo: audioParam,
extraParams: extraParam
}
hilog.info(0x0000, TAG, 'startListening start');
asrEngine.startListening(recognizerParams);
};
// 写音频流
private async writeAudio() {
this.startListeningForWriteAudio();
hilog.error(0x0000, TAG, `Failed to read from file. Code`);
let ctx = getContext(this);
let filenames: string[] = fileIo.listFileSync(ctx.filesDir);
if (filenames.length <= 0) {
hilog.error(0x0000, TAG, `Failed to read from file. Code`);
return;
}
hilog.error(0x0000, TAG, `Failed to read from file. Code`);
let filePath: string = `${ctx.filesDir}/${filenames[0]}`;
let file = fileIo.openSync(filePath, fileIo.OpenMode.READ_WRITE);
try {
let buf: ArrayBuffer = new ArrayBuffer(1280);
let offset: number = 0;
while (1280 == fileIo.readSync(file.fd, buf, {
offset: offset
})) {
let uint8Array: Uint8Array = new Uint8Array(buf);
asrEngine.writeAudio("123456", uint8Array);
await this.countDownLatch(1);
offset = offset + 1280;
}
} catch (err) {
hilog.error(0x0000, TAG, `Failed to read from file. Code: ${err.code}, message: ${err.message}.`);
} finally {
if (null != file) {
fileIo.closeSync(file);
}
}
}
// 麦克风语音转文本
private async startRecording() {
this.startListeningForRecording();
// 录音获取音频
let data: ArrayBuffer;
hilog.info(0x0000, TAG, 'create capture success');
this.mAudioCapturer.init((dataBuffer: ArrayBuffer) => {
hilog.info(0x0000, TAG, 'start write');
hilog.info(0x0000, TAG, 'ArrayBuffer ' + JSON.stringify(dataBuffer));
data = dataBuffer
let unit8Array: Uint8Array = new Uint8Array(data);
hilog.info(0x0000, TAG, 'ArrayBuffer unit8Array ' + JSON.stringify(unit8Array));
// 写入音频流
asrEngine.writeAudio("1234567", unit8Array);
});
};
// 计时
public async countDownLatch(count: number) {
while (count > 0) {
await this.sleep(40);
count--;
}
}
// 睡眠
private sleep(ms: number):Promise<void> {
return new Promise(resolve => setTimeout(resolve, ms));
}
// 设置回调
private setListener() {
// 创建回调对象
let setListener: speechRecognizer.RecognitionListener = {
// 开始识别成功回调
onStart(sessionId: string, eventMessage: string) {
hilog.info(0x0000, TAG, `onStart, sessionId: ${sessionId} eventMessage: ${eventMessage}`);
},
// 事件回调
onEvent(sessionId: string, eventCode: number, eventMessage: string) {
hilog.info(0x0000, TAG, `onEvent, sessionId: ${sessionId} eventCode: ${eventCode} eventMessage: ${eventMessage}`);
},
// 识别结果回调,包括中间结果和最终结果
onResult(sessionId: string, result: speechRecognizer.SpeechRecognitionResult) {
hilog.info(0x0000, TAG, `onResult, sessionId: ${sessionId} sessionId: ${JSON.stringify(result)}`);
},
// 识别完成回调
onComplete(sessionId: string, eventMessage: string) {
hilog.info(0x0000, TAG, `onComplete, sessionId: ${sessionId} eventMessage: ${eventMessage}`);
},
// 错误回调,错误码通过本方法返回
// 返回错误码1002200002,开始识别失败,重复启动startListening方法时触发
// 更多错误码请参考错误码参考
onError(sessionId: string, errorCode: number, errorMessage: string) {
hilog.error(0x0000, TAG, `onError, sessionId: ${sessionId} errorCode: ${errorCode} errorMessage: ${errorMessage}`);
},
}
// 设置回调
asrEngine.setListener(setListener);
};
}
​

添加AudioCapturer.ts文件用于获取麦克风音频流。


'use strict';
/*
* Copyright (c) Huawei Technologies Co., Ltd. 2023-2023. All rights reserved.
*/import {audio} from '@kit.AudioKit';
import { hilog } from '@kit.PerformanceAnalysisKit';const TAG = 'AudioCapturer';/**
* Audio collector tool
*/
export default class AudioCapturer {
/**
* Collector object
*/
private mAudioCapturer = null;/**
* Audio Data Callback Method
*/
private mDataCallBack: (data: ArrayBuffer) => void = null;/**
* Indicates whether recording data can be obtained.
*/
private mCanWrite: boolean = true;/**
* Audio stream information
*/
private audioStreamInfo = {
samplingRate: audio.AudioSamplingRate.SAMPLE_RATE_16000,
channels: audio.AudioChannel.CHANNEL_1,
sampleFormat: audio.AudioSampleFormat.SAMPLE_FORMAT_S16LE,
encodingType: audio.AudioEncodingType.ENCODING_TYPE_RAW
}/**
* Audio collector information
*/
private audioCapturerInfo = {
source: audio.SourceType.SOURCE_TYPE_MIC,
capturerFlags: 0
}/**
* Audio Collector Option Information
*/
private audioCapturerOptions = {
streamInfo: this.audioStreamInfo,
capturerInfo: this.audioCapturerInfo
}/**
* Initialize
* @param audioListener
*/
public async init(dataCallBack: (data: ArrayBuffer) => void) {
if (null != this.mAudioCapturer) {
hilog.error(0x0000, TAG, 'AudioCapturerUtil already init');
return;
}
this.mDataCallBack = dataCallBack;
this.mAudioCapturer = await audio.createAudioCapturer(this.audioCapturerOptions).catch(error => {
hilog.error(0x0000, TAG, `AudioCapturerUtil init createAudioCapturer failed, code is ${error.code}, message is ${error.message}`);
});
}/**
* start recording
*/
public async start() {
hilog.error(0x0000, TAG, `AudioCapturerUtil start`);
let stateGroup = [audio.AudioState.STATE_PREPARED, audio.AudioState.STATE_PAUSED, audio.AudioState.STATE_STOPPED];
if (stateGroup.indexOf(this.mAudioCapturer.state) === -1) {
hilog.error(0x0000, TAG, `AudioCapturerUtil start failed`);
return;
}
this.mCanWrite = true;
await this.mAudioCapturer.start();
while (this.mCanWrite) {
let bufferSize = await this.mAudioCapturer.getBufferSize();
let buffer = await this.mAudioCapturer.read(bufferSize, true);
this.mDataCallBack(buffer)
}
}/**
* stop recording
*/
public async stop() {
if (this.mAudioCapturer.state !== audio.AudioState.STATE_RUNNING && this.mAudioCapturer.state !== audio.AudioState.STATE_PAUSED) {
hilog.error(0x0000, TAG, `AudioCapturerUtil stop Capturer is not running or paused`);
return;
}
this.mCanWrite = false;
await this.mAudioCapturer.stop();
if (this.mAudioCapturer.state === audio.AudioState.STATE_STOPPED) {
hilog.info(0x0000, TAG, `AudioCapturerUtil Capturer stopped`);
} else {
hilog.error(0x0000, TAG, `Capturer stop failed`);
}
}/**
* release
*/
public async release() {
if (this.mAudioCapturer.state === audio.AudioState.STATE_RELEASED || this.mAudioCapturer.state === audio.AudioState.STATE_NEW) {
hilog.error(0x0000, TAG, `Capturer already released`);
return;
}
await this.mAudioCapturer.release();
this.mAudioCapturer = null;
if (this.mAudioCapturer.state == audio.AudioState.STATE_RELEASED) {
hilog.info(0x0000, TAG, `Capturer released`);
} else {
hilog.error(0x0000, TAG, `Capturer release failed`);
}
}
}

在EntryAbility.ets文件中添加麦克风权限。


import { abilityAccessCtrl, AbilityConstant, UIAbility, Want } from '@kit.AbilityKit';
import { hilog } from '@kit.PerformanceAnalysisKit';
import { window } from '@kit.ArkUI';
import { BusinessError } from '@kit.BasicServicesKit';export default class EntryAbility extends UIAbility {
onCreate(want: Want, launchParam: AbilityConstant.LaunchParam): void {
hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onCreate');
}onDestroy(): void {
hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onDestroy');
}onWindowStageCreate(windowStage: window.WindowStage): void {
// Main window is created, set main page for this ability
hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onWindowStageCreate');let atManager = abilityAccessCtrl.createAtManager();
atManager.requestPermissionsFromUser(this.context, ['ohos.permission.MICROPHONE']).then((data) => {
hilog.info(0x0000, 'testTag', 'data:' + JSON.stringify(data));
hilog.info(0x0000, 'testTag', 'data:' + JSON.stringify(data));
hilog.info(0x0000, 'testTag', 'data permissions:' + data.permissions);
hilog.info(0x0000, 'testTag', 'data authResults:' + data.authResults);
}).catch((err: BusinessError) => {
hilog.error(0x0000, 'testTag', 'errCode: ' + err.code + 'errMessage: ' + err.message);
});windowStage.loadContent('pages/Index', (err, data) => {
if (err.code) {
hilog.error(0x0000, 'testTag', 'Failed to load the content. Cause: %{public}s', JSON.stringify(err) ?? '');
return;
}
hilog.info(0x0000, 'testTag', 'Succeeded in loading the content. Data: %{public}s', JSON.stringify(data) ?? '');
});
}onWindowStageDestroy(): void {
// Main window is destroyed, release UI related resources
hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onWindowStageDestroy');
}onForeground(): void {
// Ability has brought to foreground
hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onForeground');
}onBackground(): void {
// Ability has back to background
hilog.info(0x0000, 'testTag', '%{public}s', 'Ability onBackground');
}
}

http://www.mrgr.cn/news/56146.html

相关文章:

  • 玩转springboot之springboot异步执行
  • LED计数电路综合实验
  • Linux系统安装软件的4种方式【源码配置编译安装、yum安装、rpm包安装、二进制软件包安装(.rpm/.tar.gz/.tgz/.bz2)】
  • 基于微博评论的自然语言处理情感分析
  • MT1351-MT1360 码题集 (c 语言详解)
  • C++实现循环队列和链式队列操作(实验5--作业)
  • python将1格式化为01
  • k8s dockers 部署 k8s运行docker
  • 使用RRT算法进行路径规划的探索与优化
  • CodeQL和数据流分析的简介
  • 双十一有哪些值得购买的好物品?2024双十一超级好用的五款品牌分享
  • Qt开发笔记(一)Qt的基础知识及环境编译(泰山派)
  • 关于美团外卖霸王餐系统的详细介绍?你了解多少
  • 低代码平台:让系统开发随需而变,轻松应对各种需求!
  • [电子科大]王丽杰 离散数学 第二讲 特殊集合和集合间关系 笔记
  • 2024 年入门编程培训,仍然值得
  • 川宁生物三季报:抗生素中间体稳健增长,合成生物学产能蓄势待发
  • 深入解析 ThreadPoolExecutor:参数配置与源码分析
  • OAK相机的标定流程更新与优化通知
  • 高标准农田灌区信息化助力精准农业发展
  • springboot在线学习系统-计算机毕业设计源码78477
  • Android 添加线性亮度,替换原来的不平滑亮度曲线
  • 小巧设计,强大功能:探索SoC模块的多样化功能
  • 【视频生成大模型】 视频生成大模型 THUDM/CogVideoX-2b
  • Xcode真机运行正常,打包报错
  • 32匿名函数