| apply_bpe | Apply BPE Merges |
| apply_timestamp_rules | Apply Timestamp Token Rules |
| audio_duration | Get Audio Duration |
| audio_to_mel | Convert Audio to Mel Spectrogram |
| beam_search_decode | Beam Search Decode |
| build_byte_decoder | Build Reverse Byte Decoder |
| byte_to_token | Convert Byte to BPE Token |
| clean_text | Clean Transcribed Text |
| compression_ratio | Compression Ratio |
| compute_stft | Compute STFT Magnitude |
| compute_word_timestamps | Word-Level Timestamp Alignment |
| copy_if_exists | Copy Weight if Exists |
| create_decoder | Create Decoder from Config |
| create_encoder | Create Encoder from Config |
| create_mel_filterbank_fallback | Create Mel Filterbank (Fallback) |
| decode_bpe_bytes | Decode BPE Bytes Back to Text |
| decode_timestamp | Decode Timestamp Token |
| decode_with_fallback | Decode with Temperature Fallback |
| detect_language | Language Detection |
| detect_language_from_mel | Detect Language from Mel Spectrogram |
| detect_language_from_pipeline | Detect Language from Pipeline |
| download_tokenizer_files | Download Tokenizer Files from HuggingFace |
| download_whisper_model | Download Model from HuggingFace |
| dtw_align | DTW Alignment |
| ensure_tokenizer_files | Ensure Tokenizer Files are Downloaded |
| expand_kv_cache | Expand KV Cache for Beam Search |
| extract_segments | Extract Segments with Timestamps |
| forced_decode | Forced Decode |
| get_initial_tokens | Get Initial Decoder Tokens |
| get_model_path | Get Model Cache Path |
| get_weights_path | Get Path to Model Weights |
| greedy_decode | Greedy Decoding |
| group_into_words | Group Subword Tokens into Words |
| hz_to_mel | Convert Hz to Mel Scale |
| is_timestamp_token | Check if Token is Timestamp |
| list_downloaded_models | List Downloaded Models |
| list_whisper_models | List Available Models |
| load_audio | Load and Preprocess Audio |
| load_decoder_weights | Load Decoder Weights |
| load_encoder_weights | Load Encoder Weights |
| load_mel_filterbank | Load Pre-computed Mel Filterbank |
| load_whisper_model | Load Whisper Model |
| load_whisper_weights | Load Weights from Safetensors |
| medfilt1 | 1D Median Filter |
| mel_to_hz | Convert Mel Scale to Hz |
| model_exists | Check if Model is Downloaded |
| pad_or_trim | Pad or Trim Audio to Fixed Length |
| parse_device | Parse Device Argument |
| parse_dtype | Parse Dtype Argument |
| rearrange_kv_cache | Rearrange KV Cache by Beam Indices |
| sample_decode | Sample Decode |
| split_audio | Split Long Audio into Chunks |
| tokenizer_decode | Decode Token IDs to Text |
| tokenizer_encode | Encode Text to Token IDs |
| transcribe | Transcribe Audio |
| transcribe_chunk | Transcribe Single Chunk |
| transcribe_long | Transcribe Long Audio |
| whisper_attention | Whisper Encoder |
| whisper_config | Whisper Model Configurations |
| whisper_decoder | Text Decoder |
| whisper_decoder_layer | Whisper Decoder |
| whisper_device | Device and Dtype Management |
| whisper_dtype | Get Default Dtype |
| whisper_encoder | Audio Encoder |
| whisper_encoder_layer | Encoder Layer |
| whisper_language_table | Whisper Language Table |
| whisper_lang_from_id | Get Language Code from Token ID |
| whisper_lang_token | Get Language Token ID |
| whisper_model | Whisper Model |
| whisper_pipeline | Whisper Transcription |
| WHISPER_SAMPLE_RATE | Audio Preprocessing for Whisper |
| whisper_special_tokens | Special Token IDs |
| whisper_tokenizer | Whisper BPE Tokenizer |