ales commited on
Commit
bcc9601
Β·
verified Β·
1 Parent(s): ebf26a7

Delete readme.md

Browse files
Files changed (1) hide show
  1. readme.md +0 -41
readme.md DELETED
@@ -1,41 +0,0 @@
1
- ---
2
- title: ai-audio-books
3
- emoji: πŸ“•
4
- colorFrom: blue
5
- colorTo: gray
6
- sdk: gradio
7
- sdk_version: 4.44.1
8
- app_file: app.py
9
- pinned: false
10
- ---
11
-
12
- ### Action items
13
- - [ ] move speaker split to new pipeline
14
- - [ ] env template
15
- - [ ] move from AI/ML api to langchain
16
- - [ ] bugfix w/ 11labs api
17
- - [ ] async synthesis
18
- - [ ] map characters to voices
19
- - [] emotion enrichment: add intonation markers, auto-set TTS params
20
- - [x] generate good enough sound effects for background
21
- - [ ] mix effects with narrration
22
- - [x] allow files uplaod (.txt)
23
- - optimizations
24
- - [ ] combine sequential phrases of same character in single phrase
25
- - [ ] support large texts. use batching. problem: how to ensure same characters?
26
- can detect characters in first prompt, then split text in each batch into character phrases
27
- - [ ] probably split large phrases into smaller ones
28
-
29
- ### Backlog
30
- - [ ] prepare text for TTS
31
- - [x] prepare prompt to split text into character phrases
32
- - [ ] split large text in batches, process each batch separatelly, concat batches
33
- - [ ] try to identify unknown characters
34
- - [ ] select voices for TTS
35
- - [ ] map characters to available voices
36
- - [ ] use LLM to recognize characters for a given text and provide descriptions
37
- detailed enough to select appropriate voice
38
- - [ ] preprocess text phrases for TTS: add intonation markers, auto-set TTS params
39
- - [ ] run TTS to create narration
40
- - [ ] add effects. mix them with created narration
41
-