This view is limited to 50 files because it contains too many changes.  See the raw diff here.
Files changed (50) hide show
  1. .gitattributes +0 -2
  2. .streamlit/config.toml +0 -10
  3. README.md +12 -38
  4. app.py +217 -322
  5. clarifai_grpc_helper.py +71 -0
  6. examples/example_04.json +0 -3
  7. file_embeddings/embeddings.npy +0 -3
  8. file_embeddings/icons.npy +0 -3
  9. global_config.py +12 -56
  10. helpers/__init__.py +0 -0
  11. helpers/icons_embeddings.py +0 -166
  12. helpers/image_search.py +0 -148
  13. helpers/pptx_helper.py +0 -982
  14. helpers/text_helper.py +0 -89
  15. icons/png128/0-circle.png +0 -0
  16. icons/png128/1-circle.png +0 -0
  17. icons/png128/123.png +0 -0
  18. icons/png128/2-circle.png +0 -0
  19. icons/png128/3-circle.png +0 -0
  20. icons/png128/4-circle.png +0 -0
  21. icons/png128/5-circle.png +0 -0
  22. icons/png128/6-circle.png +0 -0
  23. icons/png128/7-circle.png +0 -0
  24. icons/png128/8-circle.png +0 -0
  25. icons/png128/9-circle.png +0 -0
  26. icons/png128/activity.png +0 -0
  27. icons/png128/airplane.png +0 -0
  28. icons/png128/alarm.png +0 -0
  29. icons/png128/alien-head.png +0 -0
  30. icons/png128/alphabet.png +0 -0
  31. icons/png128/amazon.png +0 -0
  32. icons/png128/amritsar-golden-temple.png +0 -0
  33. icons/png128/amsterdam-canal.png +0 -0
  34. icons/png128/amsterdam-windmill.png +0 -0
  35. icons/png128/android.png +0 -0
  36. icons/png128/angkor-wat.png +0 -0
  37. icons/png128/apple.png +0 -0
  38. icons/png128/archive.png +0 -0
  39. icons/png128/argentina-obelisk.png +0 -0
  40. icons/png128/artificial-intelligence-brain.png +0 -0
  41. icons/png128/atlanta.png +0 -0
  42. icons/png128/austin.png +0 -0
  43. icons/png128/automation-decision.png +0 -0
  44. icons/png128/award.png +0 -0
  45. icons/png128/balloon.png +0 -0
  46. icons/png128/ban.png +0 -0
  47. icons/png128/bandaid.png +0 -0
  48. icons/png128/bangalore.png +0 -0
  49. icons/png128/bank.png +0 -0
  50. icons/png128/bar-chart-line.png +0 -0
.gitattributes CHANGED
@@ -33,5 +33,3 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
- *.pptx filter=lfs diff=lfs merge=lfs -text
37
- pptx_templates/Minimalist_sales_pitch.pptx filter=lfs diff=lfs merge=lfs -text
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
.streamlit/config.toml DELETED
@@ -1,10 +0,0 @@
1
- [server]
2
- runOnSave = true
3
- headless = false
4
- maxUploadSize = 0
5
-
6
- [browser]
7
- gatherUsageStats = false
8
-
9
- [theme]
10
- base = "dark"
 
 
 
 
 
 
 
 
 
 
 
README.md CHANGED
@@ -4,7 +4,7 @@ emoji: 🏢
4
  colorFrom: yellow
5
  colorTo: green
6
  sdk: streamlit
7
- sdk_version: 1.32.2
8
  app_file: app.py
9
  pinned: false
10
  license: mit
@@ -16,62 +16,36 @@ We spend a lot of time on creating the slides and organizing our thoughts for an
16
  With SlideDeck AI, co-create slide decks on any topic with Generative Artificial Intelligence.
17
  Describe your topic and let SlideDeck AI generate a PowerPoint slide deck for you—it's as simple as that!
18
 
19
- SlideDeck AI is powered by [Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407).
20
  Originally, it was built using the Llama 2 API provided by Clarifai.
21
 
22
- *Update (v4.0)*: Legacy SlideDeck AI allowed one-shot generation of a slide deck based on the inputs.
23
- In contrast, SlideDeck AI *Reloaded* enables an iterative workflow with a conversational interface,
24
- where you can create and improve the presentation.
25
-
26
-
27
  # Process
28
 
29
  SlideDeck AI works in the following way:
30
 
31
- 1. Given a topic description, it uses Mistral Nemo Instruct to generate the *initial* content of the slides.
32
  The output is generated as structured JSON data based on a pre-defined schema.
33
- 2. Next, it uses the keywords from the JSON output to search and download a few images with a certain probability.
34
- 3. Subsequently, it uses the `python-pptx` library to generate the slides,
35
  based on the JSON data from the previous step.
36
- A user can choose from a set of three pre-defined presentation templates.
37
- 4. At this stage onward, a user can provide additional instructions to *refine* the content.
38
- For example, one can ask to add another slide or modify an existing slide.
39
- A history of instructions is maintained.
40
- 5. Every time SlideDeck AI generates a PowerPoint presentation, a download button is provided.
41
- Clicking on the button will download the file.
42
-
43
-
44
- # Icons
45
-
46
- SlideDeck AI uses a subset of icons from [bootstrap-icons-1.11.3](https://github.com/twbs/icons)
47
- (MIT license) in the slides. A few icons from [SVG Repo](https://www.svgrepo.com/)
48
- (CC0, MIT, and Apache licenses) are also used.
49
-
50
-
51
- # Known Issues
52
-
53
- - **Connection timeout**: Requests sent to the Hugging Face Inference endpoint might time out. If it still does not work, wait for a while and try again.
54
 
55
- The following is not an issue but might appear as a strange behavior:
56
- - **Cannot paste text in the input box**: If the length of the copied text is greater than the maximum
57
- number of allowed characters in the textbox, pasting would not work.
58
 
59
 
60
  # Local Development
61
 
62
- SlideDeck AI uses [Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407)
63
  via the Hugging Face Inference API.
64
- To run this project by yourself, you need to provide the `HUGGINGFACEHUB_API_TOKEN` API key,
65
- for example, in a `.env` file. For image search, the `PEXEL_API_KEY` should be added.
66
- Visit the respective websites to obtain the keys.
67
 
68
 
69
  # Live Demo
70
 
71
- - [SlideDeck AI](https://huggingface.co/spaces/barunsaha/slide-deck-ai) on Hugging Face Spaces
72
- - [Demo video](https://youtu.be/QvAKzNKtk9k) of the chat interface on YouTube
73
 
74
 
75
  # Award
76
 
77
- SlideDeck AI has won the 3rd Place in the [Llama 2 Hackathon with Clarifai](https://lablab.ai/event/llama-2-hackathon-with-clarifai) in 2023.
 
4
  colorFrom: yellow
5
  colorTo: green
6
  sdk: streamlit
7
+ sdk_version: 1.26.0
8
  app_file: app.py
9
  pinned: false
10
  license: mit
 
16
  With SlideDeck AI, co-create slide decks on any topic with Generative Artificial Intelligence.
17
  Describe your topic and let SlideDeck AI generate a PowerPoint slide deck for you—it's as simple as that!
18
 
19
+ SlideDeck AI is powered by [Mistral 7B Instruct](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1).
20
  Originally, it was built using the Llama 2 API provided by Clarifai.
21
 
 
 
 
 
 
22
  # Process
23
 
24
  SlideDeck AI works in the following way:
25
 
26
+ 1. Given a topic description, it uses Mistral 7B Instruct to generate the outline/contents of the slides.
27
  The output is generated as structured JSON data based on a pre-defined schema.
28
+ 2. Subsequently, it uses the `python-pptx` library to generate the slides,
 
29
  based on the JSON data from the previous step.
30
+ Here, a user can choose from a set of three pre-defined presentation templates.
31
+ 3. In addition, it uses Metaphor to fetch Web pages related to the topic.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
33
+ 4. ~~Finally, it uses Stable Diffusion 2 to generate an image, based on the title and each slide heading.~~
 
 
34
 
35
 
36
  # Local Development
37
 
38
+ SlideDeck AI uses [Mistral 7B Instruct](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)
39
  via the Hugging Face Inference API.
40
+ To run this project by yourself, you need to provide the `HUGGINGFACEHUB_API_TOKEN` and `METAPHOR_API_KEY` API keys,
41
+ for example, in a `.env` file. Visit the respective websites to obtain the keys.
 
42
 
43
 
44
  # Live Demo
45
 
46
+ [SlideDeck AI](https://huggingface.co/spaces/barunsaha/slide-deck-ai)
 
47
 
48
 
49
  # Award
50
 
51
+ SlideDeck AI has won the 3rd Place in the [Llama 2 Hackathon with Clarifai](https://lablab.ai/event/llama-2-hackathon-with-clarifai).
app.py CHANGED
@@ -1,415 +1,310 @@
1
- """
2
- Streamlit app containing the UI and the application logic.
3
- """
4
- import datetime
5
- import logging
6
  import pathlib
7
- import random
8
- import sys
9
  import tempfile
10
- from typing import List, Union
11
 
12
- import huggingface_hub
13
  import json5
14
- import requests
15
  import streamlit as st
16
- from langchain_community.chat_message_histories import StreamlitChatMessageHistory
17
- from langchain_core.messages import HumanMessage
18
- from langchain_core.prompts import ChatPromptTemplate
19
-
20
- sys.path.append('..')
21
- sys.path.append('../..')
22
 
23
- import helpers.icons_embeddings as ice
 
24
  from global_config import GlobalConfig
25
- from helpers import llm_helper, pptx_helper, text_helper
26
 
27
 
28
- @st.cache_data
29
- def _load_strings() -> dict:
30
- """
31
- Load various strings to be displayed in the app.
32
- :return: The dictionary of strings.
33
- """
34
 
35
- with open(GlobalConfig.APP_STRINGS_FILE, 'r', encoding='utf-8') as in_file:
36
- return json5.loads(in_file.read())
 
 
37
 
38
 
39
  @st.cache_data
40
- def _get_prompt_template(is_refinement: bool) -> str:
41
  """
42
- Return a prompt template.
43
 
44
- :param is_refinement: Whether this is the initial or refinement prompt.
45
- :return: The prompt template as f-string.
46
  """
47
 
48
- if is_refinement:
49
- with open(GlobalConfig.REFINEMENT_PROMPT_TEMPLATE, 'r', encoding='utf-8') as in_file:
50
- template = in_file.read()
51
- else:
52
- with open(GlobalConfig.INITIAL_PROMPT_TEMPLATE, 'r', encoding='utf-8') as in_file:
53
- template = in_file.read()
54
-
55
- return template
56
 
57
 
58
  @st.cache_resource
59
- def _get_llm():
60
  """
61
- Get an LLM instance.
62
 
63
- :return: The LLM.
64
  """
65
 
66
- return llm_helper.get_hf_endpoint()
67
 
68
 
69
  @st.cache_data
70
- def _get_icons_list() -> List[str]:
71
  """
72
- Get a list of available icons names without the dir name and file extension.
73
 
74
- :return: A llist of the icons.
 
75
  """
76
 
77
- return ice.get_icons_list()
78
-
79
-
80
- APP_TEXT = _load_strings()
81
-
82
- # Session variables
83
- CHAT_MESSAGES = 'chat_messages'
84
- DOWNLOAD_FILE_KEY = 'download_file_name'
85
- IS_IT_REFINEMENT = 'is_it_refinement'
86
-
87
-
88
- logger = logging.getLogger(__name__)
89
 
90
- texts = list(GlobalConfig.PPTX_TEMPLATE_FILES.keys())
91
- captions = [GlobalConfig.PPTX_TEMPLATE_FILES[x]['caption'] for x in texts]
92
- pptx_template = st.sidebar.radio(
93
- 'Select a presentation template:',
94
- texts,
95
- captions=captions,
96
- horizontal=True
97
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
98
 
99
 
100
  def build_ui():
101
  """
102
- Display the input elements for content generation.
103
  """
104
 
 
 
105
  st.title(APP_TEXT['app_name'])
106
  st.subheader(APP_TEXT['caption'])
107
  st.markdown(
108
- '![Visitors](https://api.visitorbadge.io/api/visitors?path=https%3A%2F%2Fhuggingface.co%2Fspaces%2Fbarunsaha%2Fslide-deck-ai&countColor=%23263759)' # noqa: E501
 
 
 
 
 
109
  )
110
 
111
- with st.expander('Usage Policies and Limitations'):
112
- st.text(APP_TEXT['tos'] + '\n\n' + APP_TEXT['tos2'])
113
-
114
- set_up_chat_ui()
115
-
116
-
117
- def set_up_chat_ui():
118
- """
119
- Prepare the chat interface and related functionality.
120
- """
121
-
122
- with st.expander('Usage Instructions'):
123
- st.markdown(GlobalConfig.CHAT_USAGE_INSTRUCTIONS)
124
- st.markdown(
125
- '[SlideDeck AI](https://github.com/barun-saha/slide-deck-ai) is an Open-Source project.' # noqa: E501
126
- ' It is is powered by' # noqa: E501
127
- ' [Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407).' # noqa: E501
128
  )
129
 
130
- st.info(
131
- 'If you like SlideDeck AI, please consider leaving a heart ❤️ on the'
132
- ' [Hugging Face Space](https://huggingface.co/spaces/barunsaha/slide-deck-ai/) or'
133
- ' a star ⭐ on [GitHub](https://github.com/barun-saha/slide-deck-ai).'
134
- ' Your [feedback](https://forms.gle/JECFBGhjvSj7moBx9) is appreciated.'
135
- )
136
-
137
- # view_messages = st.expander('View the messages in the session state')
138
 
139
- st.chat_message('ai').write(
140
- random.choice(APP_TEXT['ai_greetings'])
141
- )
 
 
 
142
 
143
- history = StreamlitChatMessageHistory(key=CHAT_MESSAGES)
 
144
 
145
- if _is_it_refinement():
146
- template = _get_prompt_template(is_refinement=True)
147
- else:
148
- template = _get_prompt_template(is_refinement=False)
149
-
150
- prompt_template = ChatPromptTemplate.from_template(template)
151
-
152
- # Since Streamlit app reloads at every interaction, display the chat history
153
- # from the save session state
154
- for msg in history.messages:
155
- msg_type = msg.type
156
- if msg_type == 'user':
157
- st.chat_message(msg_type).write(msg.content)
158
- else:
159
- st.chat_message(msg_type).code(msg.content, language='json')
160
-
161
- if prompt := st.chat_input(
162
- placeholder=APP_TEXT['chat_placeholder'],
163
- max_chars=GlobalConfig.LLM_MODEL_MAX_INPUT_LENGTH
164
- ):
165
- if not text_helper.is_valid_prompt(prompt):
166
- st.error(
167
- 'Not enough information provided!'
168
- ' Please be a little more descriptive and type a few words'
169
- ' with a few characters :)'
170
- )
171
- return
172
-
173
- logger.info('User input: %s | #characters: %d', prompt, len(prompt))
174
- st.chat_message('user').write(prompt)
175
-
176
- user_messages = _get_user_messages()
177
- user_messages.append(prompt)
178
- list_of_msgs = [
179
- f'{idx + 1}. {msg}' for idx, msg in enumerate(user_messages)
180
- ]
181
- list_of_msgs = '\n'.join(list_of_msgs)
182
-
183
- if _is_it_refinement():
184
- formatted_template = prompt_template.format(
185
- **{
186
- 'instructions': list_of_msgs,
187
- 'previous_content': _get_last_response(),
188
- 'icons_list': '\n'.join(_get_icons_list())
189
- }
190
- )
191
- else:
192
- formatted_template = prompt_template.format(
193
- **{
194
- 'question': prompt,
195
- 'icons_list': '\n'.join(_get_icons_list())
196
- }
197
- )
198
-
199
- progress_bar = st.progress(0, 'Preparing to call LLM...')
200
- response = ''
201
 
202
- try:
203
- for chunk in _get_llm().stream(formatted_template):
204
- response += chunk
 
205
 
206
- # Update the progress bar
207
- progress_percentage = min(
208
- len(response) / GlobalConfig.LLM_MODEL_MAX_OUTPUT_LENGTH, 0.95
209
- )
210
- progress_bar.progress(
211
- progress_percentage,
212
- text='Streaming content...this might take a while...'
213
- )
214
- except requests.exceptions.ConnectionError:
215
- msg = (
216
- 'A connection error occurred while streaming content from the LLM endpoint.'
217
- ' Unfortunately, the slide deck cannot be generated. Please try again later.'
218
- )
219
- logger.error(msg)
220
- st.error(msg)
221
- return
222
- except huggingface_hub.errors.ValidationError as ve:
223
- msg = (
224
- f'An error occurred while trying to generate the content: {ve}'
225
- '\nPlease try again with a significantly shorter input text.'
226
- )
227
- logger.error(msg)
228
- st.error(msg)
229
- return
230
- except Exception as ex:
231
- msg = (
232
- f'An unexpected error occurred while generating the content: {ex}'
233
- '\nPlease try again later, possibly with different inputs.'
234
- )
235
- logger.error(msg)
236
- st.error(msg)
237
- return
238
-
239
- history.add_user_message(prompt)
240
- history.add_ai_message(response)
241
-
242
- # The content has been generated as JSON
243
- # There maybe trailing ``` at the end of the response -- remove them
244
- # To be careful: ``` may be part of the content as well when code is generated
245
- response_cleaned = text_helper.get_clean_json(response)
246
-
247
- logger.info(
248
- 'Cleaned JSON response:: original length: %d | cleaned length: %d',
249
- len(response), len(response_cleaned)
250
- )
251
- # logger.debug('Cleaned JSON: %s', response_cleaned)
252
 
253
- # Now create the PPT file
254
- progress_bar.progress(
255
- GlobalConfig.LLM_PROGRESS_MAX,
256
- text='Finding photos online and generating the slide deck...'
257
- )
258
- path = generate_slide_deck(response_cleaned)
259
- progress_bar.progress(1.0, text='Done!')
260
 
261
- st.chat_message('ai').code(response, language='json')
262
-
263
- if path:
264
- _display_download_button(path)
265
-
266
- logger.info(
267
- '#messages in history / 2: %d',
268
- len(st.session_state[CHAT_MESSAGES]) / 2
269
- )
270
 
271
 
272
- def generate_slide_deck(json_str: str) -> Union[pathlib.Path, None]:
273
  """
274
- Create a slide deck and return the file path. In case there is any error creating the slide
275
- deck, the path may be to an empty file.
276
 
277
- :param json_str: The content in *valid* JSON format.
278
- :return: The path to the .pptx file or `None` in case of error.
 
 
279
  """
280
 
281
- try:
282
- parsed_data = json5.loads(json_str)
283
- except ValueError:
284
- st.error(
285
- 'Encountered error while parsing JSON...will fix it and retry'
286
- )
287
- logger.error(
288
- 'Caught ValueError: trying again after repairing JSON...'
289
- )
290
- try:
291
- parsed_data = json5.loads(text_helper.fix_malformed_json(json_str))
292
- except ValueError:
293
- st.error(
294
- 'Encountered an error again while fixing JSON...'
295
- 'the slide deck cannot be created, unfortunately ☹'
296
- '\nPlease try again later.'
297
- )
298
- logger.error(
299
- 'Caught ValueError: failed to repair JSON!'
300
- )
301
-
302
- return None
303
- except RecursionError:
304
- st.error(
305
- 'Encountered an error while parsing JSON...'
306
- 'the slide deck cannot be created, unfortunately ☹'
307
- '\nPlease try again later.'
308
- )
309
- logger.error('Caught RecursionError while parsing JSON. Cannot generate the slide deck!')
310
 
311
- return None
312
- except Exception:
313
- st.error(
314
- 'Encountered an error while parsing JSON...'
315
- 'the slide deck cannot be created, unfortunately ☹'
316
- '\nPlease try again later.'
317
- )
318
- logger.error(
319
- 'Caught ValueError: failed to parse JSON!'
320
- )
321
 
322
- return None
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
323
 
324
- if DOWNLOAD_FILE_KEY in st.session_state:
325
- path = pathlib.Path(st.session_state[DOWNLOAD_FILE_KEY])
326
- else:
327
- temp = tempfile.NamedTemporaryFile(delete=False, suffix='.pptx')
328
- path = pathlib.Path(temp.name)
329
- st.session_state[DOWNLOAD_FILE_KEY] = str(path)
330
 
331
- if temp:
332
- temp.close()
333
 
334
- try:
335
- logger.debug('Creating PPTX file: %s...', st.session_state[DOWNLOAD_FILE_KEY])
336
- pptx_helper.generate_powerpoint_presentation(
337
- parsed_data,
338
- slides_template=pptx_template,
339
- output_file_path=path
340
- )
341
- except Exception as ex:
342
- st.error(APP_TEXT['content_generation_error'])
343
- logger.error('Caught a generic exception: %s', str(ex))
344
 
345
- return path
 
346
 
347
 
348
- def _is_it_refinement() -> bool:
349
  """
350
- Whether it is the initial prompt or a refinement.
351
 
352
- :return: True if it is the initial prompt; False otherwise.
 
 
353
  """
354
 
355
- if IS_IT_REFINEMENT in st.session_state:
356
- return True
357
-
358
- if len(st.session_state[CHAT_MESSAGES]) >= 2:
359
- # Prepare for the next call
360
- st.session_state[IS_IT_REFINEMENT] = True
361
- return True
362
-
363
- return False
364
 
 
 
 
 
 
 
 
 
 
365
 
366
- def _get_user_messages() -> List[str]:
367
- """
368
- Get a list of user messages submitted until now from the session state.
369
 
370
- :return: The list of user messages.
371
- """
372
 
373
- return [
374
- msg.content for msg in st.session_state[CHAT_MESSAGES] if isinstance(msg, HumanMessage)
375
- ]
376
 
377
 
378
- def _get_last_response() -> str:
379
  """
380
- Get the last response generated by AI.
381
 
382
- :return: The response text.
 
 
 
383
  """
384
 
385
- return st.session_state[CHAT_MESSAGES][-1].content
 
386
 
 
 
 
 
 
387
 
388
- def _display_messages_history(view_messages: st.expander):
389
- """
390
- Display the history of messages.
391
 
392
- :param view_messages: The list of AI and Human messages.
393
- """
 
 
 
 
 
 
394
 
395
- with view_messages:
396
- view_messages.json(st.session_state[CHAT_MESSAGES])
397
 
 
398
 
399
- def _display_download_button(file_path: pathlib.Path):
 
400
  """
401
- Display a download button to download a slide deck.
402
 
403
- :param file_path: The path of the .pptx file.
404
  """
405
 
406
- with open(file_path, 'rb') as download_file:
407
- st.download_button(
408
- 'Download PPTX file ⬇️',
409
- data=download_file,
410
- file_name='Presentation.pptx',
411
- key=datetime.datetime.now()
412
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
413
 
414
 
415
  def main():
 
 
 
 
 
 
1
  import pathlib
2
+ import logging
 
3
  import tempfile
4
+ from typing import List, Tuple
5
 
 
6
  import json5
7
+ import metaphor_python as metaphor
8
  import streamlit as st
 
 
 
 
 
 
9
 
10
+ import llm_helper
11
+ import pptx_helper
12
  from global_config import GlobalConfig
 
13
 
14
 
15
+ APP_TEXT = json5.loads(open(GlobalConfig.APP_STRINGS_FILE, 'r', encoding='utf-8').read())
16
+ GB_CONVERTER = 2 ** 30
17
+
 
 
 
18
 
19
+ logging.basicConfig(
20
+ level=GlobalConfig.LOG_LEVEL,
21
+ format='%(asctime)s - %(message)s',
22
+ )
23
 
24
 
25
  @st.cache_data
26
+ def get_contents_wrapper(text: str) -> str:
27
  """
28
+ Fetch and cache the slide deck contents on a topic by calling an external API.
29
 
30
+ :param text: The presentation topic
31
+ :return: The slide deck contents or outline in JSON format
32
  """
33
 
34
+ logging.info('LLM call because of cache miss...')
35
+ return llm_helper.generate_slides_content(text).strip()
 
 
 
 
 
 
36
 
37
 
38
  @st.cache_resource
39
+ def get_metaphor_client_wrapper() -> metaphor.Metaphor:
40
  """
41
+ Create a Metaphor client for semantic Web search.
42
 
43
+ :return: Metaphor instance
44
  """
45
 
46
+ return metaphor.Metaphor(api_key=GlobalConfig.METAPHOR_API_KEY)
47
 
48
 
49
  @st.cache_data
50
+ def get_web_search_results_wrapper(text: str) -> List[Tuple[str, str]]:
51
  """
52
+ Fetch and cache the Web search results on a given topic.
53
 
54
+ :param text: The topic
55
+ :return: A list of (title, link) tuples
56
  """
57
 
58
+ results = []
59
+ search_results = get_metaphor_client_wrapper().search(
60
+ text,
61
+ use_autoprompt=True,
62
+ num_results=5
63
+ )
 
 
 
 
 
 
64
 
65
+ for a_result in search_results.results:
66
+ results.append((a_result.title, a_result.url))
67
+
68
+ return results
69
+
70
+
71
+ # def get_disk_used_percentage() -> float:
72
+ # """
73
+ # Compute the disk usage.
74
+ #
75
+ # :return: Percentage of the disk space currently used
76
+ # """
77
+ #
78
+ # total, used, free = shutil.disk_usage(__file__)
79
+ # total = total // GB_CONVERTER
80
+ # used = used // GB_CONVERTER
81
+ # free = free // GB_CONVERTER
82
+ # used_perc = 100.0 * used / total
83
+ #
84
+ # logging.debug(f'Total: {total} GB\n'
85
+ # f'Used: {used} GB\n'
86
+ # f'Free: {free} GB')
87
+ #
88
+ # logging.debug('\n'.join(os.listdir()))
89
+ #
90
+ # return used_perc
91
 
92
 
93
  def build_ui():
94
  """
95
+ Display the input elements for content generation. Only covers the first step.
96
  """
97
 
98
+ # get_disk_used_percentage()
99
+
100
  st.title(APP_TEXT['app_name'])
101
  st.subheader(APP_TEXT['caption'])
102
  st.markdown(
103
+ 'Powered by'
104
+ ' [Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2).'
105
+ )
106
+ st.markdown(
107
+ '*If the JSON is generated or parsed incorrectly, try again later by making minor changes'
108
+ ' to the input text.*'
109
  )
110
 
111
+ with st.form('my_form'):
112
+ # Topic input
113
+ try:
114
+ with open(GlobalConfig.PRELOAD_DATA_FILE, 'r', encoding='utf-8') as in_file:
115
+ preload_data = json5.loads(in_file.read())
116
+ except (FileExistsError, FileNotFoundError):
117
+ preload_data = {'topic': '', 'audience': ''}
118
+
119
+ topic = st.text_area(
120
+ APP_TEXT['input_labels'][0],
121
+ value=preload_data['topic']
 
 
 
 
 
 
122
  )
123
 
124
+ texts = list(GlobalConfig.PPTX_TEMPLATE_FILES.keys())
125
+ captions = [GlobalConfig.PPTX_TEMPLATE_FILES[x]['caption'] for x in texts]
 
 
 
 
 
 
126
 
127
+ pptx_template = st.radio(
128
+ 'Select a presentation template:',
129
+ texts,
130
+ captions=captions,
131
+ horizontal=True
132
+ )
133
 
134
+ st.divider()
135
+ submit = st.form_submit_button('Generate slide deck')
136
 
137
+ if submit:
138
+ # st.write(f'Clicked {time.time()}')
139
+ st.session_state.submitted = True
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
140
 
141
+ # https://github.com/streamlit/streamlit/issues/3832#issuecomment-1138994421
142
+ if 'submitted' in st.session_state:
143
+ progress_text = 'Generating the slides...give it a moment'
144
+ progress_bar = st.progress(0, text=progress_text)
145
 
146
+ topic_txt = topic.strip()
147
+ generate_presentation(topic_txt, pptx_template, progress_bar)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
148
 
149
+ st.divider()
150
+ st.text(APP_TEXT['tos'])
151
+ st.text(APP_TEXT['tos2'])
 
 
 
 
152
 
153
+ st.markdown(
154
+ '![Visitors]'
155
+ '(https://api.visitorbadge.io/api/visitors?path=https%3A%2F%2Fhuggingface.co%2Fspaces%2Fbarunsaha%2Fslide-deck-ai&countColor=%23263759)'
156
+ )
 
 
 
 
 
157
 
158
 
159
+ def generate_presentation(topic: str, pptx_template: str, progress_bar):
160
  """
161
+ Process the inputs to generate the slides.
 
162
 
163
+ :param topic: The presentation topic based on which contents are to be generated
164
+ :param pptx_template: The PowerPoint template name to be used
165
+ :param progress_bar: Progress bar from the page
166
+ :return:
167
  """
168
 
169
+ topic_length = len(topic)
170
+ logging.debug('Input length:: topic: %s', topic_length)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
171
 
172
+ if topic_length >= 10:
173
+ logging.debug('Topic: %s', topic)
174
+ target_length = min(topic_length, GlobalConfig.LLM_MODEL_MAX_INPUT_LENGTH)
 
 
 
 
 
 
 
175
 
176
+ try:
177
+ # Step 1: Generate the contents in JSON format using an LLM
178
+ json_str = process_slides_contents(topic[:target_length], progress_bar)
179
+ logging.debug('Truncated topic: %s', topic[:target_length])
180
+ logging.debug('Length of JSON: %d', len(json_str))
181
+
182
+ # Step 2: Generate the slide deck based on the template specified
183
+ if len(json_str) > 0:
184
+ st.info(
185
+ 'Tip: The generated content doesn\'t look so great?'
186
+ ' Need alternatives? Just change your description text and try again.',
187
+ icon="💡️"
188
+ )
189
+ else:
190
+ st.error(
191
+ 'Unfortunately, JSON generation failed, so the next steps would lead'
192
+ ' to nowhere. Try again or come back later.'
193
+ )
194
+ return
195
 
196
+ all_headers = generate_slide_deck(json_str, pptx_template, progress_bar)
 
 
 
 
 
197
 
198
+ # Step 3: Bonus stuff: Web references and AI art
199
+ show_bonus_stuff(all_headers)
200
 
201
+ except ValueError as ve:
202
+ st.error(f'Unfortunately, an error occurred: {ve}! '
203
+ f'Please change the text, try again later, or report it, sharing your inputs.')
 
 
 
 
 
 
 
204
 
205
+ else:
206
+ st.error('Not enough information provided! Please be little more descriptive :)')
207
 
208
 
209
+ def process_slides_contents(text: str, progress_bar: st.progress) -> str:
210
  """
211
+ Convert given text into structured data and display. Update the UI.
212
 
213
+ :param text: The topic description for the presentation
214
+ :param progress_bar: Progress bar for this step
215
+ :return: The contents as a JSON-formatted string
216
  """
217
 
218
+ json_str = ''
 
 
 
 
 
 
 
 
219
 
220
+ try:
221
+ logging.info('Calling LLM for content generation on the topic: %s', text)
222
+ json_str = get_contents_wrapper(text)
223
+ except Exception as ex:
224
+ st.error(
225
+ f'An exception occurred while trying to convert to JSON. It could be because of heavy'
226
+ f' traffic or something else. Try doing it again or try again later.'
227
+ f'\nError message: {ex}'
228
+ )
229
 
230
+ progress_bar.progress(50, text='Contents generated')
 
 
231
 
232
+ with st.expander('The generated contents (in JSON format)'):
233
+ st.code(json_str, language='json')
234
 
235
+ return json_str
 
 
236
 
237
 
238
+ def generate_slide_deck(json_str: str, pptx_template: str, progress_bar) -> List:
239
  """
240
+ Create a slide deck.
241
 
242
+ :param json_str: The contents in JSON format
243
+ :param pptx_template: The PPTX template name
244
+ :param progress_bar: Progress bar
245
+ :return: A list of all slide headers and the title
246
  """
247
 
248
+ progress_text = 'Creating the slide deck...give it a moment'
249
+ progress_bar.progress(75, text=progress_text)
250
 
251
+ # # Get a unique name for the file to save -- use the session ID
252
+ # ctx = st_sr.get_script_run_ctx()
253
+ # session_id = ctx.session_id
254
+ # timestamp = time.time()
255
+ # output_file_name = f'{session_id}_{timestamp}.pptx'
256
 
257
+ temp = tempfile.NamedTemporaryFile(delete=False, suffix='.pptx')
258
+ path = pathlib.Path(temp.name)
 
259
 
260
+ logging.info('Creating PPTX file...')
261
+ all_headers = pptx_helper.generate_powerpoint_presentation(
262
+ json_str,
263
+ as_yaml=False,
264
+ slides_template=pptx_template,
265
+ output_file_path=path
266
+ )
267
+ progress_bar.progress(100, text='Done!')
268
 
269
+ with open(path, 'rb') as f:
270
+ st.download_button('Download PPTX file', f, file_name='Presentation.pptx')
271
 
272
+ return all_headers
273
 
274
+
275
+ def show_bonus_stuff(ppt_headers: List[str]):
276
  """
277
+ Show bonus stuff for the presentation.
278
 
279
+ :param ppt_headers: A list of the slide headings.
280
  """
281
 
282
+ # Use the presentation title and the slide headers to find relevant info online
283
+ logging.info('Calling Metaphor search...')
284
+ ppt_text = ' '.join(ppt_headers)
285
+ search_results = get_web_search_results_wrapper(ppt_text)
286
+ md_text_items = []
287
+
288
+ for (title, link) in search_results:
289
+ md_text_items.append(f'[{title}]({link})')
290
+
291
+ with st.expander('Related Web references'):
292
+ st.markdown('\n\n'.join(md_text_items))
293
+
294
+ logging.info('Done!')
295
+
296
+ # # Avoid image generation. It costs time and an API call, so just limit to the text generation.
297
+ # with st.expander('AI-generated image on the presentation topic'):
298
+ # logging.info('Calling SDXL for image generation...')
299
+ # # img_empty.write('')
300
+ # # img_text.write(APP_TEXT['image_info'])
301
+ # image = get_ai_image_wrapper(ppt_text)
302
+ #
303
+ # if len(image) > 0:
304
+ # image = base64.b64decode(image)
305
+ # st.image(image, caption=ppt_text)
306
+ # st.info('Tip: Right-click on the image to save it.', icon="💡️")
307
+ # logging.info('Image added')
308
 
309
 
310
  def main():
clarifai_grpc_helper.py ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
2
+ from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
3
+ from clarifai_grpc.grpc.api.status import status_code_pb2
4
+
5
+ from global_config import GlobalConfig
6
+
7
+
8
+ CHANNEL = ClarifaiChannel.get_grpc_channel()
9
+ STUB = service_pb2_grpc.V2Stub(CHANNEL)
10
+
11
+ METADATA = (
12
+ ('authorization', 'Key ' + GlobalConfig.CLARIFAI_PAT),
13
+ )
14
+
15
+ USER_DATA_OBJECT = resources_pb2.UserAppIDSet(
16
+ user_id=GlobalConfig.CLARIFAI_USER_ID,
17
+ app_id=GlobalConfig.CLARIFAI_APP_ID
18
+ )
19
+
20
+ RAW_TEXT = '''You are a helpful, intelligent chatbot. Create the slides for a presentation on the given topic. Include main headings for each slide, detailed bullet points for each slide. Add relevant content to each slide. Do not output any blank line.
21
+
22
+ Topic:
23
+ Talk about AI, covering what it is and how it works. Add its pros, cons, and future prospects. Also, cover its job prospects.
24
+ '''
25
+
26
+
27
+ def get_text_from_llm(prompt: str) -> str:
28
+ post_model_outputs_response = STUB.PostModelOutputs(
29
+ service_pb2.PostModelOutputsRequest(
30
+ user_app_id=USER_DATA_OBJECT, # The userDataObject is created in the overview and is required when using a PAT
31
+ model_id=GlobalConfig.CLARIFAI_MODEL_ID,
32
+ # version_id=MODEL_VERSION_ID, # This is optional. Defaults to the latest model version
33
+ inputs=[
34
+ resources_pb2.Input(
35
+ data=resources_pb2.Data(
36
+ text=resources_pb2.Text(
37
+ raw=prompt
38
+ )
39
+ )
40
+ )
41
+ ]
42
+ ),
43
+ metadata=METADATA
44
+ )
45
+
46
+ if post_model_outputs_response.status.code != status_code_pb2.SUCCESS:
47
+ print(post_model_outputs_response.status)
48
+ raise Exception(f"Post model outputs failed, status: {post_model_outputs_response.status.description}")
49
+
50
+ # Since we have one input, one output will exist here
51
+ output = post_model_outputs_response.outputs[0]
52
+
53
+ # print("Completion:\n")
54
+ # print(output.data.text.raw)
55
+
56
+ return output.data.text.raw
57
+
58
+
59
+ if __name__ == '__main__':
60
+ topic = ('Talk about AI, covering what it is and how it works.'
61
+ ' Add its pros, cons, and future prospects.'
62
+ ' Also, cover its job prospects.'
63
+ )
64
+ print(topic)
65
+
66
+ with open(GlobalConfig.SLIDES_TEMPLATE_FILE, 'r') as in_file:
67
+ prompt_txt = in_file.read()
68
+ prompt_txt = prompt_txt.replace('{topic}', topic)
69
+ response_txt = get_text_from_llm(prompt_txt)
70
+
71
+ print('Output:\n', response_txt)
examples/example_04.json DELETED
@@ -1,3 +0,0 @@
1
- {
2
- "topic": "12 slides on a basic tutorial on Python along with examples"
3
- }
 
 
 
 
file_embeddings/embeddings.npy DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:64a1ba79b20c81ba7ed6604468736f74ae89813fe378191af1d8574c008b3ab5
3
- size 326784
 
 
 
 
file_embeddings/icons.npy DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:ce5ce4c86bb213915606921084b3516464154edcae12f4bc708d62c6bd7acebb
3
- size 51168
 
 
 
 
global_config.py CHANGED
@@ -1,7 +1,3 @@
1
- """
2
- A set of configurations used by the app.
3
- """
4
- import logging
5
  import os
6
 
7
  from dataclasses import dataclass
@@ -13,72 +9,32 @@ load_dotenv()
13
 
14
  @dataclass(frozen=True)
15
  class GlobalConfig:
16
- """
17
- A data class holding the configurations.
18
- """
19
-
20
- HF_LLM_MODEL_NAME = 'mistralai/Mistral-Nemo-Instruct-2407'
21
- LLM_MODEL_TEMPERATURE = 0.2
22
- LLM_MODEL_MIN_OUTPUT_LENGTH = 100
23
- LLM_MODEL_MAX_OUTPUT_LENGTH = 4 * 4096 # tokens
24
- LLM_MODEL_MAX_INPUT_LENGTH = 400 # characters
25
 
26
  HUGGINGFACEHUB_API_TOKEN = os.environ.get('HUGGINGFACEHUB_API_TOKEN', '')
 
27
 
28
  LOG_LEVEL = 'DEBUG'
29
- COUNT_TOKENS = False
30
  APP_STRINGS_FILE = 'strings.json'
31
  PRELOAD_DATA_FILE = 'examples/example_02.json'
32
  SLIDES_TEMPLATE_FILE = 'langchain_templates/template_combined.txt'
33
- INITIAL_PROMPT_TEMPLATE = 'langchain_templates/chat_prompts/initial_template_v4_two_cols_img.txt'
34
- REFINEMENT_PROMPT_TEMPLATE = 'langchain_templates/chat_prompts/refinement_template_v4_two_cols_img.txt'
35
-
36
- LLM_PROGRESS_MAX = 90
37
- ICONS_DIR = 'icons/png128/'
38
- TINY_BERT_MODEL = 'gaunernst/bert-mini-uncased'
39
- EMBEDDINGS_FILE_NAME = 'file_embeddings/embeddings.npy'
40
- ICONS_FILE_NAME = 'file_embeddings/icons.npy'
41
 
42
  PPTX_TEMPLATE_FILES = {
43
- 'Basic': {
44
  'file': 'pptx_templates/Blank.pptx',
45
- 'caption': 'A good start (Uses [photos](https://unsplash.com/photos/AFZ-qBPEceA) by [cetteup](https://unsplash.com/@cetteup?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash) on [Unsplash](https://unsplash.com/photos/a-foggy-forest-filled-with-lots-of-trees-d3ci37Gcgxg?utm_content=creditCopyText&utm_medium=referral&utm_source=unsplash)) 🟧'
46
- },
47
- 'Minimalist Sales Pitch': {
48
- 'file': 'pptx_templates/Minimalist_sales_pitch.pptx',
49
- 'caption': 'In high contrast ⬛'
50
  },
51
  'Ion Boardroom': {
52
  'file': 'pptx_templates/Ion_Boardroom.pptx',
53
- 'caption': 'Make some bold decisions 🟥'
54
  },
55
  'Urban Monochrome': {
56
  'file': 'pptx_templates/Urban_monochrome.pptx',
57
- 'caption': 'Marvel in a monochrome dream'
58
- },
59
  }
60
-
61
- # This is a long text, so not incorporated as a string in `strings.json`
62
- CHAT_USAGE_INSTRUCTIONS = (
63
- 'Briefly describe your topic of presentation in the textbox provided below.'
64
- ' For example:\n'
65
- '- Make a slide deck on AI.'
66
- '\n\n'
67
- 'Subsequently, you can add follow-up instructions, e.g.:\n'
68
- '- Can you add a slide on GPUs?'
69
- '\n\n'
70
- ' You can also ask it to refine any particular slide, e.g.:\n'
71
- '- Make the slide with title \'Examples of AI\' a bit more descriptive.'
72
- '\n\n'
73
- 'See this [demo video](https://youtu.be/QvAKzNKtk9k) for a brief walkthrough.\n\n'
74
- ' SlideDeck AI does not have access to the Web, apart for searching for images relevant'
75
- ' to the slides. Photos are added probabilistically; transparency needs to be changed'
76
- ' manually, if required.'
77
- )
78
-
79
-
80
- logging.basicConfig(
81
- level=GlobalConfig.LOG_LEVEL,
82
- format='%(asctime)s - %(levelname)s - %(name)s - %(message)s',
83
- datefmt='%Y-%m-%d %H:%M:%S'
84
- )
 
 
 
 
 
1
  import os
2
 
3
  from dataclasses import dataclass
 
9
 
10
  @dataclass(frozen=True)
11
  class GlobalConfig:
12
+ HF_LLM_MODEL_NAME = 'mistralai/Mistral-7B-Instruct-v0.2'
13
+ LLM_MODEL_TEMPERATURE: float = 0.2
14
+ LLM_MODEL_MIN_OUTPUT_LENGTH: int = 50
15
+ LLM_MODEL_MAX_OUTPUT_LENGTH: int = 2000
16
+ LLM_MODEL_MAX_INPUT_LENGTH: int = 300
 
 
 
 
17
 
18
  HUGGINGFACEHUB_API_TOKEN = os.environ.get('HUGGINGFACEHUB_API_TOKEN', '')
19
+ METAPHOR_API_KEY = os.environ.get('METAPHOR_API_KEY', '')
20
 
21
  LOG_LEVEL = 'DEBUG'
 
22
  APP_STRINGS_FILE = 'strings.json'
23
  PRELOAD_DATA_FILE = 'examples/example_02.json'
24
  SLIDES_TEMPLATE_FILE = 'langchain_templates/template_combined.txt'
25
+ JSON_TEMPLATE_FILE = 'langchain_templates/text_to_json_template_02.txt'
 
 
 
 
 
 
 
26
 
27
  PPTX_TEMPLATE_FILES = {
28
+ 'Blank': {
29
  'file': 'pptx_templates/Blank.pptx',
30
+ 'caption': 'A good start'
 
 
 
 
31
  },
32
  'Ion Boardroom': {
33
  'file': 'pptx_templates/Ion_Boardroom.pptx',
34
+ 'caption': 'Make some bold decisions'
35
  },
36
  'Urban Monochrome': {
37
  'file': 'pptx_templates/Urban_monochrome.pptx',
38
+ 'caption': 'Marvel in a monochrome dream'
39
+ }
40
  }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
helpers/__init__.py DELETED
File without changes
helpers/icons_embeddings.py DELETED
@@ -1,166 +0,0 @@
1
- """
2
- Generate and save the embeddings of a pre-defined list of icons.
3
- Compare them with keywords embeddings to find most relevant icons.
4
- """
5
- import os
6
- import pathlib
7
- import sys
8
- from typing import List, Tuple
9
-
10
- import numpy as np
11
- from sklearn.metrics.pairwise import cosine_similarity
12
- from transformers import BertTokenizer, BertModel
13
-
14
- sys.path.append('..')
15
- sys.path.append('../..')
16
-
17
- from global_config import GlobalConfig
18
-
19
-
20
- tokenizer = BertTokenizer.from_pretrained(GlobalConfig.TINY_BERT_MODEL)
21
- model = BertModel.from_pretrained(GlobalConfig.TINY_BERT_MODEL)
22
-
23
-
24
- def get_icons_list() -> List[str]:
25
- """
26
- Get a list of available icons.
27
-
28
- :return: The icons file names.
29
- """
30
-
31
- items = pathlib.Path('../' + GlobalConfig.ICONS_DIR).glob('*.png')
32
- items = [
33
- os.path.basename(str(item)).removesuffix('.png') for item in items
34
- ]
35
-
36
- return items
37
-
38
-
39
- def get_embeddings(texts) -> np.ndarray:
40
- """
41
- Generate embeddings for a list of texts using a pre-trained language model.
42
-
43
- :param texts: A string or a list of strings to be converted into embeddings.
44
- :type texts: Union[str, List[str]]
45
- :return: A NumPy array containing the embeddings for the input texts.
46
- :rtype: numpy.ndarray
47
-
48
- :raises ValueError: If the input is not a string or a list of strings, or if any element
49
- in the list is not a string.
50
-
51
- Example usage:
52
- >>> keyword = 'neural network'
53
- >>> file_names = ['neural_network_icon.png', 'data_analysis_icon.png', 'machine_learning.png']
54
- >>> keyword_embeddings = get_embeddings(keyword)
55
- >>> file_name_embeddings = get_embeddings(file_names)
56
- """
57
-
58
- inputs = tokenizer(texts, return_tensors='pt', padding=True, max_length=128, truncation=True)
59
- outputs = model(**inputs)
60
-
61
- return outputs.last_hidden_state.mean(dim=1).detach().numpy()
62
-
63
-
64
- def save_icons_embeddings():
65
- """
66
- Generate and save the embeddings for the icon file names.
67
- """
68
-
69
- file_names = get_icons_list()
70
- print(f'{len(file_names)} icon files available...')
71
- file_name_embeddings = get_embeddings(file_names)
72
- print(f'file_name_embeddings.shape: {file_name_embeddings.shape}')
73
-
74
- # Save embeddings to a file
75
- np.save(GlobalConfig.EMBEDDINGS_FILE_NAME, file_name_embeddings)
76
- np.save(GlobalConfig.ICONS_FILE_NAME, file_names) # Save file names for reference
77
-
78
-
79
- def load_saved_embeddings() -> Tuple[np.ndarray, np.ndarray]:
80
- """
81
- Load precomputed embeddings and icons file names.
82
-
83
- :return: The embeddings and the icon file names.
84
- """
85
-
86
- file_name_embeddings = np.load(GlobalConfig.EMBEDDINGS_FILE_NAME)
87
- file_names = np.load(GlobalConfig.ICONS_FILE_NAME)
88
-
89
- return file_name_embeddings, file_names
90
-
91
-
92
- def find_icons(keywords: List[str]) -> List[str]:
93
- """
94
- Find relevant icon file names for a list of keywords.
95
-
96
- :param keywords: The list of one or more keywords.
97
- :return: A list of the file names relevant for each keyword.
98
- """
99
-
100
- keyword_embeddings = get_embeddings(keywords)
101
- file_name_embeddings, file_names = load_saved_embeddings()
102
-
103
- # Compute similarity
104
- similarities = cosine_similarity(keyword_embeddings, file_name_embeddings)
105
- icon_files = file_names[np.argmax(similarities, axis=-1)]
106
-
107
- return icon_files
108
-
109
-
110
- def main():
111
- """
112
- Example usage.
113
- """
114
-
115
- # Run this again if icons are to be added/removed
116
- save_icons_embeddings()
117
-
118
- keywords = [
119
- 'deep learning',
120
- '',
121
- 'recycling',
122
- 'handshake',
123
- 'Ferry',
124
- 'rain drop',
125
- 'speech bubble',
126
- 'mental resilience',
127
- 'turmeric',
128
- 'Art',
129
- 'price tag',
130
- 'Oxygen',
131
- 'oxygen',
132
- 'Social Connection',
133
- 'Accomplishment',
134
- 'Python',
135
- 'XML',
136
- 'Handshake',
137
- ]
138
- icon_files = find_icons(keywords)
139
- print(
140
- f'The relevant icon files are:\n'
141
- f'{list(zip(keywords, icon_files))}'
142
- )
143
-
144
- # BERT tiny:
145
- # [('deep learning', 'deep-learning'), ('', '123'), ('recycling', 'refinery'),
146
- # ('handshake', 'dash-circle'), ('Ferry', 'cart'), ('rain drop', 'bucket'),
147
- # ('speech bubble', 'globe'), ('mental resilience', 'exclamation-triangle'),
148
- # ('turmeric', 'kebab'), ('Art', 'display'), ('price tag', 'bug-fill'),
149
- # ('Oxygen', 'radioactive')]
150
-
151
- # BERT mini
152
- # [('deep learning', 'deep-learning'), ('', 'compass'), ('recycling', 'tools'),
153
- # ('handshake', 'bandaid'), ('Ferry', 'cart'), ('rain drop', 'trash'),
154
- # ('speech bubble', 'image'), ('mental resilience', 'recycle'), ('turmeric', 'linkedin'),
155
- # ('Art', 'book'), ('price tag', 'card-image'), ('Oxygen', 'radioactive')]
156
-
157
- # BERT small
158
- # [('deep learning', 'deep-learning'), ('', 'gem'), ('recycling', 'tools'),
159
- # ('handshake', 'handbag'), ('Ferry', 'truck'), ('rain drop', 'bucket'),
160
- # ('speech bubble', 'strategy'), ('mental resilience', 'deep-learning'),
161
- # ('turmeric', 'flower'),
162
- # ('Art', 'book'), ('price tag', 'hotdog'), ('Oxygen', 'radioactive')]
163
-
164
-
165
- if __name__ == '__main__':
166
- main()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
helpers/image_search.py DELETED
@@ -1,148 +0,0 @@
1
- """
2
- Search photos using Pexels API.
3
- """
4
- import logging
5
- import os
6
- import random
7
- from io import BytesIO
8
- from typing import Union, Tuple, Literal
9
- from urllib.parse import urlparse, parse_qs
10
-
11
- import requests
12
- from dotenv import load_dotenv
13
-
14
-
15
- load_dotenv()
16
-
17
-
18
- REQUEST_TIMEOUT = 12
19
- MAX_PHOTOS = 3
20
-
21
-
22
- # Only show errors
23
- logging.getLogger('urllib3').setLevel(logging.ERROR)
24
- # Disable all child loggers of urllib3, e.g. urllib3.connectionpool
25
- # logging.getLogger('urllib3').propagate = True
26
-
27
-
28
-
29
- def search_pexels(
30
- query: str,
31
- size: Literal['small', 'medium', 'large'] = 'medium',
32
- per_page: int = MAX_PHOTOS
33
- ) -> dict:
34
- """
35
- Searches for images on Pexels using the provided query.
36
-
37
- This function sends a GET request to the Pexels API with the specified search query
38
- and authorization header containing the API key. It returns the JSON response from the API.
39
-
40
- [2024-08-31] Note:
41
- `curl` succeeds but API call via Python `requests` fail. Apparently, this could be due to
42
- Cloudflare (or others) blocking the requests, perhaps identifying as Web-scraping. So,
43
- changing the user-agent to Firefox.
44
- https://stackoverflow.com/a/74674276/147021
45
- https://stackoverflow.com/a/51268523/147021
46
- https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/User-Agent/Firefox#linux
47
-
48
- :param query: The search query for finding images.
49
- :param size: The size of the images: small, medium, or large.
50
- :param per_page: No. of results to be displayed per page.
51
- :return: The JSON response from the Pexels API containing search results.
52
- :raises requests.exceptions.RequestException: If the request to the Pexels API fails.
53
- """
54
-
55
- url = 'https://api.pexels.com/v1/search'
56
- headers = {
57
- 'Authorization': os.getenv('PEXEL_API_KEY'),
58
- 'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0',
59
- }
60
- params = {
61
- 'query': query,
62
- 'size': size,
63
- 'page': 1,
64
- 'per_page': per_page
65
- }
66
- response = requests.get(url, headers=headers, params=params, timeout=REQUEST_TIMEOUT)
67
- response.raise_for_status() # Ensure the request was successful
68
-
69
- return response.json()
70
-
71
-
72
- def get_photo_url_from_api_response(
73
- json_response: dict
74
- ) -> Tuple[Union[str, None], Union[str, None]]:
75
- """
76
- Return a randomly chosen photo from a Pexels search API response. In addition, also return
77
- the original URL of the page on Pexels.
78
-
79
- :param json_response: The JSON response.
80
- :return: The selected photo URL and page URL or `None`.
81
- """
82
-
83
- page_url = None
84
- photo_url = None
85
-
86
- if 'photos' in json_response:
87
- photos = json_response['photos']
88
-
89
- if photos:
90
- photo_idx = random.choice(list(range(MAX_PHOTOS)))
91
- photo = photos[photo_idx]
92
-
93
- if 'url' in photo:
94
- page_url = photo['url']
95
-
96
- if 'src' in photo:
97
- if 'large' in photo['src']:
98
- photo_url = photo['src']['large']
99
- elif 'original' in photo['src']:
100
- photo_url = photo['src']['original']
101
-
102
- return photo_url, page_url
103
-
104
-
105
- def get_image_from_url(url: str) -> BytesIO:
106
- """
107
- Fetches an image from the specified URL and returns it as a BytesIO object.
108
-
109
- This function sends a GET request to the provided URL, retrieves the image data,
110
- and wraps it in a BytesIO object, which can be used like a file.
111
-
112
- :param url: The URL of the image to be fetched.
113
- :return: A BytesIO object containing the image data.
114
- :raises requests.exceptions.RequestException: If the request to the URL fails.
115
- """
116
-
117
- headers = {
118
- 'Authorization': os.getenv('PEXEL_API_KEY'),
119
- 'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:10.0) Gecko/20100101 Firefox/10.0',
120
- }
121
- response = requests.get(url, headers=headers, stream=True, timeout=REQUEST_TIMEOUT)
122
- response.raise_for_status()
123
- image_data = BytesIO(response.content)
124
-
125
- return image_data
126
-
127
-
128
- def extract_dimensions(url: str) -> Tuple[int, int]:
129
- """
130
- Extracts the height and width from the URL parameters.
131
-
132
- :param url: The URL containing the image dimensions.
133
- :return: A tuple containing the width and height as integers.
134
- """
135
- parsed_url = urlparse(url)
136
- query_params = parse_qs(parsed_url.query)
137
- width = int(query_params.get('w', [0])[0])
138
- height = int(query_params.get('h', [0])[0])
139
-
140
- return width, height
141
-
142
-
143
- if __name__ == '__main__':
144
- print(
145
- search_pexels(
146
- query='people'
147
- )
148
- )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
helpers/pptx_helper.py DELETED
@@ -1,982 +0,0 @@
1
- """
2
- A set of functions to create a PowerPoint slide deck.
3
- """
4
- import logging
5
- import os
6
- import pathlib
7
- import random
8
- import re
9
- import sys
10
- import tempfile
11
- from typing import List, Tuple, Optional
12
-
13
- import json5
14
- import pptx
15
- from dotenv import load_dotenv
16
- from pptx.enum.shapes import MSO_AUTO_SHAPE_TYPE
17
- from pptx.shapes.placeholder import PicturePlaceholder, SlidePlaceholder
18
-
19
- sys.path.append('..')
20
- sys.path.append('../..')
21
-
22
- import helpers.icons_embeddings as ice
23
- import helpers.image_search as ims
24
- from global_config import GlobalConfig
25
-
26
-
27
- load_dotenv()
28
-
29
-
30
- # English Metric Unit (used by PowerPoint) to inches
31
- EMU_TO_INCH_SCALING_FACTOR = 1.0 / 914400
32
- INCHES_3 = pptx.util.Inches(3)
33
- INCHES_2 = pptx.util.Inches(2)
34
- INCHES_1_5 = pptx.util.Inches(1.5)
35
- INCHES_1 = pptx.util.Inches(1)
36
- INCHES_0_8 = pptx.util.Inches(0.8)
37
- INCHES_0_9 = pptx.util.Inches(0.9)
38
- INCHES_0_5 = pptx.util.Inches(0.5)
39
- INCHES_0_4 = pptx.util.Inches(0.4)
40
- INCHES_0_3 = pptx.util.Inches(0.3)
41
- INCHES_0_2 = pptx.util.Inches(0.2)
42
-
43
- STEP_BY_STEP_PROCESS_MARKER = '>> '
44
- ICON_BEGINNING_MARKER = '[['
45
- ICON_END_MARKER = ']]'
46
-
47
- ICON_SIZE = INCHES_0_8
48
- ICON_BG_SIZE = INCHES_1
49
-
50
- IMAGE_DISPLAY_PROBABILITY = 1 / 3.0
51
- FOREGROUND_IMAGE_PROBABILITY = 0.8
52
-
53
- SLIDE_NUMBER_REGEX = re.compile(r"^slide[ ]+\d+:", re.IGNORECASE)
54
- ICONS_REGEX = re.compile(r"\[\[(.*?)\]\]\s*(.*)")
55
-
56
- ICON_COLORS = [
57
- pptx.dml.color.RGBColor.from_string('800000'), # Maroon
58
- pptx.dml.color.RGBColor.from_string('6A5ACD'), # SlateBlue
59
- pptx.dml.color.RGBColor.from_string('556B2F'), # DarkOliveGreen
60
- pptx.dml.color.RGBColor.from_string('2F4F4F'), # DarkSlateGray
61
- pptx.dml.color.RGBColor.from_string('4682B4'), # SteelBlue
62
- pptx.dml.color.RGBColor.from_string('5F9EA0'), # CadetBlue
63
- ]
64
-
65
-
66
- logger = logging.getLogger(__name__)
67
- logging.getLogger('PIL.PngImagePlugin').setLevel(logging.ERROR)
68
-
69
-
70
- def remove_slide_number_from_heading(header: str) -> str:
71
- """
72
- Remove the slide number from a given slide header.
73
-
74
- :param header: The header of a slide.
75
- :return: The header without slide number.
76
- """
77
-
78
- if SLIDE_NUMBER_REGEX.match(header):
79
- idx = header.find(':')
80
- header = header[idx + 1:]
81
-
82
- return header
83
-
84
-
85
- def generate_powerpoint_presentation(
86
- parsed_data: dict,
87
- slides_template: str,
88
- output_file_path: pathlib.Path
89
- ) -> List:
90
- """
91
- Create and save a PowerPoint presentation file containing the content in JSON format.
92
-
93
- :param parsed_data: The presentation content as parsed JSON data.
94
- :param slides_template: The PPTX template to use.
95
- :param output_file_path: The path of the PPTX file to save as.
96
- :return: A list of presentation title and slides headers.
97
- """
98
-
99
- presentation = pptx.Presentation(GlobalConfig.PPTX_TEMPLATE_FILES[slides_template]['file'])
100
- slide_width_inch, slide_height_inch = _get_slide_width_height_inches(presentation)
101
-
102
- # The title slide
103
- title_slide_layout = presentation.slide_layouts[0]
104
- slide = presentation.slides.add_slide(title_slide_layout)
105
- title = slide.shapes.title
106
- subtitle = slide.placeholders[1]
107
- title.text = parsed_data['title']
108
- logger.info(
109
- 'PPT title: %s | #slides: %d | template: %s',
110
- title.text, len(parsed_data['slides']),
111
- GlobalConfig.PPTX_TEMPLATE_FILES[slides_template]['file']
112
- )
113
- subtitle.text = 'by Myself and SlideDeck AI :)'
114
- all_headers = [title.text, ]
115
-
116
- # Add content in a loop
117
- for a_slide in parsed_data['slides']:
118
- is_processing_done = _handle_icons_ideas(
119
- presentation=presentation,
120
- slide_json=a_slide,
121
- slide_width_inch=slide_width_inch,
122
- slide_height_inch=slide_height_inch
123
- )
124
-
125
- if not is_processing_done:
126
- is_processing_done = _handle_double_col_layout(
127
- presentation=presentation,
128
- slide_json=a_slide,
129
- slide_width_inch=slide_width_inch,
130
- slide_height_inch=slide_height_inch
131
- )
132
-
133
- if not is_processing_done:
134
- is_processing_done = _handle_step_by_step_process(
135
- presentation=presentation,
136
- slide_json=a_slide,
137
- slide_width_inch=slide_width_inch,
138
- slide_height_inch=slide_height_inch
139
- )
140
-
141
- if not is_processing_done:
142
- _handle_default_display(
143
- presentation=presentation,
144
- slide_json=a_slide,
145
- slide_width_inch=slide_width_inch,
146
- slide_height_inch=slide_height_inch
147
- )
148
-
149
- # The thank-you slide
150
- last_slide_layout = presentation.slide_layouts[0]
151
- slide = presentation.slides.add_slide(last_slide_layout)
152
- title = slide.shapes.title
153
- title.text = 'Thank you!'
154
-
155
- presentation.save(output_file_path)
156
-
157
- return all_headers
158
-
159
-
160
- def get_flat_list_of_contents(items: list, level: int) -> List[Tuple]:
161
- """
162
- Flatten a (hierarchical) list of bullet points to a single list containing each item and
163
- its level.
164
-
165
- :param items: A bullet point (string or list).
166
- :param level: The current level of hierarchy.
167
- :return: A list of (bullet item text, hierarchical level) tuples.
168
- """
169
-
170
- flat_list = []
171
-
172
- for item in items:
173
- if isinstance(item, str):
174
- flat_list.append((item, level))
175
- elif isinstance(item, list):
176
- flat_list = flat_list + get_flat_list_of_contents(item, level + 1)
177
-
178
- return flat_list
179
-
180
-
181
- def get_slide_placeholders(
182
- slide: pptx.slide.Slide,
183
- layout_number: int,
184
- is_debug: bool = False
185
- ) -> List[Tuple[int, str]]:
186
- """
187
- Return the index and name (lower case) of all placeholders present in a slide, except
188
- the title placeholder.
189
-
190
- A placeholder in a slide is a place to add content. Each placeholder has a name and an index.
191
- This index is NOT a list index, rather a set of keys used to look up a dict. So, `idx` is
192
- non-contiguous. Also, the title placeholder of a slide always has index 0. User-added
193
- placeholder get indices assigned starting from 10.
194
-
195
- With user-edited or added placeholders, their index may be difficult to track. This function
196
- returns the placeholders name as well, which could be useful to distinguish between the
197
- different placeholder.
198
-
199
- :param slide: The slide.
200
- :param layout_number: The layout number used by the slide.
201
- :param is_debug: Whether to print debugging statements.
202
- :return: A list containing placeholders (idx, name) tuples, except the title placeholder.
203
- """
204
-
205
- if is_debug:
206
- print(
207
- f'Slide layout #{layout_number}:'
208
- f' # of placeholders: {len(slide.shapes.placeholders)} (including the title)'
209
- )
210
-
211
- placeholders = [
212
- (shape.placeholder_format.idx, shape.name.lower()) for shape in slide.shapes.placeholders
213
- ]
214
- placeholders.pop(0) # Remove the title placeholder
215
-
216
- if is_debug:
217
- print(placeholders)
218
-
219
- return placeholders
220
-
221
-
222
- def _handle_default_display(
223
- presentation: pptx.Presentation,
224
- slide_json: dict,
225
- slide_width_inch: float,
226
- slide_height_inch: float
227
- ):
228
- """
229
- Display a list of text in a slide.
230
-
231
- :param presentation: The presentation object.
232
- :param slide_json: The content of the slide as JSON data.
233
- :param slide_width_inch: The width of the slide in inches.
234
- :param slide_height_inch: The height of the slide in inches.
235
- """
236
-
237
- status = False
238
-
239
- if 'img_keywords' in slide_json:
240
- if random.random() < IMAGE_DISPLAY_PROBABILITY:
241
- if random.random() < FOREGROUND_IMAGE_PROBABILITY:
242
- status = _handle_display_image__in_foreground(
243
- presentation,
244
- slide_json,
245
- slide_width_inch,
246
- slide_height_inch
247
- )
248
- else:
249
- status = _handle_display_image__in_background(
250
- presentation,
251
- slide_json,
252
- slide_width_inch,
253
- slide_height_inch
254
- )
255
-
256
- if status:
257
- return
258
-
259
- # Image display failed, so display only text
260
- bullet_slide_layout = presentation.slide_layouts[1]
261
- slide = presentation.slides.add_slide(bullet_slide_layout)
262
-
263
- shapes = slide.shapes
264
- title_shape = shapes.title
265
-
266
- try:
267
- body_shape = shapes.placeholders[1]
268
- except KeyError:
269
- placeholders = get_slide_placeholders(slide, layout_number=1)
270
- body_shape = shapes.placeholders[placeholders[0][0]]
271
-
272
- title_shape.text = remove_slide_number_from_heading(slide_json['heading'])
273
- text_frame = body_shape.text_frame
274
-
275
- # The bullet_points may contain a nested hierarchy of JSON arrays
276
- # In some scenarios, it may contain objects (dictionaries) because the LLM generated so
277
- # ^ The second scenario is not covered
278
-
279
- flat_items_list = get_flat_list_of_contents(slide_json['bullet_points'], level=0)
280
-
281
- for idx, an_item in enumerate(flat_items_list):
282
- if idx == 0:
283
- text_frame.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
284
- else:
285
- paragraph = text_frame.add_paragraph()
286
- paragraph.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
287
- paragraph.level = an_item[1]
288
-
289
- _handle_key_message(
290
- the_slide=slide,
291
- slide_json=slide_json,
292
- slide_height_inch=slide_height_inch,
293
- slide_width_inch=slide_width_inch
294
- )
295
-
296
-
297
- def _handle_display_image__in_foreground(
298
- presentation: pptx.Presentation(),
299
- slide_json: dict,
300
- slide_width_inch: float,
301
- slide_height_inch: float
302
- ) -> bool:
303
- """
304
- Create a slide with text and image using a picture placeholder layout. If not image keyword is
305
- available, it will add only text to the slide.
306
-
307
- :param presentation: The presentation object.
308
- :param slide_json: The content of the slide as JSON data.
309
- :param slide_width_inch: The width of the slide in inches.
310
- :param slide_height_inch: The height of the slide in inches.
311
- :return: True if the side has been processed.
312
- """
313
-
314
- img_keywords = slide_json['img_keywords'].strip()
315
- slide = presentation.slide_layouts[8] # Picture with Caption
316
- slide = presentation.slides.add_slide(slide)
317
- placeholders = None
318
-
319
- title_placeholder = slide.shapes.title
320
- title_placeholder.text = remove_slide_number_from_heading(slide_json['heading'])
321
-
322
- try:
323
- pic_col: PicturePlaceholder = slide.shapes.placeholders[1]
324
- except KeyError:
325
- placeholders = get_slide_placeholders(slide, layout_number=8)
326
- pic_col = None
327
- for idx, name in placeholders:
328
- if 'picture' in name:
329
- pic_col: PicturePlaceholder = slide.shapes.placeholders[idx]
330
-
331
- try:
332
- text_col: SlidePlaceholder = slide.shapes.placeholders[2]
333
- except KeyError:
334
- text_col = None
335
- if not placeholders:
336
- placeholders = get_slide_placeholders(slide, layout_number=8)
337
-
338
- for idx, name in placeholders:
339
- if 'content' in name:
340
- text_col: SlidePlaceholder = slide.shapes.placeholders[idx]
341
-
342
- flat_items_list = get_flat_list_of_contents(slide_json['bullet_points'], level=0)
343
-
344
- for idx, an_item in enumerate(flat_items_list):
345
- if idx == 0:
346
- text_col.text_frame.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
347
- else:
348
- paragraph = text_col.text_frame.add_paragraph()
349
- paragraph.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
350
- paragraph.level = an_item[1]
351
-
352
- if not img_keywords:
353
- # No keywords, so no image search and addition
354
- return True
355
-
356
- try:
357
- photo_url, page_url = ims.get_photo_url_from_api_response(
358
- ims.search_pexels(query=img_keywords, size='medium')
359
- )
360
-
361
- if photo_url:
362
- pic_col.insert_picture(
363
- ims.get_image_from_url(photo_url)
364
- )
365
-
366
- _add_text_at_bottom(
367
- slide=slide,
368
- slide_width_inch=slide_width_inch,
369
- slide_height_inch=slide_height_inch,
370
- text='Photo provided by Pexels',
371
- hyperlink=page_url
372
- )
373
- except Exception as ex:
374
- logger.error(
375
- '*** Error occurred while running adding image to slide: %s',
376
- str(ex)
377
- )
378
-
379
- return True
380
-
381
-
382
- def _handle_display_image__in_background(
383
- presentation: pptx.Presentation(),
384
- slide_json: dict,
385
- slide_width_inch: float,
386
- slide_height_inch: float
387
- ) -> bool:
388
- """
389
- Add a slide with text and an image in the background. It works just like
390
- `_handle_default_display()` but with a background image added. If not image keyword is
391
- available, it will add only text to the slide.
392
-
393
- :param presentation: The presentation object.
394
- :param slide_json: The content of the slide as JSON data.
395
- :param slide_width_inch: The width of the slide in inches.
396
- :param slide_height_inch: The height of the slide in inches.
397
- :return: True if the slide has been processed.
398
- """
399
-
400
- img_keywords = slide_json['img_keywords'].strip()
401
-
402
- # Add a photo in the background, text in the foreground
403
- slide = presentation.slides.add_slide(presentation.slide_layouts[1])
404
- title_shape = slide.shapes.title
405
-
406
- try:
407
- body_shape = slide.shapes.placeholders[1]
408
- except KeyError:
409
- placeholders = get_slide_placeholders(slide, layout_number=1)
410
- # Layout 1 usually has two placeholders, including the title
411
- body_shape = slide.shapes.placeholders[placeholders[0][0]]
412
-
413
- title_shape.text = remove_slide_number_from_heading(slide_json['heading'])
414
-
415
- flat_items_list = get_flat_list_of_contents(slide_json['bullet_points'], level=0)
416
-
417
- for idx, an_item in enumerate(flat_items_list):
418
- if idx == 0:
419
- body_shape.text_frame.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
420
- else:
421
- paragraph = body_shape.text_frame.add_paragraph()
422
- paragraph.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
423
- paragraph.level = an_item[1]
424
-
425
- if not img_keywords:
426
- # No keywords, so no image search and addition
427
- return True
428
-
429
- try:
430
- photo_url, page_url = ims.get_photo_url_from_api_response(
431
- ims.search_pexels(query=img_keywords, size='large')
432
- )
433
-
434
- if photo_url:
435
- picture = slide.shapes.add_picture(
436
- image_file=ims.get_image_from_url(photo_url),
437
- left=0,
438
- top=0,
439
- width=pptx.util.Inches(slide_width_inch),
440
- )
441
-
442
- _add_text_at_bottom(
443
- slide=slide,
444
- slide_width_inch=slide_width_inch,
445
- slide_height_inch=slide_height_inch,
446
- text='Photo provided by Pexels',
447
- hyperlink=page_url
448
- )
449
-
450
- # Move picture to background
451
- # https://github.com/scanny/python-pptx/issues/49#issuecomment-137172836
452
- slide.shapes._spTree.remove(picture._element)
453
- slide.shapes._spTree.insert(2, picture._element)
454
- except Exception as ex:
455
- logger.error(
456
- '*** Error occurred while running adding image to the slide background: %s',
457
- str(ex)
458
- )
459
-
460
- return True
461
-
462
-
463
- def _handle_icons_ideas(
464
- presentation: pptx.Presentation(),
465
- slide_json: dict,
466
- slide_width_inch: float,
467
- slide_height_inch: float
468
- ):
469
- """
470
- Add a slide with some icons and text.
471
- If no suitable icons are found, the step numbers are shown.
472
-
473
- :param presentation: The presentation object.
474
- :param slide_json: The content of the slide as JSON data.
475
- :param slide_width_inch: The width of the slide in inches.
476
- :param slide_height_inch: The height of the slide in inches.
477
- :return: True if the slide has been processed.
478
- """
479
-
480
- if 'bullet_points' in slide_json and slide_json['bullet_points']:
481
- items = slide_json['bullet_points']
482
-
483
- # Ensure that it is a single list of strings without any sub-list
484
- for step in items:
485
- if not isinstance(step, str) or not step.startswith(ICON_BEGINNING_MARKER):
486
- return False
487
-
488
- slide_layout = presentation.slide_layouts[5]
489
- slide = presentation.slides.add_slide(slide_layout)
490
- slide.shapes.title.text = remove_slide_number_from_heading(slide_json['heading'])
491
-
492
- n_items = len(items)
493
- text_box_size = INCHES_2
494
-
495
- # Calculate the total width of all pictures and the spacing
496
- total_width = n_items * ICON_SIZE
497
- spacing = (pptx.util.Inches(slide_width_inch) - total_width) / (n_items + 1)
498
- top = INCHES_3
499
-
500
- icons_texts = [
501
- (match.group(1), match.group(2)) for match in [
502
- ICONS_REGEX.search(item) for item in items
503
- ]
504
- ]
505
- fallback_icon_files = ice.find_icons([item[0] for item in icons_texts])
506
-
507
- for idx, item in enumerate(icons_texts):
508
- icon, accompanying_text = item
509
- icon_path = f'{GlobalConfig.ICONS_DIR}/{icon}.png'
510
-
511
- if not os.path.exists(icon_path):
512
- logger.warning(
513
- 'Icon not found: %s...using fallback icon: %s',
514
- icon, fallback_icon_files[idx]
515
- )
516
- icon_path = f'{GlobalConfig.ICONS_DIR}/{fallback_icon_files[idx]}.png'
517
-
518
- left = spacing + idx * (ICON_SIZE + spacing)
519
- # Calculate the center position for alignment
520
- center = left + ICON_SIZE / 2
521
-
522
- # Add a rectangle shape with a fill color (background)
523
- # The size of the shape is slightly bigger than the icon, so align the icon position
524
- shape = slide.shapes.add_shape(
525
- MSO_AUTO_SHAPE_TYPE.ROUNDED_RECTANGLE,
526
- center - INCHES_0_5,
527
- top - (ICON_BG_SIZE - ICON_SIZE) / 2,
528
- INCHES_1, INCHES_1
529
- )
530
- shape.fill.solid()
531
- shape.shadow.inherit = False
532
-
533
- # Set the icon's background shape color
534
- shape.fill.fore_color.rgb = shape.line.color.rgb = random.choice(ICON_COLORS)
535
-
536
- # Add the icon image on top of the colored shape
537
- slide.shapes.add_picture(icon_path, left, top, height=ICON_SIZE)
538
-
539
- # Add a text box below the shape
540
- text_box = slide.shapes.add_shape(
541
- MSO_AUTO_SHAPE_TYPE.ROUNDED_RECTANGLE,
542
- left=center - text_box_size / 2, # Center the text box horizontally
543
- top=top + ICON_SIZE + INCHES_0_2,
544
- width=text_box_size,
545
- height=text_box_size
546
- )
547
- text_frame = text_box.text_frame
548
- text_frame.text = accompanying_text
549
- text_frame.word_wrap = True
550
- text_frame.paragraphs[0].alignment = pptx.enum.text.PP_ALIGN.CENTER
551
-
552
- # Center the text vertically
553
- text_frame.vertical_anchor = pptx.enum.text.MSO_ANCHOR.MIDDLE
554
- text_box.fill.background() # No fill
555
- text_box.line.fill.background() # No line
556
- text_box.shadow.inherit = False
557
-
558
- # Set the font color based on the theme
559
- for paragraph in text_frame.paragraphs:
560
- for run in paragraph.runs:
561
- run.font.color.theme_color = pptx.enum.dml.MSO_THEME_COLOR.TEXT_2
562
-
563
- _add_text_at_bottom(
564
- slide=slide,
565
- slide_width_inch=slide_width_inch,
566
- slide_height_inch=slide_height_inch,
567
- text='More icons available in the SlideDeck AI repository',
568
- hyperlink='https://github.com/barun-saha/slide-deck-ai/tree/main/icons/png128'
569
- )
570
-
571
- return True
572
-
573
- return False
574
-
575
-
576
- def _add_text_at_bottom(
577
- slide: pptx.slide.Slide,
578
- slide_width_inch: float,
579
- slide_height_inch: float,
580
- text: str,
581
- hyperlink: Optional[str] = None,
582
- target_height: Optional[float] = 0.5
583
- ):
584
- """
585
- Add arbitrary text to a textbox positioned near the lower left side of a slide.
586
-
587
- :param slide: The slide.
588
- :param slide_width_inch: The width of the slide.
589
- :param slide_height_inch: The height of the slide.
590
- :param target_height: the target height of the box in inches (optional).
591
- :param text: The text to be added
592
- :param hyperlink: The hyperlink to be added to the text (optional).
593
- """
594
-
595
- footer = slide.shapes.add_textbox(
596
- left=INCHES_1,
597
- top=pptx.util.Inches(slide_height_inch - target_height),
598
- width=pptx.util.Inches(slide_width_inch),
599
- height=pptx.util.Inches(target_height)
600
- )
601
-
602
- paragraph = footer.text_frame.paragraphs[0]
603
- run = paragraph.add_run()
604
- run.text = text
605
- run.font.size = pptx.util.Pt(10)
606
- run.font.underline = False
607
-
608
- if hyperlink:
609
- run.hyperlink.address = hyperlink
610
-
611
-
612
- def _handle_double_col_layout(
613
- presentation: pptx.Presentation(),
614
- slide_json: dict,
615
- slide_width_inch: float,
616
- slide_height_inch: float
617
- ) -> bool:
618
- """
619
- Add a slide with a double column layout for comparison.
620
-
621
- :param presentation: The presentation object.
622
- :param slide_json: The content of the slide as JSON data.
623
- :param slide_width_inch: The width of the slide in inches.
624
- :param slide_height_inch: The height of the slide in inches.
625
- :return: True if double col layout has been added; False otherwise.
626
- """
627
-
628
- if 'bullet_points' in slide_json and slide_json['bullet_points']:
629
- double_col_content = slide_json['bullet_points']
630
-
631
- if double_col_content and (
632
- len(double_col_content) == 2
633
- ) and isinstance(double_col_content[0], dict) and isinstance(double_col_content[1], dict):
634
- slide = presentation.slide_layouts[4]
635
- slide = presentation.slides.add_slide(slide)
636
- placeholders = None
637
-
638
- shapes = slide.shapes
639
- title_placeholder = shapes.title
640
- title_placeholder.text = remove_slide_number_from_heading(slide_json['heading'])
641
-
642
- try:
643
- left_heading, right_heading = shapes.placeholders[1], shapes.placeholders[3]
644
- except KeyError:
645
- # For manually edited/added master slides, the placeholder idx numbers in the dict
646
- # will be different (>= 10)
647
- left_heading, right_heading = None, None
648
- placeholders = get_slide_placeholders(slide, layout_number=4)
649
-
650
- for idx, name in placeholders:
651
- if 'text placeholder' in name:
652
- if not left_heading:
653
- left_heading = shapes.placeholders[idx]
654
- elif not right_heading:
655
- right_heading = shapes.placeholders[idx]
656
-
657
- try:
658
- left_col, right_col = shapes.placeholders[2], shapes.placeholders[4]
659
- except KeyError:
660
- left_col, right_col = None, None
661
- if not placeholders:
662
- placeholders = get_slide_placeholders(slide, layout_number=4)
663
-
664
- for idx, name in placeholders:
665
- if 'content placeholder' in name:
666
- if not left_col:
667
- left_col = shapes.placeholders[idx]
668
- elif not right_col:
669
- right_col = shapes.placeholders[idx]
670
-
671
- left_col_frame, right_col_frame = left_col.text_frame, right_col.text_frame
672
-
673
- if 'heading' in double_col_content[0] and left_heading:
674
- left_heading.text = double_col_content[0]['heading']
675
- if 'bullet_points' in double_col_content[0]:
676
- flat_items_list = get_flat_list_of_contents(
677
- double_col_content[0]['bullet_points'], level=0
678
- )
679
-
680
- if not left_heading:
681
- left_col_frame.text = double_col_content[0]['heading']
682
-
683
- for idx, an_item in enumerate(flat_items_list):
684
- if left_heading and idx == 0:
685
- left_col_frame.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
686
- else:
687
- paragraph = left_col_frame.add_paragraph()
688
- paragraph.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
689
- paragraph.level = an_item[1]
690
-
691
- if 'heading' in double_col_content[1] and right_heading:
692
- right_heading.text = double_col_content[1]['heading']
693
- if 'bullet_points' in double_col_content[1]:
694
- flat_items_list = get_flat_list_of_contents(
695
- double_col_content[1]['bullet_points'], level=0
696
- )
697
-
698
- if not right_heading:
699
- right_col_frame.text = double_col_content[1]['heading']
700
-
701
- for idx, an_item in enumerate(flat_items_list):
702
- if right_col_frame and idx == 0:
703
- right_col_frame.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
704
- else:
705
- paragraph = right_col_frame.add_paragraph()
706
- paragraph.text = an_item[0].removeprefix(STEP_BY_STEP_PROCESS_MARKER)
707
- paragraph.level = an_item[1]
708
-
709
- _handle_key_message(
710
- the_slide=slide,
711
- slide_json=slide_json,
712
- slide_height_inch=slide_height_inch,
713
- slide_width_inch=slide_width_inch
714
- )
715
-
716
- return True
717
-
718
- return False
719
-
720
-
721
- def _handle_step_by_step_process(
722
- presentation: pptx.Presentation,
723
- slide_json: dict,
724
- slide_width_inch: float,
725
- slide_height_inch: float
726
- ) -> bool:
727
- """
728
- Add shapes to display a step-by-step process in the slide, if available.
729
-
730
- :param presentation: The presentation object.
731
- :param slide_json: The content of the slide as JSON data.
732
- :param slide_width_inch: The width of the slide in inches.
733
- :param slide_height_inch: The height of the slide in inches.
734
- :return True if this slide has a step-by-step process depiction added; False otherwise.
735
- """
736
-
737
- if 'bullet_points' in slide_json and slide_json['bullet_points']:
738
- steps = slide_json['bullet_points']
739
-
740
- no_marker_count = 0.0
741
- n_steps = len(steps)
742
-
743
- # Ensure that it is a single list of strings without any sub-list
744
- for step in steps:
745
- if not isinstance(step, str):
746
- return False
747
-
748
- # In some cases, one or two steps may not begin with >>, e.g.:
749
- # {
750
- # "heading": "Step-by-Step Process: Creating a Legacy",
751
- # "bullet_points": [
752
- # "Identify your unique talents and passions",
753
- # ">> Develop your skills and knowledge",
754
- # ">> Create meaningful work",
755
- # ">> Share your work with the world",
756
- # ">> Continuously learn and adapt"
757
- # ],
758
- # "key_message": ""
759
- # },
760
- #
761
- # Use a threshold, e.g., at most 20%
762
- if not step.startswith(STEP_BY_STEP_PROCESS_MARKER):
763
- no_marker_count += 1
764
-
765
- slide_header = slide_json['heading'].lower()
766
- if (no_marker_count / n_steps > 0.25) and not (
767
- ('step-by-step' in slide_header) or ('step by step' in slide_header)
768
- ):
769
- return False
770
-
771
- if n_steps < 3 or n_steps > 6:
772
- # Two steps -- probably not a process
773
- # More than 5--6 steps -- would likely cause a visual clutter
774
- return False
775
-
776
- bullet_slide_layout = presentation.slide_layouts[1]
777
- slide = presentation.slides.add_slide(bullet_slide_layout)
778
- shapes = slide.shapes
779
- shapes.title.text = remove_slide_number_from_heading(slide_json['heading'])
780
-
781
- if 3 <= n_steps <= 4:
782
- # Horizontal display
783
- height = INCHES_1_5
784
- width = pptx.util.Inches(slide_width_inch / n_steps - 0.01)
785
- top = pptx.util.Inches(slide_height_inch / 2)
786
- left = pptx.util.Inches((slide_width_inch - width.inches * n_steps) / 2 + 0.05)
787
-
788
- for step in steps:
789
- shape = shapes.add_shape(MSO_AUTO_SHAPE_TYPE.CHEVRON, left, top, width, height)
790
- shape.text = step.removeprefix(STEP_BY_STEP_PROCESS_MARKER)
791
- left += width - INCHES_0_4
792
- elif 4 < n_steps <= 6:
793
- # Vertical display
794
- height = pptx.util.Inches(0.65)
795
- top = pptx.util.Inches(slide_height_inch / 4)
796
- left = INCHES_1 # slide_width_inch - width.inches)
797
-
798
- # Find the close to median width, based on the length of each text, to be set
799
- # for the shapes
800
- width = pptx.util.Inches(slide_width_inch * 2 / 3)
801
- lengths = [len(step) for step in steps]
802
- font_size_20pt = pptx.util.Pt(20)
803
- widths = sorted(
804
- [
805
- min(
806
- pptx.util.Inches(font_size_20pt.inches * a_len),
807
- width
808
- ) for a_len in lengths
809
- ]
810
- )
811
- width = widths[len(widths) // 2]
812
-
813
- for step in steps:
814
- shape = shapes.add_shape(MSO_AUTO_SHAPE_TYPE.PENTAGON, left, top, width, height)
815
- shape.text = step.removeprefix(STEP_BY_STEP_PROCESS_MARKER)
816
- top += height + INCHES_0_3
817
- left += INCHES_0_5
818
-
819
- return True
820
-
821
-
822
- def _handle_key_message(
823
- the_slide: pptx.slide.Slide,
824
- slide_json: dict,
825
- slide_width_inch: float,
826
- slide_height_inch: float
827
- ):
828
- """
829
- Add a shape to display the key message in the slide, if available.
830
-
831
- :param the_slide: The slide to be processed.
832
- :param slide_json: The content of the slide as JSON data.
833
- :param slide_width_inch: The width of the slide in inches.
834
- :param slide_height_inch: The height of the slide in inches.
835
- """
836
-
837
- if 'key_message' in slide_json and slide_json['key_message']:
838
- height = pptx.util.Inches(1.6)
839
- width = pptx.util.Inches(slide_width_inch / 2.3)
840
- top = pptx.util.Inches(slide_height_inch - height.inches - 0.1)
841
- left = pptx.util.Inches((slide_width_inch - width.inches) / 2)
842
- shape = the_slide.shapes.add_shape(
843
- MSO_AUTO_SHAPE_TYPE.ROUNDED_RECTANGLE,
844
- left=left,
845
- top=top,
846
- width=width,
847
- height=height
848
- )
849
- shape.text = slide_json['key_message']
850
-
851
-
852
- def _get_slide_width_height_inches(presentation: pptx.Presentation) -> Tuple[float, float]:
853
- """
854
- Get the dimensions of a slide in inches.
855
-
856
- :param presentation: The presentation object.
857
- :return: The width and the height.
858
- """
859
-
860
- slide_width_inch = EMU_TO_INCH_SCALING_FACTOR * presentation.slide_width
861
- slide_height_inch = EMU_TO_INCH_SCALING_FACTOR * presentation.slide_height
862
- # logger.debug('Slide width: %f, height: %f', slide_width_inch, slide_height_inch)
863
-
864
- return slide_width_inch, slide_height_inch
865
-
866
-
867
- if __name__ == '__main__':
868
- _JSON_DATA = '''
869
- {
870
- "title": "AI Applications: Transforming Industries",
871
- "slides": [
872
- {
873
- "heading": "Introduction to AI Applications",
874
- "bullet_points": [
875
- "Artificial Intelligence (AI) is transforming various industries",
876
- "AI applications range from simple decision-making tools to complex systems",
877
- "AI can be categorized into types: Rule-based, Instance-based, and Model-based"
878
- ],
879
- "key_message": "AI is a broad field with diverse applications and categories",
880
- "img_keywords": "AI, transformation, industries, decision-making, categories"
881
- },
882
- {
883
- "heading": "AI in Everyday Life",
884
- "bullet_points": [
885
- "Virtual assistants like Siri, Alexa, and Google Assistant",
886
- "Recommender systems in Netflix, Amazon, and Spotify",
887
- "Fraud detection in banking and credit card transactions"
888
- ],
889
- "key_message": "AI is integrated into our daily lives through various services",
890
- "img_keywords": "virtual assistants, recommender systems, fraud detection"
891
- },
892
- {
893
- "heading": "AI in Healthcare",
894
- "bullet_points": [
895
- "Disease diagnosis and prediction using machine learning algorithms",
896
- "Personalized medicine and drug discovery",
897
- "AI-powered robotic surgeries and remote patient monitoring"
898
- ],
899
- "key_message": "AI is revolutionizing healthcare with improved diagnostics and patient care",
900
- "img_keywords": "healthcare, disease diagnosis, personalized medicine, robotic surgeries"
901
- },
902
- {
903
- "heading": "AI in Key Industries",
904
- "bullet_points": [
905
- {
906
- "heading": "Retail",
907
- "bullet_points": [
908
- "Inventory management and demand forecasting",
909
- "Customer segmentation and targeted marketing",
910
- "AI-driven chatbots for customer service"
911
- ]
912
- },
913
- {
914
- "heading": "Finance",
915
- "bullet_points": [
916
- "Credit scoring and risk assessment",
917
- "Algorithmic trading and portfolio management",
918
- "AI for detecting money laundering and cyber fraud"
919
- ]
920
- }
921
- ],
922
- "key_message": "AI is transforming retail and finance with improved operations and decision-making",
923
- "img_keywords": "retail, finance, inventory management, credit scoring, algorithmic trading"
924
- },
925
- {
926
- "heading": "AI in Education",
927
- "bullet_points": [
928
- "Personalized learning paths and adaptive testing",
929
- "Intelligent tutoring systems for skill development",
930
- "AI for predicting student performance and dropout rates"
931
- ],
932
- "key_message": "AI is personalizing education and improving student outcomes",
933
- },
934
- {
935
- "heading": "Step-by-Step: AI Development Process",
936
- "bullet_points": [
937
- ">> Define the problem and objectives",
938
- ">> Collect and preprocess data",
939
- ">> Select and train the AI model",
940
- ">> Evaluate and optimize the model",
941
- ">> Deploy and monitor the AI system"
942
- ],
943
- "key_message": "Developing AI involves a structured process from problem definition to deployment",
944
- "img_keywords": ""
945
- },
946
- {
947
- "heading": "AI Icons: Key Aspects",
948
- "bullet_points": [
949
- "[[brain]] Human-like intelligence and decision-making",
950
- "[[robot]] Automation and physical tasks",
951
- "[[]] Data processing and cloud computing",
952
- "[[lightbulb]] Insights and predictions",
953
- "[[globe2]] Global connectivity and impact"
954
- ],
955
- "key_message": "AI encompasses various aspects, from human-like intelligence to global impact",
956
- "img_keywords": "AI aspects, intelligence, automation, data processing, global impact"
957
- },
958
- {
959
- "heading": "Conclusion: Embracing AI's Potential",
960
- "bullet_points": [
961
- "AI is transforming industries and improving lives",
962
- "Ethical considerations are crucial for responsible AI development",
963
- "Invest in AI education and workforce development",
964
- "Call to action: Explore AI applications and contribute to shaping its future"
965
- ],
966
- "key_message": "AI offers immense potential, and we must embrace it responsibly",
967
- "img_keywords": "AI transformation, ethical considerations, AI education, future of AI"
968
- }
969
- ]
970
- }'''
971
-
972
- temp = tempfile.NamedTemporaryFile(delete=False, suffix='.pptx')
973
- path = pathlib.Path(temp.name)
974
-
975
- generate_powerpoint_presentation(
976
- json5.loads(_JSON_DATA),
977
- output_file_path=path,
978
- slides_template='Basic'
979
- )
980
- print(f'File path: {path}')
981
-
982
- temp.close()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
helpers/text_helper.py DELETED
@@ -1,89 +0,0 @@
1
- import json_repair as jr
2
-
3
-
4
- def is_valid_prompt(prompt: str) -> bool:
5
- """
6
- Verify whether user input satisfies the concerned constraints.
7
-
8
- :param prompt: The user input text.
9
- :return: True if all criteria are satisfied; False otherwise.
10
- """
11
-
12
- if len(prompt) < 7 or ' ' not in prompt:
13
- return False
14
-
15
- return True
16
-
17
-
18
- def get_clean_json(json_str: str) -> str:
19
- """
20
- Attempt to clean a JSON response string from the LLM by removing the trailing ```
21
- and any text beyond that.
22
- CAUTION: May not be always accurate.
23
-
24
- :param json_str: The input string in JSON format.
25
- :return: The "cleaned" JSON string.
26
- """
27
-
28
- # An example of response containing JSON and other text:
29
- # {
30
- # "title": "AI and the Future: A Transformative Journey",
31
- # "slides": [
32
- # ...
33
- # ]
34
- # } <<---- This is end of valid JSON content
35
- # ```
36
- #
37
- # ```vbnet
38
- # Please note that the JSON output is in valid format but the content of the "Role of GPUs in AI" slide is just an example and may not be factually accurate. For accurate information, you should consult relevant resources and update the content accordingly.
39
- # ```
40
- response_cleaned = json_str
41
-
42
- while True:
43
- idx = json_str.rfind('```') # -1 on failure
44
-
45
- if idx <= 0:
46
- break
47
-
48
- # In the ideal scenario, the character before the last ``` should be
49
- # a new line or a closing bracket }
50
- prev_char = json_str[idx - 1]
51
-
52
- if (prev_char == '}') or (prev_char == '\n' and json_str[idx - 2] == '}'):
53
- response_cleaned = json_str[:idx]
54
-
55
- json_str = json_str[:idx]
56
-
57
- return response_cleaned
58
-
59
-
60
- def fix_malformed_json(json_str: str) -> str:
61
- """
62
- Try and fix the syntax error(s) in a JSON string.
63
-
64
- :param json_str: The input JSON string.
65
- :return: The fixed JSOn string.
66
- """
67
-
68
- return jr.repair_json(json_str, skip_json_loads=True)
69
-
70
-
71
- if __name__ == '__main__':
72
- json1 = '''{
73
- "key": "value"
74
- }
75
- '''
76
- json2 = '''["Reason": "Regular updates help protect against known vulnerabilities."]'''
77
- json3 = '''["Reason" Regular updates help protect against known vulnerabilities."]'''
78
- json4 = '''
79
- {"bullet_points": [
80
- ">> Write without stopping or editing",
81
- >> Set daily writing goals and stick to them,
82
- ">> Allow yourself to make mistakes"
83
- ],}
84
- '''
85
-
86
- print(fix_malformed_json(json1))
87
- print(fix_malformed_json(json2))
88
- print(fix_malformed_json(json3))
89
- print(fix_malformed_json(json4))
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
icons/png128/0-circle.png DELETED
Binary file (4.1 kB)
 
icons/png128/1-circle.png DELETED
Binary file (3.45 kB)
 
icons/png128/123.png DELETED
Binary file (2.5 kB)
 
icons/png128/2-circle.png DELETED
Binary file (4.01 kB)
 
icons/png128/3-circle.png DELETED
Binary file (4.24 kB)
 
icons/png128/4-circle.png DELETED
Binary file (3.74 kB)
 
icons/png128/5-circle.png DELETED
Binary file (4.12 kB)
 
icons/png128/6-circle.png DELETED
Binary file (4.37 kB)
 
icons/png128/7-circle.png DELETED
Binary file (3.78 kB)
 
icons/png128/8-circle.png DELETED
Binary file (4.43 kB)
 
icons/png128/9-circle.png DELETED
Binary file (4.44 kB)
 
icons/png128/activity.png DELETED
Binary file (1.38 kB)
 
icons/png128/airplane.png DELETED
Binary file (2.09 kB)
 
icons/png128/alarm.png DELETED
Binary file (4.08 kB)
 
icons/png128/alien-head.png DELETED
Binary file (4.73 kB)
 
icons/png128/alphabet.png DELETED
Binary file (2.44 kB)
 
icons/png128/amazon.png DELETED
Binary file (3.56 kB)
 
icons/png128/amritsar-golden-temple.png DELETED
Binary file (4.44 kB)
 
icons/png128/amsterdam-canal.png DELETED
Binary file (3.32 kB)
 
icons/png128/amsterdam-windmill.png DELETED
Binary file (2.67 kB)
 
icons/png128/android.png DELETED
Binary file (2.24 kB)
 
icons/png128/angkor-wat.png DELETED
Binary file (2.64 kB)
 
icons/png128/apple.png DELETED
Binary file (2.4 kB)
 
icons/png128/archive.png DELETED
Binary file (1.27 kB)
 
icons/png128/argentina-obelisk.png DELETED
Binary file (1.39 kB)
 
icons/png128/artificial-intelligence-brain.png DELETED
Binary file (4.73 kB)
 
icons/png128/atlanta.png DELETED
Binary file (2.87 kB)
 
icons/png128/austin.png DELETED
Binary file (1.72 kB)
 
icons/png128/automation-decision.png DELETED
Binary file (1.19 kB)
 
icons/png128/award.png DELETED
Binary file (2.55 kB)
 
icons/png128/balloon.png DELETED
Binary file (2.83 kB)
 
icons/png128/ban.png DELETED
Binary file (3.32 kB)
 
icons/png128/bandaid.png DELETED
Binary file (3.53 kB)
 
icons/png128/bangalore.png DELETED
Binary file (2.4 kB)
 
icons/png128/bank.png DELETED
Binary file (1.4 kB)
 
icons/png128/bar-chart-line.png DELETED
Binary file (802 Bytes)