3 2 6

Fabrizio Salmi

youknowwhatAImean

fabriziosalmi

AI & ML interests

None yet

Recent Activity

upvoted a collection 11 days ago

Gemma Neogenesis 💎🌍🇮🇹

liked a model 8 months ago

CAMB-AI/MARS5-TTS

liked a model 8 months ago

llamas-community/LlamaGuard-7b

View all activity

Organizations

None yet

youknowwhatAImean's activity

upvoted a collection 11 days ago

Gemma Neogenesis 💎🌍🇮🇹

Collection

Datasets and models for Neogenesis: Post-training recipe for improving Gemma 2 for a specific language. Notebook: https://t.ly/iuKdy • 11 items • Updated 22 days ago • 5

liked 2 models 8 months ago

CAMB-AI/MARS5-TTS

Text-to-Speech • Updated Jul 5, 2024 • 212 • 448

llamas-community/LlamaGuard-7b

Text Generation • Updated Dec 7, 2023 • 226 • 12

New activity in youknowwhatAImean/simplemath-ita-sparse 9 months ago

[bot] Conversion to Parquet

#2 opened 9 months ago by

parquet-converter

liked a Space 9 months ago

424

MeloTTS

🗣

Fast, efficient, & multilingual text-to-speech

liked a model 9 months ago

microsoft/Phi-3-mini-128k-instruct

Text Generation • Updated Aug 20, 2024 • 208k • 1.63k

New activity in youknowwhatAImean/simplemath-ita-sparse 9 months ago

Librarian Bot: Add language metadata for dataset

#1 opened 9 months ago by

librarian-bot

updated a dataset 9 months ago

youknowwhatAImean/simplemath-ita-sparse

Viewer • Updated May 8, 2024 • 20M • 57

upvoted an article 9 months ago

Article

How to Finetune phi-3 on MacBook Pro

•

Apr 24, 2024

• 65

updated a Space 9 months ago

AutoTrain Advanced

🚀

reacted to Jaward's post with 👍 9 months ago

Post

5377

All You need To Know About Phi-3 (Technical Report Walkthrough)

Summary of Summaries:
Phi-3-mini
- Architecture specs: decoder-only transformer, ModelSize: 3.8 billion
parameters, LongRope [ 128K Context length ], Vocab Size [ 32064 ],
trained on 3.3 trillion tokens. at bfloat16.
- Rivals performance to larger models like Mixtral 8x7B and GPT-3.5,
capable of running locally on a smartphone.
- Utilizes high quality training dataset heavily filtered from web data and
llm-generated synthetic data.
- Can be quantized to 4-bits, occupying ≈ 1.8GB of memory.
- Ran natively on iPhone 14 with A16 Bionic chip with inference speed of up
to 12 tokens per second.

Phi-3-small
- Architecture specs: Also decoder-only, 7B parameters, Vocab size [ 100352 ], default context length [ 8k ], Context Length: 8K, Hidden Dimension: 4096, Number of Heads and Layers: Follows 7B class structure.
- Uses tiktoken tokenizer (for enhanced multilingual tokenization)

Phi-3-medium:
- Architecture specs: Also decoder-only, Hidden Dimension: 5120, Number of Heads: 40, Number of Layers: 40, Tokenization: Consistent with other models, Training on 4.8 trillion tokens.

Training Methodology:
- Focuses on high-quality training data deviating from standard scaling laws.
- The models undergo two-phase pre-training using a mix of web sources and synthetic data for general knowledge and logical reasoning skills.

Performance:
- Phi-3-mini achieves competitive scores on standard benchmarks like MMLU and MT-Bench, indicating strong reasoning capabilities.
- Higher variants show even better performance, suggesting effective scaling with increased model size.

Limitations:
- phi-3-mini: limited by its smaller size in tasks requiring extensive factual knowledge, primarily supports English.
- phi-3-small limited multilingual support.

Hosting LLMs locally is a big win for OSS - private, secured inferencing on the go😎