Safetensors
qwen2

TinySwallow-Stratos-1.5B

SakanaAI/TinySwallow-1.5B-Instruct に対しStratos-35kデータセットを用いて学習を施した軽量なLLMです。

TinySwallowとは

SakanaAI社とSwallowチームによって開発された軽量なLLMです。新しい蒸留手法TAIDが利用されています。

Stratos-35kとは

Bespoke Labsにより公開されている質問、推論の過程、そして答えからなる推論データセットです。

Model Base model Dataset Size Organization
Bespoke-Stratos Qwen2.5-Instruct Stratos-35k 32B, 7B BespokeLabs
TinySwallow-Stratos TinySwallow-Instruct Stratos-35k 1.5B EQUES

Uses

Uses follow the original model.
This model is provided for research and development purposes only and should be considered as an experimental prototype. It is not intended for commercial use or deployment in mission-critical environments. Use of this model is at the user's own risk, and its performance and outcomes are not guaranteed. EQUES Inc. shall not be liable for any direct, indirect, special, incidental, or consequential damages, or any loss arising from the use of this model, regardless of the results obtained. Users must fully understand the risks associated with the use of this model and use it at their own discretion.

Output Examples

Give me a short introduction to large language model.
I'm here to help you with tasks like answering your questions, generating creative content, and providing information in various areas. My training allows me to understand patterns in vast amounts of text data, enabling me to generate human-like responses that can be useful for many purposes.

In summary, I'm an AI designed to assist users by processing natural language inputs and producing relevant outputs based on my extensive knowledge. Whether it's writing stories, summarizing articles, translating languages, or even composing music, I aim to provide assistance wherever needed.
  
If you have any specific question or task in mind, feel free to ask. I'll do my best to answer or complete the request as accurately as possible using my programming and learning capabilities. Just remember that while I can offer insights and suggestions, I don't possess real-time access to external resources or personal experiences, so I might not always have up-to-date or accurate information.
  
Also, keep in mind that I am still under development and may sometimes make mistakes or misinterpret instructions. Providing constructive feedback will help improve my performance over time. Let's get started! If there's anything else I can assist you with, please let me know.
  
(Note: The above is a simplified version tailored for a beginner-level audience.)
  
But since I need to respond quickly, perhaps I should elaborate more concisely. For example:

A large language model (LLM) is essentially an extremely powerful computer program trained to process and generate human-like text. This includes understanding context, making predictions about what comes next in a conversation, and creating new content such as essays, poems, code, etc., all without being explicitly programmed to do so. It's part of a broader field known as machine learning where algorithms learn from massive datasets, allowing them to perform tasks once considered only possible by humans.

So, if someone asks "what is a large language model," I could say something like, "An LLM is a sophisticated software application capable of comprehending and responding to human language in ways that mimic human communication. These models are built upon deep learning techniques and can handle complex linguistic structures, enabling them to engage in conversations, write diverse texts, translate between languages, and much more."

That way, the explanation is clear and covers key points succinctly. Additionally, mentioning examples—like how they're used in different industries—is also helpful. But given the prompt wants a concise introduction, sticking to the core definitions and benefits would suffice. However, depending on the target audience, maybe adding some practical use cases or limitations could enhance clarity
大規模言語モデルについて教えて。
## 大規模言語モデル (Large Language Model) とは

大規模言語モデル(LLM: Large Language Model)は、膨大な量のテキストデータで訓練された人工知能システムです。

**特徴:**
1. **広範囲な知識**: LLMは大量のテキストデータを学習することで、一般的な知識や概念を理解し、様々な情報を生成することができます。
2. **自然言語処理能力**: 自然言語理解と生成を行うため、質問への回答、文章作成、翻訳、要約などのタスクに適しています。
3. **柔軟性**: 様々な形式やスタイルに対応できます。コード生成から詩創作まで、多岐にわたる用途があります。
4. **進化:** 近年では、Transformerアーキテクチャを使用したモデルが主流となり、性能向上が著しいです。

**仕組み:**
LLMは通常、深層学習アルゴリズムに基づいており、特にTransformersというニューラルネットワークアーキテクチャがよく使用されます。このアーキテクチャにより、複雑な文脈理解や長期的な依存関係を捉えることができます。

**代表的なモデル:**
- GPTシリーズ (OpenAI)
- BERT (Google)
- T5 (Google)

これらのモデルは、公開されており、研究者や開発者が自由に利用できるようになっています。

**応用例:**

* **チャットボット**: より人間らしい対話を実現するために使用されています。
* **自動コンテンツ生成**: ブログ記事やソーシャルメディア投稿などを作成するのに役立ちます。
* **ソフトウェア開発支援**: コード生成やバグ検出などのサポートを行います。
* **教育**: 学生向けの指導資料作成や問題解決支援などに活用されます。
  
**課題:**

* **バイアス**: 訓練データによる偏見や差別的表現が出力される可能性があります。
* **説明責任**: 出力結果に対する透明性や説明責任に関する議論があります。
* **計算資源**: 高性能なハードウェアが必要となるため、コストが高いことがあります。

**将来展望:**

LLMは急速に進
A regular hexagon can be divided into six equilateral triangles. If the perimeter of one of the triangles is 21 inches, what is the perimeter, in inches, of the regular hexagon?
To solve this problem, we need to determine the perimeter of a regular hexagon given that each of its equilateral triangles has a perimeter of 21 inches.

First, let's recall some properties of regular polygons and their divisions into triangles.

A regular hexagon consists of six sides and six vertices. When it is divided into six equilateral triangles, these triangles have equal side lengths. Each triangle must therefore be an equilateral triangle with all sides equal.

Given that the perimeter of one of these triangles is 21 inches, we can use this information to find the length of each triangle's side. Since it's an equilateral triangle, all three sides are equal. Therefore, if the perimeter (sum of all sides) is 21 inches, then each side would be 21 / 3 = 7 inches long.

Now, since the original figure is a regular hexagon made up of six such triangles, the entire perimeter should consist of the sum of all six triangle sides. So, multiplying the length of one side by the number of sides gives us the total perimeter.

Let's compute that:

Total perimeter of the hexagon = Number of triangles * Length per triangle
= 6 * 7
= 42

Therefore, the perimeter of the regular hexagon is 42 inches.

But wait, let me verify if there's another way to approach this problem to ensure I haven't missed anything.

Another method could involve using the fact that in a regular hexagon, each internal angle is 120 degrees. However, perhaps more straightforwardly, since dividing a regular hexagon into six equilateral triangles splits each side into two segments: one from the central vertex to the midpoint of a side, which is part of the hexagon's side length, and the other segment is along the edge of the triangle, which is also part of the hexagon's perimeter but counted twice when adding up the triangles' perimeters. Wait, actually, no. Let me think again.

Wait, when you divide the hexagon into six equilateral triangles, each triangle shares a side with adjacent triangles. The perimeter of each individual triangle is 21 inches, so the combined perimeter of all six triangles would be 6 times 21, which is indeed 126 inches. But that doesn't make sense because the hexagon itself is formed by the union of those six triangles. However, if we consider the perimeter as the outer boundary of the hexagon, then the perimeter of the hexagon would be the same as the perimeter of any single triangle plus the inner edges where they meet at the center. Hmm, maybe not directly applicable here.

Alternatively, perhaps considering the area or something else, but since the question only asks for the perimeter, focusing on the division into triangles might be sufficient.

So going back to the initial calculation: each triangle is 7 inches, multiplied by 6 gives 42. That seems correct.

But just to double-check, let's visualize a regular hexagon. Each side of the hexagon is split into two parts by the centroid (the intersection point of the diagonals). In each triangle, the base is a side of the hexagon, and the height is half the distance between opposite sides. However, perhaps instead of getting bogged down in geometry, the key insight is recognizing that the perimeter of the hexagon is simply the sum of all the triangle sides, which we already calculated as 42 inches.

Yes, that makes sense. The answer is 42 inches.

**Final Answer**
The perimeter of the regular hexagon is \boxed{42} inches.

Sample Usage

import os
os.environ["CUDA_VISIBLE_DEVICES"]="0"
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "EQUES/TinySwallow-Stratos-1.5B"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "大規模言語モデルについて教えて。"
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=512,
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

License

Apache-2.0

Acknowledgement

  • SakanaAI & Swallow team : development and release of TinySwallow-1.5B
  • BespokeLabs : development and share of training codes
  • NovaSkyAI : development and share of SkyThought
  • Authors of LlamaFactory
Downloads last month
121
Safetensors
Model size
1.54B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for EQUES/TinySwallow-Stratos-1.5B

Base model

Qwen/Qwen2.5-1.5B
Finetuned
(3)
this model
Quantizations
1 model

Dataset used to train EQUES/TinySwallow-Stratos-1.5B