RWKV v7 Potato Model Card

potato

Model Overview

  • Name: RWKV v7 Potato
  • Architecture: RWKV v7 with MoLE (Mixture of LoRA Experts)
  • Base Model RWKV-x070-World-0.4B-v2.9-20250107-ctx4096
  • Parameter Count: 0.6B(540M)
  • License: Apache 2.0

Technical Specifications

  • Training Approch: LoRA(r=256)
  • Expert Configuration:
    • Total LoRA Experts: 4
    • Active LoRA Experts: 2(Shared Expert0 + n)
  • End Token: \n\n\x17
  • Inference: only supported latest RWKV-Infer

Language Support

  • English
  • Japanese
  • Chinese

Dataset

  • CJE 900k pairs Pre-instruct tuning

Purpose and Use Case

This model serves as a proof-of-concept experiment to investigate the effectiveness of Mixture of LoRA Experts (MoLE) architecture in small-parameter Language Learning Models (LLMs).

Limitations and Known Issues

The model's small parameter count (0.6B) significantly impacts its performance:

  • Responses are consistently inaccurate
  • Not suitable for production use or tasks requiring reliability
  • Should be considered an experimental research model only
  • Inference is slow due to LoRA's real-time merging

Research Context

This implementation explores the viability of MoLE architecture in resource-constrained environments, specifically examining how expert mixture mechanisms perform in small-scale language models.

License Information

This model is released under the Apache 2.0 license, allowing for both academic and commercial use with appropriate attribution.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.