East China Normal University

university

AI & ML interests

None defined yet.

Recent Activity

ECNU's activity

not-lain 
posted an update 7 days ago
view post
Post
948
we now have more than 2000 public AI models using ModelHubMixin🤗
not-lain 
posted an update 12 days ago
view post
Post
3793
Published a new blogpost 📖
In this blogpost I have gone through the transformers' architecture emphasizing how shapes propagate throughout each layer.
🔗 https://huggingface.co./blog/not-lain/tensor-dims
some interesting takeaways :
Sri-Vigneshwar-DJ 
posted an update 14 days ago
view post
Post
631
Checkout phi-4 from Microsoft, dropped a day ago... If you ❤️ the Phi series, then here is the GGUF - Sri-Vigneshwar-DJ/phi-4-GGUF. phi-4 is a 14B highly efficient open LLM that beats much larger models at math and reasoning - check out evaluations on the Open LLM.

Technical paper - https://arxiv.org/pdf/2412.08905 ; The Data Synthesis approach is interesting
Sri-Vigneshwar-DJ 
posted an update 17 days ago
view post
Post
2049
Just sharing a thought: I started using DeepSeek V3 a lot, and an idea struck me about agents "orchestrating during inference" on a test-time compute model like DeepSeek V3 or the O1 series.

Agents (Instruction + Function Calls + Memory) execute during inference, and based on the output decision, a decision is made to scale the time to reason or perform other tasks.
Sri-Vigneshwar-DJ 
posted an update 19 days ago
view post
Post
2337
Combining smolagents with Anthropic’s best practices simplifies building powerful AI agents:

1. Code-Based Agents: Write actions as Python code, reducing steps by 30%.
2. Prompt Chaining: Break tasks into sequential subtasks with validation gates.
3. Routing: Classify inputs and direct them to specialized handlers.
4. Fallback: Handle tasks even if classification fails.

https://huggingface.co./blog/Sri-Vigneshwar-DJ/building-effective-agents-with-anthropics-best-pra
not-lain 
posted an update 2 months ago
view post
Post
2285
ever wondered how you can make an API call to a visual-question-answering model without sending an image url 👀

you can do that by converting your local image to base64 and sending it to the API.

recently I made some changes to my library "loadimg" that allows you to make converting images to base64 a breeze.
🔗 https://github.com/not-lain/loadimg

API request example 🛠️:
from loadimg import load_img
from huggingface_hub import InferenceClient

# or load a local image
my_b64_img = load_img(imgPath_url_pillow_or_numpy ,output_type="base64" ) 

client = InferenceClient(api_key="hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")

messages = [
	{
		"role": "user",
		"content": [
			{
				"type": "text",
				"text": "Describe this image in one sentence."
			},
			{
				"type": "image_url",
				"image_url": {
					"url": my_b64_img # base64 allows using images without uploading them to the web
				}
			}
		]
	}
]

stream = client.chat.completions.create(
    model="meta-llama/Llama-3.2-11B-Vision-Instruct", 
	messages=messages, 
	max_tokens=500,
	stream=True
)

for chunk in stream:
    print(chunk.choices[0].delta.content, end="")
not-lain 
posted an update 6 months ago
not-lain 
posted an update 6 months ago
view post
Post
7724
I am now a huggingface fellow 🥳
·
not-lain 
posted an update 7 months ago
view post
Post
2682
I have finished writing a blogpost about building an image-based retrieval system, This is one of the first-ever approaches to building such a pipeline using only open-source models/libraries 🤗

You can checkout the blogpost in https://huggingface.co./blog/not-lain/image-retriever and the associated space at not-lain/image-retriever .

✨ If you want to request another blog post consider letting me know down below or you can reach out to me through any of my social media

📖 Happy reading !
not-lain 
posted an update 7 months ago
view post
Post
1455
Hello beautiful people.
I wanted to thank everyone that read my blogpost and I am glad to share that we have achieved 11000 readers 🥳
I couldn't have done this without you, so once again thanks a lot everyone for the support 💖
If you haven't already you can read my blog post at: https://huggingface.co./blog/not-lain/rag-chatbot-using-llama3
not-lain 
posted an update 8 months ago
view post
Post
2101
It is with great pleasure I inform you that huggingface's ModelHubMixin reached 200+ models on the hub 🥳

ModelHubMixin is a class developed by HF to integrate AI models with the hub with ease and it comes with 3 methods :
* save_pretrained
* from_pretrained
* push_to_hub

Shoutout to @nielsr , @Wauplin and everyone else on HF for their awesome work 🤗

If you are not familiar with ModelHubMixin and you are looking for extra resources you might consider :
* docs: https://huggingface.co./docs/huggingface_hub/main/en/package_reference/mixins
🔗blog about training models with the trainer API and using ModelHubMixin: https://huggingface.co./blog/not-lain/trainer-api-and-mixin-classes
🔗GitHub repo with pip integration: https://github.com/not-lain/PyTorchModelHubMixin-template
🔗basic guide: https://huggingface.co./posts/not-lain/884273241241808
not-lain 
posted an update 8 months ago
not-lain 
posted an update 8 months ago
view post
Post
1546
If you're a researcher or developing your own model 👀 you might need to take a look at huggingface's ModelHubMixin classes.
They are used to seamlessly integrate your AI model with huggingface and to save/ load your model easily 🚀

1️⃣ make sure you're using the appropriate library version
pip install -qU "huggingface_hub>=0.22"

2️⃣ inherit from the appropriate class
from huggingface_hub import PyTorchModelHubMixin
from torch import nn

class MyModel(nn.Module,PyTorchModelHubMixin):
  def __init__(self, a, b):
    super().__init__()
    self.layer = nn.Linear(a,b)
  def forward(self,inputs):
    return self.layer(inputs)

first_model = MyModel(3,1)

4️⃣ push the model to the hub (or use save_pretrained method to save locally)
first_model.push_to_hub("not-lain/test")

5️⃣ Load and initialize the model from the hub using the original class
pretrained_model = MyModel.from_pretrained("not-lain/test")