Benhao Tang's picture

Benhao Tang PRO

benhaotang

AI & ML interests

Physics Master student in theoretical particle physics at Universitรคt Heidelberg, actively looking into the possibilities of integrating AI into future physics research.

Recent Activity

Organizations

None yet

benhaotang's activity

replied to their post 9 days ago
view reply

image.png

OK, grok 3 deep research also failed on my benchmark...

And this is the final solution it gives me:

Use wsl --shutdown before hibernating; if it fails, try net stop LxssManager.

What? How about just tell me if WSL have problem, just do not using WSL... How can this be a solution when there is even an official troubleshooting guide that provide more solutions. This is the even worst than gemini and perplexity, at least they read the official guide, just got lost in github issue threads... Now I really want to know how OpenAI's compares to mine, if I have 200 dollars.

upvoted an article 9 days ago
view article
Article

Building a Real-Time Video Chat with Gemini 2.0, Gradio, and WebRTC ๐Ÿ‘€๐Ÿ‘‚

By freddyaboulton โ€ข
โ€ข 6
reacted to schuler's post with ๐Ÿ˜Ž 10 days ago
view post
Post
3371
๐Ÿ”ฎ GPT-3 implemented in pure Free Pascal!
https://github.com/joaopauloschuler/gpt-3-for-pascal

This implementation follows the GPT-3 Small architecture from the landmark paper "Language Models are Few-Shot Learners":
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚     Input Layer       โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ Token & Positional    โ”‚
โ”‚     Embedding         โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚   12x Transformer     โ”‚
โ”‚      Blocks           โ”‚
โ”‚  - 12 heads           โ”‚
โ”‚  - 768 hidden dims    โ”‚
โ”‚  - 3072 intermediate  โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚   Output Layer        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Clean Pascal Implementation
for CntLayer := 1 to {Layers=}12 do
begin
  Result.AddTransformerBlockCAI(
    {Heads=}12, 
    {intermediate dimensions=}4*768, 
    {NoForward=}true, 
    {HasNorm=}true, 
    false
  );
end;

upvoted an article 11 days ago
view article
Article

Introducing smolagents: simple agents that write actions in code.

โ€ข 790
reacted to louisbrulenaudet's post with ๐Ÿค— 11 days ago
view post
Post
3051
I am pleased to introduce my first project built upon Hugging Faceโ€™s smolagents framework, integrated with Alpaca for financial market analysis automation ๐Ÿฆ™๐Ÿค—

The project implements technical indicators such as the Relative Strength Index (RSI) and Bollinger Bands to provide momentum and volatility analysis. Market data is retrieved through the Alpaca API, enabling access to historical price information across various timeframes.

AI-powered insights are generated using Hugging Faceโ€™s inference API, facilitating the analysis of market trends through natural language processing with DuckDuckGo search integration for real-time sentiment analysis based on financial news ๐Ÿฆ†

Link to the GitHub project: https://github.com/louisbrulenaudet/agentic-market-tool

posted an update 13 days ago
view post
Post
2386
Try out my updated implementation of forked OpenDeepResearcher(link below) as an OpenAI compatible endpoint, but with full control, can be deployed completely free with Gemini api or completely locally with ollama, or pay-as-you-go in BYOK format, the AI agents will think dynamically based on the difficulties of given research, compatible with any OpenAI compatible configurable clients(Msty, Chatbox, even vscode AI Toolkit playground).

If you don't want to pay OpenAI $200 to use or want to take control of your deep research, check out here:
๐Ÿ‘‰ https://github.com/benhaotang/OpenDeepResearcher-via-searxng

**Personal take**

Based on my testing against Perplexity's and Gemini's implementation with some Physics domain questions, mine is comparable and very competent at finding even the most rare articles or methods.

Also a funny benchmark of mine to test all these searching models, is to trouble shot a WSL2 hanging issue I experienced last year, with prompt:

> wsl2 in windows hangs in background with high vmmem cpu usage once in a while, especially after hibernation, no error logs captured in linux, also unable to shutdown in powershell, provide solutions

the final solution that took me a day last year to find is to patch the kernel with some steps documented in carlfriedrich's repo and wait Microsoft to solve it(it is buried deep in wsl issues). Out of the three, only my Deep Research agent has found this solution, Perplexity and Gemini just focus on other force restart or memory management methods. I am very impressed with how it has this kind of obscure and scarce trouble shooting ability.

**Limitations**

Some caveats to be done later:
- Multi-turn conversation is not yet supported, so no follow-up questions
- System message is only extra writing instructions, don't affect on search
- Small local model may have trouble citing source reliably, I am working on a fix to fact check all citation claims
  • 1 reply
ยท
upvoted an article 25 days ago
upvoted an article about 1 month ago
view article
Article

๐Ÿบ๐Ÿฆโ€โฌ› LLM Comparison/Test: Phi-4, Qwen2 VL 72B Instruct, Aya Expanse 32B in my updated MMLU-Pro CS benchmark

By wolfram โ€ข
โ€ข 4