Possibly of interest as a precursor: https://huggingface.co./papers/2110.01509
Gregor Betz PRO
ggbetz
AI & ML interests
Reasoning, AGI, AI Safety, AI Reliability
Recent Activity
updated
a dataset
3 days ago
cot-leaderboard/cot-leaderboard-results
updated
a dataset
3 days ago
cot-leaderboard/cot-eval-results
Organizations
ggbetz's activity
replied to
singhsidhukuldeep's
post
4 days ago
posted
an
update
5 days ago
Post
1772
We've just released syncIALO -- a multi-purpose synthetic debate and argument mapping corpus with more than 600k arguments:
📝 Blog article: https://huggingface.co./blog/ggbetz/introducing-syncialo
🛢️ Dataset: DebateLabKIT/syncialo-raw
👩💻 Code: https://github.com/debatelab/syncIALO
🤗 Hugging Face has sponsored the syncIALO project through inference time / compute credits. 🙏 We gratefully acknowledge the generous support. 🫶
📝 Blog article: https://huggingface.co./blog/ggbetz/introducing-syncialo
🛢️ Dataset: DebateLabKIT/syncialo-raw
👩💻 Code: https://github.com/debatelab/syncIALO
🤗 Hugging Face has sponsored the syncIALO project through inference time / compute credits. 🙏 We gratefully acknowledge the generous support. 🫶
posted
an
update
5 months ago
Post
1487
Hi, just a brief follow-up on our Guided Reasoning (GuiR) system:
I've created a template space that facilitates testing:
1. Duplicate space logikon/guir-chat
2. Setup your own inference servers and provide details in config file
3. Add api keys as secrets
4. Your personal GuiR playground is ready
Cheers, Gregor
I've created a template space that facilitates testing:
1. Duplicate space logikon/guir-chat
2. Setup your own inference servers and provide details in config file
3. Add api keys as secrets
4. Your personal GuiR playground is ready
Cheers, Gregor
posted
an
update
5 months ago
Post
1198
🧭 Guided Reasoning
👋Hi everyone,
We've been releasing Guided Reasoning:
Our AI guides walk your favorite LLM through complex reasoning problems.
🎯 Goals:
1️⃣ Reliability. AIs consistently follow reasoning methods.
2️⃣ Self-explainability. AIs see reasoning protocols and can explain internal deliberation.
3️⃣ Contestability. Users may amend AI reasoning and revise plausibility assessments.
Try out Guided Reasoning with our light demo chatbot, powered by 🤗 HuggingFace's free Inference Api and small LLMs. (Sorry for poor latency and limited availability -- we are currently searching for 💸 compute sponsors to run more powerful models, faster, and optimize guided reasoning performance.)
Built on top of Logikon's open-source AI reasoning analytics.
Demo chat app: logikon/benjamin-chat
Github: https://github.com/logikon-ai/logikon
Technical report: https://arxiv.org/abs/2408.16331
➡️ Check it out and get involved! Looking forward to hearing from you.
👋Hi everyone,
We've been releasing Guided Reasoning:
Our AI guides walk your favorite LLM through complex reasoning problems.
🎯 Goals:
1️⃣ Reliability. AIs consistently follow reasoning methods.
2️⃣ Self-explainability. AIs see reasoning protocols and can explain internal deliberation.
3️⃣ Contestability. Users may amend AI reasoning and revise plausibility assessments.
Try out Guided Reasoning with our light demo chatbot, powered by 🤗 HuggingFace's free Inference Api and small LLMs. (Sorry for poor latency and limited availability -- we are currently searching for 💸 compute sponsors to run more powerful models, faster, and optimize guided reasoning performance.)
Built on top of Logikon's open-source AI reasoning analytics.
Demo chat app: logikon/benjamin-chat
Github: https://github.com/logikon-ai/logikon
Technical report: https://arxiv.org/abs/2408.16331
➡️ Check it out and get involved! Looking forward to hearing from you.
replied to
their
post
10 months ago
Sorry for this, is up and running again.
replied to
their
post
10 months ago
Hi @clefourrier , thx for the quick reply and positive feedback. Absolutely, we'd be happy to collaborate on a blog post. :-) I've been following the Leaderboards on the Hub series -- should we write an initial draft and share it with you, or would you suggest another way to proceed?
posted
an
update
10 months ago
Post
1442
🥇Open CoT Leaderboard
We're delighted to announce the [Open CoT Leaderboard]( logikon/open_cot_leaderboard) on 🤗 Spaces.
Unlike other LLM performance leaderboards, the Open CoT Leaderboard is not tracking absolute benchmark accuracies, but relative **accuracy gains** due to **chain-of-thought**.
Eval datasets that underpin the leaderboard are hosted [here](https://huggingface.co./cot-leaderboard).
Feedback and suggestions more than welcome.
@clefourrier
We're delighted to announce the [Open CoT Leaderboard]( logikon/open_cot_leaderboard) on 🤗 Spaces.
Unlike other LLM performance leaderboards, the Open CoT Leaderboard is not tracking absolute benchmark accuracies, but relative **accuracy gains** due to **chain-of-thought**.
Eval datasets that underpin the leaderboard are hosted [here](https://huggingface.co./cot-leaderboard).
Feedback and suggestions more than welcome.
@clefourrier