teknium1

Maximize Human Potential

Funding Links: https://github.com/sponsors/teknium1

Name: Teknium
Kind: user
Followers: 2384
Following: 27
Total stars: 2820
Repositories count: 39
Created at: 2023-03-30T20:26:07.729Z
Updated at: 2025-07-19T07:58:38.769Z
Last synced at: 2025-07-19T07:58:38.769Z

GitHub Sponsors Profile

Hello, I'm Teknium1 👋
I'm a Python Programmer, AI Enthusiast, and a Co-founder of NousResearch.
My work primarily involves AI and Data Engineering, contributing primarily by releasing open source Large Language Model (LLMs) and datasets.
🚀 My Work
💼 Nous Research
I've contributed significantly to the development of several opensource LLMs under Nous Research's huggingface organization.
Here are a couple of them:

Hermes 2 Pro 8B - The most powerful pure hermes model, with function calling capabilities.
Nous-Hermes-2-Yi-34B - Nous' latest most powerful LLM yet.
Nous-Hermes-Llama2-13b - A Hermes model built on llama 1 and llama 2.
GPT4-x-Vicuna-13b - A Vicuna model built on GPT-4.

🚀 Personal Projects
On my personal huggingface, Teknium, I have released several models, including my work on Replit-3b Model & OpenHermes:

DataForge - Economics - A dataset built by my data synthesis pipeline (not public), DataForge
OpenHermes 2.5 Mistral 7B - Most powerful Open Hermes, with much improved coding skills than OpenHermes 2
OpenHermes 2 Mistral 7B - Version 2 of the Open Hermes series.
OpenHermes 13B - An Open Sourced version of Nous-Hermes!
OpenHermes Dataset - The publicly available version of Hermes' dataset.
Replit-Instruct 3B - This model doubled the code performance of the LLM.

💻 Github Projects
I've been part of several intriguing projects on GitHub. Here are a few of them:

Prompt Engineering Toolkit - A react web application for comparing and building prompts with several LLMs.
LLM-Benchmark-Logs - A repository full of benchmarks I've done on various LLMs, originally inside of Nous' discord but it became too disorganized, so now lives on Github.
LLM-Logbook - A temporary project that became too expensive to do, collection of responses for 100 random crowdsourced prompts to various LLMs.
GPTeacher - A collection of modular datasets generated by GPT-4, for training LLMs.
RawTransform - A repository of prompts and Python scripts for intelligent transformation of raw text into diverse formats.
stanford_alpaca-replit - Modified Stanford-Alpaca Trainer for Training Replit's Code Model.
alpaca-roleplay-discordbot - An LLM discord bot that roleplays!
alpaca-discord - A Simple Discord Bot for the Alpaca LLM.

💼 CarperAI / StabilityAI
Have worked on researching, planning ablations, and cleaning/filtering the dataset for:

StableBeluga/Free Willy 2 - Orca replication on 70b Llama-2
StableBeluga/Free-Willy-1 - Orca replication on 65b Llama-1

Both are 10% Orca replications trained on Llama-1 and Llama-2 70B. Also working on domain expert knowledge and task distillation.
💼 Open Orca
Working with the Open Orca team on data cleaning, networking, ablations, and more:

Open Orca HuggingFace Repo - An Open Sourced Orca paper replication

📫 Get in Touch

Twitter: https://twitter.com/Teknium1
Discord: Teknium