Open-source datasets from AMD GenAI on Hugging Face
Utility-based benchmark with 33K+ questions across 4 domains for evaluating caption quality
High-quality synthetic reasoning dataset with 27K math and science problems for fine-tuning LLMs
Long-context training data for Instella models supporting 128K context length
Synthetic math dataset for training reasoning models on AMD GPUs
Synthetic GSM8K-style math problems for Instella model training
Benchmark for evaluating LLM reasoning with 412 novel Tic-Tac-Toe-style game questions across 4 game types