Description
Compare and benchmark multiple LLMs with ease — right from your local machine. This n8n workflow integrates with LM Studio to dynamically test prompts, log performance metrics, and track results in Google Sheets.
Whether you're a developer, researcher, or data scientist, this tool helps you quickly evaluate response quality, readability, and speed across different local models — with adjustable parameters like temperature and top-p.
🔹 How It Works:
✅ Connects to LM Studio and fetches active running models
✅ Sends predefined or custom prompts to each model
✅ Measures and logs response time, word count, readability, etc.
✅ Stores all results in a structured Google Sheet for comparison
✅ Supports dynamic tweaking of temperature 🔥 and top-p 🎯 per test
It Automates:
🧠 Prompt Testing Across Multiple LLMs
⏱️ Response Time Tracking
📊 Readability & Word Count Analysis
📄 Google Sheets Logging for Benchmarking
⚙️ Parameter Customization (temp, top-p)
💡 Why Choose This Workflow?
📊 Instant Benchmarks – Run A/B tests across multiple models side-by-side
🧪 Local Testing, Real Metrics – All testing is done on your machine with full control
📂 Organized Tracking – Easily compare models in a structured Google Sheet
🛠️ Tweak and Repeat – Change parameters and re-run tests with ease
🧠 Perfect for Model Evaluation – Ideal for research, model fine-tuning, or quality control
👤 Who Is This For?
✔️ Developers testing local fine-tuned LLMs
✔️ Data scientists comparing model outputs
✔️ AI researchers validating LLM performance
✔️ Teams deciding between open-source models
✔️ Anyone using LM Studio for LLM experiments
🔗 Consult for Integrations:
🔗 LM Studio API / Localhost Access – For sending prompts and fetching responses
🔗 Google Sheets API – For result logging and performance tracking
🔗 n8n HTTP Nodes – For flexible testing flow and logic branching
🚀 Get Started Today!
No more guessing — just clear, data-backed insights into how your LLMs perform.