💡 Summary
A Python library that enhances Zotero searches using a safe code execution pattern to manage large datasets efficiently.
🎯 Target Audience
🤖 AI Roast: “Powerful, but the setup might scare off the impatient.”
Risk: Medium. Review: shell/CLI command execution; outbound network access (SSRF, data egress); API keys/tokens handling and storage; filesystem read/write scope and path traversal; dependency pinning and supply-chain risk. Run with least privilege and audit before enabling in production.
Zotero Code Execution
Efficient multi-strategy Zotero search using code execution pattern
A Python library for Zotero MCP that implements Anthropic's code execution pattern to enable safe, comprehensive searches without context overflow or crashes.
Skill Installation
For Claude Code
- Clone or download this repository
- Copy the
skill/folder to your Claude Code skills directory:cp -r skill ~/.claude/skills/zotero-mcp-code - Restart Claude Code to load the skill
Quick Start
import sys sys.path.append('/path/to/zotero-code-execution') import setup_paths from zotero_lib import SearchOrchestrator, format_results # Single comprehensive search - fetches 100+ items, returns top 20 orchestrator = SearchOrchestrator() results = orchestrator.comprehensive_search("embodied cognition", max_results=20) print(format_results(results))
That's it! This automatically:
- ✅ Performs semantic + keyword + tag searches
- ✅ Deduplicates results
- ✅ Ranks by relevance
- ✅ Keeps large datasets in code (no crashes)
Multi-Term Searches
For OR-style searches (e.g., multiple spellings or languages), search each term separately and merge:
# Search for "Atayal" OR "泰雅族" all_results = {} for term in ['Atayal', '泰雅族']: results = orchestrator.comprehensive_search(term, max_results=50) for item in results: all_results[item.key] = item # Deduplicate by key # Re-rank combined results ranked = orchestrator._rank_items(list(all_results.values()), 'Atayal 泰雅族') print(format_results(ranked[:25]))
Why? Zotero treats multi-word queries as AND conditions. Searching "Atayal 泰雅族" finds items matching BOTH terms, not either term.
Why This Exists
The Problem
Direct MCP tool calls have limitations:
- 🚫 Crash risk with large result sets (>15-20 items)
- 🚫 Token bloat - all results load into LLM context
- 🚫 Manual orchestration - multiple searches, manual deduplication
- 🚫 No ranking - results not sorted by relevance
The Solution
Code execution keeps large datasets in the execution environment:
- ✅ No crashes - only filtered results return to context
- ✅ Token efficient - process 100+ items, return top 20
- ✅ Auto-orchestration - multi-strategy search in one call
- ✅ Auto-ranking - results sorted by relevance
Features
Multi-Strategy Search
One function call performs:
- Semantic search (multiple variations)
- Keyword search (multiple modes)
- Tag-based search
- Automatic deduplication
- Relevance ranking
Safe Large Searches
# ❌ Old way: Crash risk results1 = zotero_semantic_search("query", limit=10) # Limited to 10 results2 = zotero_search_items("query", limit=10) # Another 10 # Manual deduplication, manual ranking... # ✅ New way: Safe and comprehensive orchestrator = SearchOrchestrator() results = orchestrator.comprehensive_search("query", max_results=20) # Fetches 100+, processes in code, returns top 20
Advanced Filtering
# Fetch broadly, filter in code library = ZoteroLibrary() items = library.search_items("machine learning", limit=100) # Safe! # Filter to recent journal articles filtered = orchestrator.filter_by_criteria( items, item_types=["journalArticle"], date_range=(2020, 2025) )
Installation
Requirements
- Python 3.8+
- Zotero MCP installed via pipx
- Claude Code or similar code execution environment
Setup
- Clone this repository:
git clone https://github.com/yourusername/zotero-code-execution.git cd zotero-code-execution
- Install dependencies (optional - usually already installed with Zotero MCP):
pip install -r requirements.txt
- Use in your code:
import sys sys.path.append('/path/to/zotero-code-execution') import setup_paths # Adds zotero_mcp to path from zotero_lib import SearchOrchestrator, format_results
Usage Examples
Basic Search
orchestrator = SearchOrchestrator() results = orchestrator.comprehensive_search("neural networks", max_results=20) print(format_results(results))
Filter by Author
library = ZoteroLibrary() results = library.search_items("Kahneman", qmode="titleCreatorYear", limit=50) sorted_results = sorted(results, key=lambda x: x.date, reverse=True) print(format_results(sorted_results))
Tag-Based Search
library = ZoteroLibrary() results = library.search_by_tag(["learning", "cognition"], limit=50) print(format_results(results[:20]))
Recent Papers
library = ZoteroLibrary() results = library.get_recent(limit=20) print(format_results(results))
Custom Filtering
library = ZoteroLibrary() orchestrator = SearchOrchestrator(library) items = library.search_items("AI", limit=100) # Only recent papers with DOI recent_with_doi = [ item for item in items if item.doi and item.date and int(item.date[:4]) >= 2020 ] print(format_results(recent_with_doi))
See examples.py for 8 complete working examples.
Claude Code Skill
This repository includes a Claude Code skill for easy integration.
Installation
Copy the skill to your Claude skills directory:
cp -r claude-skill ~/.claude/skills/zotero-mcp-code
Usage
In Claude Code, searches will automatically use the code execution pattern:
"Find papers about embodied cognition"
Claude will write code using this library instead of direct MCP calls.
See claude-skill/SKILL.md for complete skill documentation.
API Reference
SearchOrchestrator
Main class for automated multi-strategy searching.
comprehensive_search(query, max_results=20, use_semantic=True, use_keyword=True, use_tags=True, search_limit_per_strategy=50)
Performs comprehensive search with automatic deduplication and ranking.
Returns: List of ZoteroItem objects
filter_by_criteria(items, item_types=None, date_range=None, required_tags=None, excluded_tags=None)
Filter items by various criteria.
Returns: Filtered list of ZoteroItem objects
ZoteroLibrary
Low-level interface to Zotero.
search_items(query, ...)- Keyword searchsemantic_search(query, ...)- Semantic/vector searchsearch_by_tag(tags, ...)- Tag-based searchget_recent(limit)- Recently added itemsget_tags()- All library tags
Helper Functions
format_results(items, include_abstracts=True, max_abstract_length=300)- Format as markdown
See README_LIBRARY.md for complete API documentation.
Architecture
Based on Anthropic's code execution with MCP:
- Claude writes Python code (not direct MCP calls)
- Code fetches large datasets (100+ items) from Zotero
- Code processes in execution environment (dedup, rank, filter)
- Only filtered results return to LLM context (20 items)
Result: Large datasets stay out of context, preventing crashes and saving tokens.
Performance
Expected Benefits
Based on Anthropic's pattern and implementation design:
- Token reduction: 50-90% (exact amount depends on search size)
- Function calls: 5-10x → 1x (confirmed by design)
- Search limits: 10-15 → 100+ items (safe in code)
- Crash prevention: Likely effective (untested)
Status
⚠️ Proof of concept - Performance claims are theoretical projections, not measured results.
See HONEST_STATUS.md for detailed status and validation needs.
Documentation
- README_LIBRARY.md - Complete library documentation
- QUICK_START.md - Quick reference guide
- CLAUDE_INSTRUCTIONS.md - Instructions for Claude Code
- examples.py - 8 working examples
- IMPLEMENTATION_SUMMARY.md - Technical details
- HONEST_STATUS.md - Implementation status
- claude-skill/SKILL.md - Claude Code skill docs
Contributing
Contributions welcome! Areas for improvement:
- Performance validation - Measure actual token savings
- Better ranking - Incorporate semantic similarity scores
- Caching - Cache search results with invalidation
- Parallel processing - Execute search strategies concurrently
- Export functions - Batch BibTeX generation, CSV export
License
MIT License - see LICENSE file for details.
Credits
- Based on Zotero MCP
- Inspired by Anthropic's code execution with MCP
Related Projects
- Zotero MCP - The underlying MCP server
- Claude Code - Code execution environment
- FastMCP - MCP server framework
Citation
If you use this in research, please cite:
@software{zotero_code_execution, title = {Zotero Code Execution: Efficient Multi-Strategy Search}, year = {2025}, url = {https://github.com/kerim/zotero-code-execution} }
Pros
- Efficient handling of large datasets
- Automatic deduplication and ranking
- Supports multiple search strategies
- Reduces crash risks during searches
Cons
- Performance claims are theoretical and untested
- Requires specific environment setup
- Potential learning curve for new users
- Limited documentation on advanced features
Related Skills
pytorch
S“It's the Swiss Army knife of deep learning, but good luck figuring out which of the 47 installation methods is the one that won't break your system.”
agno
S“It promises to be the Kubernetes for agents, but let's see if developers have the patience to learn yet another orchestration layer.”
nuxt-skills
S“It's essentially a well-organized cheat sheet that turns your AI assistant into a Nuxt framework parrot.”
Disclaimer: This content is sourced from GitHub open source projects for display and rating purposes only.
Copyright belongs to the original author kerim.
