Co-Pilot

Updated 24 days ago

zotero-code-execution

Name: zotero-code-execution
Rating: 4.0 (23 reviews)
Author: kerim

Kkerim

0.0k

kerim/zotero-code-execution

Agent Score

💡 Summary

A Python library that enhances Zotero searches using a safe code execution pattern to manage large datasets efficiently.

🎯 Target Audience

Researchers needing efficient literature searchesDevelopers integrating Zotero with custom applicationsAcademics looking for advanced filtering optionsData scientists managing large bibliographic datasetsStudents conducting comprehensive literature reviews

🤖 AI Roast: “Powerful, but the setup might scare off the impatient.”

Security AnalysisMedium Risk

Risk: Medium. Review: shell/CLI command execution; outbound network access (SSRF, data egress); API keys/tokens handling and storage; filesystem read/write scope and path traversal; dependency pinning and supply-chain risk. Run with least privilege and audit before enabling in production.

Zotero Code Execution

Efficient multi-strategy Zotero search using code execution pattern

A Python library for Zotero MCP that implements Anthropic's code execution pattern to enable safe, comprehensive searches without context overflow or crashes.

Skill Installation

For Claude Code

Clone or download this repository
Copy the skill/ folder to your Claude Code skills directory:
```
cp -r skill ~/.claude/skills/zotero-mcp-code
```
Restart Claude Code to load the skill

Quick Start

import sys
sys.path.append('/path/to/zotero-code-execution')
import setup_paths
from zotero_lib import SearchOrchestrator, format_results

# Single comprehensive search - fetches 100+ items, returns top 20
orchestrator = SearchOrchestrator()
results = orchestrator.comprehensive_search("embodied cognition", max_results=20)
print(format_results(results))

That's it! This automatically:

✅ Performs semantic + keyword + tag searches
✅ Deduplicates results
✅ Ranks by relevance
✅ Keeps large datasets in code (no crashes)

Multi-Term Searches

For OR-style searches (e.g., multiple spellings or languages), search each term separately and merge:

# Search for "Atayal" OR "泰雅族"
all_results = {}

for term in ['Atayal', '泰雅族']:
    results = orchestrator.comprehensive_search(term, max_results=50)
    for item in results:
        all_results[item.key] = item  # Deduplicate by key

# Re-rank combined results
ranked = orchestrator._rank_items(list(all_results.values()), 'Atayal 泰雅族')
print(format_results(ranked[:25]))

Why? Zotero treats multi-word queries as AND conditions. Searching "Atayal 泰雅族" finds items matching BOTH terms, not either term.

Why This Exists

The Problem

Direct MCP tool calls have limitations:

🚫 Crash risk with large result sets (>15-20 items)
🚫 Token bloat - all results load into LLM context
🚫 Manual orchestration - multiple searches, manual deduplication
🚫 No ranking - results not sorted by relevance

The Solution

Code execution keeps large datasets in the execution environment:

✅ No crashes - only filtered results return to context
✅ Token efficient - process 100+ items, return top 20
✅ Auto-orchestration - multi-strategy search in one call
✅ Auto-ranking - results sorted by relevance

Features

Multi-Strategy Search

One function call performs:

Semantic search (multiple variations)
Keyword search (multiple modes)
Tag-based search
Automatic deduplication
Relevance ranking

Safe Large Searches

# ❌ Old way: Crash risk
results1 = zotero_semantic_search("query", limit=10)  # Limited to 10
results2 = zotero_search_items("query", limit=10)     # Another 10
# Manual deduplication, manual ranking...

# ✅ New way: Safe and comprehensive
orchestrator = SearchOrchestrator()
results = orchestrator.comprehensive_search("query", max_results=20)
# Fetches 100+, processes in code, returns top 20

Advanced Filtering

# Fetch broadly, filter in code
library = ZoteroLibrary()
items = library.search_items("machine learning", limit=100)  # Safe!

# Filter to recent journal articles
filtered = orchestrator.filter_by_criteria(
    items,
    item_types=["journalArticle"],
    date_range=(2020, 2025)
)

Installation

Requirements

Python 3.8+
Zotero MCP installed via pipx
Claude Code or similar code execution environment

Setup

Clone this repository:

git clone https://github.com/yourusername/zotero-code-execution.git
cd zotero-code-execution

Install dependencies (optional - usually already installed with Zotero MCP):

pip install -r requirements.txt

Use in your code:

import sys
sys.path.append('/path/to/zotero-code-execution')
import setup_paths  # Adds zotero_mcp to path
from zotero_lib import SearchOrchestrator, format_results

Usage Examples

Basic Search

orchestrator = SearchOrchestrator()
results = orchestrator.comprehensive_search("neural networks", max_results=20)
print(format_results(results))

Filter by Author

library = ZoteroLibrary()
results = library.search_items("Kahneman", qmode="titleCreatorYear", limit=50)
sorted_results = sorted(results, key=lambda x: x.date, reverse=True)
print(format_results(sorted_results))

Tag-Based Search

library = ZoteroLibrary()
results = library.search_by_tag(["learning", "cognition"], limit=50)
print(format_results(results[:20]))

Recent Papers

library = ZoteroLibrary()
results = library.get_recent(limit=20)
print(format_results(results))

Custom Filtering

library = ZoteroLibrary()
orchestrator = SearchOrchestrator(library)

items = library.search_items("AI", limit=100)

# Only recent papers with DOI
recent_with_doi = [
    item for item in items
    if item.doi and item.date and int(item.date[:4]) >= 2020
]
print(format_results(recent_with_doi))

See examples.py for 8 complete working examples.

Claude Code Skill

This repository includes a Claude Code skill for easy integration.

Installation

Copy the skill to your Claude skills directory:

cp -r claude-skill ~/.claude/skills/zotero-mcp-code

Usage

In Claude Code, searches will automatically use the code execution pattern:

"Find papers about embodied cognition"

Claude will write code using this library instead of direct MCP calls.

See claude-skill/SKILL.md for complete skill documentation.

API Reference

`SearchOrchestrator`

Main class for automated multi-strategy searching.

`comprehensive_search(query, max_results=20, use_semantic=True, use_keyword=True, use_tags=True, search_limit_per_strategy=50)`

Performs comprehensive search with automatic deduplication and ranking.

Returns: List of ZoteroItem objects

`filter_by_criteria(items, item_types=None, date_range=None, required_tags=None, excluded_tags=None)`

Filter items by various criteria.

Returns: Filtered list of ZoteroItem objects

`ZoteroLibrary`

Low-level interface to Zotero.

search_items(query, ...) - Keyword search
semantic_search(query, ...) - Semantic/vector search
search_by_tag(tags, ...) - Tag-based search
get_recent(limit) - Recently added items
get_tags() - All library tags

Helper Functions

format_results(items, include_abstracts=True, max_abstract_length=300) - Format as markdown

See README_LIBRARY.md for complete API documentation.

Architecture

Based on Anthropic's code execution with MCP:

Claude writes Python code (not direct MCP calls)
Code fetches large datasets (100+ items) from Zotero
Code processes in execution environment (dedup, rank, filter)
Only filtered results return to LLM context (20 items)

Result: Large datasets stay out of context, preventing crashes and saving tokens.

Performance

Expected Benefits

Based on Anthropic's pattern and implementation design:

Token reduction: 50-90% (exact amount depends on search size)
Function calls: 5-10x → 1x (confirmed by design)
Search limits: 10-15 → 100+ items (safe in code)
Crash prevention: Likely effective (untested)

Status

⚠️ Proof of concept - Performance claims are theoretical projections, not measured results.

See HONEST_STATUS.md for detailed status and validation needs.

Documentation

README_LIBRARY.md - Complete library documentation
QUICK_START.md - Quick reference guide
CLAUDE_INSTRUCTIONS.md - Instructions for Claude Code
examples.py - 8 working examples
IMPLEMENTATION_SUMMARY.md - Technical details
HONEST_STATUS.md - Implementation status
claude-skill/SKILL.md - Claude Code skill docs

Contributing

Contributions welcome! Areas for improvement:

Performance validation - Measure actual token savings
Better ranking - Incorporate semantic similarity scores
Caching - Cache search results with invalidation
Parallel processing - Execute search strategies concurrently
Export functions - Batch BibTeX generation, CSV export

License

MIT License - see LICENSE file for details.

Credits

Based on Zotero MCP
Inspired by Anthropic's code execution with MCP

Related Projects

Zotero MCP - The underlying MCP server
Claude Code - Code execution environment
FastMCP - MCP server framework

Citation

If you use this in research, please cite:

@software{zotero_code_execution,
  title = {Zotero Code Execution: Efficient Multi-Strategy Search},
  year = {2025},
  url = {https://github.com/kerim/zotero-code-execution}
}

5-Dim Analysis

Clarity8/10

Novelty8/10

Utility9/10

Completeness8/10

Maintainability7/10

Pros & Cons

Pros

Efficient handling of large datasets
Automatic deduplication and ranking
Supports multiple search strategies
Reduces crash risks during searches

Cons

Performance claims are theoretical and untested
Requires specific environment setup
Potential learning curve for new users
Limited documentation on advanced features

Related Skills

pytorch

toolCode Lib

92/ 100

“It's the Swiss Army knife of deep learning, but good luck figuring out which of the 47 installation methods is the one that won't break your system.”

View Analysis

agno

toolCode Lib

90/ 100

“It promises to be the Kubernetes for agents, but let's see if developers have the patience to learn yet another orchestration layer.”

View Analysis

nuxt-skills

toolCo-Pilot

90/ 100

“It's essentially a well-organized cheat sheet that turns your AI assistant into a Nuxt framework parrot.”

View Analysis

Disclaimer: This content is sourced from GitHub open source projects for display and rating purposes only.