diff options
| author | grothedev <grothedev@gmail.com> | 2025-11-01 21:50:32 -0400 |
|---|---|---|
| committer | grothedev <grothedev@gmail.com> | 2025-11-01 21:50:32 -0400 |
| commit | 2a5dafb00d090079ea9519ac3e9d22e54dc1e6da (patch) | |
| tree | f4526536e2f760e1d7d05e7283f5092e7236438f | |
| parent | 3b026ec58a4205b3ff7589b509b937a38ef5e825 (diff) | |
update spec
| -rwxr-xr-x | ai_api_module.py | 2 | ||||
| -rwxr-xr-x | cli.py | 3 | ||||
| -rw-r--r-- | spec.md | 199 |
3 files changed, 134 insertions, 70 deletions
diff --git a/ai_api_module.py b/ai_api_module.py index 16f80df..3cf616b 100755 --- a/ai_api_module.py +++ b/ai_api_module.py @@ -23,7 +23,7 @@ api_key_env_key = 'GEMINI_API_KEY' # the api key, if you prefer to set it here api_key = None -urlbase = 'https://generativelanguage.googleapis.com/v1beta/' #models/gemini-2.0-flash:generateContent' +urlbase = 'https://generativelanguage.googleapis.com/v1beta' #models/gemini-2.0-flash:generateContent' endpoints = { 'chat': { 'simple': urlbase+'/chat/completions', # without custom assistant #use components/schemas/ChatCompletionRequest @@ -59,7 +59,8 @@ def main(): #print(f" ModelID: {m['id']}\n Description: {m['description']}\n Tokens: {m['max_completion_tokens']}\n Modalities: {str(m['modalities'])}") print() else: - print('error: ' + str(res[1])) + print('error: ') + print(str(res[1].text)) return if args.list_conversations: @@ -4,21 +4,14 @@ GEMI is a flexible tool for interfacing with AI language models via external HTTP REST APIs (such as OpenAI, Anthropic, or compatible endpoints). It provides multiple interface options (CLI, REPL, Web UI) and advanced features like conversation management, file context integration, and intelligent document retrieval. -### Use Cases -- Interactive AI-assisted coding and development -- Document analysis with local file context -- Persistent conversation sessions with history management -- Automated AI workflows via CLI scripting -- Web-based team collaboration with shared AI access - ## Architecture ### Core Module **Responsibilities:** -- HTTP API client for LLM communication +- HTTP API client for communication with external model service (e.g. OpenAI, Anthropic, Gemini, Ollama) - Conversation context management -- Message formatting and transformation -- File context integration and caching +- Message formatting and transformation (e.g. an adapter to the varying protocols of each of the models' API specs) +- File context integration and caching (configurable triggers for when the model should look at file metadata and when it should read file contents) - Session persistence and retrieval ### CLI Interface @@ -87,34 +80,73 @@ GEMI is a flexible tool for interfacing with AI language models via external HTT - **Diff View**: See changes in code suggestions - **Search History**: Full-text search across conversations -## Tech Stack +## Technical Requirements ### Language & Runtime -- **Python 3.10+**: Core implementation language -- **asyncio**: Async HTTP and concurrent operations - -### HTTP & API -- **httpx**: Modern async HTTP client -- **pydantic**: Data validation and settings management - -### CLI & REPL -- **click**: CLI framework -- **prompt_toolkit**: Rich REPL with history/completion -- **rich**: Terminal formatting and syntax highlighting - -### Web Server -- **FastAPI**: Modern async web framework -- **uvicorn**: ASGI server -- **websockets**: Real-time communication -- **Jinja2**: HTML templating - -### Data Storage -- **SQLite**: Conversation history and metadata -- **sqlite-vec** (optional): Vector similarity search for files - -### Configuration -- **python-dotenv**: Environment variable loading -- **pyyaml** or **tomli**: Config file parsing +- **Language**: Python 3.10+ (current implementation) +- **Async Support**: Required for efficient HTTP operations and concurrent tasks +- **Cross-platform**: Must work on Linux, macOS, Windows + +### HTTP Client Requirements +- Async/await support +- Connection pooling +- Timeout and retry mechanisms +- Streaming response support +- Custom header support + +### CLI Requirements +- Argument parsing with subcommands +- Help text generation +- Input/output stream handling +- Exit code standards + +### REPL Requirements +- Multi-line input support +- Command history persistence +- Auto-completion +- Syntax highlighting for code blocks +- Configurable key bindings + +### Storage Requirements +- Relational database for structured data (conversations, sessions) +- File-based storage for large content +- Transaction support +- Schema migration capability +- Optional: Vector similarity search + +### Web Server Requirements (Future) +- Async request handling +- WebSocket support for streaming +- Static file serving +- Request validation +- Session management + +## Technology Considerations + +*Note: These are suggested technologies based on the current Python implementation. Alternative choices are acceptable if they meet the technical requirements above.* + +### Current Implementation Stack +- **HTTP**: httpx (alternatives: aiohttp, requests) +- **Validation**: pydantic (alternatives: marshmallow, attrs) +- **CLI**: click (alternatives: argparse, typer) +- **REPL**: prompt_toolkit (alternatives: readline, custom) +- **Terminal UI**: rich (alternatives: colorama, termcolor) +- **Database**: SQLite (alternatives: PostgreSQL, DuckDB) +- **Config**: pyyaml/tomli (alternatives: json, ini, toml) + +### Future Web Stack Considerations +- **Framework**: FastAPI, Flask, or Django +- **ASGI Server**: uvicorn, hypercorn, or daphne +- **Frontend**: Plain HTML/JS, React, or Vue.js + +### Alternative Language Considerations +If rewriting from Python: +- **Rust**: Excellent performance, compiled binary, strong typing + - Trade-offs: Steeper learning curve, longer compile times +- **Go**: Simple concurrency, fast compilation, single binary + - Trade-offs: Less rich ecosystem for ML/AI tooling +- **TypeScript/Node**: Familiar for web developers, good async support + - Trade-offs: Runtime performance, less mature for CLI tools ## Configuration @@ -122,10 +154,10 @@ GEMI is a flexible tool for interfacing with AI language models via external HTT ```yaml # API Configuration api: - provider: "openai" # openai, anthropic, custom - base_url: "https://api.openai.com/v1" - api_key: "${OPENAI_API_KEY}" - model: "gpt-4-turbo-preview" + provider: "gemini" # openai, anthropic, custom + base_url: "https://generativelanguage.googleapis.com/v1beta/" + api_key: "${GEMINI_API_KEY}" + model: "gemini-2.0-flash" timeout: 60 max_retries: 3 @@ -135,6 +167,7 @@ model: max_tokens: 4096 top_p: 1.0 system_prompt: "You are a helpful coding assistant..." + system_prompts_path: "./prompts" # Conversation Settings conversation: @@ -172,22 +205,36 @@ web: ## API Design -### Core Classes -```python -# Message types -Message(role: str, content: str, metadata: dict) -Conversation(id: str, messages: List[Message], created_at: datetime) - -# API client -LLMClient(config: Config) - - send_message(conversation: Conversation) -> Message - - stream_message(conversation: Conversation) -> AsyncIterator[str] - -# File indexer -FileIndex(db_path: Path) - - index_directory(path: Path) -> int - - search_relevant(query: str, limit: int) -> List[FileInfo] - - get_file_content(path: Path) -> str +*High-level interface contracts - implementation details flexible* + +### Core Abstractions + +**Message Protocol** +``` +Message: { role: string, content: string, metadata: dict } +Conversation: { id: string, messages: List[Message], created: timestamp } +``` + +**API Client Interface** +``` +send_message(conversation, config) -> Message +stream_message(conversation, config) -> Iterator[string] +list_models() -> List[ModelInfo] +``` + +**Storage Interface** +``` +save_conversation(conversation) -> session_id +load_conversation(session_id) -> Conversation +list_sessions() -> List[SessionMetadata] +delete_session(session_id) -> bool +``` + +**File Index Interface** +``` +index_directory(path) -> int +search_relevant(query, limit) -> List[FileInfo] +get_file_content(path) -> string ``` ## Data Models @@ -241,22 +288,22 @@ FileIndex(db_path: Path) ## Testing Strategy -### Unit Tests -- Config loading and validation -- Message formatting -- File indexing logic -- API response parsing +### Test Categories +- **Unit Tests**: Individual function/class behavior +- **Integration Tests**: Component interactions +- **End-to-End Tests**: Full user workflows +- **Performance Tests**: Throughput and latency benchmarks -### Integration Tests -- End-to-end API calls (with mocking) -- File system operations -- Database operations -- Web server endpoints +### Testing Requirements +- Mockable external dependencies (API calls, file system) +- Deterministic test data +- CI/CD integration capability +- Code coverage reporting (target: >80%) ### Manual Testing - CLI usability testing - REPL interaction flows -- Web UI responsiveness +- Web UI responsiveness (future) - Cross-platform compatibility ## Implementation Notes @@ -266,6 +313,13 @@ FileIndex(db_path: Path) - Missing conversation context management (critical) - Need to refactor for modularity +### Design Principles +1. **Modularity**: Each component should be independently testable +2. **Configurability**: Behavior controlled via config, not code changes +3. **Extensibility**: Easy to add new API providers, storage backends +4. **Observability**: Comprehensive logging and error reporting +5. **User-Centric**: Intuitive defaults, helpful error messages + ### Next Immediate Steps 1. Implement conversation persistence layer 2. Add session listing and selection @@ -273,9 +327,18 @@ FileIndex(db_path: Path) 4. Separate concerns into modules (api, storage, cli) 5. Add basic error handling and logging +### Architectural Decisions to Make +- [ ] Storage format for conversations (JSON, SQLite, both?) +- [ ] Config file format (YAML, TOML, or support both?) +- [ ] Streaming vs batch for file indexing +- [ ] Caching strategy for API responses +- [ ] Plugin architecture design (if needed) + ### Future Considerations - Support for function calling / tool use - Image input support for vision models - Voice input/output integration - Collaborative features (shared sessions) -- API rate limiting and queuing
\ No newline at end of file +- API rate limiting and queuing +- Multi-provider fallback/routing +- Local model support (Ollama, llama.cpp)
\ No newline at end of file |
