Query your private documents and get citation-backed answers from Gemini
GitHub RepoImpressions311

Query your private documents and get citation-backed answers from Gemini

@githubprojectsPost Author

Project Description

View on GitHub

Query Your Private Docs with AI and Get Citation-Backed Answers

Ever wished you could ask an AI a question about your internal documentation, a private PDF, or a company memo and get a precise answer that actually shows its work? Most AI tools either require you to upload sensitive data to a third-party service or can't tell you where in your documents the answer came from. That's a deal-breaker for a lot of private or proprietary information.

Enter AgentField. It's a local, open-source tool that lets you point a powerful AI (Google's Gemini) at your own collection of documents. You get clear, conversational answers, and crucially, it cites the exact source document and passage it used. No data leaves your machine, and you can verify every claim.

What It Does

In simple terms, AgentField is a local retrieval-augmented generation (RAG) system. You give it a folder of documents (PDFs, text files, etc.), it processes and indexes them locally, and then you can ask questions in a clean web interface. It searches through your docs, finds relevant snippets, and instructs Gemini to compose an answer based solely on that context. The final answer includes inline citations (like [1]) that link back to the source material.

Why It's Cool

The "cool factor" here isn't just about using a large language model; it's about the specific choices that make this practical for developers and teams.

  • Privacy-First & Local: Everything runs on your machine. Your sensitive documents never get uploaded to an external API (except for the Gemini API call for the final answer generation, which only sends the extracted text snippets, not the whole files). The embedding model and vector database are local.
  • Verifiable Answers: The citation feature is the star. No more guessing if the AI is hallucinating an answer about your project's API. You can click the citation and immediately see the source text it referenced. This builds trust.
  • Clean, Simple Stack: It leverages solid, modern tools: chromadb for the local vector store, Ollama for running local embedding models, and Google's Gemini API for the final answer synthesis. It's a great example of a practical RAG pipeline.
  • Developer-Friendly: It's a Python project with clear setup instructions. It's the kind of tool you can get running in an afternoon, understand, and even modify for your own needs.

How to Try It

Getting started is straightforward if you're comfortable in a terminal. You'll need Python, an Ollama instance running locally, and a Google AI Studio API key for Gemini.

  1. Clone the repo:

    git clone https://github.com/Agent-Field/agentfield.git
    cd agentfield
    
  2. Set up your environment: Install the requirements and set your API key.

    pip install -r requirements.txt
    export GOOGLE_API_KEY="your_google_gemini_api_key_here"
    
  3. Run Ollama: In a separate terminal, pull a lightweight embedding model and run the server.

    ollama pull nomic-embed-text
    ollama serve
    
  4. Add your documents: Place your PDFs, text files, etc., into the data/ directory.

  5. Run the app: Back in the project directory, start the application.

    python app.py
    
  6. Open your browser: Navigate to http://localhost:5000, ingest your documents, and start asking questions.

For full details and any updated steps, check out the AgentField GitHub repository.

Final Thoughts

AgentField feels like a pragmatic step towards truly useful AI for knowledge work. It solves the two biggest headaches: privacy and traceability. As a developer, I can see this being incredibly useful for onboarding (ask the codebase anything), querying internal process docs, or even as a research assistant for a collection of papers.

It's not a monolithic SaaS platform; it's a composable tool you can run today. The fact that it's open-source means you can tweak the embedding model, the prompt, or the UI to fit your workflow. If you've been curious about RAG systems or need a trustworthy way to chat with your private docs, this is a perfect project to spin up and experiment with.


Follow us for more projects like this: @githubprojects

Back to Projects
Project ID: f6858183-3529-410a-a124-bb02b4508534Last updated: April 1, 2026 at 05:41 AM