ChatGeo-Magi
Large Language Models aided by Retrieval-Augmented Generation for Enhanced Geomagnetic Data Access and Customer Support
Introduction
The integration of Large Language Models (LLMs) into scientific workflows offers new opportunities for improving access to complex, domain-specific datasets. ChatGeo-Magi, explores this potential in the context of geomagnetism. By combining LLMs with retrieval-augmented generation (RAG) and API data retrieval, the system can answer geomagnetic-related questions with authoritative documents as context, and create plots and figures from NOAA’s geomagnetic field calculators.
The geomagnetism group at NOAA and CIRES develops and distributes models such as the World Magnetic Model (WMM), which supports navigation, mineral exploration, and scientific research. These models are used tens of thousands of times daily, and the need for improved Q&A and customer support interfaces continues to grow. ChatGeo-Magi aims to address this challenge by providing a system capable of domain-grounded explanations, calculator outputs, and data visualization.
Challenges
Early experiments revealed a limitation, when combining document context and tool usage, the model frequently ignored or corrupted the syntax needed to call APIs correctly. This made tool execution unreliable.
To mitigate this issue, we implemented a two-LLM system. The first model is dedicated solely to tool routing: it determines whether a user query requires an API call and outputs the correct syntax. The second model receives RAG context, the user query, and (if applicable) the API response. By dividing responsibilities, we reduced syntax errors and improved reliability when chaining document-grounded reasoning with live data retrieval. This is seen in the following diagram:
The Approach
Thus, in more detail, here is the approach used:
Retrieval-Augmented Generation (RAG):
Documents (technical reports, FAQs, emails) are chunked, embedded using HuggingFace embeddings, and indexed in a Chroma vector database. Each query is embedded into the same vector space, and the nearest document chunks are retrieved as additional context for the LLM. This ensures factual grounding and reduces hallucinations. A visualization of this vector space is presented below:
API-driven data retrieval:
The system is equipped with tools for querying NOAA’s geomagnetic APIs. For example, a query such as “What is the magnetic declination in Deadhorse, Alaska today?” is transformed into a structured API request, executed, and returned to the user with proper units. Additional tools support time series plots, multi-series comparisons, and 2D contour mapping of geomagnetic field values. This plot was generated through this system with the prompt of “Plot me the magnetic declination in Alpine Alaska from 2000 until now”
Together, these two methods allow the model to both explain background concepts and return live scientific outputs in the same conversational flow.
Current Implementation
The prototype was developed in Python using:
- LangGraph to orchestrate agent execution and manage memory across sessions.
- ChromaDB for document vector storage and retrieval.
- Ollama to run Gemma2 (27B parameters) on a virtual machine with 2×NVIDIA A100 GPUs.
The deployed agent is capable of:
- Retrieving relevant geomagnetism documents through RAG.
- Executing geomagnetic calculations via NOAA APIs.
- Generating line plots and contour maps in response to user queries.
Initial testing demonstrated the ability to handle a wide range of questions, from definitional prompts (“What is secular variation?”) to dynamic calculations (“Plot declination in Alpine, Alaska from 2000 to 2024”).
Ongoing Work
The project remains under active development. Current priorities include:
- Expanding the document corpus with additional NOAA and CIRES geomagnetism references.
- Improving reliability of tool routing with more robust agentic control.
- Exploring multi-agent configurations for better separation of reasoning and tool execution.