My current product has hundreds of APIs. Every time I need to refer to the API specification, I have to navigate to the Swagger link, scroll endlessly, or use browser search to find what I need—and then manually filter things out. It’s frustrating and painfully slow. Even worse, every developer who needs to integrate with the API has to go through the same experience.This frustration led me to explore how AI could improve the process. This post is a deep dive into that journey and how it evolved into something simple, robust, and effective.PreparationRich API DocumentationAs a first step, I reviewed entire api documentation to make sure it has clear summary and description details in the swagger docs. This is a critical step for better discoverability in later stages. I also made sure the summary and description are unique for each api.paths": { "/users/accounts/{account-id}": { "put": { "tags": [ "Account API" ], "summary": "Update Test suite by test-suite-id", "externalDocs": { "description": "Robust get account details description", "url": "https://mydocs.com/description" },\Categorization of the APIsWith hundreds of APIs available, it can be challenging to identify which ones are related. Categorizing the APIs makes management more efficient and later simplifies selecting the right API based on natural language input. This categorization was implemented using the tagging concept defined in the OpenAPI specification.\"paths": { "/users/accounts/{account-id}": { "put": { "tags": [ "Account API" ], "summary": "Update Test suite by test-suite-id", "externalDocs": { "description": "Robust get account details description", "url": "https://mydocs.com/description" },\Building Natural Language SearchUser InputUsers would enter the natural language question related to the API.Example: How to retrieve account details?Classify CategoryAt this stage, the question and all the available categories are sent to the LLM. The LLM is tasked to return the high-level category the question falls into. The output of this step is one of the categories.Classify Specific APIBased on the LLM’s identified category, the system sends another request to the model with the same question — but this time, it includes all API details within that specific category detected in the previous step.This is where the earlier preparation pays off: the more descriptive and well-structured the API documentation, the better the results. Clear descriptions help the LLM accurately determine which API the user is requesting information about.The output of this step is a single, specific API.Enrich API Response DetailsThe OpenAPI specification of the API is then provided to the LLM to generate a detailed, context-rich description of the API alongside the original question.For example, if the user asks, “How can I retrieve account details using an account ID?”, the response will include the relevant specification details of the Account API.ExtentionWith the system’s enhanced ability to accurately detect the appropriate API, users can now go a step further — generating code snippets to interact with various APIs directly.For example:“Share Python code to call the Get Account Details API for a given ID.”“Provide a cURL command to fetch account details by ID.”“Generate a Go client to retrieve account details for a specific ID.”\Lessons Learned and InsightsRich documentation is imperative for better accuracy when working with AI systems. Precise, clear, and to-the-point documentation is essential for robustness. Bonus: We also used LLM to generate a summary and description for each API, which helped immensely.Categorize FirstWhy: With hundreds of APIs, categorization reduces cognitive load and improves retrieval.How: Group related APIs into a small set of clear categories. AI systems perform better when the label space is limited.Scale tip: If the catalog is very large, add sub-categories for finer routing.Build IterativelyStart small: Take a subset of the spec and train/validate a router that can reliably select the correct API.Expand gradually: Add more APIs over time, measure accuracy, and prioritize areas with misclassifications.Focus: Optimize precision/recall rather than breadth at the outset.Close the Loop with UsersCollect feedback: Capture cases where the system picked the wrong API.Act on signals: Refine the misidentified APIs’ descriptions, summaries, and tags; clarify overlapping scopes.Repeat: Re-evaluate after each change to confirm that accuracy improves and regressions are avoided.ConclusionAs the number of available APIs continues to grow, exploring and managing them requires a new approach. With the rise of AI agents powered by large language models (LLMs), developers now have a more intuitive and efficient way to discover and interact with APIs—saving countless hours previously spent searching for the right endpoints.The potential doesn’t stop there. This concept can evolve into a standalone product capable of seamlessly ingesting OpenAPI specifications at runtime and exposing them through a natural language interface—offering users an out-of-the-box solution for API exploration.Hopefully, this article has illustrated how to leverage LLMs effectively and how well-structured API documentation can create a smoother, more intelligent discovery experience.\n \n \n \n