Local LLM Messenger Chat with GenAI on Your iPhone

December 8, 2025 · 1122 words · 6 min

In this , we want to share another winning project from last year’s . This time we will dive into

In this , we want to share another winning project from last year’s . This time we will dive into , an honorable mention winner created by . Developers are pushing the boundaries to bring the power of artificial intelligence (AI) to everyone. One exciting approach involves integrating Large Language Models (LLMs) with familiar messaging platforms like Slack and iMessage. This isn’t just about convenience; it’s about transforming these platforms into launchpads for interacting with powerful AI tools. Imagine this: You need a quick code snippet or some help brainstorming solutions to coding problems. With LLMs integrated into your messaging app, you can chat with your AI assistant directly within the familiar interface to generate creative ideas or get help brainstorming solutions. No more complex commands or clunky interfaces — just a natural conversation to unlock the power of AI. Integrating with messaging platforms can be a time-consuming task, especially for macOS users. That’s where Local LLM Messenger (LoLLMM) steps in, offering a streamlined solution for connecting with your AI via iMessage.  The following demo, which was submitted to the AI/ML Hackathon, provides an overview of LoLLM Messenger (Figure 1). The LoLLM Messenger bot allows you to send iMessages to Generative AI (GenAI) models running directly on your computer. This approach eliminates the need for complex setups and cloud services, making it easier for developers to experiment with LLMs locally. LoLLM Messenger includes impressive features that make it a standout among similar projects, such as: The architecture diagram shown in Figure 2 provides a high-level overview of the components and interactions within the LoLLM Messenger project. It illustrates how the main application, AI models, messaging platform, and external APIs work together to enable users to send iMessages to AI models running on their computers. By leveraging Docker, Sendblue, and Ollama, LoLLM Messenger offers a seamless and efficient solution for those seeking to explore AI models without the need for cloud-based services. LoLLM Messenger utilizes to manage the required services.  Docker Compose simplifies the process by handling the setup and configuration of multiple containers, including the main application, ngrok (for creating a secure tunnel), and Ollama (a server that bridges the gap between messaging apps and AI models). The LoLLM Messenger tech stack includes: To get started, ensure that you have installed and set up the following components: Open a terminal window and run the following command to clone this sample application: You should now have the following files in your directory: The script file under the directory is a Python script that uses the FastAPI framework to create a web server for an AI-powered messaging application. The script interacts with OpenAI’s GPT-3 model and an Ollama endpoint for generating responses. It uses Sendblue’s API for sending messages. The script first imports necessary libraries, including FastAPI, requests, logging, and other required modules. This section sets up configuration variables, such as API keys, callback URL, Ollama API endpoint, and maximum context and word limits. Next, the script performs the logging configuration, setting the log level to INFO. Creates a file handler for logging messages to a file named . It then defines various functions for interacting with the AI models, managing context, sending messages, handling callbacks, and executing slash commands. Two classes, and , are defined to represent the structure of incoming messages and callback data. The code also includes various functions and classes to handle different aspects of the messaging platform, such as setting default models, validating models, interacting with the Sendblue API, and processing messages. It also includes functions to handle slash commands, create messages from context, and append context to a file. Navigate to the directory and create a new file for adding environment variables. Next, add the ngrok authtoken to the Docker Compose file. You can get the . Next, you can run the application stack, as follows: You will see output similar to the following: If you’re testing it on a system without an NVIDIA GPU, then you can skip the attribute of the Compose file.  Watch the output for your ngrok endpoint. In our case, it shows: Next, append to the following ngrok webhooks URL:  Then, add it under the webhooks URL section on Sendblue and save it (Figure 3).  The service is configured to expose the service on port 8000 and provide a secure tunnel to the public internet using the domain.  The ngrok service logs indicate that it has started the web service and established a client session with the tunnels. They also show that the tunnel session has started and has been successfully established with the lollmm service. The ngrok service is configured to use the specified ngrok authentication token, which is required to access the ngrok service. Overall, the ngrok service is running correctly and is able to establish a secure tunnel to the lollmm service. Ensure that there are no error logs when you run the ngrok container (Figure 4). Ensure that the LoLLM Messenger container is actively up and running (Figure 5). The logs show that the Ollama service has opened the specified port (11434) and is listening for incoming connections. The logs also indicate that the Ollama service has mounted the directory from the host machine to the directory within the container. Overall, the Ollama service is running correctly and is ready to provide AI models for inference. To test the functionality of the lollm service, you first need to add your contact number to the Sendblue dashboard. Then you should be able to send messages to the Sendblue number and observe the responses from the lollmm service (Figure 6). The Sendblue platform will send HTTP requests to the endpoint of your lollmm service, and your lollmm service will process these requests and return the appropriate responses. The first time you chat with LLM by typing (Figure 7), you can check the logs as shown: Next, let’s install the model by typing (Figure 8). You can see the following container logs once you set the default model to as shown: The lollmm service is running correctly and can handle HTTP requests from the ngrok tunnel. You can use the ngrok tunnel URL to test the functionality of the lollmm service by sending HTTP requests to the appropriate paths (Figure 9). LoLLM Messenger is a valuable tool for developers and enthusiasts looking to push the boundaries of LLM integration within messaging apps. It allows developers to craft custom chatbots for specific needs, add real-time sentiment analysis to messages, or explore entirely new AI features in your messaging experience.  To get started, you can explore the and discover the potential of local LLM.