"Dive into the fascinating world of AI as we celebrate ChatGPT's first anniversary and explore the dynamic evolution of both closed and open Large Language Models—a journey reshaping our digital interactions!"
ChatGPT turned one year old today. Since its launch on November 30, 2022, it has changed how we talk to AI, making conversations more natural. ChatGPT was the first AI chatbot of its kind to give safe and detailed answers, follow instructions, and correct its own mistakes. It quickly became popular, reaching 100 million users in just two months, faster than apps like Instagram, TikTok, or YouTube. This launch was a big deal in AI and affected many areas.
OpenAI, with ChatGPT, showed that by fine-tuning a large language model (LLM) with specific training methods, the model could answer questions and follow instructions on many tasks. Now, ChatGPT is really useful for things like customer service and online help.
Interest in LLMs has grown, with new ones appearing often in both academia and various industries. Closed LLMs like OpenAI's GPT are usually better than open-source ones, but open-source LLMs are catching up fast. The AI field keeps changing as both types of LLMs get updated with new data.
Recently, a research team, including people from Salesforce, and universities in Singapore, compared popular open-source LLMs to see how they stack up against others like ChatGPT. They created a detailed Survey that's worth reading, which compares high-performing open-source LLMs with ChatGPT in different tasks.
The Survey Findings are useful for researchers and businesses. It gives researchers information on the latest trends and future research ideas in open-source LLMs. For businesses, it helps leaders decide how to use open-source LLMs effectively. The report also talks about the main challenges for open-source LLMs, especially regarding safety and accuracy.
This image from the Survey Report displays a timeline of the development of Large Language Models (LLMs) from May 2020 to November 2023, differentiating between closed-source models, shown below the arrow, and open-source models, displayed above it. The timeline indicates a rapid evolution in the field, with several key players releasing both closed and open-source models.
Starting with GPT-3 in May 2020, a notable closed-source model by OpenAI, there's a clear acceleration in the development of LLMs. Throughout 2022, a mix of closed and open-source models emerge, with OpenAI's ChatGPT, a closed-source model, being a significant release in November 2022, suggesting a trend towards more sophisticated and perhaps commercialized AI tools.
By early 2023, the development of open-source models seems to gain momentum, with models like BLOOM and Alpaca indicating a community drive towards accessible AI. The timeline also shows a burst of new models from various organizations in mid-2023, reflecting a diverse ecosystem of LLMs. Notably, by the end of 2023, there is an expectation for new entrants into the field, implying ongoing innovation and competition.
The spread of models over time reflects an increasing diversity in the capabilities and applications of LLMs. The presence of both open and closed-source models suggests a dynamic market where both proprietary technology and community-driven projects are thriving. This competition and variation serve to advance the field, potentially leading to more robust, versatile, and accessible language models for a wide array of applications.
The survey shows that GPT-3.5-turbo and GPT-4 are top-rated for safety, mainly because they use a method called Reinforcement Learning with Human Feedback (RLHF). This concept of Reinforcement Learning with Human Feedback works by first gathering information about what kinds of responses people like. Then, it teaches a special program to understand and copy these human preferences. After that, this program helps train the Large Language Model (LLM) to respond in ways that people prefer, making sure it avoids giving answers that are rude or biased.
However, this RLHF method needs a lot of detailed feedback from people, which is expensive to get, so it's not often used in freely available LLMs.
Another great insight from the survey was a detailed write-up on some best practices for training Large Language Models. The survey highlighted key practices for training Large Language Models (LLMs), a process that is both complex and resource-intensive, involving data collection, model design, and training. Despite the trend of releasing open-source LLMs, the practices of leading models often remain undisclosed.
Data: LLM pre-training uses trillions of data tokens from public sources, with a focus on ethical considerations like excluding personal information. Fine-tuning, though using less data, emphasizes higher quality for improved performance, especially in specialized areas.
Model Architecture: Most LLMs use a decoder-only transformer architecture, with various techniques like Llama-2's Ghost attention for better dialogue control and sliding window attention for longer contexts.
Training: Supervised fine-tuning (SFT) is crucial, with tens of thousands of annotations proving sufficient, as seen in Llama-2's 27,540 annotations. Quality and diversity of data are vital. Reinforcement Learning from Human Feedback (RLHF) uses algorithms like proximal policy optimization (PPO) to align model behavior with human preferences, enhancing safety. An alternative to PPO is direct preference optimization (DPO).
Data Contamination during Pre-training: The increasing issue of data contamination, where pre-training corpora include benchmark data, affects perceptions of LLMs' generalization abilities. Addressing this requires exploring the overlap between benchmarks and pre-training data and assessing overfitting to enhance LLM reliability. Future directions include standardizing disclosure of pre-training data and developing methods to mitigate data contamination.
Closed-sourced Development of Alignment: RLHF for alignment is gaining attention, but few open-source LLMs use it due to the lack of high-quality preference data and pre-trained reward models. Addressing the scarcity of diverse, high-quality data in complex scenarios remains a challenge.
Difficulty in Continuous Improvements: Enhancing fundamental LLM abilities faces challenges like the high cost of exploring data mixtures and reliance on knowledge distillation from closed-source models, which may mask issues in scalability. Future research could explore novel methods like unsupervised learning to advance LLM abilities while addressing these challenges and costs.
I. Read and Learn: Dive into the detailed survey comparing open-source LLMs with ChatGPT. It’s a treasure trove of information for anyone interested in the future of AI.
II. Engage with AI Safely: Understand the importance of safety in AI interactions. The use of Reinforcement Learning with Human Feedback (RLHF) in models like GPT-3.5-turbo and GPT-4 ensures responses are respectful and unbiased. Explore how this can be implemented more broadly in AI technology.
III. Contribute to AI Development: For those in the field, consider the challenges in training LLMs, like data contamination and the need for diverse, quality data. There’s a call to action for developing methods to mitigate these challenges and to explore new training techniques, such as unsupervised learning, for the continuous improvement of LLMs.