This is the second post in the series Exploring the Vocabulary of Gen AI, and this post continues the work I did here in my first post, where I provided an overview of AI terminology, including:
- A.I
- Machine Learning
- Artificial Neural Network (ANN)
- Deep Learning
- Generative AI (GAI)
- Basic model
- Large-scale language models
- Natural Language Processing (NLP)
- Transformer Model
- Generative pretrained transformer (GPT)
Responsible AI
Responsible AI is designed to provide principles and practices for the use of AI to ensure that it is adopted, implemented, and operated fairly, legally, and ethically, and to provide trust and transparency to businesses and customers. Considerations about how AI is used and what impact it may have on humans should be governed and controlled by rules and frameworks. Trust, confidence, faith, and confidence should be built into all models and applications built on AI.
Labeled data
Labeled data helps machine learning models and algorithms process and learn from raw data. Data is ‘labeled’ because it contains tags and features related to the target data, making it useful and informative. For example, if you have a picture of a tiger, you can label it as ‘tiger’. This provides context to the raw data and helps the ML model learn and recognize other tiger images by extracting them. This raw input data can be in the form of text, images, videos, etc. and requires human intervention to properly label the data.
Map learning
Supervised learning is a learning method used in machine learning that uses a large set of labeled data to predict output variables. Over time, the algorithm learns how to define relationships between labeled input data and predicted output data using a mapping function. The learning process is considered ‘supervised learning’ because the algorithm corrects itself if it makes incorrect output mappings from input data as it learns. For example, if it sees a picture of a lion and classifies it as a tiger, the algorithm is corrected and the data is sent back to learn again.
Unsupervised learning
Unsupervised learning is different from supervised learning. Supervised learning uses labeled data, while unsupervised learning does not. Instead, it is given complete autonomy to identify features of the unlabeled data and the differences, structures, and relationships between each data point. For example, if the unlabeled data contains images of tigers, elephants, and giraffes, the machine learning model must set and classify certain features and attributes in each photo to identify differences between the images, such as color, pattern, facial features, size, and shape.
Semi-supervised learning
This is a learning method that uses a combination of both supervised and unsupervised learning techniques, using both labeled and unlabeled data in the process. Typically, when using this method, the dataset of labeled data is smaller than the larger dataset of unlabeled data, so there is no need to tag a huge amount of data. As a result, a smaller supervised training set is used to help train the model, and unsupervised learning techniques are used to help classify the data points.
Rapid Engineering
Prompt engineering allows you to refine input prompts when working with large-scale language models to produce the most appropriate output. Prompt engineering techniques allow you to optimize prompts to improve the performance of generative AI models to perform specific tasks. By adjusting and changing input prompts, you can manipulate the output and behavior of AI responses to make them more relevant. Prompt engineering is a principle that can transform the way humans interact with AI.
Prompt Chaining
Prompt chaining is a technique used when working with large-scale language models and NLP, allowing conversational interactions to occur based on previous responses and inputs. It creates contextual awareness through a series of consecutive prompts, creating human-like language exchanges and interactions. As a result, it is often successfully implemented in chatbots. It improves the user experience by responding to bite-sized blocks of data (multiple prompts) instead of working with a single, comprehensive prompt that can be difficult to respond to.
Augmented Search Generation (RAG)
RAG is a framework used within AI that allows you to provide additional factual data to the underlying model from external sources that help generate responses using up-to-date information. Since the underlying model is only as good as the data it was trained on, if there are irregularities in the responses, you can supplement the model with additional external data to ensure that the model has up-to-date, reliable, and accurate data to work with. For example, if you ask a question like, “What is the latest stock price for Amazon?” RAG takes that question, uses external sources to discover this information, and then generates a response. This up-to-date information is not stored in the relevant underlying model that you are using.
parameter
AI parameters are variables within a machine learning model that the algorithm can adjust during training to optimize performance and generalize patterns in the data to make it more efficient. These values direct the model’s behavior and minimize the difference between predicted and actual results.
Fine tuning
Fine-tuning is a technique for improving and enhancing the performance of a pre-trained model on a specific task or dataset. Initially, a model trained on a large dataset can be fine-tuned using a smaller, task-specific dataset. This technique allows the model to change and adjust its parameters to better adapt to the nuances of new data, thus improving accuracy and effectiveness for the target application.
In the next post, we will continue to focus on AI and talk about the following topics:
- prejudice
- hallucination
- temperature
- anthropomorphism
- complete
- token
- The emergence of AI
- Embedding
- Text classification
- Context window