|
Today, we are announcing the availability of Llama 3.1 models on Amazon Bedrock. Llama 3.1 models are Meta’s most advanced and capable models to date. Llama 3.1 models are a collection of 8B, 70B, and 405B parameter size models that demonstrate state-of-the-art performance on a wide range of industry benchmarks and bring new capabilities to generative artificial intelligence (Generative AI) applications.
All Llama 3.1 models support 128K context lengths (an increase of 120K tokens over Llama 3). This is 16x larger than the Llama 3 models and improves inference capabilities for multilingual conversational use cases in eight languages: English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
You can now build, experiment, and responsibly scale your generative AI ideas using three new Llama 3.1 models from Meta on Amazon Bedrock.
- 3.1 Call 405B (Preview) is the world’s largest public large-scale language model (LLM) according to Meta. This model sets a new standard for AI and is ideal for enterprise-level applications and research and development (R&D). It is ideal for tasks such as generating synthetic data, where the output of the model can be used to improve smaller Llama models and transfer knowledge to smaller models of the 405B model through model distillation. This model excels at general knowledge, long-form text generation, multilingual translation, machine translation, coding, mathematics, tooling, enhanced contextual understanding, and advanced inference and decision making. For more information, see How to Generate Synthetic Data for Model Distillation Using Llama 3.1 405B on the AWS Machine Learning Blog.
- Flame 3.1 70B Ideal for content generation, conversational AI, language understanding, R&D, and enterprise applications. This model excels at text summarization and accuracy, text classification, sentiment analysis and nuance inference, language modeling, dialogue systems, code generation, and guideline following.
- 3.1 Call 8B Best suited for limited computing power and resources. This model excels at text summarization, text classification, sentiment analysis, and language translation that requires low-latency inference.
Meta measured the performance of Llama 3.1 on over 150 benchmark datasets covering a wide range of languages and a wide range of human evaluations. As the chart below shows, Llama 3.1 outperforms Llama 3 in all major benchmark categories.
To learn more about the features and capabilities of Llama 3.1, see the Llama 3.1 model card in Meta and Llama Models in the AWS documentation.
Llama 3.1’s responsible AI capabilities, combined with Amazon Bedrock’s data governance and model evaluation capabilities, enable you to confidently build secure and trustworthy generative AI applications.
- Guardrails for Amazon Bedrock – Create multiple guardrails with different configurations to fit your specific use cases, and you can use guardrails to implement custom safeguards tailored to your use cases and responsible AI policies to promote safe interactions between users and your AI applications. Guardrails for Amazon Bedrock enable you to continuously monitor and analyze user input and model responses that may violate customer-defined policies, detect hallucinations in model responses that are not based on enterprise data or relevant to user queries, and evaluate across a variety of models, including custom and third-party models. To get started, visit Creating Guardrails in the AWS documentation.
- Model Evaluation on Amazon Bedrock – Evaluate, compare, and select the best Llama model for your use case in just a few steps using automated or human evaluation. With model evaluation in Amazon Bedrock, you can choose automated evaluation using predefined metrics like accuracy, robustness, and toxicity. Or, you can choose a human evaluation workflow for subjective or custom metrics like alignment to relevance, style, and brand voice. Model evaluation provides built-in curated datasets, or you can bring your own dataset. To get started, visit Getting Started with Model Evaluation in the AWS documentation.
To learn more about how AWS keeps your data and applications secure and private, visit the Amazon Bedrock Security and Privacy page.
Getting started with Llama 3.1 model on Amazon Bedrock
If you are new to Llama models in Meta, go to the Amazon Bedrock console and select: Model approach It’s in the bottom left window. If you want to access the latest Llama 3.1 model on Meta, please request access separately. Rama 3.1 8B Instructions, Rama 3.1 70B Instructionsor Rama 3.1 405B Instructions.
To request access to the Llama 3.1 405B preview on Amazon Bedrock, please contact your AWS account team or submit a support ticket through the AWS Management Console. When creating a support ticket, select: Amazon Bedrock as Services and Models as category.
To test your Llama 3.1 model in the Amazon Bedrock console, select: text or chatting Under playground In the left menu window. Then select Select model And choose Meta As a category and Rama 3.1 8B Instructions, Rama 3.1 70B Instructionsor Rama 3.1 405B Instructions As a model.
In the following example, we selected the Llama 3.1 405B Instruct model.
By choosing View API requestsYou can also access models using the AWS Command Line Interface (AWS CLI) and code examples in the AWS SDKs. You can use the following model IDs: meta.llama3-1-8b-instruct-v1
, meta.llama3-1-70b-instruct-v1
or meta.llama3-1-405b-instruct-v1
.
Below is a sample AWS CLI command.
aws bedrock-runtime invoke-model \
--model-id meta.llama3-1-405b-instruct-v1:0 \
--body "{\"prompt\":\" (INST)You are a very intelligent bot with exceptional critical thinking(/INST) I went to the market and bought 10 apples. I gave 2 apples to your friend and 2 to the helper. I then went and bought 5 more apples and ate 1. How many apples did I remain with? Let's think step by step.\",\"max_gen_len\":512,\"temperature\":0.5,\"top_p\":0.9}" \
--cli-binary-format raw-in-base64-out \
--region us-east-1 \
invoke-model-output.txt
You can use the AWS SDK to build applications in a variety of programming languages using code examples for the Llama model on Amazon Bedrock. The following Python code example demonstrates how to send a text message to Llama for text generation using the Amazon Bedrock Converse API.
import boto3
from botocore.exceptions import ClientError
# Create a Bedrock Runtime client in the AWS Region you want to use.
client = boto3.client("bedrock-runtime", region_name="us-east-1")
# Set the model ID, e.g., Llama 3 8b Instruct.
model_id = "meta.llama3-1-405b-instruct-v1:0"
# Start a conversation with the user message.
user_message = "Describe the purpose of a 'hello world' program in one line."
conversation = (
{
"role": "user",
"content": ({"text": user_message}),
}
)
try:
# Send the message to the model, using a basic inference configuration.
response = client.converse(
modelId=model_id,
messages=conversation,
inferenceConfig={"maxTokens": 512, "temperature": 0.5, "topP": 0.9},
)
# Extract and print the response text.
response_text = response("output")("message")("content")(0)("text")
print(response_text)
except (ClientError, Exception) as e:
print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}")
exit(1)
All Llama 3.1 models (8B, 70B, 405B) are also available in Amazon SageMaker JumpStart. You can discover and deploy Llama 3.1 models in just a few clicks in Amazon SageMaker Studio, or deploy them programmatically with the SageMaker Python SDK. You can run your models with SageMaker features like container logs in SageMaker Pipelines, SageMaker Debugger, or Virtual Private Cloud (VPC) controls, which help to provide data security.
Fine-tuning for Llama 3.1 models is coming soon to Amazon Bedrock and Amazon SageMaker JumpStart. Once you build a fine-tuned model in SageMaker JumpStart, you can also import your custom model to Amazon Bedrock. For more information, see Meta Llama 3.1 models are now available in Amazon SageMaker JumpStart on the AWS Machine Learning Blog.
For customers who want greater flexibility and control over the underlying resources by deploying Llama 3.1 models on AWS with their own self-managed machine learning workflows, Amazon Elastic Compute Cloud (Amazon EC2) instances based on AWS Trainium and AWS Inferentia provide a high-performance, cost-effective way to deploy Llama 3.1 models on AWS. For more information, see AWS AI Chip Provides High Performance and Low Cost for Meta Llama 3.1 Models on AWS on the AWS Machine Learning Blog.
To celebrate this launch, Meta’s Business Development Manager, Parkin Kent, talks about the power of the collaboration between Meta and Amazon, highlighting how they’re working together to push the boundaries of what’s possible with generative AI.
Learn how enterprises are leveraging the power of generative AI with Llama models on Amazon Bedrock. Nomura, a global financial services group with operations across 30 countries and regions, is democratizing generative AI across its organization using Llama models on Amazon Bedrock.
Available Now
Meta’s Llama 3.1 8B and 70B models are generally available, and the Llama 450B model is available for preview today on Amazon Bedrock in the US West (Oregon) Region. To request access to the Llama 3.1 405B preview on Amazon Bedrock, contact your AWS account team or submit a support ticket. Check back for the full list of regions for future updates. For more information, see the Llama product page on Amazon Bedrock and the Amazon Bedrock pricing page.
Try Llama 3.1 today in the Amazon Bedrock console and send us feedback through AWS re:Post in Amazon Bedrock or through your usual AWS support contacts.
Visit the community.aws site to find in-depth technical content and learn how the Builder community is using Amazon Bedrock in their solutions. Let us know what you’ve built with Llama 3.1 on Amazon Bedrock!
— Chani