Paginating AWS API Results using the Boto3 Python SDK

21. December 2016 2016 0

Author: Doug Ireton

Boto3 is Amazon’s officially supported AWS SDK for Python. It’s the de facto way to interact with AWS via Python.

If you’ve used Boto3 to query AWS resources, you may have run into limits on how many resources a query to the specified AWS API will return, generally 50 or 100 results, although S3 will return up to 1000 results. The AWS APIs return “pages” of results. If you are trying to retrieve more than one “page” of results you will need to use a paginator to issue multiple API requests on your behalf.

Introduction

Boto3 provides Paginators to automatically issue multiple API requests to retrieve all the results (e.g. on an API call toEC2.DescribeInstances). Paginators are straightforward to use, but not all Boto3 services provide paginator support. For those services you’ll need to write your own paginator in Python.

In this post, I’ll show you how to retrieve all query results for Boto3 services which provide Pagination support, and I’ll show you how to write a custom paginator for services which don’t provide built-in pagination support.

Built-In Paginators

Most services in the Boto3 SDK provide Paginators. See S3 Paginators for example.

Once you determine you need to paginate your results, you’ll need to call the get_paginator() method.

How do I know I need a Paginator?

If you suspect you aren’t getting all the results from your Boto3 API call, there are a couple of ways to check. You can look in the AWS console (e.g. number of Running Instances), or run a query via the aws command-line interface.

Here’s an example of querying an S3 bucket via the AWS command-line. Boto3 will return the first 1000 S3 objects from the bucket, but since there are a total of 1002 objects, you’ll need to paginate.

Counting results using the AWS CLI

Here’s a boto3 example which, by default, will return the first 1000 objects from a given S3 bucket.

Determining if the results are truncated

The S3 response dictionary provides some helpful properties, like IsTruncated, KeyCount, and MaxKeys which tell you if the results were truncated. If resp['IsTruncated'] is True, you know you’ll need to use a Paginator to return all the results.

Using Boto3’s Built-In Paginators

The Boto3 documentation provides a good overview of how to use the built-in paginators, so I won’t repeat it here.

If a given service has Paginators built-in, they are documented in the Paginators section of the service docs, e.g.AutoScaling, and EC2.

Determine if a method can be paginated

You can also verify if the boto3 service provides Paginators via the client.can_paginate() method.


So, that’s it for built-in paginators. In this section I showed you how to determine if your API results are being truncated, pointed you to Boto3’s excellent documentation on Paginators, and showed you how to use the can_paginate() method to verify if a given service method supports pagination.

If the Boto3 service you are using provides paginators, you should use them. They are tested and well documented. In the next section, I’ll show you how to write your own paginator.

How to Write Your Own Paginator

Some Boto3 services, such as AWS Config don’t provide paginators. For these services, you will have to write your own paginator code in Python to retrieve all the query results. In this section, I’ll show you how to write your own paginator.

You Might Need To Write Your Own Paginator If…

Some Boto3 SDK services aren’t as built-out as S3 or EC2. For example, the AWS Config service doesn’t provide paginators. The first clue is that the Boto3 AWS ConfigService docs don’t have a “Paginators” section.

The can_paginate Method

You can also ask the individual service client’s can_paginate method if it supports paginating. For example, here’s how to do that for the AWS config client. In the example below, we determine that the config service doesn’t support paginating for the get_compliance_details_by_config_rule method.

Operation Not Pageable Error

If you try to paginate a method without a built-in paginator, you will get an error similar to this:

If you get an error like this, it’s time to roll up your sleeves and write your own paginator.

Writing a Paginator

Writing a paginator is fairly straightforward. When you call the AWS service API, it will return the maximum number of results, and a long hex string token, next_token if there are more results.

Approach

To create a paginator for this, you make calls to the service API in a loop until next_token is empty, collecting the results from each loop iteration in a list. At the end of the loop, you will have all the results in the list.

In the example code below, I’m calling the AWS Config service to get a list of resources (e.g. EC2 instances), which are not compliant with the required-tags Config rule.

As you read the example code below, it might help to read the Boto3 SDK docs for theget_compliance_details_by_config_rule method, especially the “Response Syntax” section.

Example Paginator

Example Paginator – main() Method

In the example above, the main() method creates the config client and initializes the next_token variable. Theresources list will hold the final results set.

The while loop is the heart of the paginating code. In each loop iteration, we call theget_compliance_details_by_config_rule method, passing next_token as a parameter. Again, next_token is a long hex string returned by the given AWS service API method. It’s our “claim check” for the next set of results.

Next, we extract the current_batch of AWS resources and the next_token string from the compliance_detailsdictionary returned by our API call.

Example Paginator – get_resources_from() Helper Method

The get_resources_from(compliance_details) is an extracted helper method for parsing the compliance_detailsdictionary. It returns our current batch (100 results) of resources and our next_token “claim check” so we can get the next page of results from config.get_compliance_details_by_config_rule().

I hope the example is helpful in writing your own custom paginator.


In this section on writing your own paginators I showed you a Boto3 documentation example of a service without built-in Paginator support. I discussed the can_paginate method and showed you the error you get if you call it on a method which doesn’t support pagination. Finally, I discussed an approach for writing a custom paginator in Python and showed a concrete example of a custom paginator which passes the NextToken “claim check” string to fetch the next page of results.

Summary

In this post, I covered Paginating AWS API responses with the Boto3 SDK. Like most APIs (Twitter, GitHub, Atlassian, etc) AWS paginates API responses over a set limit, generally 50 or 100 resources. Knowing how to paginate results is crucial when dealing with large AWS accounts which may contain thousands of resources.

I hope this post has taught you a bit about paginators and how to get all your results from the AWS APIs.

About the Author

Doug Ireton is a Sr. DevOps engineer at 1Strategy, an AWS Consulting Partner specializing in Amazon Web Services (AWS). He has 23 years experience in IT, working at Microsoft, Washington Mutual Bank, and Nordstrom in diverse roles from testing, Windows Server engineer, developer, and Chef engineer, helping app and platform teams manage thousands of servers via automation.