Code assistance for boto3, always up to date and in any IDE

21. December 2018 2018 0

If you’re like me and work with the boto3 sdk to automate your Ops, then you probably are familiar with this sight:

image1

No code completion! It’s almost as useful as coding in Notepad, isn’t it? This is one of the major quirks of the boto3 sdk. Due to its dynamic nature, we don’t get code completion like for other libraries like we are used to.

I used to deal with this by going back and forth with the boto3 docs. However, this impacted my productivity by interrupting my flow all the time. I had recently adopted Python as my primary language and had second thoughts on whether it was the right tool to automate my AWS stuff. Eventually, I even became sick of all the back-and-forth.

A couple of weeks ago, I thought enough was enough. I decided to solve the code completion problem so that I never have to worry about it anymore.

But before starting it, a few naysaying questions cropped up in my head:

  1. How would I find time to support all the APIs that the community and I want?
  2. Will this work be beneficial to people not using the X IDE?
  3. With 12 releases of boto3 in the last 15 days, will this become a full time job to continuously update my solution?

Thankfully, I found a lazy programmer’s solution that I could conceive in a weekend. I put up an open source package and released it on PyPI. I announced this on reddit and within a few hours, I saw this:

image4

Looks like a few people are going to find this useful! 🙂

In this post I will describe botostubs, a package that gives you code completion for boto3, all methods in all APIs. It even automatically supports any new boto3 releases.

Read on to learn a couple of less-used facilities in boto3 that made this project possible. You will also learn how I automated myself out of the job of maintaining botostubs by leveraging a simple deployment pipeline on AWS that costs about $0.05 per month to run.

What’s botostubs?

botostubs is a PyPI package, which you can install to give you code completion for any AWS service. You install it in your Python runtime using pip, add “import botostubs” to your scripts and a type hint for your boto3 clients and you’re good to go:

image18

Now, instead of “no suggestions”, your IDE can offer you something more useful like:

image17

The parameters in boto3 are dynamic too so what about them?

With botostubs, you can now get to know which parameters are supported and also which are required or optional:

image10

Much more useful, right? No more back-and-forth with the boto3 docs, yay!

The above is for Intellij/PyCharm but will this work in other IDEs?

Here are a couple of screenshots of botostubs running in Visual Studio Code:

image2

image3

Looks like it works! You should be able to use botostubs in any IDE that supports code completion from python packages.

Why is this an issue in the first place?

As I mentioned before, the boto3 sdk is dynamic, i.e the methods and APIs don’t exist as code. As it says in the guide,

It uses a data-driven approach to generate classes at runtime from JSON description files …

The SDK maintainers do it to be able to enhance the SDK reliably and faster. This is great for the maintainers but terrible for us, the end users of the SDK.

Therefore, we need statically defined classes and methods. Since boto3 doesn’t work that way, we need a separate solution.

How botostubs works

At a high level, we need a way to discover all the available APIs, find out about the method signatures and package them up as classes in a module.

  1. Get a boto3 session
  2. Loop over its available clients
  3. Find out about each client’s operations
  4. Generate class signatures
  5. Dump them in a Python module

I didn’t know much about boto3 internals before so I had to do some digging on how to accomplish that. You can use what I’ve learnt here if you’re interested in building tools on top of boto3.

First, about the clients. It’s easy when you already know which API you need, e.g with S3, you write:

client = boto3.client(‘s3’)

But for our situation, we don’t know which ones are there in advance. I could have hardcoded them but I need a scalable and foolproof way. I found out that the way to do that is with a session’s get_available_services() facility.

image19

Tip: Much of what I’ve learnt have been though Intellij’s debugger. Very handy especially when having to deal with dynamic code.

image13

For example, to learn what tricks are involved to get the dynamic code to convert to actual API calls to AWS, you can place a breakpoint in _make_api_call found in boto3’s client.py:

image14

Steps 1 and 2 solved. Next, I had to find out which operations are possible in a scalable fashion. For example, the S3 API supports about 98 operations for listing objects, uploading and downloading them. Coding 98 operations is no fun, so I’m forced to get creative.

Digging deeper, I found out that clients have an internal botocore’s service model that had everything that I was looking for. Through the service model you can find the service documentation, api version, etc.

Side note: botocore is a factored out library that is shared with the AWS CLI. Much of what boto3 is capable is actually powered by botocore.

In particular, we can read the available operation names. E.g the service model for the ACM api returns:

image6

Step 3 was therefore solved with:

image23

Next, we need to know what parameters are available for each operation. In boto parlance, they are called “input shapes”. (Similarly, you can get the output shape if needed) Digging some more in the service model source, I found out that we can get the input shape with the operation model:

image12

This told me the required and optional parameters. The missing part of generating the method signatures was then solved. (I don’t need the method body since I’m generating stubs)

Then it was a matter of generating classes based on the clients and operations above and package them in a Python module.

For any version of boto, I had to run my script, run the twine PyPI utility and it will output a PyPI package that’s up to date with upstream boto3. All of that took about 100 lines of Python code.

Another problem remained to be solved though; with a new boto3 release every time you change your t-shirt, I would need to run it and re-upload to PyPI several times a week. So, wouldn’t this become a maintenance hassle for me?

The deployment pipeline

To solve this problem, I looked to AWS itself. The simplest way I found out was to use their build tool and invoke it on a schedule. What I want is a way to get the latest boto3 version, run the script and upload the artefact to PyPI. All without my intervention.

The relevant AWS services to achieve this is Cloudwatch Events (to trigger other services on a schedule), CodeBuild (managed build service in the cloud) and SNS (for email notifications). This is what the architecture looks like on AWS:

image20

Image generated with viz-cfn

The image above describes the CloudFormation template used to deploy on Github as well as the code.

The AWS Codebuild Project looks like this:

image5

image15

To keep my credentials outside of source control, I also attached a service role to give CodeBuild permissions to write logs and read my PyPI username and password from the Systems Manager parameter store.

I also enabled the build badge feature so that I can show the build status on Github:

image7

 

 

For intricate details, check out the buildspec.yml and the project definition.

I want this project to be invoked on a schedule (I chose every 3 days) and I can accomplish that with a Cloudwatch Event Rule:

image11

When the rule gets triggered, I see that my codebuild project does what it needs to do; clone the git repo, generate the stubs and upload to PyPI:

image21

This whole process is done in about 25 seconds. Since this is entirely hands off, I needed some way to be kept in the loop. After the build has run, there’s another Cloudwatch Event which gets triggered for build events on the project. It sends a notification to SNS which in turns sends me an email to let me know if everything went OK:

image9

The build event and notification.

image8

The SNS Topic with an email subscription.

That’s it! But what about my AWS bill? My estimate is that it should be around $0.05 every month. Besides, it will definitely not break the bank, so I’m pretty satisfied with it! Imagine how much it would cost to maintain a build server on your own to accomplish all of that.

What’s with the weird versioning?

You will notice botostubs versions look like this:

image22

It currently follows boto3 releases in the format 0.4.x.y.z. Therefore, if botostubs is currently at 0.4.1.9.61, then it means that it will offer whatever is available in boto3 version 1.9.61. I included the boto version in mine to make it more obvious what version of boto3 that botostubs was generated from but also because PyPI does not allow uploads at the same version number.

Are people using it?

According to pypistats.org, botostubs has been downloaded about 600 times in its initial week after I showed it to the reddit community. So it seems that it was a tool well needed:

image16

Your turn

If this sounds that something that you’ll need, get started by running:

pip install botostubs

Run it and let me know if you have any advice on how to make this better.

Credit

Huge thanks goes to another project called pyboto3 for the original idea. The issues that I had with it was that it was unmaintained and supported legacy Python only. I wouldn’t have known that this would be possible were it not for pyboto3.

Open for contribution

botostubs is an open source project, so feel free to send your pull requests.

A couple of areas where I’ll need some help:

  • Support Python < 3.5
  • Support boto3 high-level resources (as opposed to just low-level clients)

Summary

In this article, I’ve shared my process for developing botostubs through examining the internals of boto3 and automate its maintenance with a deployment pipeline that handles all the grunt work. If you like it, I would appreciate it if you share it with a fellow Python DevOps engineer

https://pypi.org/project/botostubs/.

I hope you are inspired to find solutions for AWS challenges that are not straightforward and share them with the community.

If you used what you’ve learnt above to build something new, let me know, I’d love to take a look! Tweet me @jeshan25.

About the Author

Jeshan Babooa is an independent software developer from Mauritius. He is passionate about all things infra automation on AWS especially with tools like Cloudformation and Lambda. He is the guy behind LambdaTV, a youtube channel dedicated to teaching serverless on AWS. You can reach him on Twitter @jeshan25.

About the Editors

Ed Anderson is the SRE Manager at RealSelf, organizer of ServerlessDays Seattle, and occasional public speaker. Find him on Twitter at @edyesed.

Jennifer Davis is a Senior Cloud Advocate at Microsoft. Jennifer is the coauthor of Effective DevOps. Previously, she was a principal site reliability engineer at RealSelf, developed cookbooks to simplify building and managing infrastructure at Chef, and built reliable service platforms at Yahoo. She is a core organizer of devopsdays and organizes the Silicon Valley event. She is the founder of CoffeeOps. She has spoken and written about DevOps, Operations, Monitoring, and Automation.


AWS SAM – My Exciting First Open Source Experience

10. December 2018 2018 0

My first acquaintance with AWS Cloud happened through a wonderful tool – SAM CLI. It was a lot of fun playing around with it –  writing straightforward YAML based resource templates, deploying them to AWS Cloud or simply invoking Lambda functions locally. But after trying to debug my .NET Core Lambda function, I realized that it was not yet supported by the CLI. At this point, I started looking through the open issues and found this one that described exactly what I was experiencing it turned out to be a hot topic, so it got me interested instantly. As I reviewed the thread to get additional context about the issue, I learned that the project was looking for help.

The ability to debug Lambda on .NET locally was critical for me to fully understand how exactly my function was going to be invoked in the Cloud, and troubleshoot future problems. I decided to answer the call to help and investigate the problem further. Feeling like Superman that came to save the day, I posted a comment on the thread to say that, “I’m looking into this too.” It took no more than 10 minutes to receive a welcoming reply from members of the core team. When you receive a reply in a matter of minutes, you definitely understand that the project is truly alive and the maintainers are open to collaboration –  especially since being hosted on GitHub does not necessarily mean that a project is intended for contribution and community support. Feeling inspired, I began my investigation.

The Problem

Unfortunately the .NET Core host program doesn’t have any “debug-mode” switch (you can track the issue here), meaning that there is no way to start .NET program so it will immediately pause after launching, wait for the debugger to attach to it, and only then proceed (for example Java (using JDWP and its suspend flag) and Node.js are capable of doing this), and SAM needed some way to work around this. After several days of investigation and prototyping, I came up with a prototype of a working solution that used a remote debugger attachment to the .NET Lambda runner inside of the running Docker container (to execute AWS Lambda function locally, SAM internally spins up a Docker container with a Lambda-like environment).

 

I joined the AWS Serverless Application Slack organization to collaborate with the maintainers and understand more about the current CLI design. Thanks to the very welcoming community on the Slack channel, I almost immediately started to feel like I was part of the SAM team! . At this point, I was ready to write up a POC of .NET Core debugging with SAM. I have a README.md you can look through which explains my solution in greater detail.

 

My POC was reviewed by the maintainers of SAM CLI. Together we discussed and resolved all open questions, shaped up the design to perfection and agreed upon the approach. To streamline the process and break it down, one of the members of the core team suggested that I propose a solution and identify the set of tasks through a design document. I had no problems doing so because the SAM CLI project has a well defined and structured template for writing this kind of document. The core team reviewed and approved. Everything was set up and ready to go, and I started implementing the feature.

Solution

Docker Lambda

My first target was the Docker Lambda repo. More specifically, it was the implementation of a “start in break mode” flag for .NET Core 2.0 and 2.1 runner (implementation details here).

I must admit, that this was my very first open source pull request, and it turned out to be overkill, so my apologies to the maintainer. PR included many more changes than required to solve the problem. The unnecessary changes were related to code style refactoring and minor improvements. And don’t get me wrong – best practices and refactoring changes are not undesired nor unwelcome, but they should be implemented in a separate PR to be a good citizen in the open source community. Okay, with that lesson learned, I’ve opened a slim version of my original PR with only the required code changes and with detailed explanations of them. SAM CLI is up next on our list.

 

 

AWS SAM CLI

Thanks to the awesome development guide at SAM CLI repo, it was super easy to set up the environment and get going. Even though I am not a Python guy, the development felt smooth and straightforward. The CLI codebase has a fine-grained modular architecture, and everything was clearly named and documented, so I faced no problems getting around the project. Configured Flake8 and Pylint linters kept an eye on my changes, so following project code-style guidelines was just a matter of fixing warnings if they appear. And of course, decent unit tests code coverage (97% at the time of writing) not only helped me to rapidly understand how each component worked within the system, but also made me feel rock solid confident as I was introducing changes.

 

The core team and contributors all deserve great thumbs up 👍 for keeping the project in a such a wonderful and contribution-friendly state!

 

However, I did encounter some troubles running unit and integration tests locally on my beloved Windows machine. I got plenty of unexpected Python bugs, “windows-not-supported” limitations and stuff like that along the way, at one point I was in complete despair. But help comes if you seek it. After asking the community for guidance, we collectively came up with a Docker solution by running tests inside Ubuntu container. And finally, I was able to setup my local environment and run all required tests before submitting the pull request. Ensuring that you hadn’t broken anything is a critical process while collaborating on any kind of project, and especially open source!

According to my initial approach to .NET Core debugging, I implemented --container-name feature, which should’ve allowed users of SAM CLI to specify a name for the running Lambda container, to identify it later to perform attaching. During review, the core team found some corner cases when this approach introduces some limitations and diversity for the whole debugging experience compared to the other runtimes, so I started to look for possible workarounds.

I looked at the problem from a different perspective and came up with a solution, that enabled .NET Core debugging for AWS SAM CLI with almost no changes made 🎉 (more about this approach here and here). The team immediately loved this approach, because it closely aligned with the way other runtimes deal with debugging currently, providing a consistent experience for the users. I am happy, that I got valuable feedback from the community because exactly that got me thinking about the way to improve my solution. Constructive criticism, when handled properly, makes perfection. Now, the feature is merged into develop and is waiting for the next release to come into play! Thanks to everyone, who’d taken part into this amazing open source journey to better software!

Conclusion

Bottom line, I can say that I had a smooth and pleasant first open source experience. I’m happy that it was with SAM, as I had a super interesting time collaborating with other passionate SAM developers and contributors across the globe through GitHub threads, emails and Slack. I liked how rapid and reactive all of those interactions were – it was a great experience writing up design docs for various purposes, and being part of a strong team collaborating to reach the common goal. It was both challenging and fun to quickly explore SAM codebase and then follow its fully automated code-style rules. And, lastly, it was inspiring to experience how easy it is to contribute to a big and very well-known open source project like SAM framework.

 

Based on my experience, I came up with this battle-tested and bulletproof checklist:

  1. Before getting your hands dirty with code, take your time and engage in discussion with the core team to make sure you understand the issue and try to get as much context on it as possible. 🔍
  2. I suggest writing up a descriptive design doc, which explains the problem (or feature) and proposed solution (or implementation) in great detail, especially if it’s a common practice on the project you’re contributing to
  3. Don’t be afraid or embarrassed to ask for help from the community or the maintainers if you’re stuck. Everyone wants you to succeed 🎉
  4. Unit tests are a great source of truth in decent projects, so look through them to get a better understanding of all the moving parts. Try to cover your changes with required tests, as these tests also help future contributors (which could be you)!
  5. Always try your best to match project code-style – thanks to modern linters this one is a piece of 🍰
  6. Keep your code changes small and concise. Don’t try to pack every single code style and best practices change into a tiny PR intended to fix a minor bug
  7. The PR review process is a conversation between you and other collaborators about the changes you’ve made. Reading and reviewing every comment you receive carefully, helps to build a common understanding of the problem and avoid confusion. Never take those comments personally and try to be polite and respectful while working on them. Don’t be afraid to disagree or have a constructive discussion with reviewers. Try to come to a mutual agreement in the end and, if required, introduce specified changes
  8. Try not to force your pull request as the team may have some important tasks at hand or approaching release deadlines. Remember that good pull requests get pulled and not pushed.

 

It’s good to know that anyone from the community can have an impact just by leaving a comment or fixing a small bug. It’s cool that new features can be brought in by the community with the help of the core team. Together we can build better open source software!

 

Based on my experience, SAM is a good and welcoming project to contribute to thanks to the well-done design and awesome community. If you have any doubt, just give it a try. SAM could be a perfect start for your open source career 😉

 

When you start it is hard to stop: #795, #828, #829, #843

Contribute with caution

Thank You!

I want to give special thanks to @sanathkr, @jfuss, @TheSriram, @mikemorain and @mhart for supporting me continuously along this path.

 

About the Author

My name is Nikita Dobriansky, from sunny Odessa, Ukraine. Throughout my career in software development, I’ve been working with C# desktop applications. Currently, I’m a C# engineer at Lohika, building bleeding-edge UWP apps. I am super excited about .NET Core and container-based applications. For me, AWS is a key to better serverless ☁️ future.

In my free time, I enjoy listening to the great music and making some myself 🎸

@ndobryanskyy

About the Editor

Jennifer Davis is a Senior Cloud Advocate at Microsoft. Jennifer is the coauthor of Effective DevOps. Previously, she was a principal site reliability engineer at RealSelf, developed cookbooks to simplify building and managing infrastructure at Chef, and built reliable service platforms at Yahoo. She is a core organizer of devopsdays and organizes the Silicon Valley event. She is the founder of CoffeeOps. She has spoken and written about DevOps, Operations, Monitoring, and Automation.