Deploying big Spacy NLP models on AWS Lambda + S3

Xoel López Barata
4 min readAug 20, 2020
Source: Unsplash

Hey y’all!

A few weeks ago, I started using Spacy to detect locations in job descriptions. Spacy is an NLP library that lets you do pretty powerful stuff out-of-the-box and get things done fast.

Everything was working fine locally. But NoiceJobs (my project) is hosted on Heroku and uses the cheapest dynos possible, with only 0,5GB of RAM. For running simple apps that’s enough, but ML code is normally more memory and CPU-intensive, so when I deployed the new version of the app on Heroku I’d get memory quota exceeded errors all the time.

Some AWS engineers jumped into the conversation and after some back-and-forth, we came to the conclusion that AWS could be a good solution for my problem

I had used Flask on AWS Lambda on the past with Zappa and liked the easiness of the deployment process and fast is to get a small app running without too much hassle (most of the time)

Zappa lets you deploy Django or Flask apps on AWS Lambda, but I’d rather use Flask for something simple like this to keep memory…

--

--

Xoel López Barata

Freelance data scientist and software developer. On Twitter: @xoelipedes.