Predicting animal adoption with Random Forest, SVM

Across the country, animal shelters work around the clock to help pets get rescued. For the most part, people assume animal adoption is mostly driven by intuition and emotion — a family comes in, falls in love, and then welcomes a new pet into their home.

As readers here are likely to suspect, if you have the data, you can see there’s more to the story. Fortunately, the Austin Animal Shelter has made their animal rescue data public. They’re the largest no-kill shelter in the United States, housing over 18,000 animals each year, which provides us with plenty of information to go on.

Joanne Lin, a student at Thinkful’s data science bootcamp, decided to jump in and find insights that can help shelters get more pets rescued. By analyzing shelter outcomes for nearly 80,000 animals (roughly 44k dogs and 30k cats), Joanne was able to drill down on:

Which features are most predictive of whether or not an animal gets adopted
Which animal names are the most desirable for adoptions
What days of the week result in the most adoptions
What are the most “adoptable” dogs and cats, respectively

After diving into the data and trying to get granular with each animal’s physical features, it became clear that the data required, amusingly enough, some simple feature engineering.

Original Data Set

Name	Species	Breed	Description
Sadie	Canine	Dalmatian	White spotted
Lucky	Canine	Border Collie	Black bicolor
Piper	Canine	Australian Terrier	Red sable

More Granular

Name	Species	Breed	Coat Color	Coat Pattern
Sadie	Canine	Dalmatian	White	Spotted
Lucky	Canine	Border Collie	Black	Bicolor
Piper	Canine	Australian Terrier	Red	Sable

We left out some columns out of those tables for clarity, including age and whether the pet had been neutered/spayed, which is represented as ‘intact’. Joanne then ran these results through Random Forest to determine which pet attributes were most predictive of adoption, so she worked with Random Forest.

For cats:

For dogs:

For cats, the top 3 factors were whether it was spayed or neutered, how old the cat was, and what type of coat it had. When it came to dogs, adopters also cared about age and coat, but the #1 factor was the pup’s breed. Interestingly, adopters seemed to care a lot more if cats were named.

To define the boundary between ‘adoptable’ and ‘not adoptable’, Joanna worked with Support Vector Machine as well. There’s quite a lot to this one and I’d highly recommend diving right into Joanne’s notebook on GitHub, where you can see all of the code — but here’s a quick visual to give you the basic idea of what she found:

With these insights, shelters can prioritize to better drive pet adoptions. Whether they’re allocating more budget to spay or neuter their animals, making sure they name every potential pet (or at least the cats), or making sure they have extra adoption staff on hand for weekends, we’re hoping to see data drive a better life for shelter pets.

More information about the project can be found here.

Predicting animal adoption with Random Forest, SVM

Original Data Set

More Granular

Leave a Reply Cancel reply