This article was written by Yhat.
Introduction
One of my favorite things about Python is that users get the benefit of observing the R community and then emulating the best parts of it. I’m a big believer that a language is only as helpful as its libraries and tools.
This post is about pandasql
, a Python package we (Yhat) wrote that emulates the R package sqldf
. It’s a small but mighty library comprised of just 358 lines of code. The idea of pandasql
is to make Python speak SQL. For those of you who come from a SQL-first background or still “think in SQL”, pandasql
is a nice way to take advantage of the strengths of both languages.
In this introduction, we’ll show you to get up and running with pandasql
inside of Rodeo, the integrated development environment (IDE) we built for data exploration and analysis. Rodeo is an open source and completely free tool. If you’re an R user, its a comparable tool with a similar feel to RStudio. As of today, Rodeo can only run Python code, but last week we addedsyntax highlighting for a bunch of other languages to the editor (markdown, JSON, julia, SQL, markdown). As you may have read or guessed, we’ve got big plans for Rodeo, including adding SQL support so that you can run your SQL queries right inside of Rodeo, even without our handy little pandasql
. More on that in the next week or two!
What you will find in this article:
- Downloading Rodeo
- A bit of background, if you’re curious
- Install pandasql
- Check out the datasets
- An odd graph
- It’s just SQL
- Final thoughts
To check out all this information, click here.
Top DSC Resources
- Article: Difference between Machine Learning, Data Science, AI, Deep Learnin…
- Article: What is Data Science? 24 Fundamental Articles Answering This Question
- Article: Hitchhiker’s Guide to Data Science, Machine Learning, R, Python
- Tutorial: Data Science Cheat Sheet
- Tutorial: How to Become a Data Scientist – On Your Own
- Categories: Data Science – Machine Learning – AI – IoT – Deep Learning
- Tools: Hadoop – DataViZ – Python – R – SQL – Excel
- Techniques: Clustering – Regression – SVM – Neural Nets – Ensembles – Decision Trees
- Links: Cheat Sheets – Books – Events – Webinars – Tutorials – Training – News – Jobs
- Links: Announcements – Salary Surveys – Data Sets – Certification – RSS Feeds – About Us
- Newsletter: Sign-up – Past Editions – Members-Only Section – Content Search – For Bloggers
- DSC on: Ning – Twitter – LinkedIn – Facebook – GooglePlus