Reddit Reddit reviews Pandas for Everyone: Python Data Analysis: Python Data Analysis (Addison-Wesley Data & Analytics Series)

We found 2 Reddit comments about Pandas for Everyone: Python Data Analysis: Python Data Analysis (Addison-Wesley Data & Analytics Series). Here are the top ones, ranked by their Reddit score.

Computers & Technology
Books
Data Mining
Databases & Big Data
Pandas for Everyone: Python Data Analysis: Python Data Analysis (Addison-Wesley Data & Analytics Series)
Addison-Wesley Professional
Check price on Amazon

2 Reddit comments about Pandas for Everyone: Python Data Analysis: Python Data Analysis (Addison-Wesley Data & Analytics Series):

u/syntonicC · 13 pointsr/datascience

I used R for about 4 years before I moved to Python to use it for deep learning. I have been using Python for about 2 years now.

>Are R and Python considered redundant, or are there some situations where one will be preferred over the other? If I become proficient at using Python for data wrangling, analysis, and visualization, will I have any reason to continue using R?

It depends. I haven't really found anything that I can do in Python that I could not already do in R. I still use R because I like it better as a functional programming language and because it has a wide variety of more specific statistical packages (many for biology) that are just not available for Python yet. There are some specific cases where I just find it more intuitive and simpler to implement a solution in R. And generally, I just prefer ggplot2 over any of the various Python plotting packages. Also, R has high level API for things like TensorFlow so it's not like you can't do deep learning in R.

The biggest advantage for Python is its speed and ability to work within a larger programming framework. A lot of companies tend to use Python because the models they build are integrated into a larger system that needs the capabilities of a fully-fledged programming language. Python is generally faster and has better management of big data sets in memory. R is actually moving more in the direction to fix these issues but there are still limitations.

>Where should I start? I'm looking for a resource that isn't aimed at complete beginners, since I've been using R for a few years, and took a C class before that. At the same time I wouldn't claim to be an experienced programmer. I'm interested in learning Python both for data analysis and for general programming.

I learned Python syntax using Learn Python 3 the Hard Way. I learned about Pandas and data wrangling etc using Pandas for Everyone and Pandas Cookbook. If I was to suggest just one book, it would be Pandas for Everyone. You can learn Python syntax from YouTube, MOOCs, or online tutorials. The Pandas Cookbook is just extra practice. To be honest though, the general conventions used by Pandas for data analysis and manipulation are very similar to R in many ways. Especially if you've used anything in Hadley Wickham's Tidyverse. Finally, I made a Pandas cheatsheet while I was learning and including equivalent R functions in some places. I would be happy to share this Google Sheets file with you if you are interested.

>What IDE(s) should I use, and what are some must learn packages? I'm hoping to find something similar to RStudio.

I started off using PyCharm. I've heard good things about Spyder. But now, I actually still use RStudio! It is fully integrated with Python thanks to the Reticulate package. You can pass data structures between the languages and use both in RMarkdown. You can also use virtual environments which are popular with Python. Once you install the package:

library(reticulate)
use_virtualenv("path_to_my_virtual_env") # Start virtual environment

You can now run Python scripts directly in the RStudio console

# If you want a Python REPL to use interactively just like in R run:<br />
repl_python()<br />


It's really easy to use and even comes with auto-complete and everything else.

Hope that helped.

u/CanYouPleaseChill · 1 pointr/Python

I highly recommend Pandas for Everyone: Python Data Analysis by Daniel Chen. It's extremely practical and well-organized.