Reddit Reddit reviews ggplot2: Elegant Graphics for Data Analysis (Use R!)

We found 6 Reddit comments about ggplot2: Elegant Graphics for Data Analysis (Use R!). Here are the top ones, ranked by their Reddit score.

Computers & Technology
Books
Computer Graphics & Design
ggplot2: Elegant Graphics for Data Analysis (Use R!)
Used Book in Good Condition
Check price on Amazon

6 Reddit comments about ggplot2: Elegant Graphics for Data Analysis (Use R!):

u/I_am_not_at_work · 19 pointsr/bioinformatics
  1. Download RStudio
  2. Try online tutorials like this, this, here, and this pdf.
  3. R can produce amazingly ugly or beautiful graphs. ggplot2 is my favorite and these books 1,2,3 will give you solid foundation on how to use it.
  4. Are you just interested in RNAseq or ChIPseq? Are you running the entire bioinformatic pipeline from QC through to RPKM/counts generation? This blog post can give you a decent idea on a basic workflow for differential gene expression analysis. Most of that is R and unix based tools. But there is also a lot else out there that you can google and then learn from.
  5. Keep in mind that any error message that you can't figure out has already happened to many other people. A google search will find you a stack overflow or biostars post asking how to solve whatever problem you have encounter. So don't be discourage when you can't figure out something.
u/SharpSightLabs · 13 pointsr/Python

Here's what I'd recommend.


GETTING STARTED WITH DATA SCIENCE


If you're interested in learning data science I'd suggest the following:
 

Tools

  1. I’d recommend learning R before Python (although Python is an exceptional tool). Here are a few reasons.
    1. Many of the hot tech companies in SF, the Valley, and NYC like Google, Apple, FB, LinkedIn, and Twitter are using R for much of their data science (not all of it, but a lot).
    2. R is the most common programming language among data scientists. O’Reilly Media just released their 2014 Data Science Salary Survey. I’ll caveat though, that Python came in at a close second. Which leads me to the third reason:
    3. R has 2 packages that dramatically streamline the DS workflow:
      • dplyr for data manipulation
      • ggplot2 for data visualization

        Learning these has several benefits: they streamline your workflow. They speed up your learning process, since they are very easy to use. And perhaps most importantly, they really teach you how to think about analyzing data. GGplot2 has a deep underlying structure to the syntax, based on the Grammar of Graphics theoretical framework. I won’t go into that too much, but suffice it to say, when you learn the ggplot2 syntax, you’re actually learning how to think about data visualization in a very deep way. You’ll eventually understand how to create complex visualizations without much effort.
         

        Skill Areas
        My recommendations are:

  2. Learn basic data visualizations first. Start with the essential plots:
    • the scatter plot
    • the bar chart
    • the line chart
      (But, again I recommend learning these in R’s ggplot2.) The reason I recommend these is
      1. The are, hands down, the most common plots. For entry level jobs, you’ll use these every day.
      2. They are “foundational” in the sense that when you learn about the underlying structure of these plots, it begins to open up the world of complex data visualizations.
        As with any discipline, you need to learn the foundations first; this will dramatically speed your progress in the intermediate to advanced stages.
      3. You’ll need these plots as “data exploration” tools. Whether you’re finding insights for your business partners or investigating the results of a sophisticated ML algorithm, you’ll likely be exploring your data visually.
      4. These plots are your best “data communication” tools. As noted elsewhere in this thread, C-level execs need you to translate your data-driven insights into simple language that can be understood in a 1-hour meeting. Communicating visually with the basic plots will be your best method for communicating to a non-technical audience. Communicating to non-technical audiences is a critical (and rare) auxiliary skill, so if you can learn to do this you will be very highly valued by management.
        I usually suggest learning these with dummy data (for simplicity) but if you have a simple .csv file, that should work to.
  3. Learn data management second (AKA, data wrangling, data munging)
    After you learn data visualization, I suggest that you “back into” data management. For this, you should find a dataset and learn to reshape it.
    The core data management skills:
    • subsetting (filtering out rows)
    • selecting columns
    • sorting
    • adding variables
    • aggregating
    • joining
      You can start learning these here. Again, I recommend learning these in R’s dplyr because dplyr makes these tasks very straight forward. It also teaches you how to think about data wrangling in terms of workflow: the “chaining operator” in dplyr helps you wire these commands together in a way that really matches the analytics workflow. dplyr makes it seamless.
  4. Learn machine learning last.
    ML is sort of like the “data science 301” course vs. the 102 and 103 levels of the data-vis and data manipulation stuff I outlined above.
    Here, I’ll just give book recos:
  5. Nathan Yao of Flowing Data is great. His blog shows excellent data visualization examples. Also, I highly recommend his books. In particular, Data Points. Data Points will help you learn how to think about visualization.
  6. The book ggplot2 by Hadley Wickham. This is a great resource (though a little outdated, as Hadley has updated the ggplot package).
  7. I also really like Randal Olson’s work (AKA, /u/rhiever). He creates some great data visualizations that can serve as inspiration as you start learning.
     

    TL;DR

    I'd recommend learning R for data science before Python. Learn data visualization first (with R's ggplot2), using simple data or dummy data. Then find a more complicated dataset. Learn data manipulation second (with R's dplyr), and practice data manipulation on your more complex data. Learn machine learning last.

u/iacobus42 · 4 pointsr/statistics

Anything by Tufte and the Flowing Data book and blog are great starting places. Tufte is more theory driven, for lack of a better term, while the Flowing Data sources have more "worked" examples (with R, Python, etc).

It would be worth learning ggplot2 as well if you are interested in data visualization as that seems to be the current "standard" tool. Hadley Wickham's website and UseR book on ggplot2 are great places to start.

Relatedly, Wickham's PhD thesis is all about tools and strategies for data visualization and can be found for free on his website. There is also an hour long seminar and slides to go with the paper.

u/therealprotonk · 2 pointsr/statistics

There are a few example based approaches in R. Hadley's book on ggplot2 is worth a look, as is the online documentation. Both the book and the docs are more instructions on how to use ggplot2 than general guides for visualization, but the core ideas behind the grammar of graphics and ggplot2 are good starting points. As a bonus, the book is cheap and all the code in the examples is available online. Data Analysis and Graphics Using R is a much longer and more general introduction to experiments, statistics and graphics. If you are looking for an example heavy text to help you work through both stats and data visualization I recommend it. However it is long and somewhat expensive.

Tufte is certainly worth your time. I doubt there is a definitive guide. Data visualization is a bit like UI/UX design. There are a bunch of canonical rules which you shouldn't break until you know exactly what you are doing--then breaking them can be extremely valuable.

u/buckhenderson · 1 pointr/rprogramming

there's always the book by the author of ggplot: ggplot2: Elegant Graphics for Data Analysis. i haven't actually read it, but i'd imagine it's pretty thorough. there's going to be a new edition coming out, coinciding with a newer version of ggplot, iirc. there's also the official documentation here (looks like they just revamped the site).

but i learned it just by figuring out what i wanted to plot, and then using google/stackoverflow

u/shorttails · 1 pointr/dataisbeautiful

Hadley (ggplot2 author) also has a book on the package if you want to get a solid foundation: here