Reddit reviews Introduction to Machine Learning (Adaptive Computation and Machine Learning series)

We found 4 Reddit comments about Introduction to Machine Learning (Adaptive Computation and Machine Learning series). Here are the top ones, ranked by their Reddit score.

Check price on Amazon

4 Reddit comments about Introduction to Machine Learning (Adaptive Computation and Machine Learning series):

u/krunk7 · 10 pointsr/programming

Absolutely.

Check out The Elements of Statistical Learning and Introduction to Machine Learning.

edit those books are about practical applications of what we've learning to date from the neural network style of pattern classification. So it's not about modeling an actual biological neuron. For modeling of the biology, it's been a while since I futzed with that. But when I wrote a paper on modeling synaptic firing, Polymer Solutions: An Introduction to Physical Properties was the book for that class. Damned if I remember if that book has the details I needed or if I had to use auxiliary materials though.

u/ffualo · 3 pointsr/askscience

Hi RandomNumber37,

So here's a little bit about me first; I don't want to misrepresent myself. My background is in economics and political science, where I was interested in statistical models that predict rare international events like war and state failure. It's here I became obsessed with statistics, machine learning, etc. Also, I've been programming in many languages since I was a kid, so after my undergraduate work in the social sciences and statistics, I took a job with a bioinformatics group doing coding. I thought this would be a temporary job until graduate school in economics or quantitative political science.

However working with large-scale biological and sequencing data was way more awesome than I expected. This caused me to shift focus. I also did a fair amount of work on computational statistics, i.e. ways of trying to make R better, understanding compiler technologies, etc. So after, I became more purely interested in statistics and computational biology, and I thought I would go to graduate school for pure statistics so I could also devote some time to computational statistics. However, now I work in a plant breeding lab (which I absolutely love). I will do this about another 2-3 years before I transition into a graduate program. This would mean I've worked in the field about 6 years before applying to graduate programs.

So, with that out of the way here are answers to your questions and some advice I offer:

How much of your time is spent working with the plants themselves vs with computer-organized data?

Being that my background isn't in biology, I don't currently work with plants much. However, this is why I moved towards plant biology. Before getting obsessed about social science methods, I loved plants. I worked at an orchid greenhouse, and actually went to UC Davis thinking I'd study plant biology (until an awesome political science professor got me excited about science applied to political data). However, the scientists I work with are often not doing too much work with plants: many grow the plants, do the wet lab work, then spend more than half the time (sometimes up to 90%) analyzing the huge amount of data. I spend my full day in front of a computer, except when a colleague wants me to check out something cool in the lab, etc.
With what kind of operations does your computer aid you?

Everything. We get raw sequencing data, I have to analyze it from start to finish. Or, from raw sequencing files until the point where the numbers behind it tell a story. I also spend a huge amount of my time writing programs that do certain things for biologists in our group. Everything — protein prediction, data quality analysis, statistical modeling, etc.
Do you see a full cycle... from plant, to data, to application of knowledge to your specimens (and back to data)?

Yes, at this current position I am starting to (which I why I sought work in plant biology). It depends on what plant you work with (Arabidopsis = short life cycle, you can do lots of stuff, vs citrus tree = long life cycle, you can't do lots of stuff). But some of the more awesome longer term projects will take 4 years to fully materialize.

So now, what steps were more important? I will tell you the three things that have helped me the most. As a point of how much they've helped me, I'll just mention that despite that not having a Phd (yet), or much of a background in biology other than what I've taught myself or learned on the job (which is actually quite a lot after 4 years in the field), I've had (and continue to receive) really nice job offers.
Learn programming really, really, really well. If you want to be a step above the rest, learn python and R. Perl is huge in bioinformatics, but it's a disgusting ugly language that's dying out in my opinion. It sucks for reproducibility; no one can read anyone else's code. It was great when everyone was racing to get the human genome sequenced and had to write quick scripts constantly. Now, we have larger software platforms for that stuff, and what will count most in the future is the distribution of your scientific code. Reproducibility problems will soon be primarily dry lab, not wet lab. If you doubt that, read the "Forensic Bioinformatics" paper (http://projecteuclid.org/euclid.aoas/1267453942) which was a game changer for me. I've always been passionate about open science and reproducibility, but that made me realize that we'll have a huge problem in a few years if we're not careful.

Anyways, I'd recommend learning:

Python (with BioPython). Also, with Django if you're building web apps to interface with scientific databases.
R (with Bioconductor).
Unix command line (sed/awk, bash)
Know your editor. I use emacs. Even if it takes you 80 hours to learn emacs or your editor well, you will regain that time over a year of work. I promise. People watch me use emacs and they say it makes them dizzy because they can't keep up. That's dozens of hours saved each week.

Now, optionally (but highly, highly recommended):
C. Absolutely necessary to debug compiling programs or writing high-usage programs that need to be fast.
SQL. You'll be storing biological data in databases. SQL is important. Use SQLite a lot. People like huge PostgreSQL or MySQL databases for even small things, but this is a waste of time IMO if you're just going to be the one accessing it. Bioconductor leverages huge amount of SQLite because it's so easy and awesome.

Now, even more optionally:
Lisp. Lisp will change the way you think about programming. It's also used with AraCyc, MetaCyc, and PlantCyc data. I've used it extensively in these applications. The ratio of how Lisp has changed my thinking to how much I use it in production code is HUGE. Learn functional programming concepts; then concepts like map/reduce will fall easily into place. Know object orientation too.
Javascript. I love JS. It's doing amazing things too. And part of being a very effective bioinformatician/statistician is being able to easily convey your data. There is no easier and more interactive medium than a browser. Check out d3.js. Even old scientists can click a link and interact with data via Javascript. In contrast, they wouldn't want to install some old dusty Java application. Of course, with this comes HTML, XML, JSON, etc, etc. so learn those too.

Learn statistics REALLY WELL. Honestly, try to pick up a statistics minor (over a CS minor IMO). Lots of brilliant programmers buy the Cormen algorithm book and are set for data structures and algorithms. But understanding statistics at a deeper level — that takes intimate study via courses. I would recommend taking courses on probability theory and mathematical statistics. I took two courses as part of our mathematical statistics series and I cannot even begin to emphasize how helpful they were. I hear a quote once: at Google they use Bayes theorem like other programmers use "if" statements. Same thing in bioinformatics. Look at the best SNP callers, software, etc, and they're using population genetics models and Bayes approaches. Know math stats early, and it will permeate your thinking in the best ways.

Another quick story: I had a statistics graduate student come tell me he was working for a rather well known genomics professor on campus. He asked me how to analyze RNA-seq data. He said he wanted to use ANOVA. Even though he was a statistics graduate student, he went immediately to normality-assuming models, which was definitely not the case with this data the case. So know your Poisson, negative binomial, gamma, etc distributions. A probability course should introduce them all to you. It will also means when you start learning more theoretical population genetics, you'll be set.

Also, buy a book on machine learning (Elements of Statistical Learning II is good, and a free PDF is available). ESL II is good, but dense; don't let it discourage you. I also like this book. But again, this is dense stuff, don't let it discourage you.
Learn data structures and algorithms well. I think a single course, or doing this on your own is sufficient. However, if you want to do what Heng Li does (author of BWA, samtools, and fermi assembler) you need much, much more. Compression-based data structures are huge in bioinformatics now. I love this stuff, but it's too removed from the biology to be very interesting to me. But if that's the direction you want to move into, hang around CS department more.
Learn to code well. This is vastly underemphasized in the sciences. Learn about test-driven development. Get the habit of writing unit tests early, and writing good documentation. Learn Git too — this is a must.

u/dfmtr · 2 pointsr/MachineLearning

You can read through a machine learning textbook (Alpaydin's and Bishop's books are solid), and make sure you can follow the derivations. Key concepts in linear algebra and statistics are usually in the appendices, and Wikipedia is pretty good for more basic stuff you might be missing.

u/jhill515 · 1 pointr/MachineLearning

Personally, I like "Introduction to Machine Learning" by Alpaydin.

I also strongly recommend reading "The Computational Complexity of Machine Learning" by Michael Kerns.

I agree with @machinedunlearned on the point that ML is a multidisciplinary field. I've been doing work in this field for several years, and I don't consider myself a subject matter expert on much outside of what I call Intelligent Systems. As such, I tend to get "tied at the hip" to a field expert when I'm applying ML to various problems. That said, note that ML has a wide range of techniques that are ever expanding since it's a hot area of research. First gain a broad understanding of what constitutes supervised, unsupervised, and reinforcement learning, then lean when each type of learning is best applied to various problems. That skill will prove invaluable. The references will touch on this some, but don't be afraid to try something different to learn something new!