Computers & technology books>Books>Databases & big data books

Best data modeling & design books according to redditors

We found 378 Reddit comments discussing the best data modeling & design books. We ranked the 78 resulting products by number of redditors who mentioned them. Here are the top 20.

1. Purely Functional Data Structures

34 mentions

Used Book in Good Condition

Read Reddit comments View price

2. Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

30 mentions

Read Reddit comments View price

3. Dataclysm: Who We Are (When We Think No One's Looking)

25 mentions

Read Reddit comments View price

4. Dataclysm: Love, Sex, Race, and Identity--What Our Online Lives Tell Us about Our Offline Selves

24 mentions

Broadway Books

Read Reddit comments View price

8. SQL Antipatterns: Avoiding the Pitfalls of Database Programming (Pragmatic Programmers)

10 mentions

Used Book in Good Condition

Read Reddit comments View price

9. Visualize This: The FlowingData Guide to Design, Visualization, and Statistics

9 mentions

John Wiley Sons

Read Reddit comments View price

10. R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics (O'reilly Cookbooks)

7 mentions

O'Reilly Media

Read Reddit comments View price

11. High Performance MySQL: Optimization, Backups, Replication, and More

7 mentions

Used Book in Good Condition

Read Reddit comments View price

12. Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

6 mentions

Read Reddit comments View price

13. An Introduction to Database Systems (8th Edition)

6 mentions

Read Reddit comments View price

14. Joe Celko's SQL for Smarties: Advanced SQL Programming Second Edition (The Morgan Kaufmann Series in Data Management Systems)

5 mentions

Used Book in Good Condition

Read Reddit comments View price

15. Interactive Data Visualization for the Web: An Introduction to Designing with D3

5 mentions

O Reilly Media

Read Reddit comments View price

16. Learning Python Design Patterns

4 mentions

Read Reddit comments View price

17. Java 8 in Action: Lambdas, Streams, and functional-style programming

4 mentions

Manning Publications

Read Reddit comments View price

18. Visualizing Data

4 mentions

Used Book in Good Condition

Read Reddit comments View price

19. Dataclysm: Love, Sex, Race, and Identity--What Our Online Lives Tell Us about Our Offline Selves

3 mentions

Read Reddit comments View price

20. Deep Learning: A Practitioner's Approach

3 mentions

O Reilly Media

Read Reddit comments View price

Top Reddit comments about Data Modeling & Design:

u/samort7 · 257 pointsr/learnprogramming

Here's my list of the classics:

General Computing

u/healydorf · 227 pointsr/cscareerquestions

> When is it okay to get complacent in your job and when is it not?

That's 100% up to you. Different strokes for different folks and all that.

> How important is it to constantly be working on or learning new stuff?

Extremely important. So much so that I give almost no pushback if my people wanna spend a few days per month at a conference/training. Company will even pay for most of it. Find a company that has a line-item in the budget for professional development -- dollars that are specifically intended to be spent by the end of the year on training, conferences, etc.

And that's not exclusive to software/data/compsci. Any skilled labor is changing constantly. Professional development is important.

> For the data engineers out there what skills should I perfect that will make me employable / desirable anywhere?

Become familiar with a variety of query languages and syntax. SQL, Elastic, AQL, N1QL, a time series DB -- the specific one doesn't really matter, just know more than "basic SQL joins" that you'll see in an undergrad database course.

Recommended reading: Designing Data Intensive Applications.

u/Aldairion · 66 pointsr/AskMen

That came from data pulled off OkCupid and you can read more about this and other findings in Dataclysm, which was written by OkCupid founder Christian Rudder. It's actually a very interesting read and it covers trends in behavior beyond just that which applies to dating or attractiveness.

It's worth noting that the same data showed that a vast majority of men find women most attractive between the ages of 18 - 23 or so whereas women were pretty consistently attracted to men with a few years of their own age. There are also a lot of variables that affect what metric they're using to gauge "attractiveness" so I would take that figure with a grain of salt.

A large percentage of men don't even put much effort into their baseline appearance, either because they don't want to, don't have to, or don't think to. If we're talking about looks and looks alone, then I'm not entirely surprised. Maybe it's not 80%, but if you're comparing one group of people who have been conditioned to put a little extra effort into their appearance, to another that hasn't, or has even been discouraged from doing so, then I could see why perceptions of attractiveness would skew in one direction more than the other.

Basically, don't take a line from an OkCupid blog to heart.

u/Surprise_Buttsecks · 65 pointsr/todayilearned

If you take a look at the book OKCupid's founder wrote (Dataclysm) he makes the point that men's ratings for women are normally distributed, but women's ratings for men are a power law distribution.

u/dons · 38 pointsr/programming

The darcs people (and the pure FP world) have been talking about the similarities between persistant structures, transactional memory and rollback, and revision control for years, FWIW. Good to see these ideas becoming more widespread.

E.g. here's a similar, older triple:

tcache, database+persistance based on STM
darcs, flexible dvcs based on (true) immutable patches (no rebase)
haskell, typed, pure-by-default functional language with the first built-in STM

and I'll throw in NixOS, a Linux package manager based on immutable packages.

This is what the FP revolution looks like, I guess.

u/UniverseCity · 32 pointsr/golang

Designing Data-Intensive Applications seems to be the industry standard, although it's not Go specific.

u/Vitate · 26 pointsr/cscareerquestions

Much of this stuff is learnable outside of work, too, at least at a superficially-passable level. Trust me.

Pick up a few seminal books and read them with vigor. That's all you need to do.

Here are some books I can personally recommend from my library:

Software Design

A Philosophy of Software Design
Clean Code
Dive Into Design Patterns (Better than the GoF book IMO)
Designing Data Intensive Applications
Microservice Patterns
Test-Driven Development

&#x200B;

Software Industry
The Software Craftsman
The Clean Coder
Stealing the Corner Office

u/pippx · 24 pointsr/tumblr

The attraction graphs look very similar to ones that I saw in a book I read recently -- Dataclysm: Love, Sex, Race, and Identity--What Our Online Lives Tell Us about Our Offline Selves. It's written by the co-founder of OkCupid, so loads of the data came directly from there. That's what the OP graphs look like to me. You can use the "look inside" feature and search for "attraction"; page 47 has one of the graphs I'm referring to.

u/VanFailin · 23 pointsr/programming

I can totally believe that that code made it to production, especially while a site is still growing, but if they needed an expert to tell them not to use LIKE queries...

The book on SQL Antipatterns has my favorite cover ever, and it's a great presentation.

u/ZoraSage · 22 pointsr/polyamory

A lot are clearly copy and pasted. If it doesn't reference or ask about something in my profile, I don't bother responding.

If you're interested in this sort of thing, you should read Dataclysm.

u/cfors · 22 pointsr/datascience

Designing Data Intensive Applications is your ticket here. It takes you through a lot of the algorithms and architecture present in the distributed technologies out there.

In a data engineering role you will probably just be munging data through a pipeline making it useful for the analysts/scientists to use, so a book recommendation for that depends on the technology you will be using. Here are some of my favorite resources for the various tools I used in my experience as a Data Engineer:

High Performant Spark
acloud.guru
Python Cookbook

Good luck in your new position!

u/FunkyCannaHigh · 22 pointsr/devops

https://landing.google.com/sre/books/

&#x200B;

SRE book is free, workbook is not.

https://cloud.google.com/solutions/best-practices-for-operating-containers

https://cloud.google.com/solutions/about-capacity-optimization-with-global-lb

Some of this is google cloud specific but the principles are the same with on-prem or a different provider. "State-of-the-art" deployments are usually learned by using best practices since each distributed app's deployment will vary. These books will help with best practices:

https://www.amazon.com/Microservices-Patterns-examples-Chris-Richardson/dp/1617294543/

https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/1449373321/

https://www.amazon.com/Designing-Distributed-Systems-Patterns-Paradigms/dp/1491983647/

u/adremeaux · 21 pointsr/compsci

I know people love to hate on it but that's largely what Stephen Wolfram's A New Kind Of Science is about. It is, basically, a way of showing how many of natures most complex designs can be represented by very simple sets of rules.

u/thecometblast · 20 pointsr/TheRedPill

Some thoughts
One thing that got me thinking was his slide on the how and the why. Basically the chart looks like this:

Advice | Reason |
--------|-----------|
confidence | risk taking |
charisma | social hierarchy |
competence | provisions |
leadership | overall survival |

Talking to a stranger is risk taking. Having good charisma makes you seem higher up on the totem pole. Who gathered the most animals? A big question in women's hypergamous brain is who have the most provisions.

This got me to thinking about how I would develop social confidence? "The most important mark of confidence a man can do is to start a conversation with somebody... approach, approach, approach." (@~34:00)

So I brainstormed:

Advice | Reason | Action|
--------|------|--------|
confidence | risk taking | Approach
charisma | social hierarchy | Work in Bar/Meet Ups/ ...
competence | provisions | Job/Budgeting/Investing/show dangerous side...
leadership | overall survival | Get in Leadership Positions/Volunteer...

How feasible are the actions? Approaching can be done today by going outside, but I am [insert hamstering] and she is [hamstering]....

Here are the books he recommended @~40:18

A Billion Wicked Thoughts: What the Internet Tells Us About Sexual Relationships

Shows what men and women want.
Dataclysm
Date-onomics: How Dating Became a Lopsided Numbers Game
What's the most popular book for women? 50 shades... (a man taking charge is attractive and dominant)

Advice:

Become keen observers of human nature and behavior based on reality. One way is to take walks with your dog, sit at a cafe and eavesdrop on people on dates.

He also recommended getting social history books and getting a book list together. Not sure if the list above is the list or a quick glimpse.

Background:

Man is dying. I saw him on reddit offering free advice and skype sessions before. I thought there may be a catch and I was insecure. Fast forward today I see him on the stage, I wish I have taken up the offer
and am thinking about spending a day with him. Usually never have someone like that in my life, wonder about how a day with him would be like. Crowd in the room are tired and silencing his side jokes, but sometimes the
crowd (or one person) comes alive and responds. I would of been stoic/quiet/beta (on and on) in the audience, but would fantasize about his points. At end no one seem to have questions so he have to probe the audience "anyone want to know about my eye patch?"

questions around @48:00
your pickup line?
charisma and leadership?

etc.

u/notingoodshape · 20 pointsr/dataisbeautiful

If you think OKTrends was cool, you should read Christian Rudder's new book, Dataclysm! It's amazing.

In fact, everyone in this sub would probably be at least somewhat interested in this book.

u/itkovian · 20 pointsr/haskell

I can highly recommend Okasaki's book on data structures: https://www.amazon.com/Purely-Functional-Data-Structures-Okasaki/dp/0521663504, if you are looking for inspiration or techniques.

u/Eager_Question · 19 pointsr/Feminism

Women talking

https://www.sciencedirect.com/science/article/pii/089858989290018R

https://timedotcom.files.wordpress.com/2017/06/d3375-genderandlanguageintheworkplace.pdf

Women going for people roughly their own age

https://www.businessinsider.com/dataclysm-shows-men-are-attracted-to-women-in-their-20s-2014-10

Source book: https://www.amazon.ca/dp/0385347375/ref=sr_1_1?keywords=dataclysm&ie=UTF8&sr=8-1&linkCode=gs2&tag=bica09-20

u/I_am_not_at_work · 19 pointsr/bioinformatics

Download RStudio
Try online tutorials like this, this, here, and this pdf.
R can produce amazingly ugly or beautiful graphs. ggplot2 is my favorite and these books 1,2,3 will give you solid foundation on how to use it.
Are you just interested in RNAseq or ChIPseq? Are you running the entire bioinformatic pipeline from QC through to RPKM/counts generation? This blog post can give you a decent idea on a basic workflow for differential gene expression analysis. Most of that is R and unix based tools. But there is also a lot else out there that you can google and then learn from.
Keep in mind that any error message that you can't figure out has already happened to many other people. A google search will find you a stack overflow or biostars post asking how to solve whatever problem you have encounter. So don't be discourage when you can't figure out something.

u/estiquaatzi · 19 pointsr/ItalyInformatica

La scelta del linguaggio di programmazione dipende molto dal contesto e dalla applicazione specifica. R é ottimo per l'analisi statistica, ma appunto si adatta solo a quello.

Per iniziare, mantenendo una forte connessione con quello che desideri studiare, ti suggerisco python.

Leggi "Python Data Science Handbook: Essential Tools for Working with Data" e "Learning Python"

https://www.amazon.com/Python-Data-Science-Handbook-Essential/dp/1491912057
https://www.amazon.com/Learning-Python-5th-Mark-Lutz/dp/1449355730/Installa python 3.7 tramite Anaconda, ed usa Spyder come editor ( già incluso nella distribuzione ) . In alternativa PyCharm è ottimo ma richiede delle conoscenze più approfondite per l'installazione e la configurazione dell'ambiente e la creazione di ambienti virtuali.
https://www.anaconda.com/distribution/
https://www.jetbrains.com/pycharm/

Impara a manipolare e filtrare dati con Pandas, Numpy, Scipy,
https://pandas.pydata.org
https://www.numpy.org
https://www.scipy.org/

Impara a mostrare dati preacquisiti con matplotlib, seaborn, bokeh,
https://matplotlib.org/
https://seaborn.pydata.org/index.html
https://bokeh.pydata.org/en/latest/

o dataset che possono evolvere rapidamente con pyqtgraph
http://www.pyqtgraph.org

Mi stavo quasi dimenticando, per analisi statistiche uso Statsmodels
http://www.statsmodels.org/stable/index.html

Quanto tempo serve? Non lo so. Io imparo linguaggi e librerie nuove ogni giorno perchè ogni mese disegno qualcosa di nuovo, diverso, migliorato.

Edit: aggiungo un commento finale. Arrivati ad un certo livello di affinità con la programmazione, il linguaggio di programmazione scelto non importa quanto la disponibilità di librerie affidabili, ben documentate, e ben mantenute. Dal mio punto di vista avere buone librerie è essenziale perché creare ex novo una libreria ha un costo {temporale, di skills, di focus, etc..} che spesso preferisco allocare diversamente.

u/Sam_Yagan · 18 pointsr/IAmA

First, look at http://blog.okcupid.com/.

Second look at http://www.amazon.com/Dataclysm-When-Think-Ones-Looking/dp/0385347375.

Finally, yes, I think technical backgrounds are probably the most helpful for people in business.

u/edwardkmett · 17 pointsr/programming

Three books that come to mind:

Types And Programming Languages by Benjamin Pierce covers the ins and outs of Damas-Milner-style type inference, and how to build the bulk of a compiler. Moreover, it talks about why certain extensions to type systems yield type systems that are not inferrable, or worse may not terminate. It is very useful in that it helps you shape an understanding to understand what can be done by the compiler.

Purely Functional Data Structures by Chris Okasaki covers how to do things efficiently in a purely functional (lazy or strict) setting and how to reason about asymptotics in that setting. Given the 'functional programming is the way of the future' mindset that pervades the industry, its a good idea to explore and understand how to reason in this way.

Introduction to Algorithms by Cormen et al. covers a ton of imperative algorithms in pretty good detail and serves as a great toolbox for when you aren't sure what tool you need.

That should total out to around $250.

u/joshuaeckroth · 16 pointsr/compsci

To my knowledge, Chris Okasaki made a big impact with this work in this area, and directly influenced Clojure, among other projects.

His book is a great read: http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504

It's based on his PhD thesis: https://www.cs.cmu.edu/~rwh/theses/okasaki.pdf

This StackOverflow question addresses what's changed since the late 90's: http://cstheory.stackexchange.com/questions/1539/whats-new-in-purely-functional-data-structures-since-okasaki

u/45g · 16 pointsr/programming

You are repeating yourself and your position is inconsistent. You acknowledge the usefulness of generic types and functions and yet you claim it is enough to have a fixed set of those. How come there was a need for them in the first place? Why does Go need the generic functionality it already has? What if somebody wants to implement something from Purely Functional Data Structures for example? What if i need a singly linked list? Your point is I should have many of them. One per type. This is plain ridiculous. For one it precludes the possibility of putting this into libraries which is to say it goes directly against re-use. If you fail to see that I feel sorry for you.

u/fernandotakai · 14 pointsr/programming

i've been reading Designing Data-Intensive Applications by Martin Kleppman and i would recommend to all backend developers out there that want to step up their game.

(i also love that it's a language agnostic book)

u/jakevdp · 13 pointsr/Python

You can buy direct from the publisher: http://shop.oreilly.com/product/0636920034919.do

But it's a bit cheaper on Amazon

u/jozefg · 12 pointsr/programming

I'd suggest

Learn You a Haskell For Great Good
Real World Haskell (Though some parts are a bit dated)
Parallel and Concurrent Programming in Haskell
Purely Functional Datastructures

Now past this it's not entirely clear where to go, it's much more based on what you're interested in. For web stuff there's Yesod and it's associated literature. It's also around this time where reading some good Haskell blogs is pretty helpful. In no particular order, some of my favorites are
A Neighbourhood of Infinity
Haskell For All
Yesod/Snoyman's blog
Edward Kmett's stuff on FPComplete
Edward Yang's blog
Lindsey Kuper's blog

And many, many more.

Also, if you discovery type theory is interesting to you, there's a whole host of books to dig into on that, my personal favorite introduction is currently PFPL.

u/parts_of_speech · 12 pointsr/datascience

Hey, DE here with lots of experience, and I was self taught. I can be pretty specific about the subfield and what is necessary to know and not know. In an inversion of the normal path I did a mid career M.Sc in CS so it was kind of amusing to see what was and was not relevant in traditional CS. Prestigious C.S. programs prepare you for an academic career in C.S. theory but the down and dirty of moving and processing data use only a specific subset. You can also get a lot done without the theory for a while.

If I had to transition now, I'd look into a bootcamp program like Insight Data Engineering. At least look at their syllabus. In terms of CS fundamentals... https://teachyourselfcs.com/ offers a list of resources you can use over the years to fill in the blanks. They put you in front of employers, force you to finish a demo project.

Data Engineering is more fundamentally operational in nature that most software engineering You care a lot about things happening reliably across multiple systems, and when using many systems the fragility increases a lot. A typical pipeline can cross a hundred actual computers and 3 or 4 different frameworks.doesn't need a lot of it. (Also I'm doing the inverse transition as you... trying to understand multivariate time series right now)

I have trained jr coders to be come data engineers and I focus a lot on Operating System fundamentals: network, memory, processes. Debugging systems is a different skill set than debugging code, it's often much more I/O centric. It's very useful to be quick on the command line too as you are often shelling in to diagnose what's happening on this computer or that. Checking 'top', 'netstat', grepping through logs. Distributed systems are a pain. Data Eng in production is like 1/4 linux sysadmin.

It's good to be a language polyglot. (python, bash commands, SQL, Java)

Those massive java stack traces are less intimidating when you know that Java's design encourages lots of deep class hierarchies, and every library you import introduces a few layers to the stack trace. But usually the meat and potatoes method you need to look at is at the top of a given thread. Scala is only useful because of Spark, and the level of Scala you need to know for Spark is small compared to the full extent of the language. Mostly you are programatically configuring a computation graph.

Kleppman's book is a great way to skip to relevant things in large system design.

https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/1449373321

It's very worth understanding how relational databases work because all the big distributed systems are basically subsets of relational database functionality, compromised for the sake of the distributed-ness. The fundamental concepts of how the data is partitioned, written to disk, caching, indexing, query optimization and transaction handling all apply. Whether the input is SQL or Spark, you are usually generate the same few fundamental operations (google Relational Algebra) and asking the system to execute it the best way it knows how. We face the same data issues now we did in the 70s but at a larger scale.

Keeping up with the framework or storage product fashion show is a lot easier when you have these fundamentals. I used Ramakrishnan, Database Management Systems. But anything that puts you in the position of asking how database systems work from the inside is extremely relevant even for "big data" distributed systems.

https://www.amazon.com/Database-Management-Systems-Raghu-Ramakrishnan/dp/0072465638

I also saw this recently and by the ToC it covers lots of stuff.

https://www.amazon.com/Database-Internals-Deep-Distributed-Systems-ebook/dp/B07XW76VHZ/ref=sr_1_1?keywords=database+internals&qid=1568739274&s=gateway&sr=8-1

But to keep in mind... the designers of these big data systems all had a thorough grounding in the issues of single node relational databases systems. It's very clarifying to see things through that lens.

u/ManHuman · 12 pointsr/UofT

If you want to a job upon graduation, you need the following items:

Work experience. No work experience, no job upon graduation. Sucks, right? But that's a fact. Try to get as many internships as possible.
Languages: Python (fucking hot right now; NumPy, Pandas, TensorFlow), SQL (you need to know this as the back of your hand), R, and SAS (maybe, depends from the employer; from what I have heard, SAS is dying out).
Now, let's talk about cherry on top. Few things that may really spice up your resume are TA and research opportunities. Additionally, it would be nice to have some independent projects, e.g. Time Series analysis of the Toronto housing market.

The problem with the Stats degree is that it is heavily theoretical. So, in order to balance it out, you need to get experience. Overall, I liked my experience with Stats, although I wish I spend more time on internships.

To summarize: work experience, programming, research.

Also, Machine Learning is hot right now. Pick up some books such as:
Hands-On Machine Learning with ScikitLearn and TensorFlow
Python for Data Analysis
Python Data Science Handbook.

Lastly, you gotta network like your life depends on it. Meetup.com and eventbrite.come have some pretty good Data Science/ML/Programming networking events where you can make connections and learn about the industry demands. Additionally, leverage the power of LinkedIn; create your profile and start asking people out for coffee in order to learn what they do, how they do it, what tools they use and for you to gain insight into the market demands and what you can expect upon graduation.

May Central Limit Theorem work with all your distributions.

Also, another thing that seems to be hot in financial markets is Risk Management. I would suggest you speaking with the Stats profs or Risk Management profs from Rotman in order to understand how you can leverage your Stats degree in Risk Management. Fantastic, here is one of the first things you can do for networking. Fuck, I wish I was back in uni.

Sorry, just remembered. Hadoop is also pretty important as is Tableau (for data visualization).

Ah, yes, experience. I don't know whether you spent the last part of 2017 and early part of 2018 on searching for internships. If not, keep searching you still have a slight chance to find some for this summer. Indeed and LinkedIn are pretty good sources. Lastly, try reaching out to recruiters from various organizations in order to learn if they have anything available. Now, if you don't find anything at all, like AT ALL, I would suggest either you take summer school and start looking for internships during either the Fall or Summer semesters OR contact the temp agencies to see what opportunities they have. Some opportunities may not be related to what you studied, but at least they will give you some work experience and your resume will not look as empty as it does now. Also, if I am correct, then U of T should have an alumni database. Try going through that database, find the alumni of interest, reach out to them, and ask them out for coffee to learn more about what they do and if they have anything available. Tick tock, tick tock.

After some googling, indeed

How am I doing? I am depressed man, I am fucking depressed. But, TensorFlow is keeping me awake.

u/queensnake · 11 pointsr/programming

Thanks much, I appreciate that.

edit: Note, all, the one book I've seen on this.

u/cabbagerat · 10 pointsr/compsci

Start with a good algorithms book like Introduction to algorithms. You'll also want a good discrete math text. Concrete Mathematics is one that I like, but there are several great alternatives. If you are learning new math, pick up The Princeton Companion To Mathematics, which is a great reference to have around if you find yourself with a gap in your knowledge. Not a seminal text in theoretical CS, but certain to expand your mind, is Purely functional data structures.

On the practice side, pick up a copy of The C programming language. Not only is K&R a classic text, and a great read, it really set the tone for the way that programming has been taught and learned ever since. I also highly recommend Elements of Programming.

Also, since you mention Papadimitriou, take a look at Logicomix.

u/rieux · 10 pointsr/programming

Professor of computer science at U.S. Military Academy, probably best known for his book Purely Functional Data Structures.

u/[deleted] · 10 pointsr/programming

Reads like someone that read half of SQL For Smarties, and didn't understand that half.

u/AerieC · 10 pointsr/ExperiencedDevs

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann.

Amazing book for anyone who works on (or wants to work on) large scale applications.

u/dnew · 9 pointsr/google

> how to use google to solve a programming problem

You can't. You have to figure out how to solve the problem yourself. Then you use Google to look up individual pieces of that.

In other words, you have to go "Well, I need to open the file, then read it line by line, find the first opening brace, find the last closing brace, and extract the piece of the string between those two braces, then print that out."

How do I open a file? I can google that.

How do I find the opening brace? I can google that.

How do I chop out the middle of a string into a new string? I can google that.

See what I mean?

> CS textbooks in general just aren't as well written

Not any more. People just generally don't give a shit, I've found. I've learned numerous programming languages by reading the manual for the compiler in older times. Nowadays, you're lucky if there's even a formal spec of the syntax of the language, let alone a complete readable manual. The "Ruby on Rails" text that seems to be the authoritative text is full of stuff like "this routine seems to do ...." meaning the guy writing it doesn't actually know, and didn't bother to read the source code to figure it out for sure.

However, the good news is that the classic books full of the knowledge that does not become outdated are actually very well written. Start with some of Knuth's texts (https://en.wikipedia.org/wiki/The_Art_of_Computer_Programming), Date's book on SQL and relational models (http://www.amazon.com/Introduction-Database-Systems-8th/dp/0321197844), Bertrand Meyer on OOP (http://www.amazon.com/Object-Oriented-Software-Construction-Book-CD-ROM/dp/0136291554) and so on. (That last is even available as a PDF floating around.)

> some of the knowledge you gain could become potentially outdated in the future

Everything that you could look up on Google will be outdated in about five years. The stuff about how computers work, how to solve problems, etc never gets outdated.

On the other hand, it's one of the few jobs where you can take a job to do X and start working on it without any idea of how to do X. I've been programming almost 40 years and I've never taken a job that I knew how to do when I took the job.

u/Big-Red-Shirts · 9 pointsr/MGTOW

It's not a conspiracy guys.

The old blog posts and research projects were mostly from one of the cofounders - Christian Rudder. Who has a BS in Mathematics from Harvard.

OkCupid was sold in 2011. (Though I think Christian Rudder did occasionally still post there.)

He went on to compile and expand on the themes from his OkCupid blog posts, in his book "Dataclysm."

https://www.amazon.com/dp/0385347391

Check it out. It's a fun read.

u/atium_ · 9 pointsr/haskell

Not what you are asking for really, but you'll get better with experience.

Take a few imperative algorithms and convert them over.
Solve some problems on HackerRank. Do it your way, afterwards compare your solution with some of the other Haskell solutions.

Some functional algorithms and data structures are done very differently. Chris Okasaki has a book Purely Functional Data Structures that covers some (though its for ML)

There are papers/articles on topics such as Functional Binomial Queues and Hinze has got a paper on Priority Search Queues that also covers an implementation of Dijkstra and Prims.

The Haskell Wiki has got a page listing functional pearls. Maybe also take a look at how dynamic programming and such paradigms are done functionally.

For most algorithms you can write it in a imperative manner and use mutation and looping constructs, if you have to. But you aren't going to find some guide to convert any algorithm into idiomatic Haskell. Some functional implementations require you to think differently.

u/japple · 9 pointsr/programming

Chris Okasaki's Purely Functional Data Structures

u/ceolceol · 9 pointsr/PHP

In this order:

u/stephencorona · 9 pointsr/PHP

Your database is (most likely) going to be your bottleneck.

Get High Performance mySQL - it's a great book that'll help you make smart design decisions. Make sure you get the 2008 edition and not the 2004 edition.

Check out Percona Server, a more performant fork of mySQL 5.1 server.

Depending on your Database load, you should look into mySQL Master/Slave Replication. Maatkit has lots of useful tools for monitoring/backing up mySQL Replication.

Like some other commenters have mentioned, check out Memcached. Make sure you use pecl-memcached for accessing Memcached from PHP. There is another extension, pecl-memcache, but it's missing CAS token support, which helps with concurrency.

Make sure you build pecl-memcached with the latest version of libmemcached. Older versions have a nasty bug that ignore timeouts, so if your memcached server goes down, your PHP processes will wait forever.

Use Nginx and PHP-FPM (it comes with the latest version of PHP). Nginx uses so much less memory than Apache. I have a Nginx Server that is pushing 250mbps of traffic and uses less than 200MB of memory. PHP-FPM uses (about) the same amount of memory as mod_php, but gives you way more control.

You also might want to check out the various NoSQL databases and see if any of them fit your work-load. It sounds like Redis might be a useful tool for you, but YMMV- you'll have to check it out and see if it fits your needs.

u/CowboyFromSmell · 9 pointsr/compsci

Designing Data Intensive Applications by Martin Kleppmann is a solid overview of the field and gives you plenty more references for further investigation. It starts on singe-host databases and expands out to all kinds of distributed systems. Starting on single host systems is important because it helps you appreciate the designs of the distributed systems that replaced them.

Edit: markdown is hard

u/SupportVectorMachine · 9 pointsr/deeplearning

Not OP, but among those he listed, I think Chollet's book is the best combination of practical, code-based content and genuinely valuable insights from a practitioner. Its examples are all in the Keras framework, which Chollet developed as a high-level API to sit on top of a number of possible DL libraries. But with TensorFlow 2.0, the Keras API is now fundamental to how you would write code in this pretty dominant framework. It's also a very well-written book.

Ordinarily, I resist books that are too focused on one framework over another. I'd never personally want a DL book in Java, for instance. But I think Chollet's book is good enough to recommend regardless of the platform you intend to use, although it will certainly be more immediately useful if you are working with tf.Keras.

u/sleepingsquirrel · 9 pointsr/ECE

Here's some lighter reading I've liked:

Analog Circuit Design: Art, Science and Personalities
The Chip : How Two Americans Invented the Microchip and Launched a Revolution
Gödel, Escher, Bach: An Eternal Golden Braid
Feynamn Lectures on Computation

Readable, but more hardcore

Hardware:
The Design of CMOS Radio-Frequency Integrated Circuits, Second Edition

Mathy:
A New Kind of Science. I liked this book a lot, it is worth the price just for the interesting pictures. Make up your mind for yourself, and don't let negative reviewers unduly influence you.
Probability Theory: The Logic of Science
Information Theory, Inference and Learning Algorithms

Programmy:
Mastering Regular Expressions

Unixy:
Unix Power Tools

Off the wall:
Molecular Biology of the Cell

u/ThisIsMyJetPackWHEEE · 8 pointsr/comics

The posts died down when the author behind them started working on his book. It's pretty great, and is basically the same stuff. Check it out.

"Dataclysm"
http://www.amazon.com/Dataclysm-When-Think-Ones-Looking/dp/0385347375

u/okcukv · 8 pointsr/LadiesofScience

Excel does not make publication quality graphics. I recommend Matlab or matplotlib (python) whenever I review papers with Excel figures in them.

> How did you learn the best way to organize and present your data in your publications?

Cleveland's book is a good start. Although he is is maybe a little too austere. But in general, better to have too little ink than too much.

u/ablakok · 7 pointsr/programming

Introduction to Database Systems by Chris Date. It's a solid introduction to the theory and to practical issues. The author helped develop the relational database model at IBM with E. F. Codd. He's a hard-line advocate for the relational model. Some people think he's too hard-line. It will help you make database enemies because you'll think you know more than other people do.

u/Calabast · 7 pointsr/dataisbeautiful

(And also Dataclysm, the book this article is based on, and where the graphics came from. I know I know, the article very clearly mentions the book, but for people who don't click your link, I want them to at least see its name.)

u/5m20s147 · 7 pointsr/MGTOW

There is a book about it: https://www.amazon.com.br/Dataclysm-Identity-What-Online-Offline-Selves/dp/0385347391

u/froggyenterprisesltd · 7 pointsr/statistics

I'm not a design expert, but I do know that just because Nate uses Excel himself doesn't mean that he's the guy generating these plots. I'm fairly certain that most of the journalists putting these together are using ggplot from R or python.

If you're interested in exact replicas, your language can do 80% of the heavy lifting by giving you the bones of the structure. But to really bring it home, you need a program like Inkspace or Illustrator to polish these up.

I don't think there's any language now that effectively uses good design sensibilities. This is discussed a bit in the book Visualize This by Nathan Yau.

For most people, it looks like the python / R tutorials listed here should get the job done.

edit: a word

u/bhrgunatha · 7 pointsr/compsci

This isn't criticism or a judgment, but that sounds like an odd request. If you've really absorbed what's in CLRS, I would imagine you could just research those data structures yourself and, for example, look at some open source implementations.

Or research what's in other Data Structures and Algorithms books and read up on them.

Having said that - there is an MIT course on advanced data structures.

I also enjoyed Chris Okasaki's Purely Functional Data Structures

There are 2 Coursera courses in particular - Princeton University's Algorithms Part I and Algorithms Part II - they've provided a web site for their book where lots of algorithms and data structures are implemented using Java with the libraries and source code freely available.

u/snatchinvader · 7 pointsr/haskell

A good book describing similar techniques for designing and implementing efficient data structures with lazy evaluation is Purely Functional Data Structures.

u/AlSweigart · 7 pointsr/learnpython

Yeah, the course follows most of the book's content, though there are some chapters that the course doesn't cover. But it's a nice supplement regardless.

I don't really know of any follow up material off the top of my head. I'd recommend learning about version control (like git) and can recommend the free books Version Control by Example and Pro Git. Other than that, I've noticed that Data Science from Scratch is doing very well on Amazon, so you might want to check that out.

u/666f6f626172 · 7 pointsr/datascience

I doubt any courses you take would spend more than a day on the basics of a language. That's something you need to learn on your own. What's your background like? It sounds like you don't have much programming experience, so perhaps start with this. Then maybe this for learning numpy, pandas, and matplotlib.

EDIT: Didn't realize you were still in high school. I don't believe there's a specific data science undergrad program anywhere, but any STEM undergrad program will probably include an introductory programming course.

u/GetMedievalOnYourAss · 7 pointsr/MGTOW

FYI, this graph is from a book called Dataclysm. It was written by one of the co-founders of OKCupid. He also happens to have a degree in data analysis. He collected all kinds of bizarre and abstract data from the users of his website to gain insight into their TRUE desires. The book is revealing because it's basically showing what humans REALLY WANT when they think no one is watching. Plenty of Red Pill graphs and statistics in that book. I'm surprised Feminists haven't banned it.

u/zerro_4 · 6 pointsr/iamverysmart

Actually, his point about Asian men isn't completely wrong.

http://www.amazon.com/Dataclysm-When-Think-Ones-Looking/dp/0385347375

Asian mean are ranked as the least attractive, at least through the data accumulated through OKCupid.

u/kqr · 6 pointsr/haskell

This is a completely unhelpful answer, but if you're looking to get to know the things you listed under not comfortable, there is

u/th3_gibs0n · 6 pointsr/datascience

Data Engineering is different everywhere and task dependent. The best advice I can give is have SQL be your second language. Then depending on your role or daily tasks you would be looking at extra materials.

General Insightful Reads:

u/PM_me_goat_gifs · 6 pointsr/cscareerquestions

> scalability was a rare issue

Designing Data-Intensive Applications is a great book. Get yourself into some good personal habits, learn to cook efficiently, find a good gym near your new job, and spend some time sitting in the park reading that book.

u/Mofo_Turtles · 6 pointsr/cscareerquestions

This book is a very good for Distributed Systems at a high level.

u/terrorobe · 6 pointsr/PostgreSQL

By now already dated but a good top-to-bottom introduction into Postgres in the real world is PostgreSQL 9.0 High Performance.

Most of the things Postgres does is exposed via system tables & views - for example pg_stat_activity & pg_locks.

The rest of the documentation is great as well, give it a read.

If you are new to system administration & architecture, you may want to put Designing data intensive applications on your shopping list as well to broaden your horizon.

If you have Postgres-specific questions you can ask them here or reach out to the community.

edit: fixed links

u/sayubuntu · 6 pointsr/learnpython

After you finish that and are comfortable in python check out Python Data Science Handbook. I am not a data analyst, I am a PhD student doing research in fields that generate/require a lot of data.

The handbook goes over pythons numpy package and then gets into pandas. Pandas should be the tool you want to learn. Under the hood it uses numpy a lot so don’t skip the first half. Numpy implements a lot of matrix operations in FORTRAN/C if you use it properly (avoid loops when possible) it is incredibly efficient on large datasets.

While you are learning python I highly reccomend using jupyter lab.

Good luck!

u/friend_in_rome · 6 pointsr/learnpython

Python Data Science Handbook is awesome. Doesn't cover Scikit-Lean, but it covers Pandas (which inherently means Numpy), and some visualization stuff too.

u/piss_n_boots · 6 pointsr/learnpython

I bought this and have just begun reading it. seems good so far although in the first example they're pulling in Flask and some other stuff which, to me, made the code harder to think through.

be aware, the book is really short -- about 100 pages. (I don't have it on hand to confirm and I'm not convinced it's exactly 100 as Amazon says.) I didn't realize that and thought the price (compared to others I was considering) was good -- then it arrived... still, I think it's a good intro.

u/ianblu1 · 5 pointsr/datascience

I usually recommend this book for this sort of problem: https://www.amazon.com/Data-Science-Scratch-Principles-Python/dp/149190142X

In it you'll get your feet wet with respect to basic python and be exposed to how you would implement some core algorithms from scratch. Once you know that it should be relatively straightforward to move to the higher level libraries.

It's important to note that there aren't really "equivalent functions" mapping R to python. This is because R and python optimize for different things. R is a declarative analysis language- you tell it what you want it to do, not how to do it. Python is a full featured programming language also used for software development, so it supports many different paradigms (OO, functional, etc.). There are component libraries such as sklearn that implement declarative apis that will let you say things like "fit a model with these characteristics" or pandas that lets you say things like "what is the average of value in all of these columns". But in general python itself doesn't really work that way. You build things bottoms up.

u/ionforge · 5 pointsr/programming

If you want to read more about the topic, this book just got released https://www.amazon.com/gp/aw/d/B06XPJML5D/ref=ya_aw_dod_pi?ie=UTF8&amp;psc=1 and it is pretty much a must read for back end developers.

u/ToAskMoreQuestions · 4 pointsr/datascience

Check out Dataclysm by Christian Rudder.

u/ChefJoe98136 · 4 pointsr/SeattleWA

Hrm, that's from this book ? Another way to put it is that women are conditioned to respond that guys are generally not attractive based on a photo whereas men give a more broad distribution that encompasses a full scale ("she's a perfect ten"). There's also quite an industry around giving women makeup and a rigorous education about how to make themselves photograph/appear more attractive/cover flaws that most guys (at least those who would be evaluated by women) aren't exactly indoctrinated with.

https://www.amazon.com/gp/product/0385347391

u/iacobus42 · 4 pointsr/statistics

Anything by Tufte and the Flowing Data book and blog are great starting places. Tufte is more theory driven, for lack of a better term, while the Flowing Data sources have more "worked" examples (with R, Python, etc).

It would be worth learning ggplot2 as well if you are interested in data visualization as that seems to be the current "standard" tool. Hadley Wickham's website and UseR book on ggplot2 are great places to start.

Relatedly, Wickham's PhD thesis is all about tools and strategies for data visualization and can be found for free on his website. There is also an hour long seminar and slides to go with the paper.

u/killingRadio · 4 pointsr/datascience

Get a copy of Visualize This. Visualize This: The FlowingData Guide to Design, Visualization, and Statistics https://www.amazon.com/dp/0470944889/ref=cm_sw_r_cp_api_jA9KzbF2FGX6M

u/RedDeckWins · 4 pointsr/Clojure

I would highly recommend reading Purely Functional Data Structures.

Right now it only works with numbers. If you utilized compare you could make it more generalized.

u/PM_ME_UR_OBSIDIAN · 4 pointsr/programming

You'll find your answers in this book. Great both as a tutorial and a reference.

The TL;DR version is that with a bit of cleverness you can use redundancy in your data structures to save time and memory. For example, naively implementing a purely functional stack is easy peasy. Just take an immutable linked list; all stack operations are O(1) time and space.

u/ensiferous · 4 pointsr/PHP

You don't know even half of MySQL until you you've read: http://www.amazon.com/gp/offer-listing/0596101716/ref=tmm_pap_new_olp_sr?ie=UTF8&amp;condition=new

u/Rylick · 4 pointsr/de

So doof es klingt: einfach machen. Learning by doing ist die beste Art und Weise R zu lernen. Am Anfang ist es recht schwer, am Ende dann aber wesentlich einfacher und schneller als Point&Klick Programme wie SPSS.
Ich finde das R Cookbook ganz gut um ein Nachschlagewerk zu haben. http://www.amazon.de/Cookbook-OReilly-Cookbooks-Paul-Teetor/dp/0596809158

u/ansalonhistorian · 4 pointsr/DistributedSystems

If you want to learn it systematically, consider the following:

The popular DDIA book: Designing Data-Intensive Applications gives you some insights into data systems, which are the main reason why people study those difficult distributed theories.

The underestimated textbook: Distributed Systems: An Algorithmic Approach shows you the reasoning behind the scene and gives you a taste of the algorithms used in distributed systems.

When you think it's finally over: Distributed Algorithms talks about the system models and algorithms in a more formal way.

u/_a9o_ · 4 pointsr/cscareerquestions

If you're doing backend/server side work, there's no better book than:
Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems https://www.amazon.com/dp/1449373321/ref=cm_sw_r_cp_apa_h5nPBbZ9ZAWG9

In terms of learning what it takes to level up, I highly recommend the following books:
The Senior Software Engineer: 11 Practices of an Effective Technical Leader https://www.amazon.com/dp/0990702804/ref=cm_sw_r_cp_apa_o6nPBbVY8XDM9

The Effective Engineer: How to Leverage Your Efforts In Software Engineering to Make a Disproportionate and Meaningful Impact https://www.amazon.com/dp/0996128107/ref=cm_sw_r_cp_apa_n7nPBbB1ZDP2H

u/kambabamba · 4 pointsr/cscareerquestions

https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/1449373321

u/_pml · 4 pointsr/MachineLearning

The best chapters are the ones where he covers the ML method from scratch (like ANN). The ones that start with scikit-learn are OK, but you are really learning the scikit-learn API. The code layout is not nearly as good as O'Reilly books. His coding style leaves something to be desired (OO and mutations everywhere). As an alternative, I'd recommend the O'Reilly book: "Data Science From Scratch" by Joel Grus
https://www.amazon.com/Data-Science-Scratch-Principles-Python/dp/149190142X/ref=sr_1_1?s=books&amp;ie=UTF8&amp;qid=1467497329&amp;sr=1-1&amp;keywords=data+science+from+scratch
which covers every techniques from 'scratch.' His coding style is much better. Disadvantage is that all the routines are written in pure Python (slow).

u/KeepingItClassy11 · 4 pointsr/learnpython

I don't love the Dummies books for technical subjects; O'Reilly books are far superior. Their Python Data Science Handbook by Jake VanderPlas is worth its weight in gold, IMO.

u/OverQualifried · 4 pointsr/Python

I'm freshening up on Python for work, and these are my materials:

Mastering Python Design Patterns https://www.amazon.com/dp/1783989327/ref=cm_sw_r_cp_awd_kiCKwbSP5AQ1M

Learning Python Design Patterns https://www.amazon.com/dp/1783283378/ref=cm_sw_r_cp_awd_BiCKwbGT2FA1Z

Fluent Python https://www.amazon.com/dp/1491946008/ref=cm_sw_r_cp_awd_WiCKwbQ2MK9N

Design Patterns: Elements of Reusable Object-Oriented Software https://www.amazon.com/dp/0201633612/ref=cm_sw_r_cp_awd_fjCKwb5JQA3KG

I recommend them to OP.

u/ircmaxell · 4 pointsr/PHP

I'd strongly suggest that you get the book SQL Antipatterns.

Specifically Polymorphic Associations starting on slide 32. It's detailed in the book, but the slide gives you some good information.

Basically, solution #3 where you use a base parent table. Store the content, title and date in a common "content" table, then store the content-specific information in sub-tables.

u/jumb1 · 3 pointsr/AskMen

They also released a book called Dataclysm which sounds interesting. I've bought the book, but have yet to read it.

u/trastevere · 3 pointsr/OkCupid

From what I know, they'd like to; their author has spent the last 3 years making an update to them and compiling the results into an independent book.

I'm sure some variant will return at some point.

u/NoFunInBand · 3 pointsr/AskWomen

In my opinion this is one of the most interesting blogs on the Internet, so I'm going to plug the guy's book that just came out last week. I'm halfway through it, and basically humanity is terrible.

u/PunnyPenguins · 3 pointsr/OkCupid

According to Christian Rudder, it's one of the most common phrases for white peoples' profiles.

u/80_20 · 3 pointsr/PurplePillDebate

It is from the okcupid book called "Dataclysm: Who We Are When We Think No One's Looking" by Christian Rudder, founder of okcupid and mathematics graduate of Harvard. It is a New York Times bestseller.

Since you are obviously interested in relationships, you should read it. It gives insight like we've never seen before because it is based off of "big data".

I am not a part of the red pill thing. I rarely go there. I consider myself more of an incel advocate myself. (I'm not incel myself though)

u/the-capitan · 3 pointsr/RedPillWomen

right. they peak between 20 and 24. (that's actual data from dataclysm). the point of mentioning 25 is that virtually all women are on the downslide by then.

u/davomyster · 3 pointsr/dataisbeautiful

I agree that these data aren't nearly as interesting as the old posts but you're comparing two different blogs. The old one with all of the detailed insight was written by one of the company founders, Christian Rudder, who wrote an entire book on the subject. You seem like you're really into the deep data analytics side of things and if you or anyone else who loved the old style of posts hasn't read it, I highly recommend it: http://www.amazon.com/Dataclysm-When-Think-Ones-Looking/dp/0385347375

That blog was called OKTrends. It looks like it was last updated in 2014, the same year Match.com bought out OKCupid. Maybe Rudder didn't stick around to write blog posts anymore, I'm not sure, but this new blog we're all commenting about is called "The Deep End" so I suspect Rudder didn't write it.

Also, what makes any of you think that this simpler, less in-depth blog post has anything to do with a weakening of their matching algorithm in favor of more "folk wisdom and religion"? It's just a blog post.

u/ohmsnap · 3 pointsr/Cyberpunk

My guess is that there is more intentionally sexual art of women, and while that fact alone wouldn't make the case for it being sexist stick, there can definitely be too much of it and it could be the result of an underlying issue.

There are 77 pictures in this photoset, and pretty much all of them reinforce that "young and attractive" type that men of nearly every age idealize. Here's the women for comparison. At the very least, there's what appears to be an imbalance. Source of data

Most of the users on the subreddit are consumers, though. I think this being a conversation amonst content creators would be a pretty good idea.

Edit: parent comment added additional research, neat.

u/whattodo-whattodo · 3 pointsr/dating

This has to be a joke.

The book Dataclysm shows statistics collected from online dating sites. As you can see the chart on the right shows which ages are most attractive to men as they age. Now it's horribly skewed because all of us men are stupid. But, it shows that a 28 year old guy is MOST interested in a girl your age.

So where can you find one? Anywhere. All of us. Just pick one! ;-p

u/Yorian_Dates · 3 pointsr/MGTOW

if want more depth, I recommend the book where the info came from:

https://www.amazon.com/Dataclysm-Identity-What-Online-Offline-Selves/dp/0385347391

I found the book in the most useful Internet website after pornhub: libgen.io

Someone posted this book here months ago. The books is written by the (co-)founder of Okcupid himself. He shows with numbers and statistics what we've talked about here for years.

u/jorgegalepos · 3 pointsr/datavisualization

Nathan Yau, Visualize this

Dona Wong, The Wall Street Journal Guide to Information Graphics: The Dos and Don'ts of Presenting Data, Facts, and Figures

u/_Paxifist_ · 3 pointsr/datascience

http://www.amazon.com/Visualize-This-FlowingData-Visualization-Statistics/dp/0470944889

Took a data viz class last year. This was the textbook. Nathan yau's website flowing data is a good resource as well. Also check out d3.js for an advanced/flexible technology for visualizing data.

u/yeahbutbut · 3 pointsr/programming

> If you can spare the ram and computing time, sure. This also exists in OOP under the name of Memento pattern but is hardly ever applied because of how slow it can be with big data sets.

The advantage with immutable data structures is that your "modifications" are stored as a delta from the original so the memory requirements are fairly low. [0][1] You probably would have plenty of ram to spare.

>`How do you write the following in FP, with a single stack

(def graph (atom #{ #_"vertices go here"}))
(def stack (atom (list)))

(let [some-value 42.0]
(def my-command {:do (fn [graph] (map #(merge %1 {:length (+ (:length %1) some-value)} graph)
:undo (fn [graph] (map #(merge %1 {:length (- (:length %1) some-value)} graph)})

(defn apply-command [cmd]
;; replace the graph with a version mutated by do
(swap! graph (:do cmd))
;; put the undo function on the stack
(swap! stack conj (partial swap! graph (:undo cmd))))

(defn undo-last []
(swap! stack
(fn [stack]
;; run the undo fn
((first stack))
;; return the stack sans the top element
(rest stack))))

(apply-command my-command)
(clojure.pprint/pprint @graph)
(undo-last)
(clojure.pprint/pprint @graph)

But you probably wouldn't have the graph as a global atom, someValue would be injected into the command, etc, etc.

[0] https://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504/ref=sr_1_1?ie=UTF8&amp;qid=1493603954&amp;sr=8-1&amp;keywords=purely+functional+data+structures

[1] http://blog.higher-order.net/2009/09/08/understanding-clojures-persistenthashmap-deftwice.html

Edit: formatting, do was + undo was - in the original, add usage at the end

u/panicClark · 3 pointsr/ItalyInformatica

Io lavoro come sviluppatore ormai da diversi anni, anch'io non laureato (o meglio, laureato lo sarei, ma in un ambito piuttosto distante dall'informatica).

Le difficoltà maggiori all'inizio le ho incontrate quando si trattava di andare un pelino oltre al "giocare col lego" con linguaggi e framework (rigorosamente di alto livello): i fondamentali di come funzionano le reti e i protocolli, le strutture dati e gli algoritmi. Il primo ambito sto ancora cercando di approfondirlo bene, per strutture dati e algoritmi all'epoca mi consigliarono Introduction to Algorithms e devo dire che mi ci sono trovato abbastanza bene, seppure l'ho trovato noioso da seguire.

Mi è tornato relativamente più utile approfondire i linguaggi funzionali. Il classico in tal senso è Purely Functional Data Structures, ma a me è piaciuto di più Functional Programming in Scala.

u/gfixler · 3 pointsr/haskell

>When working in Java you just need to embrace it.

Haha. Agreed. When you're a hostage, just do what they say, and live to fight another day.

>...showed me how it's supposed to be done.

I've tried to see how it's supposed to be done many times, but it's just a broken abstraction for me. If I want to turn off a light, I flip the switch to off. In OOP, I'm supposed to create a Light class to hold the state of everything related to the light, then accessor methods with access control levels set up just so to protect me from the world, in case anyone wants to make something based on my whole lighting setup. Then I need to create nouns to shepherd my verbs around, like LightSwitchToggleAccessor, and worry about interfaces and implementations and design patterns.

In Haskell I'd say "A light can just be on or off; let's make it an alias for a boolean."

type Light = Bool

I want to be able to turn it on and off; that's just a morphism from Light state to Light state.

toggleLight :: Light -> Light
toggleLight = not

And that's it. If I realize later that I don't want Light and Bool to be interchangeable, I'd just make Light it's own type with a simple tweak to give it its own two states:

data Light = Lit | Unlit

And change the toggle to match:

toggleLight :: Light -> Light
toggleLight Lit = Unlit
toggleLight Unlit = Lit

Then I could toggle a big list of lights:

map toggleLight [light1, light2, mainLight, ...]

Or turn them all on:

map (const Lit) [light1, light2, ...]

I have equational reasoning. I can do like-for-like transformations. I get all the goodness of category theoretic abstractions, giving me reusability the likes of which I've never seen in OOP (not even close). Etc.

>objects are closures

Closures are immutable (hence the glory of this). Objects tend to be mutable, which is a nightmare (every day where I work in C#).

>try to keep as much stuff pure as possible

But you just have no way of knowing what's pure and what isn't in any of the OOP environments I've seen, and it is so obvious in C# at work; it plagues us constantly - new bugs daily, and projects always slow tremendously as they grow, and things become unchangeable, because they're too ossified. Just that small thing, that need to specify effects in your types, makes it so much easier to reason about what actually goes on in a function. For example, my Lights up there actually can't do anything in the world. I know that because of their "Light -> Light" types. All they can do is tweak data, the same way every single time they're called - you can replace them with table lookups. They'd have to get some kind of IO markup in their types before they could change anything, which is part of that equational, deterministic reasoning that makes FP so easy to understand, even as projects grow.

I don't want to try to do things. I want it to be fun to do what's good, and impossible to do what's bad. The goal of a great type system is to "make illegal states impossible to represent." I made it impossible to mess with the world, and so I can know with 100% certainty what toggleLights does. I quite literally cannot know what the same function would do in C#. It could return a different result every time. Multiply that up to a few 100klocs, and I have no idea how our projects work, and no idea what I'm breaking when I push commits (and I often break things, and everyone else constantly breaks my stuff, because we can't properly reason about anything).

u/Wegener · 3 pointsr/algotrading

Right now I'm reading The Art of R Programming. It seems like it has a lot of good knowledge but also seems really disorganized. The author uses control statements without explanation in the 2nd chapter about vectors to demonstrate their ability, and then doesn't get back to control statements until chapter 7. But being a seasoned programmer I don't think things like that will bother you too much. This is the only R book I've used, so my opinion isn't very broad based. The reviews for R Cookbook seem pretty good and I'm a little sorry I didn't start with that instead.

Hopefully someone else can chime in.

u/jcukier · 3 pointsr/DataVizRequests

1 book by far is Andy Kirk’s. Data Visualisation: A Handbook for Data Driven Design https://www.amazon.com/dp/1526468921/ref=cm_sw_r_cp_api_i_rjx3DbDVRPFDN

It’s very broad and accessible yet substantial. That’s the book I recommend to anyone who need to read just one book.

2 is RJ Andrews book Info We Trust: How to Inspire the World with Data https://www.amazon.com/dp/1119483891/ref=cm_sw_r_cp_api_i_gmx3Db0FDG9DC.

This is a wonderful book that I read as an ode to visualization as a medium. It’s more artistic than Andy’s book both in its topic and its execution.

3 book depends on your specific interest. Dashboards/tableau? https://www.amazon.com/big-book-dashboard/s?k=big+book+of+dashboard.

Data art? https://www.amazon.com/dear-data-book/s?k=dear+data+book

Data journalism/ storytelling? Data-Driven Storytelling (AK Peters Visualization Series) https://www.amazon.com/dp/B07CCZPKV3/ref=cm_sw_r_cp_api_i_Msx3DbF1GZMG8

Science of visualization? https://www.amazon.com/Information-Visualization-Perception-Interactive-Technologies/dp/0123814642

Visualization from an academic point of view? https://www.amazon.com/Visualization-Analysis-Design-AK-Peters/dp/1466508914

D3js? https://www.amazon.com/Interactive-Data-Visualization-Web-Introduction/dp/1449339735

u/PM_ME_YOUR_DOOTFILES · 3 pointsr/programming

> Data Intensive systems book

Are you referring to this book? Seems like a good book according to Amazon.

u/forever_i_b_stangin · 3 pointsr/webdev

I strongly recommend this book: https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/1449373321

u/Spawnbroker · 3 pointsr/ExperiencedDevs

If you really want to push the envelope on TC, especially as a more experienced dev, you're going to need to ace the system design interview(s).

I'm still learning this myself, but a good book you might want to check out is Designing Data-Intensive Applications. I've also heard good things about Grokking the System Design Interview.

Good luck! I'm going through the studying process as well, it's brutal.

u/FullOfEnnui · 3 pointsr/cscareerquestions

A few books that I have been going through: Programming Interviews Exposed, The Algorithm Design Manual, and Designing Data-Intensive Applications.

u/ryfm · 3 pointsr/compsci

this one is good https://amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/1449373321/

u/wyzaard · 3 pointsr/IOPsychology

If your calculus needs brushing up then I am guessing that you will probably benefit from putting some effort into linear algebra too. Just a guess though.

The Sage Hanbook of Quantitative Methods in Psychology is aimed at advanced graduate students and working researchers. The Oxford Handbook of Quantitative Methods in Psychology, Volume 1 and Volume 2 is even more comprehensive with Volume 1 covering some more philosophical topics not covered in the Sage Handbook.

An introduction to programming and computer science like this one (there are many others) is probably a good idea. You can also jump straight into a basic introduction to data science like Data Science from Scratch: First Principles with Python. The author can be amusing. Consider the quote in the preface:

> "There is a healthy debate raging over the best language for learning data science. Many people believe it’s the statistical programming language R. (We call those people wrong.) A few people suggest Java or Scala. However, in my opinion, Python is the obvious choice."

u/_starbelly · 3 pointsr/guitarpedals

Thanks! I can't wait to slay this beast. I timed my defense such that I could go let it all out at a Power Trip show a few days later, haha.

Python seems pretty intuitive to me in my initial tinkering; I also come from a Matlab/R background. I'll definitely check out pandas and scikit-learn! Do you have any suggestions for resources to efficiently learn Python? I'm working on Data Science From Scratch right now.

I have a friend who recently graduated from my same program and is now working as a data scientist at a financial startup in CA. He said the exact same things. I can't wait to make more than just slave wages....

One more question: Any recommendations for an R Studio-like IDE for Python in OSX?

u/core_dumpd · 3 pointsr/datascience

Jose Portilla on Udemy has some good python based courses (and also frequents this subreddit). There's regularly sales or some sort of coupon code available to get any of the courses for $10-$15, so it's very reasonable.

For books:

https://www.amazon.com/Python-Data-Analysis-Wrangling-IPython/dp/1491957662/ref=asap_bc?ie=UTF8 ... it's not out yet, but due any day. You can also get preview access on sites like Safari Online (which would also have all the books below).

https://www.amazon.com/Data-Science-Scratch-Principles-Python/dp/149190142X/ref=sr_1_1

For general python:

https://www.amazon.com/Python-Crash-Course-Hands-Project-Based/dp/1593276036/ref=sr_1_1

https://www.amazon.com/Automate-Boring-Stuff-Python-Programming/dp/1593275994/ref=sr_1_1

No Starch Press, OReilly, APress and Manning generally have pretty good quality publications. I'd usually skip anything from Packt, unless it's specifically received good reviews.

u/zzzeek · 3 pointsr/compsci

SQL for Smarties

u/JosephCW · 3 pointsr/cscareerquestions

Books and when possible building side projects.

List of Java-related books I've found helpful.

Clean Code

Java 8 in Action

Data Structures & Algorithms in Java

Test-Driven Java Development

The last book (Test-Driven Java Development) briefly introduces different testing frameworks for java. It gives you a good start to work off of on your own.

Ninja Edit: I'm also adding two websites that have rather useful examples/diagrams for different design patterns.

DZone

TutorialsPoint

u/fabio1618 · 3 pointsr/ItalyInformatica

Questo è un classico anche se un po' datato: https://www.amazon.it/Effective-Java-Joshua-Bloch/dp/0321356683

Ti consiglio di abbinarlo a https://www.amazon.it/Java-Action-Lambdas-functional-style-programming/dp/1617291994 per le ultime novità di java 8 (che sono fondamentali).
Sono in inglese e te ne devi fare una ragione... il 99% del materiale di qualità è in inglese.

P.S. Nè html nè css sono "linguaggi"

u/Armorweave · 3 pointsr/learnprogramming

Fundamentals of Database Systems, it covers a broad range of topics about databases including database design theory, normalisation and data modeling.

SQL Antipatterns is a really great book.

u/forgetfulcoder · 3 pointsr/learnphp

PHP The Right Way is good.

If you want something for SQL I strongly recommend SQL Antipatterns.

If you want something more abstract, Head First Design Patterns is good. It uses Java in its examples but it applies to PHP too.

u/Himmelswind · 3 pointsr/cscareerquestions

Some resources I found useful:

This Github repository is a really good overview. Although it doesn't exactly give a deep understanding of any particular topic, it's a really good way of understanding the system design "landscape". After reading this, I had a much better idea of what I needed to study more.
Designing Data-Intensive Applications is an awesome and thorough book that covers just about everything you need to know for a system design interview.
Maybe a bit obvious, but CTCI's system design chapter is useful (although not enough on its own).
It's in some ways a bit orthogonal to system design, but Computer Systems: A Programmer's Perspective gave me a much better idea of how the hell this machine I touch all day works. I think learning the things covered in here helped me speak with more confidence on system design.

u/askhistoriansapp · 3 pointsr/cscareerquestions

I've had the experience where I was turned down for a $80k/y job because they straight up didn't like me and I passed a $155k/y interview with a palindrome check question.

As software guys I think to one degree or another we're all on some sort of a spectrum :) What makes you good at this job is always going in 100%, all-or-nothing, winner-take-all and the reality of the matter is that it's not actually like that. Don't take a single loss like that's going to be your life now. It's a little easier to see if you come from the background I come from (immigrant) but I get it.

Imagine that you fail 5 more interviews and then, after that, you are guaranteed to make 200k working 30 hours remotely (it happens)

You can now go live your life anywhere on the planet and crush it. It just has to be 5 though, not 4. If you imagine this to be true, you'll suddenly see how that lifts you out of your negative frame of mind.

Meanwhile, focus on things you can control:

Read Elements of Programming Interviews in Python (or whatever flavor you prefer) because it's a very comprehensive book that's easily accessible
Coding problems in Ruby is also good and very succinct (if you care about Ruby, but it's thorough)
Exercise
Hang out with friends, get different perspectives like on this forum, although reddit in general is very negative and cancerous

Work on that, remain focused and next thing you know you'll be off the market

Edit: Also check out The Senior Software Engineer and Designing Data-Intensive Applications because those are key to everything but "leetcode" stuff.

u/WhackAMoleE · 2 pointsr/compsci

The classic is CJ Date, Introduction to Database Systems.

http://www.amazon.com/Introduction-Database-Systems-8th-Edition/dp/0321197844

Be aware that this is all theory. It's not going to tell you how to write MySQL queries.

u/TweetTranscriber · 2 pointsr/chile

📅 2018-04-23 ⏰ 23:56:15 (UTC)

>Sex differences in age preferences: Women tend to rate men roughly their own age as most attractive; men tend to rate women in their early twenties as most attractive, regardless of their own age https://www.amazon.com/Dataclysm-When-Think-Ones-Looking/dp/0385347375/ #chi2018 @okcupid

>— Steve Stewart-Williams (@SteveStuWill)

>🔁️ 24 💟 64

📷 image

&nbsp;

^(I'm a bot and this action was done automatically)

u/notahitandrun · 2 pointsr/askgaybros

I'm around your same age and demographic as the OP and I am in the SF Bay Area. I have had countless guys do what you describe. Seems like major issues with communication and or other options. I find the same dudes often on Grindr, A4A, and other sites with differing more risque text in their profiles (conflicting what they say on OKcupid; maybe they are swamped with guys on the apps or their mailboxes).

Like you I find it quite weird they will message me, or we "match" which means they took the effort to do both and they never respond. Maybe its similar to Tinder everyone wants a self esteem boost but doesn't want to put much effort out for anything else. I've tried the direct route, the talk on the phone or have a drink route and the flirty chat route. It just seems guys on Okcupid are flakes (I even get guys from other areas contacting me). I think some of them are using it as a instantaneous chat function or geo-located grindr functionality (the app), but when you respond a day latter no response, maybe they have found someone else or messaged many others. It a free site after all (not paid like match).

Like you my response rate is low but not out of being too picky, there are some straight up freaks who contact me and have nothing in their profile, or never read mine and you can tell from the questions and text or guys across the world who want a bf. The vibe I get from Okcupid is they match with you and don't really checkout the questions, then latter they read the question find one they don't like and ignore you. For instance I have many "tops" contact me and realize its not going to work based on the questions they finally read. Try taking a look at your most important rated questions and seeing if there is something they can reject you for by looking at compatibility on their profile. The silly thing about OKcupid is it gives you a match rating based on those questions that can be answered multiple (various) ways and sometimes really don't matter or mean your compatible with a date.

I also find the average age is on the young side with guys in their 30s and over being pretty rare. I read the book Dataclysm by the guy who had the OKcupid blog. He said on average you contact 1000 people and maybe get 10 responses (those are based of of straight interactions), so imagine the more superficial and flakey gay world. He also said too many pictures is bad it gives someone a reason to reject you or too many questions answered, yet if you don't answer questions you don't get shown on wall (where everyone answers new / re-answers questions). Okcupid is the equivalent of Grindr or Craiglist, lots of response but little follow through or real dates. There was a guy in LA (UCLA) who was a mathematician who supposedly quantified and was able to game okcupid, he had to respond to thousands of profiles (he used UCLA's Super Computers as bots) to get a gf and went on countless dates a day.

http://abcnews.go.com/GMA/video/boston-mathematician-hacks-dating-site-okcupid-find-true-21635472

http://www.amazon.com/Dataclysm-When-Think-Ones-Looking/dp/0385347375

https://www.ted.com/talks/amy_webb_how_i_hacked_online_dating?language=en

u/TajMy · 2 pointsr/MGTOW

> some guy

Not just "some guy", but a co-founder of OKCupid, who just happened to have a mathematics degree from Harvard. The man and his book.

u/Janube · 2 pointsr/todayilearned

The book from which this TIL originated.

u/Badhugs · 2 pointsr/geography

Some books I can recommend for map nerds: Strange Maps: An Atlas of Cartographic Curiosities, How to Lie With Maps, and a related book that's a bit more useful for data visualization - Visualize This: The FlowingData Guide to Design, Visualization, and Statistics.

The typographic maps from Axis Maps are pretty awesome and there's all kinds of map-related stuff on Etsy.

u/htedream · 2 pointsr/Clojure

most of the algorithms books are for any programming language as long as they are imperative.

as far as functional languages go, there are:

Purely Functional Data Structures
Pearls of Functional Algorithm Design

even though neither are using clojure.

If you are interested in academic papers, you can find good references here:
http://cstheory.stackexchange.com/questions/1539/whats-new-in-purely-functional-data-structures-since-okasaki

u/NLeCompte_functional · 2 pointsr/haskell

I have not read Functional Programming In Scala so I am unsure of the scope.

But Purely Functional Data Structures is a classic: https://www.amazon.com/Purely-Functional-Data-Structures-Okasaki/dp/0521663504

It's largely focused on SML, but all the examples are also given in Haskell. And for learning Haskell (or Scala/F#/Agda/etc), porting the SML examples is a good exercise.

u/jdh30 · 2 pointsr/rust

> To be honest I don't entirely understand the term "functional data structure" I'm sort of new to functional programming myself.

I'm sure you're familiar with the idea of an immutable int or double or even string. Purely functional data structures just extend this idea to collections like lists, arrays, sets, maps, stacks, queues, dictionaries and so on. Whereas operations on mutable collections take the operation and collection and return nothing, operations on purely functional data structures return a new data structure.

Here's an example signature for a mutable set:

val empty : unit -> Set<'a>
val contains : 'a -> Set<'a> -> bool
val add : 'a -> Set<'a> -> unit
val remove : 'a -> Set<'a> -> unit

and here is the equivalent for a purely functional set:

val empty : Set<'a>
val contains : 'a -> Set<'a> -> bool
val add : 'a -> Set<'a> -> Set<'a>
val remove : 'a -> Set<'a> -> Set<'a>

Note that empty is now just a value rather than a function (because you cannot mutate it!) and add and remove return new sets.

The advantages of purely functional data structures are:

Makes it much easier to reason about programs because even collections never get mutated.
Backtracking in logic programming is a no-brainer: just reuse the old version of a collection.
Free infinite undo buffers because all old versions can be recorded.
Better incrementality so shorter pauses in low latency programs.
No more "this collection was mutated while you were iterating it" problems.

The disadvantages are:
Can result in more verbose code, e.g. graph algorithms often require a lot more code.
Can be much slower than mutable collections. For example, there is no fast purely functional dictionary data structure: they are all ~10x slower than a decent hash table.

The obvious solution is to copy the entire input data structure but it turns out it can be done much more efficiently than that. In particular, if all collections are balanced trees then almost every imaginable operation can be done in O(log n) time complexity.

Chris Okasaki's PhD thesis that was turned into a book is the standard monograph on the subject.

In practice, purely functional APIs are perhaps the most useful application of purely functional data structures. For example, you can give whole collections to "alien" code safe in the knowledge that your own copy cannot be mutated.

If you want to get started with purely functional data structures just dick around with lists in OCaml or F#. Create a little list:

> let xs = [2;3];;
val int list = [2; 3]

create a couple of new lists by prepending different values onto the original list:

> list ys = 5::xs;;
val int list = [5; 2; 3]

> list zs = 6::xs;;
val int list = [6; 2; 3]

> xs;;
val int list = [2; 3]

Note how prepending 5 onto xs didn't alter xs so it could still be reused even after ys had operated on it.

You might also want to check out my Benefits of OCaml web page. I'd love to see the symbolic manipulation and interpreter examples translated into Rust!

> Personally I used Atom for a while, until I learned how to use Vim, now I use that. IDE information for Rust can be found at https://areweideyet.com/

Excellent. I'll check it out, thanks.

u/nekochanwork · 2 pointsr/learnprogramming

Expert F# 4.0 by Dom Syme (creator of F#) is a useful reference for people who already have OO programming experience, but want to learn FP. It's also a .NET language, so it interops with C# with minimal effort.

The F# Wikibook is a little dated, but otherwise a useful free reference for people learning F# for the first time.

Purely Functional Data Structures by Chris Okasaki is still the best reference available for data modeling. It uses SML as it's reference language, but almost all of the examples can be converted to equivalent F#, Haskell, or Scala.

u/formalsystem · 2 pointsr/programming

There are a lot but my favorites are

https://www.amazon.com/Purely-Functional-Data-Structures-Okasaki-dp-0521663504/dp/0521663504/ref=mt_paperback?_encoding=UTF8&me=&qid=

https://www.amazon.com/Structure-Interpretation-Classical-Mechanics-Press/dp/0262028964/ref=sr_1_2?crid=3RF2M50NKPULH&keywords=structure+and+interpretation+of+classical+mechanics&qid=1572889989&s=books&sprefix=structure+and+inte%2Cstripbooks%2C207&sr=1-2

https://www.amazon.com/Structure-Interpretation-Computer-Programs-Engineering/dp/0262510871/ref=sr_1_1?crid=1CSGG60L6328Q&keywords=structure+and+interpretation+of+computer+programs&qid=1572890005&s=books&sprefix=structure+and+inte%2Cstripbooks%2C413&sr=1-1

There are also more mathy type theory books and category theory books you should check out but I'd probably start with those 3

u/stulove · 2 pointsr/compsci

On the functional programming front, Purely Functional Data Structures has some fun stuff in it. You should be really familiar with functional languages before going through it though.

u/Mason-B · 2 pointsr/programming

An immutable list is implemented as described by the other responses, but equivalent immutable data structures exist for all mutable data structures. Immutable arrays with O(1) lookup and O(1) 'assignment', which of course enables O(1) dictionaries. And all the others.

This talk by Rich Hikley, creator of Clojure, has a good example of how it works (About 23 minutes in, but the rest of the talk is good). Also see Purely Functional Structures for an indepth look at it, and many more.

u/tedivm · 2 pointsr/PHP

Percona has more stability and better performance. It backports many features from MySQL 5.5 and has quite a few additional features. It's developed by the authors of the MySQL Performance Blog and High Performance MySQL. It remains API compatible so you can use the standard mysql client libraries, as most of the differences are internal, making transition very easy.

We switched to Percona at work recently, after being introduced to it through the XtraBackup utility that they distribute. This is one of the best innodb hotbackup tools I've seen, particularly for the price of free.

u/StoneCypher · 2 pointsr/SQL

http://www.amazon.com/Joe-Celkos-SQL-Smarties-Programming/dp/1558605762

http://www.amazon.com/SQL-Antipatterns-Programming-Pragmatic-Programmers/dp/1934356557

Yes, the name of the first one is horrible. It's a fantastic book.

Since you're talking MySQL, you might also want

http://www.amazon.com/High-Performance-MySQL-Optimization-Replication/dp/0596101716

u/localhost127 · 2 pointsr/learnprogramming

I'm not too sure on your level of knowledge, but this book is pretty good for the intermediate level. Plus it's an O'Reilly book, so you get a sweet picture of an animal on the cover.

u/DrewEugene17 · 2 pointsr/italy

https://rcompanion.org/rcompanion/index.html
https://www.r-bloggers.com/
https://www.datacamp.com/community/open-courses/introduzione-a-r
https://cran.r-project.org/doc/contrib/manuale.0.3.pdf
https://www.amazon.it/Introductory-Statistics-R-Peter-Dalgaard/dp/0387790535/ref=sr_1_6?ie=UTF8&amp;qid=1539355565&amp;sr=8-6&amp;keywords=statistics+with+r
https://www.amazon.it/R-Cookbook-Paul-Teetor/dp/0596809158/ref=sr_1_1?ie=UTF8&amp;qid=1539355551&amp;sr=8-1&amp;keywords=r+cookbook

Il terzo sopratutto, se puoi investirci qualche soldo, è secondo me il più veloce. Non parlo strettamente di quel corso gratuito in italiano, ma di tutti i corsi presenti nel portale. Essendo direttamente applicativi ti mitigano la curva di apprendimento di molte volte.

Dell'ultimo libro (ma anche dello springer) ci sono poi tutti i seguiti nelle rispettive collane, in base al proprio indirizzo e a quello che si vuole fare (medicina, finanza, cibo, ecologia, morfologia, etc.)

edit: credo esistano svariati corsi base in inglese su youtube, e qualcosa pure in italiano.

u/brews · 2 pointsr/statistics

As you already have programming experience I strongly recommend you try "The Art of R Programming" sooner or later. The majority of other books discuss R from a statistical aspect. This book, however, approaches it as a programming language. One of the few R books I own ("R graphics" and "ggplot2" might be others, but that's a bit advanced.)

This site is a great resource for all those simple little R-isms that I forget from time to time. "The R Cookbook" is another resource, much like the above, but with a bit more meat.

There are LOADS of other resources out there. If you ever have a question, just google it + "R stats" and you'll usually find what you need.

You might also want to subscript to "R Bloggers", it's a planet with loads of sources. It's inspiring and educational to see all the things people put R to use for.

u/Illithilitch · 2 pointsr/RStudio

From my coursework, I found these resources helpful:

http://www-bcf.usc.edu/~gareth/ISL/

https://www.amazon.com/Art-Programming-Statistical-Software-Design/dp/1593273843/ref=sr_1_3?ie=UTF8&qid=1527726606&sr=8-3&keywords=programming+with+R

https://www.amazon.com/Cookbook-Analysis-Statistics-Graphics-Cookbooks/dp/0596809158/ref=sr_1_9?ie=UTF8&qid=1527726606&sr=8-9&keywords=programming+with+R

https://www.amazon.com/Mastering-RStudio-Develop-Communicate-Collaborate-ebook/dp/B0123RVFZG/ref=sr_1_1_sspa?s=books&ie=UTF8&qid=1527726659&sr=1-1-spons&keywords=Rstudio&psc=1

u/bill_cleveland_fan · 2 pointsr/statistics

It's an interesting book.

R's powerful
ggplot2 graphics system has a default output
style which follows many of these principles, and it looks good.

But it's not my favourite book in this area.
My favourite would be (both)
Bill Cleveland's books

The Elements of Graphing Data (1ed 1985, 2ed 1994)
Visualizing Data (1993)

After seeing references to Cleveland in the
R documentation
(for example, the
loess
and
lattice
packages),
I read both the Cleveland books, and found them extremely interesting.

There's a classic paper by Cleveland and McGill,
"Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods"
(you can download a PDF)
which is also interesting. (And if you find that interesting, you would
most likely enjoy the books mentioned above.)

The Cleveland books are not widely famous like
The Visual Display of Quantitative Information,
but I found them more appealing in a way that's kind of
hard to describe. But, very roughly
Cleveland feels more like a statistician trying to create
visualisations which are efficiently and accurately perceived.
Tufte feels a little like a designer trying to create beautiful
visualisations based on a kind of minimalist aesthetic. Or
maybe like a philosopher trying to find the essence of a
visualisation.

The conclusions of the two approaches are not necessarily
incompatible. They would certainly agree on the
undesirability of most of the ridiculous
stuff
in the MS Excel plot menu. (So if Tufte stops people doing that, then the more people who read him, the better).

But when there's tension between the two approaches then I'd
choose the first (Cleveland).

For example, the
Tufte (minimalist) boxplots
manage to represent the same information as a box plot, but with less ink.
But they feel like they might not be as easy to read.
(See also "W. A. Stock and J. T. Behrens. Box, line, and midgap plots: Effects of display characteristics on the accuracy and bias of estimates of whisker length. Journal of Educational Statistics, 16(1): 1–20, 1991"
(abstract) )

u/notjustanymike · 2 pointsr/webdev

My vote would be D3 over out of the box libraries like highcharts.

Since you're focusing on statistic, I'm guessing you'll want to do some decent custom visualizations to highlight certain aspects of the data. Typically most of the charting libraries work well for traditional vis (bar charts, etc) but fall apart trying to make more advanced visuals.

D3 has a learning curve, but it's not as high as people think. One book in particular really helped me understand how it works, and once you know, you'll never go back.

https://www.amazon.com/Interactive-Data-Visualization-Web-Introduction/dp/1449339735

u/Kicker_of_Infants · 2 pointsr/ProgrammerHumor

This is the book I'm following. I was not very familiar with d3.js beforehand, but it serves as a solid introduction to the core concepts. Very easy to read and hands-on approach. The first three chapters (around 60 pages) are introductions to html, css, javascript, with some descriptions of basic syntax and whatnot. Starting chapter 4 it's pure D3.

Building scatter diagrams, bar charts, geodata, representing graphs with interconnected nodes, all sorts of fun stuff.

u/messacz · 2 pointsr/mongodb

It's normal thing in distributed systems. It's pretty logical :)

https://en.wikipedia.org/wiki/Quorum_(distributed_computing)
https://en.wikipedia.org/wiki/CAP_theorem
https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing
https://blog.yugabyte.com/how-does-consensus-based-replication-work-in-distributed-databases/
https://www.amazon.com/dp/1449373321/ref=cm_sw_em_r_mt_dp_U_GkfUDb87YKGR4
etc. etc. :)

PostgreSQL is a classic single-server database, not a distributed database; it supports multiple replication strategies, I think the closest one to MongoDB is this: https://www.postgresql.org/docs/current/warm-standby.html#SYNCHRONOUS-REPLICATION (notice: it's not even the default setting 😀)

Timescale uses PostgreSQL under the hood, so same thing as above.

u/ProfessionalTensions · 2 pointsr/financialindependence

Honestly, I just read a lot of blog posts. Sometimes for fun, but most of the time when I'm trying to solve a specific problem. I also make sure to document what I'm learning in github (like this (not mine)) and throw up any personal projects I work on. I also try to creatively mention in interviews that I'm self-taught and always ready to learn more. I know I've gotten lucky along the way, but I also spend hours and hours applying to jobs.

If you want hard resources: the Kimball approach was one of the first things I got familiar with and Designing Data-Intensive Application is a great modern day resource. Both are pretty dry, but once you find yourself in a situation where their knowledge applies, you'll be thankful for it a thousand times over. I've even had the Kimball approach come up in an interview....so, you never know.

Edit: I also like to watch all of the PyCon videos that even remotely relate to data.

u/tpintsch · 2 pointsr/datascience

Hello, I am an undergrad student. I am taking a Data Science course this semester. It's the first time the course has ever been run so it's a bit disorganized but I am very excited about this field and I have learned a lot on my own.I have read 3 Data Science books that are all fantastic and are suited to very different types of classes. I'd like to share my experience and book recommendations with you.

Target - 200 level Business/Marketing or Science departments without a programming/math focus.
Textbook - Data Science for Business https://www.amazon.com/gp/product/1449361323/ref=ya_st_dp_summary
My Comments - This book provides a good overview of Data Science concepts with a focus on business related analysis. There is very little math or programming instruction which makes this ideal for students who would benefit from an understanding of Data Science but do not have math/cs experience.
Pre-Reqs - None.

Target - 200 level Math/Cs or Physics/Engineering departments.
Textbook -Data Mining: Practical Machine Learning Tools and Techniques https://www.amazon.com/gp/aw/d/0123748569/ref=pd_aw_sim_14_3?ie=UTF8&amp;dpID=6122EOEQhOL&amp;dpSrc=sims&amp;preST=_AC_UL100_SR100%2C100_&amp;refRID=YPZ70F6SKHCE7BBFTN3H
My comments: This book is more in depth than my first recommendation. It focuses on math and computer science approaches with machine learning applications. There are many opportunities for projects from this book. The biggest strength is the instruction on the open source workbench Weka. As an instructor you can easily demonstrate data cleaning, analysis, visualization, machine learning, decision trees, and linear regression. The GUI makes it easy for students to jump right into playing with data in a meaningful way. They won't struggle with knowledge gaps in coding and statistics. Weka isn't used in the industry as far as I can tell, it also fails on large data sets. However, for an Intro to Data Science without many pre-reqs this would be my choice.
Pre-Req - Basic Statistics, Computer Science 1 or Computer Applications.

Target - 300/400 level Math/Cs majors
Textbook - Data Science from Scratch: First Principles with Python
http://www.amazon.com/Data-Science-Scratch-Principles-Python/dp/149190142X
My comments: I am infatuated with this book. It delights me. I love math, and am quickly becoming enamored by computer science as well. This is the book I wish we used for my class. It quickly moves through some math and Python review into a thorough but captivating treatment of all things data science. If your goal is to prepare students for careers in Data Science this book is my top pick.
Pre-Reqs - Computer Science 1 and 2 (hopefully using Python as the language), Linear Algebra, Statistics (basic will do, advanced preferred), and Calculus.

Additional suggestions:
Look into using Tableau for visualization. It's free for students, easy to get started with, and a popular tool. I like to use it for casual analysis and pictures for my presentations.

Kaggle is a wonderful resource and you may even be able to have your class participate in projects on this website.

Quantified Self is another great resource. http://quantifiedself.com
One of my assignments that's a semester long project was to collect data I've created and analyze it. I'm using Sleep as Android to track my sleep patterns all semester and will be giving a presentation on the analysis. The Quantified Self website has active forums and a plethora of good ideas on personal data analytics. It's been a really fun and fantastic learning experience so far.

As far as flow? Introduce visualization from the start before wrangling and analysis. Show or share videos of exciting Data Science presentations. Once your students have their curiosity sparked and have played around in Tableau or Weka then start in on the practicalities of really working with the data. To be honest, your example data sets are going to be pretty clean, small, and easy to work with. Wrangling won't really be necessary unless you are teaching advanced Data Science/Big Data techniques. You should focus more on Data Mining. The books I recommended are very easy to cover in a semester, I would suggest that you model your course outline according to the book. Good luck!

u/AKGeef · 2 pointsr/datascience

I don't know of any MOOCs that use Keras, so your best bet might be going through their documentation.

If you are looking for a Data Science MOOC that uses Python, University of Michigan has one here.

Also, another great resource is Joel Grus's book called Data Science from Scratch.

u/Dansio · 2 pointsr/learnprogramming

Then learning Python would be very useful for you. I have used the book called Automate the Boring stuff (Free).

For data science and machine learning I use: Data Science from Scratch and Hands on Machine Learning with Scikit-learn and Tensorflow.

For AI I have used Artificial Intelligence: A Modern Approach (3rd ed.).

u/funnythingaboutmybak · 2 pointsr/datascience

u/Gimagon · 2 pointsr/neuralnetworks

I would highly recommend Aurélien Géron's book. The first half is an introduction to standard machine learning techniques, which I would recommend reading through if you have little familiarity. The second half is dedicated to neural networks and takes you from the basics up to very results from very recent (2017) literature. It has examples building networks both from scratch and with TensorFlow.

If you want to dive deeper, the book Deep Learning is a little more theoretical, but lacks a lot of low level detail.

Joel Grus's "Data Science From Scratch" is another good reference.

u/Sarcuss · 2 pointsr/AskStatistics

Although I am not a statistician myself and given your background, some of my recommendations would be:

All of Statistics for a concise treatment of well....all of statistics :p
[Introduction to Statistical Learning/Elements of Statistical Learning] (https://www.amazon.com/Introduction-Statistical-Learning-Applications-Statistics/dp/1461471370/ref=sr_1_1?s=books&amp;ie=UTF8&amp;qid=1493070885&amp;sr=1-1&amp;keywords=introduction+to+statistical+learning) avaliable for free in the internet if you are interested in applying statistical methods with Machine Learning
Introduction to Probability for a intro to probability theory which underlies most of statistics.
One of either [Python Data Science Handbook] (https://www.amazon.com/Python-Data-Science-Handbook-Essential/dp/1491912057/ref=sr_1_1?s=books&amp;ie=UTF8&amp;qid=1493071026&amp;sr=1-1&amp;keywords=python+data+science+handbook) or [R for Data Science] (https://www.amazon.com/Data-Science-Transform-Visualize-Model/dp/1491910399/ref=sr_1_1?s=books&amp;ie=UTF8&amp;qid=1493071056&amp;sr=1-1&amp;keywords=r+for+data+science) for building programming chops in your language of choice Python or R.

This should probably be enough for now but if you need more recommendations just say so :)

Regarding the time series question, it's not my area of expertise but since time series analysis ends up employing many statistical methods, I think it can be considered an area of statistics (Statisticians around here correct me if I am wrong :P)

u/choleropteryx · 2 pointsr/CasualMath

Books on Fractal Geometry tend to have pretty pictures:

Indra's Pearls: The Vision of Felix Klein by David Mumford et al.

Beauty of Fractals by Heinz-Otto Peitgen et al

Fractal Geometry of Nature by Benoit Mandelbrot

For what it's worth New Kind of Science by Stepeh Wolfram has tons of pretty pictures, even if the content is dubious.

you might also want to checkout the Non-Euclidean Geometry for babies and other similar titles.

u/Toenex · 2 pointsr/java

As someone who is a long time but intermittent Java developer myself I'd suggest focusing on Java 8 and in particular how the arrival of lambdas is influence the language and ecosystem. As an experienced OO developer I'd guess most other aspects won't present the same learning curve. Superficially even lambdas can appear as a method to reduce boiler plating, but the implications of this trend toward supporting an increased functional programming style run much much deeper I feel. With that in mind I would suggest either the book Java 8 in Action or Functional Programming in Java.

u/vmsmith · 2 pointsr/learnpython

I just bought Learning Python Design Patterns. I haven't spent a long time with it so far, but my first few skims of it gave me a fairly good impression.

u/dustinmm80 · 2 pointsr/Python

I've read these and enjoyed them:

u/AQuietMan · 2 pointsr/DatabaseHelp

I think the best first book you can get is Bill Karwin's SQL Antipatterns. That book alone will keep you from making most of the mistakes that come back to bite new designers.

u/rbatra · 2 pointsr/SQL

Beginners: Practical SQL Handbook or SQL Primer (Disclosure: I wrote the latter)
Advanced: The Art of SQL and SQL Antipatterns
Reference: Introduction to SQL or SQL Cookbook

u/gram3000 · 2 pointsr/Database

I think it would depend on your data and how its being used. There's a great book 'SQL Anti patterns' that explains different approaches, pros and cons and suggests alternatives: https://www.amazon.co.uk/SQL-Antipatterns-Programming-Pragmatic-Programmers-x/dp/1934356557

u/mrmonkeyriding · 2 pointsr/webdev

I buy books because they go into a lot more details, or often are written really well, and easy to follow. Also, it's really nice to read paper. Often I keep books in the office as it's a quick and reliable way to research a topic in-depth without scrolling through hundreds of shit articles on a particular (and even controversial subject).

I really recommend these:

High Performance MySQL: Optimization, Backups, and Replication - I've read snippets, but it's recommended a lot and very good for more advanced readers.

SQL Antipatterns: Avoiding the Pitfalls of Database Programming - VERY beginner friendly, easy to read, follow, provides real and common scenarios and explains the anti-pattern, it's problems, the reasons to sometime excuse their use, and solutions. I love this book.

The Go Programming Language - Very good read, not TOO technical jargon, very nice to read, explains in depth and in an understandable way.

I've had plenty more over the years, but these are my current I have at home. Still more on order. :)

u/squishles · 1 pointr/BlackPeopleTwitter

4000 record excel doc? or 4000 excel documents

y'all motherfuckers need Date https://www.amazon.com/Introduction-Database-Systems-8th/dp/0321197844

u/ObnoxiousFactczecher · 1 pointr/startrek

u/read_it_at_work · 1 pointr/learnprogramming

http://www.doctrine-project.org/2010/07/27/document-oriented-databases-vs-relational-databases.html

> Relational databases were traditionally the most obvious solution for applications that needed to store retrieve/data. With the growth of internet user-base, the number of reads and writes a typical application needed to perform grew rapidly. This led to the need for scaling. Traditional RDBMSs were hard to scale (SQL operation or Transaction spanning multiple nodes doesn’t scale well). With solutions like MySQL Cluster and Oracle RAC , this is much less of a problem now, but it wasn’t the case for a while, which led to many companies abandoning traditional RDBMSs for “noSQL” data stores.

https://www.google.com/search?q=document+-oriented+databases+vs+relational+databases

https://stackoverflow.com/questions/1289130/database-where-should-i-start-from

https://stackoverflow.com/a/1289160
>Introduction to Databases course: http://infolab.stanford.edu/~widom/cs145/
>and this textbook: Introduction to Database Systems, An (8th Edition)
>http://rads.stackoverflow.com/amzn/click/0321197844

u/wizardApprentice · 1 pointr/AskMen

Thanks man - am currently reading Dataclysm, the book written by one of Okcupid's founders. You should check it out if you like data analysis.

u/Prof_Acorn · 1 pointr/dataisbeautiful

I'd guess it's from Dataclysm, which just came out.

u/tee_tea · 1 pointr/gaybros

I haven't actually read this, but it was written by one of the founders of okcupid. Hope it's some help.

http://www.amazon.com/Dataclysm-When-Think-Ones-Looking/dp/0385347375

u/soafraidofbees · 1 pointr/OkCupid

Har de har har to all the comedians replying to you... here are some non-joke answers:

Dataclysm, by OKCupid founder Christian Rudder
Tiny Beautiful Things: Advice on Love and Life from Dear Sugar, an advice columnist I happen to love who could teach a lot of OKC users a thing or two
OKCupid A-List gift subscription (you'd have to know their username... could maybe print out a homemade "coupon" for them to redeem with you later if you don't know it)
phone tripod, for taking better profile selfies

u/lifeisfractal · 1 pointr/AskWomen

u/nunboi · 1 pointr/PoliticalDiscussion

OP it has nothing to do with politics in its outlook, but for the effects of gender and race based biases in practice, check out the book Dataclysm, but the Chief Information Scientist at OK Cupid: https://www.amazon.com/Dataclysm-Identity-What-Online-Offline-Selves/dp/0385347391

u/myLifeAsThrowaway · 1 pointr/IncelTears

>As someone who has worked in research in the past

Sure, carrying a clipboard and harassing people in front of Costco gives you real authority on the matter. Here's a book by the same people that did the study. Since you're "in the biz" maybe it'll be interesting to you.

>Also, you may be any level of ugly, unless you are actually disfigured, there will be people interested on you as long as you have an interesting personality - it doesn't matter how much you say the opposite.

Well funny how I haven't found any of them. Must be my shitty personality, eh? Here's my OkCupid inbox from a few years back where I've used some normie's photos instead of my own, and my original (and rather long) profile content. I also tried the same profile content with my own pics, and hardy got any messages (and those that I did get were not friendly or flirty). Conclusion: F A C E

>first, it is because of society, then I show it's not

You didn't show me shit, you just said what you believe with nothing to support it.

>it's because men are not picky, then I show it's not true

You didn't show me shit, you just said what you believe about yourself.

>then it's because I don't flirt with women, then I show I do

My experience in flirting with women outpaces yours quite a bit. It's just that you don't have the kind of face that repulses people.

>then you know women better than they know themselves, and you know more about flirting than anyone else

I'm an authority on how women react to me. Unless they can detect my horrible personality with their sixth sense (that somehow fails to detect hooking up with an abuser), then they are completely and identically uninterested in me whether I flirt or not and whether I talk or not. Conclusion: F A C E

>And the reason for all of that? Because you cannot accept, not even for 1 second, that maybe, just maybe, your personality and behavior play a role in how people react to you too, and you could spend sometime working on yours just like you've spent 13 years in a gym.

Sure, I accept my personality is (or has become) shitty too, but is it so shitty that no one's ever loved me and it's just a coincidence that my face is ugly? Funny how that works. And funny how a shitty personality is not a barrier for good looking people to get in a relationship.

>I have no time for this victim mentality man, nor does anyone else. Have a good night.

Homophobia: doesn't exist.

Racism: doesn't exist.

Sexism: doesn't exist.

Any person who's being discriminated against should work on their personality instead.

u/AmazonInfoBot · 1 pointr/BustyPetite

Dataclysm: Love, Sex, Race, and Identity--What Our Online Lives Tell Us about Our Offline Selves.

Price: $10.87

Hi, I'm Amazon Info Bot. Every time you purchase an item through one of these links, 10% of the products price goes to the American Cancer Society.

1st Month Donation Proof Please Upvote This Comment so that I may comment more, and raise more.

[My Motive](/s "My Aunt passed away this last year from Breast Cancer. I'm in my 1st semester of Computer Science and decided to take on a project that would make a drop of difference on this world and hopefully contribute to stopping others from losing an aunt they loved as much as I did.") | [Why Not Use Amazon Smile](/s "Amazon smile gives .5% of your purchase to charity, amazon affiliate gives ~10%. That is a 20x greater affect per purchase."| Amazon Music Unlimited 30-Day Free Trial | Amazon Prime 30-Day Free Trial | 6 Months Free w/ Prime Student

u/ElasticHeadBand · 1 pointr/short

>Because OK Cupid is definitely the best measure for dating right?

Uh, yeah. It's the only measure we have.

>I still haven't seen a source.

Since you're too lazy to type a few words into google:

http://blog.okcupid.com/index.php/your-looks-and-online-dating/

There was even a book published by the guy who founded OKC who talks about dating and dating trends like this:

http://www.amazon.com/Dataclysm-Identity--What-Online-Offline-Selves/dp/0385347391/ref=sr_1_2?ie=UTF8&amp;qid=1462480969&amp;sr=8-2&amp;keywords=ok+cupid

It's pretty common knowledge at this point. Surprised you haven't heard about this until now.

u/TatuTattoo · 1 pointr/toronto

Hijacking top comment to note that the founder of OkCupid wrote a fascinating book on this phenomenon. It's called Dataclysm and was my favourite book of 2014.

https://www.amazon.ca/Dataclysm-Identity-What-Online-Offline-Selves/dp/0385347391

See also:

https://theblog.okcupid.com/race-and-attraction-2009-2014-107dcbb4f060#.noikefokj

u/omaolligain · 1 pointr/AskSocialScience

Why would you need to? The top commenter was saying the belief is the result of selection bias in popular culture.

If pop. culture caused us to legitmiatly see Scandanavian people, for example, on the street and believe them to be more beautiful on average how would that somehow invalidate OP's question?

OP is essentially asking a question about the role of certain social constructs. If you don't believe the construct exists fine, but we can go out and measure it via surveys and see if it does if we really wanted (and I assure you someone already has). The founder of OKCupid, Christian Rudder, wrote a book (Dataclysm) detailing all the beauty and attraction data they gathered on the dating website. It makes the case pretty solidly that some races/ethnicities are considered more attractive. Whether that's good or not is not really the point.

u/pornaccount9876 · 1 pointr/sex

Read Dataclysm by Christian Rudder if you want a more rigorous analysis of the differences in dating approaches for men and women. Or just read some of his blog, OkTrends. The short version is, this video is absolutely representative of gender's role in online dating, regardless of attractiveness.

u/utopista114 · 1 pointr/IncelTears

N por the OKCupid studies was in the hundreds of thousands. The guy running the studies is a freak of statistics. Granted, is still slanted by people in online dating, but the N is so big that you can make conclusions at least about internet-based dating (which is very popular in many countries, nowadays the most common way to meet people).

His book: https://www.amazon.com/Dataclysm-Identity-What-Online-Offline-Selves/dp/0385347391

u/alexander_P_L_O_T_Z · 1 pointr/dataisbeautiful

I like these books:

u/InboxZero · 1 pointr/guns

Cool, thanks for the reply. I do a lot of reports/charts/bs at work and I'm always interested in learning about stuff like this.

I'm thinking about picking this up.

u/mreiland · 1 pointr/rust

immutable data can cause problems with data structures and the like, here's a book on the topic.

http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504?ie=UTF8&amp;*Version*=1&amp;*entries*=0

u/DeusExCochina · 1 pointr/Clojure

Heh, I'm not about to! I don't even try to understand the other Purely Functional Data Structures - it's all black magic to me. But you're right - that doubly linked lists map so poorly onto immutable structures is no doubt a very strong reason we're not seeing them there.

u/rocksInGorges · 1 pointr/compsci

http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504

http://www.amazon.com/Introduction-Algorithms-Second-Thomas-Cormen/dp/0262032937

u/loup-vaillant · 1 pointr/programming

> That's not an experiment, it's an anecdote about a thing that happened once.

It's a whole class, taught by a not exactly nobody professor. If it was one student, that would be anecdotal. But this is a sizeable sample, bordering on "statistically significant". As for "happened once", I'm sure he taught several other similar classes since then. Maybe we should ask him how it went?

A better argument than the worn out "anecdote", is to suspect the evidence to be filtered, one way or the other. I presented the argument to counter some point I believed false, but nothing guarantees that I didn't know of, and omitted, arguments to the contrary. (There are many reasons why I may do so, included but not limited to self deception, dishonesty, conflict of interest…) I will just hereby swear that I do not recall having ever encountered evidence that mandatory indentation was either detrimental, or neutral to the learning of programming languages. Trust me, or don't.

> And it's an anecdote about introducing absolute novices to programming.

It was their second language. I assume they programmed for at least a semester.

> Even if it were an experiment, experiments don't provide arguments, they provide data to use to test arguments.

Experiments provide evidence for or against hypothesises. Pointing out "hey, look at this experiment that crushes your beliefs flat!" is the argument. Which may have flaws besides the experiment itself (the results of the experiment may have to crush my beliefs flat, and I misread the paper). &lt;/pedantic&gt;

---

> And even if this were an experiment with a result compatible with an argument about indentation, there's no reason to think that this would have any bearing on infix expression shenanigans in Lisp.

I agree. Yes, you have read that correctly, I agree. &lt;Sardonic smile…&gt;

There is something I suspect you and many others in this thread have totally missed: sweet expressions are not just about infix expressions. That's a detail. The crux of sweet expression is actually significant indentation. Here:

define (factorial n)
if (<= n 1)
1
(* n (factorial (- n 1)))

I don't like the last line (too many parentheses). Let's try this:

define (factorial n)
if (<= n 1)
1

n
factorial (- n 1)

So, while results about indentation doesn't have any bearing about infix notation, it does have direct bearing about sweet expressions as a whole.

> You're pretty sloppy when you address something that seems to support your position, aren't you?

You just deserve my smug smile :-D

u/mozilla_kmc · 1 pointr/rust

> FWIW, a persistent data structure is somewhat orthogonal to laziness.

But you do need lazy evaluation (in-place update of thunks, whether that's provided by the language or a library) to get amortized time guarantees on persistent data structures. How much this matters in practice, I do not know.

u/codygman · 1 pointr/programming

http://www.amazon.com/gp/aw/d/0521663504?pc_redir=1405759645&amp;robot_redir=1

u/gahathat · 1 pointr/webdev

I haven't read it yet, but this book is about building/managing large scale databases.

u/soupydreck · 1 pointr/statistics

Aside from Tufte, you might find Cleveland's Visualizing Data worthwhile. I'm reading Stephen Few's Now You See It: Simple Visualization Techniques for Quantitative Analysis now.

Also, try following some related blogs, like Nathan Yau's Flowing Data or Kaiser Fung's Junk Charts. You can get a sense of some appropriate and/or inappropriate ways of visualizing data from these.

Finally, once you get more familiar, get something like Murrell's R Graphics. This will help you understand the basics of the base R graphics capabilities so you can make what you want, exactly how you want. ggplot2 is awesome, too, but understanding the basics is really helpful. Hope that helps.

u/digitalorchard · 1 pointr/gis

Not as quick as OP's tutorial, but the "Interactive Data Visualization For The Web" book is good too.

http://www.amazon.com/Interactive-Data-Visualization-Scott-Murray/dp/1449339735

u/lordmister_15 · 1 pointr/cscareerquestions

I'm a little late to the thread but I work at a company that operates at a large scale and I've found Designing Data Intensive Applications to be the best overview of modern techniques for scalable applications

u/vira28 · 1 pointr/Firebase

On a side note. I am currently reading https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/1449373321. Loving it so far. Author clearly explains the difference b/w relational & document model.

Highly recommended.

u/puppy_and_puppy · 1 pointr/MensLib

Weird how I just finished the book Designing Data-Intensive Applications, and it ended with a section on ethics in computer science/big data that ties into this article really well. I'll add some of the sources from that section of the book here if people are curious. Cathy's book is in there, too.

https://www.linkedin.com/pulse/making-hard-choices-quest-ethics-machine-learning-igor-perisic
https://www.theguardian.com/commentisfree/2015/dec/06/algorithm-writers-should-have-code-of-conduct
https://www.theatlantic.com/technology/archive/2014/02/welcome-to-algorithmic-prison/283985/
https://www.theatlantic.com/magazine/archive/2013/12/theyre-watching-you-at-work/354681/
https://www.theguardian.com/technology/2016/aug/03/algorithm-racist-human-employers-work
https://www.scientificamerican.com/article/how-a-machine-learns-prejudice/
http://idlewords.com/talks/sase_panel.htm
https://www.amazon.com/Weapons-Math-Destruction-Increases-Inequality/dp/0553418815
https://www.nytimes.com/2016/08/01/opinion/make-algorithms-accountable.html
https://www.theguardian.com/technology/2016/nov/10/facebook-fake-news-election-conspiracy-theories

There's also this fiasco: https://www.theguardian.com/technology/2018/jan/12/google-racism-ban-gorilla-black-people

We're definitely having our machine babies learn from our own racist, classist, sexist data and giving people with malicious intent access to unprecedented amounts of data.

Some forms of machine learning, like decision trees or random forests, have output that resembles a flow-chart which is nicer for humans because you can follow the decisions the algorithm is making. Deep learning with neural nets is real hard to understand. The model is basically just a ton of numbers.

If you're curious about deep learning from the "how" side, Andrew Ng's deep learning courses on Coursera are really good: https://www.coursera.org/specializations/deep-learning

Andrew Ng is kind of like the Fred Rogers of machine learning. He also has a machine learning course on Coursera that I've heard is great.

u/tekedozai · 1 pointr/softwarearchitecture

This one has some good info:

https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable/dp/1449373321

And prob this:

http://erlang.org/download/armstrong_thesis_2003.pdf

u/gin_and_toxic · 1 pointr/webdev

Clean Architecture: https://www.amazon.com/dp/0134494164/ (also read Clean Code if you haven't).

Designing Data-Intensive Applications: https://www.amazon.com/dp/1449373321/

u/gfever · 1 pointr/cscareerquestions

Robert Martin books are good read "Clean Code" and his architecture book.

Learn design patterns: Head First Design Patterns: A Brain-Friendly Guide

Supplement with leetcode: Elements of programming interviews

You need some linux in your life: https://www.amazon.com/gp/product/0134277554/ref=ox_sc_act_title_1?smid=ATVPDKIKX0DER&psc=1

Get some system design knowledge: https://www.amazon.com/gp/product/1449373321?pf_rd_p=183f5289-9dc0-416f-942e-e8f213ef368b&pf_rd_r=NZSW6YF36GPNR9EM27XB

You need some CI/CD knowledge: The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations

u/jakc13 · 1 pointr/learnpython

Looks good, and seems to have good reviews. May well order that.

However, I am more after online learning style courses that include online tutorials and videos. More my style of learning.

u/KeyVisual · 1 pointr/datascience

What resources would you recommend for newbies? I'm currently reading Data Science from Scratch(Grus) and Python for Data Analysis(McKinney). Anything else I should check out?

Love the blog!

u/mrdevlar · 1 pointr/statistics

The books I already mentioned in this thread will cover that. That said, I am generally anti-test and pro-estimation.

If you're already a proficient programmer. Try "Data Science from Scratch". I've found it to be one of the better books on the mechanics that underpin a lot of the work.

u/fieldcady · 1 pointr/datascience

First off, thank you for your service!

I hate to say it but you've got quite a lot of ground to make up. It's hard for me to gauge whether you have the coding skills needed. I get the impression that it's mostly sys admin stuff, which is good but not really sufficient (correct me if I'm wrong). You may want to teach yourself python if you don't use it yet.

The Coursera class on machine learning is something you should look into, since it will introduce you to a large body of knowledge that is critical for DS and probably all new to you.

I also encourage reading a book on data science, which would give you a good overview of the field as a whole and let you assess where the gaps are in your knowledge. I published one recently, which has great coverage of topics but has gotten mixed reviews so far. Here's another one which has better reviews, and is by a guy I know and respect.

u/KingEnchiladas · 1 pointr/datascience

I'm a sophomore in college wanting to get in to the data science field after I graduate. I'm currently learning Python in a class of mine and I'm looking to do some learning on my own. I've found two books, Data Science from Scratch: First Principles with Python and Data Science from Scratch: Practical Guide with Python My roommate has a copy of the first book and I've looked through it some. I'm wondering if anyone has experience with either of these, or any other resources that would be helpful for me.

Thanks for your help!

u/ziegl3r · 1 pointr/cscareerquestions

Thanks for the response.

Yea I have 2 quarters and summer school before transfer to university. Currently taking calculus I and next quarter calculus II.

I started that coursera course but realized I should probably go to school and learn math there since my parents are paying for it.

I just finished the statistics course offered at my junior college and am reading [Data Science From Scratch](http://www.amazon.com/Data-Science-Scratch-Principles-Python/dp/149190142X "").

u/SethGecko11 · 1 pointr/Python

There is That book coming out in 10 days by Jake VanderPlas. I haven't read it yet (obviously) but his youtube lectures are great.

u/alzho12 · 1 pointr/datascience

As far as Python books, you should get these 2:
Python Data Science Handbook and Python Machine Learning.

u/porygon93 · 1 pointr/deeplearning

Not a course, but I suggest you to take a look at this book.
https://www.amazon.com/Deep-Learning-Practitioners-Josh-Patterson/dp/1491914254

u/crazysmoove · 1 pointr/java

Take a look at https://www.amazon.com/Deep-Learning-Practitioners-Josh-Patterson/dp/1491914254

u/troymccabe · 1 pointr/PHP

Another vote for nested sets here. We use it with some pretty decent traffic and as dazzled did we have a pretty robust class around it to handle all the resizing and whatnot.

In terms of your m-ary tree, this is what you'd want to use to be able to get any level. Imagine your dataset grew large and you were trying to go through it with php. You'd run out of time or memory.

For more information you can check out Joe Celko's book, SQL for Smarties: http://www.amazon.com/Joe-Celkos-SQL-Smarties-Programming/dp/1558605762

u/webnrrd2k · 1 pointr/programming

This isn't really a design book, but if you are going to do anything beyond the basics you should read Joe Celko's SQL for Smarties.

u/anilamda · 1 pointr/BarbarianProgramming

I wonder what the author would think of some of the cellular automata in A New Kind of Science.

u/McMonty · 1 pointr/todayilearned

Not what I was thinking at all. Read these:

http://www.amazon.com/New-Kind-Science-Stephen-Wolfram/dp/1579550088

http://www.amazon.com/G%C3%B6del-Escher-Bach-Eternal-Golden/dp/0465026567/ref=sr_1_1?s=books&amp;ie=UTF8&amp;qid=1371004122&amp;sr=1-1&amp;keywords=godel+escher+bach

And try out this free online course:
http://www.santafe.edu/education/schools/sfi-mooc/introduction-complexity/

u/potifar · 1 pointr/IAmA

I'm pretty unfamiliar with your work (except W|A), so I looked up one of your books on Amazon. The top rated review is rather dismissive (one star). I'm sure you're aware of this. Care to comment on it? Is he judging your work unfairly?

u/HowAboutABook · 1 pointr/technology

A New Kind of Science

u/manuranga · 1 pointr/lectures

read the top comment on his book at amazon

u/CunningAllusionment · 1 pointr/godot

Wow. Thanks for taking such a close look at it. I took a summer class on deterministic cellular automata that generate chaotic patterns like this one (we basically just worked off of Wolfram's "New Kind of Science"), so it's pretty exciting to encounter such a pattern unexpectedly "in the wild".

I'm not sure if it's clear what I intended this thing to do, but the idea is that on frame x+1 squares are black only if they had an odd number of black neighbors on frame x and white otherwise.

What seems to be happening instead is that each square's color is being updated as its being checked, so square (1, 1) is determining it's state by the new state of squares (0, 0), (0, 1), (0, 2), and (1, 0) and the current state of the other four squares its adjacent to.

I don't really understand why it's doing that because neighborCount is incremented based on a check of pixelArray[x][y] and is then used to set a value in newArray[x][y] which is then used to set color. There shouldn't be any way for neighborCount to see values in newArray, but there is somehow. I can only think that somehow pixelArray is being constantly updated to be the same as newArray, but I don't understand why. They're set to be equal in only 2 locations, at the end of setup() and after next_frame() is called.

Does using draw rect improve performance? I've found it takes about a half second to draw each frame with 10x10 squares. I've assumed this is due to it checking almost 60,000 if statements per frame, but maybe having that many nodes loaded is a memory sink?

Thanks again.

u/BenevolentCheese · 1 pointr/mildlyinteresting

If you'd like to learn more about the fibonacci spiral in nature and other patterns in nature based on underlying math, consider a light read of the first 700 pages of A New Kind of Science by God King slash Universal Mind Genius slash Erotic Sex Lord Stephen Wolfram.

u/denialerror · 1 pointr/learnprogramming

Java 8 in Action is what we use at work.

Edit: Stupid formatting...

u/MeGustaDerp · 1 pointr/SQL

Ah... I was thinking about getting that book. What did you think about it overall?

Just a link for future reference

u/mckennac4 · 1 pointr/cscareerquestions

Designing data-intensive applications by Martin Kleppmann has been recommended to me recently.

https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable-ebook/dp/B06XPJML5D

u/Ty1eRRR · 1 pointr/learnprogramming

I strongly advise you to check out this book. Best thing I have read in my life. There you will find a lot of answers.

u/dublos · 0 pointsr/OkCupid

Simple.

Your account is old.

You've answered questions in a time period before the current categorization.

Those questions are included in your overall match percentage.

Some of them are not categorized, so they do not count in the individual subcategories.

> Obviously this cannot be accurate mathematically.
Actually, yes it can be accurate mathematically, even if every single question you've answered is categorized and included thanks to the arcane & complicated method that match percentages are calculated.

One version is in Dataclysm: Love, Sex, Race, and Identity--What Our Online Lives Tell Us about Our Offline Selves

And I think there's a write up of it online as well.

It's not pretty.

u/Manbearjosh · 0 pointsr/OkCupid

You should read Dataclysm, written by one of the OkC founders, somewhat insightful.

https://www.amazon.com/gp/product/0385347391?tag=randohouseinc10100-20

u/artsrc · 0 pointsr/programming

Can't you get laziness with a function pointer (in a thunk struct perhaps) in any language from C up? Having it as the default is a syntactic convenience?

What are doubly linked lists really useful for? What about having two singly linked lists in opposite orders?

Would you start by buying this book?

http://www.amazon.com/Purely-Functional-Structures-Chris-Okasaki/dp/0521663504

I tend to use sets, maps, queues (priority, fifo), and lists (indexed and singly linked).

u/gergi · 0 pointsr/dataisbeautiful

Actualy, this is one of the first things you get taught if you take a visualization class, Never ever alter the data. And the slope is a very basic property of the data.

> it simply wouldn't be effective communication to make the decrease look really small

Again, if the data doesn't provide a big decrease, don't make it look like it does.

e.g. Try reading this nice one. http://www.amazon.com/Interactive-Data-Visualization-Scott-Murray/dp/1449339735

u/DaveVoyles · 0 pointsr/cscareerquestions

Clean code (mentioned elsewhere)
The Mythical Man Month
Code: The Hidden Language of Computer Hardware and Software

Docker or Kubernetes:
Docker Deep Dive
The Kubernetes Book

&#x200B;

Data Science, Machine Learning:

Hands-On Machine Learning with Scikit-Learn and TensorFlow
Now that you understand Python and the applicable libraries, you can actually use it for ML

Python for Data Analysis
Fantastic for learning Python and growing familiar with the libraries you’ll use in data analysis. It is from the creator of the Pandas framework.

Python Data Science Handbook
Great overviews of Juypter notebooks, NumPy, Pandas, Matplotlib, Scikit-learn
Free lite version & code from the author here.

Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking
So helpful for explaining the business use case for ML and data science to non-technical individuals
Also helps boil down problems into data science terms, and realize if it really is an ML problem

I put together a list of resources on my blog.

u/nziring · 0 pointsr/compsci

If you want to dive into cellular automata in a fairly approachable but very deep way, consider Wolfram's A New Kind of Science. For more academic treatments, maybe Schiff's Cellular Automata?

u/PLEASE_USE_LOGIC · -1 pointsr/AskMen

1

2

3

4

5

6

7

I've read them all; they've helped a ton^1000

u/aftersox · -1 pointsr/CrappyDesign

It's a poor representation of data. In pie charts you compare angles. Humans are poor at comparing the magnitudes of angles. Without the table, labels with the actual numbers, etc. it would be very difficult to compare the information.

For instance, it is difficult based just on the visualization if Instinct or Valor has more players. A bar, column, or dot plot will show things much better. Humans are far better at perceiving differences in length or position. That table on the right is necessary - that means the pie chart is useless.

If you are serious about designing visualizations of data, I suggest you read some books by Willilam Cleveland or Edward Tufte.

EDIT: Here is article I often share with people on this topic.

u/Robin_Banx · -3 pointsr/math

A New Kind of Science by Stephen Wolfram (the Mathematica guy) is supposed to be good. Never read it myself, very much want to at some point: http://www.amazon.com/New-Kind-Science-Stephen-Wolfram/dp/1579550088/ref=sr_1_1?ie=UTF8&amp;qid=1330979913&amp;sr=8-1

u/gnocchicotti · -5 pointsr/Bumble

Uh huh, thanks. Just relating personal observation but I appreciate your input.

EDIT: Just to be particularly specific the 20% stat is my recollections from this book which is very much based on very real statistics from the founder of OKC who had unfettered access to all of the data their user base coughed up. It's eye-opening.

u/28wheels · -16 pointsr/BustyPetite

What the fuck you talking about you fucktard? I have data to prove my facts.

It’s from a book called

Dataclysm. written by the founder of one of the largest online dating websites in the world. Used millions of interactions bewteen males and females to figure out amazing things.

Best data modeling & design books according to redditors

1. Purely Functional Data Structures

2. Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

3. Dataclysm: Who We Are (When We Think No One's Looking)

4. Dataclysm: Love, Sex, Race, and Identity--What Our Online Lives Tell Us about Our Offline Selves

5. Data Science from Scratch: First Principles with Python

6. A New Kind of Science

7. Python Data Science Handbook: Essential Tools for Working with Data

8. SQL Antipatterns: Avoiding the Pitfalls of Database Programming (Pragmatic Programmers)

9. Visualize This: The FlowingData Guide to Design, Visualization, and Statistics

10. R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics (O'reilly Cookbooks)

11. High Performance MySQL: Optimization, Backups, Replication, and More

12. Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

13. An Introduction to Database Systems (8th Edition)

14. Joe Celko's SQL for Smarties: Advanced SQL Programming Second Edition (The Morgan Kaufmann Series in Data Management Systems)

15. Interactive Data Visualization for the Web: An Introduction to Designing with D3

16. Learning Python Design Patterns

17. Java 8 in Action: Lambdas, Streams, and functional-style programming

18. Visualizing Data

19. Dataclysm: Love, Sex, Race, and Identity--What Our Online Lives Tell Us about Our Offline Selves

20. Deep Learning: A Practitioner's Approach

Top Reddit comments about Data Modeling & Design: