Reddit reviews Mining the Social Web: Data Mining Facebook, Twitter, Linkedin, Google+, Github, And More

We found 5 Reddit comments about Mining the Social Web: Data Mining Facebook, Twitter, Linkedin, Google+, Github, And More. Here are the top ones, ranked by their Reddit score.

O Reilly Media

Check price on Amazon

5 Reddit comments about Mining the Social Web: Data Mining Facebook, Twitter, Linkedin, Google+, Github, And More:

u/greenspans · 13 pointsr/DataHoarder

I'm pretty sure this book talked about how easy it was to scrape facebook before they locked down their API.

https://www.amazon.com/Mining-Social-Web-Facebook-LinkedIn/dp/1449367615/

A lot of people probably did this. I remember a talk given in my city, the guy had a few thousand people signup to his app and got millions of entries to his graph database

https://maxdemarzi.com/2013/01/28/facebook-graph-search-with-cypher-and-neo4j/

Popular game devs probably got oodles of data. Must have been awesome having a social graph of the US

u/dr1fter · 5 pointsr/IWantToLearn

Hoowhee, how did this text get so... wally.

Bots are usually (fairly) simple programs, so Python will make it easy to get at all the common functionality you want (maybe looking for pattern matches in a piece of text, some math/analytics, saving files to your hard drive, converting images...) and in practice you'll mostly only be limited by what you can figure out how to do in your language of choice, regardless of the bot you're writing.

Wherever possible, you should use official APIs (which will often support Python these days), or at least third-party APIs that are built on top of the official ones. The APIs are sort of like a mediator between your bot and the service, or a menu of remotely-accessible functionality -- for twitter it might include things like "get the list of tweet IDs posted by this user ID in the last month" and "get the full text and metadata for this tweet ID." The set of functionality in that API determines what is and isn't possible for your bot to do (and depending on the service, it might actually hide a lot of complexity around sending messages to multiple servers, authenticating the request, etc)

When there's no API (or if the official API doesn't let you do something that you know should be possible) you usually have to switch to scraping. It's error-prone (could break any time) and frowned on by a lot of services (which is why you have to think about rate limiting and bans -- you may well be violating their terms of service, and either way you're using the service in unintended ways that might interfere with its normal functioning). "Unofficial APIs" are often just scrapers under the hood, tidied up into something that looks more like a normal API. I've written a ton of little scrapers in Python -- it really is a great tool for the job.

I suppose the other case is that some services can be built in standardized ways, so you don't need an official API from that particular company, because anyone else's API for that standard should be interoperable. That's common for databases, for example, but probably not the services you're talking about -- the popular web services are usually either proprietary, or a "standard" they invented that no one else actually uses, so you're basically stuck with the official API anyways.

For a lot of examples of integrating with public APIs, you can try Mining the Social Web from O'Reilly. I didn't actually spend a lot of time with that book personally (I wasn't expecting the sort of "cookbook" format with lots of examples and code) but it might cover some of the APIs that you're interested in.

u/wscottsanders · 3 pointsr/learnpython

You might look at the O'Reilly book "Mining the Social Web". I've found it very helpful and the author has even responded to my questions about how to get the virtual machine with the ipython notebooks up and running. Has code examples, explanations about the underlying technologies, and intros to things like natural language processing included.

http://www.amazon.com/Mining-Social-Web-Facebook-LinkedIn/dp/1449367615

u/rsoccermemesarecance · 2 pointsr/datascience

Mining the Social Web

Not exactly what you're looking for but it's very helpful, imo

u/sazken · 2 pointsr/GetStudying

Yo, I'm not getting that image, but at a base level I can tell you this -

I don't know you if you know any R or Python, but there are good NLP (Natural Language Processing) libraries available for both

Here's a good book for Python: http://www.nltk.org/book/

A link to some more: http://nlp.stanford.edu/~manning/courses/DigitalHumanities/DH2011-Manning.pdf

And for R, there's http://www.springer.com/us/book/9783319207018
and
https://www.amazon.com/Analysis-Students-Literature-Quantitative-Humanities-ebook/dp/B00PUM0DAA/ref=sr_1_9?ie=UTF8&amp;qid=1483316118&amp;sr=8-9&amp;keywords=humanities+r

There's also this https://www.amazon.com/Mining-Social-Web-Facebook-LinkedIn/dp/1449367615/ref=asap_bc?ie=UTF8 for web scraping with Python

I know the R context better, and using R, you'd want to do something like this:
Scrape a bunch of sites using the R library 'rvest'
Put everything into a 'Corpus' using the 'tm' library
Use some form of clustering (k-nearest neighbor, LDA, or Structural Topic Model using the libraries 'knn', 'lda', or 'stm' respectively) to draw out trends in the data

And that's that!