Best data storage & retrieval books according to redditors

We found 35 Reddit comments discussing the best data storage & retrieval books. We ranked the 9 resulting products by number of redditors who mentioned them. Here are the top 20.

Next page

Top Reddit comments about Network Storage & Retrieval Administration:

u/librik · 13 pointsr/programming

Bit-parallel text search algorithms like this are all covered in ridiculous detail in the book Flexible Pattern Matching In Strings by Gonzalo Navarro and Matthew Raffinot. (Also don't skip the errata here.) It's a good book (if you're willing to accept that it's biased propaganda for bitwise string searching while claiming to be an impartial recipe book for all string search algorithms) which even deals with fuzzy approximate text matching using these techniques.

What makes it so interesting is that this is the first time I've seen the full extent of bit twiddling hacks, developed in theoretical depth in "Hacker's Delight", really pushed extensively to solve a real problem. I mean, occasionally you would see Population Count or Clear Lowest Bit as an optimization trick in a chess program, but these guys use all of it as the basic technology for their field.

Sun Wu and Udi Manber were probably the first people to spot that, so long as the complete set of states fits inside a machine word, the CPU can be seen as simulating a Nondeterministic Finite Automaton very, very quickly. At that point, the race was on, and every new "bit hack" discovered extended the range of NFAs that could run inside a register. Combine that with code generation, as the article here does, and you've got something that runs like a bat out of hell. (But you'll notice the big problem is that it can tell you a regular expression match ends, but it can't tell you where it starts!)

u/fieldcady · 8 pointsr/datascience

In terms of doing data science work, your biggest weakness is likely to be your coding chops. I write a lot about this in book on data science: the math is largely cookie-cutter for entry level DS positions and often not even needed, but if you can't write code then you're dead in the water.

Adjusting your role at the current company so that you start to do more coding might be the best way forward if you can wrangle it. Otherwise I would suggest doing a bunch of medium-size (think 250-500 lines of code) projects in python so that yo can chug it out.

u/xroche · 8 pointsr/france

A lire sur ce sujet: How The Web Was Born, un livre passionnant qui explique la naissance d'Internet.

Pourquoi le projet Cyclades, en avance sur Arpanet à l'époque, est mort finalement ? Deux raisons: politique et politique.

  • La première: une concurrence au minitel (et a X25, aujourd’hui mort)

  • La seconde: il est dit que quand les chercheurs présentèrent leur invention géniale, les technocrates gouvernementaux posèrent une question:
    • Comment facturer les paquets envoyés ?
    • Réponse des chercheurs: "on ne peux pas, c'est pas prévu!"

      Sourire entendu des technocrates: le X25, qui permet de facturer au paquet, avait gagné la partie. Comme quoi la courte vue et le manque de modestie peut faire perdre à un pays le leadership en matière de haute technologie.

      Ah, je vous rassure: les petits technocrates qui tuèrent Cyclades furent célébrés et promus comme il se doit. Les Louis Pouzin, eux, se contenteront de discrets hommages et remerciements, notamment américains.

      Après, ce n'est peut être pas si mal que ce soit Arpanet qui ait gagné la guerre: l'Europe, avec ses normes ISO capillotractées et ses comités industriels aussi politiques qu'incompétents, aurait de toute manière fait aller dans le fossé le projet.
u/burntsushi · 8 pointsr/rust

Aye. And personally, I'm not a huge fan of using edit distance for fuzzy searches in most cases. I've found n-gram (with n=3 or n=4) to be much more effective. And you can use that in conjunction with bitap, for example, by using an n-gram index to narrow down your search space. I use an n-gram index in imdb-rename.

If you like algorithms like bitap, you'll definitely like Flexible Pattern Matching in Strings, which covers generalizations of bitaps for regexes themselves, for both Thompson and Glushkov automata.

u/TeachMeToVlanDaddy · 4 pointsr/vmware
u/StudiedUnderSinn · 2 pointsr/WTF

Amazingly, 'Altavista Search Revolution' is still for sale, proudly bearing the digital logo!

u/vdm · 1 pointr/programming

_5. Read 'Weaving the Web'. In it, TimBL explains that HTML was meant to be just a way of linking to the real documents, in word processing formats, .ps etc. It was not intended to be the primary authoring medium, but people took it and ran with it, confounding their expectations.

Also recommended: How the Web was born.

u/notacrackheadofficer · 1 pointr/worldnews

As an American, I'd like to say that allowing Microsoft and UNESCO to dictate our future schooling [Common Core] is the greatest lockstep towards data compiling ever conceived of. The whole world should join us in gobbling the warty cock of dossier collection agencies.
Bzrezinski's 1971 book said it all.
Between Two Ages: America's Role in the Technetronic Era . It shows itself to be the actual planning guide, and not an academic predictive theory by some outsider.
A must read, although almost no one will glance at it.
The modern technocrats learned from the early ones that they should not be so public about it.
http://en.wikipedia.org/wiki/Technocracy_movement
That wiki is funny. It alleges that it all went away in the 40s.
But anyone can see that it isn't gone. http://www.economist.com/node/21538698 They are also funny, as in ''It can't happen here!'' LOL
Ray Kurzweill: His books are also must reads for the ten or twenty of us in the world who give a fuck about this modern data mining control operation. He is also very open about what is going on.
He's a bit kooky about the singularity thing, but offers great insight to the workings and processes in place, and being developed.
http://www.kurzweilai.net/data-mining-opens-the-door-to-predictive-neuroscience
.........http://www.kurzweilai.net/data-mining-our-dreams
This is an awesome, informative, flawed and incomplete book.
http://www.amazon.com/Historical-Information-Science-Emerging-Unidiscipline/dp/1573870714
Here's a great example of public hypnotizing PR
http://www.healthcatalyst.com/genius-thinker-ray-kurzweil-will-show-future
They have us chasing after outdated worm hole nothings. They certainly have the public, and maybe even the world government[s] duped. No one could reasonably doubt that there are bigger/more advanced things happening, than we know about.
The distraction game at this point is like throwing an empty hand ball to a dog. No grand conspiracy necessary, for the last hundred years. Just simple distractions. No grand conspiracy has been necessary, since the advance of mass media.
Common Core will be of no use to the public, and priceless to the technocrats.
http://www.unesco.org/new/en/communication-and-information/
Looks well planned out. Every US school board looks like bumbling fucking idiots. We are fucked.
I hope that everyone notices that I am not a tea party, or democrat or GOP or alex jones or glen back fan. I am an independent researcher whose only theory is ''They are up to something'' and nothing else. I do not know who ''they'' are, or what ''something'' is.
I know I am the only person I know, or have talked to online that has ever read UNESCO's founding document by Julian Huxley. UNESCO's motto ''Building Peace in the Minds of People'' is a hypnosis based nonsense phrase of impressive construction.
It is as attractive and meaningless as it could possibly be.
Classic Milton Erickson material. It sounds like Trancendental Meditation or some Hare Krishna crap. ''Building Peace''. What the fuck? ''Welcome to class children. We are not focusing on education or knowledge anymore. We will be building peace in your minds''
Fucking loonytunes blather, distracting everyone into signing up for data mining, brought to you by Microsoft.
Gates partnered with UNESCO/U.N. to fund “Education For All” as well. See http://bettereducationforall.org/

The “Education For All” developer is UNESCO, a branch of the United Nations. Education For All’s key document is called “The Dakar Framework for Action: Education For All: Meeting Our Collective Commitments.” Read the full text here: http://unesdoc.unesco.org/images/0012/001211/121147e.pdf
At this link, you can learn about how Education For All works:
http://www.unesco.org/new/en/education/themes/leading-the-international-agenda/education-for-all/international-cooperation/high-level-group/
Someone will call me mentally ill today for posting this wealth of information, free of all theory.
Julian's brother Aldous Huxley wrote a book 1000 times more important than Brave New World called ''Ends and Means''. It is an analysis of the coming technocratic/global thinking era, and it's pitfalls and downfalls. You will be the only person to have read it in a thousand mile radius if you find it. No one talks about that fantastic book anywhere. It's the non-fiction guide to the road to ''Brave New World''. You would think that you would have heard of it.
Edit: formatting

u/7h3dud3 · 1 pointr/vmware

Purchase copies of both the Clustering Deep Dive and Host Resources Deep Dive books. You can also find digital copies from Rubrik for free at the following:

​

https://pages.rubrik.com/host-resources-deep-dive_request.html

https://pages.rubrik.com/clustering-deep-dive-ebook.html

​

If you're going to run vSAN there is also a vSAN Deep Dive book available.

​

u/ASnugglyBear · 1 pointr/iOSProgramming

Typically iOS devs at their level of polish are using core data to store information about the app entities. This is a object graph system that maps to disk, the web, a database, or really anything else.

Marcus Zarra wrote an excellent book on doing this.

To get the numbers, they then do fetches from the object store, then count things. Count things by days, or by type, etc.

u/viv_social · 1 pointr/iOSProgramming

NSUSerDefaults is meant to be used with strings, booleans and NSData representation of other data (serialised). I believe you serialise your class to be stored into UserDefaults which is fine.

When I started developing on the iPodTouch2G, I had issues storing and retrieving data more than a few hundred kilobytes. I took some months to understand the basics of CoreData (I never wanted to use raw SQLite, which is an option). Even today I have not mastered it because mastering core data can only happen with time and experience. I still don't like the way CoreData calls from multiple threads and the merge mechanisms but that is the way of life :)

Sooner than later your dataset will grow and you will be hard pressed for options. I suggest you start with simpler architecture (One entity with one property) and scale up the learning process.

This is a comprehensive guide to learning core data and mastering it ;) Core Data: Data Storage and Management for iOS, OS X, and iCloud (Pragmatic Programmers)! from one of the masters of core data.