u/NAMOS · 10 pointsr/onions

Basically any SRE advice for a normal service but replace/compliment HAproxy / nginx / ingress controller / ELB with the Tor daemon / OnionBalance.

I run Ablative Hosting and we have a few people who value uptime over anonymity etc and so we follow the usual processes for keeping stuff online.

Have multiples of everything (especially stuff that doesn't keep state), ensure you have monitoring of everything from connections, memory pressure, open files, free RAM etc etc.

Just think of the Tor daemon onion service as just a TCP reverse proxy, with load-balancing capability and then follow any other advice when it comes to building reliable infrastructure;

https://www.amazon.co.uk/Site-Reliability-Engineering-Production-Systems/dp/149192912X/
https://www.amazon.co.uk/Infrastructure-Code-Managing-Servers-Cloud/dp/1491924357/
https://www.amazon.co.uk/Architecting-Scale-Lee-Atchinson/dp/1491943394/
https://www.amazon.co.uk/Building-Microservices-Sam-Newman/dp/1491950358/
https://www.amazon.co.uk/Practical-Monitoring-Mike-Julian/dp/1491957352/
https://www.amazon.co.uk/Cloud-Native-Infrastructure-Justin-Garrison/dp/1491984309/
https://www.amazon.co.uk/Designing-Distributed-Systems-Brendan-Burns/dp/1491983647/
https://www.amazon.co.uk/Databases-at-Scale-Operations-Engineering/dp/1491925949/

Once you've got to grips with running a reliable service then you can start layering your Onion reverse proxy / load balancing on top.
Isolate your daemons from the clearnet (clearnet example; CloudFlare 'Access' suggests you only allow CloudFlare's IPs through your firewall guaranteeing that only traffic scrubbed by them reaches your servers)
Configure Onionbalance (clearnet example; using keepalive-d / CARP / F5 LTMs to have redundancy for the ingress controller)
Harden your software Unlike the clearnet examples of security you'll want to ensure that your software can't be 'tricked' into making a call to a hostile server (e.g. "Please fetch and attach content at this URL") - you can do this with tricks like Ablative's "QuadHop" setup or simply ensure that your software doesn't have such functionality.

All of this aside, check /u/alecmuffett's "Onions that don't suck" repo for examples that are both well setup and stable.

TL;DR; Tor is just a TCP reverse proxy with load balancer capabilities go learn some DevOps dodads

Edit: As per Alec's comment - clarify that Tor is technically a reverse proxy with load-balancing capabilities rather than a straight up TCP load balancer.

u/Himekat · 10 pointsr/cscareerquestions

I don't really have many good resources for you. I don't read a lot of technical books or websites/blogs outside of my day-to-day job. I've heard some pretty amazing things about Site Reliability Engineering and Effective DevOps, but I have yet to read either of them.

Overall, as you move forward in your career, I would encourage you to learn as much as you can about the ecosystem your code lives in. A lot of people who go into DevOps have really broad and comprehensible knowledge about the entire stack, all the way from networking and servers, to writing code, to building/deploying/hosting that code, to performance tuning that code, to logging and monitoring issues within the code, etc. Some developers really get stuck on "well, I've written the application, so I'm done, right?" but really there's a lot more to it and that's what DevOps people know and do.

u/leemachine85 · 7 pointsr/devops

My advice, don't worry about coding or experience in this or that solution.

Read, Read, Read!

Learn the discipline, the fundamentals, know what DevOps is.

Start with this:

https://www.amazon.com/Site-Reliability-Engineering-Production-Systems/dp/149192912X

u/SuperQue · 5 pointsr/sysadmin

I wouldn't say too early for having a good high-level overview of best practices.

It's always good to have a solid theoretical/philosophical understanding before diving into the specifics.

I'll put my vote down for Site Reliability Engineering as a good reference for high level ideas and less on the low level details.

u/Uzthunder · 4 pointsr/devops

Site Reliability Engineering: How Google Runs Production Systems

u/YuleTideCamel · 3 pointsr/learnprogramming

Sure I really enjoy these podcasts.

.NET Rocks. Even if you don't do .NET they talk about high level concepts and talk to a lot of smart people on various things. on a recent podcast, they mentioned KataCoda which is an amazing site for learning docker. I had never heard of it.
Hanselminutes. Great technology podcast.
JavaScript Jabber
Ruby Rogues

As for books , here are some tech books I have read and enjoyed:
Clean Code Amazing book on how to write good, maintainable code. It's language agnostic.
The Clean Coder Same author as ^ but instead of code, focuses on being a professional developer and the skills needed to succeed in the industry.
[Site Reliability Engineering: How Google Runs Production Systems] (https://www.amazon.com/Site-Reliability-Engineering-Production-Systems/dp/149192912X) I'm reading this book now and it's a great look into how you maintain high scale web applications.
Notes to a Software Team Leader: Growing Self Organizing Teams . Even if you're still learning, it's a good book to see what a good lead is like and hopefully one day you can grow into that role.
Head First Design Patterns
Code: The Hidden Language of Computer Hardware and Software

There's obviously a ton of other books, but those immediately come to mind.

u/atoi · 3 pointsr/sysadmin

If I were you I would look into the site reliability engineering book. The first few chapters address your question pretty well. The tl;dr version is don't guess. Get stakeholders (specifically management) to agree to a specific SLA target based on your needs and provide the funding necessary to hit those numbers.

u/AccomplishedAdmin · 3 pointsr/ITCareerQuestions

Sysadmin, I've been doing Lead SysEng/DevOps/SRE for the past 4 years and literally have multiple written offers I'm trying to choose from right now.

I only started looking 3 weeks ago.

Learn multiple clouds(I've done the big 3 in prod and other ones for utils/tools/hobby/legacy systems), Kubernets/docker, Linux, distributed systems and ansible/puppet/chef

Read this:
https://www.amazon.com/Practice-Cloud-System-Administration-Practices/dp/032194318X/
and this:
https://www.amazon.com/Site-Reliability-Engineering-Production-Systems/dp/149192912X/

See if you can buy time for the internship offer, having multiple offers is always better :)
Is the internship paid?

u/Skylis · 2 pointsr/networking

The SRE book

https://www.amazon.com/Site-Reliability-Engineering-Production-Systems/dp/149192912X

u/Kaelin · 2 pointsr/devops

This one

Site Reliability Engineering: How Google Runs Production Systems

Free Online: https://landing.google.com/sre/book.html

Amazon: https://www.amazon.com/dp/149192912X/ref=cm_sw_r_cp_tai_wxJdAbNKAZKZH

u/Squibbles1077 · 2 pointsr/devops

The o'reilly "site reliability engineering" book was well worth reading imo

https://www.amazon.com/Site-Reliability-Engineering-Production-Systems/dp/149192912X/

u/CSMastermind · 2 pointsr/AskComputerScience

Senior Level Software Engineer Reading List

Read This First

u/zachpuls · 1 pointr/networking

Here's what I used to use:
https://www.amazon.com/gp/product/0521899435/ref=ox_sc_saved_title_2?smid=ATVPDKIKX0DER&amp;psc=1

https://www.amazon.com/gp/product/149192912X/ref=ox_sc_saved_title_1?smid=A26MBA1SHJSNWL&amp;psc=1

But here are some other good reads/materials:

http://www.aosabook.org/en/distsys.html

https://disco.ethz.ch/courses/podc_allstars/lecture/podc.pdf

I can't find the link I was thinking about, but here's a similar one: http://lms.uop.edu.jo/lms/pluginfile.php/2069/mod_resource/content/0/designing-distributed-systems-google-case-study.pdf

u/solid7 · 1 pointr/learnprogramming

Google site reliability engineering wrote a book. Check it out.

u/studweiser83 · 1 pointr/devops

You wanna be devops? Read how the best does it then take classes to answer everything you have questions about https://www.amazon.com/Site-Reliability-Engineering-Production-Systems/dp/149192912X

u/fireduck · 1 pointr/almosthomeless

Grow a beard, read this book (http://www.amazon.com/Site-Reliability-Engineering-Production-Systems/dp/149192912X) and apply for SRE jobs. Lots of them in Seattle, if my email inbox is any indicator.

I'm not joking about the beard.

u/pooogles · 1 pointr/sysadmin

>How did you get started in DevOps?

I watched https://www.youtube.com/watch?v=LdOe18KhtT4. I realised this was the future and if you wanted to be in a high performing organisation you need to do what they're doing.

Unless you're in an organisation that is willing to undergo the cultural change of Operations and Development working together you're probably not going to go far. Creating a devops organisation from scratch is HARD unless everyone is on board.

Looking into the technology is the simple part, try reading around the movement. Pheonix Project (http://www.amazon.com/Phoenix-Project-DevOps-Helping-Business/dp/0988262509) is a good start, from there I'd look into Continuous Integration and Continuous Delivery (https://www.amazon.co.uk/Continuous-Integration-Improving-Software-Signature/dp/0321336380 & https://www.amazon.co.uk/Continuous-Delivery-Deployment-Automation-Addison-Wesley/dp/0321601912).

If by this point you don't know a programming language you're going to be in serious trouble. Learn something, be it Powershell (and honestly you probably will want to move onto C# if you want to be amazing at what you) or Python/Ruby.

Honestly you should be working towards what Google does with SRE if you want to be at the leading edge. https://www.amazon.co.uk/Site-Reliability-Engineering-Production-Systems/dp/149192912X.

u/wolfador · 1 pointr/cscareerquestions

Have you read https://www.amazon.com/Site-Reliability-Engineering-Production-Systems/dp/149192912X already? It covers a bit of who/how they hire along with what they do. Might help some. Good luck!

Reddit reviews Site Reliability Engineering: How Google Runs Production Systems

18 Reddit comments about Site Reliability Engineering: How Google Runs Production Systems:

Senior Level Software Engineer Reading List

Read This First

Fundamentals

Development Theory

Philosophy of Programming

Mentality

Software Engineering Skill Sets

Design

History

Specialist Skills

DevOps Reading List