Advent of Sysadmin 2025

290 points by lazyant 14 hours ago

Here's 12 Sysadmin/DevOps (they're synonyms now!) challenges, straight from the day job:

  1.  Get a user to stop logging in as root.
  2.  Get all users to stop sharing the same login and password for all servers.
  3.  Get a user to upgrade their app's dependencies to versions newer than 2010.
  4.  Get a user to use configuration management rather than scp'ing config files from their laptop to the server.
  5.  Get a user to bake immutable images w/configuration rather than using configuration management.
  6.  Get a user to switch from Jenkins to GitHub Actions.
  7.  Get a user to stop keeping one file with all production secrets in S3, and use a secrets vault instead.
  8.  Convince a user (and management) you need to buy new servers, because although "we haven't had one go down in years", every one has faulty power supply, hard drive, network card, RAM, etc, and the hardware's so old you can't find spare parts.
  9.  Get management to give you the authority to force users to rotate their AWS access keys which are 8 years old.
  10. Get a user to stop using the aws root account's access keys for their application.
  11. Get a user to build their application in a container.
  12. Get a user to deploy their application without you.

After you complete each one, you get a glass of scotch. Happy Holidays!

DoctorOW 7 minutes ago

> Get a user to use configuration management rather than scp'ing config files from their laptop to the server.
Damn, this one I'm guilty of. Though, I'm not real Sysadmin/DevOps, I'm just throwing something together and deploying it on a LAN-only VM for security reasons (I don't trust the type of code I would write)
infogulch 25 minutes ago

Q: 3. Get a user to upgrade their app's dependencies to versions newer than 2010.
A: Calculate the average age in years of all dependencies calculated by: (max(most recent version release date, date of most recent CVE on library) - used version release date). Sleep for that many seconds before the app starts.
cobertos 11 hours ago

Re: 6. ... Github Actions
Github Actions left a bad taste in my mouth after having it randomly removed authenticated workers from the pool, after their offline for ~5 days.
This was after setting up a relatively complex PR workflow (always on cheap server starts up very expensive build server with specific hardware) only to have it break randomly after a PR didn't come in for a few days. And no indication that this happens, and no workaround from GitHub.
There are better solutions for CI, GitHub 's is half baked.
- paulddraper 17 minutes ago
  
  This is documented currently (supposed to be 14 days). [1]
  That said, I have found runners to be unnecessarily difficult.
  But Jenkins and its own quirks, and when I used GitLab, it used ancient docker-machine and outdated AMIs by default.
  I think Buildkite has been the only one to make this easy and scalable. But it is meant for self hosted runners.
  [1] https://docs.github.com/en/enterprise-cloud@latest/actions/h...
- swyx 10 hours ago
  
  bugs happen to all of us. whats your better solution - gitlab?
  - shoo 8 hours ago
    
    Roll 2d6, sum result. Your CI migration target is:
    2. migrate secret manager. Roll again 3. cloud build 4. gocd 5. jenkins 6. gitlab 7. github actions 8. bamboo 9. codepipeline 10. buildbot 11. team foundation server 12. migrate version control. Roll again
    
    swyx 7 hours ago
    
    somehow i am really liking the kind of people that comment in the comment sections of sysadmin posts. i wonder what personality type this is
    
    n4bz0r 6 hours ago
    
    Sysadmin.
    
    speakspokespok 5 hours ago
    
    SysEng
  - esseph 9 hours ago
    
    GitLab pipelines are really good.
    
    Balinares 8 hours ago
    
    Not in love with its insistence on recreating the container from scratch every step of the pipeline, among a bundle of other irksome quirks. There are certainly worse choices, though.
    
    friendzis 3 hours ago
    
    Opposite of Jenkins where you have shared workspaces and have to manually ensure workspace is clean or suffer from reproducibility issues with tainted workspaces.
  - sharts 10 hours ago
    
    honestly jenkins really isnt that bad
    
    friendzis an hour ago
    
    Hudson/Jenkins is just not architected for large, multi-project deployments, isolated environments and specialized nodes. It can work if you do not need these features, but otherwise it's fight against the environment.
    You need a beefy master and it is your single point of failure. Untimely triggers of heavy jobs overwhelm controller? All projects are down. Jobs need to be carefully crafted to be resumable at all.
    Heavy reliance on master means that even sending out webhooks on stage status changes is extremely error prone.
    When your jobs require certain tools to be available you are expected to package those as part of agent deployment as Jenkins relies on host tools. In reality you end up rolling your own tool management system that every job has to call in some canonical manner.
    There is no built in way to isolate environments. You can harden the system a bit with various ACLs, but in the end if you either have to trust projects or build up and maintain infrastructures for different projects isolated at host level.
    In cases when time-wise significant processing happens externally, you have to block an executor.
    
    bionsystem 7 hours ago
    
    Yeah I was thinking of using it for us actually. Connects to everything, lots of plugins, etc. I wonder what the hate is from, they are all pretty bad aren't they ?
    Will test forgejo's CI first as we'll use the repo anyway, but if it ain't for me, it's going to be jenkins I assume.
    
    n4bz0r 6 hours ago
    
    Cons:
    - DSL is harder to get into. - Hard to reproduce a setup unless builds are in DSL and Jenkins itself is in a fixed version container with everything stored in easily transferable bind volumes; config export/import isn't straightforward. - Builds tend to break in a really weird way when something (even external things like Gitea) updates. - I've had my setup broken once after updating Jenkins and not being able to update the plugins to match the newer Jenkins version. - Reliance on system packages instead of containerized build environment out of the box. - Heavier on resources than some of the alternatives.
    Pros:
    - GUI is getting prettier lately for some reason. - Great extendability via plugins. - A known tool for many. - Can mostly be configured via GUI, including build jobs, which helps to get around things at first (but leads into the reproducibility trap later on).
    Wouldn't say there is a lot of hate, but there are some pain points compared to managed Gitlab. Using managed Gitlab/Github is simply the easiest option.
    Setting up your own Gitlab instance + Runners with rootless containers is not without quirks, too.
    
    bionsystem 5 hours ago
    
    I have a previous experience with it. I agree with most points. Jobs can be downloaded as xml config and thus kept/versioned. But the rest is valid. I just don't want to manage gitlab, we already have it at corp level, just can't use it right now in preprod/prod and I need something which will be either throwaway or kept just for very specific tasks that shouldn't move much in the long run.
    
    n4bz0r 4 hours ago
    
    For a throwaway, I don't think Jenkins will be much of a problem. Or any other tool for that matter. My only suggestion would be to still put some extra effort into building your own Jenkins container on top of the official one [0]. Add all the packages and plugins you might need to your image, so you can easily move and modify the installation, as well as simply see what all the dependencies are. Did a throwaway, non-containerized Jenkins installation once which ended up not being a throwaway. Couldn't move it into containers (or anywhere for that matter) without really digging in.
    Haven't spent a lot of time with it myself, but if Jenkins isn't of much appeal, Drone [1] seems to be another popular (and lightweight) alternative.
    [0] https://hub.docker.com/_/jenkins/
    [1] https://www.drone.io
  - 0xedd 8 hours ago
    
    [dead]
n4bz0r 6 hours ago

> Sysadmin/DevOps (they're synonyms now!)
I've notified the authorities and social services.
jagged-chisel 12 hours ago

> … from Jenkins to GitHub Actions.
Oh, good lord why?
- vachina 11 hours ago
  
  Because sysadmim wants to outsource their responsibilities (and job).
f1shy 5 hours ago

>> Sysadmin/DevOps (they're synonyms now!)
Is this really like that? Isn't there any Unix/DBA anymore? I associate DevOps to what at my time we called "operations" and "development". We had 5 teams or so:
1) Developers, who would architect and write code, 2) Operations who would deploy, monitor and address customer complaints, 3) Unix (aka SYS) administrators, who would take care of housekeeping of well, the OS (and web servers/middleware), 4) DBA who would be monitoring and optimizing Oracle/Postgres, and 5) Network admins, who would take care of Load Balancers, Routers, Switches, Firewalls (well, there were 2 security experts for that also)
So I think DevOps would be a mix of 1&2, to avoid the daily wars that would constantly happen "THEY did it wrong!"
Can somebody clear my mind, please!? It seems I was out of it for too long?!
- rtp4me 32 minutes ago
  
  For 4) - consider PGHero[1] and PGTuner[2] instead of a full-time DBA. We use both in production and they work very well to help track down performance issues with Postgres.
  [1] https://github.com/ankane/pghero
  [2] https://pgtune.leopard.in.ua/
  Edit: For the record, I have worked at a few small companies as the "SysAdmin" guy who did the whole compliment of servers, OS, storage, networking, VMs, DB, perf tuning, etc.
- Wilya 4 hours ago
  
  In full-cloud environments, in small/middle companies I've worked at:
  Developers handle 1). Devops handle 2)/3)/5). Nobody does 4)
  - f1shy 3 hours ago
    
    Thanks. That is an interesting insight into the current reality. I assume the developers take care of optimization of queries; set up indexes and development of schemas and DB backups is handled by devops.
    I must say, again I thought (I read it somewhere?) DevOps should take care of the constant battle between Devs and Operations (I've seen enough of that in my times) by merging 1 and 2 together. But it seems just a name change, and if anything, seems worst, as a (IMHO) critical and central component, like the DB, now has totally distributed responsibilities. I would like to know what happens when e.g. a DB crashes because a filesystem is full, "because one developer made another index, because one from devops had a complaint because X was too slow".
    Either the people are extremely more professional that in my times, or it must be a shitshow to look while eating pop-corn.
    
    friendzis an hour ago
    
    > DevOps should take care of the constant battle between Devs and Operations
    In practice there is no way to relay "query fubar, fix" back, because we are much agile, very scrum: feature is done when the ticket is closed, new tickets are handled by product owners. Reality is antithesis of that double Ouroboros.
    In practice developers write code, devops deploy "teh clouds" (writing yamls is the deving part) and we throw moar servers at some cloud db when performance becomes sub-par.
  - sgarland 3 hours ago
    
    Nobody does 4 until they’ve had multiple large incidents involving DBs, or the spend gets hilariously out of control.
    Then they hire DBREs because they think DBA sounds antiquated, who then enter a hellscape of knowing exactly what the root issues are (poorly-designed schemata, unperformant queries, and applications without proper backoff and graceful degradation), and being utterly unable to convince management of this (“what if we switched to $SOME_DBAAS? That would fix it, right?”).
  - avhception 4 hours ago
    
    Can confirm: that's exactly what we do.
betaby 11 hours ago

5. and 6. are a matter of taste (trade-offs), the rest is spot on!
daemonologist 11 hours ago

You get me the permissions to do half of this stuff, and I'll do whatever you want.
Nextgrid 4 hours ago

> Get a user to stop logging in as root.
It really depends if the machine is hosting anything that you don't want some users to access. If the machine is single-purpose and any user is already able to access everything valuable from it (DB with customer data, etc) or trivially elevate to root (via sudo, docker access, etc) then it's just pointless extra typing and security theatre.
athrowaway3z 6 hours ago
```
  9.  Get management to give you the authority to force users to rotate their AWS access keys which are 8 years old.
```
Saying "keys which are 8 years old" implies you're worried about the keys themselves, which is just wrong. (Their security state depends on monitoring)
You can definitely make a strong argument that the organization needs practice rotating, so I would advise reframing it as an org-survivability-planning challenge and not a key-security issue.
technion 8 hours ago

I know its a common view that sysadmin/devops are the same these days, but witha current sysadmin role nothing youve mentioned sounds relevant. Let's give you my list:
1. Patch Microsoft exchange with only a three hour outage window 2. Train a user to use onedrive instead of emailing 50mb files and back and forth 3. Setup eight printers for six users. Deal with 9gb printer drivers. 4. Ask an exec if he would please let you add mfa to their mailbox. 5. Sit there calmly while that exec yells like a wwe wrestler about the ways he plans to ruin you in response 6. Debate the cost of a custom mouse pad for one person across three meetings 7. Deploy any standard windows app that expects everyone be an administrator without making everyone an administrator 8. Deploy an app that expects uac disabled without disabling uac 9. Debug some finance persons 9000 line excel function
- hnlmorg 6 hours ago
  
  That sounds more like Desktop Support than a SysAdmin role. My condolences if that's the job you landed when interviewing for a SysAdmin role
- hansmayer 7 hours ago
  
  What you describe sounds more like a MS "Modern Workplace" / IT support in a corporate environment.
  - technion 5 hours ago
    
    Are we arguing that corporate workers arent "real sysadmins"?
    
    jabroni_salad 31 minutes ago
    
    HN culture as a whole doesnt really recognize the validity of business that buy software vs build software.
    
    jagged-chisel 5 hours ago
    
    Pretty sure they mean “general IT support isn’t sysadmin work.”
  - Xiol 6 hours ago
    
    i.e., Hell
alberth 11 hours ago

I’d be super interested to see solutions to each, just to learn from.
JuniperMesos 9 hours ago

A lot of these problems seem pretty solveable, if you're the admin of the machine (or cloud system) and the user isn't.
If you don't want a user to log in as root, disable the root password (or change it to something only you know) and disable root ssh. If you want people to stop sharing the same login and password across all servers, there's several ways to do it but the most straightforward one seems like it would be to enforce the use of a hardware key (yubikey or similar) for login. If people aren't using configuration management software and are leaving machines in an inconsistent state, again there are several options but I'd look into this NixOS project: https://github.com/nix-community/impermanence + some policy of rebooting the machines regularly.
If you don't like how users are making use of AWS resources and secrets, then set up AWS permissions to force them to do so the correct way. In general if someone is using a system in a bad or insecure way, then after alerting them with some lead time, deliberately break their workflow and force them to come to you in order to make progress. If the thing you suggest is actually the correct course of action for your organization, then it will be worthwhile.
- philipwhiuk 4 hours ago
  
  None of them are technically hard. All of them are bureaucracy-hard.
  If you just do any of this list without the proper migration plan/time, someone senior in the org will complain and you will lose.
  - jakeydus 3 hours ago
    
    > If you just do any of this […], some senior in the org will complain and you will lose.
    More accurate statement imo.
- skywhopper 3 hours ago
  
  It’s not as easy as “I can technically change this”. If you think it is, you don’t understand the job of a sysadmin.
UltraSane 2 hours ago

Best practice is to use IP-restricted keys.

melvinodsa 14 hours ago

When I get sad and nothing to do in the world, may be hacking into a sad server's problem seems very interesting

kralos 11 hours ago

    imagine typing in a terminal...
    you want to delete the previous word so press ctrl+w...
    actually you're in a browser; the window closes...

:sadness:

melvinodsa 11 hours ago

We used to run terminal in browser using https://github.com/yudai/gotty and the entire dev team remapped their Ctrl+w to Ctrl+`. We did frontend and backend development with this setup almost for 1.5 years. Muscles memory and till this date, always have the fear if my actual terminal will get closed if I use Ctlr+w :P
tambourine_man 3 hours ago

Which is why macOS command key is such an undervalued nicety. One key for GUI stuff, one for command-line stuff.
protomikron 5 hours ago

You can use ctrl+shift+t to open the recently closed tab again.
fduran 11 hours ago

hello, creator here, sorry about that. In this case you can click again on the "Open the Server Terminal in a New Window" button
- kralos 5 hours ago
  
  It would be cool if we could SSH into the temporary host (I'm guessing these hosts currently aren't internet connected to avoid abuse so might not be possible or require some super careful firewalling)
CoolCold 8 hours ago

I feel your pain - bites me from time to time, especially in KVM ;)

truekonrads 3 hours ago

I absolutely love the sadservers. Can’t wait for windows version.

irusensei 6 hours ago

It seems it's called SRE nowadays right? I hate how things keep being renamed for no reason other than making more buzzwords for suits.

phrotoma 2 hours ago

The definition I liked best, which I _think_ came from one of the Google SRE books though I'm not certain, was: "SRE is what happens when you consider operations to be a software problem".
oarmstrong 4 hours ago

I share your disdain for buzzwords but SRE is definitely a different role.
kortilla 3 hours ago

Nope, SREs keep applications running on a platform. Lots of metrics, tools to deploy apps in whatever rollout process the company has, etc.
In small companies, sysadmin might be a duty of the SRE team, but they definitely diverge if you have a large on-prem deployment or work with bespoke VMs in the cloud.

ofrzeta 5 hours ago

It doesn't seem to record my progress.

dontdoxxme 5 hours ago

Without sharing too many spoilers... I solved the challenge but the check script was unhappy. The curl commands in the script worked fine, the earlier parts of the script failed, i.e. it didn't like how I'd decided to make that work.

This kind of thing annoys me. This is why CTFs are great, where the goal is to get the flag string. Obviously harder to do for sysadmin, but expecting a particular configuration when I managed to make it work without doing things exactly as they wanted is no better than a poorly written exam.

udev4096 10 hours ago

I wonder if we could get something like that for k8s, docker and other container ecosystem

teddyh 14 hours ago

[flagged]

thatxliner 14 hours ago

well advent of code also needs an account
- npinsker 12 hours ago
  
  It’s not necessary to see the problems though
  - unsnap_biceps 12 hours ago
    
    It's not clear that you will need an account to see the problems. Logged in with my account and it's exactly the same page. It's not Dec 1st everywhere yet, so they might open up for everyone when they do open them up.
- stonecharioteer 13 hours ago
  
  This also has a paid account and a business account.
  - thaumasiotes 2 hours ago
    
    And if you have a paid account, you get extra time to complete the challenge!
    Somehow, SadServers seems to have entirely missed the concept of a "puzzle".
fduran 11 hours ago

Checking out how the platform works was two clicks away: home -> give me a server.
I don't know of any other SaaS which gives you a VM with one click without any registration but we do it.
In any case thanks for the feedback, I've put a button on this /advent page for clarity, cheers
fragmede 14 hours ago

how do you want it to work? do you even sysadmin?
- jbmsf 13 hours ago
  
  I see: a page offering something interesting but vague.
  If you tell me more, I might sign up. If I have to create an account first, I'm walking away.
- teddyh 13 hours ago
  
  > how do you want it to work?
  I would like to see and try to solve the scenarios for myself, not to get meaningless internet points. If you look at their front page, you can do that right now. So why do I have to create an account to even see these special advent scenarios?
  > do you even sysadmin?
  Yes.
- mekoka 14 hours ago
  
  I think the point is "ok, account is free, then what?"
  At 5$/m I might give the paid subscription a try.

NooneAtAll3 11 hours ago

what's the deal with 12-days advent calendars lately?

nstart 8 hours ago

Time pressures during christmas/holidays mean that the original calendars were becoming too stressful to handle. Seen several calendars switching to 12 consecutive days or 1 every 2 days challenges.
fyltr 3 hours ago

Well, 12+12=24, so now we can complete two advents
aljaz823 10 hours ago

Advent of Code went from 25 days to 12 days starting this year, as it took too much time each year to come up with 25 unique challenges [1].
[1] https://adventofcode.com/2025/about#faq_num_days
swyx 10 hours ago

aren't they canonically 12? 12 days of christmas etc
- dragonwriter 10 hours ago
  
  No, Advent is the liturgical season preceding Christmas, beginning the fourth Sunday before Christmas (which is also the Sunday nearest November 30), it is a period of at least three weeks and one day (the shortest period that can start on a Sunday and include four Sundays.)
  The 12 days of Christmas start on Christmas and end on January 5, the eve of the Feast of Epiphany.
  12-day advent calendars are a fairly recent invention that mirrors the 12-days of Christmas, but has no direct correspondence to anything in any traditional Christian religious calendar (the more common 24-day format is also a modern, but less recent, invention detached from the religious calendar, that simplifies by ignoring the floating start date of advent and always starting on Dec. 1.)
- d5ve 10 hours ago
  
  Don't the 12 Days of Christmas start on the 25th though?
  - thaumasiotes 3 hours ago
    
    Yes, Christmas is the first of the twelve days of Christmas.
    Advent begins on the fourth Sunday before Christmas, which was Nov 30 this year. It ends on Dec 24. Therefore it is technically anywhere from 22 to 28 days long.
    Advent calendars begin on Dec 1 and end on Dec 25.
- c0wb0yc0d3r 10 hours ago
  
  Advent calendars track time until Christmas. “12 days of Christmas” are the twelve days after Christmas.

tonyhart7 10 hours ago

now we need advent of arts,math etc

dubya 38 minutes ago

For math, the AMC 10 and AMC 12 tests have 25 questions each, some of them quite challenging. Both are high school level math, no calculus. Search "2025 amc 10" for this year's problems and solutions.

zhouzhao 7 hours ago

[dead]

rvz 13 hours ago

[flagged]

gryfft 13 hours ago

Don't drag me into this.
- ctxc 12 hours ago
  
  Do you have notifications set up or something? xD
  - gryfft 12 hours ago
    
    No, I just occasionally suffer a failure of self-control when I see my almost-namesake in a comment.
mekoka 13 hours ago

Could you elaborate?