Mongo migration

For the past few months I’ve been at a terrific job, doing devops at a small SaaS company. Real quick, SaaS means “Software as a Service” & refers to companies that have a webapp that they either sell access to and/or set up a version of for their customers. There are a lot of challenges with doing devops for a company like this, trying to find the balance between the heavyweight solutions and the latest and greatest to find what’s right for us, all the while (personally speaking) doing a LOT of learning on the topic. That’s not to say that heavyweight versus the latest&greatest are opposed; there are a few more weights on that spinning disk, not the least of which is “what we were doing before was …”.

So what I’ve been working on for the last few weeks, somewhere between the old solution & the new hotness, has been a Mongo problem. We deal in data that must be scrubbed before we analyze it. So the way that works, is that the host captures data, then scrubs ALL data there, and then sends it on to our long-term storage database, and then all local data on that host is removed after a couple days. What we’ll do with all of this in five, ten years will hopefully be the subject of another post, but for now we are only dealing with about 30GB of data in the long-term storage DB, collected over the last couple years. Let’s call that “Storeo,” and the hosts that they come from “partner databases,” which is true enough.

We’ve developed a couple of schemas for Storeo, and we only upgrade our partners from one to the next with code releases. So we have a couple old versions of Storeo kicking around. The next piece of this story is that we have an analytics dashboard set up for each partner, which pulls from Storeo, based on a domain field in the data we get from each partner. There’s one for each version of Storeo that they (and we) have to refer to, which means multiple dashboards just to get all the info! So that’s foolish, yeah? As a result, a previous engineer wrote a Mongo migration script to migrate all data from version 1 to 2, and then from version 2 to 3, the current version. So there are two steps to this – first, to migrate all the legacy data up to the current version so everything can be analyzed in the same way, and second, to do this regularly so even if partners are using older versions, we roll that data up so there is ONE source of truth for all their data.

As happens occasionally, no one can quite remember how I got this project, but it’s been a ride. Mostly good, occasionally “how the hell does Mongo even work?”. Some of the problems I’ve gone through have been of a Mongo nature, some of them of a sysadmin nature, some of them just basic DBA. Many of these steps might make you scream, but I’m cataloguing them because I want to try to get down what all I’ve done and learned. When you are self-taught, your education comes in fits and starts and in no particular, and in sometimes infuriating (out of) order. So I’m going to do my best to show you all the things I did wrong, too.

Problem 1 – Where to Test

I wanted to test the migration locally, not on the production Storeo server, which continues to receive data from all our partner database. First, I fired up the mongodump docs and tried that. Well, I nearly immediately ran out of room, and deleted that dump/ directory with those contents. When I looked around with a df -h /, a command which shows you the disk file size on root, human-readable, the output was that there were only a couple gigs left. Well, I knew that dumping a 15GB database wasn’t going to work locally. So I investigated a lot of other options, like sending the mongodump to another server (technically possible), SSHing into the server but sending all dumped data to my local machine with plenty of space on it. This probably took a couple days of investigation between other tasks.

None of this really panned out (but I still think it should have), and my boss let me know that there’s a 300GB volume attached to Storeo, and I said, wait, but I didn’t see that, I looked for something like that, and they gently let me know not to give df any arguments in order to see all disks mounted on a server. With that, a df -h showed me the 300GB volume, mounted on /var/lib! Excellent. On a practical note, it’s extremely sensible to have all the data for your application stored on a volume rather than on some enormously provisioned server. When you use AWS, one volume is much the same as the next, so putting databases on their own volumes is pretty sensible. Keep your basic server’s disk very bare bones, put more complex stuff on modular disks that you can move around if you need.

So with that!! I made a directory for myself there to separate from the production stuff, confirmed that mongodump/mongorestore do NOT interrupt read/write operations, and made mongodumps of versions 1, 2 and 3. This took.. maybe an hour. Then, because they were still quite large (Mongo is very jealous of disk space), I tarballed & gzipped them to reduce them down to half a gig or so. We use magic-wormhole all the time at work (available with a quick pip install magic-wormhole [assuming you have Python and pip installed {but it doesn’t have to be just a Python thing, just like I use ag and that’s a super Perl-y tool}]) so I sent these tarballs to my local machine, untarred/ungzipped, and mongorestored to the versions of Storeo 1, 2, & 3 that I have locally to run our app on my own machine. This probably, with carefulness and lots of reading, took another couple hours. At this point we’re probably a week in.

Problem 2 – How to Test

At this point, I finally started testing the migration itself since everything was a safe copy and totally destructible. Also I retained the tarballs in case I ended up wanting to drop the database or fiddle with it in some unrecoverable way. I took a count of the documents being migrated, and of the space taken up by each DB (which was different than on prod – I thought until this week that those sizes should be constant from prod-mongodump-tarball-mongorestore, but that’s not true – apparently most databases are wiggly with their sizing). The migration script is a javascript script (how do you even say that) that you feed into mongo like so mongo migration1-to-2.js, within which you define dbSource and dbTarget. The source, in this case, is version 1 of Storeo, and the target is version 2. Each of these is a distinct database managed by Mongo. With great trepidation, I did iiiit. Ok, I’ve left a piece out. I, um, didn’t know how to run JS. Googling said “oh just give the path to the browser!” so I did and, uh – that didn’t work. You may be saying “Duh.” Look, I’ve never done any front-end at all, and have never touched javascript outside that Codecademy series I did on here a couple years back. With my tail between my legs I asked my boss again, & was told about the above, just mongo filename.js.

The script took three hours!! Gah! So I ran the next one, which took SEVEN (since it contained everything from the first one, too), and regular attention to the ssh session so I didn’t lose the process (don’t worry, linux-loving friends, I’ll get there, just keep reading). These two migrations took two business days. At this point, we started talking to the team who manages the data analysis dashboards for our partners to talk about some of the complexities. Because a) this isn’t a tool from Mongo, there are no public docs on it and b) you can only test Storeo performance after the data has been scrubbed and sent, even locally, we decided to set up a few demo servers to point to test versions of the database.

Remember the volume attached to Storeo on production? Whoo! I logged onto Storeo and learned a ton more about mongodump & mongorestore, and made teststoreo1, teststoreo2, and teststoreo3, exact mongodump/restore copies of versions 1, 2 & 3 of Storeo. Their sizes, again, were different, but we’ve learned that that’s ok! Mongo has a lot of guarantees, space management isn’t one of them, so pack extra disk and we’ll be fine. So because this took a lot of googling and careful testing, because the last thing I wanted to do was mongorestore back into the place I’d mongodumped from – at the time I wasn’t sure if mongorestore overwrites the disk entirely, and wanted to be cautious versus potential lost data. So, make the directory, mongdump into it while specifying the database. Then restore into a new database (with the same name as the directory you’ve just made – this isn’t mandatory but made it easier to trace) while feeding it the path where the mongodump lives.

mkdir teststoreo1 # make the directory
mongodump -d storeo1 teststoreo1/ # dump the database with the name storeo1 into the dir we just made 
... # this takes some time, depending of course on the size
mongorestore -d teststoreo1 teststoreo1/storeo1 # there could be a dump/ in front of this end path

So after doing this for the other two Storeo databases as well, a show dbs command in the Mongo shell outputs all three production Storeos, as well as all three test Storeos. This meant we were in a good place to do some final testing. There were a few more meetings assessing risk and the complexity of all the pieces of our infrastructure that touch Storeo, how you do. Because the function of Storeo is to continually take in stripped data, I had to ensure that we weren’t going to lose information being sent during the migration. Because it’s not an officially supported tool but instead something that we wrote in-house, and I hadn’t been able to find a tool that moves data from one mongo DB to another, it’s hard to know what will and won’t impact production, so I set up one of our demo servers to send its stripped data to teststoreo1, and then kicked off the migration from teststoreo1 to teststoreo2 to make sure there was no data loss. On that demo server, while the migration was migratin’, I made a bunch of new dummy data that I’d be able to trace back to this demo server. A few hours later, when the 1-to-2 migration was complete, sure enough there were a handful of documents in teststoreo1 that were new – they’d been held & NOT sent! With this, I was very happy with the migration script.

So I kicked off the following script with mongo migrate1-2.js, quit the process with ctrl-z, and put it in the background (after identifying it as job 1) with bg %1, so it wouldn’t be interrupted by my leaving the session (see?)..

'use strict';

var dbSource = connect("localhost/storeo1");
var dbTarget = connect("localhost/storeo2");

// The migration process could take so long that new documents may be created
// while the script is still running. We will move only the ones created
// before the start of the process
var now = new ISODate();

    elem.schemaVersion = 2; // this means each element is given the NEW schema version

dbSource.collection_2.find({createTime: {$lt: now}}).forEach(function(elem){
    elem.schemaVersion = 2;

dbSource.collection_3.find({timestamp: {$lt: now}}).forEach(function(elem){
    elem.schemaVersion = 2;

dbSource.collection_1.remove({}); // this collection did not have a timestamp
dbSource.collection_2.remove({createTime: {$lt: now}});
dbSource.collection_3.remove({timestamp: {$lt: now}});

The second script was the same but for the definitions of dbSource and dbTarget to storeo2 and storeo3, respectively. As with the testing, the first one took about three hours, the second, seven. With each one, I kicked it off, then put it in the background, then checked on it… later. Because it’d been backgrounded (that’s a verb, sure), it wasn’t quiiiiite possible to tell when it was done. That could be fixed with some kind of output at the end of the script, but that’s not how I did it!

Then I set up a lil cron job there at the end to regularly move data from 1 to 2, and once that had run for the first time, then I set up the second cron job to move it from 2 to 3.

Who wants to talk about Mongo????????

Exploring Dockerfiles

I’d like to continue the previous entry on Docker a little further. Last time we talked about the installation process & a little more, so this time we’re going to talk about the next part of getting started with Docker – writing a Dockerfile.

Here’s what we talked about last time, and with one odd little exception (why did I promise to talk about load testing…) we’re going to cover all these things!

So, next steps, make the container persistent – it isn’t yet, and play around with Dockerfiles, and just do a little more spying on the produced container itself & probably try to do some babby’s frist load testing things in there & spy on the container as a process without the box & all its processes within!

First let’s take a look at the Docker process we created last time. Just like at your native command line, docker commands all resemble low-level Linux commands, so just like you’d use ps to look at the processes running at any given time on your machine, you can use docker ps to see all the Docker processes it is managing at any given time. If you followed along last time you’ll see some that have been exited but which you don’t have access to – each time you run the docker run -it bash you get a new process. But the old ones are still there! The all flag will show us these Exited boxes with a docker ps -a.

rachel $ docker ps -a
CONTAINER ID        IMAGE                      COMMAND                  CREATED             STATUS                     PORTS               NAMES
b5de9583d7b3        fedora                     "bash"                   10 minutes ago      Exited (0) 3 seconds ago                       pedantic_morse
35192bfa05d4        images/cowsay-dockerfile   "/usr/games/cowsay *P"   2 hours ago         Exited (0) 2 hours ago                         gigantic_goldberg
a0e40d55125a        images/cowsayimage         "/usr/games/cowsay 'D"   3 hours ago         Exited (0) 3 hours ago                         jovial_mcnulty
d32381833772        debian                     "bash"                   3 hours ago         Exited (0) 3 hours ago                         cowsay

You’ll notice a few things, first that the names are a mix of adjective_noun, except one – the cowsay container example is from the excellent Using Docker where I’ve gained a lot of my recent Docker information. Their status is all Exited. Some of the container-specific commands are similar to the init.d service commands, like start, stop, and rm, so let’s start the desired container in that list up there. The container we’re going to start up is similar to the one we made before & is Fedora, though it is true that I only made it ~10m ago!

docker start pedantic_morse

So now the output of docker ps includes the container we just made. So how do we keep it? We commit it, just like with Git! Replace pedantic_morse with whatever name yours has been assigned beneath the NAMES column.

rachel $ docker commit pedantic_morse images/morse

So what we’ve done here is create an image from which we can create containers. images/morse is the image, pedantic_morse is the Docker process that we crafted it from. For every time we run the image images/morse, it creates a new Docker process, so at this point it’s still not persistent in ONE image, HOWEVER we can use this image to perform one-offs.

Clearly we’re not getting into the strength of Docker, yet. So now it’s time for a very basic Dockerfile. Just like Vagrantfile and Procfile & probably a few other similarly intended setup files, the D in Dockerfile is capitalized and there’s no extension to it, because remember – Linux doesn’t care about file extensions!

The main piece to know with Dockerfiles is that their syntax can be as minimal as you like, and personally I recommend making them non-complex – major structural pieces, and insert kickoff scripts or use some config management in the container itself for anything much more complicated. I reserve the right to change my mind on this later! And this is also more for next time to learn. But the way it looks, the RUN command will run any bash you put in it, but if you need anything more complex, the contents become a lot more murky, in my opinion. Simple is better than complex, but complex is better than complicated, so let’s do what we need to here.

For posterity and a simplistic example, here’s the first Dockerfile I ever wrote. (ed note: I trimmed this down because each line of a Dockerfile creates a new filesystem – try to truncate Dockerfile lines as much as possible)

FROM fedora:23
RUN /bin/bash
RUN echo "the dockerfile took!"

RUN dnf install -y wget tar man


The output of this, which is a bit long to post, pulls down version 23 of Fedora, uses bash for the following commands, prints “the dockerfile took!” to stdout, and then installs those three packages. I’m unsure why some of those aren’t present in a base Fedora image, but it doesn’t appear to be related to what I’m working on in this blog post, so we’ll leave it be for now.

This is about ten times longer than I thought it would be, woohoo! I hope you learned something, please please let me know if I’ve missed the mark on anything, cheers!

Tune in next time and we’ll talk about a more complicated Dockerfile, and syncing it up to… something 🙂 come back and you’ll find out what!


I am looking for work! If you’ve been browsing my blog long, or not, you’ll know that I’m primarily a backend-focused Python developer with config management, virtualization, and documentarian bents. My peopling & coworking is LEGIT and I love mentorship and thinking about information transmission. I’m also interested in tech writing, if it’s for something good and chewy that I personally want to see more documentation on (read: everything complicated), and I’d consider a dip into devops/SRE too.

Some of my musts include the following:

  • An established team of at least several years. It seems like between 5-10 years is a very good sweet spot for the kind of growth I’m looking for.
  • You use the Agile development strategy, or something similarly modern. Sprints, clear work assignment and tracking, post-mortems.
  • You use safe, modern Git practices.
  • You have other women in the company.
  • You have onboarded people before.
  • You, as an organization, have made an attempt at writing internal documentation.
  • I am happy to work remotely, but I do not want to be your only remote worker.
  • I am happiER to work in Portland, and require the flexibility to work from home a few times per month, once onboarded.
  • A semi-dedicated resource of whom I can ask friendly questions for the first several months.

Some of my wants:

  • To not be the first woman engineer you hire. This has been very difficult to find.
  • To primarily use Python, with the flexibility to learn new languages.
  • To have the time granted to write great documentation along with the features and fixes I write for you.
  • To be part of a rich code-reviewing team, where everyone’s commits are reviewed, even architects’.

Leave a comment or email me at rkellyalso aat gmail and I’ll shoot you my resume. Let’s doo thiiis.

“Composting” Docker

Right now I’m fiddling with Docker and combing through their terrific, extensive docs. As I have a history of doing lately with this blog, I’ll talk about my own setup and the installation process, all the way to the Docker image I’ve created and messed with, and what’s next for my own knowledge.

New laptop

First, I recently purchased a machine from Free Geek so I’m going through the delightful process (really :D) of setting it up. When I volunteered at Free Geek (which I HIGHLY recommend, please ping me for details!!), the computer I received after building desktops for them for a while was 40G HDD, 512MB DDR2 RAM, and was capable of a wired-only connection to the internet. After a USB wifi antenna, the machine could do all the browsing I needed, which was all I did at the time. At the time, that kind of machine cost anywhere between $80-$130 if my memory serves me correctly. I believe this was about 2011.

Two days ago, I wanted a laptop that would be able to run VMs without too much trouble, and I wanted to spend $200 or less as I am unemployed. Five years after my volunteer stint, I got a machine with 8G ram DDR3, 250G SATA HDD, i5 dual-core processor, in a pretty giant old Dell Latitude E63340, for exactly $200. AMAZING. I’m astounded at how cheap it is. And what else has changed? It’s no longer Ubuntu that they ship the computers they clean, reassemble to spec, and sell or give away, it’s Linux Mint, specifically Edition 17.1. Thoughts so far are really just that it feels very Windows-y with its focus on the (not) Start menu, and you’ll get no complaints from me on this front. It’s Ubuntu with a coat of paint on top, and I’m not toooootally sure this is what I want to be running, but it definitely works and there is probably broadest reach of packages available for the end-user (not enterprise which is of course another red-colored ball of wax) so it will do, for now, and at some point I am sure I’ll explore other distros.


On to Docker! I’ve previously read up on Docker, and find its resource allocation methodology super interesting. I’ve used a lot of Vagrant/VMWare on my OSX box at my previous job, and while the mechanism to spin up/destroy boxes was pretty good, the amount of cruft left behind became frustrating at a work level – I had to clear out vast swaths of space with a machete on an irritatingly regular basis! Basically, Docker runs without a hypervisor, which in the spirit of keeping things in layperson’s terms, is a virtual machine management machine which is pretty heavy. That they’ve figured out a way around the heaviness of a hypervisor is a Big Deal. And as I understand it, it uses the resources of the host machine very intelligently. With the previous VM paradigm, you gave both disk and memory permanently (insomuch as a VM that you create & tear down relatively easily is permanent) to the machine, and the VM held onto that inaccessibly while in use. No such selfish tactics are employed by Docker and use of memory in particular is supposedly much more elastic.

Installation was just a few shell commands which worked with no fuss, I used this official site. The instructions are for Ubuntu 14.04 which Mint 17 is based on, and as I’ve covered above, all Ubuntu packages can be used on Linux Mint with little to no adaptation required.

sudo apt-get update
sudo apt-get install apt-transport-https ca-certificates
sudo apt-get install linux-image-extra-$(uname -r)
sudo apt-key adv --keyserver hkp:// --recv-keys 58118E89F3A912897C070ADBF76221572C52609D
echo 'deb ubuntu-trusty main' >> /etc/apt/sources.list.d/docker.list
sudo apt-get install docker-engine
sudo apt-get update

Then, the Docker engine can be started! First start the service (this may be different if your linux distro uses systemctl, like a newer version of Fedora or RHEL/CentOS) & then make sure their dummy example works.

sudo service docker start
sudo docker run hello-world

Ok! Now we’ve got Docker installed. If you encounter anything funny here during installation I’d love to hear about it, please leave a comment! I can try to help.

There are two primary ways for the new end-user to continue forward. One of them is writing a Dockerfile. I leave that for its own blog post, but hold me to it! I want to write a few basic Dockerfiles for my own reference & I imagine they’ll be of some use to others! The second way I’ll mention here is via DockerHub, which you don’t even directly need to interact with to use! To me it’s reminiscent of pulling images down from VagrantCloud, but again, you don’t even have to go there to grab things. You can feed a few different common distros into the docker run command and it will pop you right into a container of that OS! It’s still rather magical at this point, so I’m still learning more about it so it becomes a bit less magical 😉

Getting into the box I wanted was as simple as plugging in the distro I wanted, Fedora, into the docker run command, like so:

sudo docker run -it fedora /bin/bash

What this led to for me, which I need to learn SO much more about, is an EXTREMELY spare version of Fedora that has very few executables I’m used to. Because it’s the latest version of Fedora, its package manager is dnf & not yum, but it still knows what you mean & permits installation “via” yum – but really it’s just aliased to /usr/bin/dnf, haha, which is fine.

Quick fun fact: dnf is an abbreviation of “Dandified Yellowdog Updater, Modified” – you can see the yum in there as the commonly used RHEL and RHEL-flavored linuxes’ package manager. To me, whenever I type dnf install -- it looks as though I’m typing “do not f&$%ing install” 😀

So, next steps, make the container persistent – it isn’t yet, and play around with Dockerfiles, and just do a little more spying on the produced container itself & probably try to do some babby’s frist load testing things in there & spy on the container as a process without the box & all its processes within!

See you next time! Would love to hear from you. If I’ve missed the mark on anything with this pretty chewy piece of technology please let me know, or if there’s anything you’d like to see me cover leave a comment!

btw the title of this is a joke which was made during the Docker-fiddling open space I held at Open Source Bridge 2016 🙂 if you were there, thanks for coming! Super fun discussion.

Postgres on Fedora

Note: This is a post from several months ago in the ol’ drafts bin and there’s a ton of information here, even though it is incomplete. I’m not running Fedora any more, but it’s possible this could help someone else, so I hit publish.

This shouldn’t be too long a post, but I’ve encountered something that does not really feel like it ought to be an edge case!

In trying to install PostgreSQL on Fedora 23 I ran into a few snags per the Other Linux Installation book published here using the download guide for RH-flavored Linuxes here.

Sidebar: In the Installation section of the introduction to Postgres (I know PG pretty well in the context of Puppet Enterprise, but I really want to expand that knowledge since I know a lot of people use & love it [/diatribe on why I’m doing this]) it says the following:

If you are installing PostgreSQL yourself, then refer to Chapter 15 for instructions on installation, and return to this guide when the installation is complete. Be sure to follow closely the section about setting up the appropriate environment variables.

That’s in section 1.1, on the Installation page, by the way. This might be why we can’t have nice things.

MOVING ALONG, the way which has worked for me to install Postgres is the following, thanks in much part to a) This Fedora Project doc and b) working knowledge of the su command.

$ sudo dnf install postgresql-server postgresql-contrib
 ... "are you sure you want to install y/N" ...

$ sudo systemctl enable postgresql
Created symlink from /etc/systemd/system/ to /usr/lib/systemd/system/postgresql.service.

$ sudo postgresql-setup initdb
WARNING: using obsoleted argument syntax, try --help
WARNING: arguments transformed to: postgresql-setup --initdb --unit postgresql
 * Initializing database in '/var/lib/pgsql/data'
 * Initialized, logs are in /var/lib/pgsql/initdb_postgresql.log

$ sudo systemctl start postgresql

So, only after making the dummy initdb did the systemctl start command go through, but I noticed that there was still no pgsql executable, so I couldn’t actually use postgres yet. Finally, I catted the postgres log like so:

$ sudo cat /var/lib/pgsql/initdb_postgresql.log
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are disabled.

fixing permissions on existing directory /var/lib/pgsql/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
creating template1 database in /var/lib/pgsql/data/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... ok
creating collations ... ok
creating conversions ... ok
creating dictionaries ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
loading PL/pgSQL server-side language ... ok
vacuuming database template1 ... ok
copying template1 to template0 ... ok
copying template1 to postgres ... ok
syncing data to disk ... ok

Success. You can now start the database server using:

    /usr/bin/postgres -D /var/lib/pgsql/data
    /usr/bin/pg_ctl -D /var/lib/pgsql/data -l logfile start

which, up at the top, points out that that this needs to be run with the postgres user, and after some fiddling, knowing that the -D flag needs to be before the /var/lib/pgsql/data path as designation. Also seeing su crap itself a number of times was irritating. To get into the postgres user, it was necessary to provide a password! I hadn’t set one, so I tried a couple easy guesses & couldn’t figure it out and a (very quick) DDG didn’t yield anything either, so I snuck around it with sudo -u postgres psql, which only asked me for the superuser password – A-OK! But Then! It then complained that it didn’t have permissions to get into ~, but reasonably got me into the postgres user’s prompt: HERE, FINALLY, I was able to run

postgres=# /usr/bin/postgres -D /var/lib/pgsql/data

Though… now I’m noticing that anything I type in there doesn’t even throw an error. It even offers ‘Type “help” for help,’ and yet when I do, with and without quotes, there is no output & no result. And I still don’t have a pgsql executable.

Ok, but there IS a result from which postgres, which is the nicely predictable /usr/bin/postgres. One success – I have a universally executable postgres! So when I run it with no arguments, hoping for more information from my sleuthing, I get some!

 $ postgres
postgres does not know where to find the server configuration file.
You must specify the --config-file or -D invocation option or set the PGDATA environment variable.

So that’s where the -D is necessary. Cool. But we also definitely don’t have the PGDATA env var
postgres -D /var/lib/pgsql/data.

Wow! There’s more to do here, and I don’t have time right now, so have SOME information, yet incomplete!

Setting up my Fedora workstation

Please note that this post contains content not suitable for those who give no craps about the way a Linux box can be set up for an end-user. I do not blame you, those who give no craps.

So leaving Puppet after 18mo (on great terms! hi friends!) I find myself in need of my own real development machine. I begrudgingly find myself admiring macs after all, but after two minutes of looking at craigslist and finding $700 MBPs from five years ago, concluded that that Just Isn’t Going To Happen, much as I love iTerm2. I’ll buy myself a new computer soon but in the mean time a friend of mine had an Ubuntu 12.04 ThinkPad X220 she said I could borrow for a bit, and oh my god, I am in LOVE. This machine is great and zippy and POWERFUL. I might.. I might just buy a clone when I’ve got the bucks, even though you can pretty much only get them used at this point.

Rather than jump back into Ubuntu which is pretty familiar ground, I wanted something slightly different and my friend Amy has been extolling Fedora’s virtues for years. Further, at Puppet, we virtualized nearly all our testbeds in CentOS using the amazing, moooostly internal (but totally available!) Puppet Debug Kit created & maintained by my brilliant former coworker who is still doing phenomenal work over there. Ok, so I will definitely miss my buds there!

So because I spent about half my time on the job in CentOS & Fedora is the closest end-user version of that with a UI (sorry I’m not hardcore enough to only run a server for my dev box haha!), I grabbed the instructions & made a Fedora-specific bootable usb drive with their (prev linked) docs. After formatting the drive, writing the .iso to it, and plugging it in, I had to fiddle with the BIOS, which on the x220 was incredibly easy – first, on bootup it tells you EXACTLY how to get into BIOS, and it gives you the option to do a one-time boot via USB, rather than having to muck around with boot order! Fabulous!! Then with a bit of wiggling (had to get into a babby command line rq to tell it to choose the Linux0 option which kicked off the install, please, friends, do not ask me why) the installation went off without a hitch, with LITERALLY NONE HITCHES.

It was after rebooting that I started to learn how powerful this little machine really is. It’s fast, despite having 1/4 the memory of my old work MBP (though I really don’t know how that scales), and the trackpad uses all the gestures I’m used to from working on macs.

Then I set up my prompt, and without wanting to get toooo too deep into the oogly bits of bash formatting, I had to try and test and try and test and finally settled from:

export PS1='\[\e[0;36m\]rk\[\e[m\] \[\e[1;37m\@ \w \[\e[m\] \n $ '

which threw a non-ASCII character, and when I fiddled, lost the ability to shut off the bold white text, haha, to:

export PS1='\[\e[0;36m\]rk\[\e[m\] \[\e[1;37m\@ \w \e[m\] \n $ '

Huh. That’s only one [ different. Just bless ya, monospace blog draft.

Anyway, then I got ambitious. I wanted to see if I could run Spotify outside a webapp, because that makes it IMMEDIATELY less likely to be used and I rely pretty heavily on it, during and outside the workday. Using these set of instructions which state a requirement of RPM Fusion as installable here, I got going. These are for Fedora 20 & I’m on 23, but I knew I could get it going. I was so excited for this, I LOVE a new Linux system’s first sudo yum(or whatever) update, so I ran that & a few minutes later tried to get RPM Fusion itself installed with the following command:

su -c ‘yum localinstall –nogpgcheck$(rpm -E %fedora).noarch.rpm$(rpm -E %fedora).noarch.rpm’

But it griped at me about there being no localinstall user – it was griping because we’d told it to perform a command with a specified user with the su command, but it had received no user. Usually this should result in its just using root, so it’s close to the same as just using sudo in front of important things you run in the terminal, but my bash version 4.3.42 was having none of it. So I peeled out the su -c (the -c just means you’re passing it a command to execute immediately, then return to the normal user after execution, rather than switching wholly into the specified user). The issue I ran into thereafter was still localinstall, which my machine still couldn’t find. I made a few attempts at installing localinstall (so meta) but it escaped me. I found this Stack Overflow-ish post asking about basically the same difficulty I was having, and more or less someone says that yum install and yum localinstall accomplish the same thing and the only reason the other still exists is for backwards compatibility. So I changed localinstall to install, removed the su -c ' ', added a sudo since the yum install would want it, and BAM, RPM Fusion on Fedora 23!

sudo yum install$(rpm -E %fedora).noarch.rpm$(rpm -E %fedora).noarch.rpm

Then, all there was to do was run the lil commands to actually get Spotify since RPM Fusion’s installed! The “dnf” of the Fedora package manager command cracks me up – sounds a lot like “do not f****ng” before “install blahblahpackage”, and I refuse to look up what it means because I laugh every time.

dnf config-manager --add-repo=
dnf install spotify-client

And that’s all it actually took, which, haha, looking back at what I’ve written, I guess is slightly more complicated than “that’s all it took” might warrant.

Next I need to find a terminal I’m happier with! I seriously miss iTerm2 so if you have any Fedora-flavored terminal loves let me know in the comments. I need tabs, man. I need ’em.

EOY 2014

Whew. I thought 2013 was fast. 2014 was bananas. I think I’m a bit rusty from not writing as much as I ought to, so let’s just jump right in!

Grad School?

In February, I received a letter of acceptance from my absolute dream graduate school to teach high school mathematics. In what has become a theme, it was something I’d worked incredibly hard for over a long period of time to achieve, and I actually made it. I knew what I would do before I even got the letter, but that didn’t actually make it any easier – I turned them down, not only because it would have increased my debt by 130% from what is already a serious amount of money, and not only because I never would have made anywhere near enough to have repaid this ~$100k of loans, but for a host of other reasons as well. With a heavy heart, I emailed my amazing advisor and advocate at the school, a nearly two year relationship, to let her know I wouldn’t be enrolling.

So I threw myself into computer science study. With PyLadies, and my last quarter of school to complete my French degree and Math minor, I busted the proverbial it to get a job, which I did, literally the day I was done with my final FINAL exams, hopefully ever : )


The job started as an internship with lots of hands-on pair coding with my boss, “how would you solve this problem,” “let’s refactor together,” along with some “hey would you reach out to this person to sponsor (x),” which was great! I got to use a way nicer computer (macbook O2) than I had (a lovely old giant brick of a PC laptop on which I installed Ubuntu 12.04), and I finally started using git at the command line. It became pretty quickly evident that we worked together fabulously, and we started to think about what kinds of projects we could do together, so we started working on new ideas, largely centered on education. It was, actually, an incredible collaboration.

We thought topics and people for Security in Python, Data Science in Python, Twisted, Django From the Actual Beginning, and a few more that I’m sure I’m missing – it was rather a fire-hose of ideas! My boss was the kind of person who had six good ideas before breakfast, and it was a fast-paced, sometimes stressful, REALLY productive space for six months.

While there, I learned git to a granular degree and now lead a monthly workshop on it and plan to lead/teach the PyLadies annual course on it as well, and while I didn’t improve my Python chops much, I learned a lot about how computers are really working, under the hood – well, under a relative hood, I got into no hardware, lol, not at all. The os and sys modules, jeez! I’ve been saying lately that it’s those two modules that turn Python from an expensive calculator into something really powerful.

Codecademy played a little role, too, as you can see if you search the javascript tag on here – while I don’t code in JS, it’s a fairly ubiquitous language & I’m glad to have some familiarity with how it handles different kinds of problems.

I also, god-willing, learned a bit of project & people management the hard way. I don’t ever want to do that again, hooray! It’s good to know, especially considering that women are often directed from engineering career paths to soft-skill positions – now I know what to push back against.

Tutorial Creation

We settled into a Python tutorial with a local $TOPIC_IN_PYTHON expert, and worked really hard on outlining, scoping, creation, refinement, refactoring, presentation, program executability, git monsters, project jupyter/pip/virtualenv & dependency concerns, and so many other logistical issues.

After we flew to the place to film it, the company we’d signed the contract offered a counter-contract, letting us off the hook, and decided not to publish it after all. While honestly heartbreaking, there was a serious amount that we all learned.


While scrambling to find a new gig after my internship ended, I stumbled on a number of really awesome opportunities, and though I only was offered one – obviously I stopped looking once I got an offer – I met and now keep in touch with many of the folks I interviewed with, because while it didn’t work out, these are all really neat people at really cool companies, so that has been really validating.

I was strictly unemployed for all of two weeks before I got an offer from Puppet, where I’ve been since mid October, and totally in love with my job. 2014! WOW.


This year, I will be learning the ins and outs of system administration, I’ll become Puppet certified, I’ll learn Ruby, and I’ll be dipping my toe into web app development as well, all while being part of the neatest leadership team of ladies ever with PyLadies PDX. I think I’d like to get Red Hat certified, or close to it, as I’ve come into Puppet without any sys admin background, and that’s something I’d really, really like to be good at at the new spot. I continue to be amazed by how well I am treated in my new career and how much I can do for people, even while clawing at so much more – frustration at the not-knowing-enough is a constant underlying anxiety for me at the new job, and I suspect that will continue for some time, and which in fact is a good thing. Studying math, and trying to stay aware of what’s Going On in tech, is a pretty good primer for the huge amounts of Not Knowing involved in working in tech, for probably at least the first few years.

WHOO! Onward and upward, or as they say, “Up and to the right!”