The NLP Job Hunt

IMG_20181118_172136

Castelvecchio in Verona

Around a week after graduation, I sent off a very small handful of applications to a few different companies in computational linguistics. I didn’t spend much time thinking about the whole thing, because shortly thereafter, I left to attend EMNLP (a big comp ling conference that was held in Brussels). After that I headed to Paris to meet some friends, then returned to Trentino to hike a bit more in the Brenta Dolomites, and then went to Berlin. Below I’ll describe my experience and advice I’ve gotten for applying to both smaller companies, and big companies, interspersed with images of my recent travels for fun.

Searching for jobs

First of all, it was pretty easy to find jobs that looked appealing or related to my studies in smaller companies. One nice source was nlppeople, who had the most relevant openings. Other sources like linguistlist and corporalist also seem useful, and then there are the typical postings on LinkedIn or Indeed that seem to target typical software engineers a little more. Another couple of places I found later on, but didn’t explore were remoteok.io and remoteml, so I wonder if those are actually useful (anyone have any experience with them?).

On the other hand, finding jobs for the big companies like Amazon, Google, Facebook, Microsoft, Apple, etc. entails going onto those guys’ websites and doing a search. The correct job opening tends to be called something like “Applied Scientist” or “Research Scientist” and has some description of the field you’d be working in or the project you’d be working on. It’s not always clear what exactly you’d be doing, and it’s easier to get an interview there if you have an acquaintance that can push your resume through to the right recruiters.

In any case, finding interesting jobs and actually getting an interesting job are different beasts.

Small company interviews

Interviews for normal companies (and start-ups) seem to consist of the following stages:

  1. introductory phone screen conversation
  2. technical interview
  3. coding project
  4. follow-up interview and/or final interview

My (limited) experience with these has been pretty positive. The introductory phone screen has typically talked about the company’s work and business model, and has asked about your own background and cultural fit. The technical interview asks machine learning and computer science questions, with a skew towards the position you’d be working in. The coding project has typically focused on a task relevant to what the company is working on. The follow-up interview might ask a few more questions about your knowledge, to see how you are stacked up against other candidates. The final interview will already talk about logistics such as salary, start times, moving, and so on.

This interview process is not easy, but it also does seem very reasonable. The questions I saw were typically to the point, and not outside the bounds of what I should be expected to know about after completing my degree, and planning to move into industry. In terms of time frames, the small companies were pretty quick on getting back to me, usually taking only one or two weeks after receiving my resume to respond, and just a few days in between each step thereafter.

It’s possible I got lucky with the small companies I interviewed for, because I heard that other people had strange interviews, where the small companies were trying to replicate the interview process of the big companies, which I believe would be a mistake.

Big company interviews

Interviews for big companies (Amazon, Google, etc.) are very different. The best way to describe it is as a massive comp sci entrance exam. Everyone takes these entrance exams, and typically, after passing, you get further interviews with the specific group you would be working with. The process seems to consist of the following stages (though I admit that I myself did not complete the whole process, so I’m not sure about the end):

  1. phone screen with behavioural, basic comp sci, and basic machine learning questions
  2. phone technical interview
  3. on site all day technical interviews with whiteboard coding (and sometimes presentation of own work)
  4. follow-up interviews with teams of choice
  5. final interviews with logistics

I won’t sugar coat this. If you are taking the big company entrance exams, you need to have a computer science degree and remember a good chunk of what you learned, or you need to (re-)teach yourself computer science fundamentals. This is really shitty for us who are coming from a theoretical linguistics background and the LCT program, which does not cover these fundamentals (although I think they really should offer them to those who don’t have them). Below I’ve assembled all the advice I’ve received from various sources on what to study before making applications to the big companies. Some companies may ask for less of the computer science stuff, and more stuff related to your degree, but it’s better to over-prepare than under-prepare.

PANO_20181117_125824.vr_2

Hiking trail in the Brenta Dolomites

Behavioural questions

First of all, some of the companies ask you behavioural questions, like “Have you ever had a conflict with a coworker?” or “Have you ever failed to meet a deadline?” or “What are your weaknesses?” For me, I kind of handle these questions on the spot. I feel that the best way to deal with them is to say “Hmm, let me think about that…” and then start thinking about working conditions at your previous job/internship/whatever. Usually, something relevant pops to mind.

Some people might find it easier to research the most common behavioural questions, and take time to think of a scenario for the most common ones. There is also a formula that can be followed which leads to a succinct answer to these types of questions, called STAR. These methods might be the more principled way to attack behavioural questions.

In any case, I feel like these questions are sort of bullshit, and I find it easier to bullshit my way through them, because that also leads to a more natural way of talking about the problem for me. I also have a lot of prior work experience, so it’s not that hard for me to conjure up some scenarios. I don’t think I’ve ever flat out failed this section, but I’ve also never applied for leadership positions where this section is probably a lot more heavily weighted.

IMG_20181031_211452

Brussels Town Hall

Topics to cover

For the computer science entrance exams at the big companies, you can use leetcode.com, topcoder.com and projecteuler.net to practice, and read the well-known book Cracking the Coding Interview as well (behaviour quesitons are in there too). In short, you will need to know:

  • algorithm complexity (big-O notation for runtime and memory)
  • sorting: n*log(n) complexity algorithms such as quicksort and merge sort
  • hashtables: how they work and how to implement one in code using only arrays
  • trees: how to construct and manipulate binary trees, n-ary trees, tries, red/black trees (and/or splay trees, and AVL trees); how to traverse trees using breadth-first search and depth-first search; the difference between inorder, postorder, and preorder
  • graphs: objects, pointers, matrix, and adjacency list representations of graphs; how to traverse them using breadth-first search and depth-first search; their complexity, tradeoffs, and implementation in code
  • other algorithms: Dijkstra and A*
  • NP-complete: what this means, and problems such as the traveling salesman, and the knapsack problem
  • combinatorics: n-choose-k
  • probability: bayes, likelihood, prior, posterior
  • statistics: significance testing, distributions such as Gaussian and Poisson
  • concurrency: processes, scheduling, locks, mutexes, semaphores, monitors, avoiding deadlock and livelock and how to avoid them, parallelization on multi-core systems
  • object oriented system design: features sets, interfaces, class hierarchies, constraints, simplicity and robustness, tradeoffs
  • development practices: validating designs, testing whiteboard code, preventing bugs, code maintainability and readability, refactor/review sample code

In addition to computer science, you will need to know machine learning. If you only took one course on it during your LCT program, you will probably need to study some things that you missed, including:

  • supervised/unsupervised/semi-supervised learning
  • generative vs. discriminative models
  • clustering
  • classification
  • regression
  • overfitting/underfitting
  • cross-validation
  • regularization
  • bias-variance tradeoff
  • ROC curves
  • train vs. dev vs. test data
  • ML algorithms: naive bayes, linear regression, logistic regression, decision trees, random forests, KNN, K-means, SVM, HMMs, Viterbi, GMMs
  • neural networks and their specific issues: feedforward DNNs, RNNs, LSTMs, vanishing/exploding gradient problem, attention, stochastic gradient descent, learning rate, mini-batches, etc.

You will want to be familiar with the issues in computational linguistics and your specific field, which will depend on what the company is doing and the job you are applying to. This part you might not have to study as much for, since it will depend on your interests and will probably be related to your studies. In any case, it could include topics such as:

  • language modeling, including smoothing
  • FSTs and regular expressions
  • word embeddings (and sentence embeddings)
  • common traditional and state-of-the-art algorithms in your chosen sub-field (e.g. for machine translation you should know SMT models and also Transformer NNs, for speech recognition you should know about HMM-GMMs and also TDNNs)
  • handling big data and data cleanup (e.g. text normalization for language data, detecting misaligned data for MT, disambiguating speech from noise in speech data)
  • other issues specific to language processing (e.g. different scripts, word orders, phonologies, etc.)

Finally, you will want to know some modern technologies for working with machine learning, neural networks, computational linguistics, and software engineering in general, such as, for example:

  • common sources of language data
  • common data formats (e.g. XML, SQL databases, etc.)
  • Python and packages like numpy, scipy, matplotlib, spacy, nltk
  • MATLAB
  • c and/or java could also be helpful if you know them
  • TensorFlow, Torch, Keras, deeplearning4j or similar for NNs
  • Kaldi for speech recognition
  • Git for version control
  • cloud computing
  • Docker
  • Linux and bash

There might be more topics that I missed, but that’s the gist of it I think. It seems like a lot, because… well, it is. It basically covers an undergraduate degree in computer science, a graduate degree in machine learning, and one or two courses in computational linguistics. You likely won’t need to know all of it for whatever job you’re applying to, but it’s not unrealistic to have questions asked from any of these topics. You may not know an answer to every question, and that might also be ok, but it’s good if you know the larger majority.

My feeling is that if you come from that comp sci background and studied comp ling, you will just have a little bit to brush up on, while if you came from theoretical linguistics and studied comp ling (at least in LCT), you will need to spend an extra semester (or more depending on how quickly you learn) to properly learn what you need to know.

At the big companies, I was told that I should apply in the topic I had the most experience in (speech recognition for me) rather than applying to other topics I might be interested in, because this is where I had the best chance of getting actually hired.

IMG_20181108_163852_2

The Eiffel Tower in autumn

Final thoughts

For me, I admit that I certainly don’t know all of the things I’ve listed above. First of all, since I don’t have a comp sci background, I never studied any of the comp sci topics in a structured way. Second, I feel that the LCT program did not have a curriculum that progressed in a logical order over the course of the entire two years, which would have supported me in learning what I needed to know. In essence, I had to restart my progress at my second uni, because my second uni didn’t really have a curriculum that allowed me to keep learning on the same track I was already on. In addition, many of the topics that I did cover during my studies were taught in a disorganized way, and/or a superficial manner, and/or in-depth but very quickly. Therefore, those items that I did cover of the topics above, I covered in a way that didn’t really solidify my understanding of them.

Having graduated, I no longer see an easy path and time-investment opportunities towards learning them. Yes, there are MOOCs, but my personal learning style really benefits from in-class instruction. I will probably have to keep studying in evening courses if I want to properly learn some of those computer science topics I’m missing. Otherwise, I have to hope that the next job I have provides me opportunities to fill in at least some of the gaps.

In any case, I am going to be very busy soon– I have accepted an offer at a start-up in Berlin.

IMG_20181123_164906

Hallo Berlin!

Weeks 44-46

IMG_20170724_175153

My favorite part of Paris! (From La Fermette)

I was very busy with finals all month. I’m afraid things didn’t go as well as I hoped. I studied a lot throughout the whole semester, but my studying often seemed to lead to more confusion rather than clarity on these topics. It’s also true that while I studied a lot, I spent a lot of time not studying too. For example, I traveled, once to wedding in the US (causing lots of jetlag in both directions), and once to Paris to see some of my husband’s family that were visiting from the US.

At least now, all my exams are over…. but I doubt I got great marks. Doing poorly on an exam is really demoralizing, because it feels like despite all the  hard work you did throughout the semester, you didn’t seem to learn anything, or otherwise, you feel like you learned a lot throughout the semester, but you didn’t get a chance to show what you know on the exam. Some of the exams this semester seemed disproportionately hard, or completely unrelated to what was done in class, and judging from others’ reactions, I wasn’t the only one that felt that way.

Apart from exams, I have two group projects still due. The first is an iPad app for the software engineering class. Through this project, I learned a lot about Swift, Xcode, and iOS app development. The class itself, like the classwork and the exam, were pretty pointless. But I actually kind of enjoyed working on this project in the end, because I got to make something concrete, and because I actually enjoyed working in our group, for the most part. We all fell into our own little niches by the end, and I think we did a good job of handling all of the responsibilities, and trading them off when necessary as well.

The second project involves semantic parsing using neural networks, and unfortunately, I legitimately feel like I didn’t contribute enough to this one. It’s not quite over yet (I guess it’s due at the end of August), so hopefully I can somehow earn my keep on that team. If not, I guess I’ll just take the hit to my grade. I don’t think I actually need the credits at this point (unless I failed one of the finals).

I don’t expect to find out my grades from this semester for a very long time. In fact, not all of my grades from last semester (4-5 months ago) are in yet. Submitting grades seems to be up to each professor (as I’ve mentioned before, this school has little central organization), and some seem to enjoy taking their time. It’s really quite appalling.

Screenshot_20170724-130959

The battery meter shows how the battery just dropped right off (and the phone turned off suddenly) at around 58% battery left.

In other news, my phone battery decided to crap out. It now drains incredibly fast and dies when the battery claims to be between 20% and 60%. It’s the kind of phone where you can’t easily open the back and replace the battery. I can try to muck about with it (voiding the warranty), or  I can send it back to the US to get a replacement. Messing with it means possibly breaking it and being without a phone, and sending it back means I would have to be without a phone while it’s in the mail. So I would need to buy a phone in the meanwhile, even if it’s a cheap-o one. Thankfully, my husband’s family is once again coming to my rescue, and helping me buy a proper, new phone as a birthday present. This one just has to not completely fail on me in the next month.

In general, I feel that the trend of these types of phone problems has been on the rise. I don’t know if it’s planned obsolescence or if it’s just an actual failure in battery design, or if it’s a combination of the two. Either way, it’s clear that the big phone companies will continue to skimp here.

Bunch of bullshit, if you ask me.

The worst part is that I know that I am the one that gets myself into these situations– I am the one that chooses to buy phones from the companies that get away with this nonsense, because I use my phone in every aspect of my daily life. It’s an every-tool, but it’s also a shiny toy, and it’s one of the categories I am willing to drop a lot of cash on.

Long story short,  it’s been a stressful month, and now, I need to finish planning the move to Italy! I am also trying to plan some travel in August before my move, because I don’t want to waste the whole month just sitting around… but we’ll see if those plans come to fruition so late in the game.

Costs

  • €225 – rent
  • €90 – health insurance
  • €25 – phone
  • €5 – meds
  • €5.5 – school supplies
  • €20 – bouldering
  • €133 – groceries
  • €61 – dining
  • €50 – trip to Paris (family helped with the bulk of the costs)
  • €89 – new phone (family helped with the bulk of the cost)
  • €450 – deposit on apartment in Italy (the initial one to hold it, another will be due once I get there)
  • Total: €1153

Weeks 32 & 33

The last two weeks have just been a whirlwind of emotion, I guess. My family came to visit me in Europe, and although it started out great, there was definitely a theme of misfortune throughout much of it. One of the stress points was that I was the only one that spoke any languages, so I had to translate/coordinate things, while also trying to keep my family from panicking. Another stress point was my family’s near pathological avoidance of planning. But those were minor things. The hardest part to deal with was the theft in the second leg of the trip… but let me start from the beginning.

My family landed in Paris. The weather was great, we hit up all the big sights, went to a bunch of museums, and ate a lot of delicious food. Unfortunately, my husband got kind of sick the first couple days, so we didn’t see much of him (at least he had seen Paris with me earlier), but he did manage to join us near the end for a couple things he hadn’t seen before.

The next place we had on our itinerary was Switzerland. As mentioned, my family has some sort of strange aversion to finalizing plans. Thankfully, my mom had ordered accommodation for us near Paris, Geneva, and Munich for the trip, but she hadn’t planned on how to get from one place to the next. We actually weren’t even staying in Geneva or Munich itself for the second and third parts of the trip, but quite far away by public transport in both places, so my parents intended to rent a car once in Geneva and to use it for the rest of the trip.

We took a train from Paris to the small town we were staying at near Geneva (actually in France). The only affordable train that was available by the time we were making the booking would come in after 22:00. Like most small towns, this one didn’t really have a public transport system that late at night, which meant we ended up waiting around for a long time for 2 separate taxis to take us to the house we were staying at.

Then, the next day was completely wasted on trying to get that rental car. We had to split up into 2 groups. One group went to rent a car at a nearby place for the duration of our stay near Geneva, and the second to the Geneva airport (via 3 busses) to rent a different car to Munich. We had to do it this way because the rental car agencies that rented internationally had no cars available since we didn’t reserve ahead of time. Suffice it to say, this was a very stressful and frustrating day for everyone.

The day after, my big brother got sick, and I later caught it as well. (By the way, I’ve been sick 7 out of 8 months I’ve been in Europe.) I was actually expecting to get sick since my family had traveled on planes, so I wasn’t surprised, but that didn’t make it any less annoying. Also, one of the days, my husband ended up having to work so we didn’t see much of him again. But Switzerland, eastern France, and the Alps were beautiful, so we managed to enjoy our time there nonetheless.

PANO_20170410_164856

PANO_20170409_170726

We had to leave quite late on our last day in Switzerland, because we had to do a lot of driving back and forth to drop off the old car and pick up the new one. On our way out, we stopped by Lausanne. My family went to the cathedral, and my husband and I went to visit with a friend.

This is when it all went to shit.

We were having a great time, right up until my big brother called to tell us that in the hour or so that they had been away from their car, it had gotten a window smashed. My husband’s and my backpacks were stolen. These were the only two things in the car cabin (since we didn’t have space for them in the trunk), and so they were the two things that were stolen. Thankfully, nothing else was taken, and everyone’s passports, money, and phones were safe as well. Also, thankfully my friend was willing to waste an entire night with us at the police to help explain the situation, since although I speak French, I would still have trouble with the whole process. Most importantly, no one was hurt.

However, we lost the rest of the day and night to this, and we had to rent a hotel nearby to stay the night as well. Even though it was just mine and my husband’s stuff, we lost a lot of expensive things to this theft, as well as a lot of small things that are just annoying to have to collect again. In my case, I lost my backpack, which had basically my whole life in it (I don’t have a lot with me in Europe). Here’s a summary of the major things:

  • Both of our house keys and my husband’s car keys (~$60 for me to replace, ~$800 for him to replace the electronic car key)
  • My husband’s expensive MacBook Pro (~$1700)
  • Much of my husbands collection of Netrunner cards, along with his winnings (promo cards, special tokens, etc.) from championships (~$350)
  • A brand new Nintendo Switch my husband had just gotten me as a gift with the new Zelda game (~$350)
  • My backpack which I had spent 6 months finding to be exactly right for my needs (~$100)
  • My work laptop that I just bought a few months ago (~$600)
  • A huge external hard drive with a bunch of pictures; thankfully I have the pictures backed up elsewhere (~$100)
  • Almost all of the clothes I own including my nice button up shirt, my travel towel, my toiletries (~$170 I guess)
  • My glasses case with a spare pair of glasses, and most of my glasses cleaning cloths (~$200)
  • Chargers for everything, including my only USB Type C to Type C for my phone and my US extension cord for all my appliances
  • All the little junk I carry in my backpack (e.g. a pocket knife, a combination lock for when I go to hostels, a pen+stylus, plug adapters, my key chains, etc.)
  • My Blizzard authenticator, so I guess I have to figure out how to cancel that
  • All the little souvenirs I had just bought from Paris (magnets/postcards)
  • Around 6 months worth of my thyroxine prescription meds that my husband had brought me from the US

So yea… after this, the vacation got less fun (and of course the two days after I come home are Easter vacation days in Germany, so I can’t even buy replacement clothes right away). I am very lucky that I was with my family during this time though, because they really helped me out. My big brother and dad generously generously offered me their laptops (I ended up taking my big brother’s). My mom bought me some clothes, my dad bought me some chargers and a cheapo backpack, I bought my husband a full collection of Netrunner cards… Basically, all the stuff will be replaced eventually.

After all that, what were we to do, but continue on with the plan? We drove to our AirBnB near Munich. We visited Neuschwanstein Castle, I climbed up to the top of a cliff to catch the sunset, we ate more amazing food, and eventually, we said our goodbyes.

It’s gonna sound weird to say, but despite all of that shit, I had a good vacation. Even though so many frustrating things happened, I didn’t realize how much I had missed my family, and of course my husband (who is still living in the US). And as for the stolen stuff, well, it’s just stuff.

Lessons learned:

  • Keep your passport/money/phone on you. This saved our bacon.
  • Don’t leave stuff in the car cabin. This makes you an easy target..
  • Don’t bring expensive stuff on trips. Then it hurts less to replace.
  • Leave a few pairs of shirts at home so you have clothes for when you return.
  • Have great friends and family. I don’t know how to do that, I just got lucky.

Costs:

  • rent – €225
  • replacing some of my things (should be in the mail soon) – ~$250
  • the gift of an entire Netrunner card collection for my husband – $350
  • souvenirs (most of them now stolen) – €33
  • transportation – €129
  • lockers to store stuff at the train station – €30
  • food (my family paid most of the time) – €60
  • cold medicine/cough drops – €8
  • Total: €404 and $600