The NLP Job Hunt

IMG_20181118_172136

Castelvecchio in Verona

Around a week after graduation, I sent off a very small handful of applications to a few different companies in computational linguistics. I didn’t spend much time thinking about the whole thing, because shortly thereafter, I left to attend EMNLP (a big comp ling conference that was held in Brussels). After that I headed to Paris to meet some friends, then returned to Trentino to hike a bit more in the Brenta Dolomites, and then went to Berlin. Below I’ll describe my experience and advice I’ve gotten for applying to both smaller companies, and big companies, interspersed with images of my recent travels for fun.

Searching for jobs

First of all, it was pretty easy to find jobs that looked appealing or related to my studies in smaller companies. One nice source was nlppeople, who had the most relevant openings. Other sources like linguistlist and corporalist also seem useful, and then there are the typical postings on LinkedIn or Indeed that seem to target typical software engineers a little more. Another couple of places I found later on, but didn’t explore were remoteok.io and remoteml, so I wonder if those are actually useful (anyone have any experience with them?).

On the other hand, finding jobs for the big companies like Amazon, Google, Facebook, Microsoft, Apple, etc. entails going onto those guys’ websites and doing a search. The correct job opening tends to be called something like “Applied Scientist” or “Research Scientist” and has some description of the field you’d be working in or the project you’d be working on. It’s not always clear what exactly you’d be doing, and it’s easier to get an interview there if you have an acquaintance that can push your resume through to the right recruiters.

In any case, finding interesting jobs and actually getting an interesting job are different beasts.

Small company interviews

Interviews for normal companies (and start-ups) seem to consist of the following stages:

  1. introductory phone screen conversation
  2. technical interview
  3. coding project
  4. follow-up interview and/or final interview

My (limited) experience with these has been pretty positive. The introductory phone screen has typically talked about the company’s work and business model, and has asked about your own background and cultural fit. The technical interview asks machine learning and computer science questions, with a skew towards the position you’d be working in. The coding project has typically focused on a task relevant to what the company is working on. The follow-up interview might ask a few more questions about your knowledge, to see how you are stacked up against other candidates. The final interview will already talk about logistics such as salary, start times, moving, and so on.

This interview process is not easy, but it also does seem very reasonable. The questions I saw were typically to the point, and not outside the bounds of what I should be expected to know about after completing my degree, and planning to move into industry. In terms of time frames, the small companies were pretty quick on getting back to me, usually taking only one or two weeks after receiving my resume to respond, and just a few days in between each step thereafter.

It’s possible I got lucky with the small companies I interviewed for, because I heard that other people had strange interviews, where the small companies were trying to replicate the interview process of the big companies, which I believe would be a mistake.

Big company interviews

Interviews for big companies (Amazon, Google, etc.) are very different. The best way to describe it is as a massive comp sci entrance exam. Everyone takes these entrance exams, and typically, after passing, you get further interviews with the specific group you would be working with. The process seems to consist of the following stages (though I admit that I myself did not complete the whole process, so I’m not sure about the end):

  1. phone screen with behavioural, basic comp sci, and basic machine learning questions
  2. phone technical interview
  3. on site all day technical interviews with whiteboard coding (and sometimes presentation of own work)
  4. follow-up interviews with teams of choice
  5. final interviews with logistics

I won’t sugar coat this. If you are taking the big company entrance exams, you need to have a computer science degree and remember a good chunk of what you learned, or you need to (re-)teach yourself computer science fundamentals. This is really shitty for us who are coming from a theoretical linguistics background and the LCT program, which does not cover these fundamentals (although I think they really should offer them to those who don’t have them). Below I’ve assembled all the advice I’ve received from various sources on what to study before making applications to the big companies. Some companies may ask for less of the computer science stuff, and more stuff related to your degree, but it’s better to over-prepare than under-prepare.

PANO_20181117_125824.vr_2

Hiking trail in the Brenta Dolomites

Behavioural questions

First of all, some of the companies ask you behavioural questions, like “Have you ever had a conflict with a coworker?” or “Have you ever failed to meet a deadline?” or “What are your weaknesses?” For me, I kind of handle these questions on the spot. I feel that the best way to deal with them is to say “Hmm, let me think about that…” and then start thinking about working conditions at your previous job/internship/whatever. Usually, something relevant pops to mind.

Some people might find it easier to research the most common behavioural questions, and take time to think of a scenario for the most common ones. There is also a formula that can be followed which leads to a succinct answer to these types of questions, called STAR. These methods might be the more principled way to attack behavioural questions.

In any case, I feel like these questions are sort of bullshit, and I find it easier to bullshit my way through them, because that also leads to a more natural way of talking about the problem for me. I also have a lot of prior work experience, so it’s not that hard for me to conjure up some scenarios. I don’t think I’ve ever flat out failed this section, but I’ve also never applied for leadership positions where this section is probably a lot more heavily weighted.

IMG_20181031_211452

Brussels Town Hall

Topics to cover

For the computer science entrance exams at the big companies, you can use leetcode.com, topcoder.com and projecteuler.net to practice, and read the well-known book Cracking the Coding Interview as well (behaviour quesitons are in there too). In short, you will need to know:

  • algorithm complexity (big-O notation for runtime and memory)
  • sorting: n*log(n) complexity algorithms such as quicksort and merge sort
  • hashtables: how they work and how to implement one in code using only arrays
  • trees: how to construct and manipulate binary trees, n-ary trees, tries, red/black trees (and/or splay trees, and AVL trees); how to traverse trees using breadth-first search and depth-first search; the difference between inorder, postorder, and preorder
  • graphs: objects, pointers, matrix, and adjacency list representations of graphs; how to traverse them using breadth-first search and depth-first search; their complexity, tradeoffs, and implementation in code
  • other algorithms: Dijkstra and A*
  • NP-complete: what this means, and problems such as the traveling salesman, and the knapsack problem
  • combinatorics: n-choose-k
  • probability: bayes, likelihood, prior, posterior
  • statistics: significance testing, distributions such as Gaussian and Poisson
  • concurrency: processes, scheduling, locks, mutexes, semaphores, monitors, avoiding deadlock and livelock and how to avoid them, parallelization on multi-core systems
  • object oriented system design: features sets, interfaces, class hierarchies, constraints, simplicity and robustness, tradeoffs
  • development practices: validating designs, testing whiteboard code, preventing bugs, code maintainability and readability, refactor/review sample code

In addition to computer science, you will need to know machine learning. If you only took one course on it during your LCT program, you will probably need to study some things that you missed, including:

  • supervised/unsupervised/semi-supervised learning
  • generative vs. discriminative models
  • clustering
  • classification
  • regression
  • overfitting/underfitting
  • cross-validation
  • regularization
  • bias-variance tradeoff
  • ROC curves
  • train vs. dev vs. test data
  • ML algorithms: naive bayes, linear regression, logistic regression, decision trees, random forests, KNN, K-means, SVM, HMMs, Viterbi, GMMs
  • neural networks and their specific issues: feedforward DNNs, RNNs, LSTMs, vanishing/exploding gradient problem, attention, stochastic gradient descent, learning rate, mini-batches, etc.

You will want to be familiar with the issues in computational linguistics and your specific field, which will depend on what the company is doing and the job you are applying to. This part you might not have to study as much for, since it will depend on your interests and will probably be related to your studies. In any case, it could include topics such as:

  • language modeling, including smoothing
  • FSTs and regular expressions
  • word embeddings (and sentence embeddings)
  • common traditional and state-of-the-art algorithms in your chosen sub-field (e.g. for machine translation you should know SMT models and also Transformer NNs, for speech recognition you should know about HMM-GMMs and also TDNNs)
  • handling big data and data cleanup (e.g. text normalization for language data, detecting misaligned data for MT, disambiguating speech from noise in speech data)
  • other issues specific to language processing (e.g. different scripts, word orders, phonologies, etc.)

Finally, you will want to know some modern technologies for working with machine learning, neural networks, computational linguistics, and software engineering in general, such as, for example:

  • common sources of language data
  • common data formats (e.g. XML, SQL databases, etc.)
  • Python and packages like numpy, scipy, matplotlib, spacy, nltk
  • MATLAB
  • c and/or java could also be helpful if you know them
  • TensorFlow, Torch, Keras, deeplearning4j or similar for NNs
  • Kaldi for speech recognition
  • Git for version control
  • cloud computing
  • Docker
  • Linux and bash

There might be more topics that I missed, but that’s the gist of it I think. It seems like a lot, because… well, it is. It basically covers an undergraduate degree in computer science, a graduate degree in machine learning, and one or two courses in computational linguistics. You likely won’t need to know all of it for whatever job you’re applying to, but it’s not unrealistic to have questions asked from any of these topics. You may not know an answer to every question, and that might also be ok, but it’s good if you know the larger majority.

My feeling is that if you come from that comp sci background and studied comp ling, you will just have a little bit to brush up on, while if you came from theoretical linguistics and studied comp ling (at least in LCT), you will need to spend an extra semester (or more depending on how quickly you learn) to properly learn what you need to know.

At the big companies, I was told that I should apply in the topic I had the most experience in (speech recognition for me) rather than applying to other topics I might be interested in, because this is where I had the best chance of getting actually hired.

IMG_20181108_163852_2

The Eiffel Tower in autumn

Final thoughts

For me, I admit that I certainly don’t know all of the things I’ve listed above. First of all, since I don’t have a comp sci background, I never studied any of the comp sci topics in a structured way. Second, I feel that the LCT program did not have a curriculum that progressed in a logical order over the course of the entire two years, which would have supported me in learning what I needed to know. In essence, I had to restart my progress at my second uni, because my second uni didn’t really have a curriculum that allowed me to keep learning on the same track I was already on. In addition, many of the topics that I did cover during my studies were taught in a disorganized way, and/or a superficial manner, and/or in-depth but very quickly. Therefore, those items that I did cover of the topics above, I covered in a way that didn’t really solidify my understanding of them.

Having graduated, I no longer see an easy path and time-investment opportunities towards learning them. Yes, there are MOOCs, but my personal learning style really benefits from in-class instruction. I will probably have to keep studying in evening courses if I want to properly learn some of those computer science topics I’m missing. Otherwise, I have to hope that the next job I have provides me opportunities to fill in at least some of the gaps.

In any case, I am going to be very busy soon– I have accepted an offer at a start-up in Berlin.

IMG_20181123_164906

Hallo Berlin!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s