LCT retrospective

Prague Castle

Last month I attended the annual meeting of my study program. This year, the meeting was held at Charles University in the beautiful capitol city of the Czech Republic, Prague. Since I have already graduated, this was probably the last LCT meeting that I will attend (although who knows!). As usual, it was an absolute blast.

As the graduating class, we participated in a small, but very formal, graduation ceremony. I already have the two diplomas from the two universities, so this ceremony was just something extra. We did receive a supplementary LCT document with a pretty nice description of the program and its requirements. I imagine this is something I could submit to anyone asking for more details about LCT, but I doubt that I will need to submit it anywhere ever, since it’s not an official diploma or transcript of records. Nevertheless, I enjoyed the ceremony, because the format was close to that of American graduation ceremonies– that is, it was very formal. There were speeches by professors and a student, there was organ music and singing, and it was held in a beautiful old hall in the Charles University in Prague. 

Insignia of the Charles University in Prague.

It was liberating being one of the graduating class. For one, I didn’t have to worry too much about making it to any particular talk or event, so I was able to sleep in! But also, I found that I did get a new perspective on the LCT program by coming for the third year.  Having graduated a while ago means that my life has moved away from all of the discussions on lectures, studying, advisers, etc. Hearing all of that talk again, reminded me how urgent it all felt at the time. Looking back made me realize how much and how quickly my life had changed– for better or worse. Either way, despite being a graduate, I still felt welcome. I met other graduates there, who were moving through similar experiences as I was now. So even though I am not in the university mindset anymore, I can still feel like I am part of this larger community.

The LCT students are incredible– not just the ones from my year or my universities–but from all the years and universities. Just about each person in the program is driven, open, and interesting in their own way. As ever, there are people who like to work, there are people who prefer to party, there are some who work hard/play hard, and there are some who chill. Nearly no one comes from the same country or the same background, which is probably the best part. As an alum at the meeting, I felt like I got to look back at the program and see it with many eyes and many points of view.

Now that I am in the workforce, I can see that having had the time to explore and meet new people was the biggest advantage I gained from LCT. I feel I learned how to be part of a community and how to go out there and find answers and guidance for myself when I needed it. Now, I have a stable career that I feel will propel me forward, but I don’t have as much time to explore new things. Still, I have to keep learning, which means I have to do the learning on my own time.

I have to keep learning… a LOT. Because what I learned in the LCT program wasn’t enough preparation for the professional world. I now have to introduce myself to a whole host of frameworks, design paradigms, algorithms, technologies, work methodologies, and attitudes that I have never had to face before. During my coursework, I spent little time on hands-on practice with modern tools. Not only that, but since I am further missing the computer science background and the web development experience that many programmers have these days, I have to learn all of those things afresh as well, in order to compete with/work alongside these people in the workforce.

To give some concrete examples, in just the last couple weeks I was struggling with CUDA drivers installation (for the billionth time), Docker, REST APIs, python’s Flask web framework, the OpenNMT-tf framework for machine translation (I already struggled with Marian, Sockeye, and OpenNMT-lua a while back), making a presentation on some recent research (i.e. reading papers and dissecting math) on a specific topic in machine translation, and a bunch of code refactoring. That’s just in the last couple weeks.

Prague Astronomical Clock

It sounds exciting, but actually, it’s very stressful to have to learn everything at once. I wish we had had some more practical courses in my master’s that would have taught us some of these theoretical ideas by using real tools (e.g. scipy, tensorflow, matplotlib), provided assignments in standardized formats (e.g. APIs to query or Docker containers to run) just so that we could get a little bit more used to those tools, if not completely comfortable with them.

I suppose one could ask how is it that professors could possibly keep up with all of the tools coming out all the time, to be able to teach us that? I would respond: how are we managing it then with much less experience? Because we, the students, do eventually manage it all on our own somehow– you just do what you gotta do– but it’s lack of guidance from our mentors in this area that easily leads to unnecessary stress and a steep learning curve. Another response might be “you have to learn how teach yourself.” Of course that is true, but learning how to teach yourself and having guidance in your studies are not mutually exclusive. At my unis, it wasn’t just like this with practical topics. It was like this with many things, much of the time. I won’t say “all the time” because there were a few gem classes/professors, but much of the time, the students got together and taught each other things they had learned 5 minutes ago. This is why the LCT program was so invaluable– it was full of students ready, willing, and able to do this, and to make a party out of it.

In the end, doing the LCT program was the right decision for me, because even though I feel the education was probably of lower quality than what you’d get at a top (in my field) US public university, I gained many soft skills and many many worthwhile experiences. If I could go back, I would definitely do it again, but only after studying a bit more on my own in the prerequisites/background topics first. In short, I would teach myself 75% of what I need to know on my own in terms of skills and theory, and then come to LCT for the last little bit on research. Things would be calmer then, and I think I could get even more out of the LCT program this way. I wonder, is it like this with all the Erasmus Mundus programs or all unis in Europe? Professors themselves seem to bounce around a lot, so is it just luck based on what professors are there the year you happen to go?

The LCT meeting was a great opportunity to look back and process everything that has happened in the last 2+ years. But now that I’ve spent some time looking back, it’s time to start looking forward. As usual, I don’t know what comes next. I have a lot of vague ideas and few concrete plans. Visiting Prague was really nice, because it reminded me that even though I don’t like big cities that much, there might be bigger cities out there that could still fit me– unlike Berlin, which is really a mismatch for my preferences, I think. In the long term, I know Berlin is not the right place for me. In some ways, it might make sense to move back home to the US. I think the salaries are still quite a bit higher there for programmers, and it would be nice to be closer to family. Eventually, I definitely want to do that… but I’m not quite ready to stop traipsing across the world just yet!

View from Prague Castle
Advertisements

The NLP Job Hunt

IMG_20181118_172136

Castelvecchio in Verona

Around a week after graduation, I sent off a very small handful of applications to a few different companies in computational linguistics. I didn’t spend much time thinking about the whole thing, because shortly thereafter, I left to attend EMNLP (a big comp ling conference that was held in Brussels). After that I headed to Paris to meet some friends, then returned to Trentino to hike a bit more in the Brenta Dolomites, and then went to Berlin. Below I’ll describe my experience and advice I’ve gotten for applying to both smaller companies, and big companies, interspersed with images of my recent travels for fun.

Searching for jobs

First of all, it was pretty easy to find jobs that looked appealing or related to my studies in smaller companies. One nice source was nlppeople, who had the most relevant openings. Other sources like linguistlist and corporalist also seem useful, and then there are the typical postings on LinkedIn or Indeed that seem to target typical software engineers a little more. Another couple of places I found later on, but didn’t explore were remoteok.io and remoteml, so I wonder if those are actually useful (anyone have any experience with them?).

On the other hand, finding jobs for the big companies like Amazon, Google, Facebook, Microsoft, Apple, etc. entails going onto those guys’ websites and doing a search. The correct job opening tends to be called something like “Applied Scientist” or “Research Scientist” and has some description of the field you’d be working in or the project you’d be working on. It’s not always clear what exactly you’d be doing, and it’s easier to get an interview there if you have an acquaintance that can push your resume through to the right recruiters.

In any case, finding interesting jobs and actually getting an interesting job are different beasts.

Small company interviews

Interviews for normal companies (and start-ups) seem to consist of the following stages:

  1. introductory phone screen conversation
  2. technical interview
  3. coding project
  4. follow-up interview and/or final interview

My (limited) experience with these has been pretty positive. The introductory phone screen has typically talked about the company’s work and business model, and has asked about your own background and cultural fit. The technical interview asks machine learning and computer science questions, with a skew towards the position you’d be working in. The coding project has typically focused on a task relevant to what the company is working on. The follow-up interview might ask a few more questions about your knowledge, to see how you are stacked up against other candidates. The final interview will already talk about logistics such as salary, start times, moving, and so on.

This interview process is not easy, but it also does seem very reasonable. The questions I saw were typically to the point, and not outside the bounds of what I should be expected to know about after completing my degree, and planning to move into industry. In terms of time frames, the small companies were pretty quick on getting back to me, usually taking only one or two weeks after receiving my resume to respond, and just a few days in between each step thereafter.

It’s possible I got lucky with the small companies I interviewed for, because I heard that other people had strange interviews, where the small companies were trying to replicate the interview process of the big companies, which I believe would be a mistake.

Big company interviews

Interviews for big companies (Amazon, Google, etc.) are very different. The best way to describe it is as a massive comp sci entrance exam. Everyone takes these entrance exams, and typically, after passing, you get further interviews with the specific group you would be working with. The process seems to consist of the following stages (though I admit that I myself did not complete the whole process, so I’m not sure about the end):

  1. phone screen with behavioural, basic comp sci, and basic machine learning questions
  2. phone technical interview
  3. on site all day technical interviews with whiteboard coding (and sometimes presentation of own work)
  4. follow-up interviews with teams of choice
  5. final interviews with logistics

I won’t sugar coat this. If you are taking the big company entrance exams, you need to have a computer science degree and remember a good chunk of what you learned, or you need to (re-)teach yourself computer science fundamentals. This is really shitty for us who are coming from a theoretical linguistics background and the LCT program, which does not cover these fundamentals (although I think they really should offer them to those who don’t have them). Below I’ve assembled all the advice I’ve received from various sources on what to study before making applications to the big companies. Some companies may ask for less of the computer science stuff, and more stuff related to your degree, but it’s better to over-prepare than under-prepare.

PANO_20181117_125824.vr_2

Hiking trail in the Brenta Dolomites

Behavioural questions

First of all, some of the companies ask you behavioural questions, like “Have you ever had a conflict with a coworker?” or “Have you ever failed to meet a deadline?” or “What are your weaknesses?” For me, I kind of handle these questions on the spot. I feel that the best way to deal with them is to say “Hmm, let me think about that…” and then start thinking about working conditions at your previous job/internship/whatever. Usually, something relevant pops to mind.

Some people might find it easier to research the most common behavioural questions, and take time to think of a scenario for the most common ones. There is also a formula that can be followed which leads to a succinct answer to these types of questions, called STAR. These methods might be the more principled way to attack behavioural questions.

In any case, I feel like these questions are sort of bullshit, and I find it easier to bullshit my way through them, because that also leads to a more natural way of talking about the problem for me. I also have a lot of prior work experience, so it’s not that hard for me to conjure up some scenarios. I don’t think I’ve ever flat out failed this section, but I’ve also never applied for leadership positions where this section is probably a lot more heavily weighted.

IMG_20181031_211452

Brussels Town Hall

Topics to cover

For the computer science entrance exams at the big companies, you can use leetcode.com, topcoder.com and projecteuler.net to practice, and read the well-known book Cracking the Coding Interview as well (behaviour quesitons are in there too). In short, you will need to know:

  • algorithm complexity (big-O notation for runtime and memory)
  • sorting: n*log(n) complexity algorithms such as quicksort and merge sort
  • hashtables: how they work and how to implement one in code using only arrays
  • trees: how to construct and manipulate binary trees, n-ary trees, tries, red/black trees (and/or splay trees, and AVL trees); how to traverse trees using breadth-first search and depth-first search; the difference between inorder, postorder, and preorder
  • graphs: objects, pointers, matrix, and adjacency list representations of graphs; how to traverse them using breadth-first search and depth-first search; their complexity, tradeoffs, and implementation in code
  • other algorithms: Dijkstra and A*
  • NP-complete: what this means, and problems such as the traveling salesman, and the knapsack problem
  • combinatorics: n-choose-k
  • probability: bayes, likelihood, prior, posterior
  • statistics: significance testing, distributions such as Gaussian and Poisson
  • concurrency: processes, scheduling, locks, mutexes, semaphores, monitors, avoiding deadlock and livelock and how to avoid them, parallelization on multi-core systems
  • object oriented system design: features sets, interfaces, class hierarchies, constraints, simplicity and robustness, tradeoffs
  • development practices: validating designs, testing whiteboard code, preventing bugs, code maintainability and readability, refactor/review sample code

In addition to computer science, you will need to know machine learning. If you only took one course on it during your LCT program, you will probably need to study some things that you missed, including:

  • supervised/unsupervised/semi-supervised learning
  • generative vs. discriminative models
  • clustering
  • classification
  • regression
  • overfitting/underfitting
  • cross-validation
  • regularization
  • bias-variance tradeoff
  • ROC curves
  • train vs. dev vs. test data
  • ML algorithms: naive bayes, linear regression, logistic regression, decision trees, random forests, KNN, K-means, SVM, HMMs, Viterbi, GMMs
  • neural networks and their specific issues: feedforward DNNs, RNNs, LSTMs, vanishing/exploding gradient problem, attention, stochastic gradient descent, learning rate, mini-batches, etc.

You will want to be familiar with the issues in computational linguistics and your specific field, which will depend on what the company is doing and the job you are applying to. This part you might not have to study as much for, since it will depend on your interests and will probably be related to your studies. In any case, it could include topics such as:

  • language modeling, including smoothing
  • FSTs and regular expressions
  • word embeddings (and sentence embeddings)
  • common traditional and state-of-the-art algorithms in your chosen sub-field (e.g. for machine translation you should know SMT models and also Transformer NNs, for speech recognition you should know about HMM-GMMs and also TDNNs)
  • handling big data and data cleanup (e.g. text normalization for language data, detecting misaligned data for MT, disambiguating speech from noise in speech data)
  • other issues specific to language processing (e.g. different scripts, word orders, phonologies, etc.)

Finally, you will want to know some modern technologies for working with machine learning, neural networks, computational linguistics, and software engineering in general, such as, for example:

  • common sources of language data
  • common data formats (e.g. XML, SQL databases, etc.)
  • Python and packages like numpy, scipy, matplotlib, spacy, nltk
  • MATLAB
  • c and/or java could also be helpful if you know them
  • TensorFlow, Torch, Keras, deeplearning4j or similar for NNs
  • Kaldi for speech recognition
  • Git for version control
  • cloud computing
  • Docker
  • Linux and bash

There might be more topics that I missed, but that’s the gist of it I think. It seems like a lot, because… well, it is. It basically covers an undergraduate degree in computer science, a graduate degree in machine learning, and one or two courses in computational linguistics. You likely won’t need to know all of it for whatever job you’re applying to, but it’s not unrealistic to have questions asked from any of these topics. You may not know an answer to every question, and that might also be ok, but it’s good if you know the larger majority.

My feeling is that if you come from that comp sci background and studied comp ling, you will just have a little bit to brush up on, while if you came from theoretical linguistics and studied comp ling (at least in LCT), you will need to spend an extra semester (or more depending on how quickly you learn) to properly learn what you need to know.

At the big companies, I was told that I should apply in the topic I had the most experience in (speech recognition for me) rather than applying to other topics I might be interested in, because this is where I had the best chance of getting actually hired.

IMG_20181108_163852_2

The Eiffel Tower in autumn

Final thoughts

For me, I admit that I certainly don’t know all of the things I’ve listed above. First of all, since I don’t have a comp sci background, I never studied any of the comp sci topics in a structured way. Second, I feel that the LCT program did not have a curriculum that progressed in a logical order over the course of the entire two years, which would have supported me in learning what I needed to know. In essence, I had to restart my progress at my second uni, because my second uni didn’t really have a curriculum that allowed me to keep learning on the same track I was already on. In addition, many of the topics that I did cover during my studies were taught in a disorganized way, and/or a superficial manner, and/or in-depth but very quickly. Therefore, those items that I did cover of the topics above, I covered in a way that didn’t really solidify my understanding of them.

Having graduated, I no longer see an easy path and time-investment opportunities towards learning them. Yes, there are MOOCs, but my personal learning style really benefits from in-class instruction. I will probably have to keep studying in evening courses if I want to properly learn some of those computer science topics I’m missing. Otherwise, I have to hope that the next job I have provides me opportunities to fill in at least some of the gaps.

In any case, I am going to be very busy soon– I have accepted an offer at a start-up in Berlin.

IMG_20181123_164906

Hallo Berlin!

Weeks 36 through 38

Busy! I thought I’d have an easier time this semester, but I’m afraid it is not so. There are more interesting classes offered this time around, but they are also harder and I am finding very little time to myself. I work every day of the week, usually quite a bit more than 8 hours a day. I haven’t been cooking much due to the busy schedule.

Like at the start of last semester, my schedule this semester is not entirely settled yet, but the classes still in the running are:

  • Software engineering
  • Semantics
  • Statistical natural language processing
  • TensorFlow (programming project)
  • Semantic Parsing (presentation + programming project)
  • Language Technology II

Software engineering (SWE) is a class I need to fulfill the requirement from UdS that states I need >8 credits from a comp sci department class taught by a non computational linguistics (COLI) prof. For this class, we get into groups of 5, and we work on a software project for a client from around the school or area. Each group works on something different. The deliverables for the class are project-management style reports, as well as the completed project. The client basically gets free interns for a semester.

In our project, we are working with a software engineer from DFKI to create an app for psychologists working with patients with dementia, Alzheimer’s, Parkinson’s, and similar. The tests involve things like asking patients to name images, or describe a scene, or tell the time. The app records their responses, analyzes the speech, and reports statistics on the data. The speech recognition part and analysis is done by DFKI. Our bit is just the front end. This has to include things like a nice UI, a database for patient tracking, audio recording, and so on. Also– and this is the stickler– it must be an iPad app. The problem is none of us have experience writing apps for iPad, and only a couple of us have Macs that we can use to compile and test the code. So yea, this is gonna be a fun ride.

The next class on the to-keep list is Semantics, which I need to fulfill the last core course requirement, plus it’s helpful for one of the LCT requirements which I haven’t finished yet as well (LT-M3 I think). Semantics is the one theoretical linguistics topic that I didn’t cover in my undergrad, so it makes sense to take it now.

Next is statistical natural language processing (SNLP). This class introduces a lot of the basic computational and info-theoretic techniques that I need to know (although some I already went over last semester); however, it’s a frustrating class, because the lectures and the assignments are completely disassociated, so I am basically teaching myself everything involved. I work on the assignments with a partner, and I feel like we are a good team, although we do have some kinks to work out. Still, even with two of us, it takes us at least twice the prof’s estimated time for us to finish the assignments.

The TensorFlow programming project sounds like a really relevant thing that I want to work on. Unfortunately, this thing hasn’t even started yet (a month into the semester), and it won’t finish until well after I am in Italy. The time frame for this isn’t great, but I am hesitant to drop it until I at least see what it’s about and how it will go.

Semantic parsing is another class that I’m not that sure about. We read a paper, and do a presentation. After all the presentations are done, we work together on a semantic parser, either implementing a system that we read about, or implementing our own system. I am already committed to doing the presentation, but I am not sure how many software projects I can do at once while also taking a bunch of classes.

Finally, Language Technology II just goes over some techniques for machine translation. It has a good curriculum, but unfortunately, it’s a very slow class, and it has very little (if any) assignments. To be honest, I’m just sort of keeping this class in my back pocket for now in case something else goes awry, but I most likely will drop it.

In addition to the above, I am attending a few other class in a not too serious way. I’ll probably stop attending these as the semester wears on (the order below reflects the order in which I will stop attending them):

  • Methods of Mathematical Analysis: I don’t like the way the prof teaches, and the curriculum isn’t as good as it could be, but maybe I’ll learn something useful
  • French Culture and Conversation: just a relaxing thing I’m doing for fun
  • German classes (“Grammatik” and “Allgemeine Deutsch Kurse”): it seems silly to be in Germany and never learn any German
  • Italian: I’m moving to Italy, and I’ve barely studied Italian, but I’m finding it difficult to put much effort into it with everything else going on

It’s a shame that I had to drop some of the other very interesting sounding classes, like Image Processing and Computer Vision, Artificial Intelligence, and a seminar on Minimalism (Syntax), but I just didn’t judge that I could manage them and/or didn’t need them as much as some of the required things.

Next week I am going to Malta for the yearly LCT conference. I hope I can enjoy it, because I will also be quite busy due to all the work that is still due.

 

 

In other news, spring is in full swing! The sun is warm, the evenings are pleasant, and I am finally so so happy with the weather. A bunch of us got together for a Grillabend (barbeque) out in the park. People cooked various delicious things, and it was such a relaxing time.

There were about 18 people there. I have to say, I am normally a fairly introverted person, and I don’t really feel comfortable in large groups. But somehow, I don’t feel that normal stress of having to be sociable when surrounded by these folks, and I actually get energy from hanging out, rather than getting fatigued by it. Moving to Italy is going to be bittersweet.

Costs:

I’m overspending on food (as usual), partly due to busyness, partly due to laziness, partly due to the enjoyment of shopping for food. =\ Next week will be expensive too since I’ll be travelling. By the way, my HiWi job ends this month, so my ideal budget will be getting cut down again.

  • €250 – rent
  • €90 – health insurance
  • €60 – replacement key (from when all my shit got stolen)
  • €25 – phone
  • €50 – train tickets for a later trip
  • €10 – bouldering
  • €168 – groceries
  • €76 – dining/snacks
  • €6 – school supplies
  • Total: €675