By Gwen Spencer, Ph.D. Operations Research, Data Scientist at Stripe
I’m writing to you two years into my industry career, and from my second “Scientist” position at a fast-growing private technology company. I wanted to start this article by naming my favorite aspect of working as a scientist in industry. Mulling it over though, I couldn’t choose just one! So first, on a day-to-day basis, what I love most about my job is the high rate at which I get to learn new ideas with and from the super sharp domain experts I collaborate with. On a longer time-scale, what I’ve found most intoxicating about working in industry is the opportunity to see my models pressure-tested at enormous scale by the complex adversary that is the real world.
Some problems I’ve worked on:
After two years as an interdisciplinary academic postdoc, and 4.5 years as Mathematics faculty at a private liberal arts college in Massachusetts, I accepted my first industry position in Seattle at a long-haul trucking startup called Convoy. In my 13 months at Convoy, I worked to automate the spatial rebalancing of the national fleet of trailers (each 53 feet long) that are used to offer a long-haul shipping product called “Drop and Hook.” Drop and Hook shipping involves pre-positioning trailers so that shippers can load trailers in advance of pick up time. This minimizes delays for truck drivers who can arrive, hook up to a preloaded trailer, and drive off. Enterprise shippers love this shipping option because they have highly choreographed routines to boost efficiency within their warehouses: with Drop and Hook, shippers can operate on their own finely-tuned warehouse schedules without interruption even if truck drivers get stuck in traffic, or arrive late due to a snag with a previous job.
Rebalancing trailers optimally requires distributing thousands of trailers across a network of hundreds of shipper sites so that trailers are always onsite when needed for outbound jobs, and so that empty trailers don’t have to be transported more miles than necessary. By working on integer-programming algorithms to reduce the empty mileage of the Drop and Hook shipping product, I helped to keep costs low for Convoy’s shipper customers while also reducing the carbon footprint required to serve their long-haul shipping needs. An extremely cool moment with my engineering partner was when we started to “take our hands off the wheel” and the fully automated system took over. As the first Covid-related lockdowns began in California, our rebalancing work helped stretch Convoy’s Drop and Hook Capacity to accommodate surge demands our shippers were facing to restock critical consumer goods. Contributing our small part to keep the national supply chain moving during this tail event was both inspiring and technically fascinating.
One of my favorite aspects of the job at Convoy was working closely with the Operations Team to iteratively improve and refine the constraints and objectives that were informing my algorithm’s automated rebalancing decisions. With that as context, I was really excited when an opportunity came up to be specifically embedded with the Operations Team at Stripe. Stripe is a fintech software company that builds economic infrastructure for the internet. Stripe serves millions of international businesses, from small companies that have only a few employees up to enterprise-scale customers like Lyft, Instacart, Slack, and Amazon. By making it simple for businesses of all sizes to securely send and receive payments online (including payments that cross international boundaries and currencies), Stripe empowers its customers to focus on their own primary businesses.
Because of the COVID lockdowns, many businesses around the world have had to quickly move online or dramatically expand the share of their business that operates online.
To support Stripe’s large and diverse user base, we provide several channels for users to contact Support, including chat, email and requesting a callback. Planning the staffing needed to meet rigorous targets for response time in each channel requires time series forecasts for a wide variety of specialized support offerings. I work on these time-series forecasting problems with a special emphasis on planning over time. Many choices about staffing and hiring must be locked-in months in advance, and so we need to think carefully about uncertainty associated with different forecast time horizons and make intentional tradeoffs. Also, since we typically need to produce forecasts for a large number of quantities that evolve very differently over time, we need to build automated model training and model selection that can provide robust performance with minimal human input.
What’s particularly special about fast-growing tech companies?
In both companies where I’ve worked, scientists have access to high-ownership high-impact opportunities. Rather than exploring how to tweak a sophisticated system that is already functioning well, a scientist at a fast-growing tech company will typically have a larger role in formulating the problem, iterating on the problem statement, and designing a first fundamental approach. My favorite explicitly-stated company value at Convoy was, “Love Problems, Not Solutions.” At an early stage company, it is critically important for all team members to redirect focus to make sure we are solving the right problems, and to be wary of intriguing but distracting technical rabbit holes: what we decide to prioritize can truly have a material impact on the success of the company.
Next, in my experience, if a fast-growing company decides that it can afford to allocate a substantial chunk of science time to a project, that decision will be paired with a sense of urgency and a commitment to resource the project so that it moves from model to deployment quickly. In conducting interview loops, I’ve frequently chatted with scientists from larger older organizations where large projects were shelved, sometimes after years of scientist effort. While killing a long-term project to redirect resources may be highly rational for a company, it can be very discouraging for the individual contributors. Being early in my industry career, I want to gain experience on the entire project pipeline: from basic formulation to full deployment. Building up my sample size on how to make each phase of this process run smoothly is both really personally engaging and an important aspect of my professional growth as an industry scientist.
Last I’ll mention that as a postdoc, and in a couple of subsequent interdisciplinary academic collaborations, I absolutely relished having access to collaborators whose deep expertise was different from my own. Hashing out models on a whiteboard with economists and ecologists was so fun it made me wonder: could I be doing Science as a full time job? For me, direct access to these different highly-developed mental models and ways of reasoning is really the intellectual spice of life! Lots of factors shape who your daily contacts are in an industry position. Working at tech companies, my primary collaborators have been terrific engineers and super sharp domain experts, which I have found to be an awesome learning opportunity. At Stripe, the Data Science organization encourages embedded scientists to sit with their stakeholders 3 days a week and with our data science colleagues 2 days a week: this is one of the things that makes me most excited about offices reopening after Covid.
Translatable skills from academic training:
At Stripe, my data science colleagues have a wide variety of different technical backgrounds: biology, physics, epidemiology, etc. While I currently work under a general Data Scientist title, the problems I’ve worked on in industry are mostly related to core concepts from my graduate training in Operations Research: algorithms, spatial optimization, decision making over time, stochastic processes, queuing, statistics, concepts about robustness and quantifying performance, and so on.
Outside of my technical familiarity with certain topics, a lot of the value I bring to my team is connected to habits of mind from my mathematical training. First, in both pure and applied math research, one of the great pleasures is inventing language to formalize initially fuzzy ideas. The right language, or even, the right evocative name, can set off a cascade of useful reasoning, exposing paths forward that were initially invisible. This is absolutely a phenomenon in industry as well. Inventing the right language can accelerate iteration towards the most meaningful questions and can unblock teams.
Second, to listen to mathematics talks productively, you often can’t rely on understanding each detail in linear order: instead you have to practice a kind of “modular ingestion.” In real-time this involves listening for the high level structure of the reasoning, and being able to fluently black-box and un-black-box various sub-modules to investigate details after the fact. These listening habits are incredibly powerful in trying to ingest messy real world settings from experts who know countless corner cases, caveats, and lurking hazards. Another listening skill is habitually building an expanding survey of “typical approaches” used to reason about major classes of problems. Almost every academic I know does this in the background: being sensitive to when you are encountering a new mode of reasoning or when a problem fits a mode of reasoning you recognize is a huge asset in an industry setting.
Finally, in certain ways, the intense specialization of mathematical training can be quite good preparation to navigate life as a generalist. Mathematical arguments often contain long sequences of logical implications. Mathematicians build up strength in an unusual muscle: the ability to peel back 80% of their own mental model and rebuild implications very quickly based on the introduction of a new assumption, or in response to a flaw exposed early in the chain of reasoning. In industry, and particularly at fast-growing tech companies, data scientists often meet new problems, features, and obstacles weekly: the ability to quickly iterate and pivot your mental model (and to bring others along with you) is like a super power!
Exploration vs. Exploitation:
Every career path offers a different collection of rewards. To say that we often face large decisions without the benefit of full information is an understatement. For me, a really successful strategy has been to pay close attention to what I find personally engaging on a daily basis. If you don’t have access to direct experience about a future direction you are considering (e.g. an internship), how could you start to gather evidence? Seeking out others who have taken paths that you are considering (and reading articles online…) is a great place to start. Asking concrete questions about how someone spent their last 40 hours at work can be illuminating. Don’t worry if you aren’t already sure what path is right for you: learning your own preferences and forming a vision of the type of exercise you’d like to be giving your brain in 2-3 years is a serious investment.