At INRIX, we value our incredible employees across the world and appreciate the uniqueness of each individual. Each month, we plan to shine a spotlight on our employees at INRIX to discover more about how they got started in their professions, how they have benefited from their INRIX careers, and who they are outside of work. In our first employee spotlight blog, we introduce you to Joshua Kidd, a Data Scientist at our INRIX office in Altrincham, Manchester. Joshua shares his passions for solving problems, creating data visualizations with maps, and hosting Dungeons and Dragons games with friends.
Which INRIX office do you work out of? What do you do there?
I work in the data science department in Altrincham, Manchester in the UK. Most of our team is based here, with a couple of other data scientists working around the US. The problems we work on day to day vary a lot. So generally, we try to use machine learning models and statistical methods to solve problems and derive insight using the truly massive amounts of data from INRIX.
The European HR, service ops, and finance team all work out of the Manchester office as well. Obviously, we’ve all had to work from home during covid, but one of the main things I’ve missed is the chance to catch up with the rest of the office. Pre-covid, I was part of the “team fun”, who organized things like barbeques, ice skating, murder mysteries, etc. Hopefully with more of us in the office, we’ll be able to start them up again. Got a few ideas kicking around which we don’t want to spoil.
What brought you to INRIX?
INRIX is actually my first job out of university. I originally joined as a data analyst and worked my way up to be a data scientist. One of the things that impressed me the most when I applied, and what’s kept me here, is how interesting and varied the problems INRIX works on are. One project may be using computer vision to detect bikes from traffic cameras, the next might be forecasting traffic using graph neural networks. I’ve also appreciated the opportunity to learn from lots of very intelligent colleagues from a lot of different backgrounds.
Plus, I get to make lots of pretty visualizations of maps which I’d probably be doing in my free time anyway.
Where did your interest in data science begin?
In the UK, you typically specialize in one subject at university, unlike the US, where I believe you take classes in quite a few before deciding on your major. I studied Mathematics and Statistics at Lancaster University here in the UK, which has always really been my favorite subject. However, one of the more enjoyable modules was on Medical statistics. Up until that point, a lot of what we had been learning was interesting but somewhat abstracted from real world problems.
It was satisfying to see how the specific knowledge we’d been gaining was actually applied, as well as just how important domain knowledge is. Things we learnt in that module, like how to structure and stratify trials, or how to evaluate historic data, all has direct bearing on the kind of work I do now. There’s just something rewarding about being able to sort through data, step round all the potential pitfalls, and be able to measure clearly and concretely how useful the work you’ve just done is.
As data has grown and become so important to companies, can you tell us what a data scientist does? What does a typical day look like for you?
Data scientist as a professional title is still relatively new, and what a data scientist is expected to do will depend an awful lot on the company they work for. Ultimately, the aim of a data scientist is to gain actionable insights from data, although that definition is probably too concise to be helpful. A key difference between the work of a data scientist and an analyst, is that a scientist will use advanced statistical techniques like machine learning to not only interpret the meaning of a dataset but use it to build a model to forecast future events.
As an example, a scientist might work on a dataset of information about visits to a store website, work out which information is relevant to predicting whether a customer will purchase an item, and then produce a model to rank products so that those most likely to be purchased are at the top.
At INRIX, a lot of our focus is on research and development. So, we’ll typically spend several months exploring whether new features or model architecture can be used to improve one of our existing algorithms, or even create an entirely new product.
What’s your favorite project that you’ve worked on since you’ve been at INRIX? Why?
One I’ve been working on recently is something called mode prediction. We receive GPS location information from lots of different providers which we use to predict speeds on roads. If the data comes from an inbuilt Sat Nav, we can be sure that the GPS points we are getting come from a car. However, some of our data comes from mobile navigation apps, and users may be taking other forms of transport like bikes, trains, walking, etc. The aim of the model was to predict based on GPS points the ‘mode’ of a trip.
What made this project enjoyable is that I got to apply an interesting new ml method that we as a team have been talking about called active learning. Although we have lots of GPS traces, they’re generally not labelled, and if we want to build a supervised model, we need to train it on data we already know the correct answer to. We can manually label GPS traces, but it’s slow work. With active learning, we use an iterative process where we make an initial version of the model, run that against a pool of traces, then work out what the model is least confident in. We use this to determine which data to manually label for the next iteration of the model which is much more efficient than selecting randomly.
What advice would you give someone in college or just starting out in their career who wants to get involved in your field?
I would say a key one is to focus on applying your knowledge practically. One of the things I’ve found most key when interviewing candidates is their description of projects they’ve worked on previously. You must be able to answer what was the aim of your project, how did you determine which model was most suitable, how did you determine features, how did you measure success? Kaggle can be useful for this if you’re just starting out, as they have lots of pre-generated datasets and problems for you to work on. Make sure you’re not just copying others code though, and can fully justify all the decisions made.
Another piece of advice would be to look for meetups in your city. They can be very useful for both interesting talks on data science as well as networking. Oftentimes, companies will run hackathons which is an opportunity to apply some of your knowledge, and maybe learn from your peers.
Code-wise python is valuable and relatively easy to learn. Get very familiar with using python libraries like numpy, scipy and pandas, as well as jupyter lab. Often data will be stored in databases so knowledge of SQL is also useful.
Communication skills are also very important. A key part of the job is clearly communicating findings to non-technical members of the company. A final pet peeve of mine, always label your graphs!
What is something that you’re passionate about outside of work?
It’s probably no surprise that I’m a massive nerd, and most of my hobbies reflect that. I’ve been running a Dungeons and Dragons game for my friends for around 4 years now, and it’s always a laugh to spend several hours coming up with interesting stories to share with them. We play a fair amount of board games too. My favorite of which is a Dutch one called Food Chain Magnate, where you try and set up rival fast food companies. Avery Alder has another fun one called The Quiet Year where you collaboratively tell the story of a community after the collapse of civilization by taking turns to expand a map.
Also, I absolutely love hiking. Now that international travel is safer, I’m definitely planning on getting up into the Swiss Alps. The Swiss alps around Kandersteg is one of my favorite places.
If you could have dinner with anyone (dead or alive), who would it be and why?
It’s probably one of the most common answers, but I’d have to go with Carl Sagan. He’s an incredible science communicator, and it’d be fascinating just to listen to him for an evening. He has a real ability to distill vast or complex ideas, like the scale of the universe or time. I still listen to him talking about the pale blue dot every now and then. “A mote of dust suspended in a sun beam”, beautiful.
What is one thing on your bucket list?
I’ve been pretty lucky with ticking things off to be honest. However, one of the wilder ideas that a friend of mine and I would talk about is the Mongol Rally. It’s a massive race, starting from Prague and driving all the way to Ulan Ude in Russia, about 4000 miles as the crow flies (actual route would be much longer). You choose your own route, whether that’s up near the arctic circle or down past the Caspian sea through Iran. You’re really supposed to just choose your route as you go, no planning. If your car breaks down, you have to fix it yourself and you can’t ring up for help. And it’s likely that it will breakdown because the third rule is that it has to be a terrible car. Must be under 1 litre, or 1000CC. Final bit of the rally is across the Mongolian steppes, hence the name.
Would you like to join the Manchester office at INRIX? We’re currently hiring for a Senior Data Scientist in our Manchester office. Learn more about working at INRIX and our open roles.