

Discover more from Health Data Science Newsletter
A Q&A With Chief Medical Officer at Labfront Dr. Andrew C. Ahn
About Times Series Analysis
Let me know if you are interested in sharing your career journey. You can email me at ahobby@healthdatasciencenewsletter.com I will do my best to get to all the requests.
Summary
Andrea Hobby interviewed Dr. Andrew Ahn, the Chief Medical Officer of Labfront and an Assistant Professor of Medicine & Radiology at Harvard Medical School. These are just a sample of all the job titles he has.
Can you briefly give an overview of your career? Also, discuss your area of research.
My area of specialization is the mathematical analysis of physiological time series. What that means is that physiological time series is rather broad, so it can encompass heart rhythms, respiratory rate, E and G's, EEGs, accelerometer data, so any data that can be captured in real-time or continuously from an individual, and could be in the healthy or sick setting. And the question is, how do I get into this field? It's roundabout, the way by which I got here.
I finished my residency in Internal Medicine. I did that at the University of Michigan. Then I did two years of what's called hospital medicine, where I cared for sick patients to be admitted to the hospital. I decided that that type of work did not appeal to my overall interests, and ultimately, I decided to go into research.
I was very interested in integrative medicine, which was, in the past, called alternative medicine. I did a three-year fellowship in general medicine and complementary alternative medicine. Through that fellowship, I started to look into acupuncture, particularly the mechanisms of activity. What are acupuncture meridians and points? One of the things that were associated with acupuncture points and meridians. Were these electromagnetic characterlike properties? And so as a result of that, I got into biophysics.
But along the way, as a part of the fellowship, I got an MPH at the School of Public Health at Harvard. This got me into learning biostatistics and epidemiology, getting used to SAS, a little, little Stata, and not much more SAS. SAS is heavy in Harvard School of Public Health. Also, we were allowed to cross-register courses in one of the local schools in Boston, and I decided to cross-register classes at MIT.
The course was called nonlinear dynamics and chaos theory. I did horribly in the class because I forgot all my biophysics; biophysics was my undergraduate major, and I forgot all the math. So ultimately, I struggled through it. This got me exposed to the idea that there's a lot of hidden like there's information that's contained in continuous data.
And that got me learning MATLAB. And then ultimately, while I was doing my acupuncture and research, I continued that through an NIH Career Development Grant. So I continued to take these graduate-level courses at MIT, where I continued to learn MATLAB and become comfortable dealing with physiological time series. And after my career development grant had expired, I realized that acupuncture had limited funding, so I shifted to analyzing physiological time series, particularly in wearables.
And that's where I became the Center's Associate Director for dynamical biomarkers. It's a lab at Beth Israel Deaconess Medical Center headed by Professor CK Peng, one of the pioneers in nonlinear dynamics in health. After that, I ultimately moved into a being in this Chief Medical Officer role at Labfront, where we predominantly deal with devices digital health devices.
Can you talk more about your work in physiological time series data?
So, Labfront, we are a mission-oriented startup, and our goal is to quote-unquote democratize health science. The way we do that is to give access to data that you get from these wearable devices. Unfortunately, like Apple Fitbit, many of these companies lock up that data. Then what they deliver to you is data that is usually processed through some black box algorithm, and we don't know how they get the data. So what we're doing at Labfront is partnering with companies such as Garmin or Polar, who are willing to make the raw data available to the users, including citizens and researchers. We help make the data available to the users, and our specialty is to help analyze the data. The devices we commonly use with Garmin are wristwatches. From the wristwatch, you can get a lot of data, including accelerometer data from which you can derive a number of steps. You can also derive sleep, but that's all through an accelerometer. Then also, from the watch, you can get heart rate through if you've noticed the watch turns green on the back of the watch, or red. So the green detects the pulse, essentially the pulse rate.
The green does a better job of reflecting against hemoglobin, the molecule in the red blood cells that absorb light. For whatever reason, green works best in the reflective mode. So the green light gets sent on the front of the watch, and that gets reflected by another sensor on the watch. And so that green system works better for that. But on occasion, you may notice that the watch turns red. And the reason it turns red is that red serves a different function. It's not just measuring heart rate, it measures oxygen saturation.
So we work with that data. And our primary clients or target audience right now are academic researchers, who, increasingly, are doing tests and trials outside of the hospital and in the home and work setting. COVID has catalyzed this change. A lot of patients don't want to come into the hospital. So remote monitoring became much more popular. And as a result, we are helping researchers start to collect data from home. And this data includes survey questionnaires. So, if you want to know how the person slept, their stress level, and their mental health status, then we can collect that information, which is integrated with the data you get from wearables, such as Garmin.
Can you discuss some of the challenges or limitations of working with physiological time series data? How are you overcoming it?
The biggest challenge is that the data we're getting tends to be noisy. People are wearing watches, some of them wear them loosely, saw some of them doing vigorous activities, or they are, I mean, a lot of factors come into play. That is, it is quite different from the highly controlled laboratory setting that you get in the hospital. The biggest challenge is to figure out how to understand how to deal with that; noise. And how do you determine whether that signal you're getting is truly physiological or is that noise? And, as you learn, you develop certain tricks of the trade over time. For instance, in heart rhythms, you get beat-to-beat intervals from the watch. Often, they need to be more accurate when someone is running or very physically active. And so what you can do is you analyze the heart rhythms while also taking a look at the accelerometer data to see if the accelerometer is showing the person's extremely active. That heart rhythm is unlikely to be accurate. And sometimes you have to ignore that. There are also other sorts of signal-processing tools. If there are multiple or too many irregularities, there are certain statistical rules you can apply that also eliminate the noise. So we do a combination of things to remove the noise aspect of things.
The other challenge is that physiological time series is not as widely taught as the more prominent data science fields. In the health field, you have the population health experts, the epidemiologists, who are more widely applied. There are large available datasets out there. And on the reverse side, you have a lot of molecular data, like genomics or medical omics or proteomics, and the whole omics field. So, you have like these two at the macroscopic and microscopic scales, where data science has matured, and even a number of courses are available. But when it comes to physiological time series, the mesoscopic scales it's less widely incorporated in what's defined as data science. So data science has these connotations of this big or small scale.
From that standpoint, it's hard for someone interested in getting into this field to know how to deal with this dataset. They learn in acoustic fields like people dealing with sound, or there's certainly a whole field of signal processing. The question of how you apply that to physiological data is challenging.
Where do you see the field evolving in the next five or ten years in health data science and the area of physiological time series?
Yeah, that is a good question. Certainly, devices will continuously develop and improve. The big area will be a multi-modality where you have all these datasets coming in from different types of sensors. But they are integrated into a single platform. You could see how physical activity would affect your blood glucose levels or your blood pressure, or you your mental activity using EEGS. The multi-modal aspects of using the internet of things to appreciate the intricate network of activities across the sensors will be an exciting field. The second area is in signal processing, for sure. And signal processing in physiological datasets tends to be nonlinear and non-stationary. Because they're nonlinear and non-stationary, they typically work poorly with standard signal processing methods such as Fourier or Wavelet transforms. So these more adaptive mechanisms are going to be more popular. And that's the two biggest areas in the next five years.
Do you have any books, resources, articles, or tools you suggest people read to learn more about your research area?
Well, that's a good question. We have a blog on our website. Labfront.com has a lot. Also, we pay attention to other fields, such as acoustic sciences or mechanical engineering, where they assess the fragility of a machine by assessing the vibrations. You see what techniques they use in these different fields. They focus much more on time series, and that's where I get a lot of information. Then, I like NCBI. I like to see what research is happening in the heart rate variability or the accelerometer field. I get these weekly notifications of what upcoming research is being done, which helps me to keep up to date on what processing techniques there are out there.
Anything else that you'd like to share?
The important thing when doing data science at this mesoscopic level or scale. That's just my term. We see engineers apply standard tools for biological systems such as the human body or human activity without appreciating the underlying physiology. Heart rate variability, for instance, has a very complex set of physiology and biological mechanisms that are underappreciated by many of these engineers. A physiologically informed signal analysis will be how we really understand and optimally apply these tools for the future.