In today’s blog we chat with Dr. Rahul Shetty, Masters in statistics from Cornell University and PhD in astronomy from the University of Maryland. Following his PhD, Dr. Shetty worked at the Center for Astrophysics at Harvard and at the Institute for Theoretical Astrophysics at Heidelberg University before becoming a data scientist. The experiences he describes are unfortunately not uncommon, but sharing them today is part of the conversations we hope will ultimately contribute to making STEM a better place.
David: Hi Rahul. Can you give us some background starting from birthplace all the way to your doctoral studies and beyond?
Rahul: I was born in India, and moved to the Philippines when I was 3 years old. My dad was an agronomist, and his research took us to West Africa two years later. That’s where I spent much of my childhood, mostly Mali, and a couple of years in Nigeria and Niger. Subsequently, my parents sent me to boarding school in the US, where I remained through my bachelors and PhD. After a total of 9 years as a postdoctoral researcher, including a “sabbatical” for studying statistics, I decided to leave academia and started to work as a data scientist in a financial-technology company in Amsterdam.
The two years there was a great introduction into the private sector. In order to diversify my experience, I consulted with a number of mid-size companies and larger corporations. That was also an invaluable experience which broadened my perspective on how data and machine learning can be harnessed for optimizing various aspects of a business. For the last year and a half, I have been leading the data science efforts at Leap, which is a startup focusing on balancing the electric grid, in turn minimizing the impact of energy consumption on the environment.
David: What are the main questions in your field of expertise?
Rahul: The overarching question in my field is: How do stars form in the interstellar medium (ISM)? There are many secondary questions: Is star formation regulated and if so, how? How does turbulence in the ISM affect star formation, and vice versa. I was also interested in how we can accurately identify the physical properties of the ISM from the observations.
I started my research career as a theorist, building models of the ISM in spiral galaxies with my advisor Eve Ostriker. Using MHD simulations, we investigated how the gas in spiral arms collapses to form molecular clouds. We also injected energy representing supernovae to follow the subsequent evolution of the ISM in spiral galaxies.
David: Which offset star formation?
Rahul: Yes - The supernovae dispersed the molecular clouds, ceasing star formation in those clouds. But, as these events drove turbulence and created expanding shells, the shells would eventually collide to form a new generation of molecular clouds. As a postdoc I worked with observers on understanding the 3D structure of clouds, and the properties of dust in the ISM. We critically assessed techniques that are commonly employed to infer the spectral energy distribution (SED) of the dust from infra-red (IR) observations. We found that noise severely impacted the conclusions drawn from the fitting results. By applying standard techniques to simple dust models, it turned out that the fits produced an artificial anti-correlation between two physical parameters defining the SED, very similar to trends commonly estimated from analyses of real observations.
David: So you proved that these weren’t physical correlations?
Rahul: To a large degree. We concluded that we could not rule out that the observed anti-correlation is artificial. We showed why such a correlation naturally arises statistically due to the form of SED and uncertainties in the observations. Following that investigation, I started seeking out an appropriate solution. This is when I had the very good fortune of running into a brilliant astro-statistician, Brandon Kelly. He suggested that we could use Bayesian methods to untangle the effects of uncertainties. We decided to collaborate together and came up with a hierarchical Bayesian method to handle the effects of the uncertainties to estimate the underlying properties. Brandon’s method was remarkable as it could accurately recover the true underlying correlations of the dust models.
David: And how different were they from the previous correlations found from observations?
Rahul: Very different. The shape of the SED is determined by the temperature, which primarily impacts the Wien regime of the SED, and the slope of the Rayleigh-Jeans tail. This parameter, called the spectral index, encodes information about the composition of the dust. The temperature and spectral index are fit at each pixel of a map, and the fits often produced an anti-correlation found between these two parameters. But, if you push the SED tail one way, you have to pull the temperature in the opposite sense to compensate. The observations are of course noisy and even uncertainties on the order of a few percent or less are enough to throw off the fit significantly. When you are trying to model this complex SED with only a small number of IR fluxes, there’s not enough information to accurately estimate the temperature and the index. You can create a very simple dust model, add a little bit of noise to the fluxes, and find an anticorrelation from the estimated parameters when there’s actually no correlation in the model.
When employing a Chi-square fit to estimate the SED at each pixel, a crucial implicit assumption is that the pixels are independent. The underlying assumption is that temperatures and spectral indices are independent. Yet, the results always produced a strong negative correlation, which violates the assumption of independence. The problem is that the implicit assumptions are ignored and/or that their consequences are simply unknown. One major advantage of the hierarchical Bayesian method was that we could relax the assumption of independence, and actually investigate whether and to what extent there might be correlations between the SED parameters. We first applied the new technique on model data, and demonstrated that it could accurately recover the true underlying correlations between the SED parameters.
The model fitting results were very encouraging. At that point we applied it on real data from the Herschel Telescope. Again, we saw the anticorrelation using the traditional method but we found a slight positive correlation from the Baysesian method. We speculated that the estimated trend might be due to grain size and grain growth in the dense ISM prior to the formation of young stars. We were no dust experts so we cautioned that these results needed further analysis. Nevertheless, that was rather interesting and led me to consider how common statistical methods and the underlying assumptions might affect other observational results. I learned a lot about statistical rigour and I wanted a broader understanding of hierarchical methods. That’s when I decided to invest more time into statistics.
I subsequently investigated the Kennicutt-Schmidt (KS) relationship and found a very similar story there too. Researchers here were using this so-called bisector fit. Based on a publication from 1990 in the Astrophysical Journal, many investigations have relied on the bisector, which essentially averages the parameters between two linear regression results, that from estimating the mean value of x given y, and the estimates of y given x. This is not recommended at all. It produced this linear KS relation between CO and star formation. But again, the reason you get that is due to the highly flawed method. One can straightforwardly demonstrate that the bisector produces a linear relationship from a number of different noisy models, and that it is the direct consequence of averaging over the x given y and the y given x methods. There’s absolutely no reason to average over two fitting results.
We applied a hierarchical Bayesian fitting algorithm to investigate the differences in KS relationships in galaxies, as well as a possible universal relationship. Using published data, we found that there is no single KS trend, and that on average there is a sublinear relationship in the population of galaxies we investigated. To us this was an interesting finding with important consequences, but it unfortunately perturbed a number of researchers. In the case of the dust SEDs, I could understand the reaction from senior scientists who had spent decades studying dust, for which the work of some young postdocs are not accepted easily.
But, in the case of the Kennicutt-Schmidt relationship, I was surprised at the resistance, and sometimes hostility, because they were basically my age. But they were very resistant, borderline aggressive in conferences and referee reports. They did not want to accept that it was remotely possible that their previous work produced inaccurate results.
David: Do these people all have jobs in astronomy?
Rahul: Yes. Many of them now have permanent positions.
David: Did you get an actual rational, reasonable, criticism to what you were doing?
Rahul: No. it was not reasonable at all. There were many discussions with the researchers who worked on dust SEDs and the KS relationship. A large part of those discussions focused on potential problems with the data, or that the variations were discussed in previous works. There was strong resistance to the findings that the data portrayed a relationship between parameters that was different from prior work, but those particular points were only given cursory attention during our discussions. There appeared to be a reluctance to fully understand or accept more statistically rigorous methods.
Another type of resistance I received had to do with the mechanics of Bayesian methods. There is a misperception that having to deal with prior information is somehow a major impediment. I would attempt to explain that incorporating prior information is actually advantageous, and in Bayesian statistics one explicitly defines the priors. One can assume broad or narrow priors, based on previous work, and those are accurately incorporated in the current investigation. Furthermore, it is relatively easy to modify the priors if new information arises, and/or simply to test an alternative hypothesis. Nevertheless, under traditional frequentist methods there are underlying assumptions - such as the assumption of independence - which is effectively a constraining prior that can lead to highly erroneous parameter estimates. It appeared that people were grasping at reasons to discount Bayesian methods without fully considering the impact of their argument on their own work. I remember often asking myself that, instead of trying to convince others of the importance of statistical rigour, I would rather be focusing the conversation on the underlying astrophysics.
I have to mention that I had solid support from my colleagues in Heidelberg. They were fascinated by the results and the potential of the statistical methods to further extract physical information from the observations accurately.
David: Basically astronomers don’t understand statistics?
Rahul: Many don’t, and more disappointing is the absence of motivation to develop and apply sound statistical methods. This is disappointing because rigorous statistics can reveal a lot of hidden information from the data. I think this lack of statistical rigour not only produces inaccurate results, but also potentially misses new and fascinating discoveries buried in the data. So there is this effort that needs to happen to learn statistics. That’s what you have to do in order to learn and apply robust methods to accurately derive physical properties from observations, but many simply do not make the effort to do it.
My experience of continually having to defend the statistics was rather unsatisfying, and in some respects, contradictory to my motivations for becoming an astrophysicist. People often get into academia because they’re curious and want to learn something, and perhaps eventually contribute to the field. What I was finding was that those in the field for some time are often too invested in their previous results, and simply did not want to accept new results which may contradict their previous work.
David: We know that from the history of science. This is the way people respond. They don’t want to hear it from young people. We can understand that. But the issue is, how is the field structured that allows these people to win? That’s the problem.
Rahul: Certainly. That’s the problem.
David: Are there interesting cultural aspects of science and are they different between the places you have been?
Rahul: Between Germany and the US, for example, there are clear similarities. There’s definitely a mindset. People are very hard-working. But I think there is more pressure and stress in the US. There is more of a work-life balance in Germany. It’s totally fine for people to take 3 weeks off in the summer and maybe a week or 2 in the winter to go skiing and no one will bat an eye. In fact, they might bat an eye if you don’t do that. In the US, instead, it’s like ‘oh my God! You’re taking 3 weeks off.’ Same is true in industry in the Netherlands where work-life balance is emphasized. People might look at you funnily if you regularly work past 6 PM.
David: Is it hard-working, talented people that make it in academia?
Rahul: I do not think being hard-working and talented is sufficient for landing long-term positions in astronomy. Basically, all astronomers are, but if I remember the statistics, I think only 1 in 5 people remain in the field 10 years after their PhDs. Unfortunately, it’s also who you know and who’s results you can advance. There is also a little bit of an old boys type of system where if you come from a certain group of people, and you toe the party line, then you are more likely to get a permanent position.
David: What about solutions?
Rahul: This is a tough one to address. Right now there’s such a big demand for people with mathematics and technical backgrounds in industry. Combining the intense pressure in academia with the potential in industry, including increased salaries, there are strong incentives for astronomers to leave academia. The positive news is that those thinking of leaving do not have to worry about their career prospects. This is great for individuals, but unfortunate from the point of view of the research field. Too many highly trained individuals end up leaving academia, ultimately slowing down progress in the field.
I find this situation rather unfortunate, again not from a personal point of view, but when considering the state of the field as a whole. I had a feeling that there might be a pandemic of burn-out in academia, which could be completely avoided because we are doing this to ourselves. We have built this system where in order to remain in academia, we have to publish and garner citations for past work. Large collaborations may split up the results, so that large fraction of publications simply repeat what's in previous papers. There needs to be a change in this system. We need to change the whole evaluation system.
Additionally, we can be more open to new ideas, even if they might contradict prior work. It should not matter that later results supplant prior ones if there is valid evidence. We should all have the mindset that we did our best with the data, tools and methods on hand at the time. As new methods are developed and shown to provide more accurate results, the old understanding needs to be updated and be an acceptable part of the process, and not limit our progress.
David: I guess we need more people that are less defensive about their work and are more genuinely interested in getting at the way the world works, and less at building their legacies. We might need to learn to be happy being wrong. Afterall, when someone shows that you’re wrong, you’ve helped them get to where they are by working through your contributions. So you are participating in the process of getting toward truth.
Rahul: Yes - exactly. I think we can tone down the toxicity with a shift in mindset. That would decrease the pressure to publish and garner citations, and focus our attention on advancing the methods, contributing to furthering our understanding of nature.
David: Thank you Dr. Shetty!
Comments