Exactly one year ago I decided to leave academia and become a data scientist. For those who don't know, a data scientist is somewhere between a software engineer, a statistician, and an algorithm expert - essentially, they're the "?" that lies between "data" and "profit" for the modern company. UCSF and my adviser really were as good as it gets (amaaaazing by academic standards), but the post-doc scene is a total scam. We all know it's a scam, but no one wants to throw away 7-10 years of passion, labor, and the eventual recoginition that comes with being the one of the few world experts in...whatever it is you do.
It's hard to walk away from. Graduate students started coming to me for advice. And not just "how does this stupid machine work" advice, but "how should I do this analysis" and "what does this mean" advice. I would give talks without anxiety and stand my ground in disagreement with professors owning more than my lifetime of high-tier publications. That's the gold of academia - correcting someone more famous than you in front of others. Young scientists live for that shit. At least I did, but I had a chip on my shoulder, which my graduate PI definitely nurtured (we were poor white punks with nothing to lose). Flash forward to a tech interview where some kid is asking you to calculate a confidence interval and your response of "I haven't done that in 10 years and I can just wiki it" does not fly. He thought you were a idiot before you opened your mouth, anyways.
The first step was to stop working on that alllllmost-finished manuscript from graduate school. Man, it was such a clean story, and the figures were so nice and polished. I told myself I'll finish it when I have a real job. My old PI could write it if he wanted to; the results and methods were all there. I already gave hundreds of hours of my own free time to it. But I really _learned_ something about neuromodulation of information processing in the neocortex. No one cares; it was time to move on. I gave myself three months to pick-up Python, work on some machine learning side-projects, and study math/stats/CS before I started applying. Every morning and evening I'd stand on the Caltrain, coffee tumbler in one hand and phone in the other, explaining gradient descent or l1 verus l2 regularization to the auditor in my mind. It was time. I was gonna blow some minds. Maybe I'd even have a good gig by Christmas and buy my family really nice presents this year.
Compared to others with equitable talent, I was lucky to have gotten so many interviews. Against all logic and advice, I became obsessed with every company that I got a callback from and convinced that I was just one chat away from my dream job. I was wrong. That's not how tech in San Francisco works at all. That didn't stop me, nor did 6 months of rejection. Do you know what a data challenge is? It's when a company sends you data and gives you a few days to analyze it and then rejects you with no feedback. Every time it happened I'd count it as free training and told myself that I was that much closer to being the most amazing data scientist ever. Sometimes I'd pass a first round coding or stats interview and think that the next door opened to a friendly culture-fit conversation. I was wrong. That's not how tech in San Francisco works at all. I started to believe that my gender was putting me at a disadvantage, which it was, but it was much more than that.
Some time in April, with a wicked cold and the nonchalance that constant rejection brings, I got a break in the form of an interview for a fellowship. The company, called Insight Data Science, is an incubator that helps PhDs transition into data science. There are no classes, just some direction and the opportunity to show off your skills to good companies. It's competitive, and for good reason: elitism in such a burgeoning field is absolutely priceless. The person who interviewed me was a woman and former academic, and for the first time I felt like someone actually _believed_ that I wasn't a total phony. That was the missing link - to find people like me. I left my post-doc, increased my credit card limits, and joined the program.
Insight DS was more stressful than I thought it would be. Even as a friendly establishment, invested in our success, the cold Silicon Valley air of "be impressive or no one cares" wafted in through the ventilation system. My peers, who left their work at the Large Hadron Collider and similar pillars of scientific legitimacy, were crucial to understanding that I was not some fluke; there is a system. Every one of them, genuine, kind, and brilliant to the core; struggled considerably. We learned from and encouraged eachother despite the fact we were often in direct competition. What emerged was a fleet of 32 data scientists and engineers, who will defend and support eachother for the rest of our careers. We will all get good jods. We will all benefit from eachother's success.
The program didn't make me smarter, and the skills I developed during 2-3 months were minuscule compared to the 11 years I utterly _devoted_ to becoming a Neuroscientist. I did work hard, and I executed an impressive project. Being in Insight didn't exempt any of us from the 2-3 rounds of technical interviews required by Bay Area tech companies. By the end I was acing data challenges and didn't break a sweat when asked white board coding questions, but I still felt I did it all just barely. As if giving up one extra hour early, on just one night, and it would have all fallen apart. That feeling of just _barely_ pulling it off has followed me around ever since I was a little kid, running around shirtless in a trailer park in Florida. I feel like I've been holding my breath every since.
I did get my dream job - the dream job that I hadn't dreamt of yet because I didn't know it existed. I certainly didn't think it would be at Salesforce(IQ), but like most people I had no idea what was out there. Now I get to build data products on the cutting edge of artificial intelligence, and I'm overwhelmed with just how privileged I am. So, I have a few weeks of freedom before I start. The first few days, all I could think about was those 11 years of constant work and worry. I worried that I wasn't immediately happy and feared that I had somehow permanently lost some important aspect of my personality. It turned out that I just needed to sleep. It turns out that having a good life, where people treat you with respect and your actions are rewarded and validated, is awesome. It turns out that misery isn't actually the secret sauce of meaning and insight, and that being broke and afraid doesn't make anyone a better person. You're probably not a fluke. There is a system, and it's much less complicated than you try to convince yourself when you're on the unfortunate side of it.
Lately friends and peers have asked me about becoming a Data Scientist, and I wanted to write this all down before I forgot the human part of it. People struggle everywhere, all the time. I know that mine pale in comparison to most people in the world, but this doesn't change the point: it's not necessary. There's not some finite amount of suffering that you must endure before it's all suddenly paid back. If you have the rare opportunity to translate your work and efforts into a better life, I highly recommend taking it. But don't think for a second that strife is the tuition that everyone has to pay.
First Order Residue
Wednesday, August 31, 2016
Wednesday, June 22, 2016
Monday, February 8, 2016
Part 2 of Building an Open-source Electrophysiology Rig: Triggers and Signal Flow
Our rig is going to be customized for synchronization of audio stimuli and neural recordings; stimulus design, presentation, and analysis will all be performed in MATLAB. The general design, however, can be modified for visual stimuli and implementation in Python.
In addition to the Intan Components listed in Part 1, we also purchased a Roland Quad Capture external sound card ($250) and used a speaker and big ole sound amplifier that were laying around. The photo below shows the basic set-up that I'm using to prototype the system.
What you need to keep in mind when designing your system is that every digital-to-analog (DAC) and analog-to-digital conversion (ADC) has its own independent jitter and delay. Your goal is to minimize the number of independent conversions so that you can best time-lock the neural signals (i.e. spikes) that you record with the stimulus that was presented.
Borrowing from the principle of TTLs, triggers can be used to temporally align your audio stimulus with the recorded neural activity. When you design your stimulus (in MATLAB or some other software), you also design a separate channel with the same time discretization; this second channel will contain square steps that can either be spaced at regular time intervals or indicate the onset and offset of short stimulus trials.
15 seconds of a stimulus (blue) and corresponding triggers (black). While plotted on the same axes, these signals are saved on two different channels. |
End of Trial Trigger |
Onset of Trial Trigger |
These triggers, along with a copy of the stimulus, will be recorded by the Intan board through the ADC port. This allows for a more accurate alignment of your recorded neural activity with the stimulus. It's important to ensure that your different triggers are distinct from each other, so that you can distinguish between them in your analysis script. I made mine distinct in both the duration of the pulses and the number of repetitions.
If you are using the free Intan GUI software, you can trigger the onset of your recordings via a digital or analog input. I designed my stimuli to have a distinct recording onset trigger 1.5 seconds before the first trial. Below is a screenshot of the Intan GUI and the pop-up window where you tell the software what input to trigger off of. You can save your entire configuration for the Intan software so that you don't have to change these settings each experiment.
In part 3, I'm going to go over my stimulus design, presentation, and analysis code, and talk about interfacing MATLAB's dsp toolbox with your sound card.
Wednesday, December 9, 2015
Building an Open Source electrophysiology rig: Intan-gible no more, Part 1
A traditional electrophysiology rig for neural recordings, complete with proprietary software and A/D/D/A components for recording and stimulus presentation, will run you about 35-100 thousand dollars. The real kicker is that upgrading proprietary systems to include more channels costs 10's of thousands.
Traditional ephys rig of the 2000's |
Perhaps it's the Open Source mentality of Gen X's and Y's, because Scientists don't like "black boxes," or simply that Academics are strapped for cash, but the Intan board is quickly becoming the favored alternative.
A full set-up, including the main interface and amplifier boards, costs about 4-6K depending on how many recording channels you need; it covers 16 to 128. It also comes with free software, which saves files in formats that can be easily opened in MATLAB or Python. They provide a limited-time free license for a custom software for MATLAB and LabView Library, if you're so inclined. A more popular alternative to using their licensed software, is to use Open-ephys
For data acquisition, the board and free software really raises the standard of what it means to be "plug-n-play."
Intan board |
For data acquisition, the board and free software really raises the standard of what it means to be "plug-n-play."
With an electrode (we use Neuronexus standard 16-channel shanks), amplifier, and adapter boards, you can be collecting data within a day.
Of course, simply collecting data is never enough. Like many other scientists studying sensory or motor systems, you need to synchronize your recordings with one or many stimuli. That is where the fun begins. In Part 2, I will update ya'll on my procedure for generating auditory stimuli to present as trial blocks and synchronizing those stimuli with neural recordings.
Friday, November 6, 2015
El Capitan, give me back my ship
I was so excited to get a new mac at work, and even more-so when I realized that Apple's Time Machine successfully imported all of my old files and applications (including the ever-so-rare subscription-free version of Illustrator and Photoshop). There is, of course, no story without a struggle. I found out that I couldn't use git!
Luckily, this guy provided excellent instructions on how to fix the problem. My only addition is a note for anyone who has never accessed the terminal from recovery mode: after you reboot and hold [command + R], you have to select your language, and then click on "utilities" from the upper LEFT portion of your screen. Terminal is listed there.
Luckily, this guy provided excellent instructions on how to fix the problem. My only addition is a note for anyone who has never accessed the terminal from recovery mode: after you reboot and hold [command + R], you have to select your language, and then click on "utilities" from the upper LEFT portion of your screen. Terminal is listed there.
Subscribe to:
Posts (Atom)