eric.novik - Unnatural Consequences

Looking back at 2019

As 2019 was closing I had a feeling that I had not done that much during the year. But then I started looking over my journal entries, my photographs, the books I read (mostly listened to), the progress we made at Generable, and I realized that I did get some things done and experienced joy (and sometimes agony) along the way.

I started the year by going to see The Jungle, a play that plants you in the middle of the Calais Jungle and follows the lives of refugees struggling for survival. Jacki and I came away deeply moved by the experience.

Later in January Colleen Chien invited Dan, Jim Savage, and me to give a short talk at her class at Columbia Law School to discuss facts and fiction in AI/ML ecosystem today. I mostly focused on our work with Stan in Clinical Research and how probabilistic modeling is making it possible to construct models that are explainable, transparent, and perform very well predictively, especially if we are able to approximate the data generating process well.

In early February we celebrated Andrei’s first birthday. I never thought that I would be a father again but Andrei is bringing so much joy to my life that it almost seems worth it. OK, it is worth it. I think. I am pretty sure. I love you, Andrei!

Winter came in March to our neighborhood and we had some fun times in the snow. Andrei loves being outside and I can not wait until he is old enough to go skiing with us.

In early March, Generable had our first off-site in Pocono Pines. This is when started seriously thinking about what product we wanted to build. At the time we thought we are going to make a tool for Stats and Data Science types to build models from different components. That turned out to be wrong. More on that later.

In the middle of March, my oldest son Ben and I built a computer from individual parts. He was surprisingly enthusiastic about this enterprise and I enjoyed working with him on this. When we turned it on and loaded the OS (Ubuntu) it became obvious that the project succeeded. Ben had not used this computer since and I turn it on only occasionally but the whole experience was totally worth it. Thanks, Ben for putting up with my relentless pursuit to turn you into a nerd.

Later in March, we saw another play at St Ann’s Warehouse called The B-Side. This play is a musical in an off-broadway sense of the word. The main character sings along a vinyl album containing songs by African American convicts in a Texas prison. I love seeing this kind of production making it to the serious stage and selling out a large theater in Brooklyn.

At the end of March, we went to see Marys Seacole (based on the life of Mary Seacole) at the Lincoln Center. The story follows Mary throughout her life and to the battlefield of the Crimean War. I don’t remember a lot of details from that play; perhaps it did not leave an impression on me or I just drank too much bourbon shortly thereafter.

In April, I was invited to be on a panel with other alumni of the Columbia Univesity’s MA in Statistics program. I love coming back to the stats department and talking to current students and recent graduates. I usually tell them to learn some Bayesian stats — most of them will graduate without encountering a posterior distribution. A tragic state of affairs, but that’s how it is for now.

Later that month, we started re-designing the Generable platform and focusing on what we call the Clinical Lead — the person who oversees early clinical trials and gives an opinion of whether a treatment should advance to a late-stage clinical trial. Inside a Pharma company, this is not just a clinical decision, there are economic factors at work, but the clinician makes an assessment of the drug is working. We abandoned the model-building idea and instead embraced communicating model results and supporting decision-makers.

At the end of April, we went to see Oklahoma on Broadway. I know people love this musical, but something about it did not click for me. I love that it is made and I think I understand the scope, but I could not quite grow to love it.

In early May I was visiting a colleague in Northampton where we have a small office. I was staying in an Airbnb house inhabited by an artsy old lady.

“What do you do?” she asked me one evening when I came home late, slightly drunk.

“I am a Statistician working in clinical research, early clinical trials in Oncology.”

“Did you say Oncology?”, she asked.

“Yes”, I answered.

“Thank you for everything you do! I am a cancer survivor.”

I was completely taken aback as this never happened to me before.

“I am not a doctor, I do not treat patients, and if we make any contributions, it would be many years from now”, I told her.

This was not false modesty. I really did not feel that I deserved her thanks, not yet anyway. But she wouldn’t have it. I finally told her she is very welcome and now completely sober and slightly teary-eyed stumbled upstairs and went to bed.

Later in May, I attended and co-presented at the Bayes-Pharma conference in Lyon, France. Marmaduke Woodman from the University of Aix-Marseille and I talked about the work we did fitting Stan models to epileptic seizures data collected from electrodes implanted in patients’ brains. The hope is that these models could be used to improve the precision of surgical interventions. I think they are planning a clinical trial for later this year.

In June, I attended the PAGE (European Pharmcometrics) conference in Stockholm, and right after the conference, I caught a short flight across the Baltic Sea to visit my mother in Riga, the city of my birth. Riga is a beautiful, modern European city with manicured parks, well-maintained Art Nouveau architecture, mild weather, and tragic history.

Every year, I promise myself that I would spend some time with my parents and this year I kept my promise. We have a lot in common my mother and me, kindred spirits so to speak. For one, she is just as vulgar as me and she appreciates my not-so-kosher jokes.

On the way home, I had a stopover in Amsterdam, where I spent a few hours at the Rembrandt House Museum before taking a long flight back to New York.

In June, my daughter Miriam graduated from middle school. She worked really hard and improved her grades considerably. I was (and still am) so proud of her.

At the end of June, the Generable crew had the second off-site meeting in Denver. I like spending one week with our remote team every three months or so. It helps to get on the same page, agree on key priorities, collaborate on technical tasks, and just spend some time hanging out together.

At the beginning of August, we spent a customary week on Lake Sunapee in New Hampshire. Ben took some sailing lessons, we played some tennis, biked, played Monkey Bridge (a Buros family tradition), and generally had a nice relaxing time. I worked most of the time but it did feel like a vacation.

On the way back from Sunapee we stoped in Belchertown to break up a long drive back. In the morning, Ben and I took a 10-mile bike ride to visit the University of Massachusets at Amherst. It is a lovely suburban campus with lots of green lawns. Can you picture yourself going here, I asked Ben. I dunno, replied Ben which is his standard reply. At least he is not over-confident!

In the middle of August, the Generable crew attended StanCon in Cambridge UK, an annual conference dedicated to all things Stan. Generable was one of the sponsors and Dan and Krzysztof presented.

In September, we took Andrei to the New York Aquarium on Coney Island. They have recently renovated the place and some of the construction is still in progress.

At the end of September, Jacki, Miriam, and I went to Disney World. Miriam wanted to go for a long time and I am happy we were able to do it. The Magic Kingdom is aging and not very gracefully, but the Flight of Passage ride in the Animal Kingdom left such an impression on me that I am seriously considering getting a VR set even though the ride itself is not in VR. It’s pretty damn close to R. While in Orlando, my dad came and stayed with us for a few days, which was nice as I do not get to see him that much anymore.

In October, we showed an alpha version of the Generable platform at the ACOP conference in Orlando. This is the first time we were able to afford a booth, which is some kind of milestone. A lot of people don’t like “working the booth” but I do, particularly when the traffic is heavy which was not always the case at ACOP. Next year we should be doing more clinically oriented conferences but we will likely be back at ACOP and PAGE.

In September we went to see Slave Play, a Broadway production that is too weird to describe so I am not going to try. If you go see, and perhaps you should, it will make you very uncomfortable, which I am sure is by design.

We ended the year with the play The Sound Inside with Mary-Louise Parker (Weeds and other good stuff) in the lead role. This was my favorite play of the year and one that I will remember for a long time. Poignant references to DFW, parallels and direct references to Raskolnikov from Crime and Punishment, a masterful soliloquy by Mary-Louse, are just some of the features that made this play special for me.

This was in many ways a theatrical year.

Here is to you 2019!

The Book of Where

I have some exciting news to share — my co-author, Tony Schwartz and I, just signed a contract to write what surely will become a best seller: The Book of Where.

The book is a culmination of years of research into a revolutionary new science that is concerned with figuring out, you know, where things are.

For generations geographers, cartographers, topographers, sailors, and other location scientists have been trying in vain to pin down the idea of location and missing it by a mile. Sure they have their Mercator projections, triangulations, GPS, and other round-about contraptions, but what they don’t have is a language of location that is capable of precisely identifying this elusive entity. Until now.

We have come up with an operator that makes it possible, finally, to uncover, you know, where shit is. Yes, you guessed it, it is the find() operator and the corresponding find-calculus.

And it’s not all theory! If you order the book, you will be able to answer such age-old questions as:

Where the f*ck are my keys?
Where is the Bermuda triangle and how to get there?
What is a map anyway?

Tony and I are thrilled to get this in front of popular audiences and we are looking forward to a productive public discussion about this important topic.

Now, go out there and find something!

2019 Predictions

Prediction is very difficult, especially if it’s about the future.
— Niels Bohr

2018 had turned the page and we are already completed approximately 0.27% of 2019. I don’t know about you but I feel like I am behind. So to procrastinate some more, here are my (silly) predictions for 2019.

Trump will remain president with P = 0.60. 2019 will no doubt be a tough year for Trump as the Mueller report will likely become public, but I am betting that Republicans will continue to support him and even though the impeachment in the house is quite likely, the removal from office is not so certain.
The market (SP500) will continue to be volatile with the VIX staying well above its historic average (~11) for most of the year with P = 0.70. If we are to believe the model, there is about 90% chance that SPX will be between 3,200 and 2,000 by the end of April or about 45% chance that it will be below its current level and above 2,000. I am more pessimistic and I will give it P = 0.60 that it will be below the current level of 2,500 by April.

The UK will not exit the EU (no Brexit) with P = 0.60. This is purely based on my conversation with someone who lives in the EU and spends a lot of time analyzing European economies.
I recently bought some cryptocurrency (a tiny amount of BTC and ETH) so I can keep myself informed and also because everyone was aggressively selling. I am pretty bullish on crypto longer term, but less certain about the current crop of offerings, although BTC proved to be very resilient. My prediction for 2019 is that BTC will not recover and will stay under its highs with P = 0.90.
We will not find a cure for any cancers with P = 0.80, which is a reversal from my last year’s prediction, and the one I am hoping to lose. I like where the cancer therapies are going, but our understanding of the mechanism is still quite weak, the methods we use to evaluate their effectiveness are quite poor (but getting better), and I am not holding my breath for data mining technologies (also known as AI) making any breakthroughs in this space.
I selfishly hope that 2019 will be the year of Bayes. I would like to see more universities offering Bayesian courses at undergraduate and graduate levels (this one from Aki @ Aalto looks amazing, for example), more companies getting started using sound probabilistic approaches, and FDA and EMA moving closer to embracing the Bayesian paradigm (we are rooting for you, Frank). I have no idea how to measure this, so no specific predictions here.

How did I do on my 2018 predictions

On 1 Jan 2018, I made the following entry into my journal

Will Trump still be president? Yes. (P = 80%)
Will Mueller team link Russia to Trump: a) To Trump campaign yes (P = 60%); b) to Trump No (P = 70%)
Will Crypto continue to rise? Yes. (P = 60%)
Will the stock market end its rise? No. (P = 55%)
Will Republicans lose control of the house in November? Yes. (P = 75%)
Will there be a war with North Korea? No. (P = 95%)
Will the New York Times go out of business? No. (P = 85%)
Will we cure one specific type of cancer? Yes. (P = 60%)
Will there be at least one Bayesian-based company that will raise Series B? (P = 70%)

I also said that I would compute my gain/loss using a hypothetical payoff function: $100*\text{log}(2p) $ if I am right and $100*\text{log}(2 * (1-p)) $ if I am wrong, where p is the probability I assign to the event occurring. We could use any base for a log but base 2 is natural as it compensates at the notional value ($100) if the bet is made with probability 1. I will describe why this particular payoff function makes sense in another post. (The tacit assumption here is that I would have been able to find a counterparty for each one of these bets, which is debatable.)

Trump is still president: $100*\text{log2}(2*0.80) = 68$
Mueller linked Trump campaign to Russia. The word link was not defined. I think it is reasonable to assume that the link had been established, but I could see how if my counterparty was a strong Trump supported, they could dispute this claim. Anyway: $100*\text{log2}(2*0.60) = 26$
Mueller linked Trump to Russia. Same as above in terms of the likelihood of it being contested, but think I lost this bet: $100*\text{log2}(2*0.30) = -74$
Crypto did not continue to rise: $100*\text{log2}(2*0.40) = -32$
Stock market ended its rise: $100*\text{log2}(2*0.45) = -15$
Republicans lost control of the house in November: $100*\text{log2}(2*0.75) = 58$
Thankfully, there is no war with North Korea: $100*\text{log2}(2*0.95) = 93$
New York Times is still in business: $100*\text{log2}(2*0.85) = 76$
I am not sure what made me so optimimistic regarding the cure for one type of cancer. Currently, the most promising cancer therapied are PD-1/PD-L1 immune checkpoint inhibitors and there have been documented cases for people who become cancer-free after being treated with one of these drugs, but I think it would be too generous to say that we have cured one type of cancer. Perhaps more impressively, Luxturna will cure your blindness with one shot to each eye if a) you have a rare form of blindness that this drug targets and b) you have $850,000 to spend. $100*\text{log2}(2*0.40) = -32$
There were a few startups based on the Bayesian paradigm and Gamalon came close with a $20M Series A round, but none raised Series B to my knowledge: $100*\text{log2}(2*0.30) = -74$

To summarize, I am up $94. Is this good or bad? It depends. A good forecaster is well-calibrated and we do not enough here to compute my calibration. The second condition is that for the same level of calibration we prefer a forecaster that predicts with higher certainty, a concept known as sharpness. Check out this paper if you are curious.

Good Thinking

“The subjectivist (i.e. Bayesian) states his judgements, whereas the objectivist sweeps them under the carpet by calling assumptions knowledge, and he basks in the glorious objectivity of science.” – I.J. Good

Irving J. Good was a mathematician and a statistician of the Bayesian variety. During the war, he worked with Alan Turing at Bletchley Park and later was a research professor of statistics at Virginia Tech. Good was convinced of the utility of Bayesian methods when most of the academy was dead set against it; that took a certain amount of courage and foresight.

One the delightful aspects of this book is that Good’d humor and sarcasm are so clearly on display. For instance, one of the chapters is called 46656 Varieties of Bayesians, where he derives this number using a combinatorial argument.

In the above quote, Good zooms in on what he considers to be the difference between the frequentist (objectivist) and Bayesian schools. This argument seems to hold to this day. In my experience interacting with Bayesians and Frequentists, particularly in Biostatistics is that Bayesians tend to work from first principles making their assumptions explicit by writing down the data generating process. Frequentists tend to use black box modeling tools that have hidden assumptions. The confounding variable here is this desire for writing down the likelihood (and priors) directly, versus relying on some function like say glm() in R to do it for you. As a side note, glm() in R does not regularize the coefficient estimates and so it will fail when data are completey separable.

The key insight is that nothing precludes Frequentists from working with likelihoods directly, and many do, but I bet that most don’t.

Another subtle difference is that people, being naturally Bayesian, generally rely on prior probabilities when making judgments. Priors are always there, even under the Frequentist framework, but some very famous and very clever Frequentists failed to take them into account, as demonstrated by this amusing bit from Good:

Does pair programming apply to statistics?

New Yorker recently published an article entitled “The Friendship That Made Google Huge.” The article describes a collaboration between Sanjay Ghemawat and Jeff Dean, two early Google employees responsible for developing some of the critical pieces of Google’s infrastructure.

One of the of the fascinating aspects of this collaboration was that they programmed together, a practice known as pair-programming. One person typically drives by typing and other is navigating by commenting, pointing out alternative solutions, spotting errors, and so on. The benefits of pair programming cited by c2 are increased discipline, better code, resilient flow, improved morale, collective code ownership, mentoring, team cohesion, and fewer interruptions. These seem reasonable, although I am not sure how much work went into validating these attributes.

Reading the article I was wounding what would the application of this technique to Statistics would look like. And I don’t mean to the computational aspect of Statistics. It seems pretty clear that if we are collaborating on the development of statistical software, pair-programming could be applied directly. But what about the process of say thinking about a new statistical algorithm?

When I started attending Stan meetings in Andrew Gelman’s office, I think around 2015, they were still a lot of fun. A few people usually gathered in a circle and discussions often took off on a statistical tangent. That was the best part. I remember one time Andrew went up to the blackboard and said something like “I have been thinking about this idea for optimizing hyper-parameter values in hierarchical models…” and proceeded to scribble formulas on the board. This was the beginning of what he dubbed a GMO (Gradient-based marginal optimization) algorithm. See here from Bob Carpenter for more details. I think he wanted to get some feedback and stress-test his ideas by writing them on the board and having other people comment. I am not sure if this qualifies as pair-statisticsing (more like three-pair), but maybe close enough?

Scientists collaborate all the time, although there are loners like Andrew Wiles, for example. But what about a close collaboration where two people sit next to each and one is writing and the other commenting or more likely using the same blackboard? It seems like it would be a useful exercise. I for one would be too embarrassed to undertake it. I should try pair-programming first.

Married once shame on you, married twice…

Last week my sister Sarah got married. I thought long and hard about the appropriate wedding gift and decided that I should give the newlyweds some unsolicited advice. Here it goes.

No, it’s not a compromise

Someone once told me that marriage is a compromise. I disagree very strongly with this notion and yet it is somehow almost conventional wisdom, so I must explain. To see why, suppose that instead of a lover you have a very close friend. It helps to think about a specific person in your life. This person is so close to you that you can share your innermost secrets without embarrassment or fear of being judged. You can ask and respect their opinion on subjects that are important to you. And so on. Now imagine that you view your relationship with this person through the lens of compromise. I bet you will not be friends for very long. There is no quid pro quo in friendship, and there should not be one in marriage. Do things for each other because you want to, not because you have to.

No expectations

Some of my biggest disappointments in life came from expectations. In my field of work, statistics, we love expectations. We can not work without them. But in relationships, I found, it helps not to have any. Think about it. You come home from work and you spouse was home all day, but the dinner is not ready. Do you get immediately mad? Why? Because you expected dinner. Too bad for you! You should instead ask if maybe you partner wants to go out to eat or cook something together, or heavens forbid, skip a meal. But what if the dinner had been made? Be surprised and be thankful. Personally, given the choice, I would rather be pleasantly surprised than severely disappointed. Wouldn’t you?

Relationship changes, people not so much, but there are exceptions of course

There are some people who are capable of change. For example, I know people who used to vote for Republicans and now they vote for Democrats. But these cases are quite rare. (It had been shown for example that one’s political opinions are formed in the 20s and tend to stay constant over your lifetime.) So what does that have to do with marriage? Everything! If you think that your partner will change for you, you are crazy! Most likely they will be the same person they are today. Love that person, not the person you think he or she will become.

Two marriage stressors: kids and money

Tolstoy said that: “All happy families are alike; each unhappy family is unhappy in its own way.” There had been a lot of studies trying to figure out what breaks marriages apart. Two things stand out: kids and finances, but wait, I am not trying to convince you not to have kids or chase after money all your life. Instead recognize that having kids will put a lot of stress on your relationship and if you choose to have kids, you should be ready for it. I know I was not, and my (first) marriage suffered as a result. Psychologist Daniel Gilbert in his book “Stumbling on Happiness” showed the following chart of marital happiness. All four studies point to the fact that once families start having children their marital satisfaction decreases (*). So what to do? Get a lot of help! We got some help when my kids were young, but it was not nearly enough. When kids are young both of you will be stressed and tired. It’s not a very romantic time. If you can afford a live-in or more or a less full-time help, do it! Make sure to schedule regular date nights, when the two of you can go out without the kids. Go on weekend getaways. In short, do what you can to protect your relationship from stress. Which brings me to the second stressor, finances, or lack thereof. Here, I am not much of an expert but having a budget and a financial plan will likely help. Today there are lots of services that are not too expensive and will help you with that.

[(*) Statistically speaking, the figure is missing a control group: people who were married but never had any kids. If that group shows similar pattern as above, we are likely observing some kind of a generic marriage fatigue, but if it is more or less constant (straight line), then perhaps the change in satisfaction is due to kids.]

Don’t go to bed angry

Jacki told me about this one and I like it so much I thought I would throw it in. Don’t go to bed angry means that you should try to resolve your disagreements before going to sleep. Don’t hold it in, talk about, understand each other’s point of view, and only then go to sleep. As a side effect, you will probably sleep better.

You are married, but you are still individuals

Back in the old days, people assumed their partner’s identity. Women were especially prone to this because they were relying on men financially and often emotionally. We don’t have to go too far back into the past to find examples. My maternal grandmother lived her life in the service of her husband. She relied on him for everything. He made all the important decisions and she did not question any of it, at least to my knowledge. It was not a very happy marriage, but they survived as a couple. When my grandfather died of cancer, my grandmother fell apart. She could not stop talking about him, she was lost in every possible way. Thankfully, her daughter was there to pick up the mantle. I have no idea what would have happened to my grandma if my mother was not there for her during that time. Nothing good I suppose. What is the point of this story? Don’t lose yourself in one another. Even though you are married you are still thinking, feeling, contributing individuals. You are stronger together, but you don’t collapse when you are apart.

Time together, time apart

Sometimes being apart is great. You can get lost in your own thoughts, reevaluate your decisions, reflect on all the good and bad things that happened. This summer Jacki rented a small apartment by the beach and spent almost an entire week there by herself. I was very supportive of this and happy that she was able to do it. I missed her a lot and she told me she missed me too. Towards the end of the week, I took the kids and joined her. We had a great time together, but we did not collapse when we were apart.

Last words, I promise…

In closing, let me say that I have made a lot of mistakes in my life. Some, I can not correct, but I can learn from them and I can change, for the better I hope. I also hope that you can learn from my mistakes, if only a little. I wish both of you love everlasting, respect for one another, and a strong enduring friendship.

Good luck.

How to Get a Job That You Don’t Hate

First of all, if you have not seen the movie Office Space, stop reading this blog post and watch it. I will wait.

Welcome back.

I am often asked to give career advice. This is strange since I don’t think I ever had a career. I had jobs, some terrible, some pretty good ones. I started a company in 2010. I am starting another one now. But a career, never. OK, so with that out of the way …

Earlier this month I was invited to discuss job search strategies with students in the MA in Statistics program at Columbia University. After the discussion, I posted the following blurb on my Facebook page.

Talking about careers in data analysis and stats with students in the MA in Statistics program at Columbia. My key messages: 1) if you want to work for banks, make sure you know what you are getting into; 2) think of an interview as a two way street: they interview you, you interview them; 3) if you hate your job, quit (if you can) and don’t worry about what it would look like on your resume; 4) don’t apply online, get a referral, go to meet ups, etc.; 5) learn some Bayesian stats — you will be a better human and know more than most of your peers.

I thought it would be useful to people if I elaborated on these a bit so here it goes.

If you want to work for banks, make sure you know what you are getting into

A lot of students in the MA in Stats program want to work for banks. I am not sure why that is but it must have something do with the geography and expectations of high earnings. Whatever the motivation, it is a good idea to know what you are getting into. Not everyone hates working for banks, but in my experience, technical people who end up working there are not very happy. I think they find that the culture does not agree with them very much. My advice is always to ask to speak with your potential future peers and ask them, the future peers, about three things they love and three things they hate about their work. You would be surprised what you will learn. Having said that, I have met people on the “business” side of banks that absolutely love it. Like with anything else do your research and make your decisions based on conditional probabilities, not population averages.

Think of an interview as a two-way street: they interview you, you interview them

This should be obvious, but most people don’t do it. The thing to recognize is that there is an inherent risk asymmetry between you and your prospective employer. You are just one candidate or employee out of many. They can make a mistake with you and they would probably be ok, but you are about to commit several years of your life to them (in expectation) and so you should be the one doing the interviewing! Of course, the realities of the sparse labor market is such that usually, you need them more than they need you, and so the roles are flipped. This fact, although daunting, should not deter you.

You want to find out what it would be like spending most of your waking hours at a job you do not yet have. This is not easy. To get started, make a simple two-category list: 1) culture; and 2) technical. For example, if you want flexible working hours, put that in the culture column and if you just must program in R, put that in the technical. Once you are done making the list, rank order the items. Do this before you take any interviews. After the interview, try to score the prospective employer along those dimensions. Where is the money column, you ask? That part is easy: know your minimum number and don’t be afraid to let them know what that is … but be reasonable, which means know what the market is paying and where you are on the skill / experience curve.

If you hate your job, quit if you can and don’t worry about what it would look like on your resume

Some jobs are just plain awful. If you do what I recommended above, you will probably avoid most of those, but every now and again one will creep up on you. What to do? Quit! Sure, this is easier said than done, but at the very least immediately start looking for a new job and make some notes about how you were duped with this one. Introspection is a great tool and I use it often.

A friend of mine spent years working at a company for a horrible boss and even though he eventually quit he still has emotional anxiety over the whole affair. Life is way too short to work for assholes. Get out now. But what about the resume, you ask, and I answer: if you are a technical person, github (or something like that) is your resume.

Don’t apply online, get a referral, go to meet-ups, and so on, but I am sorry I can’t refer you because I don’t know you

When I was working for a bank we had an opening for a business analyst. Now, here is the thing: business analyst does not analyze the business. What does she do? She writes requirements for a proposed piece of software. Anyway, that’s beside the point. When this job was posted by the HR department we received over 200 resumes! I don’t remember if we hired anyone, but you can imagine your chances of getting such a job. (Well, you can just compute them, but whatever.) The short story is, don’t apply online.

The best jobs I ever got were referred to me by my friends and classmates. Meetups are also some of the best places to get technical jobs. New York Statistical Programming Meetup is a great one for stats people and they often advertise jobs during their events. Another great way is to start contributing to some open source software. Where can you find great open source projects? Github, of course.

But Eric, why can’t you introduce me to some of those friends of yours that have all these great jobs? The truth is that they will not be my friends for much longer if I started doing that and you should not do it either. Your referral is a reflection on you — use it wisely and only introduce people you know well.

Learn some Bayesian statistics — you will be a better human and know more than most of your peers

When I was getting my MA in Stats at CU, they did not have a masters level Bayesian class. This is a tragedy of modern statistical education, but things are getting better. My friend and co-founder Ben Goodrich is teaching an excellent Bayesian class for masters students in the QMSS program. The stats department also offers the class and Andrew Gelman teaches a PhD level Bayesian course. If you are not at Columbia, Coursera recently started offering Bayesian classes. This one looks particularly interesting.

So why all the hype about Bayes? It’s a long story, but here were my initial views on the subject. I now work exclusively in the Bayesian framework. In short, Bayes keeps me close to the reasons why I fell in love with statistics in the first place — it lets me model the world directly using probability distributions.

Even if you are not a statistician you should learn about conditional probabilities and Bayes rule so you do not commit many common fallacies such as the prosecutor’s fallacy, especially if you are a prosecutor.

Bonus feature: why do you want a regular job anyway?

Recently I was on the Google hangouts call with a friend of mine who works as a contract programmer. His arrangement with the company is that he works remotely. For most people remotely means working from home. For him, it means working from anywhere in the world. Right now he lives in a small apartment in Medellín, Columbia. He showed me the view from his window. It looks approximately like this:

To quote Tina Fey: I want to go to there.

The idea that an employer dictates both the hours during which you must work and the location of where the work must be performed is somewhat outdated. Sure, there are lots of jobs out there that legitimately require this kind of commitment, but it is no longer the norm. Take a look at that culture column I mentioned before and see where you stand relative to hours / location flexibility and choose accordingly.

Note to people seeking H1B visa

A lot of people I speak with are in the US on F1 (student) visa. It is really tragic that the US does not award work visas to foreign graduates, but this is unlikely to change anytime soon. The common misconception is that you need to find a large company to sponsor your H1B (work visa). You do not. Lot’s of small companies can and do sponsor H1s. When I was working for a small startup in San Francisco in the mid-90s, we sponsored several H1Bs for Eastern European immigrants. The key is finding an experienced attorney who processes many applications and ask her for advice. Reputable attorneys will not charge you for the initial consultation.

If you have any other questions, please ask them in the comments.

Talks, Lectures, and Workshops. What is the Difference?

I am about to go on a mini speaking tour and in preparation I am skimming Scott Berkun’s “Confession of a Public Speaker.” I like this book, but while reading it I realized that I will be giving two different types of “speeches”. Let’s call them talks and workshops, and even though in both cases the subject will be Stan, the audience’s expectations will be different and my presentation must reflect those differences. In particular, Scott’s book is a lot more relevant to talks than workshops.

Most inexperiences speakers assume that the people who come to their talks want to learn something and some people do have that expectation, but those are usually inexperienced consumers of talks. The truth is that it is very unlikely that you will learn something during a talk. Learning is a hard and active process and it is not going to happen by passively absorbing sound and light waves in a reclining position. The most realistic goals for a talk is to inspire people to learn more about the subject. This is a difficult task for the presenter, but if you want to know how to do it well, I highly recommend Scott’s book.

A workshop is a different animal. As the name suggests, the participants will be working alongside the presenters and in so doing are hoping to come away with enough initial knowledge to jump start their own exploration. People who attend the workshop have already been inspired to learn more and the bar is therefore higher than during a talk. So what are the important attributes of a good workshop?

To think about that, image you are taking a technical class at a University. You are listening to a lecture. Are you at a talk or at a workshop? The listening part gives it away. Most likely you are at an uninspiring talk that should instead be a workshop. In order for the workshop to go well, here is my short of list of requirements:

Participants should have the required background at the right level of abstraction
If this is a computing workshop, participants already installed and tested the required software
Presenters have designed a series of exercises that gradually guide the audience through a set of hands on tasks each illuminating a different part of the subject
Participants have a chance to discuss the problem and their solutions with each other and with the instructors
There is a mechanism for the immediate feedback that tells the instructors if the majority have mastered the task

As a presenter, I can not control 1 and 2, but I must make it easy for people to assess their level of knowledge and software installation instructions must be clear.

Creating exercises is very time consuming, but I believe necessary for workshop style learning. Time for discussions can be weaved into the exercises and the output of the exercises can be shared with the rest of the class. Which brings me to the feedback mechanism, which is perhaps the most often overlooked aspect of the workshop.

I don’t have much experience with a feedback system during workshops, but I have used live surveys during talks and they work really well. For computing workshops, I would like to experiment with live code editors, where participants have a chance to post their code, their questions, and the error messages to the shared workspace. This would only work for moderate size groups, but I it seems to me that workshops should only be conducted in relatively small groups (say 50 people or fewer).

If you have any pointers on how to make the workshop experience better, feel free to post them in the comments.

Thoreau: Thoughts on his Indictment and Defense

I never read Walden, not in its entirety anyway. I read most of the first chapter. It was dreadful. I still remember struggling to keep up with the narrative and wondering why is this such a big deal. Overall, I love the message of the simple life, civil disobedience, and living as one with nature. I do not love the apparently hypocritical obsession with seclusion and the disdain for all humanity. But this, of course, is a very shallow view of Thoreau. But then again, I do not have the patience to study him deeply. Fortunately, Kathryn Schultz and Jedediah Purdy do and offer an indictment of the man and somewhat halfhearted defense.

I really enjoyed reading both of these, but perhaps not surprisingly I found the indictment more convincing. The defence goes something like this. Sure, Thoreau was a hypocrite and an asshole, but we should not blame the message for the messenger (i.e. ad hominem or an opposite of blaming the messenger) even though in this case it happens to be the same person. I can get behind this argument. In science and in business there were and surely are lots of arrogant assholes, who nevertheless made important contributions. John Nash, despite a very favorable portrayal in the movie Beautiful Mind (the book is much less flattering), was not a very nice man. Steve Jobs was no sweetheart either. And so on. So, is Thoreau’s message important enough to stand on its own? That I am not qualified to answer, but a contrarian and anti-authoritarian in me wants to believe it that it is.

Thanks to Bryan Lewis for pointing me to these articles on his web page.