useR! 2019 - Columbus Collaboratory
useR! is the main meeting for R enthusiasts, users (useRs) and developers in the world. This year the conference was hosted in the beautiful city of Toulouse in France. For those who don’t know, R is a free and open source programming language, originally intended for statistical computing. Although R was first released just over 26 years old, it has becoming increasingly popularized in the past decade. Particularly due to the increased demand/need for computational and statistical analyses in nearly every domain. Personally, I have been using R for about 4.5 years. And this was my 5th R related conference (I have attended the last 4 consecutive R/Bioconductor conferences).
At Columbus Collaboratory (CC), we are very enthusastic about open source software and the R language. In fact, our data science team is entirely made up of useRs. As such CC sent three data scientists to useR, namely Abbas Rizvi (me), Katie Sasso-Schafer, and Slava Nikitin. Katie and myself both contributed to the conference through a poster and lightning talk, respectively. I will get more into these later in this post.
Day 1 - Tutorials
Katie and I went to the tutorial sessions on Day 1. We went to Package Development which was lead by the tidyverse dev team (Hadley Wickham, Jenny Bryan, and Jim Hester). We felt that this was an incredible opportunity to be instructed by some of modern day R’s most influential contributors. Personally, I have been developing packages for about 2 years now, and I had a reasonable workflow down that would enable me to take my code on a project and develop it into a more well defined R package.
However, I was very excited to learn from the leaders’ of this field and how they build packages. Jenny Bryan began by having us build a simple package using
usethis. Jim Hester showed us a very simple example of the often neglected, but mightly important, concept of unit testing using
Abbas and Katie with tidyverse development team during a coffee break! From left to right: Jim Hester, Katie Sasso-Schafer, Abbas Rizvi, Hadley Wickham, Jenny Bryan
While I was quite familiar with
devtools I had never really been exposed to
usethis. This tutorial was actually quite eye opening. I have already integrated
usethis into my workflow and when teaching those unfamiliar with package developments to start with
usethis. It makes building a package incredibly easy, user-friendly, and much less error prone compared to my old workflow. And Hadley Wickham ended the tutorial with best practices when documenting your package. I found it quite fascinating that he said something along the lines of “building a package is easy, but being able to communicate and get people to use your work effectively is a life long challenge” (this is not a direct quote). I really liked that quote because when it comes to software, communication is often neglected or poorly documented, where the downstream consequence is a message being lost and the software being rendered useless.
Using Python within RStudio
The second tutorial I went to was called Keeping an exotic pet in your home! Taming Python to live in RStudio because sometimes the best language is both!. This tutorial was primarily an introductory tutorial using
reticulate package. This tutorial interatively taught attendees how to pass objects between R and Python sessions, how to use Python and Python method in R code, and how to integrate Python code snippets into one’s R workflow and RMarkdown documentation. While I did already have a decent amount of Python experience (several graduate level classes and smaller projects), I did find this tutorial quite useful. The presentor and TAs were very knowledgable and helpful. I potentially have an upcoming project at CC where I may have to use Python at a client’s request, and I will definitely be looking towards adding
reticulate into my workflow.
Day 2 - Conference
Gala Dinner: Cité de L’Escape
The Gala Dinner at useR was at Cité de L’Escape (City of Space). It is a scientific discovery center focused on space. This venue was small but jammed packed with all sorts of space related exhibits. We saw mock-up control rooms, mock-up launch vehicles (Ariane 5), and were able to sit in space capsules. The dinner was a very good time, where we had a lot of good food and beverages, and R-related conversation. We met some really awesome people and it quite enjoyable.
Katie, Slava, and Abbas getting to embarck on a journey in outer space!
Day 3 - Conference
R Implementation, Optimization and Tools (RIOT) Workshop. This was with members of the R Core team and some major contributors to the deep internals of R. As Slava put it, “this workshop is where R is forged”. I sat in about 3 talks of this workshops, and I believe Slava sat in a few more than I did. It was quite fascinating.
The talks I remember:
- Radford M. Neal presented his
pqR (pretty quick R), which is a new version of the R interpreter. It extends the R language to allow automation of tasks such as numerical computations in parallel on multi core processes.
- Jan Vitek - R Melts Brains - talked about his past decade journey designing a new byte code format for R. Going from LLVM to ASTs (using Java), to a year off, back to LLVM and now to static single assignment (SSA) form for intermediate representation (IR) using PIR. I’m not really sure what all this stuff meant but it sounded really difficult.
Keynote - How Bioconductor advances science and contributes to R
A very special highlight of useR for me was Martin Morgan’s talk on Bioconductor. One may think I am very biased here, because if you know me, my dissertation and previous work before CC was in the field of genomics, but really its because I personally know him and think very highly of him. He was one of my mentors from my days at Roswell Park Cancer Institute (Buffalo, NY) and I have maintained a relationship with him since. Martin is on the R core team and a director of the Bioconductor R repository.
This was mostly a high-level talk, introducing R as a fancy calculator and how individuals can actually impact the community with their personality and expressiveness. He went over the rigorous Bioconductor package submission, support, release cycles, R classes/objects used, and how to represent very large data.
Martin giving gwasurvivr a shoutout!
The most surprising part of his talk was when he gave my R package that I developed during my PhD a shoutout during his talk! I was super excited and blown away. What a guy!
Very briefly, Katie’s poster was titled caRReRR: How to transition from academia to industry. Here she discussed how to use one’s academic background (masters or doctoral level training) and expand/build upon those skills to get an industry position. Katie covered topics such as how one should create a focused resume that highlights skills. She had quite a few people stop by her poster and hear her take on how one should go about making this non-trivial transition.
Katie presenting poser at useR2019!
Abbas chatting with Martin Morgan and Slava chatting with Jan Vitek
I spent most the early part of the poster session talking with Martin. Knowing that he just came from a the R core meeting a few days prior, I was trying to get the inside scoop. I talked to him about the RIOT talks and he referred me to R internals to get some insight on what these guys were talking about. We talked about a lot of deep R topics, but the most fun point he told me about was how the R core ewas discussing
stringsAsFactors=TRUE by as the default setting. And that it was a major topic of discussion during the R core meeting. The biggest barrier of removing this feature would really be legacy code and how many packages would break, and would it be worth making such a significant change within R itself. Martin was telling me that they were thinking deeply about this.
Day 4 - Conference
Photon Lightning Talk
My talk was on the latest development of
Photon is a side project that CC analytics has been working on for about a year a half. In fact, Katie presented the preliminary version of
photon last year at useR2018. Nonetheless,
photon is an R package where the main feature is a miniUI RStudio add-in (essentially a form that a user populates in RStudio) that allows users on Mac and Windows to build an Electron-Shiny app with minimal effort. My contribution to this project has development of the R package and miniUI, while Slava and Pete Gordon came up with the idea and implemented most of its
photon’s core. We had a lot of interest from this talk and many useRs were excited to try
Shoutout from RStudio’s Romain Francois!
So I didn’t really talk too much about any of the other talks that occurred to throughout this conference. There was a plethora of information and packages being presented as useR2019!. It was quite amazing. Just quickly list a few things that I thought were interesting.
data.table which has been the center of a strange debate in the R community the past few months, was presented in the optimization track by Arun Srinivasan. I thought this was an amazing talk really highlighting how awesome this package is and that the syntax is actually quite intuitive for the R user. There was many talks on shiny. Joe Cheng’s
shinymeta keynote was pretty interesting. Highlighting how to make shiny work more reproducible and reveal the underlying R code.
golem was introduced by the ThinkR folks – exhibiting how we can modularize shiny apps and have a package that does more of the set up work for you.
I think my most favorite part of this entire conference though, was the atmosphere. You can really interact with some major contributors to R and data science community and there are really no big egos. It’s a fun environment where everyone is just excited about using R and making it easier and better. So long story, short, useR2019! was an excellent experience and I look forward to going to more R conferences in the future.