class: center, middle, inverse, title-slide # Packages submission and reviews; how does it work? ###
Lluís Revilla
Lluis_Revilla
--- name:intro # Brief introduction Goals of a *submission* - Sharing something of quality that can be useful to others. - Make it easier for others to build upon your package. - Other: work, grant, prestige ... ??? Submissions are though specially if coming from places with poor training Lack of confidence/experience with reviews. -- | Archives reviewing packages | Objectives of the reviews? | |:--------------------------------------------------:|:----------------------------------------------------------------------------------------:| | <a href=https://cran.r-project.org/>CRAN</a> | Non-trivial publication quality packages. | | <a href=https://bioconductor.org/>Bioconductor</a> | Promote high-quality, well documented and interoperable. | | <a href=https://ropensci.org/>rOpenSci</a> | Drive the adoption of best practices with useful, transparent and constructive feedback. | ??? Differences in objectives but all looking for quality CRAN: Point errors, comments Bioconductor: In detail comment of style, classes, dependencies, structure… rOpenSci: guideline for reviewers (about style, tests, functions, description, documentation, …) CRAN ~16000 packages, Bioconductor ~2000, rOpenSci ~300 To work with this slides use xaringan::infinite_moon_reader() --- name:projects class: center # Project differences | | CRAN | Bioconductor | rOpenSci | |:---------|:-------------------------------------------------------------------------------:|:------------------------------------------------------------------------------:|:----------------------------------------------------------------------:| |Guides | <a href=https://cran.r-project.org/doc/manuals/r-release/R-exts.html>R-exts</a> | <a href=https://www.bioconductor.org/developers/package-guidelines>Website</a> | <a href=https://devguide.ropensci.org/index.html>Book</a> | |Submit | <a href=https://cran.r-project.org/submit.html>tar.gz file</a> | <a href=https://github.com/Bioconductor/Contributions/>fill an issue</a> | <a href=https://github.com/ropensci/software-review/>fill an issue</a> | |Review | email & ftp | Github | Github | |Setup | None | ssh key, subscribe mailing | CI tests | |Checks | check --as-cran | check; BiocCheck | check --as-cran | |OS | Windows, Unix, iOS | Windows, Unix, iOS | Windows, Unix, iOS | |Versions | oldrel, release, patched, devel | release, devel | oldrel, release, devel | |Cycle | Always open | 2 annual releases | Always open | |Editors | 0 | 0 | ~10 | |Reviewers | <b>~5</b> | ~10 | Volunteers | .middle[Different setup, different review.] ??? The different projects/archives have different setups. *Read the table* All of them first you need to pass the automatic checks in place before a human looks into it. Will use data from the three projects but mostly refer to CRAN. --- name:submissions # Submissions <img src="index_files/figure-html/submissions-1.png" title="Three bar plots with new submissions, each bar is a month: on the left CRAN with 9 months collected, on the middle Bioconductor with 5 years of data, on the right rOpenSci with 6 years of data. CRAN has about 300 montlhy submissions, Bioconductor 30, rOpenSci 10. Some variance can be observed, specially on Bioconductor and rOpenSci." alt="Three bar plots with new submissions, each bar is a month: on the left CRAN with 9 months collected, on the middle Bioconductor with 5 years of data, on the right rOpenSci with 6 years of data. CRAN has about 300 montlhy submissions, Bioconductor 30, rOpenSci 10. Some variance can be observed, specially on Bioconductor and rOpenSci." width="1080" style="display: block; margin: auto;" /> .center[ CRAN data thanks to the [incoming dashboard](https://lockedata.github.io/cransays/articles/dashboard.html). ] ??? One order of magnitude of difference between each other CRAN > Bioconductor > rOpenSci Many variability on month Also very few data collected from CRAN so far (Also there are some hiccups on CRAN collection, near the end of May the CRON job stopped working for a week. ) --- name:organization # Organization <img src="index_files/figure-html/cran-holidays-1.png" title="Line plot with number of packages on CRAN's folders newbies and pretest from September 2020 to May 2021 accounted hourly. Pretest is mainly below 10 packages and newbies aroudn 70. There are saome increase on newbies packages around October and after CRAN holidays of December-January (which is marked on red). There are two spikes on packages on pretest folder, one after the holidays and another one at the beinning of April." alt="Line plot with number of packages on CRAN's folders newbies and pretest from September 2020 to May 2021 accounted hourly. Pretest is mainly below 10 packages and newbies aroudn 70. There are saome increase on newbies packages around October and after CRAN holidays of December-January (which is marked on red). There are two spikes on packages on pretest folder, one after the holidays and another one at the beinning of April." width="1080" style="display: block; margin: auto;" /> .center[ Packages are moved by reviewers [between folders](https://llrs.dev/2021/01/cran-review/#cran-load). ] ??? Many folders but these two are the most important. There isn't an explanation from CRAN about how do they work. Pretest is resubmission (newer versions of packages) and also for newbies --- # Workload after holidays <img src="index_files/figure-html/cran-holidays-zoom-1.png" title="A zoom from the previous plot to only show the pacakges on CRAN queue after the holidays. The spike on pretest package after holidays is clearly seen (reaches ~140 pacakges), followed by a sustained high number of packages on newbies (around ~70 pacakges) until middle February. At the beginning of April another spike of pretest pagkaes but newbies remain at 25 pacakges and pretest even lower." alt="A zoom from the previous plot to only show the pacakges on CRAN queue after the holidays. The spike on pretest package after holidays is clearly seen (reaches ~140 pacakges), followed by a sustained high number of packages on newbies (around ~70 pacakges) until middle February. At the beginning of April another spike of pretest pagkaes but newbies remain at 25 pacakges and pretest even lower." width="1080" style="display: block; margin: auto;" /> .center[Big volume of work! Patience!] ??? 2 months to get back to normal for new packages. First served are the resubmissions of packages. --- name:submission-patterns # Submissions patterns <img src="index_files/figure-html/cran-day-month-1.png" title="Two plots with a loess estimation of the number of pacakges on the CRAN's folders newbies and pretest. On the left by day of month: Newbies has some dip at the beginning of the month and around day 20-29 but is around 70 pacakges a day, while pretests is constant around 50 packages each day. On the right plot the same data by day of week: many pacakges at the beginning of the week and fewer on the weekend. Pretest packages fall from 50 to around 30, while newbies drops from 80 to 70." alt="Two plots with a loess estimation of the number of pacakges on the CRAN's folders newbies and pretest. On the left by day of month: Newbies has some dip at the beginning of the month and around day 20-29 but is around 70 pacakges a day, while pretests is constant around 50 packages each day. On the right plot the same data by day of week: many pacakges at the beginning of the week and fewer on the weekend. Pretest packages fall from 50 to around 30, while newbies drops from 80 to 70." width="1080" style="display: block; margin: auto;" /> .center[ Check [dashboard](https://lockedata.github.io/cransays/articles/dashboard.html) before submitting? ] ??? Submit when you are ready, better on the queue than outside. --- name:review-time # Review time <img src="index_files/figure-html/cran-review-1.png" title="Histogram of time that a submission is on CRAN's queue. One big histogram from 0 to over 2000 hours, where most there are below 500h and decay in logarithmic pattern. Above it a zoom on the first week, split by 24h till 168h (1 week). Most submissions are less than 24h on the queue." alt="Histogram of time that a submission is on CRAN's queue. One big histogram from 0 to over 2000 hours, where most there are below 500h and decay in logarithmic pattern. Above it a zoom on the first week, split by 24h till 168h (1 week). Most submissions are less than 24h on the queue." width="1080" style="display: block; margin: auto;" /> .center[Reviews are short, brief and to the point.] ??? Median time on submissions ~10 hours, mean time ~37.9882859 hours. 1, 2, 10, 37.9882859, 43, 1941 --- # Review speed <img src="index_files/figure-html/cran-submission-time-1.png" title="A plot with the loess estimation of hours for submission on CRAN. One line if the pacakge is new another if it is an update. Updated packages are 5 hours on the queue while new pacakges start from 160 hours dep to 80 before CRAN holidays (end of december and beginning of January), increase again after holidays to around 120 to slowly decay till they reach 40 hours." alt="A plot with the loess estimation of hours for submission on CRAN. One line if the pacakge is new another if it is an update. Updated packages are 5 hours on the queue while new pacakges start from 160 hours dep to 80 before CRAN holidays (end of december and beginning of January), increase again after holidays to around 120 to slowly decay till they reach 40 hours." width="1080" style="display: block; margin: auto;" /> .center[Expect 3-7 days till your new package is on CRAN.] ??? Different time, can be shorter or longer. Most longer need resubmission. Resubmit with different version (makes it easier to track how many are). CRAN: 80h Bioconductor: most of them in 1 month rOpenSci: in 2 months (seeking 2 reviewers and posting them). ``` ## # A tibble: 2 x 2 ## new time ## <chr> <dbl> ## 1 New 76.5 ## 2 Update 5 ``` --- name: users # Users role <img src="index_files/figure-html/users-plots-1.png" title="Two plots showing the number of actions done by users and on how many submissions they have done that. On the left for Bioconductor and on the right for rOpenSci. The points size is according to how many users did so, there are two colors and shapes, one for regular users and one for editors (rOpenSci) or reviewers (Bioconductor). Most active people are core people from the project, but there are some regular users involved on many issues and doing many actions too." alt="Two plots showing the number of actions done by users and on how many submissions they have done that. On the left for Bioconductor and on the right for rOpenSci. The points size is according to how many users did so, there are two colors and shapes, one for regular users and one for editors (rOpenSci) or reviewers (Bioconductor). Most active people are core people from the project, but there are some regular users involved on many issues and doing many actions too." width="1080" style="display: block; margin: auto;" /> .center[Some users are very involved.] ??? Bioconductor reviewers do a lot rOpenSci editors too Both organizations have a group of users involved on the package review system. Even if Bioconductor doesn't explicitly ask for reviewers from the community. Bioconductor are considering now how to improve the review system. Omitted bots bioc-issue-bot and ropensci-review-bot (new March 2021). --- # Comments <img src="index_files/figure-html/comments-1.png" title="Four plots, in 2 rows and 2 columns, the first column for Bioconductor and the second data from rOpenSci. First row shows comments from reviewers in relation to author's comments (almost linear relation). On the second row other users vs author's comments. Only linear relationship on rOpenSci as this include the reviewers. " alt="Four plots, in 2 rows and 2 columns, the first column for Bioconductor and the second data from rOpenSci. First row shows comments from reviewers in relation to author's comments (almost linear relation). On the second row other users vs author's comments. Only linear relationship on rOpenSci as this include the reviewers. " width="1080" style="display: block; margin: auto;" /> .center[ A dialog between authors and reviewers & editors. ] ??? Non reviewers users on bioconductor still chime in to help. --- name:bot # Bot role <img src="index_files/figure-html/bioc-issue-bot-1.png" title="Tile plot with rows showing different message from bioc-issue-bot and columns being each issue for Bioconductor. The tile is colored by the number of times each bot posted the message. The plot shows how the bot changed with time and which are the most common feedback provided (in order of more feedback given): Build results, valid push, received, accepted, reviewer assigned. And common errors: missing repository, repost, fix version, closing issue, lacking ssh key, multiple repositories detected..." alt="Tile plot with rows showing different message from bioc-issue-bot and columns being each issue for Bioconductor. The tile is colored by the number of times each bot posted the message. The plot shows how the bot changed with time and which are the most common feedback provided (in order of more feedback given): Build results, valid push, received, accepted, reviewer assigned. And common errors: missing repository, repost, fix version, closing issue, lacking ssh key, multiple repositories detected..." width="1080" style="display: block; margin: auto;" /> .center[Bot helps on the process and changes with the process] ??? Bot provides feedback of many issues and actions performed. It can be changed/adapted to change in requirements or errors. rOpenSci is going to have a bot too [ropensci-review-bot](https://github.com/ropensci-review-bot/). --- exclude: true name: labels # Labels <img src="index_files/figure-html/labels-1.png" title="Two tile plots showing labels related to the review process on the vertical axis and issues on the horitzontal axis. On the left Bioconductor and on the right rOpenSci. Bioconductor show many accepted packages few declined and more inactive issues. rOpenSci plot shows more labels which allow to better know the state of the review." alt="Two tile plots showing labels related to the review process on the vertical axis and issues on the horitzontal axis. On the left Bioconductor and on the right rOpenSci. Bioconductor show many accepted packages few declined and more inactive issues. rOpenSci plot shows more labels which allow to better know the state of the review." width="1080" style="display: block; margin: auto;" /> .bottom[ .center[ Labels are used to indicate progress on the submission. ] ] ??? On bioconductor most problems with the submissions are not the package itself but not replying or chosing another venue. rOpenSci provides more detailed questioning for scope of a package. --- name:success-submissions # Success submissions <img src="index_files/figure-html/cran_success-1.png" title="On the left a bar plot with packages submissions to CRAN on the x axis and on the vertical axis the number of pacakges. The bars are colored by if they are accepted or not. It is also split by new packages and updated pacakges. More new pacakges are not accepted on the first try than updates, but on resubmissions they are accepted. The plot on the right shows the acceptance rate of CRAN for the range of dates from 2020/09 to 2021/06. Two lines with one for new submissions which shows a consistend rate around 81% and package updates is between 85% and 95% (until the time series get to close for the review to be finished)." alt="On the left a bar plot with packages submissions to CRAN on the x axis and on the vertical axis the number of pacakges. The bars are colored by if they are accepted or not. It is also split by new packages and updated pacakges. More new pacakges are not accepted on the first try than updates, but on resubmissions they are accepted. The plot on the right shows the acceptance rate of CRAN for the range of dates from 2020/09 to 2021/06. Two lines with one for new submissions which shows a consistend rate around 81% and package updates is between 85% and 95% (until the time series get to close for the review to be finished)." width="1080" style="display: block; margin: auto;" /> .bottom[.center[High approval rates!!]] ??? Bioconductor & rOpenSci 50%, some submissions are abandoned or do not fit the project. Different problems faced by new packages and older ones. More indeepth review requires 1 month for each reviewer. ``` ## # A tibble: 35 x 7 ## submission_n new Accepted n perc suspended perc_suspended ## <fct> <chr> <lgl> <int> <dbl> <dbl> <dbl> ## 1 1 New FALSE 451 19.4 451 32.4 ## 2 1 New TRUE 1868 80.6 0 0 ## 3 1 Update FALSE 494 12.2 494 35.5 ## 4 1 Update TRUE 3551 87.8 0 0 ## 5 2 New FALSE 109 27.1 109 7.84 ## 6 2 New TRUE 293 72.9 0 0 ## 7 2 Update FALSE 162 8.15 162 11.7 ## 8 2 Update TRUE 1826 91.9 0 0 ## 9 3 New FALSE 26 22.4 26 1.87 ## 10 3 New TRUE 90 77.6 0 0 ## # … with 25 more rows ``` ``` ## # A tibble: 2 x 3 ## Approved n perc ## <chr> <int> <dbl> ## 1 No 983 0.474 ## 2 Yes 1092 0.526 ``` ``` ## # A tibble: 1 x 4 ## `1. awaiting moderation` `2. review in progress` `3a. accepted` `3b. declined` ## <dbl> <dbl> <dbl> <dbl> ## 1 0.0000231 0.222 35.9 19.1 ``` |name | Median days| Total days| |:-------------------------------|-----------:|----------:| |1/editor-checks | 2.7| 2.7| |2/seeking-reviewer(s) | 2.1| 4.8| |3/reviewer(s)-assigned | 6.9| 11.7| |4/review(s)-in-awaiting-changes | 26.3| 38.0| |5/awaiting-reviewer(s)-response | 17.1| 55.1| |6/approved | 12.7| 67.8| --- name:submit # Submit! .pull-left[ .tip-submission[ Prepare .center[ Manual to [create R packages](https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Creating-R-packages), [R Packages](https://r-pkgs.org/) Follow policies ([CRAN](https://cran.r-project.org/web/packages/policies.html)) and **guidelines** ([Bioconductor](https://www.bioconductor.org/developers/package-submission/), [rOpenSci](https://devguide.ropensci.org/)). ] ] .tip-submission[ .center[ **Check** [Use Rhub](https://builder.r-hub.io/), [Github Actions](https://github.com/r-lib/actions)] ] .tip-submission[ .right[(Re)***Submit***] .center[[Fix and explain](https://cran.r-project.org/web/packages/policies.html#Re_002dsubmission) on re-submission.] ] ] ??? Follow the detailed guidelines from Bioconductor and rOpenSci. Fix any problem that you haven't detected previously (double check the policy on CRAN). Resubmit -- .pull-right[ .center[ # Thanks R core and CRAN team, Bioconductor core, rOpenSci editors and reviewers ] .bottom[ .center[***Q&A ?*** Some answers on [Lluís's blog](https://llrs.dev/post/) posts: [Bioconductor](https://llrs.dev/2020/07/bioconductor-submissions-reviews/), [rOpenSci](https://llrs.dev/2020/09/ropensci-submissions/), [CRAN](https://llrs.dev/2021/01/cran-review/). ] ] ] ??? Thank also to the package authors (mainly tidyverse, ggplot2 and rhub, and gh). Maëlle Salmon and Stephanie Locke for the dashboard. rOpenSci review: [Video](https://www.youtube.com/watch?v=iJnn_9xKkqk)