If you have a function that you are repeating multiple times in R using the map or apply functions, you can speed it up by running it in parallel using the furrr package. In this episode of Code Club, Pat runs the run_ml function from mikropml 100 times using the future_map function to get it to run faster as he tries to evaluate different hyperparameter settings for a machine learning model. He describes how to set up parallelization and uses the tictoc package to time the performance of the code run in series vs parallel. The data he uses is from a microbiome study his lab has published looking for biomarkers associated with colorectal cancer.
In this episode, Pat will use functions from the #furrr, #tictoc, and #mikropml R packages and data handling functions from dplyr and the rest of the tidyverse in #RStudio. The accompanying blog post can be found at https://www.riffomonas.org/code_club/....
If you're interested in taking an upcoming 3 day R workshop, email me at [email protected]!
R: https://r-project.org
RStudio: https://rstudio.com
Raw data: https://github.com/riffomonas/raw_dat...
Workshops: https://www.mothur.org/wiki/workshops
You can also find complete tutorials for learning R with the tidyverse using...
Microbial ecology data: https://www.riffomonas.org/minimalR/
General data: https://www.riffomonas.org/generalR/
0:00 Introduction
2:54 Timing serial execution of code
6:26 Parallelizing code with furrr package
12:56 Synthesizing 100 splits
15:08 Recap
Поделиться в:
Доступные форматы для скачивания:
Скачать видео mp4
Информация по загрузке:
Скачать аудио mp3
Похожие видео
array(10) {
[0]=>
object(stdClass)#5304 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "FtTbmnx2Quo"
["related_video_title"]=>
string(56) "How to understand R code written by someone else (CC131)"
["posted_time"]=>
string(21) "3 года назад"
["channelName"]=>
string(18) "Riffomonas Project"
}
[1]=>
object(stdClass)#5277 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "uBqzChxAreE"
["related_video_title"]=>
string(75) "Introduction to building machine learning models in R with mikropml (CC124)"
["posted_time"]=>
string(21) "3 года назад"
["channelName"]=>
string(18) "Riffomonas Project"
}
[2]=>
object(stdClass)#5302 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "Xbvqdl_krJQ"
["related_video_title"]=>
string(79) "Building machine learning models in R with mikropml: preprocessing data (CC125)"
["posted_time"]=>
string(21) "3 года назад"
["channelName"]=>
string(18) "Riffomonas Project"
}
[3]=>
object(stdClass)#5309 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "gzoD7DQ6i0k"
["related_video_title"]=>
string(59) "Make your Analysis 4x faster | Multi core processing with R"
["posted_time"]=>
string(21) "3 года назад"
["channelName"]=>
string(26) "LiquidBrain Bioinformatics"
}
[4]=>
object(stdClass)#5288 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "_-lQZMtHdZg"
["related_video_title"]=>
string(85) "Repeating and parallelizing a function in R with the purrr and furrr packages (CC192)"
["posted_time"]=>
string(21) "3 года назад"
["channelName"]=>
string(18) "Riffomonas Project"
}
[5]=>
object(stdClass)#5306 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "iRNbIp27viw"
["related_video_title"]=>
string(87) "How to tune hyperparameters for machine learning in R with the mikropml package (CC126)"
["posted_time"]=>
string(21) "3 года назад"
["channelName"]=>
string(18) "Riffomonas Project"
}
[6]=>
object(stdClass)#5301 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "gkAvH0SHJaA"
["related_video_title"]=>
string(127) "Большие деньги, большой риск: Как везут ценные грузы через всю Канаду!"
["posted_time"]=>
string(24) "14 часов назад"
["channelName"]=>
string(25) "АЛЕКС Брежнев"
}
[7]=>
object(stdClass)#5311 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "3gtk8uRrrL4"
["related_video_title"]=>
string(82) "Barret Schloerke || Maximize computing resources using future_promise() || RStudio"
["posted_time"]=>
string(21) "3 года назад"
["channelName"]=>
string(9) "Posit PBC"
}
[8]=>
object(stdClass)#5287 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "2ZlpFkFMy7E"
["related_video_title"]=>
string(79) "Henrik Bengtsson - Future - Simple, Friendly Parallel Processing for R [Remote]"
["posted_time"]=>
string(21) "4 года назад"
["channelName"]=>
string(16) "Lander Analytics"
}
[9]=>
object(stdClass)#5305 (5) {
["video_id"]=>
int(9999999)
["related_video_id"]=>
string(11) "blWdjRUPP6E"
["related_video_title"]=>
string(72) "Разведчик о том, как использовать людей"
["posted_time"]=>
string(25) "4 недели назад"
["channelName"]=>
string(18) "Коллектив"
}
}