Reproducibility is about results that can be obtained by someone else (or you in the future) given the same data and the same code. This is a technical problem.
We talk about Computational reproducibility
Each degree of reproducibility requires additional skills and time. While some of those skills (e.g. literal programming, version control, setting up environments) pay off in the long run, they can require a high up-front investment.
According to Wilson et al. (2017)1, good practices for a better reproducibility can be organized into the following six topics:
Data management
Project organization
Tracking changes
Collaboration
Manuscript
Code & Software
git
)Questions
analyses.R
is the final one?data.csv
?We need a tool that deals with versions for us
git
)git
)git
git
is a Version Control System (VCS).
git
git
is a Version Control System (VCS). With git
you can:
git
and GitHub
are not the same thing
git
is a free and open-source softwareGitHub
(and co) is a web platform to host and share projects tracked by git
In other words:
You do not need
GitHub
to usegit
but you cannot useGitHub
without usinggit
Git
is a command-line interface (CLI)git
using a terminalgit
git status / log / add / commit
) But a lot of third-party tools provides a graphical interface to git
(e.g. RStudio, GitKraken, GitHub Desktop, extensions for VSCode, VSCodium, neovim, etc.)
Just keep in mind that for some operations you will need to use the terminal
Git main panel
Stage files, view differences and commit changes
View history and versions
Check out and follow the tutorial from N. Casajus, 2024, Setting up R: https://frbcesab.github.io/rsetup/.
git
for tracking changesgit
work?git
takes a sequence of snapshots!=
file hosting services like Dropbox, Google Drive, etc.) In the git
universe, a snapshot is a version, i.e. the state of the whole project at a specific point in time
A snapshot is a two-step process:
commit message
) Initialize git
in a (empty) folder (repository
)
Add new files in the repository
Stage
(select) one file
Stage
(select) several files
Stage
(select) all files
Commit
changes to create a new version
Now we are up-to-date
When committing a new version (w/ git commit
), the following information must be added:
WHO
- the person who has made the changes git
)WHEN
- the date of the commit git
)WHAT
- the files that have been modified git add
)WHY
- the reason of the commit, i.e. what has been done compared to the previous version git commit
)1. Undo recent, uncommitted and unstaged changes
You have modified a file but have not staged changes and you want to restore the previous version
1. Undo recent, uncommitted and unstaged changes
You have modified a file but have not staged changes and you want to restore the previous version
1. Undo recent, uncommitted and unstaged changes
You have modified a file but have not staged changes and you want to restore the previous version
2. Unstaged uncommitted files
You have modified and staged file(s) but have not committed changes yet and you want to unstage file(s) and restore the previous version
2. Unstaged uncommitted files
You have modified and staged file(s) but have not committed changes yet and you want to unstage file(s) and restore the previous version
2. Unstaged uncommitted files
You have modified and staged file(s) but have not committed changes yet and you want to unstage file(s) and restore the previous version
3. Revert one commit
You want to reverse the effects of a commit: use git revert
3. Revert one commit
You want to reverse the effects of a commit: use git revert
3. Revert one commit
You want to reverse the effects of a commit: use git revert
# Print git history
git log --oneline
# f960dd3 (HEAD -> main) commit 4
# dd4472c commit 3
# 2bb9bb4 commit 2
# 2d79e7e commit 1
git revert
does not alter the history and creates a new commit
4. Deleting commits
You want to delete one or more commits: use git reset --hard
# Print git history
git log --oneline
# f960dd3 (HEAD -> main) commit 4
# dd4472c commit 3
# 2bb9bb4 commit 2
# 2d79e7e commit 1
git reset --hard
alters the history. Be careful with this command
GitHub and co are cloud-based git repository hosting services
Perfect solutions to collaborate on projects tracked by git
Services
Overview
Advantages
git
Version Control
Git
Add a new file: README.md
Stage
changes
Commit
changes
Push
changes to remote
Pull
changes from remote
Pull
changes from remote
Make local changes
Stage
changes
Commit
changes
Don’t forget to Push
changes to remote
When you try to push, you might see this following error message:
git push
# To github.com:ahasverus/projectname.git
# ! [rejected] main -> main (fetch first)
#
# error: failed to push some refs to 'github.com:ahasverus/projectname.git'
#
# hint: Updates were rejected because the remote contains work that you do
# hint: not have locally. This is usually caused by another repository pushing
# hint: to the same ref. You may want to first integrate the remote changes
# hint: (e.g., 'git pull ...') before pushing again.
# hint: See the 'Note about fast-forwards' in 'git push --help' for details.
Just git pull
and try to git push
again
When you try to pull, you might see this following error message:
git pull
# [...]
# Auto-merging README.md
# CONFLICT (content): Merge conflict in README.md
#
# error: could not apply b8302e6... edit README
#
# hint: Resolve all conflicts manually, mark them as resolved with
# hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
# hint: You can instead skip this commit: run "git rebase --skip".
# hint: To abort and get back to the state before "git rebase",
# hint: run "git rebase --abort".
Welcome to the wonderful world of git
conflicts
A git
conflict appears when two versions cannot be merged by git
because changes have been made to the same lines.
You have to decide which version you want to keep.
.gitignore
We can also tell git
to ignore specific files: it’s the purpose of the .gitignore
file
Which files? For instance:
Template for projects available here
You can access millions of open source projects and contribute to their development.
And if your Github repository is public, everyone can use and contribute to your project.
pull
stage and commit
push
Many resources online.
Please contact me if you have any issue using Git/Github: romain.frelat@fondationbiodiversite.fr