Step by step
Step-by-step Contribution Guide
This document contains instructions for collaborating on the different libraries of Nixtla.
Sometimes, diving into a new technology can be challenging and overwhelming. We’ve been there too, and we’re more than ready to assist you with any issues you may encounter while following these steps. Don’t hesitate to reach out to us on Slack. Just give fede a ping, and she’ll be glad to assist you.
Table of Contents 📚
- Prerequisites
- Git
fork-and-pull
worklow - Set Up a Conda Environment
- Install required libraries for development
- Start editable mode
- Set Up your Notebook based development environment
- Start Coding
- Example with Screen-shots
Prerequisites
- GitHub: You should already have a GitHub account and a basic understanding of its functionalities. Alternatively check this guide.
- Python: Python should be installed on your system. Alternatively check this guide.
- conda: You need to have conda installed, along with a good grasp of fundamental operations such as creating environments, and activating and deactivating them. Alternatively check this guide.
Git fork-and-pull
worklow
1. Fork the Project: Start by forking the Nixtla repository to your own GitHub account. This creates a personal copy of the project where you can make changes without affecting the main repository.
2. Clone the Forked Repository Clone the forked repository to your
local machine using
git clone https://github.com/<your-username>/nixtla.git
. This allows
you to work with the code directly on your system.
3. Create a Branch:
Branching in GitHub is a key strategy for effectively managing and isolating changes to your project. It allows you to segregate work on different features, fixes, and issues without interfering with the main, production-ready codebase.
-
Main Branch: The default branch with production-ready code.
-
Feature Branches: For new features, create branches prefixed with ‘feature/’, like
git checkout -b feature/new-model
. -
Fix Branches: For bug fixes, use ‘fix/’ prefix, like
git checkout -b fix/forecasting-bug
. -
Issue Branches: For specific issues, use
git checkout -b issue/issue-number
orgit checkout -b issue/issue-description
.
After testing, branches are merged back into the main branch via a pull request, and then typically deleted to maintain a clean repository. You can read more about github and branching here.
Set Up a Conda Environment
If you want to use Docker or Codespaces, let us know opening an issue and we will set you up.
Next, you’ll need to set up a Conda environment. Conda is an open-source package management and environment management system that runs on Windows, macOS, and Linux. It allows you to create separate environments containing files, packages, and dependencies that will not interact with each other.
First, ensure you have Anaconda or Miniconda installed on your system. Alternatively checkout these guides: Anaconda, Miniconda, and Mamba.
Then, you can create a new environment using
conda create -n nixtla-env python=3.10
.
You can also use mamba for creating the environment (mamba is faster
than Conda) using mamba create -n nixtla-env python=3.10
.
You can replace nixtla-env
for something more meaningful to you. Eg.
statsforecast-env
or mlforecast-env
. You can always check the list
of environments in your system using conda env list
.
Activate your new environment with conda activate nixtla-env
.
Install required libraries for development
The environment.yml
file contains all the dependencies required for
the project. To install these dependencies, use the mamba
package
manager, which offers faster package installation and environment
resolution than Conda. If you haven’t installed mamba
yet, you can do
so using conda install mamba -c conda-forge
. Run the following command
to install the dependencies:
Sometimes (e.g. StatsForecast) the enviorment.yml
is sometimes inside
a folder called dev
. In that case, you should run
mamba env update -f dev/environment.yml
.
Start editable mode
Install the library in editable mode using pip install -e ".[dev]"
.
This means the package is linked directly to the source code, allowing any changes made to the source code to be immediately reflected in your Python environment without the need to reinstall the package. This is useful for testing changes during package development.
Set Up your Notebook based development environment
Notebook-based development refers to using interactive notebooks, such as Jupyter Notebooks, for coding, data analysis, and visualization. Here’s a brief description of its characteristics:
-
Interactivity: Code in notebooks is written in cells which can be run independently. This allows for iterative development and testing of small code snippets.
-
Visualization: Notebooks can render charts, tables, images, and other graphical outputs within the same interface, making it great for data exploration and analysis.
-
Documentation: Notebooks support Markdown and HTML, allowing for detailed inline documentation. Code, outputs, and documentation are in one place, which is ideal for tutorials, reports, or sharing work.
For notebook based development you’ll need nbdev
and a notebook editor
(such as VS Code, Jupyter Notebook or Jupyter Lab). nbdev
and jupyter
have been installed in the previous step. If you use VS Code follow
this
tutorial.
nbdev makes debugging and refactoring
your code much easier than in traditional programming environments since
you always have live objects at your fingertips. nbdev
also promotes
software engineering best practices because tests and documentation are
first class.
All your changes must be written in the notebooks contained in the
library (under the nbs
directory). Once a specific notebook is open
(more details to come), you can write your Python code in cells within
the notebook, as you would do in a traditional Python development
workflow. You can break down complex problems into smaller parts,
visualizing data, and documenting your thought process. Along with your
code, you can include markdown cells to add documentation directly in
the notebook. This includes explanations of your logic, usage examples,
and more. Also, nbdev
allows you to write tests
inline
with your code in your notebook. After writing a function, you can
immediately write tests for it in the following cells.
Once your code is ready, nbdev
can automatically convert your notebook
into Python scripts. Code cells are converted into Python code, and
markdown cells into comments and docstrings.
Start Coding
Open a jupyter notebook using jupyter lab
(or VS Code).
-
Make Your Changes: Make changes to the codebase, ensuring your changes are self-contained and cohesive.
-
Commit Your Changes: Add the changed files using
git add [your_modified_file_0.ipynb] [your_modified_file_1.ipynb]
, then commit these changes usinggit commit -m "<type>: <Your descriptive commit message>"
. Please use Conventional Commits -
Push Your Changes: Push your changes to the remote repository on GitHub with
git push origin feature/your-feature-name
. -
Open a Pull Request: Open a pull request from your new branch on the Nixtla repository on GitHub. Provide a thorough description of your changes when creating the pull request.
-
Wait for Review: The maintainers of the Nixtla project will review your changes. Be ready to iterate on your contributions based on their feedback.
Remember, contributing to open-source projects is a collaborative effort. Respect the work of others, welcome feedback, and always strive to improve. Happy coding!
Nixtla offers the possibility of assisting with stipends for computing infrastructure for our contributors. If you are interested, please join our slack and write to fede or Max.
You can find a detailed step by step buide with screen-shots below.
Example with Screen-shots
1. Create a fork of the mlforecast repo
The first thing you need to do is create a fork of the GitHub repository to your own account:
Your fork on your account will look like this:
In that repository, you can make your changes and then request to have them added to the main repo.
2. Clone the repository
In this tutorial, we are using Mac (also compatible with other Linux distributions). If you are a collaborator of Nixtla, you can request an AWS instance to collaborate from there. If this is the case, please reach out to Max or Fede on Slack to receive the appropriate access. We also use Visual Studio Code, which you can download from here.
Once the repository is created, you need to clone it to your own computer. Simply copy the repository URL from GitHub as shown below:
Then open Visual Studio Code, click on “Clone Git Repository,” and paste the line you just copied into the top part of the window, as shown below:
Select the folder where you want to copy the repository:
And choose to open the cloned repository:
You will end up with something like this:
3. Create the Conda environment
Open a terminal within Visual Studio Code, as shown in the image:
You can use conda but we highly recommend using Mamba to speed up the
creation of the Conda environment. To install it, simply use
conda install mamba -c conda-forge
in the terminal you just opened:
Create an empty environment named mlforecast
with the following
command: mamba create -n mlforecast python=3.10
:
Activate the newly created environment using
conda activate mlforecast
:
Install the libraries within the environment file environment.yml
using mamba env update -f environment.yml
:
Now install the library to make interactive changes and other additional
dependencies using pip install -e ".[dev]"
:
4. Make the changes you want.
In this section, we assume that we want to increase the default number
of windows used to create prediction intervals from 2 to 3. The first
thing we need to do is create a specific branch for that change using
git checkout -b [new_branch]
like this:
Once created, open the notebook you want to modify. In this case, it’s
nbs/utils.ipynb
, which contains the metadata for the prediction
intervals. After opening it, click on the environment you want to use
(top right) and select the mlforecast
environment:
Next, execute the notebook and make the necessary changes. In this case,
we want to modify the PredictionIntervals
class:
We will change the default value of n_window
from 2 to 3:
Once you have made the change and performed any necessary validations,
it’s time to convert the notebook to Python modules. To do this, simply
use nbdev_export
in the terminal.
You will see that the mlforecast/utils.py
file has been modified (the
changes from nbs/utils.ipynb
are reflected in that module). Before
committing the changes, we need to clean the notebooks using the command
./action_files/clean_nbs
and verify that the linters pass using
./action_files/lint
:
Once you have done the above, simply add the changes using
git add nbs/utils.ipynb mlforecast/utils.py
:
Create a descriptive commit message for the changes using
git commit -m "[description of changes]"
:
Finally, push your changes using git push
:
5. Create a pull request.
In GitHub, open your repository that contains your fork of the original repo. Once inside, you will see the changes you just pushed. Click on “Compare and pull request”:
Include an appropriate title for your pull request and fill in the necessary information. Once you’re done, click on “Create pull request”.
Finally, you will see something like this:
Notes
- This file was generated using this file. Please change that file if you want to enhance the document.