1. Getting Started
This page covers the essential setup required for both the Data Owner (DO) and the Data Scientist (DS) to join the secure, distributed data science ecosystem.
1.1 Data Owner (DO) Setup: Hosting the Datasite
The DO prepares their machine to securely host the data and the RDS Dashboard.
| Step | Instruction | Command/Action |
|---|---|---|
| Install CLI | Install the SyftBox Command Line Interface for client interaction. | curl -fSL https://syftbox.net/install.sh | sh |
| Authenticate | Log in with your email to receive an OTP and obtain your refresh token. | cat ~/.syftbox/config.json |
| Deploy Dashboard | Run the RDS-Dashboard Docker container, using your email and token as environment variables. | See DO Quickstart for the full Docker command. |
| Access Dashboard | Verify the dashboard is running. | Navigate to localhost:8000 |
| Upload Data | Create a dataset container by uploading your private data and accompanying mock data. | Dashboard > Datasets > Add a Dataset |
1.2 Data Scientist (DS) Setup: Preparing the Client
The DS prepares their local development environment for writing and submitting Federated Learning jobs.
| Step | Instruction | Command/Action |
|---|---|---|
| Install CLI | Install the SyftBox CLI locally. | curl -fSL https://syftbox.net/install.sh | sh |
| Install Dependencies | Install the required Python packages (e.g., Flower, Pytorch, Syft Flwr). | pip install syft_flwr (plus project-specific dependencies) |
| Clone Project | Get the source code for the FL job you intend to run. | git clone [project_url] |
| Client Initialization | Initialize secure connections to your client and the DO's datasite. | See DS Quickstart for Python code block. |
| Validate Logic | Use the mock data to test your model's code structure and logic without accessing private data. | Run FL workflow locally on mock data. |