Part I: Data Owner - FL Client
The Data Owner provides the secure environment and raw data required for training.
1. Setup & SyftBox Execution
- Install: Run the SyftBox installation script. This establishes your local machine as a "datasite."
- Run Client: Start the SyftBox client. It will run as a background service, managing secure file synchronization with the network.
- Login: Your client automatically registers you. Verify your identity by checking your local config at
~/.syftbox/config.json.
2. Datasite Administration
- Admin Access: Launch the RDS-Dashboard (via Docker). This is your command center.
- Login: Open
localhost:8000. You are automatically logged in as the Admin of your local datasite.
3. Creating Syft Datasets
- Private Data: Upload your sensitive
diabetes_train.csv. This data never leaves your machine. - Mock Data: Upload a synthetic
diabetes_mock.csvwith identical columns (e.g., Glucose, BMI, Age). This allows Data Scientists to write code without seeing real patient info. - Metadata Sync: Once created in the Dashboard, the metadata (name and schema) is synced to the network, making it "discoverable."
4. Review & Approval
- Audit: When a job arrives, open the Jobs tab. Use the "View Code" feature to audit the Python scripts.
- Execute: Click "Approve" and "Run". SyftBox will now execute the Data Scientist's model training locally on your private data.