diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md new file mode 100644 index 0000000..07744b6 --- /dev/null +++ b/.claude/CLAUDE.md @@ -0,0 +1,178 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## Project Overview + +Multi-user JupyterHub 4 deployment platform with data science stack, GPU support (auto-detection), and NativeAuthenticator. The platform spawns isolated JupyterLab environments per user using DockerSpawner, backed by the `stellars/stellars-jupyterlab-ds` image. + +**Architecture**: Docker Compose orchestrates three main services: +- **Traefik**: Reverse proxy handling TLS termination and routing (ports 80, 443, 8080) +- **JupyterHub**: Central hub managing user authentication and spawning user containers +- **Watchtower**: Automatic image updates (daily at midnight) + +User containers are dynamically spawned into the `jupyterhub_network` with per-user persistent volumes for home, workspace, and cache directories. + +## Common Development Commands + +### Building and Deployment + +```bash +# Build the JupyterHub container +make build + +# Build with verbose output +make build_verbose + +# Build using script directly +./scripts/build.sh + +# Pull latest image from DockerHub +make pull + +# Push image to DockerHub +make push +``` + +### Starting and Stopping + +```bash +# Start platform (detached mode, respects compose_override.yml if present) +./start.sh + +# Start with docker compose directly +docker compose --env-file .env -f compose.yml up --no-recreate --no-build -d + +# Start with override file +docker compose --env-file .env -f compose.yml -f compose_override.yml up --no-recreate --no-build -d + +# Stop and clean up +make clean +``` + +### Accessing Services + +- JupyterHub: `https://localhost/jupyterhub` +- Traefik Dashboard: `http://localhost:8080/dashboard` +- First-time setup: Self-register as `admin` user (auto-authorized) + +## Configuration Architecture + +### Primary Configuration: `config/jupyterhub_config.py` + +This Python configuration file controls all JupyterHub behavior: + +**Environment Variables** (set in compose.yml or compose_override.yml): +- `JUPYTERHUB_ADMIN`: Admin username (default: `admin`) +- `DOCKER_NOTEBOOK_IMAGE`: JupyterLab image to spawn (default: `stellars/stellars-jupyterlab-ds:latest`) +- `DOCKER_NETWORK_NAME`: Network for spawned containers (default: `jupyterhub_network`) +- `JUPYTERHUB_BASE_URL`: URL prefix (default: `/jupyterhub`) +- `ENABLE_GPU_SUPPORT`: GPU mode - `0` (disabled), `1` (enabled), `2` (auto-detect) +- `ENABLE_JUPYTERHUB_SSL`: Direct SSL config - `0` (disabled), `1` (enabled) +- `ENABLE_SERVICE_MLFLOW`: Enable MLflow tracking (`0`/`1`) +- `ENABLE_SERVICE_GLANCES`: Enable resource monitor (`0`/`1`) +- `ENABLE_SERVICE_TENSORBOARD`: Enable TensorBoard (`0`/`1`) +- `NVIDIA_AUTODETECT_IMAGE`: Image for GPU detection (default: `nvidia/cuda:12.9.1-base-ubuntu24.04`) + +**GPU Auto-Detection**: When `ENABLE_GPU_SUPPORT=2`, the platform attempts to run `nvidia-smi` in a CUDA container. If successful, GPU support is enabled for all spawned user containers via `device_requests`. + +**User Container Configuration**: +- Spawned containers use `DockerSpawner` with per-user volumes +- Default working directory: `/home/lab/workspace` +- Container name pattern: `jupyterlab-{username}` +- Persistent volumes: + - `jupyterlab-{username}_home`: `/home` + - `jupyterlab-{username}_workspace`: `/home/lab/workspace` + - `jupyterlab-{username}_cache`: `/home/lab/.cache` + - `jupyterhub_shared`: `/mnt/shared` (shared across all users) + +### Override Pattern: `compose_override.yml` + +Create this file to customize deployment without modifying tracked files: + +```yaml +services: + jupyterhub: + volumes: + - ./config/jupyterhub_config_override.py:/srv/jupyterhub/jupyterhub_config.py:ro + environment: + - ENABLE_GPU_SUPPORT=1 +``` + +**IMPORTANT**: `compose_override.yml` contains deployment-specific credentials (CIFS passwords, etc.) and should never be committed. + +### TLS Certificates + +Certificates are auto-generated at startup by `/mkcert.sh` script and stored in `jupyterhub_certs` volume. Traefik reads certificates from `/mnt/certs/certs.yml` configuration file. + +## Docker Image Build Process + +**Dockerfile**: `services/jupyterhub/Dockerfile.jupyterhub` + +Build stages: +1. Base image: `jupyterhub/jupyterhub:latest` +2. Install system packages from `conf/apt-packages.yml` using `yq` parser +3. Copy startup scripts from `conf/bin/` (executable permissions set to 755) +4. Install Python packages: `docker`, `dockerspawner`, `jupyterhub-nativeauthenticator` +5. Copy certificate templates from `templates/certs/` +6. Entrypoint: `/start-platform.sh` + +**Platform Initialization**: `/start-platform.sh` executes scripts in `/start-platform.d/` directory sequentially before launching JupyterHub. + +## Authentication + +**NativeAuthenticator** configuration in `jupyterhub_config.py`: +- Self-registration enabled (`c.NativeAuthenticator.enable_signup = True`) +- Open signup disabled (`c.NativeAuthenticator.open_signup = False`) +- All registered users allowed to login (`c.Authenticator.allow_all = True`) +- Admin users defined by `JUPYTERHUB_ADMIN` environment variable +- Admin panel access: `https://localhost/jupyterhub/hub/home` + +## Networking and Volumes + +**Networks**: +- `jupyterhub_network`: Bridge network connecting hub, Traefik, and spawned user containers + +**Volumes**: +- `jupyterhub_data`: Persistent database (`jupyterhub.sqlite`) and cookie secrets +- `jupyterhub_certs`: TLS certificates shared with Traefik +- `jupyterhub_shared`: Shared storage across all user environments (can be mounted as CIFS) +- Per-user volumes: Created dynamically on first spawn + +## CIFS/NAS Integration + +To mount network storage in user containers, override the `jupyterhub_shared` volume in `compose_override.yml`: + +```yaml +volumes: + jupyterhub_shared: + driver: local + name: jupyterhub_shared + driver_opts: + type: cifs + device: //nas_ip/share_name + o: username=xxx,password=yyy,uid=1000,gid=1000 +``` + +User containers will access this at `/mnt/shared`. + +## Troubleshooting + +**GPU not detected**: +- Verify NVIDIA Docker runtime: `docker run --rm --gpus all nvidia/cuda:12.9.1-base-ubuntu24.04 nvidia-smi` +- Check `NVIDIA_AUTODETECT_IMAGE` matches your CUDA version +- Manually enable with `ENABLE_GPU_SUPPORT=1` + +**Container spawn failures**: +- Check Docker socket permissions: `/var/run/docker.sock` must be accessible +- Verify network exists: `docker network inspect jupyterhub_network` +- Review logs: `docker logs ` + +**Authentication issues**: +- Admin user must match `JUPYTERHUB_ADMIN` environment variable +- Database persisted in `jupyterhub_data` volume - may need reset if corrupted +- Cookie secret persisted in `/data/jupyterhub_cookie_secret` + +## Related Projects + +User environments spawned from: https://github.com/stellarshenson/stellars-jupyterlab-ds diff --git a/.claude/JOURNAL.md b/.claude/JOURNAL.md index 95862b8..c48c815 100644 --- a/.claude/JOURNAL.md +++ b/.claude/JOURNAL.md @@ -6,3 +6,12 @@ This journal tracks substantive work on documents, diagrams, and documentation c 1. **Task - Add Docker badges**: added Docker pulls and GitHub stars badges to README.md
**Result**: README now displays Docker pulls badge (stellars/stellars-jupyterhub-ds), Docker image size badge, and GitHub stars badge + +2. **Task - Project initialization and documentation**: Analyzed codebase and created comprehensive project documentation
+ **Result**: Created `.claude/CLAUDE.md` with detailed architecture overview, configuration patterns, common commands, GPU auto-detection logic, volume management, authentication setup, and troubleshooting guide for future Claude Code instances + +3. **Task - Feature planning for user controls**: Designed two self-service features for JupyterHub user control panel
+ **Result**: Created `FEATURE_PLAN.md` documenting Reset Home Volume and Restart Server features with implementation details, API handlers, UI templates, JavaScript integration, security considerations, edge cases, testing plans, and rollout strategy + +4. **Task - Version management implementation**: Added version tracking and tagging system matching stellars-jupyterlab-ds pattern
+ **Result**: Created `project.env` with project metadata and version 1.0.0_jh-4.x, updated `Makefile` with increment_version and tag targets, auto-increment on build, dual-tag push (latest and versioned), leveraging existing Docker socket access for both planned features diff --git a/FEATURE_PLAN.md b/FEATURE_PLAN.md new file mode 100644 index 0000000..5ebff8c --- /dev/null +++ b/FEATURE_PLAN.md @@ -0,0 +1,654 @@ +# Feature Plan: User Control Panel Enhancements + +## Overview + +Enhance JupyterHub user control panel with two self-service features: +1. **Reset Home Volume**: Allow users to reset their home directory volume when server is stopped +2. **Restart Server**: Provide one-click server restart functionality + +Both features include confirmation dialogs and proper permission enforcement. + +## Feature Scope + +### Feature 1: Reset Home Volume + +**Access Control**: +- Users can reset their own home volume +- Admins can reset any user's home volume + +**Volume Scope**: +- Only `jupyterlab-{username}_home` volume +- Does NOT affect workspace (`jupyterlab-{username}_workspace`) or cache (`jupyterlab-{username}_cache`) volumes + +**UI Location**: +- User control panel (accessible to both user and admin) +- Button visible only when server is stopped + +### Feature 2: Restart Server + +**Access Control**: +- Users can restart their own server +- Admins can restart any user's server + +**Functionality**: +- Uses Docker's native container restart (preserves container, does NOT recreate) +- Performs graceful restart with configurable timeout +- Maintains all volumes, network connections, and container configuration +- Equivalent to "Restart" button in Docker Desktop + +**UI Location**: +- User control panel (accessible to both user and admin) +- Button visible only when server is running + +**Technical Approach**: +- Direct Docker API call: `container.restart(timeout=10)` +- Does NOT use JupyterHub's `stop()` and `spawn()` methods (which would recreate container) +- Container ID remains the same after restart + +## Technical Requirements + +### Prerequisites +- User's JupyterLab server must be stopped (for reset volume) +- Volume `jupyterlab-{username}_home` must exist (for reset volume) +- **Docker socket accessible at `/var/run/docker.sock`** (already configured in `compose.yml` line 54 with read-write access) +- Docker Python SDK available (already installed in `Dockerfile.jupyterhub`) + +### Existing Infrastructure Leveraged +Both features utilize infrastructure already in place: +- **Docker Socket**: Mounted at `/var/run/docker.sock:rw` for DockerSpawner, we reuse this for volume management and container restart +- **Docker Python SDK**: Already installed via `pip install docker` in the JupyterHub image +- **Container Naming Pattern**: Follows existing convention `jupyterlab-{username}` from `jupyterhub_config.py` line 112 +- **Volume Naming Pattern**: Follows existing convention `jupyterlab-{username}_home` from `jupyterhub_config.py` line 116 + +### Permission Model +- **User access**: Can only reset their own home volume +- **Admin access**: Can reset any user's home volume +- Implemented via custom decorator: `@admin_or_self` + +## Implementation Steps + +### 1. Create Custom API Handler + +**File**: `services/jupyterhub/conf/bin/volume_handler.py` (or inline in `config/jupyterhub_config.py`) + +**Purpose**: Handle volume reset requests via REST API + +**Endpoint**: `DELETE /hub/api/users/{username}/reset-home-volume` + +**Logic**: +```python +from jupyterhub.handlers import BaseHandler +from jupyterhub.utils import admin_or_self +import docker + +class ResetHomeVolumeHandler(BaseHandler): + @admin_or_self + async def delete(self, username): + # 1. Verify user exists + user = self.find_user(username) + if not user: + return self.send_error(404, "User not found") + + # 2. Check server is stopped + spawner = user.spawner + if spawner.active: + return self.send_error(400, "Server must be stopped before resetting volume") + + # 3. Connect to Docker + docker_client = docker.DockerClient(base_url='unix://var/run/docker.sock') + + # 4. Verify volume exists + volume_name = f'jupyterlab-{username}_home' + try: + volume = docker_client.volumes.get(volume_name) + except docker.errors.NotFound: + return self.send_error(404, f"Volume {volume_name} not found") + + # 5. Remove volume + try: + volume.remove() + self.set_status(200) + self.finish({"message": f"Volume {volume_name} successfully reset"}) + except docker.errors.APIError as e: + return self.send_error(500, f"Failed to remove volume: {str(e)}") +``` + +**Error Handling**: +- 404: User not found or volume doesn't exist +- 400: Server still running +- 500: Docker API error (volume in use, permission denied) + +### 2. Register API Handler + +**File**: `config/jupyterhub_config.py` + +Add handler registration: +```python +from volume_handler import ResetHomeVolumeHandler + +c.JupyterHub.extra_handlers = [ + (r'/api/users/([^/]+)/reset-home-volume', ResetHomeVolumeHandler), +] +``` + +### 3. Extend User Control Panel Template + +**File**: `services/jupyterhub/templates/home.html` (override default template) + +**Template Structure**: +- Extend JupyterHub's base `home.html` template +- Add "Reset Home Volume" button in server controls section +- Button states: + - Enabled: Server stopped AND volume exists + - Disabled: Server running OR volume doesn't exist + - Tooltip explaining current state + +**Button HTML**: +```html +{% if not user.server %} + +{% endif %} +``` + +### 4. Create Confirmation Modal + +**File**: `services/jupyterhub/templates/home.html` (inline modal) + +**Modal Content**: +```html + +``` + +### 5. Implement Client-Side JavaScript + +**File**: `services/jupyterhub/templates/home.html` (inline script) + +**Functionality**: +- Check server status and volume existence on page load +- Enable/disable reset button based on state +- Handle modal confirmation +- Make API call to reset endpoint +- Display success/error notifications + +**JavaScript Logic**: +```javascript + +``` + +### 6. Update Docker Configuration + +**No changes required**: +- Docker Python SDK already installed in `Dockerfile.jupyterhub` +- Docker socket already mounted in `compose.yml` (line 54) +- Existing Docker client code in `jupyterhub_config.py` can be referenced + +--- + +## Feature 2: Restart Server Implementation + +### 1. Create Restart Server API Handler + +**File**: `config/jupyterhub_config.py` (inline with volume handler) + +**Purpose**: Handle server restart requests via REST API + +**Endpoint**: `POST /hub/api/users/{username}/restart-server` + +**Logic**: +```python +from jupyterhub.handlers import BaseHandler +from jupyterhub.utils import admin_or_self +import docker + +class RestartServerHandler(BaseHandler): + @admin_or_self + async def post(self, username): + # 1. Verify user exists + user = self.find_user(username) + if not user: + return self.send_error(404, "User not found") + + # 2. Check server is running + spawner = user.spawner + if not spawner.active: + return self.send_error(400, "Server is not running") + + # 3. Get container name from spawner + container_name = f'jupyterlab-{username}' + + # 4. Connect to Docker and restart container + docker_client = docker.DockerClient(base_url='unix://var/run/docker.sock') + + try: + # Get the container + container = docker_client.containers.get(container_name) + + # Restart the container (graceful restart with 10s timeout) + container.restart(timeout=10) + + self.set_status(200) + self.finish({"message": f"Container {container_name} successfully restarted"}) + except docker.errors.NotFound: + return self.send_error(404, f"Container {container_name} not found") + except docker.errors.APIError as e: + return self.send_error(500, f"Failed to restart container: {str(e)}") +``` + +**Error Handling**: +- 404: User not found or container doesn't exist +- 400: Server not running (spawner not active) +- 500: Docker API error during restart + +### 2. Register Restart Handler + +**File**: `config/jupyterhub_config.py` + +Update handler registration: +```python +from volume_handler import ResetHomeVolumeHandler, RestartServerHandler + +c.JupyterHub.extra_handlers = [ + (r'/api/users/([^/]+)/reset-home-volume', ResetHomeVolumeHandler), + (r'/api/users/([^/]+)/restart-server', RestartServerHandler), +] +``` + +### 3. Add Restart Button to Template + +**File**: `services/jupyterhub/templates/home.html` + +**Button HTML** (add next to existing server controls): +```html +{% if user.server %} + +{% endif %} +``` + +### 4. Create Restart Confirmation Modal + +**File**: `services/jupyterhub/templates/home.html` + +**Modal Content**: +```html + +``` + +### 5. Implement Restart JavaScript + +**File**: `services/jupyterhub/templates/home.html` (add to existing script) + +**JavaScript Logic**: +```javascript +// Restart server handler +$('#confirm-restart-btn').on('click', function() { + const username = "{{ user.name }}"; + const apiUrl = `/hub/api/users/${username}/restart-server`; + + // Disable button and show loading state + $('#confirm-restart-btn').prop('disabled', true).text('Restarting...'); + + $.ajax({ + url: apiUrl, + type: 'POST', + headers: { + 'Authorization': 'token ' + window.jhdata.api_token + }, + success: function(response) { + $('#restart-server-modal').modal('hide'); + alert('Server successfully restarted. Redirecting to your server...'); + // Redirect to user's server + window.location.href = `/user/${username}/lab`; + }, + error: function(xhr) { + $('#restart-server-modal').modal('hide'); + const errorMsg = xhr.responseJSON?.message || 'Failed to restart server'; + alert(`Error: ${errorMsg}`); + $('#confirm-restart-btn').prop('disabled', false).text('Yes, Restart Server'); + } + }); +}); +``` + +### 6. Enhanced Status Polling (Optional) + +**File**: `services/jupyterhub/templates/home.html` + +Add polling to detect when restart completes: +```javascript +function pollServerStatus(username) { + const interval = setInterval(function() { + $.ajax({ + url: `/hub/api/users/${username}`, + type: 'GET', + headers: { + 'Authorization': 'token ' + window.jhdata.api_token + }, + success: function(data) { + if (data.server && data.server.ready) { + clearInterval(interval); + window.location.href = `/user/${username}/lab`; + } + } + }); + }, 2000); // Poll every 2 seconds + + // Timeout after 60 seconds + setTimeout(function() { + clearInterval(interval); + }, 60000); +} +``` + +## Files to Create/Modify + +### New Files +- `services/jupyterhub/templates/home.html` - Custom user control panel template with both features + +### Modified Files +- `config/jupyterhub_config.py` - Register API handlers, add volume reset and restart server handler classes +- `services/jupyterhub/Dockerfile.jupyterhub` - No changes needed (Docker SDK already installed) + +### Optional Separate Files +- `services/jupyterhub/conf/bin/volume_handler.py` - API handler logic for both features (can be inline in config instead) + +## Testing Plan + +### Unit Tests + +#### Reset Home Volume Tests +1. **API Handler Tests**: + - Test permission enforcement (user can only reset own volume) + - Test admin can reset any user's volume + - Test rejection when server is running + - Test volume not found error handling + - Test Docker API error handling + +2. **Volume Operations Tests**: + - Create test volume + - Verify volume exists check + - Verify volume removal + - Test volume in use scenario + +#### Restart Server Tests +1. **API Handler Tests**: + - Test permission enforcement (user can only restart own server) + - Test admin can restart any user's server + - Test rejection when server is not running + - Test stop operation failure handling + - Test start operation failure handling + +2. **Server Operations Tests**: + - Verify server status check (running/stopped) + - Test graceful shutdown + - Test server restart sequence + - Test concurrent restart requests + +### Integration Tests + +#### Reset Home Volume Tests +1. **UI Flow Tests**: + - Button appears only when server stopped + - Modal displays correct volume name + - Confirmation triggers API call + - Success notification displays + - Error handling for failed API calls + +2. **End-to-End Tests**: + - User stops server + - User clicks reset button + - User confirms in modal + - Volume is removed + - User starts server (new volume created) + - Verify clean home directory + +#### Restart Server Tests +1. **UI Flow Tests**: + - Button appears only when server running + - Modal displays proper warning + - Confirmation triggers API call + - Loading state during restart + - Redirect to server after restart + - Error handling for failed restart + +2. **End-to-End Tests**: + - User has running server + - User clicks restart button + - User confirms in modal + - Server stops gracefully + - Server starts automatically + - User redirected to new server instance + - Verify server is functional after restart + +#### Combined Features Tests +1. **Button State Management**: + - Reset button visible when server stopped + - Restart button visible when server running + - Both buttons never visible simultaneously + - Button states update after operations + +2. **Workflow Tests**: + - Restart server -> works normally + - Stop server -> Reset volume -> Start server -> verify clean home + - Reset volume -> Start server -> Restart server -> verify functionality + +## Security Considerations + +### Reset Home Volume +1. **Permission Validation**: Always verify user has permission to reset volume (own volume or admin) +2. **Server State Check**: Prevent volume deletion while container is running +3. **Volume Ownership**: Validate volume name matches expected pattern `jupyterlab-{username}_home` +4. **Docker Socket Access**: Limit Docker operations to volume management only +5. **Input Validation**: Sanitize username parameter to prevent injection attacks +6. **Audit Logging**: Log all volume reset operations with username and timestamp + +### Restart Server +1. **Permission Validation**: Verify user can only restart own server (or is admin) +2. **State Validation**: Ensure server is actually running before attempting restart +3. **Resource Limits**: Prevent restart request flooding (rate limiting) +4. **Graceful Shutdown**: Allow proper cleanup before forced termination +5. **Session Integrity**: Invalidate old server tokens after restart +6. **Audit Logging**: Log all restart operations with username, timestamp, and outcome + +### Both Features +1. **CSRF Protection**: All API endpoints must validate CSRF tokens +2. **Authentication**: Require valid JupyterHub session token +3. **Authorization**: Implement `@admin_or_self` decorator consistently +4. **Rate Limiting**: Prevent abuse through repeated operations +5. **Error Disclosure**: Don't expose internal system details in error messages + +## Edge Cases + +### Reset Home Volume +1. **Volume doesn't exist**: Display informative error, don't fail silently +2. **Server starting/stopping**: Disable button during transition states +3. **Volume in use by orphaned container**: Attempt force removal or display cleanup instructions +4. **Multiple concurrent reset requests**: Implement request locking/queuing +5. **Admin resetting admin's volume**: Require additional confirmation +6. **Network errors during API call**: Display retry option +7. **Volume has active snapshots/backups**: Check for dependencies before removal + +### Restart Server +1. **Server not responding**: Implement timeout and force stop if graceful shutdown fails +2. **Restart during server startup**: Queue restart request until server is fully running +3. **Container stuck in stopping state**: Detect and handle orphaned containers +4. **Multiple concurrent restart requests**: Prevent duplicate restarts with request locking +5. **Restart fails to start**: Display error and provide manual start option +6. **User opens multiple tabs**: Synchronize state across browser tabs +7. **Network interruption during restart**: Handle client-side timeout gracefully + +### Combined Features +1. **Rapid operation switching**: User stops -> resets -> starts -> restarts quickly +2. **Session expires during operation**: Re-authenticate and resume or show clear error +3. **Hub restart during user operation**: Handle hub unavailability gracefully +4. **Docker daemon unavailable**: Detect and display system-level error message + +## Future Enhancements + +### Reset Home Volume +1. **Backup before reset**: Create automatic backup to `jupyterhub_shared` before deletion +2. **Selective reset**: Allow resetting workspace or cache volumes individually +3. **Reset all volumes**: Single action to reset home, workspace, and cache +4. **Volume size display**: Show current volume size before reset +5. **Reset history**: Log of volume reset operations per user +6. **Scheduled resets**: Allow users to schedule periodic volume resets +7. **Template volumes**: Pre-populate new volumes with template files +8. **Email notification**: Send confirmation email after volume reset + +### Restart Server +1. **Scheduled restarts**: Allow users to schedule regular server restarts +2. **Restart with options**: Choose specific image version or resource limits +3. **Pre-restart save**: Automatically save all open notebooks before restart +4. **Restart notifications**: WebSocket-based real-time status updates +5. **Restart analytics**: Track restart frequency and success rates per user +6. **Soft restart**: Restart JupyterLab without container restart (when possible) +7. **Batch restart**: Admin can restart multiple user servers simultaneously +8. **Auto-restart on failure**: Automatically restart server if it crashes + +### Combined Features +1. **Workflow presets**: "Clean slate" button that resets volume and restarts server +2. **Operation queue**: Queue multiple operations (stop, reset, restart) in sequence +3. **Health checks**: Automatic server health monitoring with auto-restart option +4. **Resource optimization**: Suggest restart when server uses excessive resources + +## Dependencies + +- **JupyterHub**: 4.x (current base image: `jupyterhub/jupyterhub:latest`) +- **Docker Python SDK**: Already installed via pip +- **NativeAuthenticator**: Already configured for user management +- **Bootstrap**: Available in JupyterHub default templates for modal styling +- **jQuery**: Available in JupyterHub default templates for AJAX calls + +## Rollout Plan + +1. **Development**: Implement on local environment + - Feature 1: Reset Home Volume (priority: high) + - Feature 2: Restart Server (priority: medium) +2. **Testing**: Verify all test cases pass for both features +3. **Documentation**: Update README.md and `.claude/CLAUDE.md` with feature descriptions +4. **Deployment**: Build new Docker image with both features +5. **User Communication**: Notify users of new self-service capabilities +6. **Monitoring**: Track usage and error rates for both features during first week +7. **Iteration**: Gather user feedback and implement improvements + +## Implementation Priority + +### Phase 1: Core Features +1. Reset Home Volume API handler and basic UI +2. Restart Server API handler and basic UI +3. Both confirmation modals + +### Phase 2: Enhanced UX +1. Status polling for restart operation +2. Better error messages and user feedback +3. Loading states and progress indicators + +### Phase 3: Polish +1. Audit logging for both operations +2. Rate limiting implementation +3. Edge case handling +4. Accessibility improvements + +## Summary + +This feature plan adds two essential self-service capabilities to JupyterHub: + +**Reset Home Volume** allows users to cleanly start over by removing their home directory volume when their server is stopped. This is useful for resolving corrupted environments or starting fresh with a clean slate. The operation uses Docker API to safely remove the `jupyterlab-{username}_home` volume after confirming the server is stopped. + +**Restart Server** provides a convenient one-click solution to restart a running JupyterLab container using Docker's native restart functionality. Unlike JupyterHub's stop/spawn cycle (which recreates containers), this uses `container.restart()` to preserve the container identity, volumes, and configuration. This helps users quickly recover from server issues or apply certain configuration changes without losing their environment setup. + +Both features maintain security through permission validation, provide clear user feedback through confirmation modals, and integrate seamlessly into the existing JupyterHub user control panel. diff --git a/Makefile b/Makefile index 1706467..f225a79 100644 --- a/Makefile +++ b/Makefile @@ -4,14 +4,34 @@ # GLOBALS # ################################################################################# .DEFAULT_GOAL := help -.PHONY: help build push start clean +.PHONY: help build push start clean increment_version tag + +# Include project configuration +include project.env + +# Use VERSION from project.env as TAG (strip quotes) +TAG := $(subst ",,$(VERSION)) ################################################################################# # COMMANDS # ################################################################################# +## increment patch version in project.env +increment_version: + @echo "Incrementing patch version..." + @awk -F= '/^VERSION=/ { \ + gsub(/"/, "", $$2); \ + match($$2, /^([0-9]+\.[0-9]+\.)([0-9]+)(_.*$$)/, parts); \ + new_patch = parts[2] + 1; \ + new_version = parts[1] new_patch parts[3]; \ + print "VERSION=\"" new_version "\""; \ + print "Version updated: " $$2 " -> " new_version > "/dev/stderr"; \ + next; \ + } \ + { print }' project.env > project.env.tmp && mv project.env.tmp project.env + ## build docker containers -build: +build: increment_version @cd ./scripts && ./build.sh ## build docker containers and output logs @@ -23,12 +43,23 @@ pull: docker pull stellars/stellars-jupyterhub-ds:latest ## push docker containers to repo -push: +push: tag docker push stellars/stellars-jupyterhub-ds:latest + docker push stellars/stellars-jupyterhub-ds:$(TAG) -## start jupyterlab (fg) +tag: + @if git tag -l | grep -q "^$(TAG)$$"; then \ + echo "Git tag $(TAG) already exists, skipping tagging"; \ + else \ + echo "Creating git tag: $(TAG)"; \ + git tag $(TAG); \ + echo "Creating docker tag: $(TAG)"; \ + docker tag stellars/stellars-jupyterhub-ds:latest stellars/stellars-jupyterhub-ds:$(TAG); \ + fi + +## start jupyterhub (fg) start: - @cd ./bin && ./start.sh + @./start.sh ## clean orphaned containers clean: diff --git a/project.env b/project.env new file mode 100644 index 0000000..e40d00e --- /dev/null +++ b/project.env @@ -0,0 +1,13 @@ +# Project Configuration +PROJECT_NAME="stellars-jupyterhub-ds" +PROJECT_DESCRIPTION="Multi-user JupyterHub 4 deployment platform with data science stack, GPU auto-detection, NativeAuthenticator, and isolated per-user environments spawned via DockerSpawner" + +# Version +VERSION="2.11.35_cuda-12.9.1_jh-5.4.2" +VERSION_COMMENT="Jupyterhub with GPU auto-detection, NativeAuthenticator, and DockerSpawner configuration and new build system" + +# Author +AUTHOR_NAME="Konrad Jelen" +AUTHOR_ALIAS="Stellars Henson" +AUTHOR_EMAIL="konrad.jelen+github@gmail.com" +AUTHOR_LINKEDIN="https://www.linkedin.com/in/konradjelen/"