System design for Google Docs system
Designing a system like Google Docs involves creating a collaborative real-time document editing platform that allows multiple users to work on documents simultaneously. Here's a detailed system design using Prisma for database modeling, Express.js for API routes, and a recommended tech stack:
Tech Stack
- Backend: Node.js with Express.js
- Database: PostgreSQL for relational data (users, documents, revisions), Redis for caching
- ORM: Prisma for database interactions
- Real-time Collaboration: WebSocket (Socket.io) for real-time updates
- Authentication: JWT (JSON Web Tokens) for user authentication and authorization
- Messaging: RabbitMQ for handling asynchronous tasks (notifications, document updates)
- Storage: Amazon S3 for storing document revisions and attachments
- Monitoring: Prometheus and Grafana for monitoring system metrics
System Components
-
Client Applications (Web and Mobile):
- Interfaces for users to create, edit, share documents, collaborate in real-time, and manage document versions.
-
Express.js Backend
-
Prisma Models:
// schema.prisma datasource db { provider = "postgresql" url = env("DATABASE_URL") } generator client { provider = "prisma-client-js" } model User { id Int @id @default(autoincrement()) username String @unique email String @unique password String documents Document[] createdAt DateTime @default(now()) } model Document { id Int @id @default(autoincrement()) title String content String ownerId Int owner User @relation(fields: [ownerId], references: [id]) collaborators User[] @relation("Collaborators", references: [id]) revisions DocumentRevision[] createdAt DateTime @default(now()) } model DocumentRevision { id Int @id @default(autoincrement()) documentId Int document Document @relation(fields: [documentId], references: [id]) content String revisionNo Int createdAt DateTime @default(now()) }
-
Express API Routes:
// server.js const express = require('express'); const http = require('http'); const { Server } = require('socket.io'); const { PrismaClient } = require('@prisma/client'); const prisma = new PrismaClient(); const jwt = require('jsonwebtoken'); const bcrypt = require('bcrypt'); const { v4: uuidv4 } = require('uuid'); const app = express(); const server = http.createServer(app); const io = new Server(server); app.use(express.json()); // Socket.io middleware for authentication io.use((socket, next) => { const token = socket.handshake.auth.token; try { const decoded = jwt.verify(token, 'secret'); socket.userId = decoded.userId; next(); } catch (error) { return next(new Error('Authentication error')); } }); // Endpoint for user registration app.post('/register', async (req, res) => { const { username, email, password } = req.body; try { const hashedPassword = await bcrypt.hash(password, 10); const user = await prisma.user.create({ data: { username, email, password: hashedPassword, }, }); res.json(user); } catch (error) { console.error(error); res.status(500).json({ error: 'Failed to register user' }); } }); // Endpoint for user login and JWT generation app.post('/login', async (req, res) => { const { username, password } = req.body; try { const user = await prisma.user.findUnique({ where: { username } }); if (!user) { return res.status(404).json({ error: 'User not found' }); } const passwordMatch = await bcrypt.compare(password, user.password); if (!passwordMatch) { return res.status(401).json({ error: 'Invalid password' }); } const token = jwt.sign({ userId: user.id }, 'secret', { expiresIn: '1h' }); res.json({ token }); } catch (error) { console.error(error); res.status(500).json({ error: 'Login failed' }); } }); // Endpoint for creating a new document app.post('/documents', async (req, res) => { const { title, content } = req.body; try { const user = await prisma.user.findUnique({ where: { id: req.userId } }); if (!user) { return res.status(404).json({ error: 'User not found' }); } const document = await prisma.document.create({ data: { title, content, ownerId: user.id, collaborators: { connect: { id: user.id } }, revisions: { create: { content, revisionNo: 1, }, }, }, }); res.json(document); } catch (error) { console.error(error); res.status(500).json({ error: 'Failed to create document' }); } }); // WebSocket event handling for real-time collaboration io.on('connection', (socket) => { console.log(`User ${socket.userId} connected`); socket.on('document-edit', async ({ documentId, content }) => { try { await prisma.documentRevision.create({ data: { documentId, content, revisionNo: await prisma.documentRevision.count({ where: { documentId } }) + 1, }, }); socket.broadcast.emit('document-update', { documentId, content }); } catch (error) { console.error('Error editing document:', error); } }); socket.on('disconnect', () => { console.log(`User ${socket.userId} disconnected`); }); }); const PORT = process.env.PORT || 3000; server.listen(PORT, () => { console.log(`Server is running on http://localhost:${PORT}`); });
-
-
Database Layer
- PostgreSQL: Stores user data, document content, revisions, and transactional information.
- Redis: Used for caching document content, session management, and improving real-time collaboration performance.
-
Real-time Collaboration
- WebSocket (Socket.io): Enables real-time synchronization of document edits across multiple users, providing a seamless collaborative experience.
-
Authentication and Authorization
- JWT: Token-based authentication for securing API endpoints and managing user sessions.
-
Messaging
- RabbitMQ: Used for handling asynchronous tasks such as sending notifications to users about document updates.
-
Storage
- Amazon S3: Stores document revisions and attachments securely and provides efficient retrieval.
-
Monitoring and Analytics
- Prometheus and Grafana: Monitor system metrics, track performance, and troubleshoot issues proactively.
Why This Tech Stack?
- Node.js with Express.js: Lightweight and efficient for handling asynchronous I/O operations and building scalable APIs.
- Prisma: Simplifies database interactions with type-safe queries and migrations.
- WebSocket (Socket.io): Facilitates real-time collaboration by enabling bi-directional communication between clients and server.
- PostgreSQL: Provides relational data storage and supports complex queries for document management and user interactions.
- Redis: Improves performance with caching and pub/sub capabilities for real-time updates and session management.
- Amazon S3: Secure and scalable storage solution for storing document revisions and attachments.
Scalability and Fault Tolerance
- Horizontal Scaling: Deploy multiple instances of microservices and use load balancing to handle high traffic and ensure availability.
- Database Sharding: Partition databases to distribute load and scale horizontally as the number of documents and users increase.
- Redundancy and Backup: Store backups in Amazon S3 and deploy services across multiple availability zones (AZs) for fault tolerance and disaster recovery.
Published on: Jul 10, 2024, 01:27 AM