The Backend of MITM Proxy
2023-03-03
TL;DR
This blog post discusses how man-in-the-middle proxy backend is built using the hyper, rustls, and tokio crates. A man-in-the-middle proxy is a type of proxy server that intercepts all traffic between a client and a server. This allows the proxy server to read and modify the data that is being sent between the two parties. The hyper crate is a high-level HTTP library that provides support for HTTP/1.1 and HTTP/2, request and response bodies, cookies, and redirects. The rustls crate is a TLS implementation that provides support for the TLS 1.2, 1.3, and 1.4 protocols, cipher suites, and certificate validation. The tokio crate is an asynchronous runtime that provides support for asynchronous I/O, tasks and futures, and concurrency and parallelism.
Toc
[
Hide]
-
1 Internal Proxy
- 1.1 Struct
- 1.2 proxy Method
-
1.3
process_connect Method
- 1.3.1 serve_stream function
- 1.4 upgrade_websocket Method
- 2 Rewind Crate
- 3 Proxyhandler Struct
- 4 Body decoder
Internal Proxy
Struct
The InternalProxy struct has several fields:
- ca: a
CertificateAuthoritystruct, which is used to generate TLS certificates for secure connections. - client: a
hyper Clientinstance, which is used to forward requests to the appropriate server. - http_handler: an
HttpHandlertrait object, which is used to handle HTTP requests and responses. - websocket_connector: an optional
Connectorstruct, which is used to establish WebSocket connections. - remote_addr: the IP address and port of the client making the request.
proxy Method
pub(crate) async fn proxy( mut self, req: Request<Body>, ) -> Result<Response<Body>, hyper::Error>{
let ctx = HttpContext {
remote_addr: self.remote_addr,
};
let req = match self
.http_handler
.handle_request(&ctx, req)
.await
{
RequestResponse::Request(req) => req,
RequestResponse::Response(res) => return Ok(res),
};
// handle request
}
The InternalProxy struct has a proxy method that takes a Request<Body> object, which contains the client's request. The method first creates an HttpContext struct to represent the context of the request, including the remote IP address. It then passes the request to the http_handler to handle it. If the handler returns a RequestResponse::Response variant, it means the request has been handled and the method can return the response to the client. Otherwise, the method proceeds to handle the request itself.
if req.method() == Method::CONNECT{
self.process_connect(req)
} else if hyper_tungstenite::is_upgrade_request(&req){
Ok(self.upgrade_websocket(req))
} else {
let res = self
.client
.request(normalize_request(req))
.await?;
Ok(self
.http_handler
.handle_response(&ctx, res)
.await)
}
If the request method is CONNECT, the method calls the process_connect method to handle it. If the request is an upgrade request for a WebSocket connection, the method calls the upgrade_websocket method to handle it. Otherwise, the method forwards the request to the appropriate server using the client instance, and returns the response to the client.
process_connect Method
The process_connect method handles a CONNECT request, which is used to establish a tunnel between the client and the server. The method upgrades the client's connection to a tunnel, and then reads the first few bytes of the tunnel to determine the protocol being used. If the protocol is HTTP or HTTPS, the method sets up a new connection to the appropriate server and establishes a new tunnel to that server. If the protocol is unknown, the method forwards the data over the tunnel to the server.
fn process_connect(self, mut req: Request<Body>) -> Result<Response<Body>, hyper::Error>{
// --snip--
}
It receives an incoming Request object, which represents an HTTP/HTTPS request.
let fut = async move{
match hyper::upgrade::on(&mut req).await{
Ok(mut upgraded) => { --snip-- },
Err(e) => eprintln!("Upgrade error {e}"),
}
}
It creates a future using the async move syntax, which creates an asynchronous closure that captures the self object and the req object. Then it calls the hyper::upgrade::on method, passing in the req object, which checks if the request is eligible for upgrade to a WebSocket.
let mut buffer = [0; 4];
let bytes_read = match upgraded.read(&mut buffer).await {
Ok(bytes_read) => bytes_read,
Err(e) => {
eprintln!("Failed to read from upgraded connection: {e}");
return;
}
};
let mut upgraded = Rewind::new_buffered(
upgraded,
bytes::Bytes::copy_from_slice(buffer[..bytes_read].as_ref()),
);
// -- snip --
If the request is eligible for upgrade (Ok(mut upgraded) branch), the function attempts to read the first four bytes of the connection using the read method of the upgraded connection. Rewind can "rewind" back to the initial bytes of the message if it needs to restart the WebSocket or HTTPS protocol. This is because the initial bytes of the message that triggered the upgrade request are stored in the pre field of the Rewind struct, .
if buffer == *b"GET " {
if let Err(e) = self.serve_stream(upgraded, Scheme::HTTP).await {
eprintln!("Websocket connect error: {e}");
}
} // --snip--
The function then checks whether the first four bytes are equal to the ASCII string "GET ". If so, it calls the serve_stream method with the upgraded connection and the Scheme::HTTP enum variant. This means that the connection will be upgraded to a WebSocket connection.
else if buffer[..2] == *b"\x16\x03"{
let authority = req.uri()
.authority()
.expect("Uri doesn't contain authority");
let server_config = self
.ca
.gen_server_config(authority)
.await;
let stream = match TlsAcceptor::from(server_config).accept(upgraded).await {
Ok(stream) => stream,
Err(e) => {
eprintln!("Failed to enstablish TLS Connection:{e}");
return;
}
};
if let Err(e) = self.serve_stream(stream, Scheme::HTTPS).await {
if !e.to_string().starts_with("error shutting down connection"){
eprintln!("HTTPS connect error: {e}");
}
}
} //--snip--
If the first four bytes are not equal to "GET ", the function checks whether the first two bytes are equal to "\x16\x03", which is the signature for an SSL/TLS connection. If so, the function extracts the server name from the request URI and creates a new TlsAcceptor object using the server's certificate authority. It then attempts to accept a new TLS connection using the accept method of the TlsAcceptor.
If the TLS connection is successfully established, the function calls the serve_stream method with the new TLS connection and the Scheme::HTTPS enum variant.
else {
eprintln!(
"Unknown protocol, read '{:02X?}' from upgraded connection",
&buffer[..bytes_read]
);
let authority = req
.uri()
.authority()
.expect("Uri doesn't contain authority")
.as_ref();
let mut server = match TcpStream::connect(authority).await {
Ok(server) => server,
Err(e) => {
eprintln!{"failed to connect to {authority}: {e}"};
return;
}
};
if let Err(e) = tokio::io::copy_bidirectional(&mut upgraded, &mut server).await {
eprintln!("Failed to tunnel unknown protocol to {}: {}", authority, e);
}
}
//--snip--
If the first four bytes are not equal to "GET " and the first two bytes are not equal to "\x16\x03", the function assumes that the connection is a plaintext TCP connection. It extracts the server name from the request URI and attempts to establish a new TCP connection using the TcpStream::connect method.
If the TCP connection is successfully established, the function uses the tokio::io::copy_bidirectional method to tunnel the plaintext TCP connection to the remote server.
tokio::spawn(fut);
Ok(Response::new(Body::empty()))
Finally, the function spawns the fut asynchronous closure using the tokio::spawn method.
serve_stream function
upgrade_websocket Method
This method is used to upgrade an HTTP connection to a WebSocket connection. It takes in a Request object and returns a Response object with a body of Body.
let mut req = {
let (mut parts, _) = req.into_parts();
parts.uri = {
let mut parts = parts.uri.into_parts();
parts.scheme = if parts.scheme.unwrap_or(Scheme::HTTP) == Scheme::HTTP {
Some("ws".try_into().expect("Failed to convert scheme"))
} else {
Some("wss".try_into().expect("Failed to convert scheme"))
};
Uri::from_parts(parts).expect("Failed to build URI")
};
Request::from_parts(parts, ())
};
First, the Request object is deconstructed into its parts using the into_parts() method. The URI scheme is then checked and converted to either "ws" or "wss" if it is "http" or "https", respectively.
let (res, websocket) = hyper_tungstenite::upgrade(&mut req, None).expect("Request missing headers");
The upgrade() function from the hyper_tungstenite crate is then used to upgrade the connection to a WebSocket connection.
let fut = async move{
match websocket.await{
Ok(ws) => {
if let Err(e) = self.handle_websocket(ws, req).await {
eprintln!("Failed to handle websocket: {e}");
}
}
Err(e) => {
eprintln!("Failed to upgrade to websocket: {e}");
}
}
};
tokio::spawn(fut);
A new asynchronous task is spawned using the tokio::spawn() function to handle the WebSocket connection. The handle_websocket() method (currently an empty method) is called with the WebSocketStream and original Request objects as arguments. If an error occurs during the handling of the WebSocket, an error message is printed to the console.
Rewind Crate
This crate provides a rewindable wrapper for an asynchronous read/write stream. This can be useful because it allows to read and modif the data that is being sent between the client and the server. rewind() method allows you to reset the buffer to the beginning. This can be useful if you want to replay data that has been read from the underlying stream.
#[allow(dead_code)]
pub(crate) fn rewind(&mut self, bs: Bytes) {
debug_assert!(self.pre.is_none());
self.pre = Some(bs);
}
Proxyhandler Struct
It can be used to modify HTTP and HTTPS traffic. The ProxyHandler struct implements the Handler trait, which provides methods for handling HTTP and HTTPS requests and responses.
handle_request() method handles an HTTP request, it first copies the body of the request to a temporary buffer, then it creates a new ProxiedRequest object from the request headers, body and timestamp. Finally it returns the request object. The same is done with responses via the handle_responses() method.
Body decoder
This crate doesn't exists yet, its goal will be to decode the encripted bodies to make them readable. The body decoder will be easy to use and will be able to decode bodies from a variety of sources including HTTP requests, HTTP responses and Files.