In the previous blog post, I’ve went over how to implement a simple git server over HTTP.
After a lot more code (138 commits) and yet still not a single line of frontend, here we are again with another post about upsilon, this time explaining how write a server to serve as a git remote, over the ssh://
protocol.
The age-old question: Why?
Well, ssh://
is a very common protocol to serve git repositories, and although it is maybe not as common as http(s)
, it is still quite handy and easy to use. For completeness’ sake, I wanted to support it, and this was also a fantastic opportunity to learn how to actually write a ssh
server.
As far as implementation details go, the ssh
parts of the server are pretty similar to how the server handles the data-store, as in, there are some common operations that are shared between ssh
servers, and swapping out the actual server implementation is as easy as changing a few configuration variables.
That being said, similar to the data-stores, there’s also only one current implementation of it, which is based on russh. Using OpenSSH sshd
for the server doesn’t seem like that good of an idea, as authentication is a little harder to handle when we don’t know who is who, thus giving the rise to our custom ssh
server, and I really do hope I didn’t introduce any major security flaws, due to my lack of prior knowledge of the details of the ssh
protocol itself and how should it be used.
How?
I’m glad you asked. Looking back on it, I would rank the ssh
server as somewhere in between git-daemon
(for the git://
protocol) and git-http-backend
(for the http(s)://
protocol), in terms of difficulty and time required to implement. I didn’t spend multiple days banging my head against the wall, asking why does the git
client think the repository is corrupted, as I did when writing the http
backend, but still learned quite a bit about ssh
here… Well you actually learn more when things don’t go right, am I right? But ssh had its own fair share of problems too, so let’s get to it.
First things first
The ssh://
protocol is very different from http://
, so a few basic thing first about how git
over ssh://
operates.
- In
ssh
, after all the preliminary checks have been performed (the client checking whether the server is in the~/.ssh/known_hosts
list), the client authenticates itself (only allowed with ssh keys, no username / password or, god forbid, no authentication at all; if we would use no authentication at all we might as well just use thegit://
protocol, no point in bothering withssh://
, heck it even is faster!). - After the client has authenticated itself, the
ssh
session is started, which will go on until the client disconnects. - In this session, the client can open multiple channels, which are basically just streams of data (the server can open channels too, but for the purposes of
git
that should not happen, or otherwise the git client would be really surprised and confused). - On this channel, the client can send some “requests”, but for
git
only one of them is relevant:exec
. This is also what happens when you pass a command to run when you connect with the plainssh
command ( e.g.ssh [email protected] echo 1
). Theexec
request is used to run a single command on a channel. The ssh server, when it receives one such request, spawns a shell (sh -c '...'
) and runs the command inside it, pipingstdout
andstderr
to the channel (todata
andextended_data
with1
forstdout
andstderr
respectively), and piping from the channel (data
) to the shell’sstdin
. - On an
exec
request, the server should also send asuccess
message, which is basically just a confirmation that the command was started successfully, orfailed
if it failed to start. - The actual
git
commands that thegit
client sends aregit-upload-pack
,git-receive-pack
andgit-upload-archive
, which are the commands that are used to fetch from, push to, and servegit archive --remote
with a repository respectively. These commands are run on the channel that was opened by the client, and the server will respond with the appropriate data. - Those commands are meant to be run inside
git-shell
. As it turns out, it is not available in git-for-windows, so the webserver just says “no” when we ask it to run assh
server on windows. - All the commands are “interactive”-ish, as in they all use
stdin
to do their job. This is why we need to pipe from the channeldata
to the shell’sstdin
, or otherwise they just will not work. - After the client decides it is done, it sends an
eof
message, after which the server will close thestdin
pipe, which the corresponding shell process (and thus the git command that was run in said shell process) will pick up on and finish up. After that, the server will go on and send the rest of the data that it has to send (stdout
andstderr
), and finally announce to the client the exit code of the command that was run. - After that, the
git
client decides it is done, and so it closes the channel, closes the session, and disconnects.
Actual implementation process
I’ve said above that I used russh for the ssh
server, which is a “fork” of thrussh (if it can be considered one), both of which are pure-rust ssh server (and client, but that’s besides the point) libraries.
The first thing I did was to figure out how to get the authentication bit right, and russh
has a very nice example for the server side of things.
After that, I had to figure out how to actually run the git
commands, which proved to be a little harder on windows (due to the lack of git-shell
), so I just gave up on ssh on windows for now.
Tests that use ssh
are also ignored on windows, because of that, but running Ubuntu in WSL2 on windows is a good workaround for this currently.
The whole (current) implementation of the russh
server is in the upsilon-ssh-russh
crate.