Menu
Index

Contact
Atom Feed
Comments Atom Feed

Similar Articles

2016-03-11 13:11
Git on (Smart) HTTP with read/write authentication
2015-01-08 22:58
Nginx on Cacti via SNMP

Recent Articles

2019-07-28 16:35
git http with Nginx via Flask wsgi application (git4nginx)
2018-05-15 16:48
Raspberry Pi Camera, IR Lights and more
2017-04-23 14:21
Raspberry Pi SD Card Test
2017-04-07 10:54
DNS Firewall (blackhole malicious, like Pi-hole) with bind9
2017-03-28 13:07
Kubernetes to learn Part 4

Glen Pitt-Pladdy :: Blog

git http with Nginx via Flask wsgi application (git4nginx)

Out the box git ships with cgi to enable it to be served by a webserver that supports this, like Apache. These days with the big shift to Nginx which is an extremely performant pure (without all the embellishments) webserver, it's a whole lot more difficult to do this. While there are tricks like cgi wrappers, Apache also provided mechanisms for doing some basic permissions control (eg. read-only users) which isn't so easy to achieve with Nginx with the focus on pure webserving.

Since I've been migrating a lot of my older services to Nginx, this triggered me to start coding...

CGI Wrapper

The basic requirement is to execute the git-http-backend as cgi (appropriate environment variables with POST data on stdin and output from stdout). This can be done with some existing tools which got me frightened the moment I realised they installed with world access to execute cgi! It never ceases to amaze me how many things default to an insecure state.

This also left me with the problem of protecting some repos as read-only for some users. The permissions available in Apache for doing this are minimal, but sufficient most of the time if enough effort is spent getting the setup right. It made sense to me that we really need user groups support and repo (project) group support would also be useful. Then, any permission can be achieved with a combination of the user and user groups being applied to the projects and project groups.

For finer grained control the git hook environment also contains execution environment such as REMOTE_USER and we can also add custom variables (eg. the location of the config file) to enable hooks to apply restrictions such as branch based controls.

Nginx to uwsgi, to Flask, to git

This uses Flask, a clean, minimal web (http) framework for Python which is fast and easy to built minimal APIs, websites and most other http related things. It's typically paired with Nginx to do the front end http stuff (TLS, static files, etc.). This is the basis for this project.

The first working version took a morning to work out with collecting all the information needed to have some confidence that git was working as expected and knowing what limitations applied and at what point controls would need to move to hooks. After a few rounds of refinement I've added some sanity checking and better configurability: authorisation (read or write) can be applied at the top level of the hierarchy, or individual repo on a user or group basis.

This provides the basic functionality that many have been used to with Apache and being able to use user groups, with the benefit that it can be applied in a finer grained manner per-repo which is more difficult with Apache.

Get the Code & setup

Be conscious that this is very early stages code and is mostly an hour here and there putting this together, and likely has bugs. No significant tests have been created yet and things will almost certainly change over time.

The project is in GitHub at https://github.com/glenpp/git4nginx

Dependencies

You will need to have at least the githttp.py file and a configuration file (example provided) in a location where uwsgi can execute it (www-data user & group on Debian based systems). If you don't already have them then you will need some dependencies. On Debian based systems these packages would be needed:

  • git
  • nginx
  • uwsgi
  • uwsgi-plugin-python3
  • python3-flask
  • python3-yaml
  • apache2-utils (for managing htpasswd files also used by Nginx)

The rest are likely automatic dependencies and core Python, but there is a chance that there's something I've taken for granted is available.

uwsgi

On Debian based systems getting the application running should be fairly straight forward:

  1. Make a copy of the uwsgi example config in /etc/uwsgi/apps-available/ under a suitable application name
  2. Edit the file to reflect the location of githttp.py and any other specifics of your setup
  3. Set the environment variable GIT4NGINX_CONFIG to the path to the configuration file
  4. Optionally set the environment variable GIT4NGINX_LOG_LEVEL to a keyword for your level of log detail (eg. DEBUG, INFO or WARNING)
  5. Symlink this file to /etc/uwsgi/apps-enabled/ to enable the configuration
  6. Reload (or restart if appropriate) uwsgi and check the logs below /var/log/uwsgi/app/ to verify everything is healthy
Git Repos

The repos you are serving need to be fully accessible to the www-data user and/or group (assuming Debian based again). It would probably complicate things a lot (and defeat access controls) if they where also being accessed by other mechanisms like ssh or direct filesystem access.

The simplest approach would really to just have the repos owned and exclusively used as the application user.

This supports both all the repos at the top level of a directory, or having a layer of project group directories which likely suits more complex environments such as multiple teams which might each have multiple projects.

Config file

The config file is in YAML which is a good balance between human and machine readability. This needs to include the path to your repos, user authentication (includes group memberships) and authorisation (what the user is authorised to access).

It's important to understand that for Authorisation permissions are inherited and accumulate so a user might have read permission from the top level but pick up write access in config specific to a repo.

For Hook the configuration is also inherited from higher levels, but instead configuration for a hook plugin overrides inherited configuration allowing a plugin to be enabled for globally or for a whole project group, then disable or apply specific configuration on a per-repo basis

 

Nginx

The example configuration for Nginx is not complete and largely shows an example location section to support this tool. The rest of the file would need to be crafted to suit your environment with appropriate TLS configuration.

The important things to note in this file are:

  • set REMOTE_USER - this passes user information through to the application which doesn't happen by default
  • rewrite the location so the application operates from the root irrespective of what url you serve your git repos on

On Debian based systems this config would normally go below /etc/nginx/sites-available/ and be enabled with a symlink from /etc/nginx/sites-enabled/

Reload (or restart if appropriate) Nginx and then start testing.

Creating Repos

The application requires that all repos end .git and that their names (and project group directories) are a limited length and only contain basic characters (alphanumeric, - and _) to be safe.

As mentioned above, the application (running as user and group www-data on Debian based systems) should have full access to the repo, and ideally exclusive access otherwise you will need to have to work out additional complications.

If in doubt check the application log below /var/log/uwsgi/app/ where any problems should be logged.

Repos should be bare (--bare option) repos.

You might also like to configure additional protections as described at https://www.git-scm.com/book/en/v2/Customizing-Git-Git-Configuration such as these:

git config --local receive.denyNonFastforwards true
git config --local receive.denyDeletes true

And any others that are appropriate for your requirements.

A repo creation script is provided to simplify things: create_repo.sh

Adding Hooks

Additional restrictions can be applied with git hooks. These are done via a plugin based system which should only need symlinking from the hooks/ directory in the repo to the master script below hooks/ and most likely the setup.sh script will do all you need if run within the repo you have created.

The location of the config file and a temporary directory for logs is passed from githttp.py to the hooks via environment variables, and the log files read and re-logged after the request for diagnostics.

The githttp.py application adds additional information in environment variables for use by hooks:

  • GIT4NGINX_CONFIG - as above, the path to the config file
  • REMOTE_USER - passed through as would happen normally with CGI, the authenticated user
  • GIT4NGINX_GROUPS - string of a JSON sequence (list) of the groups the user is a member of and can be used for applying additional access controls
  • GIT4NGINX_INFO - string of a JSON map (dict) of the user info configured for the user in the config file
  • GIT4NGINX_LOG_DIR - path to the temporary log directory used for passing back logs from plugins named _ and logged in the format '%(created)f [%(levelname)s] %(message)s (%(filename)s:%(lineno)d)'

The hook plugins are configured in the main config file and the configuration for a plugin deepest in the structure (most specific to the repo, or in some cases branch) takes priority over broader configuration (eg. at a project group level).

Plugin: branch flow

This provides a simple protection against putting code into the wrong branch which is easily done by cloning a repo (master branch by default) and not switching to the development or feature branch as appropriate before committing. It requires that any revisions being pushed in branches configured is present in the proceeding branch before allowing it to be pushed. This discourages mistakes and tries to ensure the right flow of code through branches is followed (eg. dev->test->master).

Branches not listed in the configuration are assumed to be feature branches and not restricted (else you should configure them).

Plugin: branch protect

This provides write restrictions to configured branches which is useful if certain users (possibly automation) need to be the only ones allowed to promote code into a branch. Repo level write access will already be needed with this applying write protection to specific branches.

Configuration follows the same approach as the main authorisation section in the configuration - see examples.

Branches not listed in the configuration are assumed to be unprotected branches and not restricted - this is to stop key branches being damaged such as final steps of a deployment pipeline where maybe only automation and key people should be able to make changes.

Authentication Plugins

Authentication is done via plugins which will allow many different approaches to be taken in the longer term (eg. create a plugin for your corporate identity service).

The authentication plugins are below authentication_plugins/ and the most basic starter plugin is lookup_groups which does exactly what it says - it assumes the authentication is done upstream and the username is in REMOTE_USER, then looks up the groups associated with that user in the config file.

Authentication plugins return a tuple of 3 items:

  • Authenticated bool - is this user authenticated? With lookup_groups this is always True if the user is found in the config file since the authentication actually happened at the Nginx level.
  • Groups list - what groups the user is a member of
  • Info dict - this is everything below the info key for the user in the config file and may be empty. The kinds of information if used is:
    • name - the full name of the user
    • email - the email address for the user (eg. could be used for notifications by hooks)

Authentication information is made available to hooks which allows for applying additional authentication as well as integrating with other tooling (eg. CICD pipeline).