Glen Pitt-Pladdy :: Bloggit http with Nginx via Flask wsgi application (git4nginx) | |||
Out the box git ships with cgi to enable it to be served by a webserver that supports this, like Apache. These days with the big shift to Nginx which is an extremely performant pure (without all the embellishments) webserver, it's a whole lot more difficult to do this. While there are tricks like cgi wrappers, Apache also provided mechanisms for doing some basic permissions control (eg. read-only users) which isn't so easy to achieve with Nginx with the focus on pure webserving. Since I've been migrating a lot of my older services to Nginx, this triggered me to start coding... CGI WrapperThe basic requirement is to execute the git-http-backend as cgi (appropriate environment variables with POST data on stdin and output from stdout). This can be done with some existing tools which got me frightened the moment I realised they installed with world access to execute cgi! It never ceases to amaze me how many things default to an insecure state. This also left me with the problem of protecting some repos as read-only for some users. The permissions available in Apache for doing this are minimal, but sufficient most of the time if enough effort is spent getting the setup right. It made sense to me that we really need user groups support and repo (project) group support would also be useful. Then, any permission can be achieved with a combination of the user and user groups being applied to the projects and project groups. For finer grained control the git hook environment also contains execution environment such as REMOTE_USER and we can also add custom variables (eg. the location of the config file) to enable hooks to apply restrictions such as branch based controls. Nginx to uwsgi, to Flask, to gitThis uses Flask, a clean, minimal web (http) framework for Python which is fast and easy to built minimal APIs, websites and most other http related things. It's typically paired with Nginx to do the front end http stuff (TLS, static files, etc.). This is the basis for this project. The first working version took a morning to work out with collecting all the information needed to have some confidence that git was working as expected and knowing what limitations applied and at what point controls would need to move to hooks. After a few rounds of refinement I've added some sanity checking and better configurability: authorisation (read or write) can be applied at the top level of the hierarchy, or individual repo on a user or group basis. This provides the basic functionality that many have been used to with Apache and being able to use user groups, with the benefit that it can be applied in a finer grained manner per-repo which is more difficult with Apache. Get the Code & setupBe conscious that this is very early stages code and is mostly an hour here and there putting this together, and likely has bugs. No significant tests have been created yet and things will almost certainly change over time. The project is in GitHub at https://github.com/glenpp/git4nginx DependenciesYou will need to have at least the githttp.py file and a configuration file (example provided) in a location where uwsgi can execute it (www-data user & group on Debian based systems). If you don't already have them then you will need some dependencies. On Debian based systems these packages would be needed:
The rest are likely automatic dependencies and core Python, but there is a chance that there's something I've taken for granted is available. uwsgiOn Debian based systems getting the application running should be fairly straight forward:
Git ReposThe repos you are serving need to be fully accessible to the www-data user and/or group (assuming Debian based again). It would probably complicate things a lot (and defeat access controls) if they where also being accessed by other mechanisms like ssh or direct filesystem access. The simplest approach would really to just have the repos owned and exclusively used as the application user. This supports both all the repos at the top level of a directory, or having a layer of project group directories which likely suits more complex environments such as multiple teams which might each have multiple projects. Config fileThe config file is in YAML which is a good balance between human and machine readability. This needs to include the path to your repos, user authentication (includes group memberships) and authorisation (what the user is authorised to access). It's important to understand that for Authorisation permissions are inherited and accumulate so a user might have read permission from the top level but pick up write access in config specific to a repo. For Hook the configuration is also inherited from higher levels, but instead configuration for a hook plugin overrides inherited configuration allowing a plugin to be enabled for globally or for a whole project group, then disable or apply specific configuration on a per-repo basis
NginxThe example configuration for Nginx is not complete and largely shows an example location section to support this tool. The rest of the file would need to be crafted to suit your environment with appropriate TLS configuration. The important things to note in this file are:
On Debian based systems this config would normally go below /etc/nginx/sites-available/ and be enabled with a symlink from /etc/nginx/sites-enabled/ Reload (or restart if appropriate) Nginx and then start testing. Creating ReposThe application requires that all repos end .git and that their names (and project group directories) are a limited length and only contain basic characters (alphanumeric, - and _) to be safe. As mentioned above, the application (running as user and group www-data on Debian based systems) should have full access to the repo, and ideally exclusive access otherwise you will need to have to work out additional complications. If in doubt check the application log below /var/log/uwsgi/app/ where any problems should be logged. Repos should be bare (--bare option) repos. You might also like to configure additional protections as described at https://www.git-scm.com/book/en/v2/Customizing-Git-Git-Configuration such as these: git config --local receive.denyNonFastforwards true And any others that are appropriate for your requirements. A repo creation script is provided to simplify things: create_repo.sh Adding HooksAdditional restrictions can be applied with git hooks. These are done via a plugin based system which should only need symlinking from the hooks/ directory in the repo to the master script below hooks/ and most likely the setup.sh script will do all you need if run within the repo you have created. The location of the config file and a temporary directory for logs is passed from githttp.py to the hooks via environment variables, and the log files read and re-logged after the request for diagnostics. The githttp.py application adds additional information in environment variables for use by hooks:
The hook plugins are configured in the main config file and the configuration for a plugin deepest in the structure (most specific to the repo, or in some cases branch) takes priority over broader configuration (eg. at a project group level). Plugin: branch flowThis provides a simple protection against putting code into the wrong branch which is easily done by cloning a repo (master branch by default) and not switching to the development or feature branch as appropriate before committing. It requires that any revisions being pushed in branches configured is present in the proceeding branch before allowing it to be pushed. This discourages mistakes and tries to ensure the right flow of code through branches is followed (eg. dev->test->master). Branches not listed in the configuration are assumed to be feature branches and not restricted (else you should configure them). Plugin: branch protectThis provides write restrictions to configured branches which is useful if certain users (possibly automation) need to be the only ones allowed to promote code into a branch. Repo level write access will already be needed with this applying write protection to specific branches. Configuration follows the same approach as the main authorisation section in the configuration - see examples. Branches not listed in the configuration are assumed to be unprotected branches and not restricted - this is to stop key branches being damaged such as final steps of a deployment pipeline where maybe only automation and key people should be able to make changes. Authentication PluginsAuthentication is done via plugins which will allow many different approaches to be taken in the longer term (eg. create a plugin for your corporate identity service). The authentication plugins are below authentication_plugins/ and the most basic starter plugin is lookup_groups which does exactly what it says - it assumes the authentication is done upstream and the username is in REMOTE_USER, then looks up the groups associated with that user in the config file. Authentication plugins return a tuple of 3 items:
Authentication information is made available to hooks which allows for applying additional authentication as well as integrating with other tooling (eg. CICD pipeline).
|
|||
This is a bunch of random thoughts, ideas and other nonsense, and is not intended to be taken seriously. I'm experimenting and mostly have no idea what I am doing with most of this so it should be taken with cuation and at your own risk. Intrustive technologies are minimised where possible. For the purposes of reducing abuse and other risks hCaptcha is used and has it's own policies linked from the widget.
Copyright Glen Pitt-Pladdy 2008-2023
|