Distributed Server Management with GitHub

Sample Python code to update files from a GitHub repository

C05348A3-9AB8-42C9-A6E0-81DB3AC59FEB
           

I used this script to update a set of management scripts that were running on numerous servers. I could have just checked the scripts out with a "git pull", but I wanted additional controls over which script was installed on which server, so I wrote a Python script to do that.

I call this a distributed or decentralized approach because unlike having a central authority such as a Salt, Chef, or Puppet master server pushing new management scripts to all servers, it allows servers to pull updated scripts from GitHub.

The script uses the PyGitHub library, so let's start by installing that, as well as the Yaml library.

pip3 install PyGithub
pip3 install pyaml

Obtaining an API Token from GitHub

To use the following script, you'll first need to obtain an API token from GitHub. In your GitHub account, navigate to "Settings", and "Developer settings", and then click on "personal access token" and create a new token. No additional permission is needed for now.

You definitely need to protect that personal access token as it allows access to your account, not just any particular repository. For this, I decided to use the RSA-crypto library to encrypt it. The following code will install the library, create a new RSA key set, and encrypt the personal access token in a configuration file as a value of the option named ghe_token. See Public/Private Key RSA Encryption and Python Encryption for additional details on using that library and the associated command-line tool.

pip3 install rsa-crypto
rsa_crypto create
Enter key password:
Re-enter key password:
Creating key...
Created password-protected private/public keys file /Users/me/rsa_key.bin
Use the "extract" keyword to create public and private key files.
touch ~/.rsa_values.conf
rsa_crypto set -o ghe_token
Using key: /Users/me/rsa_key.bin
Opening encrypted key /Users/me/rsa_key.bin
Enter key password:
Enter value:
set section:  option:ghe_token
Updated /Users/me/.rsa_values.conf

The Python script will then be configured to read and decrypt the ghe_token configuration “option” to retrieve and use the personal access token. If this seems too complex feel free to find a different solution but please never embed any access token directly in your script, there is a high likelihood that it will end up on GitHub for anyone to see and use!

The Python script

Before being able to use this script, you'll need to change a few parameters, change ghe_repo to point to your own repository, and local_base_path to point to the local directory on your system where you want the server management scripts to reside.

The script assumes that you are using the public GitHub, but the code needed if you are using a private GitHub Enterprise repository internal to your organization has also been included and commented out.

You can find this script in a GitHub Repository.

The script will first retrieve a configuration file update_scripts.yaml. That file contains three sections:

  • update_always - contains a list of files that should always be pulled from GitHub
  • update_if_present - contains a list of files that should be updated from GitHub only if they are present
  • remove - contains a list of files that should be deleted from the local server

That file allows you to centrally control which scripts you want to have deployed to all of your servers. If you need a new script installed, push it to GitHub and add an entry into the configuration file, and your servers will download the new scripts when this script is executed. You can also "trick" your systems into downloading scrips from the update_if_present section simply by creating the file using the touch command and then running this update script. The script will also update itself if it is included in the list.

All three sections of the files will be processed, files in the update_always and update_if_present sections will only be updated in the GitHub hash is different from the local hash, making the pull more efficient. The script will also make all python scripts executable for the current user.

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import (absolute_import, division,
                        print_function, unicode_literals)
__author__ = "Videre Research, LLC"
__version__ = "1.0.2"
'''
Install and configure the server management scripts and other supporting scripts.
'''
# I M P O R T S ###############################################################
import os
import sys
import logging
import traceback
import base64
import github
import yaml
import hashlib
import rsa_crypto
import argparse
if (sys.version_info > (3, 0)):
    from urllib.parse import urljoin
else:
    from urlparse import urljoin
# G L O B A L S ###############################################################
# # Uncomment if you are using your own internal GitHub Repository
# ghe_organization = 'my_ghe_repo'
# ghe_hostname = mydomain.com
ghe_repo = 'Christophe-Gauge/GitHub'
remote_base_path = '/'
local_base_path = '/opt/scripts'
remote_config_file = 'update_scripts.yaml'
ghe_branch = 'main'
logger = logging.getLogger()
logging.basicConfig(level=logging.INFO)
logger.info("Path:    %s" % (os.path.realpath(__file__)))
logger.info("Version:  %s" % (__version__))
args = argparse.Namespace(option='ghe_token')
ghe_token = rsa_crypto.decrypt_value(args)
# C O D E #####################################################################
def get_ghe(remote_file, repo, always_update):
    """Get a file from GHE and save it locally"""
    my_file_name = os.path.basename(remote_file)
    remote_file_name = urljoin(remote_base_path, remote_file)
    local_file_name = os.path.join(local_base_path, my_file_name)
    logger.info("Retrieving remote GHE file %s%s to %s" % (repo.full_name, remote_file_name, local_file_name))
    try:
        remoteSHA = repo.get_contents(remote_file_name, ref=ghe_branch).sha
    except github.UnknownObjectException as e:
        logger.error(f"Remote file not found {remote_file_name}")
        return
    except Exception as e:
        logger.error("Error {0}".format(str(e)))
        logger.error(traceback.format_exc())
        return
    # If the file exists then let's get the hash to see if an update is needed
    if os.path.exists(local_file_name):
        # Compute the SHA1 hash of the local file
        with open(local_file_name, 'rb') as file_for_hash:
            data = file_for_hash.read()
        filesize = len(data)
        content = "blob " + str(filesize) + "\0" + data.decode('utf-8')
        encoded_content = content.encode('utf-8')
        localSHA = hashlib.sha1(encoded_content).hexdigest()
        if remoteSHA == localSHA:
            logger.info('File is present, hash is the same, we already have the latest file, NOT updating.')
            return
        else:
            logger.info('File is present, hash is different %s - %s' % (remoteSHA, localSHA))
    else:
        # This flag indicates that a file should only be updated if it already exists
        if not always_update:
            logger.info('File is not present NOT updating')
            return
    try:
        file_contents = repo.get_contents(remote_file_name, ref=ghe_branch)
        local_file_content = str(base64.b64decode(file_contents.content).decode('utf-8', 'ignore'))
        # Write the new file to disk
        with open(local_file_name, "w") as text_file:
            text_file.write(local_file_content)
        if my_file_name.endswith('.py'):
            os.chmod(local_file_name, 0o700)
        else:
            os.chmod(local_file_name, 0o400)
        logger.info('File was updated')
    except Exception as e:
        logger.error("Error {0}".format(str(e)))
        logger.error(traceback.format_exc())
def main():
    """Main function."""
    gh = github.Github(login_or_token=ghe_token)
    repo = gh.get_repo(ghe_repo)
    # # Uncomment if you are using your own internal GitHub Repository
    # gh = github.Github(base_url=f"https://{ghe_hostname}/api/v3", login_or_token=ghe_token)
    # org = gh.get_organization(ghe_organization)
    # repo = org.get_repo(ghe_repo)
    if not os.path.exists(local_base_path):
        os.makedirs(local_base_path)
    remote_file_name = urljoin(remote_base_path, remote_config_file)
    logger.info("Retrieving remote GHE file %s%s" % (repo.full_name, remote_file_name))
    try:
        file_contents = repo.get_contents(remote_file_name, ref=ghe_branch)
        text_contents = str(base64.b64decode(file_contents.content).decode('utf-8', 'ignore'))
        file_list = yaml.load(text_contents, Loader=yaml.SafeLoader)
        logger.info(yaml.safe_dump(file_list, default_flow_style=False))
    except Exception as e:
        if e.args[0] == 404:
            logger.error(f"Remote file not found {remote_file_name}")
            sys.exit(1)
        else:
            logger.error("Error {0}".format(str(e)))
            logger.error(traceback.format_exc())
            sys.exit(1)
    for file in file_list['update_always']:
        logger.info(file)
        get_ghe(file, repo, True)
    for file in file_list['update_if_present']:
        logger.info(file)
        get_ghe(file, repo, False)
    for file in file_list['remove']:
        if os.path.exists(file):
            os.remove(file)
            logger.info('File %s was deleted' % file)
    sys.exit(0)
###############################################################################
if __name__ == "__main__":
    main()
# E N D   O F   F I L E #######################################################
Posted Comments: 0

Tagged with:
GitHub