Generating Website Content with Python

Streamlining with Automation

Once you have solved problem the next logical step is to automate the solution. Python is my go to tool in this regard. For one project I worked on I was tasked with growing a site that hosted primarily video content. In this post I'll show you how I solved that problem using AWS and Python.

Python for Html Generation

Suppose you wanted to generate an html field for each video you wanted to host with a cover photo. Then each video would essentially be an html snippet with an S3 url for the video and an S3 url for the cover photo. You could try to do this manually but it would get tedious very quickly. What I wanted to do was to have all the videos in a directory and have Python scan that directory and produce the html for me. This solution is incredibly scalable because it allows massive content release and facilitates a fresh stream of material once you have the videos.

Using S3

S3 provides a great deal of power in this context. It gives every object a url and lets you exactly control who can access it. In this post I showed you an S3 bucket policy that will allow object access from within your domain. However, if you wanted to maintain AWS CLI access to that bucket that policy would block it. To fix that issue you can add a policy to that bucket with your unique ip address to allow you to upload from your home ip but maintain the security posture of blocking access from outside your domain. Here is how you could implement that functionality via a bucket policy

  {
            "Sid": "Explicit deny to ensure requests are allowed only from specific referer or IP.",
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": "arn:aws:s3:::<your-bucket>/*",
            "Condition": {
                "StringNotLike": {
                    "aws:Referer": "http://<your-bucket>/*"
                },
                "NotIpAddress": {
                    "aws:SourceIp": "<your IP>"
                }
            }
        }

Now for any media content you had you could easily incorporate it's upload to S3 via a python script or even a Bash script using the CLI. If you had a folder structure of vids/2017/ then a Bash script of the form

aws s3 cp vids/$1 s3://your-bucket/vids/$1  
aws s3 cp pics/$1 s3://your-bucket/pics/$1  

could be named upload.sh and parameterized by the sub directory that you want to upload. If you had videos categorized by year 2017 , 2018 etc then you could upload them as ./upload.sh 2017 to get them into S3.

File Name Patterns

One way to achieve automation is to establish naming patterns early on. People can get hung up on naming but wisdom shows the names aren't as important as long as the format is consistent. The assumption I used in this case is that the video and photo would have the same name but different extensions. Revisiting the upload script I showed you in he previous paragraph you can see how this would populate data in S3.

Reading Directories with Python

Our desired outcome here is a file containing html. Principally the two parts to our Python script is reading the files in the directory and writing the desired formatted html to a file. This will involve Python file operations to open and write a file. As we did in our upload script above we can parametize based on the sub-directory of videos we want to work with. Here is html generation script

import glob  
import sys  
f = open("output.txt", "w")

final = "<div style=\"margin-bottom: 2em;\"><span style=\"display: none;\">.</span></div> \  
            [video poster=\"https://s3.amazonaws.com/<your-bucket>/pics/%s\" \
                width=\"720\" height=\"404\" mp4=\"https://s3.amazonaws.com/<your-bucket>/vids/%s\"][/video]" 


dir = "vids/"+sys.argv[1]+"/*"  
filenames = glob.glob(dir)


for file in filenames:  
    if file == "vids/"+sys.argv[1]+"\Thumbs.db":
        pass
    else:

        temp = filenames[filenames.index(file)].split("\\")
            file = temp[len(temp)-1]
            sub = sys.argv[1] + "/" + file


            out = final % (sub.replace("mp4","png") , sub)
            f.write(out)
            f.write("\n")
            f.write("\n")

f.close()

Here is the final output generated

<div style="margin-bottom: 2em;"><span style="display: none;">.</span></div>             [video poster="https://s3.amazonaws.com/<your-bucket>/pics/seasons/fall.png"                width="720" height="404" mp4="https://s3.amazonaws.com/<your-bucket>/vids/seasons/fall.mp4"][/video]

<div style="margin-bottom: 2em;"><span style="display: none;">.</span></div>             [video poster="https://s3.amazonaws.com/<your-bucket>/pics/seasons/spring.png"              width="720" height="404" mp4="https://s3.amazonaws.com/<your-bucket>/vids/seasons/spring.mp4"][/video]

<div style="margin-bottom: 2em;"><span style="display: none;">.</span></div>             [video poster="https://s3.amazonaws.com/<your-bucket>/pics/seasons/summer.png"              width="720" height="404" mp4="https://s3.amazonaws.com/<your-bucket>/vids/seasons/summer.mp4"][/video]

<div style="margin-bottom: 2em;"><span style="display: none;">.</span></div>             [video poster="https://s3.amazonaws.com/<your-bucket>/pics/seasons/winter.png"              width="720" height="404" mp4="https://s3.amazonaws.com/<your-bucket>/vids/seasons/winter.mp4"][/video]

Now it's easy to just copy this on to your web server in the appropriate place. The videos will all play from S3 because it will be secure under your domain name.

Taking it Further

It would be easy in this case to not only produce the html content but also copy it on your site from within your script. I haven't taken that step at this point but it's something to think about for the future. The real walk away from this post is that automating with Python is easy if you have firm assumptions and know what you want to accomplish.