Homework 3 - Multimedia Uploads


Jump to current week

Introduction


You will continue to add features to your app and pull from the repo to get the new front end features.

Sample Deployment: https://312demo.nico.engineer/

Before you get started on this hw, please go to this line in your server.py: "socketserver.TCPServer.allow_reuse_address = True" and "server = socketserver.TCPServer((host, port), MyTCPHandler)". Change both of these lines from TCPServer to ThreadingTCPServer. This will make sure that your server can handle multiple requests at once, without blocking other requests.


Learning Objective - Image and Video Uploads


By the end of this objective, your users will be able to upload an image to use as their profile picture and share videos with other users. To enable these features, we'll first need a multipart parser.


Helper Method: parse_multipart

Create a file named util/multipart.py containing the following function (This file does not define a class). In this file, write a function named "parse_multipart" that takes a Request object as a parameter. This function will assume that the input Request is a multipart request and will extract all the relevant values of the request. This function returns an object containing the following fields (You have some freedom in how you design the classes for the objects as long as they have the required fields. Please note that a dict with these fields is not an object. You must create a class, then create objects using that class).

  • boundary
    • The value of the boundary from the Content-Type header as a string
  • parts
    • A list of all the parts of the request in the order in which they appear in the request. Each part must be an object with the following fields
      • headers
        • A dictionary of all the headers for the part in the same format as a Request object
      • name
        • The name from the Content-Disposition header that matches the name of that part in the HTML form as a string
      • content
        • The content of the part in bytes
        • Note: The content may be binary (eg. An image or video) and must never be decoded as a string.
        • Note: The content may contain the sequence b"\r\n\r\n" or b"\r\n" and there will be a test case for this. Be sure that you are not corrupting your data when this sequence is in the content of a part


Avatar Uploads

Add functionality to allow users to change their avatar (profile picture). Add the following path to render a new page using layout.html.

  • "/change-avatar" - Render change-avatar.html

Visit the Change Avatar page to find a form that sends multipart requests to your server containing an image chosen by the user. Your server must listen for POST requests at the "/api/users/avatar" endpoint and process the uploaded images. This endpoint must have the following features:

  • After a user uploads an avatar, this avatar must appear next to all their new chat messages. It is acceptable if their previous messages are not updated. This is done by adding an "imageURL" field to all their chat messages. The value of this field is the path where you saved the image (eg. "imageURL": "/public/imgs/profile-pics/eda92e0a-eb7a-430b-a938-916d2102b480.png")
  • When an image is successfully uploaded, respond with a 200 and a message of your choice
  • If the same user uploads a second avatar, it replaces the first one
  • Update the users "/api/users/@me" endpoint to include their "imageURL" in a field
  • Users can upload either .jpg, .png, or .gif files. You may identify the image type using the file extension of the uploaded file, and you must serve the images with the corresponding MIME type
  • Avatar uploads must persist through a server restart (eg. Images must be saved to disk and the users account must be updated in the database)

Please keep the following notes in mind when implementing this feature. All of these will also apply to video uploads.

File Naming: When an image is uploaded, your server will save the image as a file. It is recommended that you devise a naming convention for the image files instead of using the names submitted by your users. Naming the images "image1.jpg", "image2.jpg", "image3.jpg", etc is fine, or you can generate guids as the names.

Persistence: Your uploads must persist through a server restart. You should store your images in files (It's generally bad practice to store large files in a database), and store the filenames in your database. Since your images are stored in files, they will already persist through a restart.

Buffering: Your app must allow for large images to be uploaded. You'll accomplish this by buffering your HTTP requests. Read the content length of the request and buffer until you read the whole body. Your buffering should be able to handle arbitrarily large files. You must use proper buffering for this. Do not increase the TCP buffer size by passing a large int to the recv method (That wouldn't work anyway for very large files). You can assume that all the headers, upto the first "\r\n\r\n", of the request are always read in the first read from the socket with a 2048 buffer (eg. You do not have buffer while reading the headers, and you can safely read the content length before any buffering). It's recommended that you buffer in server.py before sending the request to your router, and not just in the route for changing the avatar, as this will cover all future large requests for any path.

Do not submit with any print statements printing out the full buffered request body. If you submit to autolab with these print statements, it will crash your submission. This is because you're printing the contents of a 2GB video file.



VideoTube

Implement a [simplified] YouTube clone where users can share videos with each other. Add the following paths that will be rendered using layout.html.

  • "/videotube" - Render videotube.html (Displays all videos)
  • "/videotube/upload" - Render upload.html (Upload video form)
  • "/videotube/videos/{videoID}" - Render view-video.html (Displays a single video)

To enable the video sharing feature on the front end, you will build out three additional endpoints. These endpoints will allow users to upload videos, return all videos, and return the details of a single video.


Upload videos (`POST /api/videos`)

This endpoint with receive videos uploaded from the front end that will contain the videos title, description, and the video itself. You should parse the request and store all this information. As with images, the videos should be stored in files with only the filename stored in your database. Respond to uploads with a 200 OK containing a JSON object with the unique id of the new video.

Response (JSON): {"id": str}

This endpoint only needs to handle .mp4 videos.

After a video has been uploaded, be sure your "/public" path will serve these videos. As long as you save videos in the public directory, this will be mostly automatic but don't forget to add the MIME type for .mp4 content.

Testing will only be done with authenticated users. It's up to you if you want to support guest video uploads.


Retrieve all videos (`GET /api/videos`)

Return a JSON object containing information about all videos that have been uploaded in the following format:

Response (JSON): {"videos": [{"author_id": str, "title": str, "description": str, "video_path": str,"created_at": str "id": str}, ...]}

Where "created_at" should be a timestamp that will be displayed on the front end. You can/should use the datetime package to get a string format for the current time.


Retrieve a single video (`GET /api/videos/{video_id}`)

Return all information about the video with an id matching {video_id}. The format follows the same structure as the previous path, but with only 1 video object.

Response (JSON): {"video": {"author_id": str, "title": str, "description": str, "video_path": str,"created_at": str "id": str}}



Application Objective 1 - Video Subtitles using LLMs


Videos are subtitled on the internet using VTT files. Historically, this was done by hand by humans. However, you're being given the exciting opportunity to leverage AI and LLMs to automatically generate these subtitles. This application objective will involve you taking an uploaded video, extracting the audio from that video, and uploading that audio to an API in order to generate a VTT file for the video.

A free-to-use API is provided for you at https://transcription-api.nico.engineer/. You will need to read this README file to understand how to use the API. Note that the API readme explains how to run the API which you do not have to do for this objective. The API is already running at the url provided and the readme is for you to understand how to use this API. You do not have to create an AWS account.

The API is limited to only being able to transcribe videos less than or equal to one minute in duration. You should use this demo video for testing transcription, as it's what we will use: https://312-transcriptions.s3.us-east-1.amazonaws.com/transcriptions/30+sec.mp4. This video will produce this outputted VTT file https://312-transcriptions.s3.us-east-1.amazonaws.com/transcriptions/ac23b8cb-3660-47fe-b7e2-01436fcd7d84.vtt. You can use these as test values to avoid hitting ratelimits on the API (it is heavily ratelimited to avoid high cost).

To accomplish this objective, you will need to extract the audio from the video using ffmpeg. You'll also find that ffprobe (which comes with ffmpeg) is useful for determining video duration.

You will need to use an auth token when interacting with the API. Check your feedback on the "HW2-AO1 API Key" assignment on Autolab to obtain your API key. Do not abuse the API or send unnecessary requests. Bulky or otherwise abusive requests will result in your access being revoked.

When a video is uploaded, you should extract the audio from the video as an mp3 file, and upload it to the api. You should read the docs for the API to understand how to do this. You should save the returned id by the API with your video metadata in the database, under the "transcription_id" field.

You will make a GET `/api/transcriptions/{videoId}` endpoint. This route should call the transcription API with the transcription_id associated with the requested video. The transcription API will return either a VTT file if the transcription is complete, or a 420 status code if the transcription is not complete/does not exist. Your API should return either a 200 with the VTT file, or a 400 if the transcription is not complete/does not exist. If successful, you should be able to upload a video to the site and see the subtitles associated with your uploaded video.

NOTE: The transcription API is asynchronous, therefore it does not wait until the transcription is complete to return a response. Because of the heavy computing nature of transcription, the API will take about 30 seconds before a transcription is actually ready. Therefore, when testing, you should wait 30 seconds before refreshing the page on the view video page to see the subtitles.

WARNING: In order to avoid your submission crashing on autolab, you must ensure that you handle any failures in transcription gracefully and ensure that video uploading still succeeds, even if your transcription doesn't for whatever reason.



Application Objective 2 - Thumbnail Chooser


Use ffmpeg to generate 5 possible thumbnail images for each video that is uploaded. These 5 thumbnails should be: The first frame of the video, the frame at 25% of the videos runtime, the frame at 50%, the frame at 75%, and the last frame. Save all 5 frames as images and add them all to your videos data in a "thumbnails" field as an array.

Example field to be added to videos - "thumbnails": [ "public/imgs/thumbnails/67c260f1bb497fca803f6c49_0.jpg", "public/imgs/thumbnails/67c260f1bb497fca803f6c49_1.jpg", "public/imgs/thumbnails/67c260f1bb497fca803f6c49_2.jpg", "public/imgs/thumbnails/67c260f1bb497fca803f6c49_3.jpg", "public/imgs/thumbnails/67c260f1bb497fca803f6c49_4.jpg" ]

Add a second field named "thumbnailURL" to your video that is the selected thumbnail which will default to the first frame of the video.

Example field to be added to videos - "thumbnailURL": "public/imgs/thumbnails/67c260f1bb497fca803f6c49_0.jpg"

Add the following path to render the thumbnail chooser page.

  • "/videotube/set-thumbnail" - Render set-thumbnail.html

Using this page, users can change their thumbnail for a specific video. When they do, a request will be sent to the following endpoint.


Change thumbnail (`PUT /api/thumbnails/{videoId}`)

Request body: {"thumbnailURL":"public/imgs/thumbnails/67c260f1bb497fca803f6c49_4.jpg"}

When a request is received at this endpoint, update the "thumbnailURL" field of the video with id {videoId} to the provided value. Respond with a message in JSON object indicating that the update was successful.

Response format (JSON): {"message": str}



Application Objective 3 - Adaptive Bit-Rate


When a video is uploaded, re-encode it to HLS with at least 2 different resolutions/bit-rates and host all required files for the video. Add the path to the index file for the video in a new field named "hls_path" to the video.

Example field to be added to videos - "hls_path": "public/videos/2439857234.m3u8"

The single index file should refer to multiple other index files with different bit-rates. If done correctly, you will be able to select the resolution manually on the front end when viewing a video.



Submission


Submit all files for your server to Autolab in a .zip file

If you used any external libraries, be sure to add them to your requirements.txt. Autolab will install all dependencies in this file, and no other dependencies, before starting your server.

It is strongly recommended that you download and test your submission after submitting. To do this, download your zip file into a new directory, unzip your zip file, enter the directory where the files were unzipped, run docker compose up, then navigate to localhost:8080 in your browser. This simulates exactly what the TAs will do during grading.



Grading


Each objective will be scored on a 0-3 scale as follows:

3 (Complete) Clearly correct
2 (Complete) Mostly correct, but with some minor issues
1 (Incomplete) Not all features outlined in this document are functional, but an honest attempt was made to complete the objective
0.3 (Incomplete) The objective would earn a 3, but a security risk was found while testing
0.2 (Incomplete) The objective would earn a 2, but a security risk was found while testing
0.1 (Incomplete) The objective would earn a 1, but a security risk was found while testing
0 (Incomplete) No attempt to complete the objective or violation of the assignment (Ex. Using an HTTP library)

Note that for your final grade there is no difference between a 2 and 3, or a 0 and a 1. The numeric score is meant to give you more feedback on your work.

3 Objective Complete
2 Objective Complete
1 Objective Not Complete
0 Objective Not Complete

Autograded objectives are graded on a pass/fail basis with grades of 3.0 or 0.0.



Security Essay


For each objective for which you earned a 0.3 or 0.2, you will still have an opportunity to earn credit for the objective by submitting an essay about the security issue you exposed. These essays must:

  • Be at least 1000 words in length
  • Explain the security issue from your submission with specific details about your code
  • Describe how you fixed the issue in your submission with specific details about the code you changed
  • Explain why this security issue is a concern and the damage that could be done if you exposed this issue in production code with live users

Any submission that does not meet all these criteria will be rejected and your objective will remain incomplete.

Due Date: Security essays are due 1-week after grades are released.

Any essay may be subject to an interview with the course staff to verify that you understand the importance of the security issue that you exposed. If an interview is required, you will be contacted by the course staff for scheduling. Decisions of whether or not an interview is required will be made at the discretion of the course staff.

When you don't have to write an essay:

  • If you never submit a security violation, you never have to write an essay for this course. Be safe. Be secure
  • If you earn a 0.1, there's no need to write an essay since you would not complete the objective anyway