diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..e43b0f9 --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +.DS_Store diff --git a/LICENSE b/LICENSE index b5347b0..78138e4 100644 --- a/LICENSE +++ b/LICENSE @@ -1,4 +1,4 @@ -Copyright (c) 2016, Sam Redmond +Copyright (c) 2016-2020, Sam Redmond, Parth Sarin, and Michael Cooper All rights reserved. Redistribution and use in source and binary forms, with or without @@ -23,4 +23,4 @@ SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. The views and conclusions contained in the software and documentation are those of the authors and should not be interpreted as representing official policies, -either expressed or implied, of the FreeBSD Project. \ No newline at end of file +either expressed or implied, of the FreeBSD Project. diff --git a/README.md b/README.md deleted file mode 100644 index b05a678..0000000 --- a/README.md +++ /dev/null @@ -1,15 +0,0 @@ -# Python Assignments - -Assignments for CS41. - -## List of Assignments - -**Assignment 0:** Welcome to Python! - -**Assignment 1:** Cryptography - -**Assignment 2:** Quest for the Holy Grail! - -**Assignment 3:** Style-fy - -**Assignment 4:** Final Project diff --git a/assign0/README.md b/assign0/README.md index 7515dbf..4759bb1 100644 --- a/assign0/README.md +++ b/assign0/README.md @@ -1,115 +1,337 @@ # Assignment 0: Welcome to Python! -**Due: 11:59:59 PM, Tue April 11th** +> Due: 11:59 PM, Friday, April 9, 2021 ## Overview -This introductory assignment aims to introduce you to a few Python fundamentals. More importantly, the goal of this warmup assignment is to ensure that your local Python installation is set up correctly and that you are familiar with the CS41 submission process. -At a high level, for this assignment you will implement a Python script that answers questions about yourself. +This introductory assignment aims to give you practice with a few of the Python fundamentals we've covered in class. More importantly, the goal of this warmup assignment is to ensure that your local Python installation is set up correctly and that you are familiar with the CS41 submission process. -*Expected Time: 1 hour (if it takes much longer than that, email us)* +Note: As this is Assignment 0, please get started early! We want to make sure that there's time to resolve any installation or submission problems. -Note: Get started early! We want to resolve any installation or submission problems earlier rather than later. +## Outline -## Review +- In Part 1, you will create a file called `coconuts.py` in which you write code to calculate the coconut-carrying capacity of swallows. +- In Part 2, you will create a file called `cheese.py` in which you write code to play the part of a cheese shop owner as they discuss inventory with a client. +- In Part 3, you will design and implement your own chatbot program from a broad specification - we're excited to see what types of chatbots you choose to build! -If you would like, get a quick refresher by flipping through our slides from the first week on the [course website.](http://stanfordpython.com/#lectures) +(If coconuts and cheese shops sound rather random, there is some method to this madness - the first two parts of this assignment are themed, as you will soon see, around Monty Python sketches which inspired the name of the Python Programming Language). + +## Installing Python + +Follow our instructions for installing Python 3.9.2 and setting up a virtual environment. _It is important that you have Python 3.9.2 installed on your system, as some of the features we will be discussing later in the quarter are unique to this most recent version of Python._ + +- For macOS users, follow [these instructions](https://github.com/stanfordpython/python-handouts/blob/master/installing-python-macos.md) +- For Linux users, follow [these instructions](https://github.com/stanfordpython/python-handouts/blob/master/installing-python-linux.md) +- For Windows users, follow [these instructions](https://github.com/stanfordpython/python-handouts/blob/master/installing-python-windows.md) + +**IMPORTANT: Every time you open a new terminal session, you will need to activate your virtual environment again.** ## Starter Files -There are no starter files for this assignment. You will create and submit a file called `intro.py`. +There are no starter files for this assignment. + +You will create and submit three Python files, named `coconuts.py`, `cheese.py`, and `chatbot.py`, and some text files which store data for your chatbot. + +A reasonable starter file might look like: + +```python +#!/usr/bin/env python3 +""" +File: .py +------------------- + +What does this file do? +""" + +# Write additional code and functions here. + +def main(): + # Write the main execution of your program here. + +if __name__ == '__main__': + main() +``` + +## Part 1: Coconut-Carrying Capacities + +Before anything else, watch [this three-minute video](https://www.youtube.com/watch?v=zqtS9xyl0f4) on Youtube. The transcript can be found [here](http://www.montypython.net/scripts/HG-cocoscene.php). + +We're going to help the guards out and compute whether a collection of swallows can carry a collection of coconuts between them. We know that "a five ounce bird could not carry a one pound coconut," so we will assume that a 5.5 ounce bird can carry a one pound coconut. More specifically, we will assume that every 5.5 ounces of bird can carry one pound of coconut. + +Create a file named `coconuts.py` in which we will write our program. In this portion of the assignment, we will ask the user for two numbers: the total ounces of birds that are carrying coconuts, and the total weight in pounds of the coconuts. + +Prompt the user for the number of ounces of bird by asking: `"How many ounces of birds are carrying the coconuts? "`. Prompt the user for the number of pounds of coconut by asking: `"How many pounds of coconuts are there? "`. Remember to convert these values to numeric types! + +If the total number of ounces of birds divided by the number of pounds of coconuts is at least 5.5 (including the value 5.5), then print `"Yes! Carrying the coconuts is possible."`. Otherwise, print `"No. Carrying the coconuts is impossible."` + +You can assume that the user input is formatted correctly. + +### Sample Runs + +Your program should be able to emulate the following sample runs. *Make sure to activate your virtual environment before executing these lines of code!* User input is ***bolded and italicized***: + +
(cs41-env)$ python coconuts.py
+How many ounces of birds are carrying the coconuts? 5
+How many pounds of coconuts are there? 1
+No. Carrying the coconuts is impossible.
+
+(cs41-env)$ python coconuts.py
+How many ounces of birds are carrying the coconuts? 6.2
+How many pounds of coconuts are there? 1.1
+Yes! Carrying the coconuts is possible.
+
+(cs41-env)$ python coconuts.py
+How many ounces of birds are carrying the coconuts? 17
+How many pounds of coconuts are there? 3
+Yes! Carrying the coconuts is possible.
+
+(cs41-env)$ python coconuts.py
+How many ounces of birds are carrying the coconuts? 12.5
+How many pounds of coconuts are there? 2.5
+No. Carrying the coconuts is impossible.
+
+ +Submit the `coconuts.py` file, which should be the code for this segment of the assignment. + +## Part 2: The Cheese Shop + +Before anything else, watch [this five-minute video](https://www.youtube.com/watch?v=Hz1JWzyvv8A) (cw: gun violence) on Facebook. The transcript can be found [here](http://www.montypython.net/scripts/cheese.php). + +In this part of the assignment, you will play the part of the Michael Palin, the owner of the National Cheese Emporium. + +Unlike the owner in the sketch, you *do* have some cheeses, and you will repeatedly ask the user what cheese they would like, and then you will respond whether or not you have that cheese. + +Make a list of `cheeses` containing `"Muenster"`, `"Cheddar"`, and `"Red Leicester"`; these are the cheeses you have in your shop. + +Begin by printing the string `"Good morning. Welcome to the National Cheese Emporium!"` to the console. + +Next, in a loop, we will repeatedly ask the user which cheese they would like to buy. If the user enters the exact name of a cheese that you have, affirm that you have the cheese they asked for in the format `"We have {}, yessir."`. If the user does not enter the exact name of a cheese that you have, say that you don't have that cheese in the form `"I'm afraid we don't have any {}."` and then reprompt the user to ask for another cheese. + +The user is also allowed to enter special questions. If the user enters either of the strings `"You... do have some cheese, don't you?"` or `"Have you in fact got any cheese here at all?"`, you must reply by listing the number of cheeses that you have in the format `"We have {} cheese(s)!"`, along with the name of each type of cheese you have, one on each line. + +### Sample Usage + +Your program should be able to emulate the following sample runs. *Make sure to activate your virtual environment before executing these lines of code!* + +
(cs41-env)$ python cheese.py
+Good morning. Welcome to the National Cheese Emporium!
+What would you like? Red Windsor
+I'm afraid we don't have any Red Windsor.
+What would you like? Lancashire
+I'm afraid we don't have any Lancashire.
+What would you like? cheddar
+I'm afraid we don't have any cheddar.
+What would you like? Cheddar
+We have Cheddar, yessir.
+
+(cs41-env)$ python cheese.py
+Good morning. Welcome to the National Cheese Emporium!
+What would you like? Red Windsor
+I'm afraid we don't have any Red Windsor.
+What would you like? cHeDdAr
+I'm afraid we don't have any cHeDdAr.
+What would you like? Have you in fact got any cheese here at all?
+We have 3 cheese(s)!
+Muenster
+Cheddar
+Red Leicester
+What would you like? exit
+I'm afraid we don't have any exit.
+What would you like? LET ME OUT
+I'm afraid we don't have any LET ME OUT.
+What would you like? Cheddar
+We have Cheddar, yessir.
+
+ +For anything that is not detailed in the above specification, your program can behave in any way you'd like. For example, you can customize the prompts and messages. + +### Hints + +You can check that an element is contained in a collection by using the keyword `in`. + +### Submission + +Please submit a file called `cheese.py` which contains the code for this segment of the assignment. -# Program Specification +## Part 3: Chatbot -Your program will prompt a line of input from the user (any prompt is acceptable), check if it matches one of the following questions, and if so, print an answer to the question. +In the third part of this assignment, you will implement a chatbot program which has a conversation with the user. There are just two requirements for this program: -- What is your name? -- What is your quest? -- What do you do in your free time? -- What else would you like to tell us that you haven't already expressed through the application? -- What are you most excited to learn about this quarter? +1. *Ask the user for input and print out responses*. The program should read input from the console and, based on the input, print out some response. You don't have to respond immediately after every input; the program could keep asking questions, just as long as it responds eventually. +2. *Access stored data*. The program should access data stored in a text file over the course of its run, so that data from previous runs is accessible to users in future runs. -Additionally, if the user enters the special input: `"What can you answer?"`, you must print out the list of questions that your program can answer, one question per line. +Beyond this, make it your own! Here are two examples that your course staff made previously: -If the user's input does not exactly match one of these above inputs, your program can do whatever it wants. +### Example 1: The Doors of Destiny -## Sample Demonstration +This program is a simple user authentication chatbot who acts like a gatekeeper before the "Doors of Destiny". The gatekeeper stops all travellers attempting to pass through the gate, and asks them for their name and passphrase. If the traveller provides a name and passphrase whch are contained in the gatekeeer's Book of Records, the traveller may proceed. If not, the traveller is denied entry. -Your program should be able to emulate the following sample runs. +Upon being asked for a name and passphrase, each traveller has two options: they may (a) provide a valid name and passphrase, or (b) bribe the guard - which allows them to add a new name and passphrase to the Book of Records - for a small fee. + +Here's what the chatbot looks like (user input is ***bolded and italicized***): + +
(cs41-env)$ python chatbot.py
+Halt! Welcome to the Doors of Destiny. 
+Should you wish to proceed, you must identify yourself within the Book of Records. 
+
+Is your name present in our book? yes
+What is your name, traveller? Michael
+What is your passphrase? parthsarin12345
+Welcome through, peaceful soul!
+
+Halt! Welcome to the Doors of Destiny. 
+Should you wish to proceed, you must identify yourself within the Book of Records. 
+
+Is your name present in our book? no
+Psst! I'm... not supposed to tell you this, but for a small... compensation... I might be able to add you to the Book of Records without the Warden noticing. 
+
+Would you like to be added to the Book of Records? yes
+Perfect! I've added you - but I don't come cheap! I charge 100 coins for my services. 
+Can you make the deposit? yes
+Deposit successful! (You have 5532 coins remaining in your account).
+
+What is your name, then, traveller? Michael
+What is your passphrase? parthsarin12345
+Welcome through the Doors of Destiny! And it's been a pleasure doing business with you.
+
+Halt! Welcome to the Doors of Destiny. 
+Should you wish to proceed, you must identify yourself within the Book of Records. 
+
+Is your name present in our book? yes
+What is your name, traveller? Michael
+What is your passphrase? cs41isacoolclass
+The passphrase you presented does not match our records! Guards - arrest this intruder!
+
+ +### Example 2: Simple Schedule + +This program is more like a virtual assistant (think Siri, Alexa, etc.), which allows users to schedule events and see their calendar. Here's a sample run for this program (note that this uses `MMDDYYYY` as an encoding scheme for dates, and that hours are represented as floating point numbers, so 14.5 means 2:30PM); once again, user input is ***bolded and italicized***: + +
(cs41-env)$ python chatbot.py
+Hello there, it's Hal, your friendly scheduling assistant! 
+
+Would you like to add a new event, or check an existing time slot? add
+What is the name of the event? CS41 Lecture
+On which day would you like to schedule the event? 03302021
+What is the start time? 14.5
+What is the end time? 16
+
+Successfully added the event to your day!
+
+Would you like to add a new event, or check an existing time slot? check
+On which day would you like to check for scheduled events? 03302021
+What time would you like to check for availability? 15
+At that time, you'll be busy with CS41 Lecture.
+
+Would you like to add a new event, or check an existing time slot? open the pod bay doors
+I'm sorry Dave, I'm afraid I can't let you do that.
+
+ +These chatbots are adorable and geeky! Feel free to bring your personality and passions to this part of the assignment. 😊 + +### File I/O and Data Formatting +Note that with both of these examples, we've left the scheme you use to store data to the file open-ended. For example, the authentication system might have two files - `coins.txt` to keep track of the coin balance, and `data.txt` to track names and passphrases - that look like this: + +``` +5632 +``` ``` -$ python3 intro.py -Ask me a question: What is your name? -It is Arthur, King of the Britons. - -$ python3 intro.py -Ask me a question: What is your quest? -To seek the holy grail. - -$ python3 intro.py -Ask me a question: What is the airspeed velocity of an unladen swallow? -(error: unknown question) What do you mean? African or European swallow? - -$ python3 intro.py -Ask me a question: What can you answer? -What is your name? -What is your quest? -What do you do in your free time? -What else would you like to tell us that you haven't already expressed through the application? -What are you most excited to learn about this quarter? +Michael:parthsarin12345 +Parth:pythoniscool!! +Antonio:cs41isafunclass ``` -In the above cases, the input prompt is `"Ask me a question: "`, and the user types her response to finish the line. +The scheduling system in the second example might have a single data file that looks like this: + +``` +event_name: CS41 Lecture +date: 03302021 +start_time: 14.5 +end_time: 16 + +event_name: Workout +date: 04012021 +start_time: 10 +end_time: 11 +``` + +The format for your data is up to you, and will likely be informed by the theme around which you'd like to design your chatbot. If you're having trouble working out a data format, though, feel free to reach out, and we're more than happy to help brainstorm with you. + +### Submission + +Please submit a file called `chatbot.py` which contains the code for this segment of the assignment. Please also submit the following text files. + +- `sampleruns.txt` should contain input and output from a couple of sample runs, copied and pasted from your Terminal (similar to what we've included above). This way we know how to interact with your program +- Any data files for your program (there should be at least one). Include some nontrivial data in the file so we don't have to add data before we use the program (e.g. in the authentication example, we should be able to attempt to login immediately, without first needing to create a file of credentials and populate it with a series of usernames and passwords). ## Extensions > Extensions on Assignment 0? If you insist. -This section of an assignment handout gives a few of our suggestions if you're looking for ways to go above and beyond the requirements of the assignment. At no point are you required to implement an extension - although they sometimes provide interesting challenges or alternative approaches to problem-solving. +This section of an assignment handout usually gives a few of our suggestions if you're looking for ways to go above and beyond the requirements of the assignment. At no point are you ever required to implement an extension - although they sometimes provide interesting challenges or alternative approaches to problem-solving. -In general, when you submit an extension to an assignment, add `-ext` to the end of the filename. In this case, if you want to submit an extension, you should submit both an unmodified `intro.py` file that implements the unextended assignment and an `intro-ext.py` file that implements your extension. +When you submit an extension to an assignment, you should submit *both* an unmodified `coconuts.py` file that implements the unextended assignment and a `coconuts-ext.py` file that implements your extension. Extension programs should also contain a module-level comment explaining what the extended assignment does differently. ### Binge-Watch Monty Python Videos on Youtube Including but not limited to: -- [The Cheese Shop](https://www.youtube.com/watch?v=cWDdd5KKhts) -- [The Bridge of Death](https://www.youtube.com/watch?v=dPOyOM7wxlE) - [The Dead Parrot](https://www.youtube.com/watch?v=4vuW6tQ0218) +- [The Argument Clinic](https://www.youtube.com/watch?v=XNkjDuSVXiE) + +or [the Monty Python channel](https://www.youtube.com/user/MontyPython/videos?sort=p&flow=grid&view=0) + +Include in your submission a file called `review.txt` with your thoughts on whichever videos you've watched! -You can tell your family that it's "for class." +You can tell your friends and family, "it's for class." -### Multiple Questions -Allow the user to ask more than just one question by putting your question-answering logic in a loop. Continue until the user enters an empty line. +### `coconuts`: Multiple questions +Allow the user to assess the coconut-carrying capacity of birds by putting your question-answering logic in a loop. Continue until the user enters an empty line for either the number of ounces of swallow or the number of pounds of coconut. -### Read Questions and Answers from a File -Store questions and answers in some file format of your devising, and read the questions and answers from the file into some data structure before prompting the user for answers. +### `coconuts`: Advanced coconut-carrying logic +Ask the user to differentiate between a European and African swallow. Penalize (or reward) groups of swallows or individual swallows. -### Dialogue -Implement some notion of dialogue, where the user can repeatedly chat with the program, and the program's behavior changes based on the user's inputs. +### `cheese`: Assignment Expressions +If you're feeling especially fancy, [you can use assignment expressions](https://www.python.org/dev/peps/pep-0572/) (a nifty new feature in Python 3.8) to assign a value to a variable in a loop condition. -### Answer More Questions? -Expand the range of questions that can be answered. You can add the classic icebreaker questions "What is your favorite flavor of ice cream" and "If you could have any superpower, what would it be?" Feel free to add any additional questions and answers you'd like. +### `cheese`: Cycle through responses +Instead of always using the same prompts and responses, cycle through a list of predetermined responses. -### Match Flexibility -Use another matching strategy to check if the user has asked a particular question. You can check substrings, regular expressions, edit distance (perhaps edit distance on substrings) or any number of clever metrics. +### `cheese`: Fuzzy matching +Allow the user to enter any input, and search for each of your cheeses, case-insensitively, in their input. That is, let the user ask: `"Any Norwegian Jarlsberger, per chance?"`. +### `chatbot`: Go Nuts! +As you are likely well aware, building effective chatbots which can fluently converse with - and understand - humans is an open area of research in computer science. Though we've designed this assignment so that you implement a simple chatbot within a well-scoped set of requirements, there truly is no ceiling to where you can take this. As two starter ideas, though, we'd recommend adding new features to your chatbot within the theme you've defined, and seeing whether you can make your chatbot robust to imperfect input (in the calendar chatbot, for example, whether the user types `add` or `add event`, the outcome should be the same - how robust can you make your chatbot to such variability in user input?). ## Grading -Your grade will be assessed on completion. If you successfully submit a Python program that answers each of the list of questions, you will receive full credit on this assignment. +This assignment will just be submitted for feedback, not a grade! -We will not be evaluating any style on this assignment. +## Style Checks -## Submitting +While not necessary for this assignment, we want to point out a useful tool for following the mechanics of Python style. The `pycodestyle` command-line tool takes as arguments a list of Python files and outputs a list of mechanical style violations. This catches small things like inconsistent spacing, line length, whitespace, but not larger things like program design, idiomatic Python, or structural complexity. *Nobody writes error-free code! `pycodestyle` is there to help your code comply with the (somewhat arbitrary) rules that the Python community has decided on.* -See the [submission instructions](https://github.com/stanfordpython/python-handouts/blob/master/submitting-assignments.md) on the course website. +You can run `pycodestyle` as follows: -For assignment 0, the key ideas are: ``` -$ ssh @myth.stanford.edu "mkdir -p ~/cs41/assign0" -$ scp @myth.stanford.edu:~/cs41/assign0/ -$ ssh @myth.stanford.edu -<... connect to myth ...> -myth$ cd ~/cs41/assign0/ -myth$ /usr/class/cs41/tools/submit +(cs41-env)$ pycodestyle coconuts.py cheese.py chatbot.py ``` -> With <3 by @sredmond +Any style violations will be printed to the console. You can automatically apply all of these changes using the `autopep8` tool. Be warned that the `autopep8` tool overwrites your files in-place, and may substantially change them, so you might want to apply changes by hand. However, `autopep8` can be a good time saver. + +``` +(cs41-env)$ autopep8 coconuts.py cheese.py chatbot.py +``` + +If you just want to see what changes would be made, but not apply them, you can use `autopep8 --diff coconuts.py cheese.py chatbot.py` instead. + +During setup, we installed both `pycodestyle` and `autopep8` into our virtual environment, so they will be available inside of the virtual environment. That is, make sure that you have activated your `cs41-env` virtual environment in order to run these tools. + +## Submitting + +Submit the python files you've created to [Paperless](https://paperless.stanford.edu). + +If you've added any extra files or extensions above to the assignment, you should include those in your Paperless submission as well. + +## Credit +Credit for the assignment idea and much of this writeup go to Sam Redmond (@sredmond). + +> With love, πŸ¦„s, and 🐘s by @psarin and @coopermj diff --git a/assign1/README.md b/assign1/README.md deleted file mode 100644 index 77c3f12..0000000 --- a/assign1/README.md +++ /dev/null @@ -1,553 +0,0 @@ -# Assignment 1: Cryptography -**Due: 11:59:59 PM, Thursday April 27th** - -## Overview -In this assignment, you will build a cryptography suite that implements three different cryptosystems - Caesar cipher, Vigenere cipher, and the Merkle-Hellman Knapsack Cryptosystem. This handout will walk you through the details of building this text-based cryptography tool. We want to instill good Pythonic practices from the beginning - so we encourage you to think critically about writing clean Python code. - -*Expected Time: 6 hours (if it takes much longer than that, email us)* - -Note: Get started early! Merkle-Hellman is the hardest cipher to implement. - -## Review - -Get a quick refresher by flipping through our slides from the first few weeks on [the course website](http://stanfordpython.com) - -## Starter Files - -We’ve provided starter files available on the website as a skeleton for this assignment. Here’s an overview of what’s in it: - -1. `crypto.py` is the primary file you will modify. It will implement all the functions to decrypt/encrypt strings. -2. `utils.py` provides useful utilities for console interaction and for Merkle-Hellman -3. `crypto-console.py` runs an interactive console that lets you test your cryptography functions. -2. `design.txt` is where you'll record the design decisions you're making. -3. `feedback.txt` is where you'll answer some questions about how the course is going overall -4. `tests/` folder contains test input and output -5. `res/` folder of sample text files to play around with file I/O. For Merkle-Hellman, the seed we used was 41 - -``` -res/caesar-plain.txt and res/caesar-cipher.txt -res/vigenere-plain.txt and res/vigenere-cipher.txt -res/mh-plain.txt and res/mh-cipher.txt -``` - -# Cryptography Suite - -## Building the Ciphers -In this section, you will build cipher functions to encrypt and decrypt messages. We'll give a brief overview of each cipher and give some pointers on how it fits it into the starter files. - -### Caesar Cipher - -A Caesar cipher involves shifting each character in a plaintext by three letters forward: - -``` -A -> D, B -> E, C -> F, etc... -``` - -At the end of the alphabet, the cipher mapping wraps around the end, so: - -``` -X -> A, Y -> B, Z -> C. -``` - -For example, encrypting `'PYTHON'` using a Caesar cipher gives - -``` -PYTHON -|||||| -SBWKRQ -``` - -For this part, implement the functions: - -``` -encrypt_caesar(plaintext) -decrypt_caesar(ciphertext) -``` - -Notes: - -- You can assume that the plaintext/ciphertext will always have length greater than zero. -- You can assume that all alphabetic characters will be in uppercase. -- If you encounter a non-alphabetic character, do not modify it. - - -You should test your ciphers using the interactive interpreter: - -``` -(cs41) $ python3 -iq crypto.py ->>> encrypt_caesar("PYTHON") -"SBWKRQ" ->>> decrypt_caesar("SBWKRQ") -"PYTHON" -``` - -A non-exhaustive list of test cases, represented by a tab-delimited (plaintext, ciphertext) pair are given in the text file `tests/caesar-tests.txt`. - -### Vigenere Cipher - -A Vigenere cipher is very similar to a Caesar cipher; however, in a Vigenere cipher, every character in the plaintext could be shifted by a different amount. The amount of shift is determined by a keyword, where 'A' corresponds to shift of 0 (no shift), 'B' corresponds to a shift of 1, ..., and 'Z' corresponds to a shift of 25. - -The keyword is repeated or truncated as necessary to fit the length of the plaintext. As an example, encrypting `"ATTACKATDAWN"` with the key `"LEMON"` gives: - -``` -Plaintext: ATTACKATDAWN -Key: LEMONLEMONLE -Ciphertext: LXFOPVEFRNHR -``` - -Looking more closely, each letter in the ciphertext is the sum of the letters in the plaintext and the key. Thus, the first character of ciphertext is L due to the following calculations: - -``` -A + L = 0 + 11 = 11 -> L -``` - -It may be useful to use the functions ord and chr which convert strings of length one to and from, respectively, their ASCII numerical equivalents. - -Implement the methods: - -``` -encrypt_vigenere(plaintext, keyword) -decrypt_vigenere(ciphertext, keyword) -``` - -Notes: - -- You can assume that there will be no non-alphabetic characters in the plaintext, ciphertext, or keyword. -- You can assume that all of the characters will be in uppercase. -- You can assume that plaintext/ciphertext/keyword will always have at least one character in it. - -Then, try testing the methods using the interactive interpreter again. - -``` -(cs41) $ python3 -iq crypto.py ->>> encrypt_vigenere("ATTACKATDAWN", "LEMON") -"LXFOPVEFRNHR" ->>> decrypt_vigenere("LXFOPVEFRNHR", "LEMON") -"ATTACKATDAWN" -``` - -Another list of non-exhaustive tests are available at `tests/vigenere-tests.txt`. - -### Merkle-Hellman Knapsack Cryptosystem - -Public-key cryptography is essential to modern society. You may have heard of RSA - one of the most popular public-key cryptosystems. Less well known, however, is the Merkle-Hellman Knapsack Cryptosystem, one of the earliest public-key cryptosystems (invented in 1978!), which relies on the NP-complete subset sum problem. Although it has been broken, it illustrates several important concepts in public-key cryptography and gives you lots of practice with Pythonic constructs. - -Building the Merkle-Hellman Cryptosystem involves three parts: - -1. Key Generation -2. Encryption -3. Decryption - -At a high-level, in the Merkle-Hellman Knapsack Cryptosystem, all participants go through key generation once to construct both a public key and a private key, linked together in some mathematical way. Public keys are made publicly available, whereas private keys are kept under lock and key (no pun intended). Usually, public keys will lead to some sort of encryption function, and private keys will lead to some sort of decryption function, and in many ways they act as inverses. - -For Person A to send message m to Person B, Person A encrypts message m using Person B's public key. Person B then decrypts the encrypted message using Person B's private key. Often, long messages are send in shorted chunks, with each chunk respectively encrypted before it is sent to the recipient. - -Make sure you understand the general idea behind public-key cryptosystems before moving forward. You don't need to know all of the details, but you should be able to explain why Person A doesn't encrypt an outgoing message with her own public key. - -First, we'll discuss the mathematics behind Merkle-Hellman Knapsack Cryptosystem, and then we'll dive into what functions you have to write for this assignment. - - -#### Key Generation -In the key generation step, we will construct a private key and a public key. - -Choose a fixed integer `n` for the chunk size (in bits) of messages to send. For this assignment, we'll use `n = 8` bits, so we can encrypt and decrypt messages one byte at a time. - -First, we must build a superincreasing sequence of `n` nonzero natural numbers: - -``` -w = (w_1, w_2, ..., w_n) -``` - -A superincreasing sequence is one in which every element is greater than the sum of all previous elements. For example, `(1, 3, 6, 13, 27, 52)` is a superincreasing sequence, but `(1, 3, 4, 9, 15, 25)` is not. One way to construct a superincreasing sequence is to start with some small number - say, a random number between 2 and 10. You can generate the next number by selecting randomly from a range like `[total + 1, 2 * total]` or something similar, where `total` is the sum of all of the elements so far. In this way, we can gradually build up our sequence to whatever size we need - in this case, until `n = 8`. - -Next, we pick a random integer `q`, such that `q` is greater than the sum of the the elements in `w`. To leverage code we've already written, let's choose `q` between `[total + 1, 2 * total]`, where `total` is the sum over all elements of `w`. - -Then, we choose a random integer `r` such that `gcd(r, q) = 1` (i.e. r and q are coprime). To accomplish this, it's sufficient to just generate random numbers in the range `[2, q-1]` until you find some `r` such that `r` and `q` are coprime. (Hint: the `utils` module exports a convenient `coprime` function. Use it!) - -Finally, we calculate the tuple - -``` -beta = (b_1, b_2, ..., b_n) -``` - -where - -``` -b_i = r Γ— w_i mod q -``` - -The public key is `beta`, while the private key is `(w, q, r)`. - -Both `w` and `beta` should be converted to tuples. - -*Implementation Note:* - -To find random integers, you can use the `randint(a, b)` function (returns a random integer in the range [a, b], including both end points) from the `random` module. For example, - -``` -import random -x = random.randint(1, 6) # returns either 1, 2, 3, 4, 5, 6 with uniform probability - -``` - -#### Encryption - -To encrypt a character, first convert it into it's equivalent bits. For example, `'A'`, which is 65 in ASCII, becomes `[0, 1, 0, 0, 0, 0, 0, 1]`. - -To encrypt this character, we just have to encrypt an 8-bit message. Call it: - -``` -alpha = (a_1, a_2, ..., a_n) -``` - -where `a_i` is the `i`-th bit of the message and `a_i` is either 0 or 1. With that, we can calculate - -``` -c = sum of a_i Γ— b_i for i = 1 to n -``` - -The ciphertext is then `c`. - -*Implementation Note:* - -Whenever you're encrypting or decrypting data using Merkle-Hellman, you'll want to deal with bits. Fortunately, the `utils` module exports the `bits_to_byte(bits)` and `byte_to_bits(byte)` functions which respectively convert an array of length 8 containing 1s and 0s to an integer between 0-255 (conceptually, a byte). - -#### Decryption - -In order to decrypt a ciphertext `c`, a receiver has to find the message bits `alpha_i` such that they satisfy - -``` -c = sum of a_i Γ— b_i for i = 1 to n -``` - -This is generally a hard problem, if the `b_i` are random values, because the receiver would have to solve an instance of the subset sum problem, which is known to be NP-hard (i.e. very hard). However, we constructed `beta` in a very special way, such that if we *also* know the private key `(w, q, r)`, then we can decrypt the message more easily. - -The key to decryption will be a special integer `s` that has some nice properties - namely, that `s` is the modular inverse of `r` modulo `q`. That means `s` satisfies the equation `r Γ— s mod q = 1`, and since `r` was chosen such that `gcd(r, q) = 1`, it will always be possible to find such an `s` using something called the Extended Euclidean algorithm, which we've implemented for you (in `utils.modinv(r, q)`). It's a really cool algorithm! If you're interested, go read about it on Wikipedia. - -Once `s` is known, the receiver of the ciphertext computes - -``` -c' = cs mod(q) -``` - -Because we know `r Γ— s mod q = 1` and `b_i = r Γ— w_i (mod q)`, it's also true that - -``` -b_i s = w_i Γ— r Γ— s = w_i (mod q). -``` - -Therefore, - -``` -c' = c Γ— s = sum of a_i Γ— b_i Γ— s for each i = a_i Γ— w_i (mod q). -``` - -Wow! We've converted our problem of solving a subset sum problem over the `b_i`s, which might be a very nasty sequence, to an equivalent problem over the `w_i`s, which form a very nice sequence. - -Thus the receiver has to solve the subset sum problem - -``` -c' = sum of a_i Γ— w_i for i = 1 to n -``` - -This problem is computationally easy because `w` was chosen to be a superincreasing sequence! Take the largest element in `w`, say `w_k`. If `w_k > c'` , then `a_k = 0`, and if `w_k <= c'`, then `a_k = 1`. Then, subtract `w_k Γ— a_k` from `c'` , and repeat these steps until you have figured out all of `alpha`. - -Still confused? This stuff can get complicated. Wikipedia provides [a great example](https://en.wikipedia.org/wiki/Merkle%E2%80%93Hellman_knapsack_cryptosystem#Example) to work through if you prefer concrete numbers over abstract symbols. - -#### Implementation - -What do you actually have to implement? We've taken care of a lot of the math behind-the-scenes (if you want to check out how, look into `utils.py`), so you're job focuses more on the data structures. In particular, you need to write the following four functions. - -``` -def generate_private_key(n=8): - """Generate a private key for use in the Merkle-Hellman Knapsack Cryptosystem - - Following the instructions in the handout, construct the private key components - of the MH Cryptosystem. This consistutes 3 tasks: - - 1. Build a superincreasing sequence `w` of length n - (Note: you can check if a sequence is superincreasing with `utils.is_superincreasing(seq)`) - 2. Choose some integer `q` greater than the sum of all elements in `w` - 3. Discover an integer `r` between 2 and q that is coprime to `q` (you can use utils.coprime) - - You'll need to use the random module for this function, which has been imported already - - Somehow, you'll have to return all of these values out of this function! Can we do that in Python?! - - @param n bitsize of message to send (default 8) - @type n int - - @return 3-tuple `(w, q, r)`, with `w` a n-tuple, and q and r ints. - """ - pass - -def create_public_key(private_key): - """Creates a public key corresponding to the given private key. - - To accomplish this, you only need to build and return `beta` as described in the handout. - - beta = (b_1, b_2, ..., b_n) where b_i = r Γ— w_i mod q - - Hint: this can be written in one line using a list comprehension - - @param private_key The private key - @type private_key 3-tuple `(w, q, r)`, with `w` a n-tuple, and q and r ints. - - @return n-tuple public key - """ - pass - - -def encrypt_mh(message, public_key): - """Encrypt an outgoing message using a public key. - - 1. Separate the message into chunks the size of the public key (in our case, fixed at 8) - 2. For each byte, determine the 8 bits (the `a_i`s) using `utils.byte_to_bits` - 3. Encrypt the 8 message bits by computing - c = sum of a_i * b_i for i = 1 to n - 4. Return a list of the encrypted ciphertexts for each chunk in the message - - Hint: think about using `zip` at some point - - @param message The message to be encrypted - @type message bytes - @param public_key The public key of the desired recipient - @type public_key n-tuple of ints - - @return list of ints representing encrypted bytes - """ - return message - -def decrypt_mh(message, private_key): - """Decrypt an incoming message using a private key - - 1. Extract w, q, and r from the private key - 2. Compute s, the modular inverse of r mod q, using the - Extended Euclidean algorithm (implemented at `utils.modinv(r, q)`) - 3. For each byte-sized chunk, compute - c' = cs (mod q) - 4. Solve the superincreasing subset sum using c' and w to recover the original byte - 5. Reconsitite the encrypted bytes to get the original message back - - @param message Encrypted message chunks - @type message list of ints - @param private_key The private key of the recipient - @type private_key 3-tuple of w, q, and r - - @return bytearray or str of decrypted characters - """ - return message -``` - -Note: We're aware this is a hard problem. It's supposed to challenge you! If you're stuck, even on something that *seems* simple, please please reach out to the course staff over Piazza or during office hours. We'll be more than happy to help! - -*Full credit to the Wikipedia summary for this explanation! The description is shamelessly copied and modified :)* -## Console Menu -In order to better test this program, we've provided a console menu to interact with the cryptography suite. This shouldn't replace your normal debugging process - rather, view it as an augmentation of the tools you have to track down any elusive bugs. - -In general, we don't do very much error handling (since the console menu is intended as a tool for you to debug), so it may crash gracelessly on bad input. You're welcome to modify or change the console menu as you see fit. We'll only be testing your application-level functions. - -A sample run of the program might look like: - -``` -(cs41) $ python3 crypto-console.py -Welcome to the Cryptography Suite! ----------------------------------- -* Tool * -(C)aesar, (V)igenere or (M)erkle-Hellman? c -* Action * -(E)ncrypt or (D)ecrypt? e -* Input * -(F)ile or (S)tring? s -Enter a string: hello! -* Transform * -Encrypting HELLO! using Caesar cipher... -* Output * -(F)ile or (S)tring? s -"KHOOR" -Again? (Y/N) y ----------------------------------- -* Tool * -(C)aesar, (V)igenere or (M)erkle-Hellman? v -* Action * -(E)ncrypt or (D)ecrypt? d -* Input * -(F)ile or (S)tring? f -Filename? res/vigenere-cipher.txt -* Transform * -Keyword? LEMON -Decrypting LXFOPVEFRNHR using Vigenere cipher and keyword LEMON... -* Output * -(F)ile or (S)tring? s -"ATTACKATDAWN" -Again? (Y/N) n -Goodbye! -(cs41) $ -``` - - -## Extensions - -What?! You still haven't had enough? Okay, your call. - -The following section contains possible extensions and is **entirely optional**. If you choose to take a crack at any, regardless of how far you get, let us know how it went in your feedback! - -### Scytale Cipher -*Difficulty: 🌶 🌶* - -The scytale was used as far back as the [Spartans](http://www.australianscience.com.au/technology/a-scytale-cryptography-of-the-ancient-sparta/), and is one example of ancient cryptography (thought to be used in military campaigns). The [Wikipedia page](https://en.wikipedia.org/wiki/Scytale) has a good overview. - -Below is a sample encryption of the plaintext "IAMHURTVERYBADLYHELP" using a scytale cipher with circumference 5 to generate the ciphertext "IRYYATBHMVAEHEDLURLP" - -We write the message diagonally down (around) the scytale, and then - -``` -I . . . . R . . . . Y . . . . Y . . . . -. A . . . . T . . . . B . . . . H . . . -. . M . . . . V . . . . A . . . . E . . -. . . H . . . . E . . . . D . . . . L . -. . . . U . . . . R . . . . L . . . . P -``` - -The ciphertext is obtained by reading from left to right, top to bottom. In this example, the ciphertext is - -``` -IRYYATBHMVAEHEDLURLP -``` - -Implement the functions: - -``` -encrypt_scytale(plaintext, circumference) -decrypt_scytale(ciphertext, circumference) -``` - -What will you do when the length of the message is not a perfect multiple of the circumference? - -Consider using list comprehensions and slice syntax to simplify your implementation. - -Want more of a challenge (🌶 🌶 🌶)? Try to decrypt an arbitrary ciphertext without knowing the circumference of the scytale. - - -### Railfence Cipher -*Difficulty: 🌶 🌶 🌶* - -Below is a sample encryption of the plaintext "WEAREDISCOVEREDFLEEATONCE" using a railfence with 3 rails to generate the ciphertext "WECRLTEERDSOEEFEAOCAIVDEN" - -We write the message diagonally - -``` -W . . . E . . . C . . . R . . . L . . . T . . . E -. E . R . D . S . O . E . E . F . E . A . O . C . -. . A . . . I . . . V . . . D . . . E . . . N . . -``` - -The ciphertext is obtained by reading the rails from left to right, top to bottom. - -``` -WECRLTEERDSOEEFEAOCAIVDEN -``` - -Implement the functions: - -``` -encrypt_railfence(plaintext, num_rails) -decrypt_railfence(ciphertext, num_rails) -``` - -How will you handle the cases where the last ascending (or descending) segment doesn't reach a corner? - -Consider using list comprehensions and slice syntax (especially assigning into slices) to simplify your implementation. - -Want more of a challenge (🌶 🌶 🌶 🌶)? Try to decrypt an arbitrary ciphertext without knowing the number of rails used. - -### Intelligent Codebreaker -*Difficulty: 🌶 🌶 🌶* - -Suppose that you have access to some ciphertext that you know has been encrypted using a Vigenere cipher. Furthermore, suppose that you know that the corresponding plaintext has been written using only words in `/usr/share/dict/words`, whitespace, and punctuation, although you don’t know the exact message. Finally, suppose that you know that someone has encrypted a message using a Vigenere cipher with a key drawn from a preset list of words, (again, let's suppose from `/usr/share/dict/words`). Can you still decrypt the ciphertext? - -For many of the incorrect keys, the resulting plaintext will be gibberish, but there will also be incorrect keys for which the resulting plaintext sounds English-y, but isn't quite right. Thus, the bulk of this problem lies in evaluating how close to a valid English sentence a given sequence of letters is. - -Your top-level function should be - -``` -decrypt_vigenere(ciphertext, possible_keys) -``` - -Besides that, you are free to implement this program however you see fit. However, think about the Python style guidelines before continuing. - -You can test your method on the text inside of `secret_message.txt`. - -For more of a challenge (🌶 🌶 🌶 🌶), broaden your definition of English-y to allow finding plaintexts in which not all words come from `/usr/share/dict/words` (a message we're interested in decrypting, for example, might contain a person's name). What other signals in the text can you look for? - -You can also try to break Vigenere encryptions using a combination of the above tactic and a frequency attack for a given key-length. - -### Error Handling -*Difficulty: 🌶* - -Currently, our library functions (`encrypt_*` and `decrypt_*`) maks a lot of strong assumptions about the input - see the Notes section in each part. Can you add code to handle cases where these assumptions are violated? - -### Encrypt Non-Text Files -*Difficulty: 🌶 🌶* - -So far, our ciphers have been applied solely to text-based messages full of ascii letters from the alphabet. However, it is possible to extend these encryption methods to work on arbitrary binary data, such as images, audio files, and more. For this extension, choose at least one of the encryption techniques and make it work on binary files. You will need to use the binary flag when reading from files. You may want to read about text sequence types compared with binary sequence types as well. - -### Cracking Merkle-Hellman -*Difficulty: 🌶 🌶 🌶 🌶 🌶* - -Unfortunately, there is a polynomial-time algorithm for breaking the Merkle-Hellman Knapsack Cryptosystem. Implement it. - -The algorithm's details are described in [Shamir's 1984 paper](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.123.5840&rep=rep1&type=pdf). - -## Design - -Please submit a short design document (`design.txt`) describing your approach to each of the parts of the assignment. "Short" means just a few sentences (1-3) per part discussing the rationale behind your decision to implement this program in the way you did. Consider answering the following questions: - -1. What data structures did you use to handle transformation of data? -2. What Pythonic ideas or strategies did you incorporate in your approach, if any? - -## Feedback - -We hope you have been enjoying the course so far, and would love to hear from you about how this first real assignment went! - -To help us out, please answer the following questions in the `feedback.txt` file provided with the starter code: - -1. How long did this assignment take you to complete? -2. What has been the best part of the class so far? -3. What can we do to make this class more enjoyable for you? - -Thank you for being our guinea pigs this quarter - we're learning from you as well as we teach this course! - -## Grading - -Your grade will be assessed on both functionality and style. - -Functionality will be determined entirely by your program's correctness on a suite of unit tests (some of which are provided with the starter code). - -Stylistically, you will be evaluated on your general program design (a la 106 series: decomposition, logic, etc) as well as your Python-specific style. In particular, we will be looking for "Pythonic" approaches to solving problems, as opposed to "non-Pythonic" solutions, that emphasize the Zen of Python. We will also be looking at your Python syntax and mechanics. We encourage you to format your code in accordance with [Python style guidelines](https://www.python.org/dev/peps/pep-0008/). You can find a tool to help format your code [online](http://pep8online.com/). If you have any questions, please don't hesitate to let us know. Think about the [Zen of Python](https://www.python.org/dev/peps/pep-0020/) when making design decisions. - -## Deliverables - -1. Your modified `crypto.py` -2. The `design.txt` file documenting your design decisions -3. The `feedback.txt` letting us know how we're doing! - -## Submitting - -See the [submission instructions](https://github.com/stanfordpython/python-handouts/blob/master/submitting-assignments.md) on the course website. - -For assignment 1, the key ideas are: - -``` -$ ssh @myth.stanford.edu "mkdir -p ~/cs41/assign1" -$ scp -r @myth.stanford.edu:~/cs41/assign1/ -$ ssh @myth.stanford.edu -<... connect to myth ...> -myth$ cd ~/cs41/assign1/ -myth$ /usr/class/cs41/tools/submit -``` - -## Credit -*Sherman Leung (@skleung), Python Tutorial, Learn Python the Hard Way, Google Python, MIT OCW 6.189, Project Euler, and Wikipedia's list of ciphers.* - -> With <3 by @sredmond diff --git a/assign1/crypto-console.py b/assign1/crypto-console.py deleted file mode 100644 index 425436f..0000000 --- a/assign1/crypto-console.py +++ /dev/null @@ -1,193 +0,0 @@ -#!/usr/bin/env python3 -tt -""" -File: crypto-console.py ------------------------ -Implements a console menu to interact with the cryptography functions exported -by the crypto module. - -If you are a student, you shouldn't need to change anything in this file. -""" -import random - -from crypto import (encrypt_caesar, decrypt_caesar, - encrypt_vigenere, decrypt_vigenere, - generate_private_key, create_public_key, - encrypt_mh, decrypt_mh) - - -############################# -# GENERAL CONSOLE UTILITIES # -############################# - -def get_tool(): - print("* Tool *") - return _get_selection("(C)aesar, (V)igenere or (M)erkle-Hellman? ", "CVM") - - -def get_action(): - """Return true iff encrypt""" - print("* Action *") - return _get_selection("(E)ncrypt or (D)ecrypt? ", "ED") - - -def get_filename(): - filename = input("Filename? ") - while not filename: - filename = input("Filename? ") - return filename - - -def get_input(binary=False): - print("* Input *") - choice = _get_selection("(F)ile or (S)tring? ", "FS") - if choice == 'S': - text = input("Enter a string: ").strip().upper() - while not text: - text = input("Enter a string: ").strip().upper() - if binary: - return bytes(text, encoding='utf8') - return text - else: - filename = get_filename() - flags = 'r' - if binary: - flags += 'b' - with open(filename, flags) as infile: - return infile.read() - - -def set_output(output, binary=False): - print("* Output *") - choice = _get_selection("(F)ile or (S)tring? ", "FS") - if choice == 'S': - print(output) - else: - filename = get_filename() - flags = 'w' - if binary: - flags += 'b' - with open(filename, flags) as outfile: - print("Writing data to {}...".format(filename)) - outfile.write(output) - - -def _get_selection(prompt, options): - choice = input(prompt).upper() - while not choice or choice[0] not in options: - choice = input("Please enter one of {}. {}".format('/'.join(options), prompt)).upper() - return choice[0] - - -def get_yes_or_no(prompt, reprompt=None): - """ - Asks the user whether they would like to continue. - Responses that begin with a `Y` return True. (case-insensitively) - Responses that begin with a `N` return False. (case-insensitively) - All other responses (including '') cause a reprompt. - """ - if not reprompt: - reprompt = prompt - - choice = input("{} (Y/N) ".format(prompt)).upper() - while not choice or choice[0] not in ['Y', 'N']: - choice = input("Please enter either 'Y' or 'N'. {} (Y/N)? ".format(reprompt)).upper() - return choice[0] == 'Y' - - -def clean_caesar(text): - """Convert text to a form compatible with the preconditions imposed by Caesar cipher""" - return text.upper() - - -def clean_vigenere(text): - return ''.join(ch for ch in text.upper() if ch.isupper()) - - -def run_caesar(): - action = get_action() - encrypting = action == 'E' - data = clean_caesar(get_input(binary=False)) - - print("* Transform *") - print("{}crypting {} using Caesar cipher...".format('En' if encrypting else 'De', data)) - - output = (encrypt_caesar if encrypting else decrypt_caesar)(data) - - set_output(output) - - -def run_vigenere(): - action = get_action() - encrypting = action == 'E' - data = clean_vigenere(get_input(binary=False)) - - print("* Transform *") - keyword = clean_vigenere(input("Keyword? ")) - - print("{}crypting {} using Vigenere cipher and keyword {}...".format('En' if encrypting else 'De', data, keyword)) - - output = (encrypt_vigenere if encrypting else decrypt_vigenere)(data, keyword) - - set_output(output) - - -def run_merkle_hellman(): - action = get_action() - - print("* Seed *") - seed = input("Set Seed [enter for random]: ") - import random - if not seed: - random.seed() - else: - random.seed(seed) - - print("* Building private key...") - - private_key = generate_private_key() - public_key = create_public_key(private_key) - - if action == 'E': # Encrypt - data = get_input(binary=True) - print("* Transform *") - chunks = encrypt_mh(data, public_key) - output = ' '.join(map(str, chunks)) - else: # Decrypt - data = get_input(binary=False) - chunks = [int(line.strip()) for line in data.split() if line.strip()] - print("* Transform *") - output = decrypt_mh(chunks, private_key) - - set_output(output) - - -def run_suite(): - """ - Runs a single iteration of the cryptography suite. - - Asks the user for input text from a string or file, whether to encrypt - or decrypt, what tool to use, and where to show the output. - """ - print('-' * 34) - tool = get_tool() - # This isn't the cleanest way to implement functional control flow, - # but I thought it was too cool to not sneak in here! - commands = { - 'C': run_caesar, # Caesar Cipher - 'V': run_vigenere, # Vigenere Cipher - 'M': run_merkle_hellman # Merkle-Hellman Knapsack Cryptosystem - } - commands[tool]() - - -def main(): - """Harness for CS41 Assignment 1""" - print("Welcome to the Cryptography Suite!") - run_suite() - while get_yes_or_no("Again?"): - run_suite() - print("Goodbye!") - - -if __name__ == '__main__': - main() diff --git a/assign1/crypto.py b/assign1/crypto.py deleted file mode 100644 index 9745701..0000000 --- a/assign1/crypto.py +++ /dev/null @@ -1,130 +0,0 @@ -#!/usr/bin/env python3 -tt -""" -File: crypto.py ---------------- -Assignment 1: Cryptography -Course: CS 41 -Name: -SUNet: - -Replace this with a description of the program. -""" -import utils - -# Caesar Cipher - -def encrypt_caesar(plaintext): - """Encrypt plaintext using a Caesar cipher. - - Add more implementation details here. - """ - raise NotImplementedError # Your implementation here - - -def decrypt_caesar(ciphertext): - """Decrypt a ciphertext using a Caesar cipher. - - Add more implementation details here. - """ - raise NotImplementedError # Your implementation here - - -# Vigenere Cipher - -def encrypt_vigenere(plaintext, keyword): - """Encrypt plaintext using a Vigenere cipher with a keyword. - - Add more implementation details here. - """ - raise NotImplementedError # Your implementation here - - -def decrypt_vigenere(ciphertext, keyword): - """Decrypt ciphertext using a Vigenere cipher with a keyword. - - Add more implementation details here. - """ - raise NotImplementedError # Your implementation here - - -# Merkle-Hellman Knapsack Cryptosystem - -def generate_private_key(n=8): - """Generate a private key for use in the Merkle-Hellman Knapsack Cryptosystem. - - Following the instructions in the handout, construct the private key components - of the MH Cryptosystem. This consistutes 3 tasks: - - 1. Build a superincreasing sequence `w` of length n - (Note: you can check if a sequence is superincreasing with `utils.is_superincreasing(seq)`) - 2. Choose some integer `q` greater than the sum of all elements in `w` - 3. Discover an integer `r` between 2 and q that is coprime to `q` (you can use utils.coprime) - - You'll need to use the random module for this function, which has been imported already - - Somehow, you'll have to return all of these values out of this function! Can we do that in Python?! - - @param n bitsize of message to send (default 8) - @type n int - - @return 3-tuple `(w, q, r)`, with `w` a n-tuple, and q and r ints. - """ - raise NotImplementedError # Your implementation here - -def create_public_key(private_key): - """Create a public key corresponding to the given private key. - - To accomplish this, you only need to build and return `beta` as described in the handout. - - beta = (b_1, b_2, ..., b_n) where b_i = r Γ— w_i mod q - - Hint: this can be written in one line using a list comprehension - - @param private_key The private key - @type private_key 3-tuple `(w, q, r)`, with `w` a n-tuple, and q and r ints. - - @return n-tuple public key - """ - raise NotImplementedError # Your implementation here - - -def encrypt_mh(message, public_key): - """Encrypt an outgoing message using a public key. - - 1. Separate the message into chunks the size of the public key (in our case, fixed at 8) - 2. For each byte, determine the 8 bits (the `a_i`s) using `utils.byte_to_bits` - 3. Encrypt the 8 message bits by computing - c = sum of a_i * b_i for i = 1 to n - 4. Return a list of the encrypted ciphertexts for each chunk in the message - - Hint: think about using `zip` at some point - - @param message The message to be encrypted - @type message bytes - @param public_key The public key of the desired recipient - @type public_key n-tuple of ints - - @return list of ints representing encrypted bytes - """ - raise NotImplementedError # Your implementation here - -def decrypt_mh(message, private_key): - """Decrypt an incoming message using a private key - - 1. Extract w, q, and r from the private key - 2. Compute s, the modular inverse of r mod q, using the - Extended Euclidean algorithm (implemented at `utils.modinv(r, q)`) - 3. For each byte-sized chunk, compute - c' = cs (mod q) - 4. Solve the superincreasing subset sum using c' and w to recover the original byte - 5. Reconsitite the encrypted bytes to get the original message back - - @param message Encrypted message chunks - @type message list of ints - @param private_key The private key of the recipient - @type private_key 3-tuple of w, q, and r - - @return bytearray or str of decrypted characters - """ - raise NotImplementedError # Your implementation here - diff --git a/assign1/design.txt b/assign1/design.txt deleted file mode 100644 index 69ce946..0000000 --- a/assign1/design.txt +++ /dev/null @@ -1,14 +0,0 @@ -Name: -SUNet: - -In 1-3 sentences per section, comment on your approach to each of the parts of the assignment. What was your high-level strategy? How did you translate that into code? Did you make use of any Pythonic practices? We want you to reflect on your coding style, and whether you're making full use of the utilities provides. - - -# Caesar Cipher - - -# Vigenere Cipher - - -# Merkle-Hellman Knapsack Cryptosystem - diff --git a/assign1/feedback.txt b/assign1/feedback.txt deleted file mode 100644 index 6861316..0000000 --- a/assign1/feedback.txt +++ /dev/null @@ -1,17 +0,0 @@ -Name: -SUNet: - -1) How long did this assignment take you to complete? - - - - -2) What has been the best part of the class so far? - - - - -3) What can we do to make this class more enjoyable for you? - - - diff --git a/assign1/not_a_secret_message.txt b/assign1/not_a_secret_message.txt deleted file mode 100644 index c23ed7c..0000000 --- a/assign1/not_a_secret_message.txt +++ /dev/null @@ -1,5 +0,0 @@ -# Why hello! You've found a secret message! I guess filenames can be misleading sometimes. - -# There's an inscription here stating that this text was encrypted with a Vigenere cipher using a common word (from /usr/share/dict/words) as a keyword. What could it say?! Surely gold and glory await your discovery. - -WCR FCH! VJTK MBJ'PX PBITQMPUNIGGN HUXQ TZGVVLFLBG PQ FBQU PQ IVGFXZEL. VBCCLAZL, X'K CBGG PBWPBT P JHA CS ICQA GB IFTA WG'H Y EPHGAC XHGVTP YVF LDS MV RRRPRWH GWGL. HDCPPXUHYN GM DCEZQ ULHGTP BM W ETNXHH ETNXHH ETNXHH ETNXHH ZNQXST JXRA DCESQ MOOG PPX AVR HYFL ZRCEMO OF IFX RSL! QSM DVNI YKL HUT MWKG LDS CBGG VSXZGRS RAL YRN YGFKNN? UXSZ, FXLVL MBJ KTKS VI RAPG SPP, POM QDL'M FCH BYDL O CGGOHHR EMLA CA EGTGNN LGMO HUT QXJFRI UHYR "CXLXHDCAC" TUR V'AJ ZLH LDS T UWPT 10 CQAFN QMGBG CDGGAG. LDS PVB'G WYOL HB GCILOG GCILOG GCILOG GCILOG IFBZ QBJPLL! \ No newline at end of file diff --git a/assign1/res/caesar-cipher.txt b/assign1/res/caesar-cipher.txt deleted file mode 100644 index 77ba61d..0000000 --- a/assign1/res/caesar-cipher.txt +++ /dev/null @@ -1 +0,0 @@ -SBWKRQ \ No newline at end of file diff --git a/assign1/res/caesar-plain.txt b/assign1/res/caesar-plain.txt deleted file mode 100644 index 38a33ed..0000000 --- a/assign1/res/caesar-plain.txt +++ /dev/null @@ -1 +0,0 @@ -PYTHON \ No newline at end of file diff --git a/assign1/res/mh-cipher.txt b/assign1/res/mh-cipher.txt deleted file mode 100644 index 19d0331..0000000 --- a/assign1/res/mh-cipher.txt +++ /dev/null @@ -1 +0,0 @@ -3504 4497 392 5338 2287 2299 6110 4052 7176 4272 1650 1870 4994 2287 3291 7176 6110 \ No newline at end of file diff --git a/assign1/res/mh-plain.txt b/assign1/res/mh-plain.txt deleted file mode 100644 index b67ce27..0000000 --- a/assign1/res/mh-plain.txt +++ /dev/null @@ -1 +0,0 @@ -<3 StanfordPython \ No newline at end of file diff --git a/assign1/res/vigenere-cipher.txt b/assign1/res/vigenere-cipher.txt deleted file mode 100644 index b04ec8d..0000000 --- a/assign1/res/vigenere-cipher.txt +++ /dev/null @@ -1 +0,0 @@ -LXFOPVEFRNHR \ No newline at end of file diff --git a/assign1/res/vigenere-plain.txt b/assign1/res/vigenere-plain.txt deleted file mode 100644 index 877099d..0000000 --- a/assign1/res/vigenere-plain.txt +++ /dev/null @@ -1 +0,0 @@ -Attack At Dawn! \ No newline at end of file diff --git a/assign1/tests/caesar-tests.txt b/assign1/tests/caesar-tests.txt deleted file mode 100644 index 0331bd0..0000000 --- a/assign1/tests/caesar-tests.txt +++ /dev/null @@ -1,19 +0,0 @@ - -A D -B E -I L -X A -Z C -AA DD -TH WK -CAT FDW -DOG GRJ -TOO WRR -DAMN GDPQ -DANIEL GDQLHO -PYTHON SBWKRQ -WHEEEEEE ZKHHHHHH -WITH SPACE ZLWK VSDFH -WITH TWO SPACES ZLWK WZR VSDFHV -NUM83R5 QXP83U5 -0DD !T$ 0GG !W$ \ No newline at end of file diff --git a/assign1/tests/vigenere-tests.txt b/assign1/tests/vigenere-tests.txt deleted file mode 100644 index 96dd137..0000000 --- a/assign1/tests/vigenere-tests.txt +++ /dev/null @@ -1,11 +0,0 @@ -FLEEATONCE A FLEEATONCE -IMHIT H PTOPA -ATTACKATDAWN LEMON LXFOPVEFRNHR -WEAREDISCOVERED LEMON HIMFROMEQBGIDSQ -WEAREDISCOVERED MELON IILFRPMDQBHICSQ -CANTBELIEVE ITSNOTBUTTER KTFGPXMCXOI -CART MAN OAEF -HYPE HYPE OWEI -SAMELENGTH PYTHONISTA HYFLZRVYMH -SHORTERKEY XYZZYZ PFNQRDOIDX -A ONEINPUT O \ No newline at end of file diff --git a/assign1/utils.py b/assign1/utils.py deleted file mode 100644 index ccbe151..0000000 --- a/assign1/utils.py +++ /dev/null @@ -1,66 +0,0 @@ -#!/usr/bin/env python3 -tt -""" -Mathematical utilities for CS41's Assignment 1: Cryptography. -""" -import fractions as _fractions - -class Error(Exception): - """Base class for exceptions in this module.""" - -class BinaryConversionError(Error): - """Custom exception for invalid binary conversions.""" - pass - -def is_superincreasing(seq): - """Return whether a given sequence is superincreasing.""" - ct = 0 # Total so far - for n in seq: - if n <= ct: - return False - ct += n - return True - - -def modinv(a, b): - """Returns the modular inverse of a mod b. - - Pre: a < b and gcd(a, b) = 1 - - Adapted from https://en.wikibooks.org/wiki/Algorithm_Implementation/ - Mathematics/Extended_Euclidean_algorithm#Python - """ - saved = b - x, y, u, v = 0, 1, 1, 0 - while a: - q, r = b // a, b % a - m, n = x - u*q, y - v*q - b, a, x, y, u, v = a, r, u, v, m, n - return x % saved - - -def coprime(a, b): - """Returns True iff `gcd(a, b) == 1`, i.e. iff `a` and `b` are coprime""" - return _fractions.gcd(a, b) == 1 - - -def byte_to_bits(byte): - if not 0 <= byte <= 255: - raise BinaryConversionError(byte) - - out = [] - for i in range(8): - out.append(byte & 1) - byte >>= 1 - return out[::-1] - - -def bits_to_byte(bits): - if not all(bit == 0 or bit == 1 for bit in bits): - raise BinaryConversionError("Invalid bitstring passed") - - byte = 0 - for bit in bits: - byte *= 2 - if bit: - byte += 1 - return byte diff --git a/assign2/README.md b/assign2/README.md deleted file mode 100644 index 13ddb56..0000000 --- a/assign2/README.md +++ /dev/null @@ -1,121 +0,0 @@ -# Assignment 2: Quest for the Holy Grail! - -**Due: 4:59:59 PM, Sat May 7th** - -## Overview - -Congratulations on a great quarter so far! - -You have embarked on a quest to find the famous Holy Grail. Along the way, you will have to write various Python scripts to assist you in your journey. We know you're up to the challenge. Your quest will draw on everything you have learned thus far in the course. - -We've actually hidden a Holy Grail somewhere on campus. The first person to find it gets late night, on us =D. Godspeed. - -*Note: We want you to enjoy the beautiful Stanford weather, as a nice break from midterms, and have some fun solving small puzzles.* - -![Holy Grail](http://parktheatreholland.com/wp-content/uploads/2014/10/banner-python.jpg) - -## Logistics - -Download the [starter files](https://github.com/stanfordpython/python-assignments/tree/master/assign2) from GitHub and use some of the hints below to solve the puzzles and find the Holy Grail! - -## Puzzle Guidelines - -The staff has created a suite of challenges that will bring you ever closer to the Holy Grail. - -### Graduate from Knight School - -Before you can leave on your quest, you must first graduate from Knight School. In order to do that, you'll need to assemble your schedule so that you're enrolled in as many courses as possible. After all, the sooner you graduate, the sooner you can start your journey! - - -In the case of a tie (i.e. two classes that have the same start and end times), choose the class that comes first in the file. - -The correct set of classes in chronological order will provide you with a token that will unlock `dna.zip`, so you can continue in your quest. - -### Cross the Moat - -**Immediately after unzipping `dna.zip`, you should fill out the form linked in `completed-knightschool.txt` so we know how far along you are.** - -Now that you have your schedule for next quarter, you're ready to embark on your quest. Unfortunately, there's a challenge in the way, and you'll need to use your Python skills to advance. Solve the puzzle to yield the password to unlock `grail.zip`. More instructions are available in the `README.md` inside `dna.zip`. - -### The Final Piece - -**Immediately after unzipping `grail.zip`, you should fill out the form linked in `completed-dna.txt` so we know how far along you are.** - -Having completed the previous challenges, you're now almost to the holy grail, the grand prize. Read through the `README.md` in `grail.zip` for detailed instructions, and don't hesitate to ask questions on Piazza if you get stuck. - -*Note: this final puzzle requires you to seek the holy grail at a physical spot on campus, so you should not wait until Saturday afternoon to start this piece. The holy grail comes with a link to a Google form, which will let us know that you've found the grail.* - -## The Hint Machine - -We've included in the starter files a `mystery.pyc` file. This file represents a byte-compiled Python file (compiled using CPython 3.4). If you're stuck, you can attempt to glean a useful hint towards the puzzles by codebreaking the symbols exported by this module. - -The `mystery` module represented by the `.pyc` file can be imported into the python interpreter or into your own script. The `mystery` module contains a `hint` function that takes some number and type of parameters. Use the hints given by the python compiler, the function itself, and your own introspection skills to figure out what to pass into this mystery function to obtain your next hint. - -For example, - -``` -$ python3 ->>> import mystery ->>> mystery.hint() # You might pass some arguments into `hint` -# some output here -``` - -## Starter Files - -``` -assign2/ -β”œβ”€β”€ README.md -β”œβ”€β”€ knightschool -β”‚Β Β  β”œβ”€β”€ courses.txt -β”‚Β Β  └── knightschool.py -β”œβ”€β”€ dna.zip -β”œβ”€β”€ grail.zip -└── mystery.pyc -``` - -In addition to this `README`, you've been given a few other tools to help you on your quest: - -* `knightschool/`: Starter code for the first puzzle. -* `dna.zip`: Locked starter code for the second puzzle. You'll need to graduate from knight school before you can unlock this puzzle. -* `grail.zip`: Locked starter code for the third puzzle. An elder knight has given you a locked chest containing vital information about the location of the holy grail. Unfortunately, the knight never gave you the key, so you'll need to find a suitable passkey yourself. -* `mystery.pyc`: A hint-giving, coconut-toting byte-compiled Python module that can help out if you're stuck. - -## General Advice - -As the Zen of Python states, "now is better than never." Get started early on this assignment! - -If you get stuck, post general inquiries on Piazza. If you’re blocked on a particular hint given by the compiled python file, please send us a private note on Piazza! - -## Extensions - -There aren't very many predefined extensions for this project. If you think of one, let us know! - -## Submitting - -Submit your final code using the `submit` script on AFS, as with Assignment 1. - -``` -myth$ /usr/class/cs41/tools/submit -``` - -## Grading - -### Functionality - -Your functionality grade is determined purely by your progress in the quest. If you complete all of the challenges, you'll receive a guaranteed check-plus. If you complete only the first two challenges, you'll receive a check. If you complete only the first challenge, you'll receive a check-minus. - -Remember, we're using the Google forms linked in every piece of the puzzle to determine how far you've made it in the quest, so make sure to submit the form as soon as you unlock a new level! **If you don't submit the form, we have no way of determining your progress, and thus can't give you credit for completion of that part.** - -In particular, you need to find the physical holy grail (using the clues in `grail.zip`) on campus to get full marks. The holy grail will be hidden in a bag somewhere. Good luck! - -### Style - -Your style grade is comprised of three main components - Pythonic practices, program design, and Python mechanics. "Pythonic practices" refers to your use of the Python tools we've talked about in class, and emphasizes Pythonic thinking. "Program design" refers to general programming style - decomposition, commenting, algorithms. "Python mechanics" refers to naming, spacing, parenthesizing, etc. Basically everything covered in PEP 8. - -We know that there are many ways to solve each of the challenges, so spend time thinking about the best approach before beginning. - -## Credit - -Inspiration for this assignment comes from the fantastic 1975 British masterpiece, [Monty Python and the Holy Grail](https://www.youtube.com/v/F41SSqJx2tU). As always, credit to Sherman Leung (@skleung) for the original handout, and to David Slater (@dsslater) for minor edits. David wrote the class selection problem, and Conner Smith (@csmith95) wrote the DNA puzzle. - -> With <3 by @sredmond diff --git a/assign2/dna.zip b/assign2/dna.zip deleted file mode 100644 index 1b0d670..0000000 Binary files a/assign2/dna.zip and /dev/null differ diff --git a/assign2/grail.zip b/assign2/grail.zip deleted file mode 100644 index d1ab147..0000000 Binary files a/assign2/grail.zip and /dev/null differ diff --git a/assign2/knightschool/courses.txt b/assign2/knightschool/courses.txt deleted file mode 100644 index c4842af..0000000 --- a/assign2/knightschool/courses.txt +++ /dev/null @@ -1,34 +0,0 @@ -CS22A 13.5 15 a -CS29N 13.5 15 t -CS41 13.5 15 E -CS94SI 15.5 16.5 U -CS101 13.5 15 7 -CS102 10.5 12 _ -CS106A 11.5 12.5 r -CS106B 14.5 15.5 q -CS106L 13.5 14.5 2 -CS107 12.5 15 - -CS107E 13.5 15 i -CS109 15 16.5 z -CS110 10.5 11.5 l -CS142 10.5 11.5 x -CS143 10.5 12 4 -CS155 13.5 15 e -CS161 15 16.5 S -CS166 15 16.5 s -CS167 15 16.5 w -CS168 13.5 15 v -CS170 19.5 22.5 p -CS181 13.5 15 I -CS181W 15 16.5 9 -CS190 12.5 15.5 n -CS193A 15 16.5 M -CS193P 16.5 18 P -CS193W 15 16.5 3 -CS194 16.5 18 o -CS194W 16.5 18 H -CS196 18 19.5 D -CS198 16.5 18.5 g -CS198B 13.5 15 6 -CS205A 9 10.5 D -CS210B 16.5 18 J diff --git a/assign2/knightschool/knightschool.py b/assign2/knightschool/knightschool.py deleted file mode 100644 index 17b5c73..0000000 --- a/assign2/knightschool/knightschool.py +++ /dev/null @@ -1,109 +0,0 @@ -#!/usr/bin/env python3 -tt -""" -File: knightschool.py ---------------------- -Assignment 2: Quest for the Holy Grail -Course: CS 41 -Name: -SUNet: - -Replace this with a description of the program. - -It's Winter quarter of your Senior year, and you have a few more Ws then you'd like. -In order for you to 'Camp Stanford' next quarter, you need to take as many classes as -possible. Lucky for you all classes meet MWF, so you dont have to worry about the -days of the week, just the times of the day. You need to make a schedule that has -no overlapping classes, however you don't need to take into account the time of -getting from one class to another. One last thing - you are fine with classes that -start at 9, but any course that runs later than 6:30 is off the table. - -The final key for the next puzzle will be the chronological concatenation of letters -from your final course list. - -The list of possible classes will be listed like this: - - course_name start_time end_time letter - -For example, - - CS41 13.5 15 a - -You'll notice that we use decimals for minutes and military time to avoid confusion. - -One way to solve this problem involes a naive brute force approach. Look at all possible -orderings of subsets of classes, and keep the longest one that is valid. However, that approach may -be too slow. Instead, there are ways to filter out classes that you would never take. When done -properly, this will leave few enough classes that a brute force approach will work in reasonable -time. - -Notes: - -* Finish the implementation of a class representing a Course. What attributes should - instances of this class have? - -* You can define a __str__ or __repr__ method for Courses so that they are more - readable when printed to the console. - -* If two classes occur at the same time, choose the class that comes first in the data file. - -* Would you ever choose CourseY instead of CourseX? - CourseX 13.5 15 X - CourseY 13.5 16 Y -""" -import itertools - -class Course: - def __init__(self, *what_other_attributes_go_here): - """ - What attributes should the class have? - """ - pass - - def __str__(self): - return "Course({})".format(self) - -def brute_schedule(courses): - """Determine a final course list, subject to time constraints. - - There are a few clever ways to solve this problem, but this function will - just implement a brute force approach, and return the final puzzle answer. - """ - return '' - -def fast_schedule(courses): - """Filters courses subject to time and overlap constraints. - - For every start time, keep the course with the shortest duration that - appears first in the list of courses, and throw out any courses that - start earlier than 9 AM or end later than 6:30 PM - - For example, with these three courses: - - CourseA 9 12 A - CourseB 9 11.5 B - CourseC 9.5 12 C - - CourseB is strictly better than CourseA, so we remove CourseA from - consideration, leaving: - - CourseA 9 12 A - CourseB 9 11.5 B - CourseC 9.5 12 C - - You should return the final puzzle answer. - """ - # Filter out strictly dominated courses - - # With the smallest set of courses, return the final key - return brute_schedule(courses) - - -if __name__ == '__main__': - DATA_FILE = 'courses.txt' - - with open(DATA_FILE, 'r') as f: - course_infos = [line.strip() for line in f] - - courses = [Course(*info.split()) for info in course_infos] - - print(fast_schedule(courses)) diff --git a/assign2/mystery.pyc b/assign2/mystery.pyc deleted file mode 100644 index f9b38f1..0000000 Binary files a/assign2/mystery.pyc and /dev/null differ diff --git a/assign3/README.md b/assign3/README.md deleted file mode 100644 index f3cca30..0000000 --- a/assign3/README.md +++ /dev/null @@ -1,441 +0,0 @@ -# Assignment 3: Wallscraper - -## Congratulations! - -Congratulations on completing Week 6! Midterm season is almost over, and we've basically finished discussing the syntax of the Python language. At this point, you know most of the important stuff about the language itself. Therefore, we'll spend most of the rest of the time in class going over useful builtin- or third-party modules that are omnipresent in the Python ecosystem. However, as far as the language itself goes, you have all become skilled in the art of the Python language. - -## Overview - -Sigh... another CS41 lab day and boring assignment. Better open up reddit and see what's new on /r/funny. - -**PSYCH IT'S THE MOST AMAZING LAB EVER.** - -And while you're at it, you really need a new desktop background wallpaper. So, head on over to /r/wallpapers, or maybe /r/wallpaper. If you're feeling up for it, even /r/earthporn and /r/spaceporn. - -Generally, these labs have focused on exploring nuances of the Python language - whether the syntax, semantics, or style of thinking. However, since we've know almost wrapped up talking about the language, labs will become a period of time for you to build something awesome. - -In particular, today you will write a program that automatically downloads the top wallpaper from reddit every night to your local computer, and optionally sets it as your desktop background. So cool! - -## Getting Set Up - -While Python's standard library has a lot of functionality included, we sometimes prefer to work with third-party packages. For this project, we're going to primarily use `requests`, a fantastic web client for Python written by Kenneth Reitz. - -### Installing Required Packages - -As always, the first step is installing any required packages using `pip`. At the very least, you should ensure that `requests` and `Pillow` are installed inside your virtual environment: - -``` -$ # Activate your virtual environment -(cs41)$ pip install requests -(cs41)$ pip install Pillow -``` - -You are free to install any other third-party libraries you think will be useful. In particular, `awesome-slugify` can be used to normalize possibly complicated filenames. - -Our solution also uses the `os`, `sys`, `io`, `subprocess`, `pathlib`, `imghdr`, and `mimetypes` packages from the standard library, if you're looking for possibly useful builtin tools. - -### Check Installations -To ensure that you've successfully installed requests, enter an interactive session and check: - -``` -(cs41)$ python3 -Python 3.4.3 (v3.4.3:9b73f1c3e601, Feb 23 2015, 02:52:03) -[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin -Type "help", "copyright", "credits" or "license" for more information. ->>> import requests ->>> from PIL import Image ->>> -``` - -If you get back the interactive prompt `>>>`, everything is installed correctly! If, on the other hand, you get an `ImportError` - please let one of the course staff know. - -## Wallscraper Specification - -The internet is full of many awesome things: [cat videos](https://www.youtube.com/v/2XID_W4neJo), [the most awesome person in the world](https://www.facebook.com/), and most importantly, [reddit](https://www.reddit.com/r/python). - -Take a (brief) look at [reddit.com/r/wallpapers](https://www.reddit.com/r/wallpapers) - there is so much happening on the page. Images are dynamically loaded, buttons ask you to click them, ads on the side demand your attention - it can be hard to find the data, and although we could use something like `BeautifulSoup` to parse all this junk, it seems too complicated to be worth it. - -On the other hand, take a (long) look at [reddit.com/r/wallpapers.json](https://www.reddit.com/r/wallpapers.json) (note the suffix `.json`). By adding this suffix to the query, we get back a rich data structure representing, in this case, posts on /r/wallpapers. - -### Overview - -At a high level, you will need to extract a list of the top posts from a subreddit, and for each of the posts, download the linked image if it represents a (SFW) image. - -### Aside: Using `requests` - -In this section, we'll explore some of the functionality of the `requests` module, which will be quite useful for this assignment. You can skip this if you already know how `requests` works. - -A sample usage of the `requests` package is shown below. - -``` ->>> import requests ->>> response = requests.get('http://stanfordpython.com') ->>> print(response) - ->>> print(type(response)) - -``` - -The `requests.get` function returns a `Response` object that represents the response returned by the server, in this case by `stanfordpython.com`. (There are similar `post`, `put`, `patch`, and `delete` functions defined by `requests`). - -A `Response` instance supports a lot of attribute references: - -``` ->>> response. -response.apparent_encoding response.history response.raise_for_status -response.close response.is_permanent_redirect response.raw -response.connection response.is_redirect response.reason -response.content response.iter_content response.request -response.cookies response.iter_lines response.status_code -response.elapsed response.json response.text -response.encoding response.links response.url -response.headers response.ok -``` - -For the purposes of this assignment, we only care about a few of them: - -``` -# true iff the request to the server was successful -response.ok - -# the raw server response, as a bytestring -response.content - -# return a python dictionary of the response data, if the data represents JSON-encoded data, otherwise raise an Exception -response.json() -``` - -More information on the `requests` library can be found [here](http://docs.python-requests.org/en/latest/) - -### Query Subreddit Data - -In this section, your task is to write a function `query` that accepts as an argument a subreddit to query (e.g. `'wallpapers'` or `'funny+gifs'`), and returns the JSON server response from reddit as a Python dictionary. You can add any additional positional or keyword arguments as you see fit. - -Your function should gracefully handle all of the following scenarios: - -* There is no internet connection -* The user supplies a string that doesn't represent a valid subreddit -* The requests module, in particular the `get` function, throws any exception from `requests.exceptions` (hint: look through [the source code](https://github.com/kennethreitz/requests/blob/master/requests/exceptions.py) to find the base exception class for the `requests` package.) -* reddit responds with a status that is not `ok` - -In all of these situations, your `query` function should print out an informative error message. - -To test this function, write a few lines of code with a reasonable subreddit, and compute the number of posts with a score greater than 500. - -#### Note: Rate Limits - -Reddit imposes a rate limit on generic scripts that make too many requests to its server (default is >30 per minute). If your script is getting rate-limited, Reddit will respond with a ``, which specifically means: *429 Client Error: Too Many Requests.* - -To avoid this problem, we need to tell Reddit that we're not just some random script by adding a `User-Agent` to our request. In particular, you need to add `headers={'User-Agent': }` as a keyword argument to `requests.get`. - -For instance, I would use - -``` -requests.get( - 'http://www.website.com', - headers={'User-Agent': 'Wallscraper Script by sredmond'} -) -``` - -This should get around the rate-limit problem. - - -### Building a `RedditPost` Class - -Next, you should build a `RedditPost` class that represents a single post. - -A RedditPost object must support two methods, and can support as many helper functions as you see fit. - -* `__init__(self, data)`: Initialize a RedditPost from a JSON dictionary representing the post, as extracted from the top-level subreddit JSON. -* `download(self)`: Tries to download the Reddit post. Must determine (1) if the post can be downloaded, (2) where to download the post, and (3) actually download the post. - -For now, let's focus on the constructor. As discussed in the data model section, there are lots of irrelevant attributes in the JSON returned by the subreddit. - -Write the `__init__` method, and only keep the attributes that correspond to an attribute you think will be useful. If an attribute is missing or the data is otherwise corrupted, your program should handle the error gracefully. - -Additionally, you can implement the magic method `__str__(self)` to return a string representing a human-readable form of a post, allowing us to more easily debug when printing a `RedditPost` to the console. We suggest printing the posts in the following format: `"{title} ({score}): {url}"`. - -### Load Response Data into Post Objects - -Write the code to convert the data returned by `query` into a list of `RedditPost` objects. You can accomplish this in one line using a list comprehension. As a sanity check, your list of `RedditPost`s should have length 25 (or perhaps 26 or 27). - -If the data is bad - i.e. keys are missing, information is not structured as you suspect, etc. - your program should not crash. Rather, it should gracefully handle the errors and proceed accordingly. Who's responsibility is it to check for malformed data? - -At this point, rewrite your old code to determine the number of posts with a score greater than 500. This should also take one line of code. - -### Download an Image Post - -Ultimately, our goal is to download wallpapers. Implement the `download(self)` method in the `RedditPost` class that attempts to download the post. - -If the post doesn't represent an image, don't download anything. How can you tell if the post represents an image? You can look at the `url` - does it end with `'.jpg'`,`'.png'`. or any other image suffix? Is `is_self` `True` or `False`? Is `post_hint` `image`, `link`, or something else? Is the domain something recognizable like `'i.imgur.com'`? - -You can add any other conditions you'd like on the downloaded wallpapers - perhaps you only download images from imgur, or only wallpapers with a score over 500, or only gilded posts. - -Where do you download the file to? Use the aspect ratio and width/height to store the download in a structured place. For example, if an image is 1920 by 1080, store it in `wallpapers/16x9/1920x1080/image.png`. How can you title the file? For one, you can use the title of the post. However, sometimes Reddit posts have titles that aren't amenable to filesystems, so you should probably slugify the title in some way. Furthermore, most titles have something like `'[1920x1080]'` in the title. You should use a regular expression to detect and remove anything that looks like that, possibly using `re.sub`. - -Hint: if you're writing image data to an open file object, make sure that the file has been opened with `wb` flags for (w)riting in (b)inary mode. Generally, when reading or writing binary data like images or sound files, it's a good idea to use the `'b'` option. - -Test this method by downloading one of the posts. - -If you have successfully downloaded a photo, congratulations! That's pretty dang impressive. - - -### Tying Everything Together - -Ultimately, the goal of this step is to combine all of the pieces you've already written to complete the final project. - -Write the code to take the list of posts generated earlier, and download them all to your filesystem. Cool! - -## Data Model - -Building this wallpaper scraper involves scraping structured data from Reddit. How exactly is this data structured? - -### Top-Level Subreddit Data - `listing` - -The data returned by a subreddit is a `listing` object, which is used to paginate content that is too long to display in one go. - -``` -{'data': {'after': 't3_4if6xu', - 'before': None, - 'children': [array of Things], - 'modhash': ''}, - 'kind': 'Listing'} -``` - -Where `'children'` is a list of `Thing`s (the real Reddit name for this data model!) - -If you want to get the previous or next page, supply a query argument `before` or `after` with the value, usually used in conjunction with `count`. - -### Intermediate Storage - `Thing` - -A `Thing`, for our purposes, looks like the following: - -``` -{ - 'data': Post, - 'kind': 't3' -} -``` - -where the only important field, `'data'`, contains a single `Post` object. - -### Reddit Post - `Post` - -The most important data model to understand is that of a Reddit post. In it's entirety, a post looks like: - -``` -{ - 'approved_by': None, - 'archived': True, - 'author': 'onewallpaperaweek', - 'author_flair_css_class': None, - 'author_flair_text': None, - 'banned_by': None, - 'clicked': False, - 'created': 1431645046.0, - 'created_utc': 1431616246.0, - 'distinguished': None, - 'domain': 'i.imgur.com', - 'downs': 0, - 'edited': False, - 'from': None, - 'from_id': None, - 'from_kind': None, - 'gilded': 0, - 'hidden': False, - 'hide_score': False, - 'id': '35ybkb', - 'is_self': False, - 'likes': None, - 'link_flair_css_class': None, - 'link_flair_text': None, - 'locked': False, - 'media': None, - 'media_embed': {}, - 'mod_reports': [], - 'name': 't3_35ybkb', - 'num_comments': 27, - 'num_reports': None, - 'over_18': False, - 'permalink': '/r/wallpaper/comments/35ybkb/its_a_misty_mood_sort_of_day_1920x1080/', - 'post_hint': 'image', - 'preview': { - 'images': [{ - 'id': '_8zF29cGX1DwJ0KDnbYGbh2oycytb6RQS1d807LC898', - 'resolutions': [{ - 'height': 60, - 'url': 'https://i.redditmedia.com/jwI5mvqJE-Cx1C5S99XP-RSB6B3TJKVJr-KKTVmb2zg.jpg?fit=crop&crop=faces%2Centropy&arh=2&w=108&s=43747ff659df46f3a8cbd0699b3fc2ec', - 'width': 108 - }, { - 'height': 121, - 'url': 'https://i.redditmedia.com/jwI5mvqJE-Cx1C5S99XP-RSB6B3TJKVJr-KKTVmb2zg.jpg?fit=crop&crop=faces%2Centropy&arh=2&w=216&s=5a7b8c50f90f08cf38121bfbbb518cc2', - 'width': 216 - }, { - 'height': 179, - 'url': 'https://i.redditmedia.com/jwI5mvqJE-Cx1C5S99XP-RSB6B3TJKVJr-KKTVmb2zg.jpg?fit=crop&crop=faces%2Centropy&arh=2&w=320&s=dddde0d7b2389824adf43cae298bcd92', - 'width': 320 - }, { - 'height': 359, - 'url': 'https://i.redditmedia.com/jwI5mvqJE-Cx1C5S99XP-RSB6B3TJKVJr-KKTVmb2zg.jpg?fit=crop&crop=faces%2Centropy&arh=2&w=640&s=c2a05cd908a84f0261dafe2b99b70a8a', - 'width': 640 - }, { - 'height': 539, - 'url': 'https://i.redditmedia.com/jwI5mvqJE-Cx1C5S99XP-RSB6B3TJKVJr-KKTVmb2zg.jpg?fit=crop&crop=faces%2Centropy&arh=2&w=960&s=f9112c9b8a95cf8a9399bd4c049a510e', - 'width': 960 - }, { - 'height': 607, - 'url': 'https://i.redditmedia.com/jwI5mvqJE-Cx1C5S99XP-RSB6B3TJKVJr-KKTVmb2zg.jpg?fit=crop&crop=faces%2Centropy&arh=2&w=1080&s=95913eda85a710c8e753cc15f498c7e2', - 'width': 1080 - }], - 'source': { - 'height': 1079, - 'url': 'https://i.redditmedia.com/jwI5mvqJE-Cx1C5S99XP-RSB6B3TJKVJr-KKTVmb2zg.jpg?s=c69cfcbf626335086ae4273a6b54b45e', - 'width': 1919 - }, - 'variants': {} - }] - }, - 'quarantine': False, - 'removal_reason': None, - 'report_reasons': None, - 'saved': False, - 'score': 833, - 'secure_media': None, - 'secure_media_embed': {}, - 'selftext': '', - 'selftext_html': None, - 'stickied': False, - 'subreddit': 'wallpaper', - 'subreddit_id': 't5_2qmjl', - 'suggested_sort': None, - 'thumbnail': 'http://a.thumbs.redditmedia.com/VJxDvwX98DdVVckX5-bXrO6gmoh7oHCHPBLIfyjvRn4.jpg', - 'title': "It's a Misty Mood sort of day [1920x1080]", - 'ups': 833, - 'url': 'http://i.imgur.com/fWbnJYt.jpg', - 'user_reports': [], - 'visited': False -} -``` - -That's quite a lot of information! Much of this information isn't relevant to our purposes. For this assignment, you should keep only the following attributes: - -``` -subreddit - which subreddit this post originated from -is_self - True iff the post is a self-, text-only post -ups - number of upvotes -post_hint - reddit's guess of the content of the post (could be 'image', 'link', or something else.) -title - title of the post -downs - number of downvotes -score - the overall score of the post (basically ups - downs, but with "vote fuzzing") -url - the post's link, if it is not a self post -domain - the domain of the url -permalink - a permanent link to the reddit post -created_utc - epoch timestamp in UTC of the post's creation -num_comments - how many comments the post has -preview - data structure containing image previews -name - unique name for this post -over_18 - true iff the post is not safe for work (NSFW) -``` - -You will find that some of these attributes are more helpful than others. - -### Extras - -#### Learning More - -For the full description of Reddit JSON objects, check out [the documentation](https://github.com/reddit/reddit/wiki/JSON) - -#### Viewing JSON Data In-Browser - -If you're planning to poke around sample JSON data from the browser, I highly recommend JSONView for [Chrome](https://chrome.google.com/webstore/detail/jsonview/chklaanhfefbnpoihckbnefhakgolnmc) and [Firefox](https://addons.mozilla.org/en-us/firefox/addon/jsonview/). This browser addition makes it easy to explore the structure of JSON from the browser. Unfortunately, there isn't a good equivalent tool for Safari. - -## Pythonic Suggestions - -When processing the data from a given subreddit, make use of list comprehensions to simplify your data exploration. For example, you should never need to build an empty list during any part of this project. - -If you pass `stream=True` as a keyword argument to `requests.get`, the `.content` will not be loaded at once into memory. Instead, you can use `requests.iter_content(chunk_size=1024)` to iterate over the server response content. This is generally considered good practice, and should be used when downloading image files, which may be arbitrarily large. - -In keeping with the motto of "coding for the common case", you should generally blindly assume that your data is properly formatted, and catch any improper behavior in an `except` block. That is, use exceptional control flow to simplify error handling. - -## Extensions - -Some of these are easy, some of these are very hard. - -### Download Albums - -Add support for downloading imgur albums. - -### Command Line Utility -We saw in class that command-line arguments can be passed to Python scripts, and these arguments will be available through `sys.argv`. Modify your program so that it can be invoked with a single command-line argument representing the subreddit to scrape data from. So, `$ python wallscraper.py wallpaper` would download all the top wallpapers of the day, and `$ python wallscraper.py fffffffuuuuuuuuuuuu` would download all the top rage comics. - -### Configure your computer so that this script runs every hour/day/month -Both OS X and Linux have ways to schedule a program to run every so often (Windows is harder). If you decide to do this option, talk with us. It's one of the coolest extensions, because you get awesome wallpapers over time, but it's also one of the hardest to get right. If you want to read up on your own, look up `launchd` and `cron`. - -### Programmatically set the highest-scoring wallpaper as your desktop wallpaper -Both OS X and Linux have command-line tools to programmatically set your desktop background to be a specified file path (again, Windows is harder). In combination with the previous extension, you could have an automatically shifting desktop background of the internet's top trending wallpapers! - -### Support for Pagination -We currently scrape only one page of Reddit data at a time. In the response data, there are pagination tokens `before` and `after` than can be used to scroll through pages and pages of reddit. Use these pagination tokens to search through arbitrarily many pages of a subreddit. - -### Wallpaper deduplication -If we ever encounter the same wallpaper twice, we'll process the data twice, download it twice, etc. Implement a system that will eliminate image download duplication. You have freedom to implement this however you want. - -### Logging -When you encounter errors, log the errors instead of printing an error message. Use the `logging` library. - -### Parallel Processing and Multithreading -Extend the current download code to make use of Python's multiprocessing and multithreading primitives. - -## Starter Code - -``` -assign3/ -β”œβ”€β”€ README.md -└── wallpapers/ -└── wallscraperutils.py -└── wallscraper.py -``` - -In addition to this `README`, the other starter files are: - -* `wallpapers/`: where all the downloaded wallpapers will go. -* `wallscraper.py`: Barebones starter code. All of your program logic will go into this file. -* `wallscraperutils.py`: A few helper functions that may simplify some of the less interesting steps of the assignment. Read through the file for more information. - -## Submitting - -Submit your final code using the `submit` script on AFS, as with previous assignments. - -``` -myth$ /usr/class/cs41/tools/submit -``` - -Furthermore, we have built a style-checking tool for you. On AFS, you can run - -``` -myth$ /usr/class/cs41/tools/stylecheck path/to/wallscraper.py -``` - -to run PEP8 linting on your Python files. We highly recommend fixing all of the style before submitting your final solution. You can do so using the `autopep8` module discussed in lecture, although you'll have to `pip install autopep8` first inside your virtual environment. - -## Grading - -As stated in class, this assignment is optional. If you choose to complete this assignment, we will replace your lowest grade so far with the grades from this assignment if it helps you. - -### Functionality - -We'll be testing your code on live Reddit data, so make sure it works on real subreddits (as stated, we suggest `'/r/wallpapers+wallpaper+earthporn'`). There a lot of different ways you can take this assignment, so we'll be assessing functionality on a case-by-case basis. If your program handles errors gracefully and successfully downloads wallpapers from the internet, that's deserving of a ✓+! If the wallscraper is mostly correct, but fails on some inputs or crashes in certain conditions, that's a ✓. If the program *drastically* fails to either (1) connect to the internet and extract a list of top posts or (2) save posts to the filesystem, that would be a ✓-. - -### Style - -As always, your style grade is comprised of three main components: - -* **Pythonic practices:** Proper use of the Python tools and ways of thinking introduced in this class - using list comprehensions where appropriate, intelligent utilizing iterables/generators where appropriate, etc. -* **Program design:** General programming style - decomposition, commenting, logic, algorithm design, etc. -* **Python mechanics:** Basically everything covered in PEP8 - naming, spacing, parenthesizing, etc. - -## Credit - -*This assignment was inspired by a late-night conversation with Eddie Wang (@eddiew), and wouldn't be possible without the careful review of Sherman Leung (@skleung) and course helpers David Slater (@dsslater), Brexton Pham (@bpham), Conner Smith (@csmith95), and Matt Mahowald (@mmahowald)* - -> With <3 by @sredmond \ No newline at end of file diff --git a/assign3/wallscraper.py b/assign3/wallscraper.py deleted file mode 100644 index b142753..0000000 --- a/assign3/wallscraper.py +++ /dev/null @@ -1,30 +0,0 @@ -#!/usr/bin/env python3 -tt -""" -File: wallscraper.py --------------------- -Assignment 3: Wallscraper -Course: CS 41 -Name: -SUNet: - -Replace this with a description of the program. -""" -import utils - - -class RedditPost: - def __init__(self, data): - pass - - def download(self): - pass - - def __str__(self): - return "" - - -def main(): - pass - -if __name__ == '__main__': - main() diff --git a/assign3/wallscraperutils.py b/assign3/wallscraperutils.py deleted file mode 100644 index dd91889..0000000 --- a/assign3/wallscraperutils.py +++ /dev/null @@ -1,53 +0,0 @@ -#!/usr/bin/env python3 -tt -""" -Miscellaneous utilities for wallscraper -""" -from fractions import Fraction -import sys -import pathlib - -WALLPAPER_FOLDER = pathlib.Path(__file__).parent / 'wallpapers' - -def get_aspect_ratio(width, height): - # Credit to https://en.wikipedia.org/wiki/Aspect_ratio_(image) - common = [ - (4, 3), # (COMMON) present-day TV standard - (16, 10), # (COMMON) Standard widescreen computer monitor - (5, 3), # (COMMON) photography standard - (16, 9), # (COMMON) video widescreen standard - ] - - uncommon = [ - (1, 1), # Square - (19, 16), # Movietone Ratio - (5, 4), # Early TV format - (11, 8), # Academy-preferred standard - (3, 2), # 8-perf 35mm film - (14, 9), # Compromise widescreen on some commercials - (15, 9), # Nintendo 3DS / 35 mm - (17, 10), # Some android tablets - (7, 4), # Early 35mm - (37, 20), # 35mm US/UK widescreen standard - (2, 1), # SuperScope - (22, 10), # 70mm standard - (21, 9), # Temporary cinema displays - (3, 1), # photography standard - ] - - f = Fraction(width, height) - for ratio in common: - for monitor in range(1, 5): - if f == Fraction(*ratio) * monitor: - return ratio[0] * monitor, ratio[1] - - for ratio in uncommon: - if f == Fraction(*ratio): - return ratio - - # Screens for vertical mobile images - for ratio in common + uncommon: - if f == Fraction(ratio[1], ratio[0]): - return ratio[1], ratio[0] - - sys.stderr.write("Unrecognized dimensions {}x{}: Using {}x{}\n".format(width, height, f.numerator, f.denominator)) - return f.numerator, f.denominator diff --git a/assign4/README.md b/assign4/README.md deleted file mode 100644 index 7897773..0000000 --- a/assign4/README.md +++ /dev/null @@ -1,83 +0,0 @@ -# Assignment 4: Final Project! - -## The Proposal - -*Due: Friday, May 20th at 11:59:59 PM* - -![The Proposal](https://raw.githubusercontent.com/stanfordpython/python-assignments/master/assign4/proposal.png) - -As described in class, the purpose of the project proposal is for the course staff to ensure that the project is well-scoped and incorporates Python in some meaningful way. We can also suggest useful packages for your project. The better your proposal, the more we can assist you by pointing you away from common pitfalls and towards good solutions. - -You should use [`template.md`](https://github.com/stanfordpython/python-assignments/blob/master/assign4/template.md) as a starting template for your proposal. You can access the raw markdown [here](https://raw.githubusercontent.com/stanfordpython/python-assignments/master/assign4/template.md) or just copy-paste into a Google Doc. See [`sampleproposal.md`](https://github.com/stanfordpython/python-assignments/blob/master/assign4/sampleproposal.md) for an example of what we’re looking for. Additionally, we've added some ideas to [`ideas.md`](https://github.com/stanfordpython/python-assignments/blob/master/assign4/ideas.md) if you're stuck. - -In order to submit the proposal, drop your file into our [Google Drive Folder](https://drive.google.com/open?id=0B-eHIhYpHrGDdHJzclFoem1rR1E). When we review your proposals, we'll look at the most recently added proposal. - -*Note: you can use late days on the project proposal, but hopefully you will complete the proposal on time so that we can get feedback to you sooner.* - -## The Project - -*Due: Sunday, May 29th at 11:59:59 PM* - -Implement the project you have proposed, incorporating feedback that we will return to you by Monday, May 23rd at the latest. You are free to begin working on the project before you hear from us. - -*Note: you can only use up to one late day on the final project!* - -### Development Strategy and Hints - -Have a plan. We're making you submit a project proposal so that you think about potential challenges and your plan to overcome them. - -Start small and iterate quickly. Python, unlike many other languages, allows you to rapidly iterate. Consider developing your code in small steps using the interactive interpreter. - -Build incrementally and test frequently! This project will likely be the largest Python project you have written, so make sure each task works before moving on! - -### Deliverables - -In addition to your code, you must include a `README.md` file as a meaningful writeup of your final project. - -This writeup should contain a technical overview of the project and the code therein. In effect, you're writing documentation for your project - if the first thing someone reads about your project is the README, what information does she need to know? We're asking you to also include a technical section in your README to describe the code design, the purpose of various modules, and any requirements (e.g. must run a certain version of Python, or must have a particular operating system, or must have a Postgres database running, or must have a Google account, or anything else). - -In addition, we're asking you to write installation/execution instructions. After we download your code, what steps do we have to perform to get it up and running? For many of you, the answer will just be "run the main python script," but several others will have more complex configuration. If we can't set up your project, we have no way to confirm that your project works correctly, so we hope that your installation instructions are clear, correct, replicable, and concise. - -Other general sections of a README usually include, but are not limited to: known bugs, contact information for the maintainer (that's you!), and credits/acknowledgements. - -In total, the README should be about 700-1500 words, with a majority of that going to the technical overview. Of course, these words counts are estimates, and you're free to write fewer or more as you see fit. - -### Starter Code - -For this assignment, we are not explicitly providing any code to you. However, we have been working on an early alpha release of a `stanford` package that makes CS106A/B/X-style functionality (graphics, sound) available in Python 3. The software hasn't been tested much, and is surely buggy, but if you'd like to work with our development libraries, let us know. - -You are free to use any builtin modules, publicly available code, or any code you find online, as long as you cite it appropriately. Use Google and StackOverflow a lot! Chances are that someone has built a library to help with your project. - -You may *not* use proprietary code, code which requires a paid license, anything which promotes illegal activity, etc. - -## Grading - -The project *proposal* grade will be assessed purely on completion. Did you do it? Great! If not - less great. - -Your final project grade will be assessed on both functionality and style. - -Functionality will be determined holistically using a combination of difficulty of project and success of execution. Unfortunately, that's as detailed as we can get given the breadth of possible topics. In effect, if you put in your fair share of effort, we'll be reasonable. =) - -Stylistically, as always, you'll be assessed on three main categories: - -* **Pythonic practices:** Proper use of the Python tools and ways of thinking introduced in this class - using list comprehensions where appropriate, intelligent utilizing iterables/generators where appropriate, etc. Show us that you've learned how to think like a Python programmer! -* **Program design:** General programming style - decomposition, commenting, logic, algorithm design, etc. -* **Python mechanics:** Basically everything covered in PEP8 - naming, spacing, parenthesizing, etc. - -We hold the final project to a higher standard of style than the assignments, since it's naturally more freeform. Make sure that your code is something that you are proud showing off! - -## Submitting - -When you have finished your final project, you can submit all your files using the `submit` script as usual: - -``` -myth$ /usr/class/cs41/tools/submit -``` - -**WARNING: If you use any third-party libraries, ensure that you have generated a `requirements.txt` file listing your project's dependencies before submitting. You can do this by putting the output of `$ pip freeze` into a file. When exercising your code, we guarantee that we will run `$ pip install -r requirements.txt` to install this list of dependencies.** - -If your project is sufficiently convoluted, make sure to add the corresponding clarifying information in your README file. - -We highly recommend that you check your submission folder on AFS after submitting to ensure that all necessary files were copied over successfully. - -> With <3 by @sredmond \ No newline at end of file diff --git a/assign4/ideas.md b/assign4/ideas.md deleted file mode 100644 index e0ed8c1..0000000 --- a/assign4/ideas.md +++ /dev/null @@ -1,88 +0,0 @@ -# Potential Project Ideas - -## Stuck? -*Here are some suggestions* - -If you're totally stuck, it can help to think through your answers to the following questions. - -What categories of stuff are you interested in? Machine learning? Sports? Food? Fashion? Music? Dev tools? Photography? Healthcare? Social media? Video games? Sleeping? Board games? Theater? Biophysics? Algorithms? Systems? Automation? AI? etc. - -What data is available? Census data? Genome data? API data? Survey data? etc. - -What problems do you care about? Healthcare? Food insecurity? Education? Climate change? Divestment? Mental health? Politics? etc. - -You can do almost anything you would like to! - -## Random Ideas -*Miscellaneous ideas that we thought of over the course of the quarter.* - -2-step auto-authenicator - who needs security?! just auto-2-step all messages from a certain sender - -pizza button - a hardware button that when pressed will use my credit card info to deliver domino's directly to the dorm room - -paperless.py - clone and update paperless so that it can support CS41 (way out of scope for this class but a person can dream) - -canvasdav - a WebDAV client for Canvas (Coursework had one, Canvas doesn't) for local filesystem-like mounting of Canvas materials - -astparse - check for similarity between python code by looking at the ASTs - -piazzAI - a piazza client that answers common questions by using a hand-selected bank of information about the course or by learning based off of other student/TA answers - -stamp - convert spotify playlists to apple music playlists and back - -herowid - using the erowid database, a huge set of experiences of people on a variety of illegal drugs, classify a person's substance abuse based on their speech or text patterns - -xl2py - an excel spreadsheet to python converter, with reference resolution and everything - -facegraph - scrape all facebook chats with a given person and generate visualizations and analytics of your lifetime conversation - -quarto AI - AI to play a hyper-tic-tac-toe game - -codenames AI - AI using NLP to suggest a clue that is very similar to one set of words but very different from another set of words - -wireframe visualizer - build a project that visualizes 3D graphics using raytracing - -convex optimizer - generic convex optimizations on sufficiently constrained problems - -## Proposals from Autumn 2015 -*List of proposals from last quarter's final projects. This list exists to show you the incredible breadth of projects that are possible!* - -Bach - Programming a Python-based piano interface with various musical capabilities. - -NoNonsense.py - scrape the genome of a new organism and return all the genes that will need to be changed, as well as all suggested primers to design to convert this to a viable system for nonsense suppression. - -TeaWithStrangers - When given a group of people and their survey answers, the program will output clusters of people based on common interests with optimized probability of great conversations and new friendships. - -RaspiTrack.py - Implementing a small number of computer vision algorithms in Python using a Raspberry Pi and a RPi Camera module, specifically focused on detecting circular objects, including edge-detection (Canny), k-means, and blob detection. - -Tweet Analyzer - scrapes Tweets based on a hashtag and performs sentiment analysis - -Hearts - Programming a Python-based interface for the game Hearts. - -Kaggle.py - Participate in a Kaggle competition using Python - -Juke - desktop jukebox application, allowing a user to create a playlist then people in proximity will be able to queue songs from their laptops onto the playlist - -Fantasy Football Trade Generator - propose mutually beneficial trades between teams in any fantasy football league. - -Dexcom Data Visualizer - suite that parses Dexcom data to create useful visualizations and metrics using Python. - -MashedHeadlines.py - Twitter bot that will mash together two headlines and tweet them every hour - -Random Rapper - version of the random writer project from CS106B that scrapes rap lyrics from the lyrics website Genius.com to generate random, rap-like output. - -## Ex-Assignments -*These projects were barely cut from being CS41 assignments. With some modification, they could be good final projects!* - -Ghost AI - build an AI for the popular word game Ghost, where players sequentially add a character to a growing prefix, losing if they spell a full word or if the current string is not a valid prefix of any word. - -Image Triangulation - Use Delaunay triangulation to auto-artsify profile pics into low-poly renders of the image - -## Stolen Assignments -*Implementations of programming assignments from other classes, this time in Python!* - -Karel (CS106A)- build a Karel emulator in Python, including a graphical interface, a lexer and parser for the karel language, and enforced correctness) - -RSS Aggregator (CS110) - aggregates news from lots of RSS feeds - - diff --git a/assign4/proposal.png b/assign4/proposal.png deleted file mode 100644 index 8cad38b..0000000 Binary files a/assign4/proposal.png and /dev/null differ diff --git a/assign4/sampleproposal.md b/assign4/sampleproposal.md deleted file mode 100644 index 271be97..0000000 --- a/assign4/sampleproposal.md +++ /dev/null @@ -1,66 +0,0 @@ -# CS41 Sample Project Proposal: Sous Chef - -> Sam Redmond (sredmond) and Guido van Rossum (bdfl) - - -## Overview - -We want to build a program that suggests (and generates) recipes given a set of ingredients to include and exclude. - - -## Background - -There are lots of recipe websites on the internet, where you can search for a term like "pie" and you'll get back a list of pies. Usually, there are lots of good utilities in place to search the results in various ways - by price of ingredients, by total preparation time, etc. However, the results are determined only by the query string. - -We want to let a user supply a set of ingredients to include and exclude - say, ingredients they already have in their kitchen and allergens, respectively - and then we suggest highly-rated recipes that use or focus on those ingredients. Perhaps they can restrict the results to dishes with only the specified ingredients in case they *really* don't want to go to the market. Hopefully, we can also use some machine learning techniques to generate new recipes centered on the given ingredients using a variant of k-nearest neighbors and a few hand-chosen heuristics. - - -## Implementation Strategy - -Our project falls into three major categories - scraping the data, interacting with the user, and computing recipes to output to the user. - -First, we're going to scrape the [BigOven API](http://api2.bigoven.com/web/documentation), which has over 350,000 recipes. We'll put all of the recipes into a big database so that we can efficiently query it later. For this part, we're going to rely extensively on the `requests` module discussed in class - and perhaps some of the multiprocessing primitives in the standard library to speed up the download. There's also [Food2Fork API](http://food2fork.com/about/api) which has over 200,000 recipes if we need more data points. Lastly, we could always scrape the HTML of other common recipe sites like allrecipes.com. - -To interact with the user, we're going to have a simple text-based I/O system where the user enters the ingredients one at a time and then signals when done. We'll then output a bunch of recipes for the user to choose from. - -To actually compute the best recipes, we'll simply filter out all the recipes that *don't* contain the desired ingredients or *do* contain the undesired ingredients. Of the remaining recipes, we'll sort them on a combination of relevance to the original ingredient suggestions and overall ratings. We plan to use the awesome pandas library for better data manipulation of the recipes in raw Python. - - -## Tasks - -1. Authenticate to the BigOven API -2. Download all of the recipes into a database -3. Load the recipes into Python Recipe class and Cookbook class -4. Main loop that asks user for ingredients and returns recipes using the class interface -5. Match ingredient names to names in recipes to filter bad recipes -6. Sort remaining recipes by a "good" heuristic (we'll need to try a lot of heuristics) -7. *(Stretch)* Use common food substitutions and word misspellings for more flexible user input (i.e. buter -> butter, pop -> soda) -8. *(Stretch)* Map the recipes into a high-dimensional vector space and run clustering algorithms to find the best-matching recipes -9. *(Stretch)* Integrate with Instacart so that any missing ingredients can get automatically delivered - -Honestly, the only part we're worried about is the actual algorithm of choosing the best matching recipes. Can a naive algorithm do "well enough," or do we need to incorporate ML techniques to get reasonable results? We're fairly confident that we can scrape the recipe data and do the console I/O. - - -### Estimated Timeline - -**(Core)** - -* Task 1 (1 hours) - both -* Task 2 (2 hours) - both -* Task 3 (0.5 hours) - Guido -* Task 4 (1 hours) - Sam -* Task 5 (1 hours) - Sam -* Task 6 (3 hours) - both - -**(Stretch)** - -* Task 7 (1.5 hours) - Sam -* Task 8 (4 hours) - Sam -* Task 9 (5 hours) - Guido - -We've made a little progress on Task 1 (we acquired an API token), but we haven't used it to connect to the API endpoint yet. - - -## Resources - -All of our data is going to come from the APIs described above, but we're also going to hand-code some test recipes for small data sets to make sure the general logic is working. \ No newline at end of file diff --git a/assign4/template.md b/assign4/template.md deleted file mode 100644 index 641e167..0000000 --- a/assign4/template.md +++ /dev/null @@ -1,48 +0,0 @@ -# Project Title - -> Team Member 1 (sunet1) [ ,Team Member 2 (sunet2) [, and Team Member 3 (sunet3) ] ] - - -## Overview -*This section should be on the order of a few (1-2) sentences. After reading this, we should have a general sense of your final project topic. You can imagine that this is the description blurb for your project if we were to make a CS41 Final Project Program.* - -At a high-level, what do you propose to work on for your final project? - - -## Background -*This section should be on the order of a few (1-2) paragraphs. If your project is highly domain-specific, this section may be longer.* - -Give us some context for your project. We may not be as informed about the topic as you are. Why are you excited about this topic? Do you have any experience with projects like this? What effects could this project have on the outside world? - - -## Implementation Strategy -*This section should be on the order of a few (2-3) short paragraphs, and captures your current program design.* - -At a high-level, how do you plan to implement your project? How does it incorporate Python? - -What are the main pieces that your projects breaks down into? -How do these pieces connect to each other? If they share data in some way, how? - -Which Python packages do you plan to work with? How will these external modules connect to the code you write? - - -## Tasks -*This section is the most important because it gives us a sense of the scope of your project and forces you to think about the deliverables to which you'll hold yourself.* - -Break down your project into a sequence of small tasks that you can feasibly accomplish to incrementally build towards a complete working project. - -Include at least three stretch goals. A *stretch goal* is an extension that you think would be really awesome, but would probably be outside the scope of the final project. - -What tasks do you think will be easy? What tasks do you think will be hard? Tedious? Trivial? Give us a sense of your current outlook on these project tasks. - -Annotate each of the tasks (including stretch goals) with an estimate for how long you believe the task will take you. If you're in a group, also annotate each task with the names of anyone responsible for the task. - -If you've already accomplished some of the tasks, make a note of it! If you're incorporating a final project for another class into this final project, you'll be held to a higher standard of quality and quantity, so make a note of that too. - -This task list isn't binding - you can change your mind, modify tasks, etc - but it's a good starting point to organize incremental development. - - -## Resources -*This section is smaller and less vital than the others. If you're not using any external resources, you can leave this blank.* - -What external resources will you be working with? If you are working with a dataset, how will you acquire the data set? If you're publishing some project (e.g. web app), how will you host your project? If your project requires hardware, how will you get it? Are there any other resources you need to acquire before starting? \ No newline at end of file