Let's start the new year off right and talk about one of my favorite subjects, Application Programming Interfaces or APIs.  In short, they are a wonderful thing.  For those less familiar with the premise of APIs let's dive into some greater detail.  If you are already versed, feel free to skip ahead.

Simplified, web applications have 3 components.  A front end that manages how the client interacts with the application and how content received from the back end.  When the client clicks on a button or seeks to view things like blog content or profile information the front end sends a request to the back end server to retrieve the content from the database.   The front end renders the content in an interactive or aesthetically pleasing way to the client and the session continues.  This cycle occurs over and over again until the client decides to depart the session and move on to activities that don't involve the internet.

What does this have to do with APIs you ask? A great deal.  APIs are exposed routes available on the backend application server that clients can directly access without a need for a front end interface.  The extent of API capabilities is based on the applications design.  However, in many cases, most things you can do through the web interface you can do directly through the API.

API requests have four basic components.  The first component is the url or route of the request.  Simply put, this is the server endpoint you are making your request to.  The second is the header.  The request header contains the authentication type and the authentication credentials of a request.  Query parameters are the third basic component.  These parameters describe what subset of data are you seeking to retrieve or manipulate from the endpoint.  The last component is the Hypertext Transfer Protocol (HTTP)  method included with the request.

Word of caution.  When working with credentials inside headers of your python scripts or code, opt for the use of shell environmental variables to pass authorization credentials to the script instead of within your code.  This is a best practice that keeps your code from going to version control with credentials embedded and used in this tutorial.

GET, POST, PUT, and DELETE are the four primary HTTP methods.  These methods define the type of operation the server should perform with the request. For today's example we're going to work with the GET method, which, as you may of guessed, asks the server to get the data from the database and route it back to the client.  In this operation no data is manipulated on the server, only retrieved for use by the client.

Enough chit chat.  Let's get to work.

The API we are going to work with comes from the U.S. Department of Education. To follow along you need to request a API key through the Department of Education webpage.  Reference the documentation for this specific API here. In this example I use the API to retrieve the data for number of students and tuition for Minnesota colleges in 2018.

#The request package is the workhorse for the API request and this posts namesake!
import requests
#The os package enables the use of environmental variables (among other things) within your code
import os

#The URL variable is the location of your request.  Reference the specific API documentation you're working with for details on how to access a specific endpoint.  I've listed some details on the query paramters below.
#per_page: The number of listings on a page.  For this API you can go up to 40 listings per page.
#school.degrees_awarded.predominant: Enumerated field denoting the predominant degree awarded from a specific instutution.  For this example, 3 indicates bachelors degrees.  Changing the parameter to 2 will search for schools that predominantly award associates degrees.
#school.state: The state of the institution.  In this case I've chosen MN. Go Twins!
#fields: This parameter lists the fields you would like to retrieve from the query.  If you're familiar with SQL this is akin to the SELECT command.  If you list no fields, the API will return all fields.  Note this data set is pretty enormous so if you do not list any fields, the performance will suffer.
#For this example I am returning the name of the school, the size of the student body in 2018, the 2018 cost of in state tuition, and the 2018 cost of out of state tuition.

url = "https://api.data.gov/ed/collegescorecard/v1/schools.json?per_page=40&school.degrees_awarded.predominant=3&school.state=MN&fields=id,school.name,2018.student.size,2018.cost.tuition.in_state,2018.cost.tuition.out_of_state"

#establish an empty payload.  This example is a GET request and does not require a payload.  If using POST you need to have data for the end point to work with.
payload = {}

#input your authorization header and authentication credentials.  Note that I set an environmental variable named APIKEY to provide credentials.  Again, it is best practice to never place credentials within the code you're writing.
headers = {
  'Authorization': 'Basic '+os.environ['APIKEY']
}

#use the request method from the requests package to communicate to the endpoint.  The first parameter denotes the HTTP method, in this case GET.  The second parameter is the endpoint url.  The third are the headers associated with the request, including the credentials.  Lastly, the final parameter is any data you're sending with the request.  In this case the payload is empty because we're simply getting data from the server instead of manipulating data.  The API request in this example is returned as an object called response.

response = requests.request("GET", url, headers=headers, data = payload)

#Take the response object and use the json() method to encode the output to JSON. 
print(response.json())

If you're using a Mac or Linux, set the environmental variable 'APIKEY'  used in this tutorial using the export command shown below.

currentdirectory$ export APIKEY=yourkey

The command export simply posts a variable 'APIKEY' to your local and temporary terminal session.  The key is then made accessible at runtime through the OS package.  Using environmental variables is a security best practice to keep your credentials from getting exposed within your code.

With that out of the way, let's take a look at a snippet of the response.

{'metadata': {'total': 37, 'page': 0, 'per_page': 40},
 'results': [{'school.name': 'Walden University',
   '2018.student.size': 6910,
   '2018.cost.tuition.out_of_state': 12120,
   'id': 125231,
   '2018.cost.tuition.in_state': 12120},
  {'school.name': 'Gustavus Adolphus College',
   '2018.student.size': 2229,
   '2018.cost.tuition.out_of_state': 45400,
   'id': 173647,
   '2018.cost.tuition.in_state': 45400},
  {'school.name': 'Minneapolis College of Art and Design',
   '2018.student.size': 717,
   '2018.cost.tuition.out_of_state': 39120,
   'id': 174127,
   '2018.cost.tuition.in_state': 39120},
   
#... the rest of the response is removed for ease of reading

Voila!  You've now used the python request package to make an authenticated GET request to the Department of Education API.  For the next tutorial I will demonstrate how we can use the javascript fetch API to conduct the same request and render it in a more client friendly way using a front end created using react.  In that article we will interactively explore which colleges have the greatest disparity between in state and out of state tuition.

Stay tuned!

Aaron