Commit b6e53550 authored by Sorrel Harriet's avatar Sorrel Harriet
Browse files

a resource for people using mysql

parent dc1ce6a8
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Working with MySQL from Python\n",
"In this tutorial you are shown how to execute SQL statements on a MySQL database from a Python script.\n",
"\n",
"It is assumed here that you have completed Lab 3 and have a basic understanding of relational models and SQL.\n",
"\n",
"**Please note that, since this is a notebook file, you cannot run it from terminal like a .py script.**\n",
"\n",
"## MySQL Connector API\n",
"In this tutorial we will be using the [MySQL Connector API](https://dev.mysql.com/doc/connector-python/en/connector-python-introduction.html) for Python. This library provides a set of utilities we can use to connect to and query a MySQL database. API stands for \\`Application Programming Interface', and it is a general term used to describe a set of classes/functions/routines that facilitate the interaction between 2 different software applications (or different components in the same application).\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# import the mysql connector API\n",
"import mysql.connector"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# store some reuseable configuration options\n",
"# edit these according to your database settings\n",
"config = {\n",
" 'user' : 'USERNAME',\n",
" 'password' : 'PASSWORD',\n",
" 'host' : 'localhost',\n",
" 'database' : 'DATABASENAME'\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Connect to a database\n",
"First off, we will define a reuseable function that establishes a connection with a database, according to the configuration options that we pass it. We will also use this function to try to handle and report on any connection errors."
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [],
"source": [
"def connect(config):\n",
" \"\"\" Creates a connection with a MySQL database\n",
" Returns a connection object (handle to the database)\n",
" \"\"\"\n",
"\n",
" try:\n",
" cnx = mysql.connector.connect(**config)\n",
" print( \"Connected to {} database as {}\".format( config['database'], config['user'] ) )\n",
" return cnx\n",
"\n",
" except mysql.connector.Error as err:\n",
" if err.errno == errorcode.ER_ACCESS_DENIED_ERROR:\n",
" print( \"Something is wrong with your user name or password\" )\n",
" elif err.errno == errorcode.ER_BAD_DB_ERROR:\n",
" print( \"Database does not exist\" )\n",
" else:\n",
" print(err)\n",
"\n",
" else:\n",
" cnx.close()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Query the database\n",
"Next we're going to do some work on the database. For that, we need to establish a connection with the database, and then instantiate a **cursor** object. A cursor is a control structure that enables traversal over the records in the database."
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Connected to flucks database as sorrel\n",
"\n",
"Querying database...\n",
"\n",
"Img id\tImg URL\t\t\t\t\t\tFlucks given\n",
"1\thttps://pbs.twimg.com/media/DMvdd6gWkAAehPL.jpg\t7\n",
"2\thttps://pbs.twimg.com/media/DMvfNuCXUAAW4F1.jpg\t9\n",
"3\thttps://pbs.twimg.com/media/DMvE0hsXUAAs4pl.jpg\t4\n",
"4\thttps://pbs.twimg.com/media/DMvXMmRU8AACFWi.jpg\t2\n",
"5\thttps://pbs.twimg.com/media/DMvQbNAXcAAyTI8.jpg\t6\n",
"6\thttps://pbs.twimg.com/media/DMviSSOVoAAZyy5.jpg\t4\n",
"7\thttps://pbs.twimg.com/media/DMr4y9nXkAAWbB4.jpg\t6\n",
"8\thttps://pbs.twimg.com/media/DMvWP1PWsAA6lPP.jpg\t7\n",
"9\thttps://pbs.twimg.com/media/DE6JC4AU0AEFE6k.jpg\t6\n",
"10\thttps://pbs.twimg.com/media/DMvdFlhW0AA4x-S.jpg\t8\n",
"11\thttps://pbs.twimg.com/media/DMr2cTxVQAAneAm.jpg\t6\n",
"12\thttps://pbs.twimg.com/media/DMvVQk5W4AAI5Ss.jpg\t4\n",
"13\thttps://pbs.twimg.com/media/DMve8MtX0AEzX87.jpg\t5\n",
"14\thttps://pbs.twimg.com/media/DMvm75YWAAAsCDL.jpg\t7\n",
"15\thttps://pbs.twimg.com/media/DMu6hT2WAAAjIdq.jpg\t3\n",
"16\thttps://pbs.twimg.com/media/DMvW43jUIAAwsar.jpg\t6\n",
"17\thttps://pbs.twimg.com/media/DMnrbtNU8AEC6lY.jpg\t3\n",
"18\thttps://pbs.twimg.com/media/DMvfUrrXUAEsGqa.jpg\t1\n",
"19\thttps://pbs.twimg.com/media/DMveH-OWsAAPLP1.jpg\t5\n",
"20\thttps://pbs.twimg.com/media/DMrz1zmVwAAzIGR.jpg\t1\n"
]
}
],
"source": [
"# create a connection handle to database\n",
"cnx = connect(config)\n",
"\n",
"# check there's an open connection\n",
"if cnx:\n",
" \n",
" # create a cursor object\n",
" cursor = cnx.cursor()\n",
" \n",
" print(\"\\nQuerying database...\")\n",
"\n",
" # fetch some data (i.e. the number of times each image has been `served')\n",
" query = (\"SELECT m.id AS media_id, m.img_url, COUNT(*) AS num_serves FROM Media m INNER JOIN MediaServe ms ON ms.media_id=m.id GROUP BY media_id\")\n",
" cursor.execute(query)\n",
" \n",
" # display some column headings\n",
" print(\"\\nImg id\\tImg URL\\t\\t\\t\\t\\t\\tFlucks given\")\n",
" \n",
" # display the data returned in the cursor object\n",
" for (media_id, img_url, num_serves) in cursor:\n",
" print(\"{}\\t{}\\t{}\".format( media_id, img_url, num_serves ))\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Inserting data\n",
"We can follow a similar process when inserting data. The only difference here is that the data to be inserted is being passed as a second argument in the execute method of the cursor object. This is an optional variation on what we saw previously, where the entire SQL string was passed as a single argument. \n",
"\n",
"The advantage of passing the data separately for an insert is that it leverages Python's string formating rules to run checks on the data that is being inserted (i.e. does the value match the type indicated by the type **tokens**). Of course, that's assuming that the right tokens are used!\n",
"\n",
"We can also retrieve the **lastrowid** property from the cursor - useful when a single insert is part of **transaction**, in which a subsequent insert updates a child table."
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"21\n"
]
}
],
"source": [
"# check there's an open connection\n",
"if cnx:\n",
" \n",
" # define the SQL statement (inserts new row in Media)\n",
" # note the use of placeholder `token' \n",
" # which helps to ensure the integrity of the data\n",
" sql = (\"INSERT INTO Media \"\n",
" \"(img_url,alt_txt) \"\n",
" \"VALUES (%s,%s)\")\n",
" \n",
" # define values to replace the tokens\n",
" data = (\"https://pbs.twimg.com/media/DM0o8UwXUAAAFm2.jpg\",\"Image related to 'kitten' scraped from Twitter\")\n",
"\n",
" # execute the query on the database\n",
" cursor.execute(sql, data)\n",
" \n",
" # get id of the new row\n",
" fluck_id = cursor.lastrowid\n",
" print(fluck_id)\n",
"\n",
" # make sure data is committed to the database\n",
" cnx.commit() "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Close the connection\n",
"Finally, we should remember to close the cursor and connection with the database. This is because, if many connections are left open, it might affect the performance of the queries. In practice, leaving connections open (or establishing multiple connections) is unlikely to have a noticeable effect on performance until many users are running the application, but it is not a good habit to get into!"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# when we finish working, close the connection\n",
"cursor.close()\n",
"cnx.close()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 1
}
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment