Commit 5313f9a1 authored by Sorrel Harriet's avatar Sorrel Harriet
Browse files

Merge branch 'master' of gitlab.doc.gold.ac.uk:data-networks-web/lab-exercises

parents dfef6089 b6e53550
%% Cell type:markdown id: tags:
# Working with MySQL from Python
In this tutorial you are shown how to execute SQL statements on a MySQL database from a Python script.
It is assumed here that you have completed Lab 3 and have a basic understanding of relational models and SQL.
**Please note that, since this is a notebook file, you cannot run it from terminal like a .py script.**
## MySQL Connector API
In this tutorial we will be using the [MySQL Connector API](https://dev.mysql.com/doc/connector-python/en/connector-python-introduction.html) for Python. This library provides a set of utilities we can use to connect to and query a MySQL database. API stands for \`Application Programming Interface', and it is a general term used to describe a set of classes/functions/routines that facilitate the interaction between 2 different software applications (or different components in the same application).
%% Cell type:code id: tags:
``` python
# import the mysql connector API
import mysql.connector
```
%% Cell type:code id: tags:
``` python
# store some reuseable configuration options
# edit these according to your database settings
config = {
'user' : 'USERNAME',
'password' : 'PASSWORD',
'host' : 'localhost',
'database' : 'DATABASENAME'
}
```
%% Cell type:markdown id: tags:
## Connect to a database
First off, we will define a reuseable function that establishes a connection with a database, according to the configuration options that we pass it. We will also use this function to try to handle and report on any connection errors.
%% Cell type:code id: tags:
``` python
def connect(config):
""" Creates a connection with a MySQL database
Returns a connection object (handle to the database)
"""
try:
cnx = mysql.connector.connect(**config)
print( "Connected to {} database as {}".format( config['database'], config['user'] ) )
return cnx
except mysql.connector.Error as err:
if err.errno == errorcode.ER_ACCESS_DENIED_ERROR:
print( "Something is wrong with your user name or password" )
elif err.errno == errorcode.ER_BAD_DB_ERROR:
print( "Database does not exist" )
else:
print(err)
else:
cnx.close()
```
%% Cell type:markdown id: tags:
## Query the database
Next we're going to do some work on the database. For that, we need to establish a connection with the database, and then instantiate a **cursor** object. A cursor is a control structure that enables traversal over the records in the database.
%% Cell type:code id: tags:
``` python
# create a connection handle to database
cnx = connect(config)
# check there's an open connection
if cnx:
# create a cursor object
cursor = cnx.cursor()
print("\nQuerying database...")
# fetch some data (i.e. the number of times each image has been `served')
query = ("SELECT m.id AS media_id, m.img_url, COUNT(*) AS num_serves FROM Media m INNER JOIN MediaServe ms ON ms.media_id=m.id GROUP BY media_id")
cursor.execute(query)
# display some column headings
print("\nImg id\tImg URL\t\t\t\t\t\tFlucks given")
# display the data returned in the cursor object
for (media_id, img_url, num_serves) in cursor:
print("{}\t{}\t{}".format( media_id, img_url, num_serves ))
```
%% Cell type:markdown id: tags:
## Inserting data
We can follow a similar process when inserting data. The only difference here is that the data to be inserted is being passed as a second argument in the execute method of the cursor object. This is an optional variation on what we saw previously, where the entire SQL string was passed as a single argument.
The advantage of passing the data separately for an insert is that it leverages Python's string formating rules to run checks on the data that is being inserted (i.e. does the value match the type indicated by the type **tokens**). Of course, that's assuming that the right tokens are used!
We can also retrieve the **lastrowid** property from the cursor - useful when a single insert is part of **transaction**, in which a subsequent insert updates a child table.
%% Cell type:code id: tags:
``` python
# check there's an open connection
if cnx:
# define the SQL statement (inserts new row in Media)
# note the use of placeholder `token'
# which helps to ensure the integrity of the data
sql = ("INSERT INTO Media "
"(img_url,alt_txt) "
"VALUES (%s,%s)")
# define values to replace the tokens
data = ("https://pbs.twimg.com/media/DM0o8UwXUAAAFm2.jpg","Image related to 'kitten' scraped from Twitter")
# execute the query on the database
cursor.execute(sql, data)
# get id of the new row
fluck_id = cursor.lastrowid
print(fluck_id)
# make sure data is committed to the database
cnx.commit()
```
%% Cell type:markdown id: tags:
## Close the connection
Finally, we should remember to close the cursor and connection with the database. This is because, if many connections are left open, it might affect the performance of the queries. In practice, leaving connections open (or establishing multiple connections) is unlikely to have a noticeable effect on performance until many users are running the application, but it is not a good habit to get into!
%% Cell type:code id: tags:
``` python
# when we finish working, close the connection
cursor.close()
cnx.close()
```
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment