Notes: Django

Misc Django Notes
by Oliver; March 20, 2017
   
web
 

Introduction

Here are some unpolished notes about Django, the well-known backend framework written in Python. For these notes, I'm using Django version 1.11.1. (Side note: this website is written in Django:)

A Note About Webdev

A quick note about webdev. When you're creating a website, it's going to go through many iterations before the finished product (if, indeed, it ever finishes—many sites undergo slow, continuous evolution). You'll probably host your site on AWS, but you don't want your rough drafts to be visible to the public. The way I like to solve this issue is to develop and host the website on my local computer, a Mac, and port it to AWS once it's good enough. With git, this isn't too hard and it provides a nice division between your development site and your publication-ready site.

Database: Postgres

First things first. Django needs a database. Let's choose postgres.

Install Postgres and Start It

On Mac, install postgres:
$ brew install postgresql
Initialize a location where postgres stores its data:
$ initdb /path/postgres_data -E utf8 
You may need to do this:
$ createdb
Start or stop postgres:
$ pg_ctl -D /path/postgres_data -l /path/logfile start 
$ pg_ctl -D /path/postgres_data stop 
We must create a db for our Django project. Let's call our database myDB:
$ createdb myDB 
Check out your postgres processes:
$ ps -Af | grep postgres

The Postgres Shell

Open the postgres shell:
$ psql
Ditto, but attach to a particular db:
$ psql myDB
Ditto, but attach to a particular db as a particular user:
$ psql -d myDB -U myUserName
In the postgres shell, list databases, then connect to one:
=> \l
=> \connect myDB
Show tables, and how big they are:
=> \dt+
Show first 10 rows from mytable:
=> SELECT * FROM mytable LIMIT 10;
Show last 10 rows from mytable:
=> SELECT * FROM mytable ORDER BY id DESC LIMIT 10;
Get size of mytable:
=> SELECT COUNT(*) from mytable;

Starting your Django Project

Let's make an overarching directory for the project called mySite and go into it:
$ mkdir mySite
$ cd mySite
You'll want to use virtualenv so you don't mix up your Django-related Python packages and your globally installed Python packages. Also, let's be sure to use Python 3, not Python 2. Here we go:
$ virtualenv -p python3 venv
$ source venv/bin/activate
$ pip install django
$ pip install psycopg2 # this is the Django postgres plugin
$ pip freeze > requirements.txt # record the packages we've installed
$ django-admin startproject myProject
Here's what our directory structure looks like so far:
mySite/
├── myProject
│   ├── manage.py
│   └── myProject
│       ├── __init__.py
│       ├── settings.py
│       ├── urls.py
│       └── wsgi.py
├── notes
└── venv
    ├── bin
    ├── include
    ├── lib
    └── pip-selfcheck.json

Modifying the Database in settings.py

We see that Django has created a settings.py, which is the configuration file for the project. In settings.py, change the default db from sqlite to your postgres db:
# DATABASES = {
#    'default': {
#        'ENGINE': 'django.db.backends.sqlite3',
#        'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
#    }
# }

# use postgres instead

DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql',
        'NAME': 'myDB',
        'USER': 'myUserName',
        'PASSWORD': 'myPassword',
        'HOST': 'localhost',
        'PORT': '',
    }
}
Note: if you're using a public git repo, don't commit your settings file, because it contains a secret key!

Creating the Database Schema

Now that we've created our database with postgres's createdb command and linked to it in our settings.py file, we have to create the database schema. That's:
$ python manage.py migrate

Starting Git

Not using version control is not an option! Here are some standard commands to get git up and running.

Referring to the directory tree above, we're in the mySite/ directory. First, I like to make a .gitignore file that looks like this:
$ cat .gitignore
notes
*.pyc
settings.py
venv
Now start the repository:
$ echo "# myProject" >> README.md
$ git add .gitignore README.md requirements.txt
$ git commit -m 'first commit - add .gitignore, README, requirements.txt'
If you have an empty repository waiting on GitHub, hook it up:
$ git remote add origin git@github.com:myUserName/myProject.git
$ git push -u origin master

Starting an App within your Project

Follow the Django tutorial. Let's make an app called sitebackend:
$ python manage.py startapp sitebackend
Now our directory structure is looking something like this:
mySite/
├── README.md
├── notes
├── myProject
│   ├── manage.py
│   ├── notes
│   ├── myProject
│   │   ├── __init__.py
│   │   ├── __pycache__
│   │   ├── settings.py
│   │   ├── urls.py
│   │   └── wsgi.py
│   └── sitebackend
│       ├── __init__.py
│       ├── __pycache__
│       ├── admin.py
│       ├── apps.py
│       ├── migrations
│       ├── models.py
│       ├── tests.py
│       ├── urls.py
│       └── views.py
├── requirements.txt
└── venv
    ├── bin
    ├── include
    ├── lib
    └── pip-selfcheck.json
Follow the docs to modify the following files:

myProject/sitebackend/urls.py:
from django.conf.urls import url

from . import views

urlpatterns = [
    url(r'^$', views.index, name='index'),
]
myProject/sitebackend/views.py:
from django.shortcuts import render
from django.http import HttpResponse

def index(request):
    return HttpResponse("Hello, world")
myProject/myProject/urls.py:
from django.conf.urls import include, url
from django.contrib import admin

urlpatterns = [
    url(r'^home/', include('sitebackend.urls')),
    url(r'^admin/', admin.site.urls),
]
Now trying serving your initial Django site, as discussed in the next section.

Running the Django Mini-Server

When your project goes into production, you'll want to use a proper server like nginx (see Setting up Django and your web server with uWSGI and nginx). However, you can test your project during development without the hassle of configuring nginx. Django comes bundled with a mini-server. Run it:
$ python manage.py runserver
By default this serves the page on port 8000. To, say, run on port 8001 instead:
$ python myProject/manage.py runserver localhost:8001
(That's via Stackoverflow: Django change default runserver port.) If you're on AWS EC2, don't forget to open security permissions on the port you want to access.

Creating your Models

Suppose we want to define a (biological) virus object. Here's an example models file, sitebackend/models.py:
from django.db import models

class Virus(models.Model):
    """full description of the viruses (nucleic acid info and taxonomy)"""
    name = models.CharField(max_length=70)
    # virus taxonomic identifier
    taxid = models.CharField(max_length=10)
    # DNA, RNA or RETRO
    nucleic1 = models.CharField(max_length=70)
    # ssDNA, dsDNA, (+)ssRNA, (-)ssRNA, dsRNA, RETRO)
    nucleic2 = models.CharField(max_length=70)
    # taxonomic info
    order = models.CharField(max_length=70)
    family = models.CharField(max_length=70)
    subfamily = models.CharField(max_length=70)
    genus = models.CharField(max_length=70)
    species = models.CharField(max_length=70)
Now we need to transmit this schema into our database:
$ python manage.py makemigrations sitebackend
$ # python manage.py sqlmigrate sitebackend 0001
$ python manage.py migrate

Loading Data into your Database

I often have data in text files and face the issue of importing that data into the database. One way to accomplish this is to write a loader script. I'll create a scripts/ directory in mySite/:
mySite/
├── README.md
├── notes
├── myProject
├── requirements.txt
├── scripts
│   ├── loader1.py
│   └── notes
└── venv
Suppose our text file looks like this:
#taxid  nucleic1        nucleic2        order   family  subfamily       genus   specie  name
568715  RNA     (+)ssRNA        nan     Astroviridae    nan     nan     nan     Astrovirus MLB1
683172  RNA     (+)ssRNA        nan     Astroviridae    nan     nan     nan     Astrovirus MLB2
1247114 RNA     (+)ssRNA        nan     Astroviridae    nan     nan     nan     Astrovirus MLB3
645687  RNA     (+)ssRNA        nan     Astroviridae    nan     nan     nan     Astrovirus VA1
Then we could write a script loader1.py as follows:
import sys
sys.path.append('../myProject')
import django
django.setup()
from sitebackend.models import Virus

# input file has header:
# #taxid nucleic1 nucleic2 order family subfamily genus specie name

header = 1
for line in sys.stdin:
    if header:
        header = 0
        continue
    fields = line.strip().split("\t")
    v = Virus(name = fields[8],
        taxid = fields[0],
        nucleic1 = fields[1],
        nucleic2 = fields[2],
        order = fields[3],
        family = fields[4],
        subfamily = fields[5],
        genus = fields[6],
        species = fields[7]
    )
    v.save()
Now we can run it as follows:
$ export DJANGO_SETTINGS_MODULE=myProject.settings
$ cat file.txt | python ./loader1.py

Loading Data into your Database from Fixtures

Another way to load your data is via fixtures, which you can read about here. You can make a directory, e.g., here:
$ mkdir -p myProject/sitebackend/fixtures
(the docs say: "By default, Django looks in the fixtures directory inside each app for fixtures") and throw a file of JSON data in the directory. For example, suppose we have cancer objects in our models.py. Then our fixture might look like this:

myCancerData.json:
[
  {"fields": {"name": "Gastric cancer"}, "pk": 1, "model": "sitebackend.Cancer"}, 
  {"fields": {"name": "Colorectal cancer"}, "pk": 2, "model": "sitebackend.Cancer"}, 
  {"fields": {"name": "Glioma"}, "pk": 3, "model": "sitebackend.Cancer"}
]
pk is the primary key.

Common question: suppose we have another database table that links to our cancer table via foreign keys. How do we express that with fixtures? The answer is to use the cancer object's pk to link it. For example, suppose we have patient objects and each patient is associated with a particular cancer. Then our patient fixture might look like this:
[
  {"fields": {"patientid": 53, "study": 1, "cancer": 1}, "pk": 330, "model": "sitebackend.Patient"}, 
  {"fields": {"patientid": 89, "study": 1, "cancer": 2}, "pk": 227, "model": "sitebackend.Patient"}, 
  {"fields": {"patientid": 66, "study": 1, "cancer": 1}, "pk": 19, "model": "sitebackend.Patient"}
]
This captures the relationship that the patient with pk == 227 has Colorectal cancer.

We still haven't loaded the data in the database. To do that, run:
$ python manage.py loaddata myCancerData.json
manage.py loaddata is (pardon the language) finicky as fuck—i.e., the opposite of robust. I discovered the following super-annoying "gotchas":
  • using single quotes not double quotes throws an error
  • a trailing comma at the end of the file ( },] as opposed to }] ) throws an error
  • loading 500,000 objects threw a mystery error; 250,000 objects was ok
Also, it should be noted, if your data is in text files, you'll still have to write a script. Only this time it will be to transform your text file into JSON format.

The Django REST framework

I like to use the Django REST framework. The point of this is to make your backend a lean, JSON-serving API and take care of the all the front-end rendering with a javascript framework, like Angular or Vue. These javascript frameworks will digest your JSON and deal with it in a more elegant and interactive fashion than Django. You thus save yourself from having to use Django's templating engine and are easily set up to build a SPA ("single page application").

Follow their docs to install it:
$ pip install djangorestframework
then add it to your INSTALLED_APPS list in settings.py:
INSTALLED_APPS = [
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
    'sitebackend.apps.SitebackendConfig',
    'rest_framework',
]
Now we're going to take inspiration from this tutorial: http://www.django-rest-framework.org/tutorial/2-requests-and-responses/.

Edit sitebackend/views.py to be:
from django.shortcuts import render
from django.http import HttpResponse

# http://www.django-rest-framework.org/tutorial/2-requests-and-responses/
from rest_framework import status
from rest_framework.decorators import api_view
from rest_framework.response import Response

from sitebackend.models import Virus

def index(request):
    return HttpResponse("Hello, world")

@api_view(['GET',])
def get_virus_all(request):
    """
    Get list of virus objects

    Sample output:
    GET /virus
    [
        {
            "order": "nan",
            "species": "Adeno-associated dependoparvovirus A",
            "taxid": "10804",
            ...
        },
        ...
    ]
    """

    # return Response([i.__dict__ for i in Virus.objects.all()[0:10]])

    res = []
    for i in Virus.objects.all():
        i.__dict__.pop('_state', None)
        res.append(i.__dict__)

    return Response(res)
The reason I'm deleting the _state key is that it throws an error if you don't:
<django.db.models.base.ModelState object at ... > is not JSON serializable
Now we're going to hook this function up to the appropriate URL. Edit sitebackend/urls.py to be:
from django.conf.urls import url

from . import views

urlpatterns = [
    url(r'^$', views.index, name='index'),
    url(r'^virus/$', views.get_virus_all),
]
The result is a svelte, JSON-serving back-end! Here's what it looks like in the browser:

image

Now your front-end javascript framework can crunch this data and go wild with it—filtering it, populating menus, etc.

The Django Shell

The django shell is ideal for testing database queries. Fire it up:
$ python manage.py shell
Let's suppose we've defined a "sample" class in models.py, and our database is populated with sample objects. Your particular project might have user objects or article objects or whatever, but no matter.

Get all sample objects:
In [1]: from sitebackend.models import Sample

In [2]: Sample.objects.all()
Out[2]: <QuerySet [<Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, <Sample: Sample object>, '...(remaining elements truncated)...']>
Get the first sample object:
In [1]: Sample.objects.get(id = 1)
Out[1]: <Sample: Sample object>
To get the dictionary representation of the first sample object, we can look at the object's __dict__ attribute:
In [1]: Sample.objects.get(id = 1).__dict__
Out[1]:
{'id': 1,
 'patient_id': 1,
 'sampleid': 'P1.T'}
Note the difference between the .get and .filter methods: .get is used when you expect one result and will return something of the object type you're querying; while .filter can return more than one object and thus will yield something of the QuerySet type. Here's .filter:
In [1]: MyGene.objects.filter(name = 'TP53')
Out[1]: <QuerySet [<MyGene: MyGene object>]>

In [2]: type(MyGene.objects.filter(name = 'TP53'))
Out[2]: django.db.models.query.QuerySet
Here's .get:
In [3]: MyGene.objects.get(name = 'TP53')
Out[3]: <MyGene: MyGene object>

In [4]: type(MyGene.objects.get(name = 'TP53'))
Out[4]: sitebackend.models.MyGene

Serving Django with nginx

As noted above, refer to: Setting up Django and your web server with uWSGI and nginx. I also have a post on the subject here, which closely mirrors the above link.

One of the first steps that page mentions is to install the development version of Python, and then to install uwsgi:
pip install uwsgi
Eventually, you have to start messing around with the nginx config file. Here are some sample ngnix commands on my system (Amazon Linux):
$ sudo /etc/init.d/nginx start # start it
$ sudo /etc/init.d/nginx stop # stop it
$ sudo /etc/init.d/nginx restart
$ sudo nginx -t # test the config file syntax and print its path

Miscellaneous

Access-Control-Allow-Origin errors with your frontend framework? Install django-cors-headers (see Wikipedia: Same-origin policy; Wikipedia: Cross-origin resource sharing).
Advertising

image


image


image