Skip to main content

Relationships in Database using SQLAlchemy

In this post we will discuss the basics of managing relationships between database tables in SQLAlchemy for One to One, One to Many and Many to Many relationships. Before we proceed, lets try to quickly refresh basic relational database concepts.

Relational Database Table : 
A relational database consists of one or more table(s) with each table having rows and columns. Think of table as a grid with column(s) going from left to right and the rows representing the data. Columns are defined to hold a specific type of data, for example, dates, numeric, strings etc. A table can contain zero or more rows while if the rows are zero, table is said to be empty. Example would be  a table of employees in a company.

Primary Key : 
A Primary Key is the combination of one or more columns, whose data is used to uniquely identify the rows in the table. It acts like an address used to identify the homes for postal delivery. If there are more columns are used to represent the primary key, then the combination of data in these columns is used to uniquely represent the rows in table. Example would be "Employee ID" in a table of employees. 

Foreign Key : 
A Foreign Key is the combination of one or more columns that refer to the Primary Key of an another table. Example would be a second table dealing with Employee Details which will have the "Employee ID" as a Foreign Key (This is acting as a Primary Key in Employees table).

One to Many Relationship :
In One to Many Relationship, we can have one table whose rows can be referenced by multiple rows in an another table.

             An example would be the "Employer" and "Employees" table where we can have multiple employees belonging to a unique employer. In this case also, we need to identify Parent and Child tables. As a general rule, the table with One attribute is the Primary table and the other table as Child table. In terms of database, the Child table must hold the Foreign Key and should have an additional Primary Key.

There is another new concept from traditional database in SQLAlchemy - relationships. Relationships complement foreign key and are basically used in ORM to establish relationship between database models (Python Classes). These are very helpful in writing queries involving multiple tables. Foreign key is basically telling our SQL database about the relationship between two tables and Relationship is telling the application (ORM) about the relationship between two tables/classes. Relationship will help ORM to deal with table joins.

class Employer(Base):
    __tablename__ = 'employer_table'

    id = Column(Integer, primary_key=True)
    employer_name = Column(String)
    employees = relationship("Employee")

    def __repr__(self):
        return f'Employer : {self.employer_name}'

class Employee(Base):
    __tablename__ = 'employee_table'

    id = Column(Integer, primary_key=True)
    employer_id = Column(Integer, ForeignKey('employer_table.id'))
    employee_name = Column(String)

    def __repr__(self):
        return f'Employee : {self.employee_name}'

Notice that the foreign key is referencing to the parent table with the name defined in "__tablename__" (since foreign key is more relevant for database relationship). Relationship in Employer class is referncing to Employee class by class name.

We can then add the data using ORM as below :-

from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from models import Employer, Employee

# Create the engine
engine = create_engine('sqlite:///:memory:', echo=False)
Session = sessionmaker(bind=engine)
session = Session()

# Create All TablesEmployer.metadata.create_all(engine)
Employee.metadata.create_all(engine)

# Add an employee and employer & add reference
employer1 = Employer(employer_name="ABC Corp")
employee1 = Employee(employee_name="John")
employer1.employees.append(employee1)
session.add(employer1)
session.add(employee1)
session.commit()

# Add a new employer
employer2 = Employer(employer_name="DEF Corp")
session.add(employer2)
session.commit()

# Add an employee to an existing employer
employer = session.query(Employer).filter(
           Employer.employer_name=='DEF Corp').first()
employee = Employee(employee_name="Steve")
employee.employer_id = employer.id
session.add(employee)
session.commit()

print("Employees for employer1",employer1.employees)
print("Employees for employer2",employer2.employees)

Output
Employees for employer1 [Employee : John]
Employees for employer2 [Employee : Steve]

Notice that in the previous example, the data for employees belonging to particular employer can only be accessed using employer instance i.e. it is mainly a one directional flow. SQLAlchemy offers something called backref to make the relation bi-directional.

So, we can modify the relationship in our Employer table as below

class Employer(Base):
    __tablename__ = 'employer_table'

    id = Column(Integer, primary_key=True)
    employer_name = Column(String)
    employees = relationship("Employee", backref='employer')

    def __repr__(self):
        return f'Employer : {self.employer_name}'

print("Employer for employee1",employee1.employer)
print("Employer for employee2",employee.employer)

This allows to access the employer from the employee instance using the backref name that we added.

Employer for employee1 Employer : ABC Corp
Employer for employee2 Employer : DEF Corp

SQLAlchemy offers two ways to offer relationship - backref and backpopulates. If you are using back_populates, then relationship needs to be defined on both the tables/models. If you are using backref, relationship needs to be defined on only one table and can be defined on any table.

class Employer(Base):
    __tablename__ = 'employer_table'

    id = Column(Integer, primary_key=True)
    employer_name = Column(String)
    employees = relationship("Employee", back_populates='employer')

    def __repr__(self):
        return f'Employer : {self.employer_name}'

class Employee(Base):
    __tablename__ = 'employee_table'

    id = Column(Integer, primary_key=True)
    employer_id = Column(Integer, ForeignKey('employer_table.id'))
    employer = relationship("Employer", back_populates='employees')
    employee_name = Column(String)

    def __repr__(self):
        return f'Employee : {self.employee_name}'


One to One Relationship :
One to One relationship means that each row in a database table is linked with one and ONLY one other row in another table. So, if we have two tables A and B, then each row in A is linked with other row in B.

Lets take an example to understand it better :- 
           Consider a case where we have two database tables holding the records of Persons and the Passports. Now, in any country, one person can hold only one passport for that country. If we have to represent this in database, we have to identify the Parent and Child table. In our example, Persons table would be the primary table since a person can exist even without a passport. To represent the One to One relationship in a database, the child table must hold the Foreign Key which actually serves as the Primary Key for that table. So, in this way, you can't hold the duplicate records for the Parent table in Child table. As you can see that the Child table is not having its own primary key and rather is using the Foreign Key as its own Primary Key. So, it is clear that the Foreign Key must be unique in this case.

from sqlalchemy import Column, Integer, String, ForeignKey, Table, create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship, backref, sessionmaker

Base = declarative_base()

class Person(Base):
    __tablename__ = 'person_table'

    id = Column(Integer, primary_key=True)
    person_name = Column(String)
    passport = relationship("Passport", uselist=False, backref='person')

    def __repr__(self):
        return f'Person : {self.person_name}'

class Passport(Base):
    __tablename__ = 'passport_table'

    id = Column(Integer, primary_key=True)
    passport_id = Column(Integer, ForeignKey('person_table.id'), unique=True)
    passport_number = Column(String)

    def __repr__(self):
        return f'Passport : {self.passport_number}'


# Create the engine
engine = create_engine('sqlite:///:memory:', echo=False)
Session = sessionmaker(bind=engine)
session = Session()

# Create All Tables
Person.metadata.create_all(engine)
Passport.metadata.create_all(engine)

# Add an employee and employer & add reference
person1 = Person(person_name="John Doe")
passport1 = Passport(passport_number="ABC123456")
person1.passport = passport1

passport2 = Passport(passport_number="DEF123456")
person2 = Person(person_name="Steve", passport=passport2)

session.add_all([person1, passport1, passport2, person2])
session.commit()


print("Passport id for person1",person1.passport)
print("Passport id for person2",person2.passport)



Many to Many Relationship :
In Many to Many Relationship, the rows from both the tables can reference each other with no limits. Typically, in database, this is handled by a "join table" which will be the third table having the primary keys of both the tables.

             An example would be the "Student" and "Course" tables. In this case, it would be tricky to establish Parent Child relationship. One way is to try to identify which table can exist without the other. In our case, clearly it is Student which should be the Primary table.


"""
Example for Many to Many Relationship using SQLAlchemy

"""

from sqlalchemy import Column, Integer, String, ForeignKey, Table
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

Base = declarative_base()

association_table = Table('association', Base.metadata,
    Column('student_id', Integer, ForeignKey('student_table.id')),
    Column('course_id', Integer, ForeignKey('course_table.id'))
)

class Student(Base):
    __tablename__ = 'student_table'

    id = Column(Integer, primary_key=True)
    student_name = Column(String)
    courses = relationship("Course",
                          secondary=association_table,
                          backref="students")

    def __repr__(self):
        return f'Student : {self.student_name}'

class Course(Base):
    __tablename__ = 'course_table'

    id = Column(Integer, primary_key=True)
    course_name = Column(String)

    def __repr__(self):
        return f'Course : {self.course_name}'


# Create the engine
engine = create_engine('sqlite:///:memory:', echo=False)
Session = sessionmaker(bind=engine)
session = Session()

# Create All Tables
Student.metadata.create_all(engine)
Course.metadata.create_all(engine)

# Create new entries for Student and Course
student1 = Student(student_name="John")
student2 = Student(student_name="Steve")
course1 = Course(course_name="CS101")
course2 = Course(course_name="DS101")

# Associate Student and Course
student1.courses.append(course1)
student1.courses.append(course2)
student2.courses.append(course1)

# Commit the changes
session.add(student1)
session.add(student2)
session.add(course1)
session.add(course2)
session.commit()

# Query the database and print
print("Courses for student1",student1.courses)
print("Courses for student2",student2.courses)
print("Students enrolled for course1",course1.students)
print("Students enrolled for course2",course2.students)


Many to Many Relationship using Association :
In Many to Many Relationship, the rows from both the tables can reference each other with no limits. Typically, in database, this is handled by a "join table" which will be the third table having the primary keys of both the tables.

             An example would be the "Student" and "Course" tables. In this case, it would be tricky to establish Parent Child relationship. One way is to try to identify which table can exist without the other. In our case, clearly it is Student which should be the Primary table.


"""
Example for Many to Many Relationship using SQLAlchemy using Associations
Associations allow to add extra fields

"""

from sqlalchemy import Column, Integer, String, ForeignKey, Table
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

Base = declarative_base()

class Student(Base):
    __tablename__ = 'student_table'

    id = Column(Integer, primary_key=True)
    student_name = Column(String)
    courses = relationship("Course", secondary="student_course_link")

    def __repr__(self):
        return f'Student : {self.student_name}'

class Course(Base):
    __tablename__ = 'course_table'

    id = Column(Integer, primary_key=True)
    course_name = Column(String)
    students = relationship(Student, secondary="student_course_link")

    def __repr__(self):
        return f'Course : {self.course_name}'

class StudentCourseLink(Base):
   __tablename__ = 'student_course_link'

   student_id = Column(Integer, ForeignKey('student_table.id'), primary_key=True)
   course_id = Column(Integer, ForeignKey('course_table.id'), primary_key=True)
   extra_data = Column(String(256))
   course = relationship(Course, backref="student_bref")
   student = relationship(Student, backref="course_bref")

# Create the engine
engine = create_engine('sqlite:///:memory:', echo=False)
Session = sessionmaker(bind=engine)
session = Session()

# Create All Tables
Student.metadata.create_all(engine)
Course.metadata.create_all(engine)

student1 = Student(student_name="John")
course1 = Course(course_name="CS101")
student1_course1 = StudentCourseLink(student=student1, course=course1, extra_data='Full-Time-Student')

# Commit the changes
session.add(student1)
session.add(course1)
session.add(student1_course1)
session.commit()

# Query the database and print the records
q1 = session.query(StudentCourseLink).join(Student).filter(Student.student_name == 'John').one()
print(q1.student.student_name)

q2 = session.query(StudentCourseLink).filter(StudentCourseLink.extra_data == 'Full-Time-Student').one()
print(q2.student.student_name)

for x in session.query( Student, Course).filter(StudentCourseLink.student_id == Student.id,
   StudentCourseLink.course_id == Course.id).order_by(StudentCourseLink.student_id).all():
   print ("Student: {} Course: {}".format(x.Student.student_name, x.Course.course_name))

Code for all the relationship examples are available at my Github repository

Comments