Object Relational Mapping (ORM)
Introduction
Software systems employ persistent storage to store data for future use. Often Relational Database Management Systems (RDBMS) act as persistent storage for storing entities being managed by a system. RDMBS support Structured Query Language (SQL) which standardizes how we interact with various database servers offered by multiple vendors (Oracle, MySQL, PostgreSQL etc.)
RDBMS organizes data in a collection of entities known as a table, with each entity captured as one record/row on tables. RDBMS also allows users to express constraints and associations between two entities as indices.
Problem
Object Oriented Programming (OOP) is a programming model that organizes software design around data. Data is modeled as objects, which are in-memory representation of the data in discussion. Evey OOP language supports constructs that help user to define the structure of an object.
For eg.. Java offers Class to model an entity. Following is an example Object model for Employee entity with Id
, firstName
and lastName
as attributes on it.
public class Employee {
private String employeeId;
private String firstName;
private String lastName;
}
While OOP constructs help user to define and manage data in-memory during processing, a key challenge comes into play when these entities modeled as objects need to be persisted/retrieved from underlying rdbms storage.
For e.g. let’s say following sql table represents employee with 3 records:
In order to write / read data from the specified table, programmer will have to write SQL queries.
# Writing data to the table
Insert into Employees(employee_id, first_name, last_name) values ('emp4', 'Emp4', 'Gamma');# Reading data from the table
select `employee_id`, `first_name`, `last_name` from employees where employee_id = 'emp1'
Following are the key challenges:
- Programmers have to write SQL queries and translate entities modeled as Object in supported SQL format to store, read, query in-memory entities both at the time of writing to storage and reading it back from storage.
- Both models must be in sync and we should coordinate changes to each in order to continue working. Out of sync changes between data model and object entities can lead to loss of functionality or loss of data sometimes.
- Where underlying data storage or storage api changes, the programmer must update the object models to adopt.
ORM
Object Relational Mapping is an acronym used to represent mechanisms which let programmers define relational transformations for object models supported by the language. This helps programmer to not think around underlying rdbms storage used and translation to and from objects during read / write.
ORM exposes language specific constructs, usually as annotations that programmer can use to denote the structure. ORM then use these constructs to understand, analyze and generate translations and mappings to underlying rdbms entities.
For e.g. hibernate
offers annotations for java programmer to define entity model . Following is example of using ORM annotations for employee entity
@Entity
public class Employee {
@Id
@Column
private String employeeId;
@Column
private String firstName;
@Column
private String lastName;
}
Example Orms
- Java: Hibernate
- Typescript: TypeORM
- Python: SQLAlchemy
- Dot net: Linq
Pros
ORMs offer a variety of benefits as listed below
- ORM generates the underlying RDBMS representation for the OOP models. We can employ ORM to generate both schema and store/retrieval queries.
- ORM keeps the data model in sync with the object model and updates the underlying structure.
- ORM by existence and use across multiple codebases generated queries are robust and avoids general pitfalls like
sql injection
and other similar issues - When we need to change the underlying rdbms provider, we can leverage ORM to port application with no significant changes to codebase.
- ORMs support regular programming paradigms such as extension and composition, assisting programmers to reuse existing pieces and avoid duplication.
Cons
- N+1 Select Problem: With OneToMany mappings, ORMs tend to execute repetitive queries in specific situations, which can be averted by calling a single broader query. ORMs having no awareness of the wholistic logical data models and hence cannot optimize for such cases.
- Sub-Optimal queries: ORMs are built for a broader audience and might not always generate the most optimal query. With additional domain awareness and profiler use, one can write a better query for slow performing queries.
- Learning Curve: One has to ramp up on ORM constructs for specific languages and configurations required. The learning curve can be steep in some scenarios. There has been some standardization to introduce abstract concepts applicable to multiple ORMs. For e.g. JPA attempts to standardize ORM annotations which can be used across multiple ORM providers for Java language.
Conclusion
We can group benefits from ORM as following
- Developer Productivity: Faster development time , avoiding churn to reiterate over queries as the data model evolves. Developers also don’t need to switch context while writing code.
- Application Structure: Provides easy-to-use constructs and organizes application persistent storage code by using language constructs around DRY.
- Maintainability: Easy to maintain applications since ORMs evolve and offer better and more coverages of underlying functionalities.
While ORMs offer a collection of amazing features and help improve developer productivity. We cannot assume them to be a silver bullet to solve for every problem with persistence storage and complexity around it.
Some situations might still need customized queries, which most ORMs do support by allowing the programmer to execute hand written native platform queries.
Having said that ORMs can not be a replacement for learning SQL and are more of an assistive technology that helps improve developer experience and productivity.