Categories
Motivation
In the forties and fifties (mostly in the works of Cartan, Eilenberg, MacLane, and Steenrod), it was realized that there was a systematic way of developing certain relations of linear algebra, depending only on fairly general constructions which were mostly arrowtheoretic, and were affectionately called abstract nonsense by Steenrod.
My source: Riehl
Perhaps the purpose of categorical algebra is to show that which is trivial is trivially trivial.
It is likely you were unwittingly exposed to mathematical categories long before you first heard the words "category theory." Real vector spaces and their linear transformations? That's a category. Groups and their group homomorphisms? A category. Rings, fields, or topological spaces (and the "appropriate" maps between them)? All categories.
At the most intuitive level, a mathematical category simply consists of "stuff" (usually mathematical objects with prescribed algebraic structures) and the "maps" between them (usually, but not always, set maps that respect those algebraic structures). Category theory, then, can be thought of as a mathematical language that is broadly applicable across algebra, topology, set theory, logic, and beyond. This is part of what gives category theory its power, namely its ability to universally describe constructions and ideas across different mathematical disciplines. It brings under a single umbrella the study of sets (with their set maps), vector spaces (with their linear transformations), groups (with their homomorphisms), and topological spaces (with their continuous maps), just to name a few.
Category theory studies objects (e.g., groups) and the arrows between them (e.g., homomorphisms). Every general result of category theory is a result that can be interpreted and used in your favorite category. There are maps between categories, called functors, which allow us to connect categories to each other; and there are maps between functors, called natural transformations, which can provide deep insights into fundamental mathematical constructions (e.g., free groups) and operations (e.g., tensor products).
There is a second, less obvious benefit to studying category theory that I personally feel is even more profound. Thinking categorically can push us to embrace new, abstract ideas that we might initially find unintuitive, but which eventually provide incredible new insights. These insights and lessons are sprinkled throughout these notes. Keep your eyes peeled for them!
Formal definitions
Any formal definition of category is admittedly a bit clunky, so remember the general idea: you have objects, and you have arrows between those objects.
A category consists of the following data:

A collection^{[1]} of objects

A collection of arrows

For each arrow, specified domain and codomain objects. The notation
signifies that is an arrow with domain and codomain 
For each object, a specified identity arrow. The notation
denotes the identity arrow for the object 
Any pair of arrows
with the codomain of equal to the domain of is called a composable pair. For each composable pair of arrows, there is a specified composite arrow with domain the domain of and codomain the domain of . We denote this composite arrow (or simply , if there is no cause for confusion) These data are subject to the following two axioms:
 (Identity) For any arrow
, the composites and are both equal to .  (Associativity) For any composable triple of arrows
, the composites and are equal.
 (Identity) For any arrow
Visualization
When we want to visualize categories it's useful to think of the objects as dots and the arrows as ... arrows. For example, we might visualize two objects
There are a few things to note. First, if this image is meant to represent an entire category, then it is implicitly assumed that all of the properties required to be a category hold. For example, there must exist an arrow corresponding to the composition
That being said, it is much more common to draw a picture like the above to represent a small "part" of a category, in which case we are not meant to assume that the only arrows between
In all cases, it is common convention (mainly for our own sanity) to omit any arrows that must necessarily be in the category, per the definition of category. So we usually don't draw the identity arrows, nor do we draw the compositions of all composable arrows. We simply assume that those are all present. In our currently example, we might then simply sketch the following diagram:
For one final simplification, it is common to simplify the visual presentation of the objects, either by dropping their labels, or removing the dots, such as below:
More examples can be found here.
Conventions
The language and notation of category theory is not completely standardized, but here are some common conventions.
Abstract categories are sometimes denoted with a single capital script letter, such as
If there is no cause for confusion, it's reasonable (and common) to drop all pretension with script lettering and simply use capital letters to denote categories; e.g., the category
It is common to use the word morphism in place of "arrow." I will personally use "morphism" when working with known algebraic objects (such as groups or modules), as it harkens back to the word "homomorphism" that was (and regrettably still is) used in those contexts. However, for an abstract category I will stick to "arrow."
Given two objects
Most (but not all) categories are named after their objects. For example, the category with objects all groups and with arrows all group morphisms is called "the category of groups" and is usually denoted with some variation of
Settheoretic issues
At the most fundamental and rigorous level, there are some technical logical issues that need to be addressed. This section briefly addresses those concerns, but this is nothing something we will worry about elsewhere.
When attempting to study all objects of a certain type, it is easy to run into settheoretic issues along the lines of Russell's paradox. A common convention is to assume there is a big enough set
A category is small if its collection of arrows is a small set.
Since each object is uniquely associated with an identity arrow, in a small category the collection of objects is also a small set. Unfortunately, many of the common categories we encounter are not small, i.e., are large.
A category is locally small if for any pair of objects the collection of arrows between those objects is a small set.
Will we worry about any of this? We will not. Instead, we will embrace the following quote:
The search for the most useful settheoretical foundations for category theory is a fascinating topic that unfortunately would require too long of a digression to explore. Instead, we sweep these foundational issues under the rug, not because these issues are not serious or interesting, but because they distract from the task at hand.
Examples
Examples are abundant! We begin with some really basic categories before discussing the (more complicated) categories you've likely encountered before.
A common convention with very "simple" categories is to simply sketch a visual representation of the category, with dots used to represent objects and arrows to represent ... arrows. It is also common convention not to draw the identity arrows, nor any arrows that are necessarily there by the composition assumption.
Some basic categories
The smallest possible category is the empty category, which has no objects or arrows. This category is usually denoted
The next smallest categories are the categories that have a single object. It is common to let
Since there is a unique object and unique arrow (the identity arrow on that object), there's no point to even label them. However, if you were to label the object you might^{[3]} label it as below:
With our convention of omitting identity arrows, we would visualize this category simply as
There are lots of categories that have a unique object but many arrows. Such categories are in (natural) bijection with monoids.
Continuing the pattern above, the category denoted
As one last basic example in this specific sequence of categories, the category
Note that arrow
Preorders
A preorder is a category
Sets as categories
A category is discrete when every arrow is an identity arrow. In other words, it's basically just a set (of objects). For example, a discrete category with six objects might be visualized as below. As usual, the identity arrows are not shown.
On the other hand, for a given set
Note that both the discrete categories above, and categories such as
Groups as categories
For a given group
For example, if
Note here that I have chosen to include the identity arrow, since it corresponds to the identity element in
The association to each group
Matrices over a fixed commutative ring
For each commutative ring
For example, in the category
Note that composition is written algebraically righttoleft ("inside out"), so the composition of the two arrows above corresponds to the arrow labeled by the product of those matrices in the opposite (visual) order.
This is the rare case of a category named after its arrows!
Opposite categories
For each category
Why consider such a category? We'll see.
Large categories
Most of the objects we encounter in math are the objects of some (large) categories. Below is a quick roundup of some with which you might already be familiar:
Category  Objects  Arrows 

sets  set maps (i.e., functions)  
sets with selected base point  basepointpreserving set maps  
categories  functors  
monoids  morphisms of monoids  
groups  group homomorphisms  
abelian groups  group homomorphisms  
rings (with unity)  (unitpreserving) ring homomorphisms  
commutative rings (with unity)  (unitpreserving) ring homomorphisms  
left modules over the ring 

right modules over the ring 

topological spaces  continuous maps  
topological spaces  homotopy classes of maps  
topological spaces with selected base point  base pointpreserving continuous maps  
Suggested next note
Here we use the word "collection" (as opposed to "set") to allow for set theory technicalities, such as a "class" of objects. For our purposes you can assume the objects of our categories form a set. ↩︎
At least in the case these collections are sets. See below. ↩︎
You'll shortly see why I labeled the object with a number, as opposed to a letter. ↩︎
And I choose to denote the uique object with a star for flair. ↩︎