Unions, Intersections and Relative Complements
Finite, Infinite, Denumerable and Uncountable Sets
Set theory was developed in the second half of the Nineteenth Century. It has its roots in the work of Georg Cantor, although contributions of others such as Gottlob Frege and Giuseppe Peano were significant. Ultimately, the goal of Set Theory was to provide a common axiomatic basis for all of mathematics. In some sense, mathematics could then be reduced to logic. Attempts to provide an axiomatic basis for mathematics were undertaken by such prominent individuals as Bertrand Russell, Alfred North Whitehead, and David Hilbert.
A set is a collection of things of any kind. If B is a set we
call the "things" in B the elements or members of B. In symbols
meansthat
b
is an element of B. Similarly, for a set B the statement
means
that the object b is not in B. Sets themselves are often
symbolized by enclosing their elements within "curly brackets". For example
{Jimmy Carter, Ronald Reagan, George Bush, Bill Clinton} represents the
set of US presidents during the years 1980 to 1995. Note: The use
of { } , which is also employed as a grouping symbol, for sets could be
confusing, but generally the context makes what is intended clear.
Sets can also be described by a rule or predicate members must satisfy. If P(x) is the predicate "x was a president of the US during the years 1980 to 1995", then the set listed above could be symbolized as {x | P(x) }. This is rendered in words as "the set of all x such that P(x)". The "|" stands for the expression "such that", although some authors prefer ":" to mean the same thing. The use of curly brackets and a rule to specify sets is called set-builder notation and is used throughout mathematics.
Naïve Set Theory is based on the following two axioms.
1. Axiom of Abstraction (or Comprehension): Any
collection of objects that can either be listed or described by some predicate
constitutes a set. Furthermore, an object is a member of a set if and only
if it is one of the objects listed or satisfies the open predicate describing
the set. In symbols given any predicate, P(x), the set {x
|
P(x)}
exists and
.
2. Axiom of Extensionality: Given two sets A
and B, A = B if and only if
.
In
words, two sets are identical if and only if they have exactly the same
elements.
Let A = { Bob Jones, Felix the cat, Mars, 9 } and B = { 9, Mars, Bob Jones, Felix the cat}. By the application of the axioms A = B, since every member of one set is also in the other set. Thus, the order in which elements are listed is immaterial in defining a set. A set has no characteristics other than its status as a collection of its elements.
These two axioms seem very intuitive and self-evident. In fact, they seem almost devoid of content. You might suspect that they lack sufficient power to prove anything of importance. In fact, these two axioms are too powerful! You can prove absolutely everything from them because they are self-contradictory! We will discus this later when we outline Russell's Paradox. Suffice it to say that the problem lies in the Axiom of Abstraction's lack of restriction on the predicate P(x). Thus, by the Axiom of Abstraction, we are allowed to formulate things such as the set of all sets. By making our universe too big we are begging for trouble! In our use of Naïve set theory we will purposely avoid such constructions.
For a given predicate P(x), consider the
set
.
Suppose
for some a,
,
then by the Axiom of Extensionality,
.
Hence, by Indirect Proof, we've established that the set Q has no
members, i.e.,
.
This
set is called the empty Set or the null Set and is commonly
symbolized by either empty brackets { } or the symbol
.
The empty set goes by infinitely many aliases. For example, the following
sets are all different names for the one and only empty set. (Z
is the "standard" name for the set of integers.)
![]()
Note: The set
is
not the empty set. This is a set of sets which has an element, namely the
null set.
Some sets are contained entirely within other sets. For
example, the set of women is contained within the set of human beings.
We say that
is
a subset of
.
In
symbols,
.
The
formal definition of inclusion (A being a subset of B) is
stated as follows:
![]()
In words, given any two sets A and B, A is a subset of B if and only if every element of A is also an element of B.
Some authors use the symbol
instead
of
for inclusion.
In analogy to the symbols <and
,
we
will use
for
inclusion and will reserve
to
mean that A is a proper subset of B. This means that A
is contained in B, but is not all of B. In symbols, if A
and B are sets,
.
Since
isalways
true when p is false and since
is
always false, for any set A, it is true that
.
Hence, the empty set is a subset of every set,
.
From
the notion of subset we can characterize set equality by the following
theorem, given two sets A and B,
.
Proof: Assume A = B, then
by the Axiom of Extensionality every element in A is also in B
and every element in B is also in A, so
and
.
So
.
Assume
.
Pick
any x. If x is in set A, since
,
x
is also in B. If x is in set B, since
,
xis
also in A. So,
.
So
by the Axiom of Extensionality, A = B. So
.
Consider the set L = {a, b, c}. The full list of the 8 subsets of L is as follows:
,{a},
{b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}.
The set of all possible subsets of a given set A is called the power set of A, in symbols P(A). Thus, for the set L,
P(L)
= {
,{a},
{b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}}.
Unions, Intersections and Relative Complements
Given two sets A and B we can define the following binary set operations.
1. The union of A and B :
.
2. The intersection of A and B :
.
3. The relative complement of A in B
:
.
The union of two sets is everything in either one of them. It is the "sum total" of the two sets. The intersection of two sets is the "overlap", i.e., the part of the two sets which is common to both. If the intersection is empty, we say the two sets are disjoint. This means they have no elements in common. In probability theory two disjoint events are called mutually exclusive, which means one event happening excludes the possibility of the other having happened. The relative complement of A in B are the elements in B without the ones also in A. One might read this as B "minus"
A or B "without" A. For example, if A = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10} and B = {0, 2, 4, 6, 8, 10, 12, 14, 16}, then
.
From the definitions of intersection, union and inclusion,
we can immediately establish that for any sets A and B,
,
,
and
.
Applying the indicated tautology or contradiction along
with the set operation definitions establishes each of the following set
operation identities for any sets A,
B, and C.
| Tautology | Set Operation Identity |
| Contradiction | Set Operation Identity |
The above identities serve as the basis for an "algebra of sets".
We can now prove the following two theorems:
1.
2.
Proof:
1.
.
Now consider any x in A. Since
,
x
is also in B. Therefore x is in
.
Hence,
.
Since
,we
have
.
2.
.
Now consider any x in
.
Then x is either in A or B, but since
,
if x is in A, it is also in B. Therefore x
is in B. Hence,
.
Since
,we
have
.
As was noted earlier, allowing the scope of sets to become
too large (such as the set of all sets) can lead to contradictions. This
can often be avoided by imagining that all the sets under consideration
are subsets of some finite universe. Some authors use U
for such a set, but we will call it S to avoid confusion with the
set operation of union. S is also the symbol used in probability
for the universe of all possible outcomes of an experiment. In this context
S
is called the Sample Space. A common usage is to abbreviate
by
.
This of course is a more concise notation. It also reflects the definitions
of Boolean
Algebra, where multiplication, as indicated by the juxtaposition of two
variables, is defined as set intersection and addition is set union. Working
within the restriction of a universe S, allows us to define the
complement
of a set A as the relative complement of A in S.
![]()
Since the complement of A is that part of "everything"
(i.e., S ) that's not A , we have the interpretation
of complement as negation. In symbols,
is
"like"
.
Assuming that the sets A and B are subsets of S and using the algebra of sets we can establish the following:
;
;![]()
| Set Identity | Interpretation |
| What's not part of everything is nothing | |
| What's not part of nothing is everything. | |
| Not (not A) is A. | |
| A and not A is always false. | |
| B but not A is B without A. | |
| DeMorgan's Rule for negating an OR. | |
| DeMorgan's Rule for negating an AND. | |
| A is present when either A is present with B or A is present without B. | |
| Either A is in the universe (S) or it is not (EMI). |
Set operations in a universe S can be viewed graphically by a Venn diagram. Here the rectangle represents S and any subset of S is drawn as a closed figure within this rectangle. This is illustrated below.
Venn Diagram Showing the Intersection and Union of Sets A and B
The most basic numbers are the natural numbers ("counting" numbers or positive integers). While there are ways to construct the natural numbers using sets, such a development is beyond the scope of these notes. Furthermore, this construction is not intuitive and strikes many people (mathematicians included) as artificial. Thus, we will consider the natural numbers as undefined and primitive objects. This approach can also be taken to extremes as seen in the work of Leopold Kronecker, who declared "God created the integers, all else is the work of man". The set of natural numbers is often symbolized as either Nor Z+. If the ellipsis is understood to represent the same sequence repeating without end, then N = { 1, 2, 3, 4, 5, 6, 7, … }.
After much resistance zero was eventually accepted as a "valid" number. The set of whole numbers is given by
W =
{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, … } = union of N
with
{ 0 }. The negative of a natural number, n, is its opposite
,
so
that
.
We define the set of integers as Z
=
.
Since not every mathematical process (for example, measurements with a rule) results in an integer answer, we need to consider numbers that are fractions or ratios of integers. Assuming that the process of division is understood, we can write
.
For integers this can be interpreted as repeated subtraction
just as multiplication can be interpreted as repeated addition. Thus, 34
divided by 5 is 6, remainder 4. This means that we can subtract 5 from
34 six times before we get a remainder of 4 less than 5. Essentially, thecomputation
tells us how many 5's are in 34. There are six 5's in 34 with 4 left over.
This is one reason why division by zero is meaningless. How many 0's are
in 34? How many 0's could we subtract from 34 till we get a remainder less
than 0? So when we write
,
we assume that q is not zero. We therefore define the set of rational
numbers as the set of all permissible ratios of integers. In symbols, the
set of rationals is given by
Q
=
.
A
second way to characterize the set of rationals is
Q = {x | x has a terminating or repeating decimal representation }. In fact this definition can be shortened as just the set of repeating decimals. Any terminating decimal is a repeating decimal with a repeating string of 0's. For example, 3.458 = 3.458000… . One problem with defining rational numbers this way it that the decimal representation of some rational numbers is not unique. Consider the decimal 0.49999… , where it is understood that the string of 9's never stops. This is the rational number one half which is also represented by the terminating decimal 0.5 . This redundancy in representation exits for every rational number that is a terminating decimal. To understand the equivalence of these forms requires a familiarity with limits and infinite geometric series. These concepts are beyond the scope of the present notes. However, the following computation may make the result at least seem plausible.

Since we would like our decimal representation of rational numbers to be unique, we will assume that all numbers that end in an infinite string of 9's have been rewritten as the corresponding terminating decimal.
Still there are processes whose answers can not be rational. Pythagoras is said to have known that the square root of 2 is such a number. A number that is not rational is called irrational. This certainly does not mean that the number is "deranged" or even illogical, it simply means that the number can not be expressed as a ratio of integers. Other Greek mathematicians such as Theodorus, were aware of additional irrational quantities.
The proof that
is
irrational was one of the first examples of indirect proof. It assumes
some familiarity with arithmetic and is presented below.
Assume that
,
where P and Q are both positive integers without any common
factors. If P and Q were not originally in lowest terms,
we could remove as many common factors as necessary until we arrived at
a rational representation of
that was in lowest terms. So without any loss of generality, we will assume
that P and Q have no common factors. Now, by definition of
the square root,
.
So
P
must be even since it has an even square. Hence,
P = 2M for some natural number M.
Substituting this into the expression for P squared yields
.
So Q must be even since it has an even square. Hence, 2 divides
both P and Q in contradiction to the assumption that P
and Q have no common factors. Therefore, it is impossible to express
as
a rational number.
In fact, using the Fundamental Theorem of Arithmetic, we can show that the n'th root of any natural number that is not a perfect n'th power must be irrational. Let a be a natural number that is not a perfect n'th power. This means that there is no natural number m with the property that mn = a .
Now suppose
,
where P and Q are natural numbers with P > Q
>
1 . We know Q > 1, since Q = 1 implies that Pn
is a. As before we can assume that P and Q have no
common factors. Now by the Fundamental Theorem of Arithmetic, P
and Q can both be written as unique products of prime factors. Specifically,
if the J prime factors of P are p1, p2,
… pJ, and K prime factors of Q are q1,
q2,
… qK , then none of the prime factors of P match
any of the prime factors of Q.
![]()
By the definition of the n'th root, we have
.
Now, the prime factors of Pn are the same as the prime
factors of P, i.e.,
p1, p2, … pJ, and the prime factors of Qn are the same as the prime factors of Q . Only the multiplicity (how often each factor occurs) has changed.
![]()
However,
so
we have a second factorization of Pn , which involves
different prime factors. This is a contradiction of the Fundamental Theorem
of Arithmetic. So for a not a perfect n'th power,
is
irrational.
By taking the union of the set of rational numbers with the set of irrational numbers we form a bigger set, R, called the set of real numbers. It is convenient to represent the real numbers geometrically as points on a "number line". This is illustrated below.

Often we want to work with subsets of real numbers that correspond to intervals on the number line.
The open
interval
from a to b does not include the endpoints;
.
The closed
interval
from a to b does include the endpoints;
.
The semi-open or semi-closed intervals from a to b include just one endpoint;
and
.
The entire real number line can be represented using this
notation;
R.Here
the symbol
is
"infinity" and is not any kind of real number. It is used in this
context to mean that the interval has no bounds.
An ordered pair is a pair of objects in which (unlike a set) order makes a difference. We represent ordered pairs by (a, b) . This use of parentheses reflects an earlier time when the choice of typographical symbols was limited. Unfortunately, it is the third use of parentheses we've encountered! The earlier uses were as a grouping symbol and to designate an open interval of real numbers. Reluctance to change notation means that we'll have to rely on context to distinguish which use of parentheses is intended. In (a, b) a is called the first component and b is called the second component. The main idea of an ordered pair is that for two ordered pairs to be equal their components must match.
![]()
In fact, the relation (a, b) can defined as the set of sets { {a}, {a, b} } , which ensures the above property.
A set of ordered pairs is called a relation. In symbols this can be written as follows:
![]()
Here, in the second statement we have introduced an abbreviated form of set builder notation in which we designate immediately after the curly brackets that this will be a set of ordered pairs. The nature or conditions of the relation are given by the predicate xRy. For example, let xMy be the statement "y is the mother of x". Then the relation M is a set whose elements are of the form (child, mother). We might say this relation defines the "relationship" of motherhood. In fact, this is why the name relation was chosen to designate a set of ordered pairs.
The set of values of the first component of the ordered pairs in a relation is called the domain of the relation and is designated as Dom(R). The set of values of the second component of the ordered pairs in a relation is called the range of the relation and is designated as Rng(R). Here, R means the given relation, not the set of real numbers.
Dom(R) =
Rng(R) = ![]()
In the relation M, Dom(M) is the set of all individuals who were someone's child, while Rng(M) is the set of all individuals who are mothers. Undoubtedly, the intersection of these two sets is not empty!
Given any relation, we can define its inverse by reversing each ordered pair. The inverse of M, usually designated by M-1 would be the set of all (mother, child) pairs. In general, we define the inverse of a relation as follows:
![]()
From this it is obvious that the Dom(R-1)= Rng(R) and Rng(R-1) = Dom(R).As a second example, if R were the relation consisting of all (names, phone numbers) in a community (i.e., the contents of a "phone book"), then R-1 would be an "inverse phone book" with entries of the form (phone numbers, names).
So far we have allowed the predicate relationship xRy to define the relation and its domain and range. An alternate approach is to start with two sets A and B and define the relation (and hence the relationship) as all possible pairings of elements of A with elements of B. This is called the Cartesian (after René Descartes) cross product of A with B.
![]()
From this definition we see
that
.
Thus in general, the Cartesian product is non-commutative,
i.e.,
.
The set
R xRcommonly
called R2
is the set of all ordered pairs of real numbers. Just as Rrepresents
all the points on a line, RxR
represents all the points in a plane. Thus RxR
is
referred to as the Cartesian plane.
The schematic below graphically represents a relation between two sets A and B. From the 10 connections drawn between the two sets we can list the 10 ordered pairs in R. Thus, each connection represents a "relationship" between an object in the domain and an object in the range. Note: In a relation, a given element in the domain can connect to more than one element in the range and vice versa.

A relation R is called reflexive on a set
A
if and only if
.
A relation R is called symmetric if and
only if
.
A relation R is called transitive if and
only if
.
Note:
1. If R is symmetric, then Dom(R) = Rng(R). To prove this, let x be any element of Dom(R), then there must be some
y in Rng(R) with xRy. Since R is symmetric, we must also have yRx. Hence, x is also an element of Rng(R) and
.
Now let y be any element of Rng(R), then there must be some
x
in the Dom(R) with xRy.
Again since R is symmetric, we must also have yRx.
Hence, y is an element of Dom(R) and
.
2. If R is symmetric and transitive, then R is reflexive on Dom(R). Since R is symmetric, we have Dom(R) = Rng(R).
Let x be any element of Dom(R), then there is a y in Dom(R) with (x, y) in R. Since R is symmetric, we have both
xRy and yRx. Since R is transitive, xRy and yRx implies xRx. Hence, for all x in Dom(R), we have xRx.
If a relation is reflexive, symmetric, and transitive, it is called an equivalence relation on the set A. The prototypical equivalence relation is equality, which obviously has all three properties. A slightly different example of an equivalence relation is "x weighs the same as y", which of course does not assert that x and y are the same object. It merely states that when the objects are weighed the measurements are equal. A third example of an equivalence relation is "x has the same shape as y". This is the geometric equivalence relation called similarity.
The significance of an equivalence relation on a set A is that it allows us to partition A into disjoint subsets all of which have the same property. For example, we could partition a set of people into "equal weight" groups, or the set of all possible triangles into subsets based on similarity (shape).
Often we are interested in relationships between two sets of objects in which the choice of an object in one set completely and unambiguously determines the related object in the second set. For example, if a TV is working properly, the choice of channel number should unambiguously determine one and only one picture! We often label the first set, that controls the choice of objects in the second set, as the input set or the independent variable set. The second set is then called the output or dependent variable set. The idea of these terms is that there is "some freedom" in the choice of the independent variable (like the TV channel), while the output seen depends on this choice. In such a determined relationship it is customary to state the independent variable as the first component (it's selected first after all), and the output as the second component. The key restriction is that for a given input, there must be one and only one (i.e., a unique) output. The input determines the output, so we should only see one response. Thus, in this kind of relation, an ordered pair with a given first component can only appear once! Such a relation is called a function.
A relation f is a function if and only if
.
Since the second or output component of a function is completely determined
by the first or input component, it is common practice to use the following
notation in defining a function f :
.
Here, f (x) is read "f of x" and designates the unique element
in the Rng(f ) that is associated with the input
x from the
Dom(f ). We can picture a function as a "machine" or "black box"
that converts the input x into an output designated by f
(x) . We then think of the function not so much as a set of ordered
pairs, but rather a process that produces a unique response based on the
initial input. This terminology is also used in statements like "the health
of the nation's economy is a function of the level of productivity", which
is interpreted to mean that the level of productivity determines the health
of the economy. Labeling certain buttons on a calculator as "function keys"
means that using these buttons along with the appropriate input generates
a particular and unique output. Sometimes in mathematics we can specify
the process that converts input into output through a formula. For example,
if Dom(f ) = R and we
have the formula
, we
generate ordered pairs in RxR
by applying this formula to each real number x .
The formula can be applied to any input as is illustrated by the following:
The last four examples illustrate that when we write a formula for a function, f (x) = …x… , the variable x is a "dummy" and just stands for the input value. The essence of the process, which is the function, stays the same for any choice of the independent variable name.
For
,
the Rng(f ) =
is
not all of R.
A second notation for functions is
.
Here A is the domain of f, and the "action" of the function
is to map or transform (the
symbol,
which in this context does not mean implication!) elements of this
domain into elements of the set B. The notation of a function
"of a set", f (A), means the range of f when the domain
of f is A.
f (A) =Rng(f ) =
.
In general, the range is only a subset of B. If the Rng(f
) = B, we say the function is onto or surjective.
This is expressed in symbols by
.
To be more precise,
.
In the example,
,
we would write f : R
R,
the function is into R,
but not onto R.
What makes a function a function is that for every input in the domain there is one and only one output in the range.
This means a particular first component or input occurs in one and only one ordered pair in the complete set of ordered pairs that constitutes the function. If in addition, we have that every second component or output occurs in
one and only one ordered pair in the function, we say
that the function is 1-1 or injective and write
.
This means that every output in the range is generated once and only once
by its own particular input. Hence, if the outputs match, then the inputs
must match. This gives us the following definition:
.
Equivalently a function is 1-1 if and only if
.
In the TV example discussed earlier, the function that
converts channels into pictures is usually not 1-1. It is quite possible
that the same picture is being shown on different stations. If the president
is addressing the nation, I don't conclude my set is broken just because
the same picture pops up on different channels. The picture is still a
function of channel, it's just not a 1-1 function. The function given by
is also not 1-1, since we get the same output for both a given number and
its negative.
One important property of 1-1 functions is that the inverse relation f -1 is also a function. Recall that the inverse of a relation is set of ordered pairs obtained by switching the order of all pairs in the original relation. For a function this means switching input with output. The domain of f -1 is the range of f and the range of f -1 is the domain of f.
![]()
Suppose for some x, y, and z that (x, y) and (x, z) are all in f -1. Then (y, x) and (z, x) are all in f , i.e., f (y) = x and
f (z) = x. Since f is 1-1, it must be that y = z . Hence, we have established that
, so that f-1
is a function.
Since f -1 is a function, we can write the relation in two different, but equivalent ways.
![]()
Thus, we have the following equivalence that for any 1-1 function f .
![]()
Thus, for any x in the range of a 1-1 function f,
,
and for any x in the domain of a 1-1 function f ,
.
Consider the function defined from set A to set B by the four ordered pairs shown below.
This is certainly a function since no element in the domain A is mapped to more than one element in the range. However, this function is not 1-1 since both x1 and x2 in the domain get mapped to the same element, y5 in the range. The function is also not onto since there are elements in B, y2, y4, and y6 , which are not in the range of f .

Consider the function defined from set A to set B by the four ordered pairs shown below.
This is a 1-1 function since every element in the domain A connects to one and only one element in the range, and every element in the range is connected to only one element in the domain. Since the range does not include the elements y6 and y4 of B, the function is not onto.

Note: Since f is a 1-1 function, an inverse function f -1 exits. Its domain is not B, since f was not onto.
Consider the function defined from set A to set B by the four ordered pairs shown below.
This is a function since every element in the domain A connects to one and only one element in the range. However, this function is not 1-1 since both x1 and x4 in the domain get mapped to the same element, y3 in the range. This is an onto function since there are no elements in B which are not in the range of f .

A function from set A to set B that is both 1-1 and onto
is called a bijection.
If f is a bijection, then not only is Rng(f ) = Dom( f-1) = B, but the Rng(f -1) = Dom(f ) = A. In a bijection each element of A is matched with one and only one element of B, and each element of B is matched with one and only one element of A. It's like a game of "musical chairs" where there are exactly the same number of chairs as people. Each person gets a chair and each chair gets a person. The two sets are in a 1-1 correspondence and have exactly the same number of elements.
Consider the function defined from set A to set B by the four ordered pairs shown below.
This is a 1-1 function since every element in the domain A connects to one and only one element in the range and every element in the range connects to one and only one element in the domain. This is also an onto function since there are no elements in B which are not in the range of f . Thus, this function is a bijection.

Note: Since f is a 1-1 function, an inverse function f -1exits. Since f is an onto function the domain of f -1 is B.
An example of a function which is a bijection from any set unto itself is the Identity Function, I(x) = x . This function is called the identity function because it returns whatever was put in. The output is identical to the input.
For any set A,
.
Often the output of one process becomes the input of a new process.
To model this mathematically we need to apply two or more functions in
succession. This is called composition and is symbolized by an "open circle"
between the two function names. To be more precise we state the following
definition: given sets A, B, and C, and functions f, and g with
,
the composition of "g with f " is a function
whose
output on the Dom(f ) is characterized by
.
We have the following three results.
1. Compositions of onto functions are onto.
Let c be any element of C, then since g is an onto
function from B to C, there is a b in B with
g(b)
= c. Now, since f is an onto function from A to B,
there is a a in A with f (a) = b . Hence,
there is an a in A with
.
2. Compositions of 1-1 functions are 1-1 .
Suppose for some a in A and b in A that
.
Now
and
and g is 1-1 from B to C, so
.
However, f is 1-1 from A to B, so a = b
. Hence,
.
3. Compositions of bijections are bijections.
This follows from the combined application of 1 and 2.
Finite, Infinite, Denumerable and Uncountable Sets
As we have seen, two sets A and B have exactly the same
number of elements if there exits a bijection or one-to-one correspondence
between them. We formalize this observation with the following definition.
Two sets A and B are said to be cardinally equivalent, in
symbols,
,
if and only if there exits a bijection between them.
![]()
A set A is finite if and only if either A is empty or,
for some
N
, the set {1, 2, 3, … n}
A.
The natural number n is called the cardinality of A
or Card(A) and is simply the number of elements of A. If
A
is empty, Card(A) = 0.
Note: This definition recalls the most basic of mathematical operations: counting. As we count, we go through {1, 2, 3, … n} setting up a one-to-one correspondence with the objects being counted.
Cardinal equivalence is an equivalence relation. To demonstrate this, we need only find or construct an appropriate bijection.
Reflexive: Let f (x) = I(x)
= x , the identity function. For any set A,
.
Symmetric: If
,
there is a bijection f from A to B. Then the function
f-1
is a bijection from
B to
A.
Transitive: If
and
,
there are bijections
f and g with
and
. Then
the composition
is
a bijection from A to C.
In general for a finite set A with n elements, A = {e1, e2, e3, e4, e5, … en}, P(A) has 2n elements. More formally, if Card(A) is n, then Card(P(A)) = 2n. We can verify this formula using mathematical induction.
Base Case: n = 0 (Technically, 0 is
not
a
natural number, but establishing this base case combined with the Induction
Step will establish the desired theorem.) If n = 0,
A =
.
Suppose
, then
.
If
,
then
, which
is a contradiction.
.
Hence,
.
Therefore,
has
only 1 = 20 elements.
Induction Step: Assume that any set of n elements has 2n elements. Now consider any set of the form
A = {e1, e2, e3,
e4, e5, … en, en+1}.
Let
, then
either
. If
,
then x must be one of the 2n subsets of {e1,
e2, e3, e4, e5, … en}.
If
, then
let
. Every element
of y must then come from the set {e1, e2,
e3, e4, e5, … en}, i.e.,
then y must be one of the 2n subsets of {e1,
e2, e3, e4, e5, … en}.
Hence, there are 2n different choices for x with
.
Thus, the total number of choices for x is 2n
+ 2n =2n+1.
What if there is no subset of N which has a bijection to A? In that case we say the set A is infinite and the Card(A) is not a natural number. Stated differently, a set A is infinite if and only if it's not finite. The set Nis an infinite set. This result may seem intuitively obvious, but it still requires a proof based on the definitions. We could use Mathematical Induction and indirect proof to show that for every natural number n, there does not exist a bijection from {1, 2, 3, …, n } to N . For the sake of brevity, we will not work through this argument. A set A which is cardinally equivalent to Nis called denumerable or countably infinite.
Galileo
seems to have been the first person to claim that the set of even numbers
E
= {2, 4, 6, 8, …}, despite that it has "only half" of the elements in N,
has the same number of elements as N,
i.e., E
N
.
This result follows from the fact that the function f (x)
= 2x is a bijection from N
to E. For any non-zero m, a linear function of the
form,
f (x) = mx + b, is a bijection from N tothe infinite arithmetic sequence {b + m, b + 2m, b + 3m, b + 4m, … }. We therefore conclude that all such sequences are cardinally equivalent to N.
Even infinite geometric sequences like {10, 100,
1000, 104, 105, 106, 107 ,
… }, despite their "sparse" appearance, are cardinally equivalent to N.
This is true, since for
,
the function
is
a bijection from N to
such sets.
What can we say about infinite sets that include
all of the natural numbers? Are they "bigger" than N
? The following bijection from N
to Z shows that N
Z.

Arrange all the positive rational numbers P/Q
in a table with the numerator P labeling the columns of the table
and the denominator Q labeling the rows. Since the desired map from
N
to Q +is
to be 1-1 it is necessary to remove all multiple occurrences of the same
rational numbers. Thus, all unit fractions such as 2/2, 3/3, etc. are "struck
off" the table since they equal 1/1. In a like manner any fraction equivalent
to a fraction in an earlier row is removed. After this process is complete
(it will take forever!) we have each element ofQ
+ listed once and only once in the table. But the positions
of these entries can be indexed by a single natural number by moving diagonally
through the table and "counting off" as we go from each listed entry to
the next listed entry. [A slightly different form of this argument would
be to use the movement along the diagonal to "count" all the elements of
N
xN
(i.e.,
N
xN
N).
We then argue that, since the elements of Q
+ can be constructed from a infinite subset ofN
xN,
Q
+
N
. However, this approach anticipates definitions and arguments that we
have yet to elaborate.]

In the function f which is the bijection from N to Q +;
Now we can show that N
Q.
The argument is almost identical to the demonstration that N
Z.
Let f be the bijection from N
to Q +shown
above. Then the function g defined below is a bijection from from
N
to Q.

Cantor designated the Card(N)
as
(aleph
naught, aleph being the first letter of the Hebrew alphabet). Cantor believed
that
, while not
a number in the ordinary sense of the word, represented the "smallest kind
of infinity". He called
the
first transfinite cardinal.
In order to compare cardinalities of various sets, we introduce the following notations for two sets A and B.
and
.
Thus,
,
while
.
The idea is that, if a 1-1 map from A to B exists, then B must have at least as many members as A. Furthermore, if B has at least as many members as A, but a one to one correspondence from A to B is impossible, then B must have "more members" than A.
The similarity of
and
to the
ordinary symbols
is intentional. For any real numbers a, b, and c,
we have the transitive property that if a < b and b
< c , then
a < c . For
and
we have
the following results which seem almost intuitively obvious based on "everyday"
notions of size and comparison. With the exception of property 6, the arguments
are pretty straightforward. They soon, in fact, seem identical! Once you
have understood say the first seven, you can probably skim the rest. Property
10
is needed when we compare Card(R)
to Card(N).
1. For any sets A and B,
.
Suppose A is a subset of B, let f be the identity function, i.e., I(x) = x, and Dom(f ) = A, then f (A) = Rng(f ) = A, which is a subset of B. Since I is 1-1, we have a 1-1 function from A to B .
2.
.
If
we can find 1-1 functions f and g with
and
,
then the composition of g with f is 1-1 from A to
C.
3.
.
If
we can find 1-1 functions f and g with
and
,
then the composition of g with f is 1-1 from A to
C.
4.
.
Suppose
.
Then there is a 1-1 function f with
.
Now, pick any x in A. Since A is a subset of B,
x
is also in B and so f (x) is in C. Since f
is
1-1on B, it must also be 1-1 on A. So we have
.
5.
.
Suppose
.
Then there is a 1-1 function f with
.
Now, pick any x in A. Then f (x) is in B.
Since B is a subset of C, f (x) is also in
C.
So we have
.
6.
.
If
we can find a 1-1 function f with
and
,
so
and
.
The other part of the theorem,
is a result proved by Cantor, Felix
Bernstein, and Friedrich
Schröder and is known as the Cantor-
Schröder-Bernstein (CSB) Theorem. It's proof is rather
complex and technical, but also extremely clever and maybe even beautiful.
It involves using the generalized union of sets of infinite sequences and
is beyond the scope of these notes. The interested reader should consult
the references.
7.
.
Suppose
.
Then we can find 1-1 functions f and g with
and
.
The composition of g with f is 1-1 from A to C,
so
. As an
indirect proof, suppose now that
,
then
. Since
,
we have
,
hence by property 3,
.
But
implies
that
, so by the
CSB Theorem
.
This contradicts
,
so it must not be true that
.
8.
.
Suppose
.
Then we can find 1-1 functions f and g with
and
.
The composition of g with f is 1-1 from A to C
, so
. As
an indirect proof, suppose now that
,
then
. Since
,
we have
, hence
by property 3,
.
But
implies
that
, so by the
CSB Theorem
.
This contradicts
,
so it must not be true that
.
9.
.
Suppose
.
Then we have
and
so from property 4, we have
.
As an indirect proof, suppose now that
,
then
. Since
,
by property 3 we have
.
Since
and property
1
imply that
,
by the CSB Theorem
.
This contradicts
,
so it must not be true that
.
10.
.
Suppose
.
Then we have
and
so from property 5, we have
.
As an indirect proof, suppose now that
,
then
. Since
,
by property 1 we have
.
Hence,
by
property
3. The CSB Theorem then implies
.
This contradicts
,
so it must not be true that
.
If a set A has the property that
N , we say that A is countable.
A set A is said to be uncountable if and
only if N
A
.
We are now ready to show that there are uncountable sets
"bigger" than N. The
following proof is due to Cantor and presents his famous diagonal
argument that N
R. First, consider the function
.
This is certainly a 1-1 function from N
to the open interval (0, 1). We will show that no such 1-1 function can
be onto.Let g be any 1-1 function from N
to the open interval (0, 1). Let
designate the elements of the range of g. Each of these a's
is a real number between zero and one. If we agree that all rational numbers
that are terminating decimals be represented only by their repeating string
of 0's and not by a repeating string of 9's, the output of g can
be represented uniquely by the decimal expansion of each an.
As is shown below, let an, j
be the j'th decimal digit of an = g(n). Now pick
two different decimal digits, say 3 and 5. Define the following function
h(n)
and let bn = h(n) for every natural number
n.

Then b is a real number in the open interval (0, 1) whose decimal expansion consists only of 3's and 5's. Furthermore, at decimal place n, bn fails to match the n'th decimal digit (an,n) of an .

For any natural number n, the decimal expansion of b does not match the decimal expansions of any of an. Therefore, the number b is not an element of the range of g. Thus, no 1-1 map from N to the open interval (0, 1) can be onto.
Hence, N
(0,
1). Since the open interval (0, 1) is a subset of R,from
property 10 we conclude N
R. Thus, the set of real numbers is uncountable.
Note: You might think we could solve the problem in the above proof by simply including the derived number b in the range of the 1-1 function g. This wouldn't solve the problem. Because now we'd be working with a "new" g, and then we'd construct a "new" b that would not be in the range of this function. There are "just too many" real numbers to count! Any scheme that tries to count them all is guaranteed to miss some.
Surprisingly, set size as measured with cardinality is
very different than geometric measures of length or area. For any pair
of real numbers a and d, with a < d, the
linear function
is
a bijection from (0, 1) to the interval (a, d). Thus, (a,
d)
(0,
1). Since (0, 1) is uncountable, so is (a, d). The geometric
length of (0, 1) is 1, while the geometric length of (a, d)
,
which ranges from the "indescribably small as a approaches d
to the incomprehensibly large as d is chosen arbitrarily far from
a.
Yet, never the less, (0, 1) and (a, d) have the "same number"
of elements.
Using some properties of rational functions, we could
show that the following function is a bijection from the open interval
to
.

An even more unbelievable result is that ![]()
![]()
.
The first set is one-dimensional and has no geometric area, while the second
set is two-dimensional and has an area of 1.
To see the cardinal equivalency consider the function
f
from
to
defined
by the following "coding" scheme. Given any number in (0, 1) it has a unique
decimal expansion (provided we convert all trailing strings of 9's into
terminating decimals). Let
. The output of f applied to a is the
ordered pair
.
First,
this function is 1-1, since there is an inverse "decoding" in which we
use alternate decimal digits from the first and second components of the
output to reconstruct the decimal expansion of the input. Second, given
any ordered pair (b, d) in
,
using the decoding scheme allows us to reconstruct the number a
in (0, 1) with f (a) = (b, d), so the function
is onto.
It might now seem that Ris
the biggest possible set. Cantor, however, showed how starting with any
set, we can get a set with larger cardinality. Specifically, for any set
A,
A
P(A)
.
Proof. First consider the function, T(x)
={ x }, which given an object forms the set with that object and
only that object as an element (the "singleton" of x). Then for
any a in A, T(a) = {a} is a subset of
A,
so T(a) is in P(A).
Now suppose T(a) = T(b), .i.e., {a}
= {b}. Since two sets are equal if and only if they have the same
elements, we conclude a = b. So we have
P(A)
. Now consider any function f with
P(A).
Consider the set B,
.
Now since every element in B is also in A, we have that B
is a subset of A. Hence, B is an element of P(A).
Now, can we find x in A, with f (x) = B
?
Assume we've found an x in A with f (x) = B. Now, is this x in B? If x is in B, then x must be in A and x is not in the set given by f (x) = B. That is, since x is in A, if x is in B, then it is not in B! Thus, if x is in B we get a contradiction. Therefore, it must be the case that x is not in B. However, since x is not in B, x is not in f (x). That is, we have x in A and x not in f (x), which is precisely the membership criteria for B! Thus, x is both not in B and in B. Another contradiction! We must therefore conclude that there is no x in A with f (x) = B. Thus, no 1-1 map from A to P(A) can be onto.
Thus, there is no largest set! By forming the power set we can always make a larger one. In particular,
N
P(N) and R
P(R).
Both of the sets R and P(N) are uncountable. Which is larger? Again Cantor provided the answer.
Theorem: R
P(N
)
.
Proof. In analogy with the decimal representation every
real b number in the open interval (0, 1) can be represented by
a binary number, b = 0.b1b2b3b4b5b6b7b8…
bn,
where each bn is either 0 or 1. As with decimal numbers
we will assume that any repeating binary number with a trailing string
of 1's has been rewritten as a terminating binary number. Hence, the binary
representation of the real numbers in (0, 1) is unique. We will now generate
another coding scheme which codes every subset of N
into
a binary number in [0, 1]. Let A be any subset of N,
then we can construct the function f that maps A into the
binary number 0.a1a2a3a4a5a6a7a8…
an…,
with an = 1 if n is in
A, and an
= 0 if n is not in A. If A is empty f (A)
= 0, and if A is N, f (A) = 1. The function
f
is certainly 1-1 from P(N
)
to [0, 1], since if two subsets of N
map to the same binary number, they must contain exactly the same natural
numbers. Let b be any real number in (0, 1), then let B be
the subset of N with
n
in B if and only if bn = 1. By construction
f
(B)
= b, so f is an onto function. So we have established that
P(N
)
[0, 1]. Since
the interval [0, 1]
R,we
have R
P(N
).
The symbol c is sometimes used for the Card(R).
Since Card({1, 2, 3,…n} ) = 2n, the result R
P(N
)
is often written symbolically as
.
Cantor made the conjecture that there is no set A,
whose cardinality is between the natural numbers and the real numbers.
More precisely this assertion, known as the Continuum
Hypothesis, says that there does not exist a set A with the
property that N
A
R
.
In the year 1900, at the dawn of a new century, David Hilbert, a passionate advocate of Cantor's work, addressed the Second International Congress of Mathematicians in Paris. He presented 23 famous unsolved problems for consideration. The first problem on the list was to establish the truth or falsehood of the Continuum Hypothesis. Hilbert presented in his speech what was almost an article of faith among mathematicians, namely, that if a mathematical proposition is true, it must either be an axiom or theorem of mathematics. According to this view all true mathematical statements can be proved.
To quote Hilbert ,
"The great importance of definite problems for the progress of mathematical science in general ... is undeniable. ... [for] as long as a branch of knowledge supplies a surplus of such problems, it maintains its vitality. ... every mathematician certainly shares ..the conviction that every mathematical problem is necessarily capable of strict resolution ... we hear within ourselves the constant cry: There is the problem, seek the solution. You can find it through pure thought..."
In 1902 while Gottlob Frege was preparing to publish his major opus on the foundations of mathematics representing his life's work, he received a letter from the young English philosopher Bertrand Russell. After reading this correspondence, Frege wrote in the introduction of his book,
"A scientist can hardly meet with anything more undesirable than to have the foundation give way just as the work is finished. In this position I was put by a letter from Mr. Bertrand Russell as the work was nearly through the press."
What had happened? Inspired by the work of Cantor and
others, Bertrand Russell considered a construction reminiscent of the set
B
in Cantor's proof that A
P(A).
Specifically, he considered the set of all sets that are not elements of
themselves, i.e., the set of all collections that are not self-contained.
Let
. This
set is blatantly self-referential, but it is not necessarily unreasonable.
For example, a set of dogs is in W, for a set of dogs is not a dog!
In any case, the Axiom of Abstraction says we can talk about W.
The big question we arrive at now is the following: is
?
Well, if
, then
by the Axiom of Abstraction,
.
Similarly, if
,
then
. We
get a contradiction either way! There's no way out. Naïve set theory
is inherently self-contradictory!
Almost immediately mathematicians and logicians tried to "fix" set theory by various modifications of the Axiom of Abstraction that seemed to be the source of the trouble. Russell and Whitehead in their book, Principia Mathematica, developed a "theory of types" which did not even allow the question, "is a set an element of itself ", to be posed. Other mathematicians developed alternate axiom schemes for set theory that did not allow Russell's Paradox to occur. The most widely accepted are based on the work of Ernst Zermelo, Adolf Fraenkel , and Thoralf Skolem. This system of axioms is called ZF and many mathematicians feel that it (or perhaps it with the Axiom of Choice added) contain all of the formal methods required for mathematics. Some would even go so far as to say that a proof of a theorem is valid if and only if it could, at least in principle, be formulated and proven in ZF.
In 1930 Kurt Gödel completely turned over the apple cart! He showed that all of the systems of set theory developed to avoid Russel's Paradox could not be proven to be consistent. That is, using methods acceptable to mathematicians, he was able to show that there is no guarantee that there are not other paradoxes, just like Russell's Paradox, which would render all of these set theories self-contradictory! In fact, the details of his proof are even more perplexing. For what Gödel really demonstrated was that if these set theories are consistent (no contradictions possible), then they must be incomplete. This means that there are well-formed formulas in these axiomatic systems which can neither be proved nor disproved using the rules of these axiomatic systems. For this reason, Gödel's result is called Gödel's Incompleteness Theorem. A statement that can neither be proved nor disproved within an axiomatic system is called undecidable. Gödel in essence showed that any axiomatic system with sufficient "power" to encapsulate the methods of mathematics must of necessity contain undecidable statements. The situation is again reminiscent of Cantor's theorem that the real numbers in the open interval (0, 1) are uncountable. Adding the number b developed by the Cantor "diagonal slash" construction to the list of real numbers does not make the real numbers countable, since now a "new" b could be generated which is not on the list. In an analogous fashion, adding either an undeciable statement (or its negation) to the list of axioms to make a "new" axiomatic system, just generates a "new" undecidable statement within this "new" system!
In particular, Gödel's Incompleteness Theorem guarantees that no finite scheme of axioms can prove or disprove all the well-formed statements involving the natural numbers! There are statements in any axiomatic formulation of number theory (probably the most basic subject in mathematics) that are undecidable.
Based on the work of Gödel in 1940 and of Paul Cohen in 1963, the Continuum Hypothesis is undecidable within standard ZF theory! So much for Hilbert's first problem. This constituted a resolution he probably did not imagine when he first posed the question. There are some very deep, if not downright disturbing, questions here. Is the Continuum Hypothesis true? What is meant by the truth of any undecidable proposition? What does this tell us about the truth of any mathematical statement?
V. Naïve Set Theory <----------> Table of Contents <----------> VI. Closing Comments