I want to be able to express set notations fluently in math fields used in machine learning, so I started reading Naive Set Theory by Halmos.
But I have been facing a lot of problems like :
- On pages 1-6 , I encountered
if and only ifand had to go to Wikipedia, to actually understand it . - Also , I had to search Math.SE to understand such sentences as :
It is equally harmless if the letter used has already been used with "for some" or "for all". Recall that "for some $x \ (x \in A)$" means the same as "for some $y \ (y \in A)$" ; it follows that a judicious change of notation will always avert alphabetic collisions.
" nothing contains everything "
Every page is a conundrum that requires huge amounts of mental gymnastics. I think I'm not ready to read the book, yet.
Can anyone recommend a better introduction to informal set theory than Halmos ?
My Background : I majored in biology and attended calculus, statistics, real analysis, and linear algebra classes in a university. However, I dropped out of all math classes very early to focus on biology. Now, I want to learn math again. The high school didn't teach any theoretical foundation of math at all.
$\endgroup$ 115 Answers
$\begingroup$You are ready. You don't read math book like you read a novel. You can literally spend days on one page. You are not going to find a better book than Halmos's book; so you might as well grab a ton of scratch paper and go to town.
I do recommend that you get acquainted with the fundamentals of logic and boolean algebra first. You must become comfortable with implies, if and only if, for all, there exists, and, and or.
Since you have found your way into Stack Exchange, then you have access to quality help when your brain starts melting due to overload.
In terms of set theory, the phrase 'nothing contains everything' out of context, can be taken to mean either that the empty set contains all sets or that there is no such thing as the set of everything.
The first interpretation is ridiculous. The second is actually quite profound and has not always been believed.
Finally, there is something an old boss of mine once told me. "Nobody ever accomplished anything by reading the whole book." There is much in Halmos that you probably just don't need to know. Don't feel obliged to learn everything in the book.
$\endgroup$ 8 $\begingroup$I want to be able to express set notations fluently in math fields used in machine learning.
A book on set theory probably isn't the right place to be looking. Here's my advice:
Start off by trying to understand "propositional logic" (aka "boolean logic".) Start by Googling terms like "introduction to propositional logic." Perhaps get yourself a good introductory book on logic. Following sylvia's advice, consider getting your hands on How To Prove It and/or similar books.
Once you understand propositional logic, make it your goal to understand "first-order logic." Try googling "introduction to first-order logic" and/or similar search terms.
Seriously, make sure you understand first-order logic. Being able to express one's ideas correctly using first-order logic is arguably one of the most important skills a mathematician must possess.
Realize that you don't have to use set-builder notation to express yourself unambiguously. In particular, observe that the following both express the same thing.
$A = \{x \in X \mid \mathrm{blah}\}$.
$A$ is the unique subset of $X$ such that (for all $x \in X$) the following are equivalent.
$x \in A$
$\mathrm{blah}$
I think Halmos' Naive Set Theory is primarily concerned with set theory as a foundation on top of which mathematics is built, but the word "naive", if I understand correctly, just means he's viewing the concept of a set concretely as a collection of things rather than axiomatically as being whatever satisfies the axioms. As to set theory applied to machine learning, it may be that what is needed differs from the content of Halmos' book. I like the book by E. Kamke, which was reprinted by Dover Books, but I don't know how well suited to machine learning the topic is. Is it useful in machine learning to know how to prove that $2^{\aleph_0}\ne\aleph_\omega$? I wouldn't be surprised if it's not, but I don't know.
As for the difference between "for some $x$, ($x\in A)$" and "for some $y$, $(y\in A)$", it is the same as the difference between the first and last terms in this expression: $$ \sum_{x=1}^5 x^2 = \underbrace{1^2 + 2^2 + 3^2 +4^2 + 5^2}_{\text{No }x\text{ or }y\text{ appears here.}} = \sum_{y=1}^5 y^2. \tag 1 $$ $x$ and $y$ are "bound" variables here. There is a Wikipedia article about this: The first and last expressions in $(1)$ are called "alphabetic variants" of each other. If one is faced with an expression like $$ \left(\sum_{i\in A}f(i)\right)\left(\sum_{i\in B} g(i)\right) $$ it can be useful to write this alphabetic variant: $$ \left(\sum_{i\in A}f(i)\right)\left(\sum_{j\in B} g(j)\right) $$ because from that one can go on to see that this is equal to $$ \sum_{i\in A}\left(\sum_{j\in B} \Big( f(i)g(j) \Big) \right). $$ A similar thing happens when one wants to put formulas in first-order logic into what is called prenex normal form.
I would take "nothing contains everything" to mean that there is no set of which everything is a member. In particular, in the conventional Zermelo–Fraenkel theory, no set is a member of itself. Moreover, every set has more subsets than it has members, so its subsets cannot all be members of it, although in some instances some of them are. See in particular this article:
$\endgroup$ $\begingroup$While not exactly an introduction to set theory per se I considered An Introduction to Proofs and the Mathematical Vernacular by Martin V. Day to be of great help in getting acquainted with a number of concepts and notations.
Although it's focus lies on reading, writing and checking proofs it provides all the necessary information to understand all those proofs. In my opinion it manages to keep the language simple and aids students to find their way by pointing out important caveats.
I think you're reading the wrong book. Of course, if you want to continue reading Halmos for fun and/or the intellectual challenge, that's great, but what you're doing is the equivalent of trying to learn to drive by reading books about automotive engineering. A more mathematical analogy would be to think back to the courses you took on calculus and real analysis: the calculus course was about how to use differentiation and integration to solve problems, whereas the analysis course is about how differentiation and integration work "under the hood".
For machine learning, you will be a user of sets and you might do some calculation on sets but you won't be doing research into what sets are and what are the limits of what they can do. For your purposes, you don't really need to know what's "under the hood" and it will be entirely sufficient to say that a set is a collection of objects defined by some stated property. Yes, that way lies Russell's paradox but Russell's paradox only comes up if you use "weird" properties to define sets and you won't be going anywhere like that. In particular, essentially all the objects you'll come across that you might want to describe as "sets" will be subsets of something that definitely is a set, such as the integers. In those areas, set theory works in just the way you'd expect it to.
The first chapter of any undergrad textbook on general discrete mathematics will almost certainly contain all the set theory you need for machine learning.
$\endgroup$ 5