Cameron Smith welcome to my personal site

In 1853 Richard Burton, a British orientalist and explorer, disguised himself as a Persian merchant and visited the Islamic holy city of Mecca. For Burton, the stakes were high: not only was the journey itself extremely dangerous (his caravan was attacked by bandits) but the punishment for non-Muslims entering the holy city was death.

Over the last two centuries things have calmed down significantly: there are drive-through KFC and Taco Bell restaurants on the way to the city, and the chance of dying in a car accident is much greater than being the victim of banditry. However, the penalty for trespassing remains harsh: arrest, imprisonment, and deportation. Despite this, in January of 2020 I decided to visit Mecca. This is my journal.

Public health organizations have stated that social distancing is one of the most effective ways to reduce the spread of the coronavirus. Social distancing entails active separation from in-person social gatherings: avoiding parties, public transport, crowded streets, and any other source of fun. When people are forced to go around others, the CDC offers specific guidelines about how to keep yourself and others safe: one of these guidelines is to keep six feet away from others. While this seems like a simple recommendation, it has specific, geometric implications…

A beginner's guide to learning how to program with Python.

A beginner's guide to learning how to program with Python.

This article is here to answer all the questions you’ve ever had about linear regression. I’ll walk you through every step of the process, from acquiring data, describing it, creating a model, evaluating it, optimizing it, and preventing it from overfitting. Along the way I’ll introduce generally applicable data science techniques like data munging, visualizations, and some nifty Python code. I take a two-pronged approach to introducing linear regression: code and math. Every problem I present in this article can be described from an abstract, mathematical perspective as well as concretely in code. Both of these perspectives are important, and shed light on different aspects of linear regression, so I do a careful dance back and forth between both of these. At the end of this article, you’ll have a deep understanding of linear regression. If you’re a beginner, you’ll find yourself comfortable. And if you’ve already encountered linear regression before, you’ll leave with all of your nagging questions answered.

Before reading this you should have a basic understanding of coding principles. All the code in this article is in Python, but if you know another high-level language it should be easy to follow along even if you’ve never written a single line of Python before. I also use Python’s matplotlib and pandas libraries frequently, and it’ll be worth your while to check out their documentation if you find yourself getting lost. I also introduce some mathematical formalisms from calculus and linear algebra. However, I explain these topics in such a way that they will make intuitive (if not explicit sense) even to people who’ve never taken a calculus class before.

Although linear regression is often introduced in a formal, statistical way, I’m going to talk about the topic from a less formal, more machine-learly perspective. This doesn’t mean that I’m any less rigorous, but that the rigour comes from the practical perspective of a data scientist rather than the theoretical persective of a mathematician. Hold onto your seats and enjoy the ride!

This blog post also exists as a Jupyter notebook. Check it out here.

After the Cherobyl nuclear disaster in 1986, a Zone of Alienation (The Zone) was established in the vicinity of the Chernobyl nuclear power plant. Humans were evacuated by the Soviet military and never allowed to return. But The Zone is not entirely abandoned: it’s been re-inhabited by wolves, badgers, moose, and foxes – all species that had gone extinct in The Zone long ago…

This is how I create modeling servers on Google Cloud. Below I’ve listed the individual steps. However, keep in mind that I tend to bundle the programmatic portions of these steps into a bash script so that I’m not copy-pasting each line. Also, the Google Cloud CLI can be used to create projects and instances instead of the GUI. However, I find that using the GUI doesn’t add much more time, and it can be a useful tool for visualizing your data usage and projects. Get familiar with both!

“With four parameters you can fit an elephant to a curve; with five you can make him wiggle his trunk.”

–John Von Neumann

Models are simplified representations of the world around us. Sometimes by simplifying something, we’re able to see connections that we wouldn’t be able to notice with all the messy details still present. Because models can represent relationships, they can also be used to predict the future. However, sometimes modeling something too closely becomes a problem. After all, a model should be a simplified version of reality: when you try to create an over-complicated model, you might begin noticing connections that don’t actually exist! In this blog post I’ll discuss how regularization can be used to make sure our models reflect reality in meaningful ways.

I struggled with algebra in middle school. I still don’t like algebra. When I was studying for math tests as a kid I kept asking myself why does this matter? and what does this mean? After all, mathematics is emphatically not reality: it’s only connected to the world and our experiences through clever metaphors (or isomorphisms if you want to use math jargon). Yes, a perfect sphere has never existed, but the shape of the Earth is spherical enough that we can pretend it’s one.

The biggest reason that people struggle with understanding academic topics is that those topics aren’t given a sense of meaningfulness. This isn’t the fault of learners, it’s the fault of educators. When we navigate the world we can’t help but try to discover the meaning of the things we hear, think, and see. Humans are great at extracting meaningfulness: we understand the gist of articles, the plot of a story, or the intentions of someone talking to us. So it’s only natural that when we don’t find algebra meaningful we lose interest.

You’ll never understand or appreciate mathematics, poetry, or chemistry if you view them as the meaningless manipulation of numbers, words, or chemicals. Unfortunately, this is exactly how all of these fields are taught in school. In this blog post I want to show how to make things more meaningul. Along the way, we’ll develop some techniques to help us extract meaning from the world around us.

People associate cities with traffic, asphalt, and buildings. San Francisco often seems like one enormous parking lot: disconnected from nature. But there’s a real jungle coexisting with the urban jungle. Hiding in plain sight are an abundance of deadly plants. These range from the terrifying — like the deliriant Brugmansia aurea — to the lethal: Ricinus communis. I spent a day walking around San Francisco identifying some of the deadly plants that live among us. None of these plants are in inaccessible areas: all are easily spotted from streets or parks. Anyone can identify, observe, and maybe even bring home some of these poisonous plants.

When I first encountered multi-threading and multi-processing, I wasn’t able to distinguish the two. For me, both were some sort of magical way to make your programs run faster. However, understanding how multi-threading and multi-processing is critical for many medium- and large-sized software projects. In this post, I’ll explain how each works.

When I’m trying to decipher some hairy math formula, I find it helpful to translate the equation into code. In my experience, it’s often easier to follow the logical flow of a programming function than an equivalent function written in mathematical notation. This guide is intended for programmers who want to gain a deeper understanding of both mathematical notations and concepts. I decided to use Python as it’s closer to pseudo-code than any other language I’m familiar with.

This is not meant to be a reference manual or encyclopedia. Instead, this document is intended to give a general overview of mathematical concepts and their relationship with code. All code snippets were typed into the interpreter, so I am omitting the canonical >>>.

In this article, I’m going to teach you how to build a text classification application from scratch. To get started, all you need to know is a little Python, the rudiments of Bash, and how to use Git. The finished application will have a simple interface that allows users to enter blocks of text and then returns the identity of that text.

This project has three steps. The first is constructing a corpus of language data. The second is training and testing a language classifier model to predict categories. The third step is deploying the application to the web along with an API.

You can find the source code on Github. If you’d like a sneak peek at what the application looks like in the wild, click here.

100% freegan for a month.

I’m obsessed with learning different writing systems, but keeping things organized can be difficult. Fortunately, very few writing systems are independent inventions: most are derived from other scripts. To make things easier for myself, I created a taxonomic tree of all writing systems descended from Egyptian Hieroglyphs.1,2 Also included are some inspired orthographies such as Cherokee, which was invented by Sequoyah through the process of “stimulus diffusion”. Click here for the full screen version (recommended).

Mouse over a node in this tree to see some information about the script.

1. Data taken from Wikipedia

2. Some taxonomic groupings such as North Brahmic and South Brahmic are used for convenience of organization even though they are not scripts themselves.

Visualizations of the world's growing (and changing) population.

I’ll be exploring pangrams, which are sentences that contain every character in a writing system at least once. The pangram most familiar to English-speakers is “the quick brown fox jumps over the lazy dog.” Although they’re primarily used in typography, pangrams can also be useful for learning new writing systems.

In this blog post I’ll be looking at pangrams in Korean, Japanese, Arabic, and Hindi. Although all languages are created equal, some writing systems are more user-friendly than others: in the process of exploring pangrams, I’ll have the chance to contrast the relative merits of different orthographies.

Symbols from Lalish, holy mountain for the Yazidi people.

Pictures from my trip to Northern Iraq.

My favorite fruit.

An unlawful ascent.