Visualizing My Photo Library

Note: There’s an interactive version of this post as an Observable notebook here.


My iCloud Photo Library where I keep the majority of my photos keeps growing steadily. Currently, there are a total of 17016 photos in there.1 I wanted to get an overview of my photo habits by charting how and when I take photos.

Photos by Year and Month

Number of photos by year and month
Number of photos by year and month. Click for full image.

Let’s start with the number of photos I took each year. It is displayed in the chart at the top. The trend is obvious, I take more photos each year. While 2012 to 2016 saw moderate growth, I was quite surprised by the explosion in 2017 and 2018.

I see a few reasons for this: Obviously, I just take more photos in general. More specifically, I stopped filtering myself. I now take a lot of pictures of everday mundane things for archival purposes, like a visual diary. Ten years down the road, I want to get a good feel of how my life is today. And finally, I started asking friends and family to share their photos with me more regurlarly.

The bottom right chart shows photos by month. December is the clear winner and reflects the collected Christmas and New Year’s Eve photos. Apparently I don’t take many photos in the cold months of January and February.

The chart on the bottom left combines both charts and gives photos per year and month. Christmas time each year is noticeable again (and increasing each year) but I also have yearly photos in April from my birthday.

Photos by Weekday and Time of Day

Number of photos by weekday and time of day
Number of photos by weekday and time of day.

I was particularly interested in when I take most photos, i.e. which weekday or time of day. The bottom right chart clearly highlights Saturday as my photo day of choice. Makes sense since that’s when most exciting things happen. Friday and Sunday, too, stand out from the regular workdays.

The chart on the top shows at which time during the day I take most photos. It slowly ramps up from late morning and peaks from early afternoon to evening. During nights I tend to sleep (or maybe am awake but don’t shoot many photos). Surprisingly, in all these years, I’ve only managed to take six photos between four and five a.m.

From the bottom left chart we can see that at least on Fridays and Saturdays, I do take some late night and early morning photos.

Every Photo I’ve Ever Taken

Click for full image.

Here is one dot for every photo in my library. Looking at this feels weirdly intimate. These dots represent a lot of moments.

Most noticeably, again, is the increase in photo density over time.

Data

The data is exported from my iCloud Photo Library on my iPhone. I used the Pythonista iOS app and the Photos API to export a simple list of timestamps for each photo. You can find the script and dataset here.

Charts were created with the excellent Vega-Lite library.

Conclusion

This was a fun way to make some temporal sense of my photo library.

In this post, I only considered photos by creation date. In the future, I want to include and visualize additional metadata for each photo such as:

Additionally, I could add some crossfilters. For example, see how weekday and time of day charts change for each year. There are a lot of dimensions for simple data like this.

Photo libraries used to be nothing more than sequences of photos. But with evergrowing photos libraries spanning decades and potentially reaching hundreds of thousands of photos, it will become increasingly harder to keep an overview. Apple, Google and others have noticed this and are heavily ramping up their Machine Learning and other automated photo analysis approaches (face and object recognition, smart photo clustering, memory reminders, filtering photos across multiple dimensions etc.). I’m excited where this is going and how we’ll interact with our photo libraries in the future.

  1. Not all of the photos in my library were taken by me and not all photos I’ve ever taken are in there. And for simplictiy, I used the term “photo” to denote any type of item in my library be it photo, video, GIF, audio etc. 

IchSagNurWeb

Recently, I was asked if I could figure out which single word “Ich sag nur Web” stands for when rearranging the letters (German for “I just say web”). I tried but I couldn’t come up with the solution. Since I was too proud to give up, I employed some automated cheating.

The idea was to simply generate each permutation of the letters and check which of them are valid German words. This requires testing 12! = 479 001 600 words which would take a very long time. Fortunately, we can use the fact that in German the letter “c” almost exclusively occurs within the sequence “ch” or “sch” to reduce the search space.

In the case of “sch”, we only need to permutate the remaining letters “iagnurweb” and insert “sch” at each position in the permutations. This reduces to 10 * 9! = 10! = 3 628 800 words to test, two orders of magnitude smaller. For “ch” it would be 11! = 39 916 800 words.

Code

For Python, I found the pyenchant spell-checking library which I used to check for valid German words. Since the number of permutations is quite large, I offloaded it to multiple processes.

from itertools import permutations
from multiprocessing import Pool
import enchant

riddle = "ichsagnurweb"
sch = "sch"
german_dictionary = enchant.Dict("de_DE")

def check(permutation):
    permutation = "".join(permutation)
    for i in range(len(permutation) + 1):
        word = permutation[:i] + sch + permutation[i:]
        word = word.capitalize()
        if german_dictionary.check(word):
            print(word)

def main():
    characters = set(riddle) - set(sch)
    perms = permutations(characters)
    # Parallelize work
    workers = Pool()
    workers.imap(check, perms, chunksize=50000)
    workers.close()
    workers.join()

main()

Results

After about two minutes on my machine, the “sch” version spit out the following words. Some of the words are hilarious and some aren’t even real words. The only real candidate on the list is Braunschweig, a city in Germany. Riddle solved!

Arschweinbug
Barschwungei
Bauschwinger
Bierwaschgnu
Bierwaschung
Braugenwisch
Braunschweig
Braunwegschi
Buschwegiran
Buschwegrain
Einwaschburg
Erbschwingau
Genbrauwisch
Genraubwisch
Grabneuwisch
Grabwunschei
Neugrabwisch
Raubgenwisch
Schubwegiran
Schubwegrain
Schwingbauer
Schwingbraue
Schwingerbau
Schwungbarei
Waschbiergnu
Wascheinburg
Waschreibung
Wegbraunschi
Wegbuschiran
Wegbuschrain
Wegschubiran
Wegschubrain
Weinarschbug
Wunschgrabei

For fun, I also ran the “ch” version. It took about 25 minutes to generate the following:

Achsburgwein
Achsburgwien
Achswegrubin
Achsweinburg
Arschweinbug
Bachwegruins
Bachwegurins
Barschwungei
Barweichgnus
Bauchingwers
Bauschwinger
Beirangwuchs
Bierwachgnus
Bierwachsgnu
Bierwaschgnu
Bierwaschung
Braugenwisch
Braunschweig
Braunwegschi
Bruchgaswein
Bruchgaswien
Bruchweganis
Bruchweingas
Buchgraswein
Buchgraswien
Buchsargwein
Buchsargwien
Buchsingware
Buchwagensir
Buchwagensri
Buchwarnsieg
Buchwegrains
Buchweingras
Buchweinsarg
Burgachswein
Burgachswien
Burgeinwachs
Burgnachweis
Burgsachwein
Burgsachwien
Burgwachsein
Burgweichsan
Buschwegiran
Buschwegrain
Busringwache
Buswachniger
Busweichgarn
Busweichrang
Eichwarnbugs
Einburgwachs
Eingrabwuchs
Einwachsburg
Einwaschburg
Erbschwingau
Gasbruchwein
Gasbruchwien
Gasweinbruch
Genbrauwisch
Genraubwisch
Grabeinwuchs
Grabneuwisch
Grabsuchwein
Grabsuchwien
Grabwunschei
Grasbuchwein
Grasbuchwien
Grasweinbuch
Nachweisburg
Neugrabwisch
Rangbeiwuchs
Rangsuchweib
Rangweichbus
Raubgenwisch
Ringbuswache
Ringsubwache
Ringsuchwabe
Sachburgwein
Sachburgwien
Sachwegrubin
Sachweinburg
Sargbuchwein
Sargbuchwien
Sargweinbuch
Schubwegiran
Schubwegrain
Schwingbauer
Schwingbraue
Schwingerbau
Schwungbarei
Singbuchware
Subringwache
Subwachniger
Subweichgarn
Subweichrang
Suchgrabwein
Suchgrabwien
Suchrangweib
Suchringwabe
Suchweingrab
Wachbiergnus
Wachburgsein
Wachbusniger
Wachsbiergnu
Wachseinburg
Wachsreibung
Wachsubniger
Wagenbuchsir
Wagenbuchsri
Warnbuchsieg
Warneichbugs
Waschbiergnu
Wascheinburg
Waschreibung
Wegachsrubin
Wegbachruins
Wegbachurins
Wegbraunschi
Wegbruchanis
Wegbuchrains
Wegbuschiran
Wegbuschrain
Wegsachrubin
Wegschubiran
Wegschubrain
Weichbargnus
Weichburgsan
Weichbusgarn
Weichbusrang
Weichrangbus
Weichsubgarn
Weichsubrang
Weinachsburg
Weinarschbug
Weinbruchgas
Weinbuchgras
Weinbuchsarg
Weingasbruch
Weingrasbuch
Weinsachburg
Weinsargbuch
Weinsuchgrab
Wunschgrabei

5 Years of Home Screens

All home screens of the past five years.
Click for full image.

About five years ago, I started taking one screenshot of my iPhone home screen each month and archiving it. I really don’t know why but I just kept doing it. With 2018 coming to and end, it was time to sit down and put all home screens together. The result is a curious cross section of my iPhone usage history.

Observations

I thought this would be more interesting. Just looking at the screenshots and types of apps, there don’t seem to be significant changes in how I’ve used my phone over the past five years. Well, onto the next five years!

Home Screen Shortcut

I probably wouldn’t have stuck with collecting home screens this long if it weren’t for this shortcut I created. It makes the entire process quick and easy. The shortcut takes the home screen image as input, adds it to a specific photo album and then archives it to Day One along with some additional metadata (iOS version, device type).

Plotting Weight on iOS

Generated weight chart
Generated chart with Shortcuts and Pythonista.

Sporadically, I log my weight into Apple Health on my phone. I want to see my weight charted over time but I don’t like to keep yet another app around for this purpose. This is better solved with some iOS automation and the Shortcuts app.

The problem is Shortcuts itself has no capability to chart data. Enter Pythonista. Pythonista is a full-fledged Python environment on your iOS device! It even includes the matplotlib library which we’ll use to plot the weight samples.

Since Pythonista has no API for accessing Apple Health data, we’ll use Shortcuts as a frontend. This shortcut collects the weight samples as CSV, then calls this Pythonista script which generates the chart and returns the image back to the calling shortcut.

The result is the chart above. The cool thing is, everything—from data collection to charting—happens right on your phone and without having to install third-party one-purpose apps.

Download the shortcut here and the Python script here and then install it in Pythonista.

Emoji Fragmentation

Emoji Fragmentation describes two problems in the beloved world of emoji:

  1. Different platforms (and versions of platforms) support different sets of emojis.
  2. The same emojis look differently across platforms.
Example of emoji fragmentation with the Drooling Face emoji.
Drooling Face emoji. Emoji images: © Apple, Samsung, Google, Microsoft

In the first case, an emoji you send to a friend might render as a placeholder box if it’s not available on her platform. In the second case, the meaning of what you want to convey might change slightly or sometimes drastically. While emoji support and visual styles seem to be converging over time across platforms, fragmentation still remains a problem.

I’ve been known to consult the excellent Emojipedia before sending a sensitive message to make sure my intentions remain clear emoji-wise when received on a different platform. This is a cumbersome process so it was time to automate it.

Screenshot
The shortcut in action for the monocle emoji. Face expressions differ significantly.

The result is this shortcut for the iOS Shortcuts app. It expects a single emoji character on the clipboard and shows you how it will look across platforms. You can trigger the shortcut from the share sheet, Notification Center or Siri. That way, you don’t have to leave your messaging app to compare emojis.

Limitations

The shortcut only shows the current version of an emoji for each platform. If the receiver is on an older version, the emojis might still look different.

Since there is no Emojipedia API, the shortcut works by scraping the corresponding Emojipedia entry. In case the site updates its HTML structure, the shortcut will most likely break.

Conclusion

Emojis are serious business. Download the shortcut here.

Alfred File Actions++

Alfred File Actions
File Actions in Finder.

Alfred’s file actions are one of my most used features of the app. They allow you to quickly perform actions on the selected file in Finder like moving, deleting, sharing. I’ve always wanted this feature for other apps not just for Finder.

Let’s say I’m in Sketch editing an image and want to email it to somebody. Currently, I’d have to find the file in Finder and then bring up the file actions for it. It would be great to have a feature to show the file action panel for the currently opened file no matter in which app I am.

File Actions in any app!

I made this workflow for Alfred that does just that. Press the ⌘. shortcut (can be configured) in any document-based app and the file action panel appears, just like in Finder. This workflow is a godsend for me as it speeds up many common file-related tasks.

Workflow and source code are available here.

Counting Mountain Summits with Strava

The Brocken summit
Finding summits

There’s a small mountain near me that I’ve probably summited dozens of times so far. I don’t keep track but I’ve always been curious about the exact number. Let’s find out.

The mountain in question is the Brocken, with 1141 meters the highest peak in the Harz mountains. In this post, we’ll determine the number of times I’ve successfully climbed it by analyzing all my Strava activities. To date, I have about 800 runs, hikes and other activities logged on there.

The source code for this project is on GitHub.

Assumptions

Data

An activity is represented by a polyline of geographical coordinates. The summit is represented by a polygon. In particular, we’ll use a circle with a radius of 350 metres around the peak (marked red in the images). This is large enough to account for inaccuracies in the GPS data.

Split Activities

An activity that starts directly at the summit is not counted as a successful summit. This is in response to activities where I logged ascent and descent separately. For example, I would run up the mountain, take a lunch break at the top and hike down later. If we were to treat the two parts separately, we’d risk overcounting the number of summits.

Multi-Summits

Multiple summits within a single activity are counted individually. Hey, if I put in the effort, I want to see it reflected.

We need to define some criteria for what constitutes a valid subsequent summit. Going down a few metres from the peak and right back up is probably not sufficient. To make it simple, we’ll use a second circle with an arbitrary radius of 3 kilometres around the peak. The boundary of this circle must be crossed between one successful summit and the next.

One activity with multiple summits
One activity with two summits. Right: Between the two summits, the activity reaches over the required 3 km away from the peak, marked by the outer circle.

Finding Summits

The basic idea is straightforward: If an activity contains a summit, its polyline must intersect the summit polygon. However, this alone is not enough as it does not account for multiple summits within a single activity.

To do that, we need to take a closer look at the individual segments of an activity. We obtain these segments by splitting up the line along the boundary of the summit.

Consider the following activity:

Schematic view of an activity
A schematic view of an activity. Right: The activity split into line segments by the summit polygon.

The activity crosses in and out of the summit three times. Cutting the activity along the summit boundary leaves us with seven individual line segments (the three inner ones included).

Of these, we can identify four different types of segments by where they lie in relation to the summit. The two interesting ones here are:

A start segment itself marks one successful summit. A loop segment marks a summit only if it is not completely contained by the outer boundary circle. The inner and the end segments can be disregarded.

As such, the activity above contains two successful summits.

We repeat this process for every activity and sum up the results to arrive at the total number of valid summits.

Implementation

I implemented the algorithm using the spatial analysis package turf.js. To get the input data, I first wrote a script to batch download all my activities from the Strava API and convert them to GeoJSON.

Results

All identified activities with summits
All 57 identified summit activities.

Here is what the script spit out:

Total number of activities: 777

Total number of summits: 60

Activities with summits: 57 (48 runs, 9 hikes)
   Shortest: 8 km, longest: 59 km, mean: 21 km
   Lowest elevation gain: 363 m, highest: 1881 m, mean: 719 m

60 summits is a respectable number if I say so myself.

Only three of the activities contain multiple summits. I actually expected there to be more. This is a consequence of our choice of 3 kilometres for the outer perimeter between subsequent summits. If we reduce the required radius by just 100 metres, the number of valid summits increases to 64.

Also captured: my preference of trail running over hiking.

The source code for this project is on GitHub.