NYC Playgrounds by Neighborhood

If you’ve never found yourself wishing to be young again, New York City will certainly change that. With access to over 1200+ recreational play areas chock-full of fun and interactive playthings, life is good for an NYC kid. Heck, some playgrounds are even equipped with spinning water wheels, climbing walls, and giant building blocks. Jealous yet? Well, you should be.

Read More »

Sleep Data Analysis with R

I have been keeping track of my sleep for over two years using the Sleep Cycle app for the iPhone. I initially downloaded the app for its smart alarm feature, which promises to wake you up without feeling tired (every person’s dream). The app allegedly achieves this feat by triggering the alarm when you are in light sleep, which is determined by the detection of your movements in bed. When you are finally able to wake up, you are treated to a nice summary of last night’s slumber:

Read More »

Knn Faces

Read More »

Resolving EntityRef - Expecting ';'

So Frank and I were scraping data off the web (with permission, of course) using R’s XML package when suddenly a wild error appeared! A quick search brought up a few StackOverflow posts and blogs offering common solutions. At first glance, we thought that the URL query string was the culprit of our woes - the xmlParse()` function could not read the unescaped ampersands!

Read More »

NYC Open Data Presentation

Back in October, I gave a brief presentation on the benefits of NYC Open Data for tracking issues on a district-level. This was primarily motivated from my work at the New York City Council and will be the basis of my MPH thesis, in which I will be creating a knowledge management system to quantify and visualize the 311 data for better communication in the policy arena.

Read More »

Image Compression with PCA

Ever wonder how various graphics software are able to reduce the file size of your image without a significant loss in quality? Welcome to the world of image compression! Expanding on a previous post in which I used principal component analysis (PCA) to generate so-called “eigenfaces”, I will be using the infamous Lenna image to demonstrate how the same technique can be used to compress images and reduce file size.

Read More »

Denoising a PNG Image with kNN Imputation

Given a PNG image with noise, where noise is defined as having an RGB value equal to [0, 0, 0], we can use k-nearest neighbors imputation to fill in the zeros using nearest neighbor averaging. For each row with a zero value, we find the k-nearest neighbors using a Euclidean metric, confined to columns for which that row is not zero. The accuracy of this image restoration technique will vary depending on how we set our k parameter.

Read More »

Principal Component Analysis and Eigenfaces

After an afternoon of playing around with Python’s sklearn library, I present to you a short little experiment in dimensionality reduction using the Extended Yale Faces Database B. The extended Yale Face Database B contains 16128 images of 28 human subjects under 9 poses and 64 illumination conditions. Here are some example images from this paper:

Read More »

Exploring New York City Council Meetings Data

If you haven’t noticed from my previous blog posts on New York City 311 complaints and webscraping legislation, I am a huge fan of using data to improve local politics and municipal policies. We often hear of data science in the context of training recommendation engines for behemoths like Amazon and Netflix, but with cities like New York pushing for open data, even a small-time data wrangler like myself can provide (hopefully) useful insights for my community.

Read More »