Digital Footprint - What Facebook Knows About Me

Protecting your digital footprint is important, so for a few weeks I used a Chrome extension called Data Selfie to find out what Facebook knows about me.

Yitaek Hwang

Earlier this year, DuckDuckGo, a search engine committed to protecting its users’ privacy, reported that the website received more than 14 million searches a day. That figure pales in comparison to Google’s 3.5 billion a day, but the continued rise in popularity for pro-privacy and anonymity websites indicate the growing interest in users protecting their digital footprint.

A couple of weeks back, I came across a Chrome extension called Data Selfie that collects data from your Facebook newsfeed and applies machine learning to give a snapshot of what Facebook might know about you through your digital footprint.

According to the creators, Hang Do Thi Duc and Regina Flores Mir, they aim to explain how our “data profiles, the ones we actively create, compare to the profiles made by the machines at Facebook, Google, and Co. — the profiles we never get to see, but unconsciously create”.

Although I only post on Facebook once every couple of months to let the world know I’m alive on social media, I was nevertheless curious to see what Facebook would conclude about me. The setup process was really easy. Just download the Chrome extension and let it collect data on the background while you browse.

Data Selfie stores locally what links you click on, which posts you liked, how long you read a certain post, and what you type. This information is passed off to Apply Magic Sauce API and IBM Watson to predict psycho-demographic traits such as personality, religious orientation, and political orientation.

My Data Selfie

The first few days after installing Data Selfie, I found myself more conscious of my digital footprint even though the algorithms did not have enough data to generate predictions. I realized that I almost exclusively use Facebook on my phone. Since Data Selfie is not supported on mobile, I had to force myself to use Facebook on my laptop to gather data.

Even in limited use, Data Selfie used applied sentiment analysis to predict which keywords are relevant to me and how I feel about them accordingly. It was interesting for me to find that it thought I was -0.6 negative on Silicon Valley and 0.57 positive on global health. I don’t recall clicking on or interacting with a particular post with those topics, so I wondered if certain ads popped up on my newsfeed that didn’t register to me.

We can also find some hilarious results, it thought I was -0.78 negative on President Trump, but 0.32 positive on Donald Trump. It also thought I was 0.3 positive on Duke and simultaneously -0.75 negative on Duke.

Perhaps that reflects all the positive and negative press we receive every basketball game. Data Selfie also concluded that I was 51st percentile in Intelligence and 50th in Leadership (those numbers were lower until today).

The most interesting aspect of using the Data Selfie was finding the misalignment in what I perceived to be my identity versus what Facebook perceived me based on my digital footprint.

For example, Data Selfie concluded that I am competitive based on what I type (yellow) and easily stressed and emotional. I consider myself the complete opposite, which lines up more with the green plusses (posts I looked at).

Also, while Data Selfie nailed most of my shopping preferences, it thought that I probably don’t like outdoor activities and probably won’t consider starting a business in next few years. I thought that was interesting since I’ve been looking at outdoor activities for my upcoming Iceland trip quite extensively, and I’m in an entrepreneurship fellowship with my newsfeed currently filled by Kickstarter and Indiegogo campaigns.

Image credit: Data Selfie

What Does This Mean & What can I do?

The dichotomy between my perceived self and my online persona forced me to rethink how I interact online. Do I project my inner thoughts somehow online more so than in person?

Also, in the age of fake news (or shall I say “alternative facts”), I can echo the concerns that Flores Mir brings up in the ethical implications of our lack of control over digital data. As Katharine Schwab writes in her review on FastCoDesign, how ethical and dangerous is it when your data is used to sell you on a candidate?

“If it’s just Ugg boots and Mac cosmetics, maybe that’s okay,” Flores Mir says. “But if they’re targeting you to try and influence your vote for either Hillary Clinton or Donald Trump — maybe that’s not okay.”

At a time where data privacy is becoming a hot topic and after learning that a big data company hyper-targeted pro-Trump ads during the election based on personality profiles created on Facebook, using similar tools like the Data Selfie, digital literacy is becoming a critical factor in an increasingly connected world.

Finally, while you don’t have too much control over what data you surrender to use Google and Facebook, you can opt-out of targeted ads by following these simple steps:

Yitaek Hwang
Yitaek Hwang - Senior Writer, IoT For All
Yitaek is a Senior Writer at IoT For All who loves learning about IoT, machine learning, and artificial intelligence. He graduated from Duke University with a dual degree in electrical/computer and biomedical engineering and is a huge Cameron Crazie.
Yitaek is a Senior Writer at IoT For All who loves learning about IoT, machine learning, and artificial intelligence. He graduated from Duke University with a dual degree in electrical/computer and biomedical engineering and is a huge Cameron Crazie.