To Catch a Catfisher

This fall at HackTX, my group and I left the beaten path to try something kind of different with our 24 hour project. It started off as a kind of joke of itself, but then the greater implications of what we were making became apparent the more we worked into the night.

Idea

After brainstorming for a bit in our group chat, we eventually settled on the idea of making a bot that would scan Tinder for certain images. In effect, you could determine if a specific individual had a Tinder account activated in a certain area. We advertised our idea to the judges as something that would detect if someone was catfishing using your images. They eventually arrived at same conclusions we did: It would be extremely easy to target specific people using this app, and it could be weaponized with ease. Does your husband activate a Tinder when he goes on work trips? How long after you broke up with your ex, did she make a Tinder account? These are some extremely creepy use cases, and that’s kind of the point. What our project did, was undermine the total failings of location based hookups like Tinder.

Execution

We managed to finish most of the project within the 24 hours. There were some parts that we couldn’t get fixed in time, but we were able to submit with about a minute to spare. On to the individual parts that I worked on.

Tinder

Obviously a major part of this project is figuring out how to access Tinder. I didn’t want to have to emulate it on a phone, or just record the screen and get data from that. I needed to get access to the data, directly.

Authentication

The first thing I had to do, was figure out how to get access to Tinder’s authentication. This way, I don’t have to actually use the app; I can just send requests for profiles, and receive the data as if I was the actual mobile client. Tinder doesn’t know the difference between the actual app and the code on my computer at this point. I’m not going to go into too much detail on this step, because I don’t want to assist the creation of more bots on Tinder. After about an hour, I had a bot hooked up to a dummy Tinder account making authenticated requests using a burner phone number. I initially used Facebook for authentication, but they flagged our account for suspicious activity. Good job Facebook.

here is some example code at this point

 
 

Tinder Requests

Once I had access to the client side functions of the Tinder app, I could make requests to their server. The way Tinder retrieves profiles granted us incredible power in our profile searching. One of the messages we could send was to set our coordinates for profiles, and a radius for search. This allowed us to pinpoint any location we wanted to look for profiles. Then, all we had to do was continually request Tinder for new profiles in that area. We had access to all of the users photos, their Instagram username, age, location. Here’s an example output with random name and removed Instagram profile:

Image provided from Tinder

Image provided from Tinder

Image Processing

Now that the program had access to users Tinder information, including a remote URL to their photos, it was time to use a local directory of photos and find matches. We thought of the kinds of things our program needed to do, and it came down to this:

  • The program had to identify a copied image

  • A copied image has to be identified even if it is cropped, flipped, or put through a filter of some sort

  • Use Microsoft Azure facial detection and recognition to identify images that aren’t copied, but likely contain the target’s face

Due to the time constraint, we decided that three methods of measurement would be sufficient for our demo the next day, this is what we decided to use:

Mean Squared Error and Structural Similarity Measure

Using this guide , I was able to get Mean Squared Error to work for the images with a bit of duct tape and oil grease. I needed to pull the image from the remote URL and also pull the local images for processing. Then I had to normalize their dimensions so that the matrix operation wouldn’t scream in agony.

Using the MSE, the program can determine if the sample image is an exact copy of the images from the local directory. Outside of this scenario, it is not as useful. Structural similarity takes the cake in this regard, because it is indexed by features within certain windows of the image. It is not as entirely reliant on the holistic makeup of the picture.

SSIM formula to look smart

SSIM formula to look smart

Image Hashing with Dhash

Dhash was also a huge part of this program for determining similar images. This algorithm works really well and is a great redundancy for the previous two. It’s also a great redundancy because it’s much more accurate and reliable. Here’s how it works.

Big Train.png

Start with an image

 
compressTrain.png

Compress the photo to 8x8

Convert the compressed image to grayscale

 

And then using the grayscale array, you can calculate a hash!

1100100101101001001111000001100000001000000000000000011100111111

Using the bitstring you get after Dhash, you can count the number of bits that are different from your source images and determine a threshold for a matching image.

here is some example output at this point

Judgement

If you noticed that the facial recognition probability is the same for all of the photos, it’s because we couldn’t get it to work in time. I had the input ready to go for when we merged, but we were running too close. Come back to the site and check on my progress with adding facial recognition.

We did manage to get a UI working thanks to a teammate who was up all night building a GUI for us. It made our presentation go by a lot more smoothly.

 
Screen Shot 2018-11-04 at 2.58.33 AM.png
 

In the wild

We decided that we would put the program to the test, and see if it could identify myself on a Tinder profile. We also needed to create a dummy account to do the searching. We decided to make the target account with my photos and name, that way it wasn’t creepy, and we supplied the photos from my profile:

Within 5 minutes of starting, it had found my account.

 
 

you know I’m plugging Laser Catch whenever I can

Screen Shot 2018-10-21 at 12.55.24 PM.png

Conclusion

In the end, we were really excited to see that our program could actually work in the wild, and at a fast enough rate for a judge to watch. It also showed that there are extreme vulnerabilities in location based searching like it is found in Tinder. Another issue is the culture surrounding the service. When people post their information on that site, they are thinking about the interpersonal interactions with other users, not the way that this information can be used against them.